Get Started

Check out our Blocks Directory for a full list of data blocks and use cases

Start Modeling by reading through this discourse post.

DDL Statements

Use the following DDL statements as a reference and starting point for uploading our datasets into your own DB.

Note: Our DDL statments follow Amazon Redshift syntax. You may need to adjust data types and other fields as necessary for your dialect.

(jump) Working with Redshift? Refer to our steps below.

Schema


CREATE TABLE zcta_distances(
   zip1        INTEGER  PRIMARY KEY
  ,zip2        INTEGER
  ,mi_to_zcta5 NUMERIC(16,13)
);

CREATE table zipcode(
   zip_code  INTEGER
  ,latitude  FLOAT
  ,longitude FLOAT
  ,city      VARCHAR(50)
  ,state     VARCHAR(2)
  ,county    VARCHAR(50)
);

CREATE table zipcode_station(
   zip_code                      INTEGER
  ,year                          INTEGER
  ,nearest_station_id            INTEGER
  ,distance_from_nearest_station NUMERIC(13,10)
);

CREATE table zcta_county_map(
   zcta5         varchar(50)
  ,state         varchar(2)
  ,county        varchar(3)
  ,geoid         varchar(5)
  ,poppt         INTEGER
  ,hupt          INTEGER
  ,areapt        INTEGER
  ,arealandpt    INTEGER
  ,zpop          INTEGER
  ,zhu           INTEGER
  ,zarea         INTEGER
  ,zarealand     INTEGER
  ,copop         INTEGER
  ,cohu          INTEGER
  ,coarea        INTEGER
  ,coarealand    INTEGER
  ,zpoppct       NUMERIC(4,1)
  ,zhupct        NUMERIC(5,2)
  ,zareapct      NUMERIC(5,2)
  ,zarealandpct  NUMERIC(5,2)
  ,copoppct      NUMERIC(4,2)
  ,cohupct       NUMERIC(4,2)
  ,coareapct     NUMERIC(4,2)
  ,coarealandpct NUMERIC(4,2)
);

CREATE table stations(
   usaf    INTEGER
  ,wban    INTEGER
  ,name    VARCHAR(30)
  ,country VARCHAR(2)
  ,fips VARCHAR(2)
  ,state   VARCHAR(2)
  ,call    VARCHAR(30)
  ,lat     FLOAT
  ,lon     FLOAT
  ,elev    NUMERIC(7,1)
  ,begin   DATETIME
  ,"end"     DATETIME
);

CREATE table gsod(
stn FLOAT,
wban FLOAT,
year FLOAT,
mo FLOAT,
da FLOAT,
temp numeric(4,2),
count_temp FLOAT,
dewp numeric(4,2),
count_dewp FLOAT,
slp numeric(4,2),
count_slp FLOAT,
stp numeric(4,2),
count_stp FLOAT,
visib numeric(4,2),
count_visib FLOAT,
wdsp numeric(4,2),
count_wdsp FLOAT,
mxpsd numeric(4,2),
gust numeric(4,2),
max numeric(4,2),
flag_max VARCHAR(50),
min numeric(4,2),
flag_min VARCHAR(50),
prcp numeric(4,2),
flag_prcp VARCHAR(50),
sndp numeric(4,2),
fog FLOAT,
rain_drizzle FLOAT,
snow_ice_pellets FLOAT,
hail FLOAT,
thunder FLOAT,
tornado_funnel_cloud FLOAT
);

Copy Data from S3

COPY zipcode
FROM 's3://looker-datablocks/gsod/zipcode/'
CREDENTIALS 'aws_access_key_id=<aws_access_key_id>;aws_secret_access_key=<aws_secret_access_key>'  -- replace with access key and secret key from step 1
REGION 'us-east-1'
IGNOREHEADER as 1
CSV;

COPY zipcode_station
FROM 's3://looker-datablocks/gsod/zipcode_station/'
CREDENTIALS 'aws_access_key_id=<aws_access_key_id>;aws_secret_access_key=<aws_secret_access_key>'  -- replace with access key and secret key from step 1
REGION 'us-east-1'
IGNOREHEADER as 1
CSV;

COPY zcta_county_map
FROM 's3://looker-datablocks/gsod/zcta_county_map/'
CREDENTIALS 'aws_access_key_id=<aws_access_key_id>;aws_secret_access_key=<aws_secret_access_key>'  -- replace with access key and secret key from step 1
REGION 'us-east-1'
IGNOREHEADER as 1
CSV;

COPY stations
FROM 's3://looker-datablocks/gsod/stations/'
CREDENTIALS 'aws_access_key_id=<aws_access_key_id>;aws_secret_access_key=<aws_secret_access_key>'  -- replace with access key and secret key from step 1
REGION 'us-east-1'
IGNOREHEADER as 1
CSV;

COPY gsod
FROM 's3://looker-datablocks/gsod/gsod/'
CREDENTIALS 'aws_access_key_id=<aws_access_key_id>;aws_secret_access_key=<aws_secret_access_key>'  -- replace with access key and secret key from step 1
REGION 'us-east-1'
IGNOREHEADER as 1
CSV;

datablocks-gsod-redshift

Using Redshift? These instructions are for uploading the GSOD dataset into your Redshift database. Note: if you already have an AWS IAM user with the proper policy you may skip step 1.

Overall Steps:

(jump) In your AWS console, apply our policy to your IAM user and grab the IAM access key ID and secret access key (this will be used in the authorization/credentials piece of the copy command in step 3)
(jump) Create tables in Redshift
(jump) Copy data to Redshift from Looker’s S3 bucket
(jump) Add LookML files to your Looker project

Step 1: Add Policy to IAM User and Get Access Key

If you don't already have an IAM user with an access key and secret access key, you will need to create one in the AWS console.

Once the user is created, you will be provided with an Access Key ID and Secret Access Key. Write these down for later - the secret key will be shown only once. More information on access keys here.

Next you will need to add our policy to your IAM user to allow the user to copy data from the Looker S3 bucket. You can copy the policy directly from here:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1507928463000",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::looker-datablocks",
                "arn:aws:s3:::looker-datablocks/*"
            ]
        }
    ]
  }

Step 2: Create Tables in Redshift

Run the following create table commands in Redshift. Refer above to the full list of tables that you may need to define.

CREATE TABLE stations(
...
);

Step 3: Copy Data to Redshift from Looker's S3 Bucket

Run the following copy commands in Redshift. Refer above to the full list of copy commands to be added. Note: you will need to add your aws_access_key_id and aws_secret_access_key from step 1 into each of the statements

COPY stations
FROM 's3://looker-datablocks/gsod/stations/'
CREDENTIALS 'aws_access_key_id=<aws_access_key_id>;aws_secret_access_key=<aws_secret_access_key>'  -- replace with access key and secret key from step 1
REGION 'us-east-1'
IGNOREHEADER as 1
CSV;

Step 4: Add LookML Files to your Looker Project

Copy the LookML files from this repo (or download here)
Add the files to your Looker project (prefixed with rs)
Change the connection parameter in the model file to your Redshift connection

Now you're ready to explore GSOD data and combine with your other datasets!

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
aws_add_policy.png		aws_add_policy.png
aws_add_user.png		aws_add_user.png
aws_get_access_key.png		aws_get_access_key.png
aws_looker_policy.png		aws_looker_policy.png
bq.explore.lkml		bq.explore.lkml
bq.gsod.view.lkml		bq.gsod.view.lkml
bq.stations.view.lkml		bq.stations.view.lkml
bq.zipcode_county.view.lkml		bq.zipcode_county.view.lkml
bq.zipcode_facts.view.lkml		bq.zipcode_facts.view.lkml
bq.zipcode_station.view.lkml		bq.zipcode_station.view.lkml
bq.zipcode_station_query.view.lkml		bq.zipcode_station_query.view.lkml
bq_gsod.model.lkml		bq_gsod.model.lkml
lookml_upload.gif		lookml_upload.gif
readme.md		readme.md
rs.explore.lkml		rs.explore.lkml
rs.gsod.view.lkml		rs.gsod.view.lkml
rs.stations.view.lkml		rs.stations.view.lkml
rs.zipcode_county.view.lkml		rs.zipcode_county.view.lkml
rs.zipcode_facts.view.lkml		rs.zipcode_facts.view.lkml
rs.zipcode_station.view.lkml		rs.zipcode_station.view.lkml
rs_gsod.model.lkml		rs_gsod.model.lkml
sf.explore.lkml		sf.explore.lkml
sf.gsod.view.lkml		sf.gsod.view.lkml
sf.stations.view.lkml		sf.stations.view.lkml
sf.zipcode_county.view.lkml		sf.zipcode_county.view.lkml
sf.zipcode_facts.view.lkml		sf.zipcode_facts.view.lkml
sf.zipcode_station.view.lkml		sf.zipcode_station.view.lkml
sf.zipcode_station_query.view.lkml		sf.zipcode_station_query.view.lkml
sf_gsod.model.lkml		sf_gsod.model.lkml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Get Started

DDL Statements

Schema

Copy Data from S3

datablocks-gsod-redshift

Overall Steps:

Step 1: Add Policy to IAM User and Get Access Key

Step 2: Create Tables in Redshift

Step 3: Copy Data to Redshift from Looker's S3 Bucket

Step 4: Add LookML Files to your Looker Project

About

Releases

Packages

Languages

xiaolong/datablocks-gsod

Folders and files

Latest commit

History

Repository files navigation

Get Started

DDL Statements

Schema

Copy Data from S3

datablocks-gsod-redshift

Overall Steps:

Step 1: Add Policy to IAM User and Get Access Key

Step 2: Create Tables in Redshift

Step 3: Copy Data to Redshift from Looker's S3 Bucket

Step 4: Add LookML Files to your Looker Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages