GitHub - tariq452/Data-Modeling-with-Postgres: Build a data warehouse for analytical operations. to allow Sparkify wants to analyze the data they've been collecting on songs and user activity on their new music streaming app. the analytics team is particularly interested in understanding what songs users are listening to.

Purpose:

Build a data warehouse for analytical operations. to allow Sparkify wants to analyze the data they've been collecting on songs and user activity on their new music streaming app. the analytics team is particularly interested in understanding what songs users are listening to.

Benefits

The query will be fast and simplify
The analytics team is particularly interested in understanding what songs users are listening to. Currently, they don't have an easy way to query their data, which resides in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in their app.

Design schema

Star schema because it's more effective for handling queries

Fact Table songplays - records in log data associated with song plays i.e. records with page NextSong songplay_id, start_time, user_id, level, song_id, artist_id, session_id, location, user_agent Dimension Tables users - users in the app user_id, first_name, last_name, gender, level songs - songs in music database song_id, title, artist_id, year, duration artists - artists in music database artist_id, name, location, lattitude, longitude time - timestamps of records in songplays broken down into specific units start_time, hour, day, week, month, year, weekday

sample example from table songplays

Etl

Python Currently, they don't have an easy way to query their data, which resides in a directory of JSON logs on user activity on the app, as well as a directory with JSON metadata on the songs in their app.

used Etl to read files json and insert data into tables

Example Query

SELECT u.gender,count(*) FROM songplays s inner join users u on (s.user_id=u.user_id) group by u.gender;

The above query explain how many Male and Female using the application

to start this project folow below step:

1-run create_tables.py to cretate db and create tables. 2-run etl.py to read json files and process data and insert into tables.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
create_tables.py		create_tables.py
etl.py		etl.py
sql_queries.py		sql_queries.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Design schema

Etl

Example Query

About

Releases

Packages

Languages

tariq452/Data-Modeling-with-Postgres

Folders and files

Latest commit

History

Repository files navigation

Design schema

Etl

Example Query

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages