Skip to content

ecigwe/udacity_tmdb_movie

Repository files navigation

TMDB Movie Data Analysis

by Igwe Emmanuel

Objectives

This is a repository for Udacity Data Analyst Project 1 (Investigate a Dataset). The dataset used in the project is also included in this repository.

Installation

The libraries used on this project include:

  1. Pandas – For storing and manipulating structured data. Pandas functionality is built on NumPy (upgrade to version 0.25.1)
  2. Numpy – For multi-dimensional array, matrix data structures and, performing mathematical operations
  3. Matplotlib – For all visualizations (including maps and graphs)

Introduction

I analyzed the dataset which contains information of about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue. The analysis is focused on answering the questions:

  1. the effect of vote count, popularity and budget on the revenue generated
  2. the effect of the runtime on the revenue generated
  3. effect of budget and vote count

Project Methodology

The main steps for this project can be summarized as follows:

  1. Data Wrangling
  2. Data Assessment
  3. Data Cleaning
  4. Exploratory Analysis

Conclusions/Results

Results

Based on the data and analysis carried out;

  1. properties such as vote count, popularity and budget have strong effect on the revenue generated
  2. the effect of the runtime is not that strong on the revenue generated
  3. budget and vote count have the strongest effect

The budget of a movie that generates low revenue is about 5 million while that of a high revenue movie over 52 million. This clearly shows that budget of a movie is correllated with the revenue of a movie, but there are limitations to this result, such as the year the movie was released(release_year) and Director of the Movie.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published