Skip to content

A Python API that scrapes movie information from Netflix. A nice substitute for their now-privatized API. Used by http://www.tomatoflix.com

Notifications You must be signed in to change notification settings

jameskang410/scraping-netflix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scraping Netflix - The Unofficial Way!

A Python class that scrapes information about Netflix movies that are available for streaming. Used by [www.tomatoflix.com][1]

Warning: Netflix's APIs have changed and this package has not yet been updated to follow those changes.

Why?

This project started because I wanted to create [Tomatoflix][1], an interactive website that helps lazy people like me find random Netflix movies to watch. I was surprised to find out that Netflix privatized their API. I took matters into my own hands and decided to forge a Netflix API of my own.

Requirements

  • Python 3
  • Modules:
    • BeautifulSoup
    • Requests
    • Fuzzywuzzy (Not very impressed with this one... Open to fuzzy matching alternatives. But it'll do for now.)

Installation

Git clone this to your local computer and it should be good to go. Currently working on making this installable via Pip.

Instructions

from netflix import *

# Insert netflix ID as a raw string
# To find Netflix ID:
# Sign into Netflix > Chrome Developer Tools > Resources > Cookies > www.netflix.com > NetflixId
netflix_id = r'INSERT NETFLIX ID HERE'

movie = Netflix(netflix_id)

# Initialization only has to be done once.
# This method creates jsons for all of the major genres that will be used to pull data from
movie.initialize()

"""
>Genres were successfully downloaded as JSON files
"""

# search() looks to see if the movie is available on Netflix streaming.
# other methods are chained to search() and returns specific information about the movie.
movie.search('Jerry Maguire').duration()
movie.search('Jerry Maguire').netflix_rating()

"""
>Movie was found
>2hr 18m
>3.6 stars
"""

Check out the example.py file. E-mail any specific questions to [email protected]

All Available Functions

[1]: http://www.tomatoflix.com
__Functions__ __Return Data Type__ __Description__
initialize(_netflix\_id\_as\_string_) None Creates a JSON file for each movie, organized by genre. This method has to be run __only once__ and __should not be run after the JSON files have been pulled successfully__. This will minimize your chances of getting "caught" by Netflix (as if they don't know what we're up to...).
all\_titles() List Returns a list of every title that's available for streaming on Netflix. Loop through this list to get information about every movie.
search(_movie\_string_) None Checks if the string is a movie that is currently available on Netflix. Will return one of the following messages: ```Movie was found``` or ```Movie could not be found. Did you mean any of the following movies?```. If movie is not found, a list of movies that paritially matched the search string will be printed to the console. __In order to find specific information about a movie, the algorithm must find a movie match.__
movie\_number() Int Returns the Netflix movie ID number
genres() List Returns a list of genres the movie belongs to on Netflix
title() String Returns the title of the movie
tv\_show() String Returns a _"Y"_ if the movie is considered a TV show. Returns a _"N"_ if it is only a movie.
synopsis() String Returns the synopsis for the movie.
year() Int Returns the year the movie was made. NOTE: This year does not always match the year listed on other movie websites like Rotten Tomatoes.
netflix\_rating() String Returns the average Netflix rating for the movie.
cert\_rating() String Returns the maturity rating for the movie.
actors\_list() List Returns a list of the prominent actors in the movie.
actors\_string() String Returns a string of the prominent actors in the movie.
url() String Returns the non Netflix member friendly URL for the movie.
duration() String Returns the duration (hours and minutes or number of seasons) of the movie or TV show.
box\_art() String Returns the URL for the small box art of the movie.
large\_box\_art() String Returns the URL for the large box art of the movie. NOTE: Because of the different layout of Netflix movie pages, this method does not always work.

About

A Python API that scrapes movie information from Netflix. A nice substitute for their now-privatized API. Used by http://www.tomatoflix.com

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages