Skip to content

Linked Data Search Engine for Movies using Wikidata and Sparql

License

Notifications You must be signed in to change notification settings

saivarshith2000/LinkedMovieSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Linked Movie Search

Linked Data Search Engine for Movies using Wikidata and Sparql

TEAM

  1. H.O Sai Varshith (17CS30015)
  2. Harsh Pritam Sanapala (17CS30016)
  3. Madine Manas (17CS10025)
  4. Imandi Deepak (17CS30017)

Introduction

Movie Search allows the user to search for movies, casts and directors. Simply select the filter you want to use and enter the search term. The app uses SPARQL via the SPARQLWrapper library to query the Wikidata Knowledge base and displays the search results in a nice manner.The back-end is implemented in Python3 using the Flask framework while the front-end is implemented using BulmaCSS library. Images for cast and directors are provided by Wikidata but unfortunately posters for movie are not avaiable in Wikidata. So,movie posters are obtained from The Movie Database using their Free API.

Running the App

Since the app is written in Python, running it is pretty simple. All you need is the latest version of Python3 and its package manager pip. To run the app, run the following commands

git clone https://github.com/saivarshith2000/LinkedMovieSearch  # Clone this repository
cd LinkedMovieSearch                # move into the app directory (this directory contains main.py)
python3 -m venv venv                # create a python virtual container for the app
source venv/bin/activate            # activate the virtual environment
pip3 install -r requirements.txt    # install the required dependencies in this container
python main.py                      # run the app (it runs on the address http://127.0.0.1:5000/)

Note: In the above commands if python3 or pip3 commands don't work just try them again without the 3.

SPARQL Details

SPARQL Details are explained in the Report.pdf. Please refer to it.

Recommendation System(Or lack thereof)

The app was originally supposed to be a Search-Cum-Recommendation system. But, unfortunately due to certain limitations it was not implemented. Instead, we have designed the system to implement it. The limitations are explained below:

  1. Wikidata: Wikidata has ton of data stored in RDF format. The SPARQL used in the app is quite simple but non-trivial interms of computation power required for inferencing. For some queries it takes pretty long time or it just gives up with empty results. Since its a free service and computation is not trivial Wikidata has implemented some restrictions on their Free API service. For example, we cannot send more than 3 requests per minute from the same IP. In a recommendation system it is very important to be able to fire off a few requests in a multi-threaded environment and then process them on return. But, since we cannot send that many requests it is not possible to have such system.

  2. Network: Some queries take pretty long(40s-1min).This means the network socket must be open for a longer time than standard. Since, the API uses HTTP protocal neither wikidata nor the Campuse network keeps a socket open for that long. Therefore, we have an IncompleteRead Error. This too prohibits the implementation of a recommendations system.

Possible Solutions for a Recommendation System

  1. Static File Caching System: This is a pretty common form of caching used by most of the online services. Its as follows, fetch results from the API, save them in a file based or SQL based database(File based is preferred for portable apps), on the next query check if this data is useful. This eliminates the need to send multiple queries to some extent. But, we need to have a very large cache of data to do this. This method would really impressive results if the app was implemented as a single server and multiple user model.

  2. LIMIT SPARQL: Use the LIMIT SPARQL construct to limit the number of triples returned. Although, the data returned is _ correct_ it is not enough. We cannot run any meaningful recommendation algorithm on limited data.

About

Linked Data Search Engine for Movies using Wikidata and Sparql

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published