We built a real-time, general-purpose web search engine, which returned 50 most relevant pages for any query within 3 seconds.
The codes are in the 'src' directory. There are 8 sub-directories inside, corresponding to each step of the project, detailed in the following project report:
https://drive.google.com/open?id=0B_51lX7odCb7U3UxRi1KSG4yVFk
This project consists of 4500 lines of codes. The back end is written in Java, the front end is with Node.js.