PaperLight is a project to create a AI-powered web-based app for the exploration and visualization of scientific papers. The goal is to provide a tool that can help people to find and understand the scientific papers of their choice.
The project started as a demonstration app for the workshop End-to-End AI App Engineering with Open Source, and it's secondary goal is to serve as a learning resource on AI-powered applications for (not only) the workshop participants.
-
You can find papers of your interest based on your search criteria.
-
Icebreaker: Summarize the papers (based on their abstract) in a way that makes it easy to understand what they are about. You can choose the audience level of the summary (e.g. high-school, college, university students), and also a language of the summary (e.g. English, Slovak). Below you can see a process diagram of the Icebreaker tool:
- Paper Buddy: A tool that allows you to view the papers and ask questions about them. The questions are answered by the AI agents, and the answers are based on the content of the papers. Below you can see a process diagram of the Paper Buddy tool:
- App: streamlit
- Data: arxiv, astrophsics data system,
- Gen AI framework: langchain
- LLM and Embedding models: openai
This project is currently in the early stages of development. The core functionality is tested on a small number of papers, and the user interface still needs to be improved. So you can expect to see some bugs, inconsistencies, and inaccuracies both in code and results.
Additionally, only arXiv and astrophysics data system are currently supported as paper sources.
- More paper sources: Add support for more paper sources (e.g. PubMed)
- Chat history: Add a chat history to the internal prompt of the Paper Buddy tool. This would allow LLM to not only take into account the current question and the paper content, but also the previous questions and answers.
- Better UI: Improve the user interface and user experience of the app.
- Open-source Model: Use an open-source model for the LLM and Embedding models. Could be quantized, to save resources.
- Downloadable summaries: Allow users to download the summaries.
- Choice of prompt: Allow users to choose the prompt for the LLM model.
- Clone the repository
- Create a virtual environment:
python -m venv .venv
- Install the requirements:
pip install -r requirements.txt
-
- Create Google project
- Enable Custom Search API
- Create new Google API key.
- Create new Google Custom Search Engine (CSE) and get the CSE ID. Enable only the arxiv (*.arxiv.org/abs/*) and astrophysics data system (*.ui.adsabs.harvard.edu/abs/*) domains.
- Create an OpenAI API key.
- Create .env file in the root of this repo with the following content:
OPENAI_API_KEY=your_openai_api_key
GOOGLE_CSE_ID=your_google_cse_id
GOOGLE_API_KEY=your_google_api_key
- Run the app:
streamlit run 1_📜🔦👀_About_Paperlight.py
- Open the app in your browser:
http://localhost:8501/
- Enjoy!