Skip to content

Final project for Yale CPSC 477. Using RAG to tackle the lay summary problem for papers in biology.

License

Notifications You must be signed in to change notification settings

YUXUANCHENG/477RAG

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Final project for Yale CPSC 477. Using Retrieval-Augmented Generation to tackle BioLaySumm 2024, BioNLP task. -- Andrew Ton and Yuxuan Cheng

The repo is adapted from localGPT, see more detials on setting up here: https://github.com/PromtEngineer/localGPT

Environment Setup 🌍

  1. 📥 Clone the repo using git:
[git clone https://github.com/PromtEngineer/localGPT.git](https://github.com/YUXUANCHENG/477RAG.git)
  1. 🐍 Install conda for virtual environment management. Create and activate a new virtual environment.
conda create -n localGPT python=3.10.0
conda activate localGPT
  1. 🛠️ Install the dependencies using pip

To set up your environment to run the code, first install all requirements:

pip install -r requirements.txt

Installing LLAMA-CPP :

LocalGPT uses LlamaCpp-Python for GGML (you will need llama-cpp-python <=0.1.76) and GGUF (llama-cpp-python >=0.1.83) models.

If you want to use BLAS or Metal with llama-cpp you can set appropriate flags:

For NVIDIA GPUs support, use cuBLAS

# Example: cuBLAS
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

For Apple Metal (M1/M2) support, use

# Example: METAL
CMAKE_ARGS="-DLLAMA_METAL=on"  FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

For more details, please refer to llama-cpp

To run LocalGPT:

python run_localGPT.py

To embed documents:

python ingest.py

To run evaluation:

python evaluate.py

Final Metric score results are stored in gpt_scores.txt and llama_scores.txt

Dependencies are listed in requirements.txt

About

Final project for Yale CPSC 477. Using RAG to tackle the lay summary problem for papers in biology.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 69.8%
  • HTML 28.5%
  • Dockerfile 1.6%
  • Roff 0.1%