Skip to content

bRAGAI/bragai-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bRAG AI: Pioneering Retrieval-Augmented Fine-Tuning for Code LLMs

Overview

bRAG AI redefines the boundaries of code-centric language models by blending cutting-edge Retrieval-Augmented Generation (RAG) techniques with parameter-efficient fine-tuning. Tailored for dynamic and evolving software ecosystems, bRAG AI leverages advanced strategies such as Low-Rank Adaptation (LoRA) and Fill-In-The-Middle (FIM) training to deliver unmatched domain-specific adaptability and precision.

Video Demo

(Shortened - 9 minutes) Technical Video on bRAGAI [edited]

(Full 17 minutes) Techincal Video on bRAGAI [uncut]

Research Paper

Explore the technical foundations of bRAG AI in our detailed research paper:

  • Title: bRAG AI: Retrieval-Augmented Fine-Tuning for Code LLMs
  • Author: Taha H. Ababou
  • Affiliation: Boston University

Download the Research Paper (PDF)

Key Features

  • Efficient Fine-Tuning: Harness the power of LoRA and QLoRA to fine-tune models with minimal computational overhead.
  • Dynamic Context Retrieval: Seamlessly integrate knowledge from diverse sources like GitHub repositories, academic papers, and multimedia transcripts.
  • Context-Aware Reasoning: Achieve exceptional accuracy in code generation and domain-specific adaptability through real-time context enrichment.
  • Engineered for Evolution: Purpose-built to address the challenges of ever-changing codebases and frameworks.

Installation

Get started with bRAG AI in a few simple steps:

# Clone the repository
git clone https://github.com/bRAGAI/bragai

# Navigate to the project directory
cd bragai

# Install dependencies
pip install -r requirements.txt

Project Structure

.
├── README.md                   # Documentation
├── eval
│   ├── code-eval
│   │   ├── codellama-base-eval.py  # Evaluation script for base model
│   │   ├── codellama-bragai-eval.py  # Evaluation script for fine-tuned model
│   │   ├── humaneval/              # HumanEval benchmark integration
│   │   └── requirements.txt        # Dependencies for code evaluation
│   ├── rag.py                      # RAG evaluation script
│   ├── requirements.txt            # Dependencies for evaluation scripts
│   └── zeroshot.py                 # Zero-shot evaluation script
├── finetune
│   ├── data
│   │   ├── README.md               # Dataset preparation instructions
│   │   ├── clone_gh_repos.py       # Script to clone GitHub repositories
│   │   ├── prepare_dataset.py      # Script to prepare dataset for fine-tuning
│   │   ├── push_to_hub.py          # Script to upload datasets to Hugging Face Hub
│   │   └── requirements.txt        # Dependencies for dataset preparation
│   ├── inference/                  # Inference-related scripts
│   └── training
│       ├── fim.py                  # Implements FIM (Fill-in-The-Middle) techniques
│       ├── requirements.txt        # Dependencies for training
│       ├── run_peft.sh             # Shell script for PEFT training
│       └── train.py                # Fine-tuning script

How to Cite

If bRAG AI contributes to your research, please use the following citation:

@article{ababou2024bragAI,
  title={bRAG AI: Retrieval-Augmented Fine-Tuning for Code LLMs},
  author={Ababou, Taha H.},
  year={2024},
  institution={Boston University}
}

Future Directions

bRAG AI represents a significant leap forward, but we’re just getting started. Upcoming enhancements include:

  • Dynamic Retrieval Optimization: Implement smarter, adaptive retrievers to refine performance further.
  • Multi-Modal Applications: Extend RAG capabilities to support healthcare, finance, and education.
  • Integrated Fine-Tuning: Develop synchronized fine-tuning for retrieval systems and LLMs to improve synergy.

Join Us

We’re on a mission to revolutionize code LLMs. We'd love to hear from you if you’re interested in contributing, collaborating, or simply learning more. Reach out via our website or email me directly at [email protected]!


bRAG AI: Empowering Developers, One Line of Code at a Time.

About

bRAGAI Repository For Research Paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published