bRAG AI redefines the boundaries of code-centric language models by blending cutting-edge Retrieval-Augmented Generation (RAG) techniques with parameter-efficient fine-tuning. Tailored for dynamic and evolving software ecosystems, bRAG AI leverages advanced strategies such as Low-Rank Adaptation (LoRA) and Fill-In-The-Middle (FIM) training to deliver unmatched domain-specific adaptability and precision.
(Shortened - 9 minutes) Technical Video on bRAGAI [edited]
(Full 17 minutes) Techincal Video on bRAGAI [uncut]
Explore the technical foundations of bRAG AI in our detailed research paper:
- Title: bRAG AI: Retrieval-Augmented Fine-Tuning for Code LLMs
- Author: Taha H. Ababou
- Affiliation: Boston University
Download the Research Paper (PDF)
- Efficient Fine-Tuning: Harness the power of LoRA and QLoRA to fine-tune models with minimal computational overhead.
- Dynamic Context Retrieval: Seamlessly integrate knowledge from diverse sources like GitHub repositories, academic papers, and multimedia transcripts.
- Context-Aware Reasoning: Achieve exceptional accuracy in code generation and domain-specific adaptability through real-time context enrichment.
- Engineered for Evolution: Purpose-built to address the challenges of ever-changing codebases and frameworks.
Get started with bRAG AI in a few simple steps:
# Clone the repository
git clone https://github.com/bRAGAI/bragai
# Navigate to the project directory
cd bragai
# Install dependencies
pip install -r requirements.txt
.
├── README.md # Documentation
├── eval
│ ├── code-eval
│ │ ├── codellama-base-eval.py # Evaluation script for base model
│ │ ├── codellama-bragai-eval.py # Evaluation script for fine-tuned model
│ │ ├── humaneval/ # HumanEval benchmark integration
│ │ └── requirements.txt # Dependencies for code evaluation
│ ├── rag.py # RAG evaluation script
│ ├── requirements.txt # Dependencies for evaluation scripts
│ └── zeroshot.py # Zero-shot evaluation script
├── finetune
│ ├── data
│ │ ├── README.md # Dataset preparation instructions
│ │ ├── clone_gh_repos.py # Script to clone GitHub repositories
│ │ ├── prepare_dataset.py # Script to prepare dataset for fine-tuning
│ │ ├── push_to_hub.py # Script to upload datasets to Hugging Face Hub
│ │ └── requirements.txt # Dependencies for dataset preparation
│ ├── inference/ # Inference-related scripts
│ └── training
│ ├── fim.py # Implements FIM (Fill-in-The-Middle) techniques
│ ├── requirements.txt # Dependencies for training
│ ├── run_peft.sh # Shell script for PEFT training
│ └── train.py # Fine-tuning script
If bRAG AI contributes to your research, please use the following citation:
@article{ababou2024bragAI,
title={bRAG AI: Retrieval-Augmented Fine-Tuning for Code LLMs},
author={Ababou, Taha H.},
year={2024},
institution={Boston University}
}
bRAG AI represents a significant leap forward, but we’re just getting started. Upcoming enhancements include:
- Dynamic Retrieval Optimization: Implement smarter, adaptive retrievers to refine performance further.
- Multi-Modal Applications: Extend RAG capabilities to support healthcare, finance, and education.
- Integrated Fine-Tuning: Develop synchronized fine-tuning for retrieval systems and LLMs to improve synergy.
We’re on a mission to revolutionize code LLMs. We'd love to hear from you if you’re interested in contributing, collaborating, or simply learning more. Reach out via our website or email me directly at [email protected]!
bRAG AI: Empowering Developers, One Line of Code at a Time.