Skip to content

Latest commit

 

History

History
154 lines (93 loc) · 22.9 KB

README.md

File metadata and controls

154 lines (93 loc) · 22.9 KB

LLM (Large Language Models) FineTuning Projects and notes on common practical techniques

Find me here..


Fine-tuning LLM (and YouTube Video Explanations)

Notebook 🟠 YouTube Video
Finetune Llama-3-8B with unsloth 4bit quantized with ORPO Youtube Link
Llama-3 Finetuning on custom dataset with unsloth Youtube Link
CodeLLaMA-34B - Conversational Agent Youtube Link
Inference Yarn-Llama-2-13b-128k with KV Cache to answer quiz on very long textbook Youtube Link
Mistral 7B FineTuning with_PEFT and QLORA Youtube Link
Falcon finetuning on openassistant-guanaco Youtube Link
Fine Tuning Phi 1_5 with PEFT and QLoRA Youtube Link
Web scraping with Large Language Models (LLM)-AnthropicAI + LangChainAI Youtube Link

Fine-tuning LLM

Notebook Colab
📌 Gemma_2b_finetuning_ORPO_full_precision Open In Colab
📌 Jamba_Finetuning_Colab-Pro Open In Colab
📌 Finetune codellama-34B with QLoRA Open In Colab
📌 Mixtral Chatbot with Gradio
📌 togetherai api to run Mixtral Open In Colab
📌 Integrating TogetherAI with LangChain 🦙 Open In Colab
📌 Mistral-7B-Instruct_GPTQ - Finetune on finance-alpaca dataset 🦙 Open In Colab
📌 Mistral 7b FineTuning with DPO Direct_Preference_Optimization Open In Colab
📌 Finetune llama_2_GPTQ
📌 TinyLlama with Unsloth and_RoPE_Scaling dolly-15 dataset Open In Colab
📌 Tinyllama fine-tuning with Taylor_Swift Song lyrics Open In Colab

LLM Techniques and utils - Explained

LLM Concepts
📌 DPO (Direct Preference Optimization) training and its datasets
📌 4-bit LLM Quantization with GPTQ
📌 Quantize with HF Transformers
📌 Understanding rank r in LoRA and related Matrix_Math
📌 Rotary Embeddings (RopE) is one of the Fundamental Building Blocks of LlaMA-2 Implementation
📌 Chat Templates in HuggingFace
📌 How is Mixtral 8x7B is a dense 47Bn param model
📌 The concept of validation log perplexity in LLM training - a note on fundamentals.
📌 Why we need to identify target_layers for LoRA/QLoRA
📌 Evaluate Token per sec
📌 traversing through nested attributes (or sub-modules) of a PyTorch module
📌 Implementation of Sparse Mixtures-of-Experts layer in PyTorch from Mistral Official Repo
📌 Util method to extract a specific token's representation from the last hidden states of a transformer model.
📌 Convert PyTorch model's parameters and tensors to half-precision floating-point format
📌 Quantizing 🤗 Transformers models with the GPTQ method
📌 Quantize Mixtral-8x7B so it can run in 24GB GPU
📌 What is GGML or GGUF in the world of Large Language Models ?

Other Smaller Language Models