Skip to content

An NVIDIA AI Workbench Example Project for Finetuning Llama 2

License

Notifications You must be signed in to change notification settings

diwer/workbench-example-llama2

 
 

Repository files navigation

NVIDIA AI Workbench: Introduction

This is an NVIDIA AI Workbench example project that demonstrates how to fine-tune a Llama 2 large language model (LLM) on a custom dataset in minutes using NVIDIA NeMo Framework. Please note the project requirements:

  • Operating System: Ubuntu 22.04, Windows (WSL2), MacOS 12+
  • CPU requirements: None, (tested with Intel® Xeon® Platinum 8380 CPU @ 2.30GHz)
  • GPU requirements: Minimum 1x NVIDIA A100-80GB

Sizing Guide

GPU VRAM Example Hardware Compatible?
24 GB RTX 3090/4090, RTX A5000/5500, A10/30 N
32 GB RTX 5000 Ada N
40 GB A100-40GB N
48 GB RTX 6000 Ada, L40/L40S, A40 N
80 GB A100-80GB Y
>80 GB 8x A100-80GB Y

Project Description

Llama 2 has gained traction as a robust, powerful family of Large Language Models that can provide compelling responses on a wide range of tasks. While the base 7B, 13B, and 70B models serve as a strong baseline for multiple downstream tasks, they can lack in domain-specific knowledge of proprietary or otherwise sensitive information. Fine-tuning is often used as a means to update a model for a specific task or tasks to better respond to domain-specific prompts. These notebooks walk through downloading and configuring the Llama 2 model from HuggingFace, preparing a custom dataset, and fine-tuning the pretrained base model against this new dataset. The 7B model has been selected by default in this project, but it is also configurable with the 13B and 70B versions of the model, depending on your compute resources and constraints.

  • llama2-lora-ft.ipynb: This notebook provides a sample workflow for fine-tuning the Llama 2 base model for extractive Question-Answering on the SQuAD dataset using Low-Rank Adaptation Fine-tuning (LoRA), a popular parameter-efficient fine-tuning method.

  • llama2-ptuning.ipynb: This notebook provides a sample workflow for fine-tuning the Llama 2 base model for extractive Question-Answering on a custom dataset using customized prompt formattings and a p-tuning method.

If you are interested in trying out another model with NeMo Framework, check out this AI Workbench example project for Nemotron-3.

Quickstart (Llama-2 7B)

Prerequisites

  1. AI Workbench will prompt you to provide a few pieces of information before running any apps in this project. Ensure you have this information ready.

    • The location where you would like the Llama 2 model to live on the underlying host system.
    • The Hugging Face username.
    • The Hugging Face API Key w/ Llama 2 access (see below).

Llama 2 is a gated model that is available for commercial use. To be able to download the model, submit a request on Meta's portal for access to all models in the Llama family. Please note that your HuggingFace account email address MUST match the email you provide on the Meta website, or your request will not be approved.


Tutorial (Desktop App)

If you do not NVIDIA AI Workbench installed, first complete the installation for AI Workbench here. Then,

  1. Fork this Project to your own GitHub namespace and copy the link

    https://github.com/[your_namespace]/<project_name>
    
  2. Open NVIDIA AI Workbench. Select a location to work in.

  3. Clone this Project onto your desired machine by selecting Clone Project and providing the GitHub link.

  4. Wait for the project to build. You can expand the bottom Building indicator to view real-time build logs.

  5. When the build completes, set the following configurations.

    • EnvironmentMountsConfigure. Specify the file path of the mount, eg. where the Llama 2 model will live on your host machine.

      eg. if your downloaded Llama2 model directory resides in your home path, enter /home/[user]

    • EnvironmentSecretsConfigure. Specify the Hugging Face username and API Key secrets.

  6. On the top right of the window, select Jupyterlab.

  7. Navigate to the code/llama-2-[XX]b directory of the project. Then, open your fine-tuning notebook of choice and get started. Happy coding!

Tutorial (CLI-Only)

Some users may choose to use the CLI tool only instead of the Desktop App. If you do not NVIDIA AI Workbench installed, first complete the installation for AI Workbench here. Then,

  1. Fork this Project to your own GitHub namespace and copying the link

    https://github.com/[your_namespace]/<project_name>
    
  2. Open a shell and activating the Context you want to clone into by

    $ nvwb list contexts
    
    $ nvwb activate <desired_context>
    
  3. Clone this Project onto your desired machine by running

    $ nvwb clone project <your_project_link>
    
  4. Open the Project by

    $ nvwb list projects
    
    $ nvwb open <project_name>
    
  5. Start Jupyterlab by

    $ nvwb start jupyterlab
    
    • Specify the file path of the mount, eg. where the Llama 2 model will live on your host machine.

      eg. if your downloaded Llama2 model directory resides in your home path, enter /home/[user]

    • Specify the Hugging Face username and API Key secrets.

  6. Navigate to the code/llama-2-[XX]b directory of the project. Then, open your fine-tuning notebook of choice and get started. Happy coding!


Tip: Use nvwb help to see a full list of NVIDIA AI Workbench commands.


Tested On

This notebook has been tested with a 1x NVIDIA A100-80gb GPU system and the GA version of NVIDIA AI Workbench: nvwb 0.21.3 (internal; linux; amd64; go1.21.3; Tue Mar 5 03:55:43 UTC 2024)

License

This NVIDIA AI Workbench example project is under the Apache 2.0 License

This project may utilize additional third-party open source software projects. Review the license terms of these open source projects before use. Third party components used as part of this project are subject to their separate legal notices or terms that accompany the components. You are responsible for confirming compliance with third-party component license terms and requirements.

Have questions? Please direct any issues, fixes, suggestions, and discussion on this project to the DevZone Members Only Forum thread here.

About

An NVIDIA AI Workbench Example Project for Finetuning Llama 2

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.0%
  • Shell 2.0%