Skip to content

Inference code for Mistral and Mixtral hacked up into original Llama implementation

License

Notifications You must be signed in to change notification settings

dzhulgakov/llama-mistral

This branch is 5 commits ahead of, 26 commits behind meta-llama/llama:main.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

cecee44 · Dec 9, 2023
Nov 8, 2023
Dec 9, 2023
Feb 24, 2023
Feb 24, 2023
Jul 18, 2023
Jul 21, 2023
Oct 15, 2023
Dec 9, 2023
Jul 18, 2023
Aug 11, 2023
Jul 18, 2023
Sep 23, 2023
Dec 8, 2023
Dec 8, 2023
Jul 18, 2023
Jul 18, 2023

Repository files navigation

Extremely hacky implementation of Mixtral 8x7B

New: API access

Try a much faster implementation of this model at https://app.fireworks.ai/

What is it?

Mistral dropped the new MoE model this morning: https://twitter.com/MistralAI/status/1733150512395038967

This is an attempt to hack the original Llama codebase to load it. The implementation is very naive and slow.

You need 2 x 80Gb or 4 x 40Gb cards to load it.

Implementation:

WARNING: There's no official reference model code. This implementation might be wrong. At least the generation looks coherent which is a good sign :)

Usage

Download the weights for Mixtral from HF or Torrent. HF is the easiest: https://huggingface.co/someone13574/mixtral-8x7b-32kseqlen/tree/main . Make sure you consolidate the weights

Run with 2 GPUs (~45GB required in each):

python example_text_completion.py path/to/mixtral/ path/to/mixtral/tokenizer.model

To run with 4 GPUs pass --num-gpus 4.

Edit prompt in the example if needed.

Sample output

Mistral hallucinates about Mistral:

Mistral.ai is a company that
> provides a platform for building, training, and deploying AI models.

The platform offers a variety of tools and services that can help developers and data scientists build and train AI models.

Some of the key features of Mistral.ai's platform include:

- A drag-and-drop

==================================

Simply put, the theory of relativity states that
> 1) the laws of physics are the same for all observers in uniform motion relative to one another, and 2) the speed of light in a vacuum is the same for all observers, regardless of their relative motion or of the motion of the light source.

The first postulate, the principle of

==================================

A brief message congratulating the team on the launch:

        Hi everyone,

        I just
> wanted to say a big congratulations on the launch of your new website.

        I think it looks fantastic and I am sure it will be a great success.

        Well done everyone and keep up the good work.

        Best wishes,

        XXXX

==================================

Translate English to French:

        sea otter => loutre de mer
        peppermint => menthe poivrée
        plush girafe => girafe peluche
        cheese =>
> fromage
        teddy bear => ourson en peluche
        polar bear => ours polaire
        cuddly panda => panda câlin
        fluffy sheep => mouton fluffy
        furry kitten => chaton poilu
        fuzzy

==================================

About

Inference code for Mistral and Mixtral hacked up into original Llama implementation

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.7%
  • Shell 5.3%