Skip to content

🤗 AutoNLP: train state-of-the-art natural language processing models and deploy them in a scalable environment automatically

License

Notifications You must be signed in to change notification settings

dongpil/autonlp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤗 AutoNLP

AutoNLP: faster and easier training and deployments of SOTA NLP models

Installation

You can Install AutoNLP python package via PIP. Please note you will need python >= 3.7 for AutoNLP to work properly.

pip install autonlp

Please make sure that you have git lfs installed. Check out the instructions here: https://github.com/git-lfs/git-lfs/wiki/Installation

Quick start - in the terminal

Supported languages:

  • English: en
  • French: fr
  • German: de
  • Finnish: fi
  • Hindi: hi
  • Spanish: es
  • Chinese: zh
  • Dutch: nl
  • Turkish: tr

Supported tasks:

  • binary_classification
  • multi_class_classification
  • entity_extraction

Note: AutoNLP is currently in beta release. To participate in the beta, just go to https://huggingface.co/autonlp and apply 🤗

First, create a project:

autonlp login --api-key YOUR_HUGGING_FACE_API_TOKEN
autonlp create_project --name sentiment_detection --language en --task binary_classification

Upload files and start the training. You need a training and a validation split. Only CSV files are supported at the moment.

# Train split
autonlp upload --project sentiment_detection --split train \
               --col_mapping review:text,sentiment:target \
               --files ~/datasets/train.csv
# Validation split
autonlp upload --project sentiment_detection --split valid \
               --col_mapping review:text,sentiment:target \
               --files ~/datasets/valid.csv

Once the files are uploaded, you can start training the model:

autonlp train --project sentiment_detection

Monitor the progress of your project.

# Project progress
autonlp project_info --name sentiment_detection
# Model metrics
autonlp metrics --model MODEL_ID

Quick start - Python API

Setting up:

from autonlp import AutoNLP
client = AutoNLP()
client.login(token="YOUR_HUGGING_FACE_API_TOKEN")

Creating a project and uploading files to it:

project = client.create_project(name="sentiment_detection", task="binary_classification", language="en")
project.upload(
    filepaths=["/path/to/train.csv"],
    split="train",
    col_mapping={
        "review": "text",
        "sentiment": "target",
    })

# also upload a validation with split="valid"

Start the training of your models:

project.train()

To monitor the progress of your training:

project.refresh()
print(project)

After the training of your models has succeeded, you can retrieve the metrics for each model and test them with the 🤗 Inference API:

client.predict(project="sentiment_detection", model_id=42, input_text="i love autonlp")

or use command line:

autonlp predict --project sentiment_detection --model_id 42 --sentence "i love autonlp"

How much do I have to pay?

It's difficult to provide an exact answer to this question, however, we have an estimator that might help you. Just enter the number of samples and language and you will get an estimate. Please keep in mind that this is just an estimate and can easily over-estimate or under-estimate (we are actively working on this).

autonlp estimate --num_train_samples 500000 --language en

About

🤗 AutoNLP: train state-of-the-art natural language processing models and deploy them in a scalable environment automatically

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.5%
  • Shell 1.9%
  • Makefile 0.6%