All notable changes to this project will be documented in this file. The format is based on Keep a Changelog.
- (BREAKING) Simplified the input and output of encoder/decoder models to avoid needing to take ownership of the possibly cached encoder hidden state, offering a minor performance improvement for text generation tasks. The model output field for encoder hidden states are now optional, and only returned if the encoder hidden states were not provided for the given forward path. This may be a breaking change for low-level dependencies that manipulate directly the encoder/decoder model outputs.
- Addition of the MobileBERT language model, task-specific heads and registration in relevant pipelines
- Made all model configurations
Clone
- Made several base modules of the BERT language model public, and added model output
Struct
for the new publicly exposed, complex types
- Addition of the Reformer language model, task-specific heads and registration in relevant pipelines
- Pre-trained models for DistilRoBERTa, used as a default for integration tests
- Updated endpoint of the model resources reflecting changs to the Huggingface's model hub
- Early stopping turned by default on for translation and summarization
- Support for additional models for the conversational pipeline
- Updated the version of Tokenizer crate with consistent visibility
- (BREAKING) move of teh text generation pipeline to its owned pipeline. Shared generation utilities are moved to
generation_utils
- All models, tokenizers and pipelines are now
Send
- Benchmark scripts for all pipelines
- Addition of the XLNet model and task-specific heads
- (BREAKING) Changed the download method for resources now a method of the resource itself, and leveraging the cached-path crate.
- (BREAKING) Changed the return type of models to be output
Struct
instead of long tuples. - (BREAKING) Changed the naming of the model main modules from
modelname
tomodel_modelname
to avoid confusion with the top level module name - Extended the range of allowed types for pipelines input, allowing both owned
Vec
and slices, and bothString
and sting slice. - Handling of all activations functions is mow made from a common module and
Struct
- Zero-shot classification pipeline using a natural language inference model
- (BREAKING) Updated version of tokenizers crate with added options for lower casing, accent stripping and prefix addition
- Updated BART classification model to allow running their
forward
method without being mutable.
- (BREAKING) Improved error handling via the addition of
RustBertError
and error propagation throughout the crate.
- Updated version of tokenizers crate with improved error handling
- Addition of the reformer language model and its integration for language generation
- Changed model resources endpoints to leverage updated Huggingface's model hub
- Updated the beam search processing to use vectorized operations
- Generalization of the accepted input for several pipelines to accept both
Vec
and slices, and to accept bothString
and&str
- Addition of the ALBERT language model and task-specific heads
- Addition of German - English translation models
- Addition of the T5 language model and integration in supported pipelines (translation and summarization)
- Updated the modules throughout the crate to accept both owned and references to varstore paths.
- Addition of a multi-turn conversational pipeline based on DialoGPT.
- Code formatting using
rustfmt
- Removed the requirement for generation models to be mutable. Models are now all stateless, and no longer store an internal cache (now provided as an input).
- Updated BART model to take past layer states as an input instead of storing in internally.
- Fixed sequence classification model logits squeeze causing it to crash for batched inputs.
- Addition of translation between Russian and English
- Fixed a bug causing downloads to be incomplete, and removes the creation of a tokio runtime for the download of resources.
- Addition of the Marian model, leveraging a shared language model implementation with the BART model.
- Addition of translation capabilities. Supports translation between English and French, Spanish, Portuguese, Italian, Catalan and German, and between German and French.
- Addition of multi-label classification capabilities for sequence classification via the
predict_mutilabel
function.
- Generalization of pipelines to allow leveraging multiple model architectures. Leveraging
Enum
unpacking, introducesConfigOption
,TokenizerOption
and pipeline-specific Options. - Addition of generic
SentenceClassificationModel
pipeline. TheSentimentModel
now leverages shared implementation for sentence classification. - Addition of
TokenClassificationModel
pipeline. TheNERModel
now leverages shared implementation for token classification.
- Major rework of tokenization crate, alignment with updated API
- Minor bug fixes for tokenization
- Implementation of the Electra model (generator, discriminator, task-specific heads)
- GPT2-medium and GPT2-large models
- Addition of Resources for handling file dependencies (e.g. vocabularies, model weights, configurations). Resources may be
LocalResources
(pointing to a filesystem location) orRemoteResources
(pointing to a remote endpoint). These resources can be passed to adownload_resource
method that returns the location in the local filesystem for both types of resources, downloading them if necessary. - Resources specifications for all existing architectures, pointing to model files hosted on Huggingface's model hub.
- (BREAKING) moved the resources' specification to the
GenerateConfig
forGPT2Generator
. - (BREAKING) creation of pipeline configurations to contain the resources required to build the pipeline, used as an input rather than paths to local files.
- Updated the configuration for the number of target labels to use the
id2label
field instead ofnum_labels
(aligning with changes in standard configuration in the Transformers library). Removednum_labels
from configurations. - Made the
output_attentions
,output_hidden_states
andtorchscript
fields for DistilBERT configuration optional - Fixed the device placement for sinusoidal embeddings for DistilBERT model.
- Optimization of the BART model avoiding unnecessary tensor copies for cache manipulation and residual connections.
- Optimization of DistilBERT model when embeddings are provided as an input
- Minor optimizations to question answering and sentiment analysis pipelines
- Addition of a cache reset for text generation routines
- Implementation of cache reset for BART language model
- BART language model
- Implementation of
LanguageModel
andPrivateLanguageModel
for BART - Summarization capabilities
- Tanh activation
- (BREAKING) Moved the
LMHeadModel
Trait from GPT2 module to the pipelines module - Updated the
LMHeadModel
inputs to includeencoder_outputs
anddecoder_input_ids
to support causal language model (e.g. BART) - (BREAKING) Added methods to the
PrivateLanguageGenerator
to support encoder-decoder models - (BREAKING) changed the type of
Generator
language model to require mutability (BART caching mechanism stores the cache in the model requiring the entire model mutability - changed at a later point) - Optimization of the
get_banned_token
method
- Updated the device location of the token update when EOS is not allowed because the minimum sequence length was not reached
- No longer process a given beam hypothesis if it is marked as done
- No longer add beams to a hypothesis if the rank is lower than the number of beams
- Updated final beam update to skip completed hypotheses
- Documentation throughout the crate
- Creation of a
GenerateConfig
configuration structure to hold generation options
- Visibility of low-level utilities in the crate
- Updated the generation options to be passed at the text generation model instantiation, rather than at every call to the
generate
method - Updated visibility of generation routines into a public API and private lower level methods
- Text generation now takes a
Option<Vec<&str>>
instead of aOption<&str>
. Shorter sequences are left-padded withpad
if available, otherwise witheos
. - Turned-off gradient calculations for generation process
- Beam search completion validation
- Padding sequence for sentences shorter than the maximum length moved to correct device
- DistilGPT2 pretrained weights for GPT2
LMHeadModel
trait for model supporting text generation, offering an interface between the model specific input/output, and the generic set of inputs/outputs expected for model supporting text generation- Implementation of
LMHeadModel
for GPT2 and GPT - Text generation pipeline, supporting beam search, top-k/top-p decoding, repeated tokens banning, repetition and length penalties as
LanguageGenerator
Trait - Implementation of
LanguageGenerator
for GPT and GPT2 - Examples and tests for language generation
- Fixed concatenation dimension for GPT2 past
- Updated input type for
QuestionAnsweringModel
'spredict
to be&[QaInput]
instead of a pair of question and context strings. QuestionAnsweringModel now works with a list of inputs and returns a list of predictions, processing inputs as batches.
- Swish and gelu_new activation functions
- GPT2 language model
- GPT language model
- Addition of a NER pipeline
- Addition of a QuestionAnswering pipeline
- Moved
SentimentClassifier
from DistilBERT module to the newly created pipelines - Changed precision of id to label mapping of BERT config from
i32
toi64
- Simplified calculation of sinusoidal embeddings for DistilBERT
- Addition of RoBERTa language model
- Addition of
BertEmbedding
trait for BERT-like models
- Updated
BertEmbeddings
to implement the newly createdBertEmbedding
Trait - Updated
BertModel
's embeddings to be of typeimpl BertEmbedding
rather than specific embeddings, allowing to re-use the BERT structure for other models, only replacing the embeddings layer.
- Fixed the variable path for BERT models with task-specific heads to allow loading a snapshot from models trained on Transformers.
- BERT Model and examples
- Addition of
DistilBertForTokenClassification
andDistilBertForQuestionAnswering
model heads - Collection of activation functions (gelu, relu, mish)
- Dropout module
- Custom Linear layer, allowing a creation without bias
- Config trait allowing to deserialize from
json
files
- (BREAKING) Updated
DistilBertConfig
to use the newly createdConfig
Trait
- Integration tests
- Migrated from
rust_transformers
v0.2.0 (deprecated) to `rust_tokenizers v1.0.0
- Example for DistilBERT masked language modeling
- Download utilities script for DistilBERT (base and SST2)
- made
label2id
,id2label
,is_decoder
,output_past
anduse_bfloat
configuration fields optional for DistilBertConfig
- Tensor conversion tools from Pytorch to Libtorch format
- DistilBERT model architecture
- Ready-to-use
SentimentClassifier
using a DistilBERT model fine-tuned on SST2