Stars
A very simple framework for state-of-the-art Natural Language Processing (NLP)
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
Aligning pretrained language models with instruction data generated by themselves.
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multil…
The package connects to Telegram's API to generate JSON files containing data for channels, including information and posts. It allows you to search for specific channels or a set of channels provi…
An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship types.
Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models". ACL 2023. Best Paper Award.
Repository for data and evaluation of 2024 Shared Task on SDG classification held by the Swiss Text Conference.
pleyad / medi-magma
Forked from Aleph-Alpha/magmaMAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multil…