Stars
Aligning pretrained language models with instruction data generated by themselves.
Package to extract connotation frames
Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models". ACL 2023. Best Paper Award.
Curated papers on Large Language Models in Healthcare and Medical domain
Entry for the shared task at SwissText 2024 - Automatic Classification of the united nations’ sustainable development goals in scientific abstracts
Repository for data and evaluation of 2024 Shared Task on SDG classification held by the Swiss Text Conference.
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
Tutorial for surrogate gradient learning in spiking neural networks
pleyad / medi-magma
Forked from Aleph-Alpha/magmaMAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multil…
A very simple framework for state-of-the-art Natural Language Processing (NLP)
A Python wrapper around the topic modeling functions of MALLET.
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
The package connects to Telegram's API to generate JSON files containing data for channels, including information and posts. It allows you to search for specific channels or a set of channels provi…
Tutorials, datasets, and other material associated with textbook "A First Course in Network Science" by Menczer, Fortunato & Davis
A vocoder framework which had been widely used in research community since 1999.
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
repo for "Natural Language Processing for Law and Social Science" @ ETH Zurich, Spring 2022
An open etymology dataset created using Wiktionary data. Contains 3.8M entries, 1.8M terms, 2900 languages, and 31 unique relationship types.
Reading list for research topics in multimodal machine learning
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
An annotated implementation of the Transformer paper.
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multil…
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering