Stars
The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection
Fundamentals of Machine Learning (EEL5840) final project.
indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2
a state-of-the-art-level open visual language model | 多模态预训练模型
MENYO-20k Corpus in "The Effect of Domain and Diacritics in Yorùbá-English Neural Machine Translation" in MT Summit 2021
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Facebook Low Resource (FLoRes) MT Benchmark
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
A novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
Implementation of "RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation".
Transformer model for Chinese-English translation.
[ACL 2023] S3HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering
AIR retriever for Multi-Hop QA (ACL 2020 paper)
Anserini is a Lucene toolkit for reproducible information retrieval research
Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering
A simple RNN based encoder decoder network for the task of machine translation from chinese to english.
Chinese-English Neural machine translation with Encoder-Decoder seq2seq model : Bidirection-GRU + Fasttext word embedding + Attention + K-Beam search + BLEU score
Attention-based RNN model for Chinese-English translation
Semantic Priming Across Many Languages (PSA Proposal)
An attempt to build a working, locally-running cheap version of Generative Agents: Interactive Simulacra of Human Behavior