Stars
💫 Industrial-strength Natural Language Processing (NLP) in Python
🔒 Consolidating and extending hosts files from several well-curated sources. Optionally pick extensions for porn, social media, and other categories.
Facilitating the design, comparison and sharing of deep text matching models.
Implementation of BERT that could load official pre-trained models for feature extraction and prediction
Fully open data curation for reasoning models
🎯 Task-oriented embedding tuning for BERT, CLIP, etc.
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
Full text geoparsing as a Python library
TrustRAG:The RAG Framework within Reliable input,Trusted output
Kaggle:Quora Question Pairs, 4th/3396 (https://www.kaggle.com/c/quora-question-pairs)
Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.
Facilitating the design, comparison and sharing of deep text matching models.
A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and Transferability of Contextual Representations" (NAACL 2019).
Utilities, Baselines, Statistics and Descriptions Related to the MSMARCO DATASET
question answering, reading comprehension toolkit
Evaluation tools for Retrieval-augmented Generation (RAG) methods.
A simple version of MatchPyramid implement in TensorFlow. Paper https://arxiv.org/abs/1602.06359.
A curated list of resources dedicated to retrieval-augmented generation (RAG).
Official Repo of paper "QUITO: Accelerating Long-Context Reasoning through Query-Guided Context Compression".
The IR papers rocked the world, including best papers, test-of-time papers, and highly cited papers, published in IR conferences.
The matchzoo-doc-template contains the code to generate the API document for the matchzoo project.
When to Retrieve? Teaching LLMs to Utilize Information Retrieval Effectively