jingcangcang

💭

ing

jingcangcang

💭

ing

2 followers · 34 following

Lists (13)

Sort

Stars

OpenSenseNova / piccolo-embedding

code for piccolo embedding model from SenseTime

Python 118 6 Updated May 21, 2024

GAIR-NLP / ProX

Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"

Python 209 17 Updated Oct 16, 2024

THUDM / LongWriter

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Python 1,587 156 Updated Oct 29, 2024

wenge-research / YAYI-UIE

雅意信息抽取大模型：在百万级人工构造的高质量信息抽取数据上进行指令微调，由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)

285 13 Updated Aug 8, 2024

huggingface / datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 2,177 164 Updated Jan 30, 2025

apple / corenet

CoreNet: A library for training deep neural networks

Jupyter Notebook 7,000 546 Updated Oct 14, 2024

YukinoWan / GPT-RE

Python 31 6 Updated Jan 28, 2024

ZhaoyueSun / PHEE

Python 23 2 Updated Sep 13, 2024

DFKI-NLP / tacrev

[ACL 20] TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task

Jupyter Notebook 71 8 Updated May 1, 2020

thunlp / DocRED

Dataset and codes for ACL 2019 DocRED: A Large-Scale Document-Level Relation Extraction Dataset.

Python 627 112 Updated Dec 1, 2020

mit-ccc / TweebankNLP

[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset

Python 104 8 Updated Jan 24, 2024

Babelscape / multinerd

Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguation)" (NAACL 2022).

Jupyter Notebook 43 7 Updated Jan 30, 2024

brickee / HarveyNER

A new dataset HarveyNER with fine-grained locations annotated in tweets with strong baseline models using Curriculum Learning.

Python 6 1 Updated Nov 8, 2022

GateNLP / broad_twitter_corpus

The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016)

Jupyter Notebook 66 6 Updated May 12, 2022

hitz-zentroa / GoLLIE

Guideline following Large Language Model for Information Extraction

Python 335 27 Updated Oct 27, 2024

quqxui / Awesome-LLM4IE-Papers

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

861 47 Updated Nov 18, 2024

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,903 266 Updated Jun 4, 2024

IsakZhang / Generative-ABSA

Python 88 17 Updated Aug 3, 2021

zhangzhenyu13 / llm3s-conatiner

large language model training-3-stages+deployment

Python 47 12 Updated Aug 14, 2023

XinyuanWangCS / PromptAgent

This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgent is a novel automatic prompt optimization method that auton…

Python 238 29 Updated Aug 4, 2024

OpenMatch / Augmentation-Adapted-Retriever

[ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In".

Python 59 5 Updated Jul 12, 2024

AlexTMallen / adaptive-retrieval

Python 173 10 Updated Jan 21, 2023

HIT-SCIR / Chinese-Mixtral-8x7B

中文Mixtral-8x7B（Chinese-Mixtral-8x7B）

Python 646 33 Updated Aug 17, 2024

thu-coai / CritiqueLLM

Python 138 4 Updated Jul 1, 2024

mistralai / mistral-inference

Official inference library for Mistral models

Jupyter Notebook 9,913 884 Updated Nov 12, 2024

hkust-nlp / deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python 531 29 Updated Dec 9, 2024

VILA-Lab / ATLAS

A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171

Python 944 96 Updated May 28, 2024

sunzeyeah / RLHF

Implementation of Chinese ChatGPT

Python 287 36 Updated Nov 20, 2023

THUDM / ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Python 13,615 1,592 Updated Jan 13, 2025

SpongebBob / Finetune-ChatGLM2-6B

ChatGLM2-6B 全参数微调，支持多轮对话的高效微调。

Python 398 41 Updated Aug 17, 2023

jingcangcang

Lists (13)

embedding

information extract

keyphase

keyphrase

llm

nlp code

prompt

rag

reward model

工具

数据

降维方法

零散知识片段

Stars