Skip to content
View jingcangcang's full-sized avatar
💭
ing
💭
ing

Block or report jingcangcang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

code for piccolo embedding model from SenseTime

Python 118 6 Updated May 21, 2024

Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"

Python 209 17 Updated Oct 16, 2024

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Python 1,587 156 Updated Oct 29, 2024

雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)

285 13 Updated Aug 8, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 2,177 164 Updated Jan 30, 2025

CoreNet: A library for training deep neural networks

Jupyter Notebook 7,000 546 Updated Oct 14, 2024
Python 31 6 Updated Jan 28, 2024
Python 23 2 Updated Sep 13, 2024

[ACL 20] TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task

Jupyter Notebook 71 8 Updated May 1, 2020

Dataset and codes for ACL 2019 DocRED: A Large-Scale Document-Level Relation Extraction Dataset.

Python 627 112 Updated Dec 1, 2020

[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset

Python 104 8 Updated Jan 24, 2024

Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguation)" (NAACL 2022).

Jupyter Notebook 43 7 Updated Jan 30, 2024

A new dataset HarveyNER with fine-grained locations annotated in tweets with strong baseline models using Curriculum Learning.

Python 6 1 Updated Nov 8, 2022

The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016)

Jupyter Notebook 66 6 Updated May 12, 2022

Guideline following Large Language Model for Information Extraction

Python 335 27 Updated Oct 27, 2024

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

861 47 Updated Nov 18, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,903 266 Updated Jun 4, 2024
Python 88 17 Updated Aug 3, 2021

large language model training-3-stages+deployment

Python 47 12 Updated Aug 14, 2023

This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgent is a novel automatic prompt optimization method that auton…

Python 238 29 Updated Aug 4, 2024

[ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In".

Python 59 5 Updated Jul 12, 2024

中文Mixtral-8x7B(Chinese-Mixtral-8x7B)

Python 646 33 Updated Aug 17, 2024
Python 138 4 Updated Jul 1, 2024

Official inference library for Mistral models

Jupyter Notebook 9,913 884 Updated Nov 12, 2024

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python 531 29 Updated Dec 9, 2024

A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171

Python 944 96 Updated May 28, 2024

Implementation of Chinese ChatGPT

Python 287 36 Updated Nov 20, 2023

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Python 13,615 1,592 Updated Jan 13, 2025

ChatGLM2-6B 全参数微调,支持多轮对话的高效微调。

Python 398 41 Updated Aug 17, 2023
Next