Skip to content
View thunderboom's full-sized avatar

Block or report thunderboom

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,746 262 Updated Mar 2, 2025

A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.

997 75 Updated Dec 22, 2024

Inference code for Llama models

Python 57,780 9,715 Updated Jan 26, 2025

ChatYuan: Large Language Model for Dialogue in Chinese and English

Python 1,896 183 Updated Jun 16, 2023

ICME 2022 paper "Improving Image Paragraph Captioning with Dual Relations" code

Python 7 1 Updated Mar 16, 2022

QQ浏览器2021AI算法大赛赛道一 第1名 方案

Python 263 59 Updated Feb 11, 2022

阿布量化交易系统(股票,期权,期货,比特币,机器学习) 基于python的开源量化交易,量化投资架构

Python 12,986 3,919 Updated Nov 21, 2024

汽车知识图谱

JavaScript 75 36 Updated Aug 24, 2020
Python 13 6 Updated Jul 20, 2021

📙 中华新华字典数据库。包括歇后语,成语,词语,汉字。

Python 11,086 2,604 Updated Dec 26, 2023

Python module to generate regular all expression matches

Python 186 45 Updated Nov 19, 2024

ccf 2020 qa match competition top1

Python 266 83 Updated Jan 28, 2021

keras implement of transformers for humans

Python 5,389 928 Updated Nov 11, 2024

"Few-shot Text Classification with Distributional Signatures" ICLR 2020

Python 257 54 Updated Dec 17, 2020

2020阿里云天池大数据竞赛-中医药命名实体识别挑战赛

Python 27 9 Updated Nov 7, 2020

A PyTorch implementation of the method found in "Adversarially Robust Few-Shot Learning: A Meta-Learning Approach"

Python 50 10 Updated Oct 9, 2020

text-to-image synthesis

Python 6 Updated Sep 11, 2019

BERT-based Seq2Seq architecture trained on SQuAD to generate questions given a text and an answer.

Python 25 3 Updated Nov 5, 2020
Python 38 5 Updated Jul 13, 2020

A Large-Scale Few-Shot Relation Extraction Dataset

Python 732 166 Updated May 4, 2022

Attention-based Induction Networks for Few-Shot Text Classification

Python 45 7 Updated Jun 3, 2020

DBSCAN clustering algorithm on top of Apache Spark

HTML 258 116 Updated Mar 28, 2018

夸夸语料,来自豆瓣互相表扬组数据

75 19 Updated Apr 4, 2019

Focal loss for multiple class classification

Python 81 17 Updated Oct 27, 2020

此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。

Jupyter Notebook 3 2 Updated Mar 26, 2020

Must-read papers on neural relation extraction (NRE)

TeX 1,030 153 Updated Nov 10, 2020

novel deep learning research works with PaddlePaddle

Python 1,723 787 Updated Aug 16, 2024

100+ Chinese Word Vectors 上百种预训练中文词向量

Python 11,934 2,324 Updated Oct 30, 2023
Next