Skip to content
View yiducn's full-sized avatar

Block or report yiducn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Genome modeling and design across all domains of life

Jupyter Notebook 2,466 239 Updated Mar 6, 2025

Predict author h-index and paper citation counts on the dataset underlying Semanic Scholar

Python 27 9 Updated Apr 14, 2017

AI for crystal materials

32 2 Updated Mar 8, 2025

Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".

Python 643 65 Updated Sep 19, 2024

Code and steps used to generate the Data Citation Corpus dump file

Python 5 1 Updated Feb 2, 2025

Data annotation toolbox supports image, audio and video data.

Python 1,065 106 Updated Mar 11, 2025

🧑‍🚀 全世界最好的LLM资料总结(数据处理、模型训练、模型部署、o1 模型、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.

3,992 420 Updated Mar 10, 2025

OpenResearcher, an advanced Scientific Research Assistant

HTML 437 36 Updated Oct 10, 2024

Scientific Large Language Models: A Survey on Biological & Chemical Domains

292 30 Updated Feb 5, 2025

Must-read papers on NLP for science.

58 6 Updated Jun 19, 2023

[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

Python 269 16 Updated Oct 28, 2024

List the AI for Science papers accepted by top conferences

Jupyter Notebook 101 12 Updated Sep 14, 2024
HTML 4 Updated Jan 7, 2025

Artificial Intelligence Research for Science (AIRS)

Python 577 63 Updated Mar 11, 2025

MLCommons Science benchmarking working group

Jupyter Notebook 13 3 Updated May 19, 2023

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 38,075 4,653 Updated Mar 1, 2025

DataComp: In search of the next generation of multimodal datasets

Python 684 56 Updated Jan 2, 2024

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,938 348 Updated Aug 7, 2024

A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery (EMNLP'24)

545 30 Updated Feb 26, 2025

Repository for research in the field of Responsible NLP at Meta.

Python 196 30 Updated Nov 27, 2024

Papers on fairness in NLP

436 53 Updated May 2, 2024

NOMAD lets you manage and share your materials science data in a way that makes it truly useful to you, your group, and the community.

JavaScript 83 21 Updated Mar 11, 2025

Download Dataset (MP, OQMD, AFLOW, JARVIS etc.) using Matminer, Restful API and AFLUX

Jupyter Notebook 6 2 Updated Jan 20, 2020

FAIR Chemistry's library of machine learning methods for chemistry

Python 1,001 286 Updated Mar 11, 2025

API Client for paperswithcode.com

Python 161 24 Updated May 10, 2024

Github for "Reduced, Reused and Recycled" (NeurIPS 2021 Best Paper, D&B Track)

Jupyter Notebook 17 4 Updated Jan 8, 2022

Summarize existing representative LLMs text datasets.

1,204 120 Updated Mar 7, 2025

OpenAGI: When LLM Meets Domain Experts

Python 2,080 179 Updated Nov 28, 2024

TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)

Python 179 13 Updated Nov 17, 2023
Next