LLM4SR: A Survey on Large Language Models for Scientific Research

Authors: Ziming Luo*, Zonglin Yang*, Zexin Xu, Wei Yang, Xinya Du

This is a repository for organizing papres, codes and other resources related to large language models for the scientific research process.

📚 How to read?

Schematic overview of the scientific research pipeline covered in this survey. This cyclical process begins with scientific hypothesis discovery, followed by experiment planning and implementation, paper writing, and finally peer reviewing of papers. The experiment planning stage consists of optimizing experiment design and executing research tasks, while the paper writing stage consists of citation text generation, related work generation, and drafting & writing. Those papers contain both task-specific methods and evaluation benchmarks. Note that there might be some duplicated papers in the two categories.

🔆 This project is still on-going, pull requests are welcomed!!

If you have any suggestions (missing papers, new papers, key researchers or typos), please feel free to edit and pull a request. Just letting us know the title of papers can also be a great contribution to us. You can do this by open issue or contact us directly via email.

⭐ If you find this repo useful, please star it!!! -->

LLMs for Scientific Hypothesis Discovery

SciMON SciMON: Scientific Inspiration Machines Optimized for Novelty (May. 23, 2023; ACL 2024)
MOOSE Large Language Models for Automated Open-domain Scientific Hypotheses Discovery (Sep. 6, 2023; ICML AI4Science Workshop Best Poster Award; ACL 2024)
MCR Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design (Oct. 22, 2023; EMNLP 2023)
Large language models are zero shot hypothesis proposers (Nov. 10, 2023; COLM 2024)
FunSearch Mathematical discoveries from program search with large language models (Dec. 14, 2023; Nature)
ChemReasoner ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback (Feb. 15, 2024; ICML 2024)
SGA LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery (May. 16, 2024; ICML 2024)
AIScientist The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (Aug. 12, 2024)
MLR-Copilot MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents (Aug. 26, 2024)
IGA Can llms generate novel research ideas? a large-scale human study with 100+ nlp researchers (Sep. 6, 2024)
SciAgents SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning (Sep. 9, 2024)
Scideator Scideator: Human-LLM Scientific Idea Generation Grounded in Research-Paper Facet Recombination (Sep. 23, 2024)
MOOSE-Chem MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses (Oct. 9, 2024; ICLR 2025)
VirSci Two Heads Are Better Than One: A Multi-Agent System Has the Potential to Improve Scientific Idea Generation (Oct. 12, 2024)
CoI Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents (Oct. 17, 2024)
Nova Nova: An Iterative Planning and Search Approach to Enhance Novelty and Diversity of LLM Generated Ideas (Oct. 18, 2024)

LLMs for Experiment Planning and Implementation

Optimizing Experimental Design

Coscientist Autonomous chemical research with large language models (Dec. 20, 2023)
ChemCrow Augmenting large language models with chemistry tools (May. 08, 2024)
CRISPR-GPT CRISPR-GPT: An LLM Agent for Automated Design of Gene-Editing Experiments (Arp. 27, 2024)
Navigating Complexity Navigating Complexity: Orchestrated Problem Solving with Multi-Agent LLMs (Jul. 10, 2024)
HuggingGPT HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face (Dec. 03, 2024)
AutoGen AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework (Oct. 03, 2023)
LLM-RDF An automatic end-to-end chemical synthesis development platform powered by large language models (Nov. 23, 2024)
Simulating Expert Discussions with Multi-agent for Enhanced Scientific Problem Solving (Jan. 23, 2024)

Automating Experimental Processes

LLMs for Scientific Paper Writing

Citation Text Generation

Automatic Generation of Citation Texts in Scholarly Papers: A Pilot Study (July. 30, 2020)
Explaining Relationships Among Research Papers (Feb. 20, 2024)

AutoCite AutoCite: Multi-Modal Representation Fusion for Contextual Citation Generation (Mar. 08, 2021)
BACO BACO: A Background Knowledge- and Content-Based Framework for Citing Sentence Generation (Aug. 1, 2021)
Controllable Citation Sentence Generation with Language Models (Nov. 14, 2022)
Intent-Controllable Citation Text Generation (May. 21, 2022)

Related Work Generation

Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition (Feb. 19, 2024)
Leveraging Large Language Models for Literature Review Tasks - A Case Study Using ChatGPT (Dec. 20, 2023)
LitLLM LitLLM: A Toolkit for Scientific Literature Review (Fe. 02, 2024)
HiReview HiReview: Hierarchical Taxonomy-Driven Automatic Literature Review Generation (Oct. 02, 2024)
Towards a Unified Framework for Reference Retrieval and Related Work Generation (Dec. 06, 2023)
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning (Apr. 08, 2024)
Reinforced Subject-Aware Graph Neural Network for Related Work Generation (Jul. 26, 2024)
Toward Structured Related Work Generation with Novelty Statements (Jul. 26, 2024)

Drafting andWriting

Generating Scientific Definitions with Controllable Complexity (May. 22, 2022)
SciCap SciCap: Generating Captions for Scientific Figures (Nov. 07, 2021)
CoAuthor CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities (Apr. 29, 2022)
Autonomous LLM-driven research from data to human-verifiable research papers (Apr. 24, 2024)
PaperRobot PaperRobot: Incremental Draft Generation of Scientific Ideas (Jun. 28, 2019)
AutoSurvey AutoSurvey: Large Language Models Can Automatically Write Surveys (Jun. 10, 2024)
AI Scientist The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (Aug. 12, 2024)
CycleResearcher CycleResearcher: Improving Automated Research via Automated Review (Oct. 28, 2024)

Benchmarks

Enabling Large Language Models to Generate Text with Citations (Dec. 06, 2023)
CiteBench: A Benchmark for Scientific Citation Text Generation (Dec. 06, 2023)
SciGen SciGen: a Dataset for Reasoning-Aware Text Generation from Scientific Tables (May. 23, 2024)
SciXGen SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation (Nov. 7, 2021)

LLMs for Peer Reviewing

LLM-Review-Sys The Emergence of Large Language Models (LLM) as a Tool in Literature Reviews: An LLM Automated Systematic Review (Sep. 6, 2024)
NLP-for-Peer-Review What Can Natural Language Processing Do for Peer Review? (May. 10, 2024)
A Friend of a Foe? Artificial Intelligence in Scientific Writing: A Friend or a Foe? (Apr. 20, 2024)
Increasing-Use-of-LLMs Mapping the Increasing Use of LLMs in Scientific Papers (Apr. 1, 2024)
Monitoring AI-Modified Content Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews (Mar. 11, 2024)
Emerging Plagiarism Emerging Plagiarism in Peer-Review Evaluation Reports: A Tip of the Iceberg? (Feb. 29, 2024)
Substantiation-Analysis Automatic Analysis of Substantiation in Scientific Peer Reviews (Nov. 20, 2023)
PR4PR Peer Reviews of Peer Reviews: A Randomized Controlled Trial and Other Experiments (Nov. 16, 2023)
Can-LLM-Provide-Useful-Feedback? Can large language models provide useful feedback on research papers? A large-scale empirical analysis (Oct. 3, 2023)
GPT4-Review-Study GPT-4 is Slightly Helpful for Peer-Review Assistance: A Pilot Study (Jun. 16, 2023)

Automated Peer Reviewing Generation

SEA Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis (Jul. 9, 2024)
SWIF2T Automated Focused Feedback Generation for Scientific Writing Assistance (May. 30, 2024)
CGI2 Scientific Opinion Summarization: Paper Meta-Review Generation Dataset, Methods, and Evaluation (May. 24, 2024)
LLM-MetaReview Prompting LLMs to Compose Meta-Review Drafts from Peer-Review Narratives of Scholarly Manuscripts (Feb. 23, 2024)
Reviewer2 Reviewer2: Optimizing Review Generation Through Prompt Generation (Feb. 16, 2024)
MARG MARG: Multi-Agent Review Generation for Scientific Papers (Jan. 8, 2024)
ReviewRobot ReviewRobot: Explainable Paper Review Generation Based on Knowledge Synthesis (INLG(ACL)2020) [](https://github.com/EagleW/Review

Peer Reviewing Tools

LLM-assisted Peer Reviewing Workflows

AI-Mediated Peer Review A Critical Examination of the Ethics of AI-Mediated Peer Review (Sep. 2, 2024)
AgentReview AGENTREVIEW: Exploring Peer Review Dynamics with LLM Agents (Jun. 18, 2024)
ReviewerGPT ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing (Jun. 1, 2024)
ReviewFlow ReviewFlow: Intelligent Scaffolding to Support Academic Peer Reviewing (Feb. 5, 2024)
HumanInTheLoop-AI-Reviewing Human-in-the-loop AI Reviewing: Feasibility, Opportunities, and Risks (Jan. 1, 2024)
CocoSciSum CocoSciSum: A Scientific Summarization Toolkit with Compositional Controllability (EMNLP2023)
PaperMage PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents (ACL2023)
PaperQA2 Language agents achieve superhuman synthesis of scientific knowledge (Sep. 10, 2023)
ChatGPT-Journal-Reviews ChatGPT and the Future of Journal Reviews (Sep. 29, 2023)
CARE CARE: Collaborative AI-Assisted Reading Environment (Feb. 24, 2023)

Benchmarks

CritiqueReview LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing (Jun. 24, 2024)
ORSUM Scientific Opinion Summarization: Paper Meta-Review Generation Dataset, Methods, and Evaluation (May. 24, 2024)
RR-MCQ Is LLM a Reliable Reviewer? A Comprehensive Evaluation of LLM on Automatic Paper Reviewing Tasks (ACL2024)
Reviewer2 Reviewer2: Optimizing Review Generation Through Prompt Generation (Feb. 16, 2024)
ASAP-Review Can We Automate Scientific Reviewing? (Jan. 30, 2024)
PeerSum Summarizing Multiple Documents with Conversational Structure for Meta-Review Generation (May. 2, 2023)
MOPRD MOPRD: A Multidisciplinary Open Peer Review Dataset (Dec. 9, 2022)
NLPeer NLPeer: A Unified Resource for the Computational Study of Peer Review (Nov. 12, 2022)
MReD MReD: A Meta-Review Dataset for Structure-Controllable Text Generation (Findings(ACL)2022)
PeerRead A Dataset of Peer Reviews (PeerRead): Collection, Insights and NLP Applications (Apr. 25, 2018)

Citation

If you find this code useful in your research, please consider citing:

@misc{luo2025llm4srsurveylargelanguage,
      title={LLM4SR: A Survey on Large Language Models for Scientific Research}, 
      author={Ziming Luo and Zonglin Yang and Zexin Xu and Wei Yang and Xinya Du},
      year={2025},
      eprint={2501.04306},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.04306}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM4SR: A Survey on Large Language Models for Scientific Research

📚 How to read?

🔆 This project is still on-going, pull requests are welcomed!!

⭐ If you find this repo useful, please star it!!! -->

Table of Contents

LLMs for Scientific Hypothesis Discovery

LLMs for Experiment Planning and Implementation

Optimizing Experimental Design

Automating Experimental Processes

Data Preparation

Experiment Execution and Workflow Automation

Data Analysis and Interpretation

Benchmarks

LLMs for Scientific Paper Writing

Citation Text Generation

Related Work Generation

Drafting andWriting

Benchmarks

LLMs for Peer Reviewing

Automated Peer Reviewing Generation

Peer Reviewing Tools

LLM-assisted Peer Reviewing Workflows

Benchmarks

Citation

About

Releases

Packages

Contributors 4

License

du-nlp-lab/LLM4SR

Folders and files

Latest commit

History

Repository files navigation

LLM4SR: A Survey on Large Language Models for Scientific Research

📚 How to read?

🔆 This project is still on-going, pull requests are welcomed!!

⭐ If you find this repo useful, please star it!!! -->

Table of Contents

LLMs for Scientific Hypothesis Discovery

LLMs for Experiment Planning and Implementation

Optimizing Experimental Design

Automating Experimental Processes

Data Preparation

Experiment Execution and Workflow Automation

Data Analysis and Interpretation

Benchmarks

LLMs for Scientific Paper Writing

Citation Text Generation

Related Work Generation

Drafting andWriting

Benchmarks

LLMs for Peer Reviewing

Automated Peer Reviewing Generation

Peer Reviewing Tools

LLM-assisted Peer Reviewing Workflows

Benchmarks

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Packages