Skip to content

Latest commit

 

History

History
 
 

further_readings

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Further Readings

Here we list further (or related, preliminary) readings for the course. Due to our limited time to prepare for the course, we must have missed many excellent resources. Please help us improve and expand the list by creating pull requests! If we find it useful, we will actively merge your pull request (and of course, thus you will be listed as one of the contributors to the course).

A. Prerequistes

B. Related Topics

C. Corresponding to Each Lecture

L1

  1. 【视频】CS11-747 CMU自然语言处理课程
  2. 【视频】CS224n 斯坦福自然语言处理课程
  3. 【视频】大模型概况简介
  4. 【网页】十个自然语言处理大模型
  5. 【论文】Distributed Representations of Words and Phrases and their Compositionality
  6. 【论文】基础模型综述(李飞飞等)

L2

  1. 【视频】CS11-747 CMU自然语言处理课程
  2. 【视频】CS224n 斯坦福自然语言处理课程
  3. 【网页】pytorch官方教程

L3

  1. 【论文集】预训练模型必读论文
  2. Transformer原论文: Attention Is All You Need

L4

  1. 【论文集】Prompt Tuning论文列表
  2. 【论文集】Delta Tuning论文列表
  3. 【工具包】OpenPrompt
  4. 【工具包】OpenDelta

L5

  1. 【论文】BMInf: An Efficient Toolkit for Big Model Inference and Tuning
  2. 【网页】BMTrain:为大模型训练计算成本节省9成
  3. 【网页】不止于ZeRO:BMTrain技术原理浅析
  4. 【论文】ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
  5. 【论文】TinyBERT: Distilling BERT for Natural Language Understanding
  6. 【论文】Structured Pruning Learns Compact and Accurate Models
  7. 【论文】MoEfication: Transformer Feed-forward Layers are Mixtures of Experts

L6

  1. 信息检索

    1. 【论文】Dense Passage Retrieval for Open-Domain Question Answering
    2. 【论文】Document Ranking with a Pretrained Sequence-to-Sequence Model
    3. 【论文】BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models
  2. 阅读理解/问答

    1. 【论文】SQuAD: 100,000+ Questions for Machine Comprehension of Text
    2. 【论文】Reading Wikipedia to Answer Open-Domain Questions
    3. 【论文】UNIFIEDQA: Crossing Format Boundaries with a Single QA System
    4. 【网页】WebGPT: Browser-assisted question-answering with human feedback
  3. 文本生成

    1. 【论文】Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
    2. 【论文】Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension
    3. 【论文】Prefix-Tuning: Optimizing Continuous Prompts for Generation

L7

  1. 【论文】AlphaFold2: Highly accurate protein structure prediction with AlphaFold
  2. 【论文】Enformer: Effective gene expression prediction from sequence by integrating long-range interactions
  3. 【论文】DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome
  4. 【论文】biomedical NLP综述: Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing.
  5. 【工具包】TorchDrug

L8

  1. 【论文】How does NLP benefit legal system: A summary of legal artificial intelligence
  2. 【论文】Lawformer: A pre-trained language model for chinese legal long documents
  3. 【论文】LeCaRD: a legal case retrieval dataset for Chinese law system
  4. 【论文】LEVEN: A Large-Scale Chinese Legal Event Detection Dataset
  5. 【论文】LEGAL-BERT: The muppets straight out of law school

L9

  1. 【论文】MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
  2. 【论文】Finding Experts in Transformer Models
  3. 【论文】Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)
  4. 【论文】Exploring Universal Intrinsic Task Subspace via Prompt Tuning
  5. 【论文】On Transferability of Prompt Tuning for Natural Language Understanding