diff --git a/NLPinterview/TopicModel/readme.md b/NLPinterview/TopicModel/readme.md new file mode 100644 index 0000000..cf36b87 --- /dev/null +++ b/NLPinterview/TopicModel/readme.md @@ -0,0 +1,26 @@ +# 【关于 主题模型】那些你不知道的事 + +- [【关于 主题模型】那些你不知道的事](#关于-主题模型那些你不知道的事) + - [一、 trick](#一-trick) + + +## 一、 trick + +1. 利用优质“少量”数据学习模型,缓解单机速度和内存问题,然后对剩余/新文档做推导(可数据并行)。比如用微博训练LDA时,先把长度短的微博过滤掉(有工作得出长度为7的短文本适合LDA进行学习),过滤相似微博(转发会造成很多近乎相同的微博)。当训练数据量大并且单机环境中可试一下GraphLab Create,该工具还支持采样比较快的alias LDA。如果不仅是为了学习当前语料中的主题分布,并且也用于预测新数据,则数据量越大越好。 +2. 去除一些TF/DF较低/高的词,较低的词在拟合的过程中会被平滑掉,较高的词没有区分力,标点,助词,语气词也可以去掉(中文常用词60万左右)。在中文中应考虑全角变半角,去乱码,繁转简,英文中考虑大小写转换。实际处理数据时会发现分词后不同词个数很容易达到百万级别,这里很多词是没有意义的,数字词,长度过长的词,乱码词。此外,分词过程中如果两个词在一起的频率比较高,那么分词结果会把两个词合并,那么合并与否对LDA的训练是否有影响呢?有的词应该合并,比如”北京 大学“,也有的词分开会好一些,比如”阶级 斗争“。 +3. 根据上下文合并短文本,比如合并用户所有的微博作为一个文档,合并相似的微博作为一个文档,把微博当做一个查询,利用伪反馈来补充微博内容(中文微博比twitter字数更多一些,长微博不用扩展已经可以正确分类,短微博本身可能就是歧义的,扩展效果也不一定好),把微博及其评论作为一个文档。在一定程度上可缓解短文本问题。 +4. Topic Model的训练是一个数据拟合过程,找出latent topic最大训练语料库的似然概率,当不同类的数据不平衡时,数量量少的主题可能会被数据量多的主题主导。LDA本来就倾向于拟合高频的topic。LDA很多奇怪的结果大多都是因为词的共现导致的。 +5. 训练过程中,迭代次数一般可设为1000 – 2000次,可根据时间要求,机器配置选择。迭代次数达到一定值后,会在最小值处来回跳转。LDA的运行时间和文档数,不同词个数,文档长度,topic个数有关。 +6. K的选择,对每个K跑一个LDA,肉眼观察每个topic的情况最靠谱。当训练数据量大时不可行。此时可以根据不同的topic的相似度来调整K。假设不同topic之间的相似性小为佳(Perplexity, GraphLab Create直接输出这个结果)。一个经验设置是K × 词典的大小 约等于 语料库中词的总数。 +7. 挖掘优质的词典很重要,一方面有助于分词,也有助于明确潜在的主题。 +8. 数据量大后,LDA和PLSA的效果差不多,但是PLSA更容易并行化。LDA和PLSA的最大区别在于LDA对于Doc的Topic分布加上了一层先验,Doc-topic分布是当作模型变量,而LDA则只有一个超参数,Doc-Topic分布则是隐藏变量。在预测的时候,plsa是求一个似然概率,lda则是有两项,先验乘以似然。 +9. LDA在文本领域中,把word抽象成topic。类似,LDA也可以用在其它任务中,我们在信用评估中,直接把每个用户当成一个文档,文档中的词是每个关注的人,得到的topic相当于是一个用户group,相当于对用户进行聚类。还有,把微博中的@/rt的人当作word。http://www.machinedlearnings.com/2011/03/lda-on-social-graph.html +10. 超参数\alpha \beta对训练的影响?\alpha越大,先验起的作用就越大,推导的topic分布就越倾向于在每个topic上的概率都差不多。\alpha的经验选择为50/k, 其中k是topic数目,beta一般为0.01 +11. the color of a word tend to be similar to other words in the same document. +12. the color of a word tend to be similar to its major color in the whole corpus. +13. 用大的数据集训练一个general的model,还是根据垂直领域训练一个specific的model呢?应该看是想得到一些小众的topic,还是比较热门的topic。 +14. 为什么LDA的最大似然难求?含有两个连续的隐藏变量,需要积分掉,对于一个word,需要考虑每个topic生成这个word的概率,因此也有个求和项。因为这个条件分布很难求,导致求解带隐变量优化问题的EM算法也不行,因此EM算法往往都是用一个近似分布来代替。Gibbs Sampling则是生成p(z|…)的几个样本来近似这个条件分布。经过多次迭代(一次迭代对于一篇文章中的一个词只采样一次),一开始随机产生的 topic-word 矩阵 和 doc-topic 会处于稳定,真实的分布。对于一个Doc,根据词之间的可交换性,取不同词对应的topic的过程也是独立的。 +15. 短文本可以尝试TwitterLDA(假设一个短文本只关于一个话题),https://github.com/smutahoang/ttm + + + diff --git a/Trick/EarlyStopping/img/20210523220743.png b/Trick/early_stopping/img/20210523220743.png similarity index 100% rename from Trick/EarlyStopping/img/20210523220743.png rename to Trick/early_stopping/img/20210523220743.png diff --git "a/Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" "b/Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" similarity index 100% rename from "Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" rename to "Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" diff --git "a/Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" "b/Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" similarity index 100% rename from "Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" rename to "Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" diff --git "a/Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" "b/Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" similarity index 100% rename from "Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" rename to "Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" diff --git "a/Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" "b/Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" similarity index 100% rename from "Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" rename to "Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" diff --git "a/Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" "b/Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" similarity index 100% rename from "Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" rename to "Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" diff --git "a/Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" "b/Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" similarity index 100% rename from "Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" rename to "Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" diff --git "a/Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" "b/Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" similarity index 100% rename from "Trick/EarlyStopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" rename to "Trick/early_stopping/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" diff --git a/Trick/EarlyStopping/readme.md b/Trick/early_stopping/readme.md similarity index 100% rename from Trick/EarlyStopping/readme.md rename to Trick/early_stopping/readme.md diff --git a/Trick/LabelSmoothing/img/20210523220743.png b/Trick/label_smoothing/img/20210523220743.png similarity index 100% rename from Trick/LabelSmoothing/img/20210523220743.png rename to Trick/label_smoothing/img/20210523220743.png diff --git "a/Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" "b/Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" similarity index 100% rename from "Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" rename to "Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210301212242.png" diff --git "a/Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" "b/Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" similarity index 100% rename from "Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" rename to "Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602203923.png" diff --git "a/Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" "b/Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" similarity index 100% rename from "Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" rename to "Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204003.png" diff --git "a/Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" "b/Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" similarity index 100% rename from "Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" rename to "Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204205.png" diff --git "a/Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" "b/Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" similarity index 100% rename from "Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" rename to "Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204441.png" diff --git "a/Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" "b/Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" similarity index 100% rename from "Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" rename to "Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204518.png" diff --git "a/Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" "b/Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" similarity index 100% rename from "Trick/LabelSmoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" rename to "Trick/label_smoothing/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210602204551.png" diff --git a/Trick/LabelSmoothing/readme.md b/Trick/label_smoothing/readme.md similarity index 100% rename from Trick/LabelSmoothing/readme.md rename to Trick/label_smoothing/readme.md diff --git a/Trick/SmallSampleProblem/AdversarialTraining/AdversarialTraining.md b/Trick/small_sample_problem/AdversarialTraining/AdversarialTraining.md similarity index 100% rename from Trick/SmallSampleProblem/AdversarialTraining/AdversarialTraining.md rename to Trick/small_sample_problem/AdversarialTraining/AdversarialTraining.md diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214346.png" "b/Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214346.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214346.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214346.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214955.png" "b/Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214955.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214955.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230214955.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215334.png" "b/Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215334.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215334.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215334.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215453.png" "b/Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215453.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215453.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215453.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215528.png" "b/Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215528.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215528.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/QQ\346\210\252\345\233\27620201230215528.png" diff --git a/Trick/SmallSampleProblem/AdversarialTraining/img/eda.drawio b/Trick/small_sample_problem/AdversarialTraining/img/eda.drawio similarity index 100% rename from Trick/SmallSampleProblem/AdversarialTraining/img/eda.drawio rename to Trick/small_sample_problem/AdversarialTraining/img/eda.drawio diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210200952.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210200952.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210200952.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210200952.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" "b/Trick/small_sample_problem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" rename to "Trick/small_sample_problem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" "b/Trick/small_sample_problem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" rename to "Trick/small_sample_problem/AdversarialTraining/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" diff --git "a/Trick/SmallSampleProblem/AdversarialTraining/xmind/\343\200\220\345\205\263\344\272\216 \346\225\260\346\215\256\345\242\236\345\274\272 \344\271\213 \345\257\271\346\212\227\350\256\255\347\273\203\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" "b/Trick/small_sample_problem/AdversarialTraining/xmind/\343\200\220\345\205\263\344\272\216 \346\225\260\346\215\256\345\242\236\345\274\272 \344\271\213 \345\257\271\346\212\227\350\256\255\347\273\203\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" similarity index 100% rename from "Trick/SmallSampleProblem/AdversarialTraining/xmind/\343\200\220\345\205\263\344\272\216 \346\225\260\346\215\256\345\242\236\345\274\272 \344\271\213 \345\257\271\346\212\227\350\256\255\347\273\203\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" rename to "Trick/small_sample_problem/AdversarialTraining/xmind/\343\200\220\345\205\263\344\272\216 \346\225\260\346\215\256\345\242\236\345\274\272 \344\271\213 \345\257\271\346\212\227\350\256\255\347\273\203\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" diff --git a/Trick/SmallSampleProblem/EDA/eda.drawio b/Trick/small_sample_problem/EDA/eda.drawio similarity index 100% rename from Trick/SmallSampleProblem/EDA/eda.drawio rename to Trick/small_sample_problem/EDA/eda.drawio diff --git a/Trick/SmallSampleProblem/EDA/eda.md b/Trick/small_sample_problem/EDA/eda.md similarity index 100% rename from Trick/SmallSampleProblem/EDA/eda.md rename to Trick/small_sample_problem/EDA/eda.md diff --git "a/Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230214346.png" "b/Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230214346.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230214346.png" rename to "Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230214346.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230214955.png" "b/Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230214955.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230214955.png" rename to "Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230214955.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230215334.png" "b/Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230215334.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230215334.png" rename to "Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230215334.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230215453.png" "b/Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230215453.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230215453.png" rename to "Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230215453.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230215528.png" "b/Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230215528.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/QQ\346\210\252\345\233\27620201230215528.png" rename to "Trick/small_sample_problem/EDA/img/QQ\346\210\252\345\233\27620201230215528.png" diff --git a/Trick/SmallSampleProblem/EDA/img/eda.drawio b/Trick/small_sample_problem/EDA/img/eda.drawio similarity index 100% rename from Trick/SmallSampleProblem/EDA/img/eda.drawio rename to Trick/small_sample_problem/EDA/img/eda.drawio diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20201231235616.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102160355.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161002.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161058.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210102161342.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001405.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001650.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117001955.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002233.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" "b/Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" rename to "Trick/small_sample_problem/EDA/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210117002650.png" diff --git "a/Trick/SmallSampleProblem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" "b/Trick/small_sample_problem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" rename to "Trick/small_sample_problem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272 EDA.xmind" diff --git "a/Trick/SmallSampleProblem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" "b/Trick/small_sample_problem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" similarity index 100% rename from "Trick/SmallSampleProblem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" rename to "Trick/small_sample_problem/EDA/img/\346\225\260\346\215\256\345\242\236\345\274\272EDA.png" diff --git a/Trick/SmallSampleProblem/activeLearn/img/20201026103601.png b/Trick/small_sample_problem/activeLearn/img/20201026103601.png similarity index 100% rename from Trick/SmallSampleProblem/activeLearn/img/20201026103601.png rename to Trick/small_sample_problem/activeLearn/img/20201026103601.png diff --git a/Trick/SmallSampleProblem/activeLearn/img/20201026103839.png b/Trick/small_sample_problem/activeLearn/img/20201026103839.png similarity index 100% rename from Trick/SmallSampleProblem/activeLearn/img/20201026103839.png rename to Trick/small_sample_problem/activeLearn/img/20201026103839.png diff --git a/Trick/SmallSampleProblem/activeLearn/img/20201026104224.png b/Trick/small_sample_problem/activeLearn/img/20201026104224.png similarity index 100% rename from Trick/SmallSampleProblem/activeLearn/img/20201026104224.png rename to Trick/small_sample_problem/activeLearn/img/20201026104224.png diff --git a/Trick/SmallSampleProblem/activeLearn/img/20201026104844.png b/Trick/small_sample_problem/activeLearn/img/20201026104844.png similarity index 100% rename from Trick/SmallSampleProblem/activeLearn/img/20201026104844.png rename to Trick/small_sample_problem/activeLearn/img/20201026104844.png diff --git a/Trick/SmallSampleProblem/activeLearn/img/20201026112155.png b/Trick/small_sample_problem/activeLearn/img/20201026112155.png similarity index 100% rename from Trick/SmallSampleProblem/activeLearn/img/20201026112155.png rename to Trick/small_sample_problem/activeLearn/img/20201026112155.png diff --git "a/Trick/SmallSampleProblem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210108222507.png" "b/Trick/small_sample_problem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210108222507.png" similarity index 100% rename from "Trick/SmallSampleProblem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210108222507.png" rename to "Trick/small_sample_problem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210108222507.png" diff --git "a/Trick/SmallSampleProblem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210194424.png" "b/Trick/small_sample_problem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210194424.png" similarity index 100% rename from "Trick/SmallSampleProblem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210194424.png" rename to "Trick/small_sample_problem/activeLearn/img/\345\276\256\344\277\241\346\210\252\345\233\276_20210210194424.png" diff --git a/Trick/SmallSampleProblem/activeLearn/readme.md b/Trick/small_sample_problem/activeLearn/readme.md similarity index 100% rename from Trick/SmallSampleProblem/activeLearn/readme.md rename to Trick/small_sample_problem/activeLearn/readme.md diff --git "a/Trick/SmallSampleProblem/activeLearn/xmind/\343\200\220\345\205\263\344\272\216 \344\270\273\345\212\250\345\255\246\344\271\240\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" "b/Trick/small_sample_problem/activeLearn/xmind/\343\200\220\345\205\263\344\272\216 \344\270\273\345\212\250\345\255\246\344\271\240\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" similarity index 100% rename from "Trick/SmallSampleProblem/activeLearn/xmind/\343\200\220\345\205\263\344\272\216 \344\270\273\345\212\250\345\255\246\344\271\240\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" rename to "Trick/small_sample_problem/activeLearn/xmind/\343\200\220\345\205\263\344\272\216 \344\270\273\345\212\250\345\255\246\344\271\240\343\200\221 \351\202\243\344\272\233\344\275\240\344\270\215\347\237\245\351\201\223\347\232\204\344\272\213.xmind" diff --git a/Trick/warm_up/readme.md b/Trick/warm_up/readme.md new file mode 100644 index 0000000..e4cf368 --- /dev/null +++ b/Trick/warm_up/readme.md @@ -0,0 +1,49 @@ +# 【关于 Warm up 】那些你不知道的事 + +> 作者:杨夕 +> +> 论文学习项目地址:https://github.com/km1994/nlp_paper_study +> +> 《NLP 百面百搭》地址:https://github.com/km1994/NLP-Interview-Notes +> +> 个人介绍:大佬们好,我叫杨夕,该项目主要是本人在研读顶会论文和复现经典论文过程中,所见、所思、所想、所闻,可能存在一些理解错误,希望大佬们多多指正。 +> + +![](img/微信截图_20210301212242.png) + +> NLP && 推荐学习群【人数满了,加微信 blqkm601 】 + +![](img/20210523220743.png) + +- [【关于 Warm up 】那些你不知道的事](#关于-warm-up-那些你不知道的事) + - [一、 什么是 Warm up?](#一-什么是-warm-up) + - [二、为什么需要 Warm up?](#二为什么需要-warm-up) + - [参考](#参考) + +## 一、 什么是 Warm up? + +Warmup 是在 ResNet 论文中提到的一种学习率预热的方法,它在训练开始的时候先选择使用一个较小的学习率,训练了一些 epoches 或者 steps (比如 4 个 epoches,10000steps),再修改为预先设置的学习来进行训练。 + +## 二、为什么需要 Warm up? + +- **在训练的开始阶段,模型权重迅速改变**。 刚开始模型对数据的“分布”理解为零,或者是说“均匀分布”(当然这取决于你的初始化);在第一轮训练的时候,每个数据点对模型来说都是新的,模型会很快地进行数据分布修正,**如果这时候学习率就很大,极有可能导致开始的时候就对该数据“过拟合”,后面要通过多轮训练才能拉回来,浪费时间。**当训练了一段时间(比如两轮、三轮)后,模型已经对每个数据点看过几遍了,或者说对当前的batch而言有了一些正确的先验,较大的学习率就不那么容易会使模型学偏,所以可以适当调大学习率。这个过程就可以看做是warmup。那么为什么之后还要decay呢?当模型训到一定阶段后(比如十个epoch),模型的分布就已经比较固定了,或者说能学到的新东西就比较少了。如果还沿用较大的学习率,就会破坏这种稳定性,用我们通常的话说,就是已经接近loss的local optimal了,为了靠近这个point,我们就要慢慢来。 + +- **mini-batch size较小,样本方差较大**。第二种情况其实和第一种情况是紧密联系的。在训练的过程中,**如果有mini-batch内的数据分布方差特别大,这就会导致模型学习剧烈波动,使其学得的权重很不稳定**,这在训练初期最为明显,最后期较为缓解(所以我们要对数据进行scale也是这个道理)。 + + + + +## 参考 + +1. [神经网络中 warmup 策略为什么有效;有什么理论解释么?](https://www.zhihu.com/question/338066667) +2. [AdaBelief-更稳定的优化器](https://xv44586.github.io/2020/10/25/adabelief/) +3. [深度神经网络模型训练中的最新 tricks 总结【原理与代码汇总】](https://bbs.cvmart.net/articles/3320/vote_count?) +4. [【基础知识】Warmup预热学习率_菜鸟起飞-程序员宅基地](http://www.cxyzjd.com/article/nefetaria/110212564) +5. [”预热学习率“ 的搜索结果](http://www.cxyzjd.com/searchArticle?qc=%E9%A2%84%E7%83%AD%E5%AD%A6%E4%B9%A0%E7%8E%87&page=1) +6. [ICML 2020 | 摆脱warm-up!巧置LayerNorm使Transformer加速收敛](https://www.msra.cn/zh-cn/news/features/pre-ln-transformer) +7. [深度學習Warm up策略在幹什麼?](https://chih-sheng-huang821.medium.com/%E6%B7%B1%E5%BA%A6%E5%AD%B8%E7%BF%92warm-up%E7%AD%96%E7%95%A5%E5%9C%A8%E5%B9%B9%E4%BB%80%E9%BA%BC-95d2b56a557f) +8. [深度学习深度学习模型训练的tricks总结](https://www.codenong.com/cs105809498/) + + + +