Skip to content

Commit

Permalink
fix: format
Browse files Browse the repository at this point in the history
  • Loading branch information
zanussbaum committed Apr 13, 2023
1 parent 9cf38e0 commit 4dd5df1
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion TRAINING_LOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,10 @@ We tried training a full model using the parameters above, but found that during

### Model Training Divergence

We trained multiple [GPT-J models](https://huggingface.co/EleutherAI/gpt-j-6b) with varying success. We found that training the full model lead to diverged post epoch 1. ![](figs/overfit-gpt-j.png). We release the checkpoint after epoch 1.
We trained multiple [GPT-J models](https://huggingface.co/EleutherAI/gpt-j-6b) with varying success. We found that training the full model lead to diverged post epoch 1. ![](figs/overfit-gpt-j.png)


We release the checkpoint after epoch 1.


Using Atlas, we extracted the embeddings of each point in the dataset and calculated the loss per sequence. We then uploaded [this to Atlas](https://atlas.nomic.ai/map/gpt4all-j-post-epoch-1-embeddings) and noticed that the higher loss items seem to cluster. On further inspection, the highest density clusters seemded to be of prompt/response pairs that asked for creative-like generations such as `Generate a story about ...` ![](figs/clustering_overfit.png)
Expand Down

0 comments on commit 4dd5df1

Please sign in to comment.