Skip to content

Commit

Permalink
typo fix
Browse files Browse the repository at this point in the history
  • Loading branch information
www committed Jul 5, 2022
1 parent 8556d0f commit 8d78020
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion RWKV-v3/src/model_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def forward(self, x):
if self.layer_id == 0:
x = self.ln0(x)
if self.layer_id == 0 and RWKV_CFG.model_type == 'RWKV-ffnPre':
x = x + self.ffnPre(x)
x = x + self.ffnPre(self.ln1(x))
else:
x = x + self.att(self.ln1(x))
x = x + self.ffn(self.ln2(x))
Expand Down
2 changes: 1 addition & 1 deletion RWKV-v3/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@
# Let's say you will train a L6-D512 model.
# 1) Set lr_init = lr_final = 8e-4. Let it run for some mini-epochs, until the improvement of loss become slow.
# 2) Check epoch_save_frequency and make sure the partially-trained model is saved. Ctrl+C to stop the run.
# 3) Set lr_init = 8e-4, lr_final = 1e-5, warmup_tokens = ctx_len * batch_size * 50, betas = (0.9, 0.999)
# 3) Set lr_init = 8e-4, lr_final = 1e-5, warmup_tokens = ctx_len * batch_size * 50, betas = (0.9, 0.999).
# 4) Search for "torch.load" here and modify it to load the partially-trained model. Continue the training.
#
# For L12-D768, set lr_init = 6e-4. For L24-D1024, set lr_init = 4e-4. For L24-D2048, set lr_init = 3e-4.
Expand Down

0 comments on commit 8d78020

Please sign in to comment.