Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RoPE implementation with a shakespeare-char-rope test #590

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

albertvucinovic
Copy link

@albertvucinovic albertvucinovic commented Jan 26, 2025

Without RoPE:
20250124_23h05m01s_grim

With RoPE:
20250126_12h15m25s_grim

Can still be used without RoPE normally. Everything should work as before. Only if in the config file you add use_rope flag, then it will use RoPE instead of the wpe matrix. rope_base is also a configurable value.

Tested on 4090, so has different mfu (because the calculation is based on A100).

@albertvucinovic
Copy link
Author

Didn't check for checkpoint continuations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant