Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anybody observe slow training speed for mamba compared to transformer model? #14

Open
vgthengane opened this issue May 12, 2024 · 2 comments

Comments

@vgthengane
Copy link

No description provided.

@d62lu
Copy link

d62lu commented Jun 25, 2024

yes, I just made the comparison. The speed of the Mamba block is truly slower than Transformer, under the same input dimension

@d62lu
Copy link

d62lu commented Jun 25, 2024

for both model training and inference. I am not looking into the details of the Mamba block, maybe I missed something in the code....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants