-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the optimized rotation matrix for Llama3-70B #11
Comments
Hi, @lsjlsj5846 I also successfully get similar results with paper for Llama2-7B, 13B, 70B, and Llama-3 8B when take RTN as the weight quantizer. However, the GPTQ results I obtained even worse than RTN. |
Hi, @ChenMnZ |
@lsjlsj5846 I used the W4A4KV4 pretrained rotation matrices before.(https://drive.google.com/drive/folders/1R2zix4qeXBjcmgnJN1rny93cguJ4rEE8?usp=sharing). Thanks for your reminder, I will give a try with W16A4KV4 rotation matrix. |
@lsjlsj5846 I meet the same problem with RTN Llama3-70B W4A4KV4. |
Hi, @ChenMnZ
I also used the W16A4KV4 rotation matrix that was given. google drive Here's what I reproduced.
There is a big difference. I think the good results on Wikitext are likely to be overfitting on Wikitext🤔. Have you encountered the same problem as me? I look forward to discussing it with you. |
I also agree with this overfitting. Maybe SpinQuant is more like to LoRA, which tries to fitting downstream tasks. |
Hello,
I tried to reproduce the results of the paper, and got similar results for Llama2-7B, 13B, 70B, and Llama-3 8B.
However, when I tested Llama3-70B using the optimized rotation matrix you provided [link], the result of RTN was as follows:
I also found out that GPTQ results of Llama3-70B differ from what you reported. (I used W4A4KV4 rotation matrix for RTN, and W16A4KV4 rotation matrix for GPTQ.)
I guess the provided rotation matrices for Llama3-70B is somehow wrong. Could you check this issue, and provide the right rotation matrix for Llama3-70B if possible?
Thank you.
The text was updated successfully, but these errors were encountered: