Question about the optimized rotation matrix for Llama3-70B #11

lsjlsj5846 · 2024-09-09T06:31:09Z

Hello,

I tried to reproduce the results of the paper, and got similar results for Llama2-7B, 13B, 70B, and Llama-3 8B.
However, when I tested Llama3-70B using the optimized rotation matrix you provided [link], the result of RTN was as follows:

Wikitext-2 PPL	paper-reported	Mine	diff.
Llama3-70B	4.1	7.5821	3.4821

I also found out that GPTQ results of Llama3-70B differ from what you reported. (I used W4A4KV4 rotation matrix for RTN, and W16A4KV4 rotation matrix for GPTQ.)
I guess the provided rotation matrices for Llama3-70B is somehow wrong. Could you check this issue, and provide the right rotation matrix for Llama3-70B if possible?

Thank you.

ChenMnZ · 2024-09-09T08:58:20Z

Hi, @lsjlsj5846
Have you successfully reproduce dthe results when take GPTQ as weight quantizer?

I also successfully get similar results with paper for Llama2-7B, 13B, 70B, and Llama-3 8B when take RTN as the weight quantizer.

However, the GPTQ results I obtained even worse than RTN.

lsjlsj5846 · 2024-09-09T09:05:03Z

Hi, @ChenMnZ
Yes, I got GPTQ results similar to the paper, except for Llama3-70B.
Did you use W16A4KV4 rotation matrices?

ChenMnZ · 2024-09-09T10:05:51Z

@lsjlsj5846 I used the W4A4KV4 pretrained rotation matrices before.(https://drive.google.com/drive/folders/1R2zix4qeXBjcmgnJN1rny93cguJ4rEE8?usp=sharing).

Thanks for your reminder, I will give a try with W16A4KV4 rotation matrix.

ChenMnZ · 2024-09-09T12:48:48Z

@lsjlsj5846 I meet the same problem with RTN Llama3-70B W4A4KV4.

cokeshao · 2024-09-26T02:58:11Z

Hi, @ChenMnZ
I also got GPTQ results that were different from the paper.

./scripts/2_eval_ptq.sh meta-llama/Llama-2-7b-hf 4 4 4

I also used the W16A4KV4 rotation matrix that was given. google drive

Here's what I reproduced.

Task	Version	Metric	Value		Stderr	In paper
arc_easy	0	acc	0.6540	±	0.0098	72.6
		acc_norm	0.5198	±	0.0103
arc_challenge	0	acc	0.3703	±	0.0141	47.5
		acc_norm	0.3891	±	0.0142

There is a big difference. I think the good results on Wikitext are likely to be overfitting on Wikitext🤔.

Have you encountered the same problem as me? I look forward to discussing it with you.
Thank you.

JingyangXiang · 2024-11-19T05:27:53Z

Hi, @ChenMnZ I also got GPTQ results that were different from the paper.
./scripts/2_eval_ptq.sh meta-llama/Llama-2-7b-hf 4 4 4
I also used the W16A4KV4 rotation matrix that was given. google drive

Here's what I reproduced.

Task Version Metric Value Stderr In paper
arc_easy 0 acc 0.6540 ± 0.0098 72.6
acc_norm 0.5198 ± 0.0103
arc_challenge 0 acc 0.3703 ± 0.0141 47.5
acc_norm 0.3891 ± 0.0142
There is a big difference. I think the good results on Wikitext are likely to be overfitting on Wikitext🤔.

Have you encountered the same problem as me? I look forward to discussing it with you. Thank you.

I also agree with this overfitting. Maybe SpinQuant is more like to LoRA, which tries to fitting downstream tasks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the optimized rotation matrix for Llama3-70B #11

Question about the optimized rotation matrix for Llama3-70B #11

lsjlsj5846 commented Sep 9, 2024

ChenMnZ commented Sep 9, 2024 •

edited

Loading

lsjlsj5846 commented Sep 9, 2024

ChenMnZ commented Sep 9, 2024 •

edited

Loading

ChenMnZ commented Sep 9, 2024

cokeshao commented Sep 26, 2024 •

edited

Loading

JingyangXiang commented Nov 19, 2024 •

edited

Loading

Question about the optimized rotation matrix for Llama3-70B #11

Question about the optimized rotation matrix for Llama3-70B #11

Comments

lsjlsj5846 commented Sep 9, 2024

ChenMnZ commented Sep 9, 2024 • edited Loading

lsjlsj5846 commented Sep 9, 2024

ChenMnZ commented Sep 9, 2024 • edited Loading

ChenMnZ commented Sep 9, 2024

cokeshao commented Sep 26, 2024 • edited Loading

JingyangXiang commented Nov 19, 2024 • edited Loading

ChenMnZ commented Sep 9, 2024 •

edited

Loading

ChenMnZ commented Sep 9, 2024 •

edited

Loading

cokeshao commented Sep 26, 2024 •

edited

Loading

JingyangXiang commented Nov 19, 2024 •

edited

Loading