Per Layer result #9

ironartisan · 2024-12-15T06:38:09Z

Thank you for your impressive work.
As mentioned in Appendix C of your paper, "we assign a distinct pruning ratio for each Transformer block instead of each layer."
However, when I reproduced the per-layer perplexity results based on the provided code, I obtained a value of 24.36. It seems that your code also performs pruning on a per-layer basis. Could you clarify if my understanding is correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per Layer result #9

Per Layer result #9

ironartisan commented Dec 15, 2024

Per Layer result #9

Per Layer result #9

Comments

ironartisan commented Dec 15, 2024