Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the bit settings in PWLQ so strange? #7

Open
dragonbra opened this issue Apr 19, 2023 · 1 comment
Open

Why is the bit settings in PWLQ so strange? #7

dragonbra opened this issue Apr 19, 2023 · 1 comment

Comments

@dragonbra
Copy link

dragonbra commented Apr 19, 2023

In pwlq.py, from lines 60 to 67, if the quant_bit of the middle area is set to bits, the quant_bits of tail_neg and tail_pos are set to bits-1 respectively.

## option 2: non-overlapping
    if pw_opt == 2:
        qw_tail_neg = uniform_affine_quantizer(w, 
            bits=bits-1, scale_bits=scale_bits, minv=-abs_max, maxv=-break_point)
        qw_tail_pos = uniform_affine_quantizer(w, 
            bits=bits-1, scale_bits=scale_bits, minv=break_point, maxv=abs_max)
        qw_middle = uniform_symmetric_quantizer(w, 
            bits=bits, scale_bits=scale_bits, minv=-break_point, maxv=break_point)
    
        qw = torch.where(-break_point < w, qw_middle, qw_tail_neg)
        qw = torch.where(break_point > w, qw, qw_tail_pos)

Won't there be a total num_levels of the full range become 2 * 2 ** bit? Is this right? Or is there anything wrong with my understanding?

@spineer10
Copy link

You are right, I'm confused with it when I see the code for the first time. But you can look at the paper in 3.2, the last paragraph says, We emphasize that b-bit PWLQ represents FP32 values into b-bitintegers to support b-bit multiply-accumulate operations, even though in total, it has the same number of quantization levels as (b+1)-bit uniform quantization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants