Skip to content

Commit

Permalink
Ensure logit padding happens on default stream
Browse files Browse the repository at this point in the history
  • Loading branch information
turboderp committed Aug 27, 2024
1 parent d9f0ecc commit 8d3d4c2
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions exllamav2/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -989,6 +989,9 @@ def forward_chunk(self,
if self.tp_context:
self.tp_context.wait_streams()

if x is not None and x.is_cuda:
torch.cuda.set_stream(torch.cuda.default_stream(x.device))

# Apply logit scale

# if x is not None and self.config.logit_scale != 1:
Expand Down

0 comments on commit 8d3d4c2

Please sign in to comment.