-
Notifications
You must be signed in to change notification settings - Fork 900
Pull requests: NVIDIA/FasterTransformer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: fix position_encoding_table memory error.
#791
opened Mar 27, 2024 by
johnson-magic
•
Review required
Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel.
#773
opened Oct 24, 2023 by
FengDSP
•
Review required
[BugFix] GPT inference error when pipeline_para_size > 1 and int8_mode != 0
#750
opened Aug 23, 2023 by
00why00
•
Review required
[Bugfix] GptJ & GptNeoX batch inference error
#742
opened Aug 11, 2023 by
YZP17121579
•
Review required
[Doc] Add
projects
section in README which is developed based on FasterTransformer
#731
opened Jul 25, 2023 by
lvhan028
•
Review required
Add triton fastertransformer backend support for deberta
#725
opened Jul 19, 2023 by
sfc-gh-zhwang
•
Review required
Add cuDNN include path as a common include dir
#724
opened Jul 18, 2023 by
jacobkahn
•
Review required
fix: initialize tiled_prompt_lengths_buf_ to zero in gptneox
#716
opened Jul 13, 2023 by
yandai
•
Review required
Huggingface gptj convert script supports sharded checkpoint
#695
opened Jun 29, 2023 by
skyser2003
•
Review required
swin-transformer quantization readme files changes
#675
opened Jun 16, 2023 by
Mhhhaster
•
Review required
fix: fix Qk_vec_acum_fp32_ has already been declared
#659
opened Jun 9, 2023 by
lkm2835
•
Review required
gptneox & gptj int8 quantization & share context
#653
opened Jun 7, 2023 by
rahuan
•
Review required
Update gpt_guide.md: documentation link is invalid
#620
opened May 22, 2023 by
treycheng
•
Review required
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-02-01.