Skip to content

Commit

Permalink
Default inject_fused_attention and mlp to True, matching defaults
Browse files Browse the repository at this point in the history
  • Loading branch information
TheBloke committed Jun 3, 2023
1 parent 4617629 commit edb13d4
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions examples/benchmark/generation_speed.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,8 @@ def load_model_tokenizer(
use_triton: bool = False,
use_safetensors: bool = False,
use_fast_tokenizer: bool = False,
inject_fused_attention: bool = False,
inject_fused_mlp: bool = False
inject_fused_attention: bool = True,
inject_fused_mlp: bool = True
):
tokenizer = AutoTokenizer.from_pretrained(
pretrained_model_name_or_path=tokenizer_name_or_path or model_name_or_path,
Expand Down

0 comments on commit edb13d4

Please sign in to comment.