Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
ilil96 authored May 5, 2024
1 parent 534734c commit b5f5c9a
Showing 1 changed file with 0 additions and 3 deletions.
3 changes: 0 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,9 +199,6 @@ The demo script will load the quantized model, and perform inference on a custom
Include 16 to measure the latency of the original model in fp16.
The latency at each precision will be measured and displayed.

Please note that this demo serves as a proof-of-concept.
Further optimizations in the inference pipeline are needed to achieve the best performance of our engine.

The demo will look like this when run properly:

![AnyPrec Latency Demo](https://github.com/SNU-ARC/any-precision-llm/assets/48833786/75a42bea-979a-489f-aee8-89697c55411a)
Expand Down

0 comments on commit b5f5c9a

Please sign in to comment.