-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedOct 23, 2024 -
tokenizers Public
Forked from huggingface/tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Rust Apache License 2.0 UpdatedApr 24, 2024 -
tiktoken Public
Forked from openai/tiktokentiktoken is a fast BPE tokeniser for use with OpenAI's models.
Python MIT License UpdatedApr 12, 2024 -
server Public
Forked from triton-inference-server/serverThe Triton Inference Server provides an optimized cloud and edge inferencing solution.
Python BSD 3-Clause "New" or "Revised" License UpdatedJan 11, 2023 -
FasterTransformer Public
Forked from NVIDIA/FasterTransformerTransformer related optimization, including BERT, GPT
C++ Apache License 2.0 UpdatedJan 11, 2023 -
fastertransformer_backend Public
Forked from triton-inference-server/fastertransformer_backendPython BSD 3-Clause "New" or "Revised" License UpdatedDec 7, 2022 -
bertsearch Public
Forked from Hironsan/bertsearchElasticsearch with BERT for advanced document search.
Python MIT License UpdatedApr 18, 2022 -
KLUE Public
Forked from KLUE-benchmark/KLUEKorean NLU Benchmark
Creative Commons Attribution Share Alike 4.0 International UpdatedJun 9, 2021