Stars
[NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
A tool for visualizing attention-score heatmap in generative LLMs
Tensors and Dynamic neural networks in Python with strong GPU acceleration