Skip to content
View Wonderful-Me's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Rice University
  • Houston, United States

Block or report Wonderful-Me

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
87 results for source starred repositories
Clear filter

llama and other large language models on iOS and MacOS offline using GGML library.

Swift 1,597 110 Updated Jan 27, 2025

Universal LLM Deployment Engine with ML Compilation

Python 20,006 1,666 Updated Feb 12, 2025

FedML - The Research and Production Integrated Federated Learning Library: https://fedml.ai

1,942 331 Updated Sep 3, 2022

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 736 35 Updated Feb 19, 2025

📚FFPA: Yet another Faster Flash Prefill Attention with O(1)⚡️SRAM complexity for headdim > 256, 1.8x~3x↑🎉faster than SDPA EA.

Cuda 106 5 Updated Feb 19, 2025

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Python 102 16 Updated Oct 15, 2024

Low-bit LLM inference on CPU with lookup table

C++ 681 52 Updated Jan 9, 2025

On-device AI across mobile, embedded and edge for PyTorch

C++ 2,519 449 Updated Feb 19, 2025

CoreNet: A library for training deep neural networks

Jupyter Notebook 7,004 546 Updated Oct 14, 2024

MLX: An array framework for Apple silicon

C++ 19,164 1,096 Updated Feb 19, 2025

Rotary Transformer

Python 894 52 Updated Mar 21, 2022
MoonBit 4 2 Updated Feb 19, 2025

Line-by-line profiling for Python

Python 2,860 125 Updated Jan 30, 2025

Push-Button End-to-End Testing of Kubernetes Operators and Controllers

Python 125 43 Updated Feb 14, 2025
Python 314 40 Updated Apr 2, 2024

LLM inference in C/C++

C++ 74,752 10,805 Updated Feb 19, 2025

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

TypeScript 4,990 515 Updated Feb 13, 2025

Local models support for Microsoft's graphrag using ollama (llama3, mistral, gemma2 phi3)- LLM & Embedding extraction

Python 892 141 Updated Sep 30, 2024

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

Python 21,935 1,920 Updated Jan 23, 2025

High-speed Large Language Model Serving for Local Deployment

C++ 8,107 423 Updated Feb 19, 2025

A paper list of recent mamba efforts for low-level vision.

243 9 Updated Feb 13, 2025

Deep Learning Energy Measurement and Optimization

Python 239 30 Updated Feb 5, 2025

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,279 630 Updated Feb 19, 2025

Tile primitives for speedy kernels

Cuda 2,048 115 Updated Feb 19, 2025

Fast, Flexible and Portable Structured Generation

C++ 706 44 Updated Feb 19, 2025

Fast Multimodal LLM on Mobile Devices

C++ 696 80 Updated Feb 9, 2025

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

590 40 Updated Feb 19, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 3,438 300 Updated Feb 19, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,271 103 Updated Feb 10, 2025
Next