amirgholami

Amir Gholami amirgholami

216 followers · 1 following

Achievements

x3 x3 x3

Achievements

x3 x3 x3

Highlights

Stars

SqueezeAILab / SqueezedAttention

SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference

Python 43 4 Updated Nov 20, 2024

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,877 1,390 Updated Feb 1, 2025

SqueezeAILab / TinyAgent

[EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!

Python 370 58 Updated Sep 4, 2024

HivaMohammadzadeh1 / Presentations

This Repository includes some of the presentations and tutorials I have made

5 Updated Jan 16, 2024

SqueezeAILab / LLM2LLM

[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Python 176 12 Updated Mar 25, 2024

SqueezeAILab / LLMCompiler

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

Python 1,618 124 Updated Jul 10, 2024

mherrmann / helium

Lighter web automation with Python

Python 7,551 459 Updated Feb 20, 2025

SqueezeAILab / KVQuant

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Python 335 30 Updated Aug 13, 2024

Mooler0410 / LLMsPracticalGuide

A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)

9,746 755 Updated May 31, 2024

uptrain-ai / uptrain

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform…

Python 2,238 198 Updated Aug 18, 2024

ShashankSubramanian / adaptive-selfsupervision-pinns

Gradient-based adaptive sampling algorithms for self-supervising PINNs

Python 24 2 Updated May 8, 2023

ucb-bar / gemmini-rocc-tests

Fork of seldridge/rocket-rocc-examples with tests for a systolic array based matmul accelerator

C 55 42 Updated Feb 15, 2025

gdinh / matmap

A modular, automatable, tunable mapper for accelerator programming

Python 8 1 Updated Apr 27, 2022

RunLLM / aqueduct

Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.

Go 520 18 Updated Jun 7, 2023

kssteven418 / Squeezeformer

[NeurIPS'22] Squeezeformer: An Efficient Transformer for Automatic Speech Recognition

Python 249 19 Updated Feb 12, 2023

yuchenlin / rebiber

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Python 2,753 163 Updated Aug 18, 2024

ShashankSubramanian / GLIA

GLioblastoma Image Analysis for integrating brain tumor growth models with medical imaging

C++ 17 5 Updated Mar 30, 2023

Efficient-ML / Awesome-Model-Quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…

2,002 216 Updated Nov 1, 2024