Stars
An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl
A flexible free and unlimited PDF Translator for Human with Local-LLM or ChatGPT
The geometry of multilingual language model representations (EMNLP 2022).
Fully open reproduction of DeepSeek-R1
Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors
Sky-T1: Train your own O1 preview model within $450
Modeling, training, eval, and inference code for OLMo
The hub for EleutherAI's work on interpretability and learning dynamics
Converter of invoices and receipt images into an csv file containing a list of products and prices.
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
Code accompanying the paper "Massive Activations in Large Language Models"
Advanced receipt OCR and analysis using PaddleOCR, GPT-3.5-turbo, Plotly, and Gradio for interactive visualizations.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Function Vectors in Large Language Models (ICLR 2024)
Jobs_Applier_AI_Agent_AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Let your Claude able to think
Alpaca Chinese Dataset -- 中文指令微调数据集【人工+GPT4o持续更新】
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory
Quick tutorial showing how to fine-tune Llama3.1 with nothing but free tools and text data. All code included in ipynb. For a step by step walkthrough take a look at the tutorial below on medium.
Efficient Video Prediction via Sparsely Conditioned Flow Matching. In ICCV, 2023.
Model Context Protocol Servers
THOI: An efficient library for higher order interactions analysis based on Gaussian copulas enhanced by batch-processing
A collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API
Codebase for testing whether hidden states of neural networks encode discrete structures.
Approaching (Almost) Any Machine Learning Problem
Approaching (Almost) Any Machine Learning Problem中译版,在线文档地址:https://ytzfhqs.github.io/AAAMLP-CN/