- Canada
- zewen-chi.github.io
Highlights
- Pro
Stars
Hackable and optimized Transformers building blocks, supporting a composable construction.
Our solution for the arc challenge 2024
PyTorch implementation of normalizing flow models
Seamless operability between C++11 and Python
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Annotated Flow Matching paper
PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
High-Resolution Image Synthesis with Latent Diffusion Models
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Collecting research materials on EBM/EBL (Energy Based Models, Energy Based Learning)
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
A curated list of Large Language Model (LLM) Interpretability resources.
Reaching LLaMA2 Performance with 0.1M Dollars
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"
MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning
Multilingual Large Language Models Evaluation Benchmark
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Languag…
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming