-
P.h.D student@University of Adelaide
- Sydney, Australia
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
VideoPriorSD: Generating High-Resolution Images via Video-based Priors
This is the official PyTorch implementation of "ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality"
MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A banchmark list for evaluation of large language models.
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
OLMoE: Open Mixture-of-Experts Language Models
Code & Dataset for Paper: "Distill Visual Chart Reasoning Ability from LLMs to MLLMs"
[NeurIPS 2024] Repository for the paper "OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking".
Official code base for paper EZIGen: Enhancing zero-shot personalized image generation with precise subject encoding and decoupled guidance
a collection of awesome autoregressive visual generation models
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
A paper list of some recent works about Token Compress for Vit and VLM
ElasticTok: Adaptive Tokenization for Image and Video
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
Diff-DGMN: A Diffusion-based Dual Graph Multi-attention Network for POI Recommendation
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"
🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)