-
ZJU
- Hangzhou
-
12:43
(UTC +08:00)
Highlights
- Pro
Lists (10)
Sort Name ascending (A-Z)
Stars
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
This is a repo with links to everything you'd ever want to learn about data engineering
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
搜索、推荐、广告、用增等工业界实践文章收集(来源:知乎、Datafuntalk、技术公众号)
Official implementation of the Law of Vision Representation in MLLMs
An implementation of Deep Canonical Correlation Analysis (DCCA or Deep CCA) with pytorch.
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)
An Open-source Toolkit for LLM Development
Apache ECharts is a powerful, interactive charting and data visualization library for browser
Hyprland is an independent, highly customizable, dynamic tiling Wayland compositor that doesn't sacrifice on its looks.
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Continuation of Clash Verge - A Clash Meta GUI based on Tauri (Windows, MacOS, Linux)
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
Improving Alignment and Robustness with Circuit Breakers
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Discovering Bias in Latent Space: An Unsupervised Debiasing Approach (ICML 2024)
[NeurIPS'24 Spotlight] EVE: Encoder-Free Vision-Language Models
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
Keeping language models honest by directly eliciting knowledge encoded in their activations.
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.