Stars
An efficient implicit semantic augmentation method, complementary to existing non-semantic techniques.
Official repo for CellPLM: Pre-training of Cell Language Model Beyond Single Cells.
Repository for Nicheformer: a foundation model for single-cell and spatial omics
[ARXIV'24] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull reque…
[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.
Codebase for Aria - an Open Multimodal Native MoE
Reference implementation for Token-level Direct Preference Optimization(TDPO)
An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation
Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)
[Arxiv 2024] AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention
Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning
PyTorch implementation of the InfoNCE loss for self-supervised learning.
Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Visualizing the attention of vision-language models
A tool for visualizing attention-score heatmap in generative LLMs
[TPAMI 2024] Measurement Guidance in Difffusion Models: Insight from Medical Image Synthesis
Counterfactual Samples Synthesizing for Robust VQA