- Beijing China
Stars
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.
An official implementation of Advancing Radiograph Representation Learning with Masked Record Modeling (ICLR'23)
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays
[CVPR'25] Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation
🐫 CAMEL: Finding the Scaling Law of Agents. The first and the best multi-agent framework. https://www.camel-ai.org
paper list, dataset, and tools for radiology report generation
[CVPR2024] FedHCA^2: Towards Hetero-Client Federated Multi-Task Learning
Janus-Series: Unified Multimodal Understanding and Generation Models
MedRAX: Medical Reasoning Agent for Chest X-ray
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
[CVPR 2024] VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis
[CVPR 2024 Extension] 160K volumes (42M slices) datasets, new segmentation datasets, 31M-1.2B pre-trained models, various pre-training recipes, 50+ downstream tasks implementation
EMNLP'22 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Texts
Famous Vision Language Models and Their Architectures
Clinical Knowledge Graph (CKG) is a platform with twofold objective: 1) build a graph database with experimental data and data imported from diverse biomedical databases 2) automate knowledge disco…
A collection of resources on applications of multi-modal learning in medical imaging.
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
Collection of AWESOME vision-language models for vision tasks
[NeurIPS'22] Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
Foundation models based medical image analysis
Flexible DICOM conversion into structured directory layouts
[NeurIPS 2023] AbdomenAtlas 1.0 (5,195 CT volumes + 9 annotated classes)
Multimodal deep learning for Alzheimer's disease dementia assessment
[NeurIPS 2024 Spotlight] Code for the paper "Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts"