Stars
Tips for Writing a Research Paper using LaTeX
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays
[CVPR 2024 Best paper award candidate] EGTR: Extracting Graph from Transformer for Scene Graph Generation
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
A curated list of foundation models for vision and language tasks
[ECCV2024 Oral🔥] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"
A collection of resources on applications of multi-modal learning in medical imaging.
[Arxiv-2024] CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation
Segment Anything in Medical Images
Official implementation of project Honeybee (CVPR 2024)
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
EVA Series: Visual Representation Fantasies from BAAI
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
KoLLaVA: Korean Large Language-and-Vision Assistant (feat.LLaVA)
A full Python Implementation of the ROUGE Metric (not a wrapper)
📺 Discover the latest machine learning / AI courses on YouTube.
🔥Highlighting the top ML papers every week.
Awesome list of Korean Large Language Models.
KoAlpaca: 한국어 명령어를 이해하는 오픈소스 언어모델 (KoAlpaca: An open-source language model to understand Korean instructions)
A curated list of facial expression recognition in both 7-emotion classification and affect estimation.
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Official Implementation for "TEXTure: Text-Guided Texturing of 3D Shapes"
Transfer the ControlNet with any basemodel in diffusers🔥