Stars
Multimodal Large Language Models for Remote Sensing (RS-MLLMs): A Survey
这是一份入门AI/LLM大模型的逐步指南,包含教程和演示代码,带你从API走进本地大模型部署和微调,代码文件会提供Kaggle或Colab在线版本,即便没有显卡也可以进行学习。项目中还开设了一个小型的代码游乐场🎡,你可以尝试在里面实验一些有意思的AI脚本。同时,包含李宏毅 (HUNG-YI LEE)2024生成式人工智能导论课程的完整中文镜像作业。
[CVPR Oral 2022] PyTorch Implementation for "Learning to Deblur using Light Field Generated and Real Defocused Images"
The official code for the NeurIPS 2021 paper "Gaussian Kernel Mixture Network for Single Image Defocus Deblurring".
[CVPR 2021] Official PyTorch Implementation for "Iterative Filter Adaptive Network for Single Image Defocus Deblurring"
[ICCV 2021] Official Tensorflow Implementation for "Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions"
This is the code of paper 'ASCNet: Asymmetric Sampling Correction Network for Infrared Image Destriping'
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Code for PID: Physics-Informed Diffusion Model for Infrared Image Generation
A personal investigation for RGBT tracking.
Hyperbolic Visual Embedding Learning for Zero-Shot Recognition (CVPR 2020)
Code release for Bi-Directional Feature Reconstruction Network for Fine-grained Few-shot Image Classification
[AAAI-2024] High-Order Structure Based Middle-Feature Learning for Visible-Infrared Person Re-Identification
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Instruction Tuning with GPT-4
A curated (most recent) list of resources for Learning with Noisy Labels
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.
Online Coreset Selection for Rehearsal-based Continual Learning, ICLR 2022
A collection of deep learning based RGB-T-Fusion methods, codes, and datasets. The main directions involved are Multispectral Pedestrian Detection, RGB-T Aerial Object Detection, RGB-T Semantic Seg…
✨✨Latest Advances on Multimodal Large Language Models
Best transfer learning and domain adaptation resources (papers, tutorials, datasets, etc.)
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: [email protected] [email protected] qinyang.gm…
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
This is a method of dataset condensation, and it has been accepted by CVPR-2022.