-
Microsoft Research Asia (Research Intern)
Stars
Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.
An open-source framework for training large multimodal models.
Represent, send, store and search multimodal data
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Free Google Translator API 免费的Google翻译
A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
A data augmentations library for audio, image, text, and video.
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
gitpod-io / openvscode-server
Forked from microsoft/vscodeRun upstream VS Code on a remote machine with access through a modern web browser from any device, anywhere.
leoxiaobin / CvT
Forked from microsoft/CvTThis is an official implementation of CvT: Introducing Convolutions to Vision Transformers.
Reading list for research topics in multimodal machine learning
A Trimap-Free Portrait Matting Solution in Real Time [AAAI 2022]
CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection
Use the browser's online image format converter, no need to upload files, you can convert jpeg, jpg, png, gif, webp, svg, ico, bmp files to jpeg, png, webp animation, gif, base64,avif,mozjpeg. 使用浏览…
Official implementation of "VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment"