Algo
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Open Academic Research on Improving LLaMA to SOTA LLM
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
Master programming by recreating your favorite technologies from scratch.
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
A tool convert TensorRT engine/plan to a fake onnx
Code and documentation to train Stanford's Alpaca models, and generate the data.
Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Making large AI models cheaper, faster and more accessible
Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
A synthetic data generator for text recognition
Differentiable IoU of rotated bounding boxes using Pytorch
deep learning for image processing including classification and object-detection etc.
A collection of libraries to optimise AI model performances
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
YOLOv5 Series Multi-backbone(TPH-YOLOv5, Ghostnet, ShuffleNetv2, Mobilenetv3Small, EfficientNetLite, PP-LCNet, SwinTransformer YOLO), Module(CBAM, DCN), Pruning (EagleEye, Network Slimming), Quanti…
A model compression and acceleration toolbox based on pytorch.
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.