-
China University of Mining and Technology
- shanghai,China
Lists (24)
Sort Name ascending (A-Z)
2023CVPR
3D handpose estimation
relative methodsChatGPT
computer vision
Data_and_download
detection/segemntation
ECG
FoundationModelMedical
humanpose
IncSDA
life trick
LLM
LLM Inference Quant
LMM
Lung/Gastric cancer segementaion
self search for whether the tumor has invaded other organs?Medical
OCR
popular
RAG
spider
stroke
smart_healthvideo
VQA
前端
Stars
Real-time, fine-grained reading list on LLM-synthetic-data.🔥
Medical SAM 2: Segment Medical Images As Video Via Segment Anything Model 2
Discover the repository for "ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting," a pioneering study that has been accepted for presentation at CVPR 2024.
GPT4 & LangChain Chatbot for large PDF docs
Azur Lane bot (CN/EN/JP/TW) 碧蓝航线脚本 | 无缝委托科研,全自动大世界
A library for efficient similarity search and clustering of dense vectors.
Official Implementation of "CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning" on MICCAI 2024.
Video classification method for endoscopic ultrasound risk prediction of rectal cancer
Robust Speech Recognition via Large-Scale Weak Supervision
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
One UI is all done with chatgpt web, midjourney, gpts,suno,luma,runway,viggle,flux,ideogram,realtime,pika,udio; Simultaneous support Web / PWA / Linux / Win / MacOS platform
Minimal web UI for GeminiPro.
🍒 Cherry Studio is a desktop client that supports for multiple LLM providers
OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistributi…
Official code of "Towards General Text-guided Universal Image Synthesis Framework for Customized Multimodal Brain MRI"
Generate text images for training deep learning ocr model
Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.
Convert PDF to markdown + JSON quickly with high accuracy
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
The code and the DIW dataset for "Learning From Documents in the Wild to Improve Document Unwarping" (SIGGRAPH 2022)
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
百度网盘AI大赛——图像处理挑战赛:文档图像摩尔纹消除第2名方案
ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
an extremely simple tool for separating vocals and background music, completely localized for web operation, using 2stems/4stems/5stems models 这是一个极简的人声和背景音乐分离工具,本地化网页操作,无需连接外网
Faster Whisper transcription with CTranslate2