Stars
A comprehensive video analysis tool that combines computer vision, audio transcription, and natural language processing to generate detailed descriptions of video content. This tool extracts key fr…
"VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos"
🔥🔥First-ever hour scale video understanding models
a tiny project to test the effectiveness of video QA through RAG techniques and multimodal LLMs
the resources about the application based on LLM with RAG pattern
Making large AI models cheaper, faster and more accessible
鱼皮的 AI 知识库,汇总热门的 AI 大模型和工具,比如 Deepseek 使用指南、提示词技巧分享、知识干货、应用场景、AI 变现、行业资讯、教程资源等一系列内容,帮助你快速掌握 AI 技术,走在时代前沿。涉及的大模型:chatGPT、Deepseek、Deepseek-r1、QWEN、GROK 等等
Aila(AI超元域): The premier AI integration tool for Windows, macOS, and Android. Ask once, get answers from 10+ AIs like ChatGPT, Gemini, Claude3, Copilot, Poe, perplexity and more. Features customiza…
✨✨Latest Advances on Multimodal Large Language Models
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
AI语义搜索本地素材。以图搜图、查找本地素材、根据文字描述匹配画面、视频帧搜索、根据画面描述搜索视频。Semantic search. Search local photos and videos through natural language.
VCED 可以通过你的文字描述来自动识别视频中相符合的片段进行视频剪辑。该项目基于跨模态搜索与向量检索技术搭建,通过前后端分离的模式,帮助你快速的接触新一代搜索技术。
This is the code of the paper "Towards Generalized UAV Object Detection: A Novel Perspective from Frequency Domain Disentanglement",which is submitted to IJCV. It is an extension of our CVPR 2023 p…
Papers for normalization techniques, released codes collections.
[停止维护 请使用note286/xduts]西安电子科技大学研究生学位论文XeLaTeX模板