
- Delhi
Lists (2)
Sort Name ascending (A-Z)
Starred repositories
A simple screen parsing tool towards pure vision based GUI agent
An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
Technically-oriented PDF Collection (Papers, Specs, Decks, Manuals, etc)
Self hosted FLOSS fitness/workout, nutrition and weight tracker
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
🪄 Create rich visualizations with AI
Everything you need to build state-of-the-art foundation models, end-to-end.
an open source, extensible AI agent that goes beyond code suggestions - install, execute, edit, and test with any LLM
DeepSeek-VL: Towards Real-World Vision-Language Understanding
[ICLR 2024] Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
DeepSeek LLM: Let there be answers
Janus-Series: Unified Multimodal Understanding and Generation Models
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Lightpanda: the headless browser designed for AI and automation
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Get started quickly with Next.js, Postgres, Stripe, and shadcn/ui.
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Preset for using Preact with the vite bundler
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.