Lists (9)
Sort Name ascending (A-Z)
Stars
Repository for ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Towards Large Multimodal Models as Visual Foundation Agents
Building Open LLM Web Agents with Self-Evolving Online Curriculum RL
A simple screen parsing tool towards pure vision based GUI agent
🔥🔥 btrace(AKA RheaTrace) is a high performance Android trace tool which is based on Perfetto, it support to define custom events automatically during building apk and using bhook to provider more n…
VisionTasker introduces a novel two-stage framework combining vision-based UI understanding and LLM task planning for mobile task automation in a step-by-step manner.
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
AndroidWorld is an environment and benchmark for autonomous agents
🔥Android无障碍服务(AccessibilityService)开发框架,Android自动化脚本框架,快速开发复杂自动化任务、远程协助、监听等
Vreo (VR Video 缩写) 是基于如视三维渲染引擎 Five 和 用户界面构建库 React 实现的如视 3D 空间剧本播放器。
An input-component for controlling your app in natural language using an LLM though LangChain.dart
a state-of-the-art-level open visual language model | 多模态预训练模型
Paper list for Personal LLM Agents
Source code for the paper "Empowering LLM to use Smartphone for Intelligent Task Automation"
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Modular and customizable Material Design UI components for Android
Data manipulation and transformation for audio signal processing, powered by PyTorch
Real-Time audio processing library written in Dart.
🦜🔗 Build context-aware reasoning applications
Noise is an Android wrapper for kissfft, a FFT implementation written in C.
🔥 Android Kotlin时代的Adapter, Dsl 的形式使用 RecyclerView.Adapter, 支持折叠展开, 树结构,悬停,情感图状态切换, 加载更多, 多类型Item,侧滑菜单等
🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型
Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
Sharp looking Flutter applications with fractional device pixel ratios.