CommandWindows (⌘ ⊞) is a desktop opeating system copilot based on multi-modal large language model, supporting all-platforms which have application windows.
- GPT4 Vision from OpenAI (
gpt-4-vision-preview
) - Gemini Pro Vision from Google (
gemini-pro-vision
) - Gemini Nano/Ultra Vision from Google
- Local Vision Model
- Vary-toy
Preview:
Help me create a Google doc and write the definition of Blockchain on that
- A digital assistant rather than just software
- Use multi-modal perception to help you operate your computer
- Step by step notification showing
- Testing on more platforms
- More detailed device information to LLM
- Enhanced chat experience with better reply content
- More interactive operation
- Convinent shortcuts
While an official release is not yet available due to the experimental status, you can still try out this tool by cloning the repository and then running on your system.
git clone https://github.com/c0mm4nd/command-windows
cd command-windows
npm i
npm run start
The pre-built releases will be available soon!
Simply run
npm run make
The built file is inside the make
folder
Inspired by SOC but written in Electron with Javascript.
Currently, this project is
- actively in development and experimental, not suitable for any production
- welcoming any kind of issues and pull requests!