Tencent's Marvis Puts an OS-Level AI Agent on Your Desktop
Desktop automation agents that can see your screen, kill processes, and edit the registry lower the barrier between a vague user complaint and a completed system-administration task. For developers and IT support, an agent that handles stubborn uninstalls and locked files without manual troubleshooting cuts a persistent, low-level time sink.
Marvis deploys six specialized agents — File, Computer, App, Browser, Search, and a coordinating Marvis agent — that together can operate a Windows desktop at the OS level. A user sends a screenshot of an unwanted app or a folder path, and the system identifies the target, stops background processes, uninstalls software, cleans registry entries, and removes residual files. The same agent architecture handles undeletable folders by forcing deletion and moving results to the Recycle Bin, matching manual human behavior.
A mobile companion app extends control to a phone, letting users retrieve files from a remote PC or operate desktop applications through conversation rather than manual remote-mouse input. The phone-to-PC flow works by instructing the agent to locate and transfer files, bypassing the need for traditional remote-desktop interaction.
Installation requires a system disk with more than 20 GB free on Windows. The tool is available through a public website download, and the full interface surfaces all six agents in a single panel where the main Marvis agent dispatches tasks by visually "patting" the responsible sub-agent.
OS-level agents that combine screen understanding with process and registry access turn the desktop into an API — the user describes the outcome, and the agent handles the sysadmin work.
Marvis's agent-to-agent delegation model (one orchestrator, five specialists) mirrors how human IT teams triage, but runs in seconds on a single machine.
Using screenshots as input sidesteps the problem of users not knowing an app's executable name or process ID, which is the core friction in manual uninstalls.
Mobile-to-PC file retrieval through natural language could replace a chunk of remote-desktop usage for quick, single-file access.
The 20 GB disk requirement hints at a substantial local runtime, likely bundling models or a virtualized environment rather than a thin cloud client.