跪拜 Guibai
← All articles
Artificial Intelligence

Tencent's Marvis Puts an OS-Level AI Agent on Your Desktop

By 吾鳴 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

Desktop automation agents that can see your screen, kill processes, and edit the registry lower the barrier between a vague user complaint and a completed system-administration task. For developers and IT support, an agent that handles stubborn uninstalls and locked files without manual troubleshooting cuts a persistent, low-level time sink.

Summary

Marvis deploys six specialized agents — File, Computer, App, Browser, Search, and a coordinating Marvis agent — that together can operate a Windows desktop at the OS level. A user sends a screenshot of an unwanted app or a folder path, and the system identifies the target, stops background processes, uninstalls software, cleans registry entries, and removes residual files. The same agent architecture handles undeletable folders by forcing deletion and moving results to the Recycle Bin, matching manual human behavior.

A mobile companion app extends control to a phone, letting users retrieve files from a remote PC or operate desktop applications through conversation rather than manual remote-mouse input. The phone-to-PC flow works by instructing the agent to locate and transfer files, bypassing the need for traditional remote-desktop interaction.

Installation requires a system disk with more than 20 GB free on Windows. The tool is available through a public website download, and the full interface surfaces all six agents in a single panel where the main Marvis agent dispatches tasks by visually "patting" the responsible sub-agent.

Takeaways
Marvis abstracts desktop control into six agents: Marvis (orchestrator), File Agent, Computer Agent, App Agent, Browser Agent, and Search Agent.
Sending a screenshot of an unwanted desktop or taskbar app triggers identification, process termination, uninstallation, and registry cleanup.
Locked or undeletable folders can be removed by providing the folder path; deleted items land in the Recycle Bin like a manual delete.
A mobile app lets users request files from their PC or control desktop applications through conversation instead of remote-mouse input.
Windows installation requires more than 20 GB of free space on the system disk.
The main Marvis agent visually "pats" the sub-agent it assigns a task to, making task routing explicit in the UI.
Conclusions

OS-level agents that combine screen understanding with process and registry access turn the desktop into an API — the user describes the outcome, and the agent handles the sysadmin work.

Marvis's agent-to-agent delegation model (one orchestrator, five specialists) mirrors how human IT teams triage, but runs in seconds on a single machine.

Using screenshots as input sidesteps the problem of users not knowing an app's executable name or process ID, which is the core friction in manual uninstalls.

Mobile-to-PC file retrieval through natural language could replace a chunk of remote-desktop usage for quick, single-file access.

The 20 GB disk requirement hints at a substantial local runtime, likely bundling models or a virtualized environment rather than a thin cloud client.

Concepts & terms
OS-level AI agent
An AI system with permissions to interact directly with the operating system — reading the screen, managing processes, editing the registry, and controlling file systems — rather than being sandboxed inside a single application.
Agent orchestration
A design where a main agent breaks down a user request into sub-tasks and delegates each to a specialized agent, then collects and summarizes the results.
Registry cleanup
The process of removing leftover Windows Registry entries after uninstalling software, preventing orphaned settings that can cause conflicts or failed re-installs.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗