跪拜 Guibai
← All articles
Artificial Intelligence · AIGC · Agent

From 10 RMB to 1 RMB: How One Developer Slashed AI Video Costs with Codex + Obsidian

By 后端小肥肠 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

This workflow demonstrates how AI agents can turn a viral content format into a near-automated production line at a fraction of the cost. For Western developers building AI-powered content tools, the combination of a local knowledge base (Obsidian) with a low-cost agent platform (Codex) offers a template for creating cheap, customizable, and scalable media pipelines—without relying on expensive, black-box platforms.

Summary

Viral 'Life Scenario' videos—where a narrator role-plays a dramatic life story—are generating millions of views on Chinese platforms. One solo creator reverse-engineered the format and built a two-part AI pipeline to produce them cheaply and at scale.

The first part is a script-writing Skill that references an Obsidian knowledge base. The creator feeds example scripts into Obsidian, then tells Codex to mimic that style when generating new scripts from a user-provided topic. The second part is a video-generation Skill that automates the full production chain: voiceover, subtitle alignment, semantic scene segmentation, image generation, partial image-to-video, and finally a CapCut draft.

The key trade-off: Coze offers richer CapCut plugins but costs over 10 RMB per video and requires complex drag-and-drop workflow configuration. The Codex Skill approach costs 1–3 RMB per video, offers higher flexibility and easier style changes, but requires minor manual tweaks for CapCut effects. The creator recommends learning both tools and focusing on flowchart thinking rather than picking one.

Takeaways
Viral 'Life Scenario' videos follow a fixed structure: opening video, theme freeze frame, and main content with voiceover, subtitles, images, and background music.
The creator built a two-part Skill: a script generator using Obsidian as a knowledge base, and a video generator that outputs a CapCut draft.
Cost per video dropped from over 10 RMB (Coze) to 1–3 RMB (Codex Skill), a tenfold reduction.
Changing the visual style in the Skill requires only a single command, unlike Coze where it affects the entire workflow.
The full pipeline is: topic + script → voiceover → subtitle alignment → semantic scene segmentation → image generation → partial image-to-video → CapCut draft.
The creator recommends learning both Coze and Skill, as each has unique strengths: Coze for rich plugins and easy video production, Skill for low cost and high flexibility.
Flowchart thinking—breaking down a process from surface to components to workflow—is the core skill for building such agents.
The Skill has been shared in a co-learning group's resource library for direct reuse.
Conclusions

The real innovation here isn't the AI—it's the cost arbitrage. By swapping a high-cost, low-flexibility platform (Coze) for a low-cost, high-flexibility one (Codex), the creator turned a viral format into a sustainable micro-business.

Using Obsidian as a knowledge base for script generation is a clever way to inject domain-specific style without prompt engineering. It turns the knowledge base into a living style guide that improves over time.

The creator's 'flowchart thinking' approach—reverse-engineering a viral video format into a repeatable workflow—is a transferable skill that applies to any content vertical, not just Life Scenario videos.

The willingness to mix tools (Coze for comics, Codex for Life Scenario) rather than commit to one platform reflects a pragmatic, results-driven mindset that Western developers can learn from.

The 1–3 RMB cost per video suggests that AI-powered content creation is becoming accessible to solo creators, not just teams with budgets. This could democratize short-form video production globally.

Concepts & terms
Codex
An AI agent platform that allows users to create 'Skills'—automated workflows that can generate scripts, images, videos, and more, with low cost and high flexibility.
Coze
A competing AI agent platform with a richer plugin ecosystem (especially for CapCut video editing) but higher per-run costs and more complex workflow configuration.
Obsidian
A local, markdown-based knowledge base tool that can be used as a repository for reference materials, enabling AI agents to mimic specific writing or content styles.
CapCut
A popular video editing app (by ByteDance) that supports AI-generated drafts, which can be produced programmatically via plugins or Skills.
Life Scenario Video
A viral short-video format where a narrator role-plays a dramatic, often emotional life story (e.g., 'Full-time civil service exam for 8 years, kicked out by parents'), using a fixed structure of opening, theme freeze frame, and narrated main content.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗