The AI Design Workflow Is Not a Single Tool: A Full Breakdown of Figma MCP, Claude Design, Codex, and Stitch
For Western developers and designers building with AI, this framework is the difference between a demo that looks good and a production system that holds up. The core lesson is that AI amplifies skill but doesn't replace judgment—understanding the cost-quality curve of each tool and the engineering discipline of separating data channels (MCP) from operation scripts (Skills) is what separates teams that ship from teams that burn tokens on pretty screenshots.
The idea that a single AI prompt can batch-produce design drafts is a myth. A producible pipeline requires 4-5 tools working in relay, each with a distinct role: Figma as the structured data source, Stitch for cheap exploration, Claude Design for high-fidelity first drafts, and Claude Code or Codex for code generation. Trying to use any one tool for everything wastes tokens and breaks style consistency.
The key insight is the cost-quality curve. Claude Design delivers the highest visual fidelity but can consume 8% of a monthly quota for a single dashboard. Codex costs 1/3 to 1/4 the tokens but outputs are rougher, needing 2-5px corrections. Stitch is free and fast (15-30 seconds per screen) but is mobile-only and can't learn your design system. Claude Code is the most stable for writing back to Figma's auto-layout structure.
The practical workflow is a 6-step cycle: set up the environment (Figma MCP, CLI tools), explore with Stitch, generate a high-fidelity first draft with Claude Design, collaborate on code with Claude Code and Codex, train the AI on your design system via Skills and Variables, and finally produce the component library. The critical engineering practice is separating 'what AI sees' (MCP) from 'how AI acts' (Skills), and never letting AI build the foundational variable library or basic components itself—those must be hand-crafted by designers.
The most common failure mode is treating AI design tools as interchangeable alternatives rather than a layered stack. Each tool has a specific role—data layer (Figma), exploration layer (Stitch), high-fidelity layer (Claude Design), code layer (Claude Code/Codex)—and mixing them up is the fastest way to waste budget.
The real bottleneck isn't model capability but the engineering discipline of separating 'what AI sees' (MCP) from 'how AI acts' (Skills). Teams that skip this separation end up with AI that 'freestyles' and breaks the design system.
The 50% threshold is a powerful decision gate. It forces teams to distinguish between high-ROI AI use (generating new pages, where AI saves 80% of the work) and negative-ROI use (building variable libraries, where AI creates 60% more work to fix).
The most surprising finding is that specifying a design system in Claude Design's prompt actually reduces output quality. The model's attention is split between 'following rules' and 'making it beautiful', and the default style is already trained on production-grade design languages.
The Skills ecosystem is currently fragmented—Claude and Codex use different file paths and conventions. This is a standardization window; whoever converges the spec first will likely become the de facto standard for AI design tooling.
The 'relay' model (Claude Code for skeleton, Codex for styles) is more efficient than any single model because it allocates the token budget to the model best suited for each task. Mixing contexts in one session causes Skills contamination and degrades both outputs.
Stitch's mobile-first quality gap is a training data distribution problem, not a fundamental limitation. The model has seen far more beautiful mobile settings pages than beautiful BI dashboards. This is a structural bias that will take time to correct.
The handoff from Claude Design to Claude Code is a 'lossy compression' step—structure is preserved, but styles are lost. This is intentional, not a bug. The engineering insight is to treat this as a phased strategy, not a failure.
Training AI on a design system is not a one-time event. Every time a new semantic variable is added or a namespace changes, the Skills file must be updated and regression tests re-run. The Skills file is the design system's API documentation.
The ultimate competitive advantage for a team in 2026 will not be the number of designers, but the depth to which their design system can be understood by AI. A well-maintained Skills repository is the new moat.