Streaming AI Agent Terminal Output to the Browser with SSE

Stream & SSE 101 — Receiving AI Agent Terminal Output in Real-Time on the Web

Foreword

Recently, I've been working on integrating an AI Agent into a web frontend. The requirement is straightforward: a user triggers an AI Agent (like Claude Code, Codex, etc.) on a webpage, and then sees its output stream in real-time, just like watching a terminal—characters appearing line by line, rather than waiting for it to finish and returning everything at once.

The core technologies for this scenario are Stream and SSE (Server-Sent Events). I've been working on this for over a week and stepped on quite a few pitfalls; here's a summary.

Why Not WebSocket

When people hear "real-time communication," many immediately think of WebSocket. But WebSocket is too heavy—it's full-duplex, meaning the client and server can send messages to each other simultaneously. For an AI Agent's output scenario, we only need one-way server-to-client push. WebSocket is overkill.

SSE is naturally designed for this kind of scenario:

Feature	SSE	WebSocket
Communication Direction	Server → Client (one-way)	Bidirectional
Protocol	HTTP	ws://
Auto-Reconnect	Built-in	Manual implementation required
Browser Support	All modern browsers	All modern browsers
Proxy/Firewall Traversal	Good (HTTP-based)	Occasionally blocked
Complexity	Low	High

In a nutshell: If you only need the server to push data, use SSE; if you need bidirectional communication, use WebSocket.

What is a Stream

Before discussing SSE, let's clarify the concept of "Stream." A Stream is essentially a chunked data transfer pattern.

Traditional HTTP requests follow a "request-response" model: the client sends a request, waits, the server finishes processing, and returns the entire response body at once. For AI Agent tasks that can take tens of seconds or even minutes, the user experience is terrible—you stare at a spinning loading indicator with no idea what's happening.

The Stream approach: the server doesn't hoard data; it sends a bit as soon as it's generated. The client receives a chunk and renders it immediately, just like watching terminal output.

Traditional: Request ──────────────────────────────> Full Response
Stream:     Request ──> chunk1 ──> chunk2 ──> chunk3 ──> ... ──> [DONE]

Backend Stream

Taking Node.js as an example, the backend pushes data chunk by chunk using ReadableStream or the framework's built-in stream capabilities:

// Express example
app.post('/api/agent/run', async (req, res) => {
  // Key: Set SSE response headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  const agent = spawn('claude', ['--print', req.body.prompt]);

  agent.stdout.on('data', (chunk) => {
    // Push each line produced to the client via SSE format
    res.write(`data: ${JSON.stringify({ text: chunk.toString() })}\n\n`);
  });

  agent.on('close', () => {
    res.write(`data: ${JSON.stringify({ done: true })}\n\n`);
    res.end();
  });
});

Note the use of res.write() instead of res.send()—the former is streaming writes, the latter is a one-time send.

SSE Protocol

SSE (Server-Sent Events) is a one-way push protocol built on top of HTTP. Its data format is very simple:

data: {"text": "Hello"}

data: {"text": " World"}

data: {"done": true}

There are only three rules:

Each event starts with data:
The event content follows the colon
An event ends with two newline characters \n\n

That's it. No handshake, no frames, no binary—pure text.

Additional SSE Fields

Besides data, SSE supports several optional fields:

id: 42
event: message
retry: 3000
data: {"text": "Hello"}

Field	Purpose
`data`	Event data, supports multiple lines (one `data:` per line)
`event`	Event type, clients can handle differently based on this
`id`	Event ID, used for resuming from breakpoints
`retry`	Reconnection wait time (milliseconds)

Architecture Design

Below is a diagram showing the architecture for the entire AI Agent streaming output:

┌─────────────────────────────────────────────────────────────────────┐
│                            Browser                                  │
│                                                                     │
│  ┌───────────────┐   POST /api/agent/run   ┌──────────────────────┐ │
│  │   UI Layer    │ ───────────────────────► │ fetch + Readable    │ │
│  │   (React)     │                          │ Stream (SSE parser) │ │
│  └───────────────┘                          └──────────┬──────────┘ │
│          │                                             │            │
│          │ render chunk by chunk                       │ HTTP       │
│          ▼                                             ▼            │
│  ┌───────────────┐                          ┌─────────────────────┐ │
│  │   Terminal    │ ◄─────────────────────── │ EventSource /       │ │
│  │   xterm.js    │   text: "Hello\n"        │ fetch SSE client    │ │
│  └───────────────┘   text: "World\n"        └─────────────────────┘ │
│                     done: true                                      │
└──────────────────────────────────────┬──────────────────────────────┘
                                       │ HTTP (SSE)
                                       │ Content-Type: text/event-stream
                                       ▼
┌──────────────────────────────────────┴──────────────────────────────┐
│                        Nginx / Reverse Proxy                        │
│              proxy_buffering off;  chunked_transfer_encoding on;    │
└──────────────────────────────────────┬──────────────────────────────┘
                                       │
                                       ▼
┌─────────────────────────────────────────────────────────────────────┐
│                       Backend Server (Node.js)                      │
│                                                                     │
│  ┌──────────────────┐    spawn / API call    ┌───────────────────┐  │
│  │ SSE Route Handler│ ─────────────────────► │ AI Agent Process  │  │
│  │ Set SSE headers  │                        │ Claude Code       │  │
│  │ Push chunk by    │ ◄─── stdout.on('data') │ Codex / Others    │  │
│  └──────────────────┘                        └───────────────────┘  │
│                                                                     │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │ Session Manager                                               │  │
│  │ - Maintain each Agent session state                           │  │
│  │ - Track Last-Event-ID for resume support                      │  │
│  └───────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────┘

Data flow:

The user initiates a request from the frontend (POST prompt)
The backend spawns the AI Agent process and establishes an SSE long connection
Every time the Agent produces a line of output, the backend wraps it as an SSE event and pushes it
The frontend receives it chunk by chunk and renders it in real-time to the terminal component

Resumable Delivery: Last-Event-ID

SSE has a very practical mechanism—resumable delivery. When a network hiccup causes a connection interruption, the client doesn't need to start from scratch; it can continue receiving from where it left off.

This is the role of Last-Event-ID. It's not a request parameter, but an automatic reconnection mechanism built into the SSE protocol:

How It Works

Server sends:
  id: 1
  data: {"text": "Hello"}

  id: 2
  data: {"text": " World"}

  id: 3
  data: {"text": "!"}

        ── Network disconnected ──

Client auto-reconnects, request header includes:
  Last-Event-ID: 3    ← Tells the server "I received id=3"

Server continues pushing from id=4:
  id: 4
  data: {"text": " Done!"}

Backend Implementation

app.get('/api/agent/stream', (req, res) => {
  // Get the last event ID the client received
  const lastId = req.headers['last-event-id'];

  // SSE response headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  // Resume from the breakpoint (need to recover historical events from session)
  const sessionId = req.query.sessionId;
  const history = sessionStore.getHistory(sessionId, lastId);

  // First, resend events before the breakpoint
  for (const event of history) {
    res.write(`id: ${event.id}\ndata: ${JSON.stringify(event.data)}\n\n`);
  }

  // Then continue receiving new Agent output
  agent.stdout.on('data', (chunk) => {
    const eventId = sessionStore.nextId(sessionId);
    res.write(
      `id: ${eventId}\ndata: ${JSON.stringify({ text: chunk.toString() })}\n\n`,
    );
  });
});

Frontend Implementation

Native EventSource automatically handles Last-Event-ID with no extra code. If using fetch + ReadableStream, you need to implement it manually:

let lastEventId = null;

async function connectWithResume(sessionId) {
  const headers = { 'Content-Type': 'application/json' };
  if (lastEventId) {
    headers['Last-Event-ID'] = lastEventId; // Manually include the breakpoint ID
  }

  const response = await fetch(`/api/agent/stream?sessionId=${sessionId}`, {
    method: 'GET',
    headers,
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });
    const events = buffer.split('\n\n');
    buffer = events.pop();

    for (const event of events) {
      const idMatch = event.match(/^id:\s*(.+)$/m);
      const dataMatch = event.match(/^data:\s*(.*)$/m);
      if (idMatch) lastEventId = idMatch[1]; // Record the latest ID
      if (dataMatch) {
        const data = JSON.parse(dataMatch[1]);
        appendToTerminal(data.text);
      }
    }
  }
}

Note: Last-Event-ID is only automatically carried in EventSource. With fetch, you need to manually parse the id field from the response and add it to the request header upon reconnection.

How the Frontend Receives Data

Method 1: EventSource (Native API)

Browsers natively provide the EventSource API, which is extremely simple to use:

const source = new EventSource('/api/agent/stream?id=123');

source.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.done) {
    console.log('Agent execution finished');
    source.close();
    return;
  }
  // Append to terminal UI in real-time
  appendToTerminal(data.text);
};

source.onerror = (err) => {
  console.error('SSE connection error', err);
};

But EventSource has a fatal flaw: it only supports GET requests. AI Agents typically need to POST a prompt, which is awkward.

Method 2: fetch + ReadableStream (Recommended)

For SSE scenarios requiring POST requests, using fetch with ReadableStream is a more flexible approach:

async function runAgent(prompt) {
  const response = await fetch('/api/agent/run', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ prompt }),
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });

    // Split SSE events by \n\n
    const events = buffer.split('\n\n');
    buffer = events.pop(); // The last segment might be incomplete, save for next time

    for (const event of events) {
      const match = event.match(/^data:\s*(.*)$/m);
      if (!match) continue;

      const data = JSON.parse(match[1]);
      if (data.done) {
        console.log('Agent execution finished');
        return;
      }
      appendToTerminal(data.text);
    }
  }
}

Core idea:

fetch gets the response.body (a ReadableStream)
Use getReader() to read chunk by chunk
Use TextDecoder to convert binary to text
Split by \n\n to parse SSE events one by one

Method 3: Third-Party Libraries

If you don't want to parse the SSE format yourself, you can use an off-the-shelf library:

npm install @microsoft/fetch-event-source

import { fetchEventSource } from '@microsoft/fetch-event-source';

await fetchEventSource('/api/agent/run', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ prompt }),

  onmessage(ev) {
    const data = JSON.parse(ev.data);
    if (data.done) return;
    appendToTerminal(data.text);
  },

  onerror(err) {
    console.error('SSE error', err);
  },

  onclose() {
    console.log('Connection closed');
  },
});

The advantage of this Microsoft library: it supports POST, automatic reconnection, and custom headers, making it much more useful than native EventSource.

Frontend Rendering Solutions

Once you have the streaming data, how do you render it? Claude Code's output contains ANSI escape sequences (colors, cursor movements, progress bars, etc.), not plain text, so the choice of rendering solution is critical.

Solution 1: xterm.js (Recommended)

xterm.js is the de facto standard for frontend terminal rendering; VS Code's built-in terminal uses it. It fully supports ANSI escape sequences and can reproduce Claude Code's terminal output 1:1.

npm install xterm @xterm/addon-fit @xterm/addon-web-links

import { Terminal } from 'xterm';
import { FitAddon } from '@xterm/addon-fit';
import { WebLinksAddon } from '@xterm/addon-web-links';
import 'xterm/css/xterm.css';

const term = new Terminal({
  theme: {
    background: '#1e1e1e',
    foreground: '#d4d4d4',
    cursor: '#d4d4d4',
  },
  fontSize: 14,
  fontFamily: "'Fira Code', 'Menlo', monospace",
  cursorBlink: true,
  scrollback: 10000,
});

const fitAddon = new FitAddon();
term.loadAddon(fitAddon);
term.loadAddon(new WebLinksAddon()); // Auto-detect links
term.open(document.getElementById('terminal'));
fitAddon.fit();

// Receive SSE data, write directly to terminal
function onSSEMessage(data) {
  if (data.done) {
    term.writeln('\r\n\x1b[32m✓ Agent execution finished\x1b[0m');
    return;
  }
  // xterm.js natively supports ANSI escape sequences, just write directly
  term.write(data.text);
}

Advantages:

Full support for ANSI colors, cursor movement, screen clearing, etc.
Supports text selection, copy/paste
Excellent performance, no lag with large output
Active community, rich plugins

Solution 2: ansi_up (Lightweight)

If you don't want to introduce a full terminal emulator, ansi_up can convert ANSI escape sequences to HTML for rendering with regular DOM elements:

npm install ansi_up

import AnsiUp from 'ansi_up';

const ansiUp = new AnsiUp();
const output = document.getElementById('output');

function appendToTerminal(text) {
  const html = ansiUp.ansi_to_html(text);
  output.innerHTML += html;
  output.scrollTop = output.scrollHeight;
}

Advantages: Lightweight (~10KB), suitable for simple scenarios needing only color rendering. Disadvantages: Does not support complex ANSI operations like cursor movement or progress bar overwriting.

Solution Comparison

Feature	xterm.js	ansi_up	Plain `<pre>`
ANSI Colors	✅	✅	❌
Cursor Movement / Clear Screen	✅	❌	❌
Progress Bar Overwrite Refresh	✅	❌	❌
Text Selection/Copy	✅	✅	✅
Bundle Size	~200KB	~10KB	0
Use Case	Full terminal experience	Simple colored output	Plain text

Conclusion: For rendering Claude Code output, xterm.js is the first choice. Its ANSI compatibility is the best and can fully reproduce the terminal experience. If you're just displaying simple colored logs, ansi_up is sufficient.

Streaming Markdown Rendering (For Conversational AI Output)

If the frontend isn't for a pure terminal display but a conversational UI like ChatGPT, you need a renderer that can handle streaming Markdown. Besides terminal streams, Claude Code's output often contains Markdown-formatted content (code blocks, file diffs, lists, etc.).

Traditional react-markdown cannot handle unclosed code blocks, incomplete tables, and other issues in AI streaming output—and Streamdown (by Vercel, 5k+ stars) is specifically designed to solve this pain point:

npm install streamdown @streamdown/code @streamdown/math @streamdown/mermaid @streamdown/cjk

import { useChat } from '@ai-sdk/react';
import { Streamdown } from 'streamdown';
import { code } from '@streamdown/code';
import { mermaid } from '@streamdown/mermaid';
import { math } from '@streamdown/math';
import { cjk } from '@streamdown/cjk';
import 'katex/dist/katex.min.css';
import 'streamdown/styles.css';

export default function Chat() {
  const { messages, status } = useChat();

  return (
    <div>
      {messages.map((message) => (
        <div key={message.id}>
          {message.role === 'user' ? 'User: ' : 'AI: '}
          {message.parts.map((part, index) =>
            part.type === 'text' ? (
              <Streamdown
                key={index}
                plugins={{ code, mermaid, math, cjk }}
                isAnimating={status === 'streaming'}>
                {part.text}
              </Streamdown>
            ) : null,
          )}
        </div>
      ))}
    </div>
  );
}

Streamdown's core advantages:

Designed specifically for AI streaming output, gracefully handles unclosed Markdown blocks
Built-in Shiki code highlighting, KaTeX math formulas, Mermaid diagrams
Supports CJK typography optimization
Plugin-based architecture, import on demand, tree-shakeable
Built-in security hardening (rehype-harden) to prevent XSS

Code Highlighting Library Comparison (Streamdown has Shiki built-in; the following is for reference in standalone usage scenarios):

Library	Bundle Size	Language Support	Features
Shiki	~500KB	100+	VS Code equivalent syntax highlighting, best results
Prism.js	~20KB	270+	Lightweight, rich plugins

Common UI Component Summary

Scenario	Recommended Solution	Representative Project
Full Terminal Experience	xterm.js (14k stars)	VS Code Terminal, Claude Code Web
Simple Colored Logs	ansi_up	Lightweight log panel
Conversational AI Output	Streamdown (5k+ stars)	ChatGPT, Claude Web
File Diff Display	react-diff-viewer	GitHub PR, Code Review tools
Mind Map	react-markmap	Real-time mind map preview

Keeping Long Connections Alive: Heartbeat Keep-Alive

SSE is essentially an HTTP long connection. In theory, it stays open as long as neither side actively closes it. But in real-world production, proxy layers (Nginx, CDN, load balancers) usually have an idle timeout mechanism—if no data is transmitted for a period, they will actively disconnect.

For example, Nginx's default proxy_read_timeout is 60 seconds, meaning if no data flows through for 60 seconds, the connection gets cut. AI Agents sometimes think for a long time (e.g., Claude Code analyzing a large file), during which there might be no output for tens of seconds, putting the connection at risk.

Solution: Server-Side Heartbeat

The server periodically sends an SSE comment (a line starting with :), telling the proxy layer "the connection is still alive." The SSE protocol specifies that lines starting with : are comments and are automatically ignored by the client:

app.post('/api/agent/run', async (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  // Heartbeat: send a comment every 15 seconds to prevent proxy layer timeout disconnection
  const heartbeat = setInterval(() => {
    res.write(': heartbeat\n\n');
  }, 15000);

  const agent = spawn('claude', ['--print', req.body.prompt]);

  agent.stdout.on('data', (chunk) => {
    res.write(`data: ${JSON.stringify({ text: chunk.toString() })}\n\n`);
  });

  agent.on('close', () => {
    res.write(`data: ${JSON.stringify({ done: true })}\n\n`);
    clearInterval(heartbeat); // Clean up heartbeat
    res.end();
  });

  // Also clean up when the client disconnects
  req.on('close', () => {
    clearInterval(heartbeat);
    agent.kill();
  });
});

Key points:

Recommended heartbeat interval: 15-30 seconds. Too short wastes bandwidth; too long might miss the proxy timeout.
Use the : heartbeat\n\n format; the client ignores it automatically, not affecting business logic.
When the Agent finishes or the client disconnects, must clean up setInterval, otherwise memory leaks.

No Need to Poll and Re-establish Connection

Some might ask: Do we need to re-send a request every minute with a since parameter?

No. SSE itself is a persistent connection. As long as the heartbeat keep-alive is done correctly, the connection will remain open. Only when the connection actually breaks (network failure, proxy timeout, etc.) do you need to reconnect—at that point, use Last-Event-ID to resume from the breakpoint, no need to start from scratch.

Pitfalls Encountered

1. Nginx Proxy Buffering

SSE's biggest enemy is proxy layer buffering. Nginx buffers responses by default, causing the client not to receive real-time data. Solution:

location /api/agent/ {
  proxy_pass http://backend;
  proxy_buffering off;          # Disable buffering
  proxy_cache off;              # Disable caching
  proxy_read_timeout 300s;      # Long connection timeout
  chunked_transfer_encoding on;
}

2. Connection Limit

Under HTTP/1.1, browsers have a limit on concurrent connections to the same domain (usually 6). If multiple Agent sessions are opened simultaneously, connections might be maxed out. Solutions:

Upgrade to HTTP/2 (multiplexing, not subject to this limit)
Or use WebSocket instead

3. Large Output Memory Issues

AI Agents sometimes output large amounts of content (e.g., reading an entire file). If the frontend keeps appending to the DOM, the page will get increasingly laggy. Suggestions:

Limit the maximum number of lines, truncate the head when exceeded
Use virtual scrolling to only render the visible area

Summary

Scenario	Recommended Solution
SSE requiring only GET	Native `EventSource`
SSE requiring POST (e.g., AI Agent)	`fetch + ReadableStream` or `@microsoft/fetch-event-source`
Need bidirectional communication	WebSocket
Need full terminal experience	xterm.js
Conversational AI output (streaming Markdown)	Streamdown

The Stream + SSE combo works excellently in AI Agent scenarios: lightweight, one-way, auto-reconnect, good compatibility. Compared to WebSocket's complex handshake and bidirectional communication, SSE is tailor-made for the "server pushes data" scenario.