跪拜 Guibai
← Back to the summary

WebSocket vs. SSE: A No-Library Guide to Streaming Protocols for AI Apps

Preface

With the explosive development of AI technology, scenarios such as large model interaction, real-time data analysis, and AI-driven collaborative tools are becoming increasingly common, and the application of streaming transmission technology is also becoming more widespread. Compared to the traditional "request-response" complete data return model, streaming transmission enables segmented, real-time data push, significantly reducing interaction latency and improving user experience—for example, the word-by-word replies of AI chatbots, real-time subtitle synchronization for voice transcription, and real-time alert pushes for smart monitoring all rely on streaming transmission.

WebSocket and SSE, as two mainstream streaming transmission implementation solutions, respectively cater to the core needs of full-duplex and one-way push, and both are built on the HTTP ecosystem, offering good compatibility and implementability. Understanding the underlying principles and native implementation logic of these two protocols is fundamental for developers to efficiently build AI streaming applications and solve real-time interaction scenarios. This article will completely abandon reliance on third-party libraries, focusing on the native implementation of the Node.js http module, and comprehensively deconstruct from protocol principles, frame structure analysis, code implementation to documentation basis, providing a core reference for technical development and solution design.

1. WebSocket Protocol: Full-Duplex Communication Implementation

1.1 Core Protocol Principles (Simplified Explanation)

The core value of WebSocket is to break the shackles of HTTP's "one question, one answer" model, establishing a persistent "two-way communication pipeline" between the client and server, suitable for real-time scenarios. We understand it through three key steps:

WebSocket is an application-layer protocol that provides full-duplex (bidirectional simultaneous communication) over a single TCP connection, aiming to solve the unidirectionality and short-connection problems of the HTTP "request-response" model. It is suitable for real-time chat, real-time collaboration, and other scenarios. Its core mechanisms include:

WebSocket Frame Structure Diagram (Corresponding to RFC 6455 Standard)

The frame is divided into two parts: the "frame header" (at least 2 bytes) and the "payload data" (actual transmitted content). The fields are arranged bit by bit, corresponding to the frame parsing logic in the code. The diagram is as follows (text description adapted for technical document embedding, can be directly converted into a visual chart):

image.png

Standard Frame Structure (Byte-level Breakdown):

Byte 1: 1 bit (FIN) + 3 bits (RSV1-RSV3) + 4 bits (opcode)

Byte 2: 1 bit (MASK) + 7 bits (Payload length)

Optional fields: 4 bytes (Masking-key, only present when data is sent by the client) + Payload data

Field Meanings (Corresponding to Code Parsing Logic):

Reference Diagram Source: In addition to the original diagram in RFC 6455, you can refer to the visualization diagram of MDN WebSocket data frame format for an easier understanding of the field relationships.

1.2 Implementing WebSocket with Node.js http Module

The native Node.js http module can directly capture protocol upgrade requests and complete WebSocket handshake, frame parsing, and data transmission through custom logic. This is the core way to understand the WebSocket protocol. The following native implementation breaks down the code logic corresponding to each principle, without relying on any third-party libraries, getting straight to the essence of the protocol.

1.3 Native Implementation Breakdown (Principles Corresponding to Code)

The following native code fully implements the core process of handshake upgrade and text frame sending/receiving. Each step corresponds to the WebSocket principle, while also noting the principle details that need to be supplemented in a production environment (such as masking, multi-frame handling), helping you thoroughly understand the underlying logic of the protocol.

Combined with the above frame structure principles, the following code adds mask decryption logic (solving the garbled text issue). Each parsing step corresponds to a frame field, and RFC specification references are noted, achieving a deep binding of principles and code:

const http = require('http');
const crypto = require('crypto');

// Create an HTTP server (WebSocket is based on HTTP handshake, so an HTTP service must be started first)
const server = http.createServer((req, res) => {
  res.writeHead(200);
  res.end('Non-WebSocket request');
});

// Listen for the 'upgrade' event: corresponds to the [Handshake Upgrade] principle, capturing the client's upgrade request
// This event is triggered when the client sends an HTTP request with Upgrade: websocket
server.on('upgrade', (req, socket, head) => {
  // 1. Verify the legitimacy of the upgrade request (Principle: Ensure it is a WebSocket protocol upgrade request)
  if (req.headers.upgrade !== 'websocket') {
    socket.write('HTTP/1.1 400 Bad Request\r\n\r\n');
    socket.destroy();
    return;
  }

  // 2. Generate the handshake response identifier (Principle: Protocol-mandated identity verification mechanism to prevent unauthorized connections)
  const secWebSocketKey = req.headers['sec-websocket-key']; // Client's random key
  const magicString = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'; // Protocol's fixed string
  const hash = crypto.createHash('sha1')
    .update(secWebSocketKey + magicString) // Concatenate the key with the fixed string
    .digest('base64'); // Generate the response identifier, sent back to the client for verification

  // 3. Send the 101 response to complete the handshake upgrade (Principle: HTTP protocol switches to WebSocket protocol)
  const responseHeaders = [
    'HTTP/1.1 101 Switching Protocols',
    'Upgrade: websocket', // Confirm upgrade to WebSocket
    'Connection: Upgrade', // Confirm keeping the long connection
    `Sec-WebSocket-Accept: ${hash}`, // Send back the verification identifier; the connection is established if the client verifies it successfully
    '\r\n'
  ];
  socket.write(responseHeaders.join('\r\n'));

  // 4. Listen for socket data and parse WebSocket frames (corresponds to the [Frame Format Communication] principle)
  // After the handshake succeeds, data sent by the client is transmitted in units of frames, requiring manual parsing of the frame structure
  socket.on('data', (buffer) => {
    const fin = (buffer[0] & 0x80) === 0x80;
    const opcode = buffer[0] & 0x0F;
    const hasMask = (buffer[1] & 0x80) === 0x80;
    let payloadLen = buffer[1] & 0x7F;

    let payloadStart = 2; // Default data start position (after the frame header)
    let maskKey = [];
    // Step 1: Extract the masking key (client data always carries a mask)
    if (hasMask) {
      maskKey = buffer.slice(payloadStart, payloadStart + 4);
      payloadStart += 4; // Move the data start position back by 4 bytes (skip the masking key)
    }

    // Step 2: Decrypt the data (XOR operation)
    const payloadBuffer = buffer.slice(payloadStart, payloadStart + payloadLen);
    const decryptedPayload = [];
    for (let i = 0; i < payloadBuffer.length; i++) {
      decryptedPayload.push(payloadBuffer[i] ^ maskKey[i % 4]); // XOR decryption
    }
    const payload = Buffer.from(decryptedPayload).toString('utf8');

    // Only process complete text frames
    if (opcode === 1 && fin) {
      console.log('Received:', payload);
      // Build a response frame to send back (server-sent data does not require a mask)
      const responseBuffer = Buffer.alloc(2 + payload.length);
      responseBuffer[0] = 0x81;
      responseBuffer[1] = payload.length;
      responseBuffer.write(payload, 2);
      socket.write(responseBuffer);
    }
  });

  // Connection close and error handling to avoid resource leaks
  socket.on('close', () => {
    console.log('WebSocket connection closed');
  });
  socket.on('error', (err) => {
    console.error('WebSocket error:', err);
  });
});

server.listen(8080, () => {
  console.log('WebSocket server running on ws://localhost:8080');
});

Client test (browser console):

const ws = new WebSocket('ws://localhost:8080');
ws.onopen = () => console.log('Connected');
ws.send('Hello WebSocket');
ws.onmessage = (e) => console.log('Received:', e.data); // Receive server response

1.4 WebSocket Documentation and Protocol Standards

2. SSE Protocol: Server-to-Client One-Way Push

2.1 Core Protocol Principles (Simplified Explanation)

SSE is a lightweight communication method where "the server sends data one-way, and the client only receives it." It's like the server opens a "real-time broadcast channel" for the client, suitable for scenarios that don't require client feedback (such as notifications, market data). The core logic is simpler than WebSocket, based on HTTP long connections:

Server-Sent Events (SSE) is a one-way communication protocol based on HTTP, only supporting data push from the server to the client. It is suitable for real-time notifications, market data updates, and other scenarios that don't require client feedback. Its core features:

2.2 Implementing SSE with Node.js http Module

SSE does not require third-party libraries and can be directly implemented using the Node.js http module. The core is to set the correct response headers and continuously push formatted data.

const http = require('http');

const server = http.createServer((req, res) => {
  // Only handle requests to the /sse path as the SSE connection entry point
  if (req.url === '/sse') {
    // Step 1: Set the core SSE response headers (corresponds to the "Fixed Data Format" principle)
    res.writeHead(200, {
      'Content-Type': 'text/event-stream', // Must be set to this type for the client to recognize it as SSE
      'Cache-Control': 'no-cache', // Disable caching to prevent the client from receiving old data repeatedly
      'Connection': 'keep-alive', // Keep the HTTP long connection, do not close immediately
      'Access-Control-Allow-Origin': '*' // Cross-origin support (restrict domains as needed in actual projects)
    });

    // Step 2: Handle resumable transmission (corresponds to the "Automatic Reconnection" principle)
    // When the client reconnects, it carries the Last-Event-ID header, recording the ID of the last received message
    const lastEventId = req.headers['last-event-id'] || '0';
    console.log('Last Event ID:', lastEventId);
    let eventId = parseInt(lastEventId) + 1; // Continue generating message IDs from the point of disconnection

    // Step 3: Push messages at regular intervals (simulating real-time data, embodying "one-way continuous push")
    const interval = setInterval(() => {
      const data = {
        time: new Date().toISOString(),
        content: `SSE message #${eventId}`
      };

      // Build the SSE message format: id (optional) + data (required) + double newline ending
      const message = [
        `id: ${eventId}`, // Message ID, used for resumable transmission
        `data: ${JSON.stringify(data)}`, // Message content, must start with data:
        '\n' // Blank line + double newline, marks the end of a message
      ].join('\n');

      res.write(message); // Push the message to the client
      eventId++;

      // Simulate connection closure (optional, can be based on business logic in actual scenarios)
      if (eventId > 10) {
        clearInterval(interval);
        res.write('event: close\ndata: Connection closed\n\n'); // Custom close event
        res.end();
      }
    }, 1000);

    // Step 4: Clean up resources when the client disconnects (to avoid memory leaks)
    req.on('close', () => {
      clearInterval(interval);
      res.end();
      console.log('SSE connection closed');
    });
  } else {
    // Non-SSE request, return a test page (including client-side EventSource logic)
    res.writeHead(200, { 'Content-Type': 'text/html' });
    res.end(`
      <!DOCTYPE html>
      <html>
      <body>
        <div id="messages"></div>
        <script>
          const eventSource = new EventSource('/sse');
          eventSource.onmessage = (e) => {
            document.getElementById('messages').innerHTML += '<p>' + e.data + '</p>';
          };
          eventSource.addEventListener('close', (e) => {
            document.getElementById('messages').innerHTML += '<p>Connection closed by server</p>';
            eventSource.close();
          });
        </script>
      </body>
      </html>
    `);
  }
});

server.listen(8081, () => {
  console.log('SSE server running on http://localhost:8081');
});

Test method: Visit http://localhost:8081, and you can see a message pushed by the server every second. The connection will automatically close after 10 messages.

2.3 SSE Documentation and Protocol Standards

3. WebSocket vs. SSE: Comparison and Applicable Scenarios

Feature WebSocket SSE
Communication Direction Full-duplex (bidirectional) One-way (server → client)
Protocol Basis HTTP handshake upgrade to independent protocol HTTP long connection, no protocol upgrade
Reconnection Mechanism Requires manual implementation (e.g., heartbeat detection) Client EventSource automatic reconnection
Data Format Binary/text frames, flexible and efficient Text only (text/event-stream)
Applicable Scenarios Real-time chat, collaborative editing, gaming Real-time notifications, market data push, log streams

4. Notes

Team Introduction

"Smart Home Technology Platform - Application Software Framework Development" is primarily responsible for the research and development of design tools, including marketing design tools, home appliance VR design and display, water, electricity, HVAC, and pre-design capabilities. It researches and develops material libraries, builds home furnishing material libraries, integrates unit libraries, full-category product libraries, design plan libraries, and production process models, creating AI design capabilities based on unit types and styles for rapid generation of quantity takeoffs and quotations. It also develops the store designer center and project center, including designer management capabilities and project manager management capabilities. It achieves full lifecycle management of scenarios and provides business opportunity management tools for industries such as water, air, and kitchen, thereby realizing a B-end to C-end full-process system centered on scenarios.