← Back to the summary

Building a Tiny JavaScript VM in C++: Lexer, Parser, AST Interpreter, and the Road to Bytecode

After coding for over a week, I'm finally starting to write the text, haha.

Many thanks to senior student wjy for guidance and recommendations. I hope that starting from here, I can better understand CS and begin some real growth.

Project link -> https://github.com/NIIIIIIIIka/MiniJS

Welcome all experts to comment and give feedback (bows).

The goal of this project is to implement a Tiny JavaScript VM from scratch using C++.

It is not meant to replicate a complete V8, SpiderMonkey, or QuickJS, but to run through the core pipeline of a language engine using a controllable JavaScript subset:

Source Code
  -> Lexer
  -> Parser
  -> AST
  -> AST Interpreter
  -> Runtime

Later, it will evolve into:

AST
  -> Bytecode Compiler
  -> Stack-based VM
  -> Object System
  -> Garbage Collector

If I put this project on the front page of my resume, what I most want to express is not "I wrote a toy interpreter," but:

I am breaking down a JavaScript engine into understandable, testable, and evolvable modules, and gradually implementing the compilation frontend, interpreted execution, runtime semantics, virtual machine, and memory management.

1. Why build a Tiny JavaScript VM?

A complete JavaScript engine is extremely complex.

It must not only support variables, functions, objects, arrays, closures, prototype chains, exceptions, and modules, but also handle a vast number of engineering problems like JIT, hidden classes, inline caches, garbage collection, event loops, standard libraries, and host environments.

If you try to implement full JS from the start, it's easy to fall into two traps:

There are too many syntax rules and edge cases; the core pipeline gets overwhelmed by details before it's even connected.
There is no clear evolution order between modules, and it ends up as a pile of unverifiable half-finished products.

So this project chooses to implement a Tiny JS subset first.

Not because full JS is unimportant, but because the learning path for a language engine should first grasp the main trunk:

How does source code become Tokens?
How do Tokens become an AST?
How is the AST interpreted and executed?
How do variables and scopes work?
How does a function call create an execution environment?
How are reference values like arrays and objects represented?
How to later evolve from an AST Interpreter to a Bytecode VM?

Once these questions are resolved, extending to full language features has a stable foundation.

2. Overall Architecture

The project can be broken down into five core modules:

Lexer        Lexical analysis, slicing source code into Tokens
Parser       Syntax analysis, organizing the Token stream into an AST
AST          Abstract Syntax Tree, expressing program structure
Interpreter  Interpreter, directly executing the AST
Runtime      Runtime values, environments, errors, and built-in capabilities

The execution flow is roughly:

let x = 1 + 2 * 3;

First, it goes through the Lexer:

Let Identifier(x) Equal Number(1) Plus Number(2) Star Number(3) Semicolon Eof

Then through the Parser:

LetStmt
  name: x
  initializer:
    BinaryExpr(+)
      left: NumberExpr(1)
      right:
        BinaryExpr(*)
          left: NumberExpr(2)
          right: NumberExpr(3)

Finally, the Interpreter executes this AST, binding the variable x to the runtime value 7.

Although small, this flow already contains the basic skeleton of a language engine.

3. Lexer: Turning Source Code into a Token Stream

The Lexer is responsible for lexical analysis.

It only cares about "how characters form lexical units," not about grammatical structure.

For example:

while (i < 3) {
  i = i + 1;
}

The Lexer will recognize:

While LeftParen Identifier(i) Less Number(3) RightParen
LeftBrace Identifier(i) Equal Identifier(i) Plus Number(1) Semicolon RightBrace

The Lexer's responsibilities include:

Skipping whitespace characters.
Recognizing literals like numbers and strings.
Recognizing identifiers and keywords.
Recognizing operators and delimiters.
Recording source code positions.
Producing lexical diagnostics.

Again, it does not care about grammatical structure. So the Lexer should not judge whether while is followed by (, nor should it judge whether 1 + ; is a syntax error. These belong to the Parser.

The Lexer is like a kindergarten literacy teacher.

Her only job is:

Seeing while, recognizing it's a word (keyword)
Seeing (, recognizing it's a left parenthesis (symbol)
Seeing 1, recognizing it's a number
Seeing ;, recognizing it's a period/terminator

A good module boundary is: the Lexer is only responsible for Tokens, and the Parser is responsible for structure.

4. Parser: Turning the Token Stream into an AST

The Parser is responsible for syntax analysis.

It consumes the Token stream produced by the Lexer and constructs an AST according to grammar rules.

The project uses a recursive descent Parser. Each type of grammatical structure corresponds to a parsing function:

statement()
letDeclaration()
ifStatement()
whileStatement()
functionDeclaration()
...

The Parser divides source code into two grammatical categories:

Statement → Describes "what the program does" (control flow + side effects)
Expression → Describes "how values are calculated" (operations + evaluation)

Statements are dispatched by statement():

let       -> letDeclaration()
if        -> ifStatement()
while     -> whileStatement()
function  -> functionDeclaration()
return    -> returnStatement()
{         -> blockStatement()
otherwise -> expressionStatement()

Expressions handle precedence through function hierarchy (more detailed explanation on this part later):

assignment          → top level, parsed last (assignment)
  equality          → wraps next (equality)
    comparison      → wraps next (comparison)
      term          → wraps next (addition/subtraction)
        factor      → wraps next (multiplication/division)
          call      → wraps one level up (function call)
            primary → bottom level, parsed first (e.g., 123, "abc", (expr))

So:

1 + 2 * 3

Will not be parsed as:

(* (+ 1 2) 3)

But will be parsed as:

(+ 1 (* 2 3))

Because term() handles addition/subtraction, and factor() handles multiplication/division; when the addition layer takes the right operand, it first lets the multiplication layer collect 2 * 3 into a subtree.

The Parser is also responsible for recording syntax errors, for example:

let = 10;
1 + ;
print(1;

These errors won't directly crash the process but enter a diagnostic list, convenient for testing, command-line output, or future editor integration.

5. AST: The Structured Intermediate Representation of the Program

The AST is the core product after the Lexer and Parser.

Source code is linear; the AST is structured.

For example:

function fact(n) {
  if (n <= 1) {
    return 1;
  }
  return n * fact(n - 1);
}

In the AST, it's not a string of characters, but a tree composed of nodes:

FunctionStmt
  name: fact
  params: n
  body:
    IfStmt
      condition: BinaryExpr(<=)
      thenBranch: ReturnStmt(1)
    ReturnStmt
      BinaryExpr(*)
        left: VariableExpr(n)
        right: CallExpr(fact)

The value of the AST is: subsequent modules no longer need to understand source code characters, nor care about the Token stream; they only need to process nodes.

Correspondingly, the AST in the project is roughly divided into two categories:

Expr expression nodes
Stmt statement nodes

This layer of design determines whether the interpreter and subsequent compilers are easy to write.

6. Interpreter: Directly Executing the AST

The current phase chooses to implement an AST Interpreter first.

That is, after the Parser produces the AST, the interpreter directly recursively visits AST nodes and executes them.

For example:

let x = 10;
x = x + 2;
x;

The interpreter execution flow is roughly:

Execute LetStmt: define x = 10 in the environment
Execute AssignExpr: read x, calculate x + 2, update x = 12
Execute ExprStmt: read x, get the final result 12

Another example:

while (i < 3) {
  i = i + 1;
}

The interpreter will repeatedly:

Calculate condition
If truthy, execute body
Calculate condition again
Until condition is false

The advantage of the AST Interpreter is intuitive implementation:

Each AST node corresponds to a piece of execution logic.
It's easy to verify language semantics.
It's suitable for first getting scopes, functions, arrays, and error handling working.
It's easy to locate problems when tests fail.

Its disadvantage is also obvious: execution efficiency is not high, as it recursively interprets the tree each time.

But this is exactly the right first phase. Get the semantics right first, then consider compiling the AST into bytecode.

7. Runtime: Values, Environments, Functions, and Errors

The Runtime is the part that truly carries state during interpreted execution.

Key runtime components include:

Value        Represents runtime values
Environment  Maintains variable bindings and scope chains
RuntimeError Represents runtime errors
Builtin      Provides built-in functions like print

Runtime values currently support:

number
boolean
null
array
function

The environment model supports block-level scope:

let x = 1;
{
  let x = 2;
  x;
}
x;

The x inside the block will shadow the outer x, but will not pollute the outer scope.

Function calls need to create a new call environment:

function add(a, b) {
  return a + b;
}
add(1, 2);

When executing add(1, 2), the interpreter must:

Find the function definition
Check the number of parameters
Create a function call environment
Bind formal parameters a = 1, b = 2
Execute the function body
Handle return
Return the result

Recursive functions also rely on this call model:

function fact(n) {
  if (n <= 1) {
    return 1;
  }
  return n * fact(n - 1);
}

fact(5);

Each call to fact has its own parameter bindings and execution environment.

8. Why not build full JS from the start?

The complexity of full JS is not in a single point, but in the superposition of all semantics.

For example, once the object system enters, it brings out:

Property lookup
Prototype chain
this binding
Method calls
Constructors
new
Dynamic property addition/deletion
Property descriptors

Once closures enter, they bring out:

Lexical environment capture
Variable lifetime extension
Function objects saving outer environments
How escaped variables are stored

Once GC enters, it brings out:

Object graph traversal
Root set
Reference relationships
Circular references
Pause timing
Allocation strategies

These are all very important, but if done together when the Lexer, Parser, and Interpreter are not yet stable, the project can easily spiral out of control.

So this project adopts a phased strategy:

First implement a runnable language core
Then expand runtime data structures
Then introduce bytecode and virtual machine
Finally handle the object system and memory management

This is closer to the evolution of real engineering: establish a closed loop first, then gradually enhance it.

9. Why build an AST Interpreter first?

The AST Interpreter is the most suitable execution model for the first phase.

The reason is simple: it is closest to the syntax tree.

When the Parser produces:

BinaryExpr(+)
  left: NumberExpr(1)
  right: BinaryExpr(*)

The interpreter can directly recursively evaluate:

evaluate left
evaluate right
apply operator

The focus of this phase is verifying language semantics:

Is expression precedence correct?
Is variable lookup correct?
Is block scoping correct?
Are if / while correct?
Are function calls and return correct?
Is array reference semantics correct?

If a Bytecode VM is built from the start, two problems are faced simultaneously:

Is the semantics correct?
Is the bytecode design correct?

Debugging cost becomes significantly higher.

So a more stable path is:

AST Interpreter first serves as the semantic baseline
Bytecode VM later aligns its behavior with the AST Interpreter

This means when introducing the VM in the future, the same batch of language tests can simultaneously run on two backends:

source -> parser -> AST Interpreter -> result
source -> parser -> compiler -> bytecode VM -> result

If results on both sides are consistent, the VM semantics are basically correct.

10. How to evolve to a Bytecode VM?

The AST Interpreter executes the tree directly.

A Bytecode VM will first compile the AST into a linear sequence of instructions:

1 + 2 * 3;

Might compile to:

OP_CONSTANT 1
OP_CONSTANT 2
OP_CONSTANT 3
OP_MUL
OP_ADD
OP_POP

The VM maintains an operand stack during execution:

push 1
push 2
push 3
mul -> push 6
add -> push 7

Subsequent evolution can be divided into several steps:

Define the Bytecode instruction set.
Implement Chunk / Constant Pool.
Write a Compiler from AST to Bytecode.
Implement a Stack-based VM.
Have existing tests cover both the AST Interpreter and the VM.
Gradually compile function calls, local variables, jumps, loops, arrays, and objects to bytecode.

Control flow will become jump instructions:

OP_JUMP_IF_FALSE
OP_JUMP
OP_LOOP

Function calls will become call frames:

CallFrame
  function
  instruction pointer
  stack base

Once this step is complete, the project moves from a "tree interpreter" towards a true virtual machine.

11. How to evolve to an Object System?

When arrays introduce reference semantics:

let a = [1, 2, 3];
let b = a;
b[0] = 99;
a[0]; // 99

This is the prelude to the object system.

Later, runtime values can be expanded to:

Number
Boolean
Null
ArrayObject
FunctionObject
PlainObject
StringObject

Problems the object system needs to solve include:

Object property storage
Property reading and writing
Method calls
this binding
Prototype chain lookup
Constructors and new

The Parser layer needs to support object literals and member access, the Runtime layer needs to support property tables, and the Interpreter or VM layer needs to support property reading and writing.

12. How to evolve to GC later?

As long as the language supports arrays, objects, functions, and closures, it will encounter memory management problems.

Early on, C++ smart pointers can be used to quickly express ownership relationships, but to get closer to a real engine, you need to implement your own object heap and GC.

A controllable evolution path is:

First put all reference types into a unified Heap
Store object references in Value
Use Environment / Stack / CallFrame as GC Roots
Implement mark-sweep
Then consider reference counting or generational optimization

The basic process of Mark-Sweep is:

1. Start from root objects
2. Mark all reachable objects
3. Traverse the heap, free unmarked objects

Root objects include:

Values on the current execution stack
Global variable environment
Active function call frames
Environments captured by closures
References in the VM constant pool

This step pushes the project into the most core problem of a language runtime: how to manage object lifecycles.

Summary

The value of a Tiny JavaScript VM is not in supporting full JS from the start, but in breaking down and gradually implementing the main pipeline of a language engine.

The core closed loop of the current phase is:

Lexer -> Parser -> AST -> Interpreter -> Runtime

It can already execute variables, expressions, control flow, functions, recursion, arrays, and built-in functions.

The next phase will upgrade the AST execution path to:

AST -> Bytecode Compiler -> Stack-based VM

And then continue to add:

Object System -> Closure -> Garbage Collector

The advantage of this route is that each step has a clear goal and testable results. It's not about piling on features all at once, but about deconstructing a language engine in an engineering way.