Alibaba's Open Code Review Turns AI Code Review into a Configurable Pipeline

Alibaba's Open-Source AI Code Review Tool: Execution Chain Analysis of `ocr review`

Open Code Review is an open-source AI code review CLI from Alibaba. Its core entry point is the ocr command. It reads a Git diff, passes the changed files to a Review Agent with tool-calling capabilities, and generates structured review comments with file paths and line numbers.

The Review Agent can be understood as a review execution unit that works on a single file's diff: it reads the current diff, calls tools to supplement context when necessary, and submits structured comments via code_comment.

Compared to directly using a coding agent's built-in code review, Open Code Review is more engineering-oriented:

The review scope comes from a clear diff; review rules can be maintained alongside the repository via rule.json; results can be output as text or JSON for developers, CI, or other Agents.
Rules, input scope, and output format are all fixed, making it easier to maintain consistent behavior when integrating with different coding agents.

It doesn't solve "let the model glance at the code"; it turns code review into a configurable, reusable, and integrable process.

Usage: Installation, Execution, and Rule Configuration

Open Code Review is recommended to be installed via npm:

npm install -g @alibaba-group/open-code-review

After installation, you can run a review directly in a Git repository:

# Review uncommitted changes in the current workspace
ocr review

# Review changes in feature/pay relative to main
ocr review --from main --to feature/pay

# Review changes introduced by a specific commit
ocr review --commit abc123

# Output for CI or other Agents to consume
ocr review --format json

To customize review rules, place a .opencodereview/rule.json file in your project:

{
  "rules": [
    {
      "path": "src/pay/**/*.go",
      "rule": "Focus on checking amount calculations, error handling, and concurrency safety"
    },
    {
      "path": "**/*mapper*.xml",
      "rule": "Check for SQL injection risks, parameter errors, and missing closing tags"
    }
  ],
  "exclude": ["**/generated/**", "vendor/**"]
}

This rule configuration affects two things:

rules determines what review rules the Review Agent receives when a file enters review.
include / exclude determines which files enter the review queue. The example above only configures exclude, used to exclude generated code and third-party directories.

Rule sources have a priority order:

The rule file specified by the command line --rule has the highest priority.
Next is the project-level .opencodereview/rule.json.
Then the user global ~/.opencodereview/rule.json.
Finally, OCR's built-in rules.

When no custom rules are configured, OCR doesn't run bare. It has built-in review rules matched by file type. For example, *.ts, *.tsx, *.js, *.jsx use the same set of frontend rules, covering TypeScript types, React Hooks, side effects, async error handling, and common security issues. Other common file types have corresponding rules. When a specific type isn't matched, it falls back to default rules.

In project collaboration, project-level rules are usually committed to the repository; for temporary validation of a rule set, --rule can point to a separate JSON file.

Custom configuration doesn't directly replace the entire built-in rule set. rules are matched layer by layer based on file path:

Only when the current file matches a custom rule is the custom rule used;
If it doesn't match, it continues to the next layer;
Ultimately, it may still use built-in rules.

include / exclude are handled differently. The program selects the layer with the highest priority that has configured filter conditions to take effect as a whole; it does not merge filter conditions from multiple layers. exclude always takes precedence; include is not an allowlist of "only review these files", but is used to explicitly include files that would otherwise be excluded by default, such as test files. Normal business files that don't match include will still proceed through the default filtering judgment.

Principles: From Command Entry to Structured Comments

The main flow can be broken down into 7 nodes:

1. From npm Command to Go CLI

This entry point appears to be an npm package, but the core review logic runs in a Go program.

The npm package is mainly responsible for installation and exposing the ocr command, and passing user input parameters to the Go CLI executable for the current platform.

The output of this node is simple: the Go CLI receives a set of command parameters. Subsequently, it determines whether the user wants to execute the review subcommand and enters the review configuration assembly for this session.

2. Normalize the Review Configuration for This Session

After entering the review subcommand, the program doesn't immediately call the model. Instead, it first assembles the review configuration for this session. This handles "pre-run preparation": merging command-line arguments, project configuration, user global configuration, and built-in defaults into an executable review configuration.

The output of this node is not comments or a final prompt, but a set of materials needed to assemble the Review Agent's runtime context, such as review scope, rule sources, file filters, output format, concurrency level, and optional requirement background. How rules are injected, files are filtered, and prompts are filled will be expanded in subsequent nodes.

3. Generate Review Queue Based on Git Diff

Once the Agent starts, the first thing it does is read the current code differences. Different input scopes use different diff sources:

ocr review: Reads changes in the current workspace.
ocr review --from main --to feature/pay: Reads differences between two references.
ocr review --commit abc123: Reads differences introduced by a single commit.

Regardless of the source, the read data is organized into a diff list split by file. The Agent first injects these diffs into a read-only DiffMap for subsequent tools to query by path; then it filters out the queue of files to be reviewed one by one.

Why filter? Because a single diff might contain binary files, generated code, default excluded paths, user-specified excluded paths, or purely deleted files. They can exist as context in the DiffMap, but not all are suitable as primary review files to initiate a model review.

For example, suppose there are four changed files, and the rules exclude src/pay/generated/**:

File	Result	Reason
`src/pay/service.go`	Enters review queue	Business code change, suitable as primary review object
`src/pay/coupon.go`	Enters review queue	Business code change, suitable as primary review object
`src/pay/service_test.go`	Does not enter review queue	Test files are not primary review objects by default
`src/pay/generated/client.go`	Does not enter review queue	Matches user excluded path

After filtering, each file in the review queue forms a task. The task for src/pay/service.go can be understood as:

[]ReviewTask{
  {
    File: "src/pay/service.go",
    Diff: `
      @@ -42,6 +42,9 @@ func (s *Service) Pay(req PayRequest) error {
        amount := req.Amount
      + if req.CouponID != "" {
      +     amount = s.coupon.Apply(req.CouponID, amount)
      + }
        return s.gateway.Charge(amount)
      }
    `,
  },
}

The output of this node is the "review queue". Subsequently, only the files in the queue will initiate single-file review tasks; filtered-out files won't become primary review tasks, but their diffs can still be queried by path via file_read_diff.

4. Complete the Review Agent Runtime Context

Once the review queue is determined, subsequent reviews proceed file by file for the files in the queue. Before each file enters review, the program assembles a complete runtime context: current file path, current file diff, list of other changed files still in the filtered review queue, requirement background, available tools, and the review rules corresponding to the current file.

Review rules are not a fixed piece of text determined in advance; they are selected based on the current file. When entering a single-file review, the program matches rule.json or built-in rules based on the file path and file type, and passes the matched rule text into the Plan and Review Agent prompts. This way, Go files, frontend files, and XML files in the same review can get different review focuses.

Let's first look at the structure of the Review Agent prompt. It's not sent to the model at this moment, but serves as an assembly template for subsequent per-file reviews: when it's a specific file's turn, the current file path, diff, matched rules, and optional Plan results are filled in.

Illustrated in Chinese, the Review Agent prompt before assembly looks roughly like this:

System prompt template (simplified version, not the actual prompt):

You are a code review assistant.
Only review newly added or modified code in the current file's diff.
If you need more context, you can call tools.
If you confirm a problem, you must submit a structured comment via `code_comment`.
If the current file review is complete, call `task_done` to end the task.

User prompt template (simplified version, not the actual prompt):

Other changed files:
{{change_files}}

Current file:
{{current_file_path}}

Current file diff:
{{diff}}

Requirement background:
{{requirement_background}}

Review rules:
{{system_rule}}

Review plan (optional, from the Plan phase in large file scenarios):
{{plan_guidance}}

Please review the current file diff.

The placeholders in this template are replaced with actual content only during the single-file review phase. After the runtime context is completed, the Agent begins distributing file subtasks according to the review queue.

5. Distribute File Subtasks by Concurrency Level

After loading the diff, the Agent already has a review queue. The job of the concurrent distribution node is straightforward: distribute the files in the queue according to the concurrency level.

Before distribution, a large diff pre-filter is also performed to prevent a single file's diff from being too large and directly overflowing the subsequent prompt. Files exceeding the threshold do not start a complete single-file review task; instead, they are skipped and a warning is recorded. Then the program controls the number of concurrent file tasks based on --concurrency.

Each file task fills the previously seen prompt template into an actual input:

Current file path
Current file diff
Other changed files
Requirement background
Matched review rules
Optional Plan results

The subsequent single-file review revolves around this input.

6. Complete Single-File Agent Review

executeSubtask is the review unit for a single file. It receives a file diff, is responsible for completing the review of this file, and writes the confirmed structured comments into the comment collector.

The main flow is as follows:

The input to executeSubtask comes from the queue item handed over by the concurrent distribution node.

Taking src/pay/service.go as an example, when entering executeSubtask, the current file context is roughly:

ReviewTask{
  File: "src/pay/service.go",
  OtherChangedFiles: []string{
    "src/pay/coupon.go",
  },
  RequirementBackground: "This requirement adds coupon deduction capability to the payment flow.",
  ReviewRule: "Only review newly added and modified logic in Go business code; prioritize amount calculations, error handling, and concurrency safety.",
}

Upon entering single-file review, rule selection is based on the current file path. src/pay/service.go first tries to match a custom rule; if it matches src/pay/**/*.go, this rule is used. If no custom rule matches, it continues to check global rules and built-in rules. The final selected rule text replaces the review rules when assembling the prompt.

Note that OtherChangedFiles here only comes from the filtered review queue, so the example doesn't list service_test.go or generated/client.go. If the Review Agent later needs to confirm the diffs of these files, it can query the DiffMap by path via file_read_diff.

The design constraint here is that a single file gets only one final rule; once a higher-priority rule is matched, subsequent rules are not merged in.

This rule is placed in two places:

Plan input: Lets the Plan know what standards should be used to review the current file when analyzing risk points.
Review Agent input: Fills the "Review Rules" section later, constraining which issues to focus on during the formal review.

If the file change is relatively large, executeSubtask runs a Plan first. The Plan can be understood as a "roadmap before formal review": it first sorts out the main changes and potential risks of the current file, and suggests which files or diffs the Review Agent should prioritize reading to aid judgment.

Plan system prompt (simplified version, not the actual prompt):

You are a code review planning assistant.
Your responsibility is to analyze the current file diff, identify potential risk points, and plan the context that needs to be read for each risk point.

Available tools:

- file_read: Read file content.
- file_read_diff: View diffs of other changed files.
- code_search: Search for call sites, methods with the same name, or similar implementations.
- file_find: Find files by filename clues.

Output requirements:
Output only JSON, no additional explanation.
JSON fields include change_summary, issues, severity, description, tool_guidance.

Analysis rules:

1. Only analyze newly added and modified code, ignore deleted code.
2. issues are sorted by high, medium, low.
3. Tools are only used for planning, not actually called during the Plan phase.
4. description needs to explain the problem location, nature of the problem, and potential impact.

Plan user prompt (simplified version, not the actual prompt):

Other changed files:
src/pay/coupon.go

Current file:
src/pay/service.go

Current file diff:
@@ -42,6 +42,9 @@ func (s \*Service) Pay(req PayRequest) error {
amount := req.Amount

- if req.CouponID != "" {
-     amount = s.coupon.Apply(req.CouponID, amount)
- }
  return s.gateway.Charge(amount)
  }

Requirement background:
This requirement adds coupon deduction capability to the payment flow.

Review rules:
Only review newly added and modified logic in Go business code; prioritize amount calculations, error handling, and concurrency safety.

Task:
Analyze the above code changes and output a structured review plan using JSON.

The Plan output becomes the subsequent plan_guidance, roughly like this:

{
  "change_summary": "Adjust deduction amount based on CouponID before payment",
  "issues": [
    {
      "severity": "medium",
      "description": "Need to confirm the amount boundary after coupon deduction to avoid negative or zero amount charges",
      "tool_guidance": [
        {
          "name": "file_read",
          "reason": "View the return value constraints of coupon.Apply",
          "arguments": "src/pay/coupon.go"
        }
      ]
    }
  ]
}

This JSON is filled as plan_guidance into the Review Agent's user prompt. After seeing these suggestions, the Review Agent can prioritize reading relevant files and querying relevant diffs, then decide whether to submit a code_comment. If the file change is small, the Plan is skipped, and the review prompt doesn't retain an empty review plan section.

Then executeSubtask generates the Review Agent prompt. It also consists of a system prompt and a user prompt.

Review Agent system prompt (simplified version, not the actual prompt):

You are a code review assistant.
Only review newly added or modified code in the current file's diff.
If you need more context, you can call tools.
If you confirm a problem, you must submit a structured comment via `code_comment`.
If the current file review is complete, call `task_done` to end the task.

Review Agent user prompt (simplified version, not the actual prompt):

Other changed files:
src/pay/coupon.go

Current file:
src/pay/service.go

Current file diff:
@@ -42,6 +42,9 @@ func (s \*Service) Pay(req PayRequest) error {
amount := req.Amount

- if req.CouponID != "" {
-     amount = s.coupon.Apply(req.CouponID, amount)
- }
  return s.gateway.Charge(amount)
  }

Requirement background:
This requirement adds coupon deduction capability to the payment flow.

Review rules:
Only review newly added and modified logic in Go business code; prioritize amount calculations, error handling, and concurrency safety.

Review plan:
{
  "change_summary": "Adjust deduction amount based on CouponID before payment",
  "issues": [
    {
      "severity": "medium",
      "description": "Need to confirm the amount boundary after coupon deduction to avoid negative or zero amount charges",
      "tool_guidance": [
        {
          "name": "file_read",
          "reason": "View the return value constraints of coupon.Apply",
          "arguments": "src/pay/coupon.go"
        }
      ]
    }
  ]
}

Please review the current file diff.

Next comes the Review Agent's review loop. The model gets the current messages and tool definitions in each round, and can perform three main types of actions:

Read context: e.g., view related files or other diffs.
Submit comments: Call code_comment after confirming an issue.
End task: Call task_done after the current file review is complete.

The roles of several core tools are as follows:

Tool	Role	Typical Scenario
`file_read`	Read the content of the changed file	Need to view the context of the current file or related files
`file_read_diff`	View diffs of other files in this change	Plan suggests confirming related file changes, or the current issue requires cross-file judgment
`code_search`	Search the codebase by text or regex	Need to find call sites, methods with the same name, configuration items, or similar implementations
`file_find`	Find files by filename keyword	Only know filename clues, but not the exact path
`code_comment`	Submit a structured review comment	Issue confirmed, needs to enter the final comment list
`task_done`	End the current file review	No more issues in the current file, or comments have been submitted

Ordinary assistant text doesn't directly become the final result. Only structured comments submitted via code_comment enter the comment collector. After the single-file main loop ends, a comment filtering step is performed to delete comments that can be proven wrong by the current diff alone. For example, if the model says "a newly added variable is not used", but the subsequent newly added code in the current diff does use this variable, such comments can be filtered out.

At this point, the single-file review ends. The output of executeSubtask is not the final output, but comments written to the comment collector; if this file fails, it is recorded as a warning, and other file tasks can still continue.

7. Aggregate Structured Comments and Output

Concurrent distribution starts multiple file subtasks, and single-file reviews write comments into the comment collector. The aggregation output node is responsible for global wrap-up: wait for all file reviews to finish, then organize the comments into results visible to the user.

This wrap-up process mainly does four things:

Wait for all file reviews to end, ensuring all comments have been written.
If all file reviews fail, this review directly fails.
Retrieve the comment list from the comment collector.
Perform final organization of comments, then output to the user or other systems.

Before output, another layer of result organization is applied:

Fill in missing comment line numbers based on the code change location.
Statistics on review duration, file count, comment count, and model usage.
If there are no reviewable files, output a "skipped" status in JSON mode.
Choose text or JSON output based on --format.

Text output is for terminal users, focusing on comment location, comment content, and suggested fixes.
JSON output is for CI or other Agents, including status, summary, comments, warnings, and statistics.

The entire chain ends here.

Summary: The Key to Engineering is Controllable Review

From an engineering perspective, the difficulty of AI Code Review is not just "whether the model can understand the code", but whether a review can be stably organized. The review scope must be clear, rules must be configurable, the execution process must be able to supplement context, results must be output structurally, and failures must be recorded and handled.

The value of ocr review lies here. It doesn't throw the diff directly at a general-purpose Agent; instead, it breaks the review into a fixed chain: use configuration and rules to constrain input, use the review queue to control scope, use Plan to reduce uncertainty in large file reviews, use tools to supplement context, and use structured comments to carry the final result.

This is also its main difference from "let the Agent look at my code for me". The former is more like an engineering task that can be plugged into a pipeline; the latter is more like an ad-hoc Q&A. What truly affects the effectiveness of deployment is not just what the model ultimately says, but whether the review process can be constrained by rules, supported by tools, and consumed by systems. For an AI Code Review tool, these engineering boundaries are often more important than the quality of a single response.

If you found this article helpful, feel free to like, bookmark, or follow me.