RapidOCR Drops PaddleOCR's 5-Second Lag to Near-Instant by Switching to ONNX

My original website article: https://www.javayong.cn/ocr/01-RapidOcr-introduction.html

While learning to use PaddleOCR, I tested it on my local Mac M1 and found its performance unsatisfactory—processing each image took about 5 seconds. So I started looking for alternatives and unexpectedly discovered that a large number of developers in the community recommend RapidOCR.

This made me wonder: Why is RapidOCR so fast?

In today's article, we will dive into the core advantages of RapidOCR, analyze the technical secret behind its high performance—ONNX—and finally introduce how to use it.

Introduction to RapidOCR

RapidOCR is a fully open-source, free, multi-platform, multi-language OCR tool that supports fast offline deployment, with extreme speed and broad compatibility as its core strengths.

Given that PaddleOCR still has room for optimization in engineering, RapidOCR is dedicated to simplifying and accelerating the inference deployment of OCR models on various terminal devices. The project innovatively converts PaddleOCR models into the highly compatible ONNX format and implements seamless cross-platform porting based on multiple programming languages such as Python, C++, Java, and C#, allowing developers to get started easily and integrate efficiently.

At this point, you might be wondering: What is ONNX?

What is ONNX

ONNX (Open Neural Network Exchange) is an open format for representing deep learning models. It defines the data structure and computation graph of a model, enabling models trained with different deep learning frameworks to be converted and used interchangeably.

Its main features are:

Cross-framework compatibility: Supports model export from mainstream frameworks like PyTorch, TensorFlow, and MXNet
Cross-platform deployment: Can run seamlessly on different hardware platforms and operating systems
High-performance inference: Supports multiple inference optimization techniques, significantly improving model runtime efficiency
Extensible architecture: Supports custom operators for easy functional expansion

Developers can train models using any supported framework, then export them to ONNX format, and deploy them using any ONNX-compatible inference engine.

Taking the machine vision field as an example, C# is a mainstream language for application development, so the ONNX format is highly favored. Developers do not need to install the bulky PaddlePaddle framework or use Python to write recognition algorithms; they can simply use the ONNX Runtime library to complete all the work, greatly improving development efficiency.

In short, ONNX is like a "universal language" for deep learning models, allowing models to migrate freely between different environments and platforms.

Quick Start

First, create a Python project and add the dependency to the requirements file:

# OCR
rapidocr==3.9.0

Next, encapsulate a RapidOCR example:

def example_3_chinese_english_mixed():
    # Create a RapidOCR instance (using default parameters, auto-detects multiple languages)
    ocr = RapidOCR()

    # Read the image
    img = Image.open(TEST_IMAGE_1)
    img_array = np.array(img)

    print(f"Image info: {img.size} {img.mode}")

    # Execute OCR
    result = ocr(img_array)

    # Convert result format (RapidOCR 3.8+ returns a RapidOCROutput object)
    result = convert_result(result)

    # result format: [[[x1,y1],[x2,y1],[x2,y2],[x1,y2]], (text, confidence)]
    print(f"\nDetected {len(result)} text boxes\n")

    for i, (box, (text, score)) in enumerate(result, 1):
        print(f"{i}. Text: {text}")
        print(f"   Confidence: {score:.4f}")
        print(f"   Coordinates: {box}\n")


def convert_result(result):
    """
    Convert the RapidOCROutput object from RapidOCR 3.8+ to the old format

    Args:
        result: The result returned by RapidOCR (could be a RapidOCROutput object or a list)

    Returns:
        Old format list: [([[x1,y1],[x2,y1],[x2,y2],[x1,y2]], (text, confidence)), ...]
    """
    if hasattr(result, 'boxes') and result.boxes is not None:
        # RapidOCROutput object
        boxes = result.boxes
        txts = result.txts
        scores = result.scores
        return [[box.tolist(), (txt, float(score))]
                for box, txt, score in zip(boxes, txts, scores)]
    return result  # Already in old format

Printed result:

RapidOCR's recognition results use the following format:

[
    [box_points, text, score],
    [box_points, text, score],
    ...
]

Field descriptions:

box_points: The four vertex coordinates of the text box
text: The recognized text content
score: Confidence score (between 0-1, closer to 1 indicates more reliable recognition)

Example of image OCR style:

Summary

By adopting the ONNX format and optimizing multiple inference engines, RapidOCR has successfully achieved high-performance, high-accuracy OCR recognition capabilities. Its open-source, cross-platform nature makes it an excellent choice in the OCR field.

If you also encounter performance bottlenecks with PaddleOCR (without GPU resources), give RapidOCR a try; it will surely surprise you.