--- comments: true --- # Text Image Rectification Module Usage Tutorial ## 1. Overview The primary purpose of text image rectification is to perform geometric transformations on images to correct distortions, inclinations, perspective deformations, etc., in the document images for more accurate subsequent text recognition. ## 2. Supported Model List
Model | Model Download Link | CER | Model Storage Size (M) | Description |
---|---|---|---|---|
UVDoc | Inference Model/Training Model | 0.179 | 30.3 M | High-accuracy text image rectification model |
Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
---|---|---|---|
Regular Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
High-Performance Mode | Choose the optimal combination of prior precision type and acceleration strategy | FP32 Precision / 8 Threads | Choose the optimal prior backend (Paddle/OpenVINO/TRT, etc.) |
Parameter | Description | Type | Default |
---|---|---|---|
model_name |
Name of the model | str |
None |
model_dir |
Model storage path | str |
None |
device |
Device(s) to use for inference. Examples: cpu , gpu , npu , gpu:0 , gpu:0,1 .If multiple devices are specified, inference will be performed in parallel. Note that parallel inference is not always supported. By default, GPU 0 will be used if available; otherwise, the CPU will be used. |
str |
None |
enable_hpi |
Whether to use the high performance inference. | bool |
False |
use_tensorrt |
Whether to use the Paddle Inference TensorRT subgraph engine. | bool |
False |
min_subgraph_size |
Minimum subgraph size for TensorRT when using the Paddle Inference TensorRT subgraph engine. | int |
3 |
precision |
Precision for TensorRT when using the Paddle Inference TensorRT subgraph engine. Options: fp32 , fp16 , etc. |
str |
fp32 |
enable_mkldnn |
Whether to use acceleration for inference. | bool |
True |
cpu_threads |
Number of threads to use for inference on CPUs. | int |
10 |
Parameter | Description | Type | Default |
---|---|---|---|
input |
Input data to be predicted. Required. Supports multiple input types:
|
Python Var|str|list |
|
batch_size |
Batch size, positive integer. | int |
1 |
Method | Description | Parameter | Type | Parameter Description | Default Value |
---|---|---|---|---|---|
print() |
Print result to terminal | format_json |
bool |
Whether to format the output content using JSON indentation |
True |
indent |
int |
Specifies the indentation level to beautify the output JSON data, making it more readable, effective only when format_json is True |
4 | ||
ensure_ascii |
bool |
Controls whether to escape non-ASCII characters into Unicode . When set to True , all non-ASCII characters will be escaped; False will retain the original characters, effective only when format_json is True |
False |
||
save_to_json() |
Save the result as a json format file | save_path |
str |
The path to save the file. When specified as a directory, the saved file is named consistent with the input file type. | None |
indent |
int |
Specifies the indentation level to beautify the output JSON data, making it more readable, effective only when format_json is True |
4 | ||
ensure_ascii |
bool |
Controls whether to escape non-ASCII characters into Unicode . When set to True , all non-ASCII characters will be escaped; False will retain the original characters, effective only when format_json is True |
False |
||
save_to_img() |
Save the result as an image format file | save_path |
str |
The path to save the file. When specified as a directory, the saved file is named consistent with the input file type. | None |
Attribute | Description |
---|---|
json |
Get the prediction result in json format |
img |
Get the visualized image in dict format |