CN114267046A - Method and device for orientation correction of document image - Google Patents

Method and device for orientation correction of document image Download PDF

Info

Publication number
CN114267046A
CN114267046A CN202111679610.5A CN202111679610A CN114267046A CN 114267046 A CN114267046 A CN 114267046A CN 202111679610 A CN202111679610 A CN 202111679610A CN 114267046 A CN114267046 A CN 114267046A
Authority
CN
China
Prior art keywords
degrees
image
angle
document
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111679610.5A
Other languages
Chinese (zh)
Inventor
刘鹏伟
郭丰俊
龙腾
丁凯
张彬
镇立新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Linguan Data Technology Co ltd
Shanghai Shengteng Data Technology Co ltd
Shanghai Yingwuchu Data Technology Co ltd
Shanghai Hehe Information Technology Development Co Ltd
Original Assignee
Shanghai Linguan Data Technology Co ltd
Shanghai Shengteng Data Technology Co ltd
Shanghai Yingwuchu Data Technology Co ltd
Shanghai Hehe Information Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Linguan Data Technology Co ltd, Shanghai Shengteng Data Technology Co ltd, Shanghai Yingwuchu Data Technology Co ltd, Shanghai Hehe Information Technology Development Co Ltd filed Critical Shanghai Linguan Data Technology Co ltd
Priority to CN202111679610.5A priority Critical patent/CN114267046A/en
Publication of CN114267046A publication Critical patent/CN114267046A/en
Priority to PCT/CN2022/088550 priority patent/WO2023123763A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)
  • Editing Of Facsimile Originals (AREA)
  • Image Input (AREA)

Abstract

本申请公开了一种文档图像的方向校正方法。寻找输入图像中文档区域的边和四个角点,使用透视变换方法对所述输入图像进行切边和小角度方向校正。将所述切边和小角度方向校正后的图像通过一个角度分类模型得到偏差角度检测值;所述偏差角度检测值仅有四种取值——0度、或90度、或180度、或270度。根据所述偏差角度检测值对所述切边和小角度方向校正后的图像的方向进行校正。本申请将文档图像的方向计算问题改为背景类图像、四个大角度方向分类问题,运算速度快。本申请不做每个小角度处理,简化文档图像的方向校正的复杂度,便于神经网络学习。

Figure 202111679610

The present application discloses a method for correcting the orientation of a document image. Find the edges and four corners of the document area in the input image, and use the perspective transformation method to perform edge trimming and small-angle direction correction on the input image. The image after the trimming and the small angle direction correction is passed through an angle classification model to obtain a deviation angle detection value; the deviation angle detection value has only four values - 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees. The orientation of the image after trimming and small-angle orientation correction is corrected according to the deviation angle detection value. In this application, the problem of calculating the orientation of the document image is changed to the problem of classification of background images and four large-angle orientations, and the calculation speed is fast. This application does not do every small angle processing, which simplifies the complexity of the orientation correction of the document image and facilitates the learning of the neural network.

Figure 202111679610

Description

Method and device for correcting direction of document image
Technical Field
The application relates to a method for correcting the direction of a document image.
Background
The document image refers to a document in an image format, and is usually a document converted from a paper document into the image format by means of photographing, scanning, and the like. The direction in which a document can be read correctly is generally considered to be the correct direction, and some document images are not oriented correctly, for example, inverted by 180 degrees. In order to perform operations such as browsing and reading, OCR (optical character recognition), and the like, the orientation of the document image needs to be corrected to a correct orientation.
Chinese patent application "character recognition method, apparatus, device, and medium based on direction detection" with application publication No. CN112329777A and application publication date of 2021, 2, month, and 5 discloses: rotating the sliced sample to obtain a first training sample; training a MobileNet-v2 network by using a first training sample to obtain a text direction detection model; when a picture to be detected is received, performing text position detection on the picture to be detected to obtain at least one character slice; and inputting each preprocessed character slice into the text direction detection model, and acquiring the output of the text direction detection model as the text direction of each character slice. This document relates primarily to the detection of the direction of reading of a single line of text in a document image, rather than the direction of the entire document image.
Disclosure of Invention
The technical problem to be solved by the application is to provide a method for correcting the direction of a document image, which utilizes the information of a document area in the document image to judge the direction of the image and correct the image quickly and accurately.
In order to solve the above technical problem, the present application provides a method for correcting the orientation of a document image, including the following steps. Step S10: searching edges and four corner points of a document area in an input image, and performing edge cutting and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is closest. Step S20: obtaining a deviation angle detection value of the image after the trimming and the small angle direction correction through an angle classification model; the deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) and training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. Step S30: and correcting the direction of the image after the trimming and the small-angle direction correction according to the deviation angle detection value.
Further, in step S10, if the edge and the four corner points of the document region in the input image cannot be found, it indicates that the input image is not a document image, and the whole process is exited.
Further, in the step S10, the deviation angle α is defined as an angle from the correct direction of the trimmed image to the actual direction of the trimmed image along the clockwise direction, and the value range of the deviation angle α is 0 degree or more and α < 360 degrees; the small angle direction correction comprises: correcting trimmed images with the angle of 0 degrees < alpha < 45 degrees and the angle of 315 degrees < alpha < 360 degrees into trimmed images with the angle of alpha being 0 degrees; correcting trimmed images with the angle of 45 degrees < alpha < 135 degrees into trimmed images with the angle of alpha being 90 degrees; correcting trimmed images with the angle of 135 degrees < alpha < 225 degrees into trimmed images with the angle of alpha being 180 degrees; and correcting the trimmed images with the angle of 225 degrees < alpha < 315 degrees into trimmed images with the angle of alpha being 270 degrees.
Further, the small-angle direction correction further includes: when alpha is 0 degrees, small-angle correction is not performed; when α is 45 degrees, either the trimmed image corrected to α is 0 degrees or the trimmed image corrected to α is 90 degrees; when α is 135 degrees, either the trimmed image corrected to α is 90 degrees or the trimmed image corrected to α is 180 degrees; when α is 225 degrees, either the trimmed image corrected to α is 180 degrees or the trimmed image corrected to α is 270 degrees; when α is 315 degrees, either the trimmed image corrected to α is 270 degrees or the trimmed image corrected to α is 0 degrees.
Further, in step S20, the angle classification model is obtained by training using a lightweight neural network.
Preferably, in step S20, the angle classification model is trained to uniformly scale the input size of all images in the enhanced training data set to a fixed size.
Preferably, in step S20, the image after the trimming and the small angle direction correction is scaled to a fixed size of the input image during the training of the angle classification model, and then the scaled image after the trimming and the small angle direction correction is sent to the angle classification model.
Further, in step S20, if the angle classification model determines that the image after the trimming and the small angle direction correction belongs to the background image, the whole process is exited.
Further, in step S20, if the angle classification model determines that the deviation angle detection value of the trimmed edge and the image corrected by the small angle direction is 0 degree, the whole process is exited.
The application also provides a direction correcting device of the document image, which comprises a trimming and small-angle direction correcting unit, a deviation angle detecting unit and a large-angle direction correcting unit. The trimming and small-angle direction correcting unit is used for searching edges and four corner points of a document area in an input image and performing trimming and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is closest. The deviation angle detection unit is used for obtaining a deviation angle detection value through an angle classification model on the image after the trimming and the small-angle direction correction; the deviation angle detection value only has four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) and training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. And the large-angle direction correction unit is used for correcting the direction of the image after the trimming and small-angle direction correction according to the deviation angle detection value.
The technical effect that this application obtained is: a set of solution for quickly and accurately trimming edges and correcting directions is provided for document images. After a picture is input, the system can automatically detect a document area in the picture according to a detection algorithm and give four corner points of the document area, cut out the document area through perspective transformation and simultaneously carry out small-angle direction correction, and then carry out large-angle direction correction through detection of an angle classification model, so that convenience is provided for browsing document images or other subsequent processing operations. The present application addresses orientation detection and correction of an entire image, rather than the orientation of a single line of text therein. According to the method and the device, the trimming processing is carried out before the direction of the document image is corrected, and the accuracy of the direction correction of the document image is improved. According to the method and the device, the background images are added in the direction classification of the document images, and the accuracy of the direction classification of the document images is improved.
Drawings
Fig. 1 is a schematic flow chart of a method for correcting the orientation of a document image according to the present application.
Fig. 2 to 5 are schematic diagrams of several trimmed images before small-angle direction correction by perspective transformation.
Fig. 6 to 9 are schematic diagrams of several trimmed images after small-angle direction correction by perspective transformation.
Fig. 10 is a schematic structural diagram of an orientation correction apparatus for document images proposed in the present application.
The reference numbers in the figures illustrate: 10 is a trimming and small-angle direction correcting unit, 20 is a deviation angle detecting unit, and 30 is a large-angle direction correcting unit.
Detailed Description
Referring to fig. 1, the method for correcting the orientation of a document image according to the present application includes the following steps.
Step S10: searching edges and four corner points (horners) of a document region in an input image, and performing edge cutting and small-angle direction correction on the input image by using a perspective transformation (perspective transformation) method to obtain an image after edge cutting and small-angle direction correction. If the input image is a document image, the trimmed image is the document region of the input image. The small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees, or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is the closest.
If the edges and four corner points of the document area in the input image cannot be found in the step, the input image is indicated to belong to the background image, and the whole process is exited.
Referring to fig. 2 to 5, the images after trimming before small angle direction correction are performed by perspective transformation. Please refer to fig. 6 to 9, which are the trimmed images after small angle direction correction by perspective transformation. Wherein the dotted line represents the correct direction of the trimmed image, the solid line represents the actual direction of the trimmed image, the deviation angle alpha is defined as the angle from the correct direction of the trimmed image to the actual direction of the trimmed image along the clockwise direction, and the value range of the deviation angle alpha is more than or equal to 0 degrees and less than 360 degrees. The small-angle direction correction specifically refers to: (1) the trimmed images with the angle of 0 degrees < alpha < 45 degrees and the angle of 315 degrees < alpha < 360 degrees are corrected into trimmed images with the angle of alpha being 0 degrees. The trimmed image shown in fig. 2 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 6. (2) The trimmed images with the angle of 45 degrees < alpha < 135 degrees are all corrected to be trimmed images with the angle of alpha being 90 degrees. The trimmed image shown in fig. 3 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 7. (3) The trimmed images with 135 degrees < alpha < 225 degrees are all corrected to be trimmed images with alpha being 180 degrees. The trimmed image shown in fig. 4 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 8. (4) The trimmed images with 225 degrees < alpha < 315 degrees are all corrected to be trimmed images with alpha being 270 degrees. The trimmed image shown in fig. 5 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 9. There are several special cases described below. When α is 0 degrees, no small angle correction is necessary. When α is 45 degrees, the trimmed image with α being 0 degrees may be corrected, or the trimmed image with α being 90 degrees may be corrected. When α is 135 degrees, the trimmed image with α being 90 degrees may be corrected, or the trimmed image with α being 180 degrees may be corrected. When α is 225 degrees, the trimmed image with α being 180 degrees may be corrected, or the trimmed image with α being 270 degrees may be corrected. When α is 315 degrees, the trimmed image with α being 270 degrees may be corrected, or the trimmed image with α being 0 degrees may be corrected. The deviation angle α of the trimmed image obtained after the small-angle direction correction is only four values, namely 0 degree, 90 degrees, 180 degrees and 270 degrees, which are respectively shown in fig. 6 to 9.
Step S20: and obtaining a deviation angle detection value by the image after the trimming and the small-angle direction correction through an angle classification model. The deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees.
The angle classification model is obtained in the following manner. (1) A plurality of images and corresponding orientation labels are collected as a training data set (training data set). Most of the multiple images are document images after trimming, and the rest of the multiple images are background images. The document image after trimming refers to an image in which a blank area of an edge is removed and only a document area in the image is reserved; and the deviation angle of the actual direction of the document images after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The background class image refers to an image without a document. The direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image. And the direction labels of the background images are fixed to be one, and the image is shown as the background image. (2) And randomly rotating part or all of the images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated document images after the edges are cut. The training data set processed in this way is called an enhanced training data set, and the purpose of the enhanced training data set is to make the distribution of the document images after edge cutting as uniform as possible in different directions. (3) And training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. Preferably, the angle classification model is obtained by using a Neural Network (NN) training, such as SqueezeNet, MobileNet, shuffle net, EffNet, and the like, so as to be conveniently deployed at a mobile terminal such as a smartphone. Preferably, the input sizes of all images in the enhanced training data set are uniformly scaled to a fixed size during training to achieve better training effect.
Preferably, in this step, the image after the trimming and the small angle direction correction is scaled to a fixed size of the input image during the training of the angle classification model, and then the scaled image after the trimming and the small angle direction correction is sent to the angle classification model.
In this step, if the angle classification model determines that the image after the trimming and the small angle direction correction belongs to the background image, a deviation angle detection value cannot be given, and the whole process is exited.
In this step, if the angle classification model determines that the deviation angle detection value of the image after the trimming and the small angle direction correction is 0 degree, the direction of the image after the trimming and the small angle direction correction is not required to be corrected, and the whole process is exited.
Step S30: if the angle classification model provides a deviation angle detection value of the image after the trimming and the small-angle direction correction, the image after the trimming and the small-angle direction correction belongs to a document image, the direction of the image after the trimming and the small-angle direction correction is corrected according to the deviation angle detection value, and the image after the trimming and the small-angle direction correction is rotated to be in the correct direction. The step is to carry out rotation correction on the image after trimming and small-angle direction correction, so that the image after direction correction can be conveniently read and printed.
Referring to fig. 10, the apparatus for correcting the direction of a document image according to the present application includes a trimming and small-angle direction correcting unit 10, a deviation angle detecting unit 20, and a large-angle direction correcting unit 30.
The trimming and small-angle direction correcting unit 10 is configured to search edges and four corner points of a document region in an input image, and perform trimming and small-angle direction correction on the input image by using a perspective transformation method to obtain an image after trimming and small-angle direction correction.
The deviation angle detection unit 20 is configured to obtain a deviation angle detection value from the image after the trimming and the small angle direction correction through an angle classification model. The deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The angle classification model is obtained in the following manner. (1) And collecting a plurality of images and corresponding direction labels as a training data set. Most of the multiple images are document images after trimming, and the rest of the multiple images are background images. The deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The background class image refers to an image without a document. The direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image. And the direction labels of the background images are fixed to be one, and the image is shown as the background image. (2) And randomly rotating part or all of the images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated document images after the edges are cut. The training data set thus processed is referred to as an enhanced training data set. (3) And training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is.
The large-angle direction correcting unit 30 is configured to correct the directions of the images after the trimming and the small-angle direction correction according to the deviation angle detection value, and rotate the images after the trimming and the small-angle direction correction into a correct direction.
Compared with the prior art, the method and the device for correcting the direction of the document image have the following beneficial effects.
First, the present application is based on deep learning (deep learning) technology, and robustness is high.
Secondly, the method changes the direction calculation problem of the document image into a background image and four large-angle direction (0 degree, 90 degrees, 180 degrees and 270 degrees) classification problems, and is high in operation speed. According to the method and the device, each small angle is not processed, the complexity of direction correction of the document image is simplified, and neural network learning is facilitated.
Thirdly, the angle classification model is obtained by training the lightweight neural network, the operation speed is high, the size is small, and the method is particularly suitable for being deployed at a mobile terminal.
The above are merely preferred embodiments of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1.一种文档图像的方向校正方法,其特征是,包括如下步骤;1. a method for correcting the orientation of a document image, comprising the steps of: 步骤S10:寻找输入图像中文档区域的边和四个角点,使用透视变换方法对所述输入图像进行切边和小角度方向校正;如果所述输入图像是文档图像,那么切边后的图像就是所述输入图像的文档区域;所述小角度方向校正是指将输入图像的文档区域校正为与正确方向的偏差角度为0度、或90度、或180度、或270度的四种形态之一,输入图像的文档区域与所述四种形态的哪一种最接近,就校正为哪一种形态;Step S10: Find the sides and four corners of the document area in the input image, and use the perspective transformation method to perform edge trimming and small-angle direction correction on the input image; if the input image is a document image, then the image after trimming It is the document area of the input image; the small-angle direction correction refers to correcting the document area of the input image to four forms in which the deviation angle from the correct direction is 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees. One, the document area of the input image is the closest to which of the four forms, and it is corrected to which form; 步骤S20:将所述切边和小角度方向校正后的图像通过一个角度分类模型得到偏差角度检测值;所述偏差角度检测值仅有四种取值——0度、或90度、或180度、或270度;Step S20: Passing the corrected image of the trimming and the small angle direction through an angle classification model to obtain a deviation angle detection value; the deviation angle detection value has only four values - 0 degrees, or 90 degrees, or 180 degrees. degrees, or 270 degrees; 所述角度分类模型采用如下方式得到:(1)收集多张切边后的文档图像以及无文档的背景类图像及相应的方向标签作为训练数据集合;所述切边后的文档图像的实际方向与正确方向的偏差角度或者为0度,或者为90度,或者为180度,或者为270度;所述切边后的文档图像的方向标签用来记载该张切边后的文档图像的实际方向与正确方向的偏差角度;所述背景类图像的方向标签表示该张图像为背景类图像;(2)对所述训练数据集合中的部分或全部图像以90度为单位进行随机旋转,并相应地改变旋转后的切边后的文档图像的方向标签,得到增强的训练数据集合;(3)使用所述增强的训练数据集合训练一个角度分类模型,所述角度分类模型用于区分文档图像与背景类图像,还用于识别每张文档图像的实际方向与正确方向的偏差角度是四种取值中的哪一种;The angle classification model is obtained in the following manner: (1) collecting multiple document images after trimming, background images without documents and corresponding direction labels as training data sets; the actual direction of the document images after trimming The deviation angle from the correct direction is either 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees; the orientation label of the edge-cut document image is used to record the actual position of the edge-cut document image. The deviation angle between the direction and the correct direction; the direction label of the background image indicates that the image is a background image; (2) randomly rotate some or all images in the training data set in units of 90 degrees, and Correspondingly changing the orientation label of the rotated and trimmed document image to obtain an enhanced training data set; (3) using the enhanced training data set to train an angle classification model, which is used to distinguish document images It is also used to identify which of the four values is the deviation angle between the actual direction and the correct direction of each document image; 步骤S30:根据所述偏差角度检测值对所述切边和小角度方向校正后的图像的方向进行校正。Step S30: Correct the direction of the image after trimming and small-angle direction correction according to the deviation angle detection value. 2.根据权利要求1所述的文档图像的方向校正方法,其特征是,所述步骤S10中,如果无法找到输入图像中文档区域的边和四个角点,则表明所述输入图像不是文档图像,退出整个流程。2. The method for correcting the orientation of a document image according to claim 1, wherein in the step S10, if the sides and four corners of the document area in the input image cannot be found, it means that the input image is not a document image, exit the entire process. 3.根据权利要求1所述的文档图像的方向校正方法,其特征是,所述步骤S10中,将偏差角度α定义为从切边后的图像的正确方向开始沿着顺时针方向到切边后的图像的实际方向之间的角度,偏差角度α的取值范围是0度≤α<360度;所述小角度方向校正包括:将0度<α<45度以及315度<α<360度的切边后的图像均校正为α=0度的切边后的图像;将45度<α<135度的切边后的图像均校正为α=90度的切边后的图像;将135度<α<225度的切边后的图像均校正为α=180度的切边后的图像;将225度<α<315度的切边后的图像均校正为α=270度的切边后的图像。3 . The method for correcting the orientation of a document image according to claim 1 , wherein, in the step S10 , the deviation angle α is defined as starting from the correct direction of the trimmed image and going clockwise to the trimmed edge. 4 . The angle between the actual directions of the rear images, the value range of the deviation angle α is 0 degrees≤α<360 degrees; the small-angle direction correction includes: 0 degrees<α<45 degrees and 315 degrees<α<360 degrees The trimmed images with α=0 degree are corrected to the trimmed images with α=0 degree; the trimmed images with 45°<α<135° are corrected to the trimmed images with α=90°; 135 degree<α<225 degree trimmed images are corrected to α=180 degree trimmed images; 225 degree<α<315 degree trimmed images are corrected to α=270 degree trimmed images image behind the edge. 4.根据权利要求1所述的文档图像的方向校正方法,其特征是,所述小角度方向校正还包括:当α=0度时,不进行小角度校正;当α=45度时,或者校正为为α=0度的切边后的图像,或者校正为α=90度的切边后的图像;当α=135度时,或者校正为为α=90度的切边后的图像,或者校正为α=180度的切边后的图像;当α=225度时,或者校正为为α=180度的切边后的图像,或者校正为α=270度的切边后的图像;当α=315度时,或者校正为为α=270度的切边后的图像,或者校正为α=0度的切边后的图像。4. The method for correcting the orientation of a document image according to claim 1, wherein the small-angle orientation correction further comprises: when α=0 degrees, no small-angle correction is performed; when α=45 degrees, or When α=135 degrees, or the corrected image after α=90 degrees, Or corrected to the edge-cut image of α=180 degrees; when α=225 degrees, either corrected to the edge-cut image of α=180 degrees, or corrected to the edge-cut image of α=270 degrees; When α=315 degrees, it is either corrected to a trimmed image with α=270 degrees, or corrected to an edge trimmed image of α=0 degrees. 5.根据权利要求1所述的文档图像的方向校正方法,其特征是,所述步骤S20中,所述角度分类模型采用轻量级神经网络训练得到。5 . The method for correcting the orientation of a document image according to claim 1 , wherein, in the step S20 , the angle classification model is obtained by training a lightweight neural network. 6 . 6.根据权利要求1所述的文档图像的方向校正方法,其特征是,所述步骤S20中,所述角度分类模型在训练时,将所述增强的训练数据集合中的所有图像的输入尺寸统一缩放到一个固定尺寸。6. The method for correcting the orientation of a document image according to claim 1, wherein in the step S20, when the angle classification model is trained, the input size of all images in the enhanced training data set is Uniform scaling to a fixed size. 7.根据权利要求6所述的文档图像的方向校正方法,其特征是,所述步骤S20中,先将所述切边和小角度方向校正后的图像缩放为所述角度分类模型训练时输入图像的固定尺寸,再将缩放后的所述切边和小角度方向校正后的图像送入所述角度分类模型。7 . The method for correcting the orientation of a document image according to claim 6 , wherein, in the step S20 , the image after the trimming and the small angle orientation correction is first scaled to be input during training of the angle classification model. 8 . The fixed size of the image, and then the scaled image after trimming and small angle direction correction is sent to the angle classification model. 8.根据权利要求1所述的文档图像的方向校正方法,其特征是,所述步骤S20中,如果所述角度分类模型判定所述切边和小角度方向校正后的图像属于背景类图像,则退出整个流程。8. The method for correcting the orientation of a document image according to claim 1, wherein, in the step S20, if the angle classification model determines that the trimmed and small-angle orientation corrected images belong to background images, then exit the entire process. 9.根据权利要求1所述的文档图像的方向校正方法,其特征是,所述步骤S20中,如果所述角度分类模型判定所述切边和小角度方向校正后的图像的偏差角度检测值为0度,则退出整个流程。9 . The method for correcting the orientation of a document image according to claim 1 , wherein in the step S20 , if the angle classification model determines the deviation angle detection value of the image after the trimming and the small-angle orientation correction. 10 . If it is 0 degrees, the whole process is exited. 10.一种文档图像的方向校正装置,其特征是,包括切边和小角度方向校正单元、偏差角度检测单元、以及大角度方向校正单元;10. An orientation correction device for a document image, characterized by comprising a trimming and small-angle orientation correction unit, a deviation angle detection unit, and a large-angle orientation correction unit; 所述切边和小角度方向校正单元用于寻找输入图像中文档区域的边和四个角点,使用透视变换方法对所述输入图像进行切边和小角度方向校正;如果所述输入图像是文档图像,那么切边后的图像就是所述输入图像的文档区域;所述小角度方向校正是指将输入图像的文档区域校正为与正确方向的偏差角度为0度、或90度、或180度、或270度的四种形态之一,输入图像的文档区域与所述四种形态的哪一种最接近,就校正为哪一种形态;The trimming and small-angle direction correction unit is used to find the edges and four corner points of the document area in the input image, and use the perspective transformation method to perform trimming and small-angle direction correction on the input image; if the input image is document image, then the trimmed image is the document area of the input image; the small-angle direction correction refers to correcting the document area of the input image so that the deviation angle from the correct direction is 0 degrees, or 90 degrees, or 180 degrees. One of the four forms of 270 degrees or 270 degrees, the document area of the input image is the closest to which of the four forms, and which form is corrected; 所述偏差角度检测单元用于将所述切边和小角度方向校正后的图像通过一个角度分类模型得到偏差角度检测值;所述偏差角度检测值仅有四种取值——0度、或90度、或180度、或270度;所述角度分类模型采用如下方式得到:(1)收集多张切边后的文档图像以及无文档的背景类图像及相应的方向标签作为训练数据集合;所述切边后的文档图像的实际方向与正确方向的偏差角度或者为0度,或者为90度,或者为180度,或者为270度;所述切边后的文档图像的方向标签用来记载该张切边后的文档图像的实际方向与正确方向的偏差角度;所述背景类图像的方向标签表示该张图像为背景类图像;(2)对所述训练数据集合中的部分或全部图像以90度为单位进行随机旋转,并相应地改变旋转后的切边后的文档图像的方向标签,得到增强的训练数据集合;(3)使用所述增强的训练数据集合训练一个角度分类模型,所述角度分类模型用于区分文档图像与背景类图像,还用于识别每张文档图像的实际方向与正确方向的偏差角度是四种取值中的哪一种;The deviation angle detection unit is used to obtain the deviation angle detection value through an angle classification model of the image after the trimming and the small angle direction correction; the deviation angle detection value has only four values - 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the angle classification model is obtained in the following manner: (1) collect multiple document images after trimming, background images without documents and corresponding orientation labels as training data sets; The deviation angle between the actual direction of the trimmed document image and the correct direction is either 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees; the orientation label of the trimmed document image is used for Record the deviation angle between the actual direction and the correct direction of the document image after trimming; the direction label of the background image indicates that the image is a background image; (2) For part or all of the training data set The image is randomly rotated in units of 90 degrees, and the orientation label of the rotated and trimmed document image is changed accordingly to obtain an enhanced training data set; (3) Use the enhanced training data set to train an angle classification model , the angle classification model is used to distinguish document images and background images, and is also used to identify which of four values the deviation angle between the actual direction and the correct direction of each document image is; 所述大角度方向校正单元用于根据所述偏差角度检测值对所述切边和小角度方向校正后的图像的方向进行校正。The large-angle direction correction unit is used for correcting the direction of the image after trimming and small-angle direction correction according to the deviation angle detection value.
CN202111679610.5A 2021-12-31 2021-12-31 Method and device for orientation correction of document image Pending CN114267046A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111679610.5A CN114267046A (en) 2021-12-31 2021-12-31 Method and device for orientation correction of document image
PCT/CN2022/088550 WO2023123763A1 (en) 2021-12-31 2022-04-22 Direction correction method and apparatus for document image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111679610.5A CN114267046A (en) 2021-12-31 2021-12-31 Method and device for orientation correction of document image

Publications (1)

Publication Number Publication Date
CN114267046A true CN114267046A (en) 2022-04-01

Family

ID=80832566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111679610.5A Pending CN114267046A (en) 2021-12-31 2021-12-31 Method and device for orientation correction of document image

Country Status (2)

Country Link
CN (1) CN114267046A (en)
WO (1) WO2023123763A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115457559A (en) * 2022-08-19 2022-12-09 上海通办信息服务有限公司 Method, device and equipment for intelligently correcting text and license pictures
WO2023123763A1 (en) * 2021-12-31 2023-07-06 上海合合信息科技股份有限公司 Direction correction method and apparatus for document image

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190091101A (en) * 2018-01-26 2019-08-05 지의소프트 주식회사 Automatic classification apparatus and method of document type using deep learning
CN110188747A (en) * 2019-04-28 2019-08-30 广州华多网络科技有限公司 A kind of sloped correcting method of text image, device and image processing equipment
CN110378249A (en) * 2019-06-27 2019-10-25 腾讯科技(深圳)有限公司 The recognition methods of text image tilt angle, device and equipment
CN111260569A (en) * 2020-01-10 2020-06-09 百度在线网络技术(北京)有限公司 Method and device for correcting image inclination, electronic equipment and storage medium
US20200250415A1 (en) * 2019-02-01 2020-08-06 Intuit Inc. Supervised machine learning algorithm application for image cropping and skew rectification
CN111767859A (en) * 2020-06-30 2020-10-13 北京百度网讯科技有限公司 Image correction method and device, electronic equipment and computer-readable storage medium
CN112419207A (en) * 2020-11-17 2021-02-26 苏宁金融科技(南京)有限公司 Image correction method, device and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108681729B (en) * 2018-05-08 2023-06-23 腾讯科技(深圳)有限公司 Text image correction method, device, storage medium and equipment
WO2021221614A1 (en) * 2020-04-28 2021-11-04 Hewlett-Packard Development Company, L.P. Document orientation detection and correction
CN112101367A (en) * 2020-09-15 2020-12-18 杭州睿琪软件有限公司 Text recognition method, image recognition and classification method and document recognition processing method
CN114267046A (en) * 2021-12-31 2022-04-01 上海合合信息科技股份有限公司 Method and device for orientation correction of document image

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190091101A (en) * 2018-01-26 2019-08-05 지의소프트 주식회사 Automatic classification apparatus and method of document type using deep learning
US20200250415A1 (en) * 2019-02-01 2020-08-06 Intuit Inc. Supervised machine learning algorithm application for image cropping and skew rectification
CN110188747A (en) * 2019-04-28 2019-08-30 广州华多网络科技有限公司 A kind of sloped correcting method of text image, device and image processing equipment
CN110378249A (en) * 2019-06-27 2019-10-25 腾讯科技(深圳)有限公司 The recognition methods of text image tilt angle, device and equipment
CN111260569A (en) * 2020-01-10 2020-06-09 百度在线网络技术(北京)有限公司 Method and device for correcting image inclination, electronic equipment and storage medium
CN111767859A (en) * 2020-06-30 2020-10-13 北京百度网讯科技有限公司 Image correction method and device, electronic equipment and computer-readable storage medium
CN112419207A (en) * 2020-11-17 2021-02-26 苏宁金融科技(南京)有限公司 Image correction method, device and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023123763A1 (en) * 2021-12-31 2023-07-06 上海合合信息科技股份有限公司 Direction correction method and apparatus for document image
CN115457559A (en) * 2022-08-19 2022-12-09 上海通办信息服务有限公司 Method, device and equipment for intelligently correcting text and license pictures
CN115457559B (en) * 2022-08-19 2024-01-16 上海通办信息服务有限公司 Method, device and equipment for intelligently correcting texts and license pictures

Also Published As

Publication number Publication date
WO2023123763A1 (en) 2023-07-06

Similar Documents

Publication Publication Date Title
CN114299528B (en) Information extraction and structuring method for scanned document
CN110569832B (en) Text real-time positioning and identifying method based on deep learning attention mechanism
US11663817B2 (en) Automated signature extraction and verification
CN108805076B (en) Method and system for extracting table characters of environmental impact evaluation report
CN110135411A (en) Business card recognition method and device
US20150199568A1 (en) Automated document recognition, identification, and data extraction
CN101719142B (en) Method for detecting picture characters by sparse representation based on classifying dictionary
CN114267046A (en) Method and device for orientation correction of document image
CN104484643A (en) Intelligent identification method and system for hand-written table
JP6323437B2 (en) 10-finger fingerprint card input device, 10-finger fingerprint card input method, and storage medium
CN111967286A (en) Method and device for identifying information bearing medium, computer equipment and medium
CN114897872B (en) A method, device and electronic device for identifying cells in cell clusters
CN114463767B (en) Letter of credit identification method, device, computer equipment and storage medium
CN116824135A (en) Atmospheric natural environment test industrial product identification and segmentation method based on machine vision
CN109508712A (en) A kind of Chinese written language recognition methods based on image
CN114445843A (en) Fixed-format card image text recognition method and device
CN116597466A (en) Engineering drawing text detection and recognition method and system based on improved YOLOv5s
CN115512381A (en) Text recognition method, text recognition device, text recognition equipment, storage medium and working machine
CN112149654B (en) Invoice text information identification method based on deep learning
CN115984859B (en) Image character recognition method, device and storage medium
CN113780116A (en) Invoice classification method, apparatus, computer equipment and storage medium
CN102819739B (en) A kind of type page localization method and device
CN109741273A (en) A kind of mobile phone photograph low-quality images automatically process and methods of marking
CN115457585A (en) Processing method and device for homework correction, computer equipment and readable storage medium
CN113920434A (en) Image reproduction detection method, device and medium based on target

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination