CN114267046A

CN114267046A - Method and device for orientation correction of document image

Info

Publication number: CN114267046A
Application number: CN202111679610.5A
Authority: CN
Inventors: 刘鹏伟; 郭丰俊; 龙腾; 丁凯; 张彬; 镇立新
Original assignee: Shanghai Linguan Data Technology Co ltd; Shanghai Shengteng Data Technology Co ltd; Shanghai Yingwuchu Data Technology Co ltd; Shanghai Hehe Information Technology Development Co Ltd
Current assignee: Shanghai Linguan Data Technology Co ltd; Shanghai Shengteng Data Technology Co ltd; Shanghai Yingwuchu Data Technology Co ltd; Shanghai Hehe Information Technology Development Co Ltd
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2022-04-01
Also published as: WO2023123763A1

Abstract

The present application discloses a method for correcting the orientation of a document image. Find the edges and four corners of the document area in the input image, and use the perspective transformation method to perform edge trimming and small-angle direction correction on the input image. The image after the trimming and the small angle direction correction is passed through an angle classification model to obtain a deviation angle detection value; the deviation angle detection value has only four values - 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees. The orientation of the image after trimming and small-angle orientation correction is corrected according to the deviation angle detection value. In this application, the problem of calculating the orientation of the document image is changed to the problem of classification of background images and four large-angle orientations, and the calculation speed is fast. This application does not do every small angle processing, which simplifies the complexity of the orientation correction of the document image and facilitates the learning of the neural network.

Description

Method and device for correcting direction of document image

Technical Field

The application relates to a method for correcting the direction of a document image.

Background

The document image refers to a document in an image format, and is usually a document converted from a paper document into the image format by means of photographing, scanning, and the like. The direction in which a document can be read correctly is generally considered to be the correct direction, and some document images are not oriented correctly, for example, inverted by 180 degrees. In order to perform operations such as browsing and reading, OCR (optical character recognition), and the like, the orientation of the document image needs to be corrected to a correct orientation.

Chinese patent application "character recognition method, apparatus, device, and medium based on direction detection" with application publication No. CN112329777A and application publication date of 2021, 2, month, and 5 discloses: rotating the sliced sample to obtain a first training sample; training a MobileNet-v2 network by using a first training sample to obtain a text direction detection model; when a picture to be detected is received, performing text position detection on the picture to be detected to obtain at least one character slice; and inputting each preprocessed character slice into the text direction detection model, and acquiring the output of the text direction detection model as the text direction of each character slice. This document relates primarily to the detection of the direction of reading of a single line of text in a document image, rather than the direction of the entire document image.

Disclosure of Invention

The technical problem to be solved by the application is to provide a method for correcting the direction of a document image, which utilizes the information of a document area in the document image to judge the direction of the image and correct the image quickly and accurately.

In order to solve the above technical problem, the present application provides a method for correcting the orientation of a document image, including the following steps. Step S10: searching edges and four corner points of a document area in an input image, and performing edge cutting and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is closest. Step S20: obtaining a deviation angle detection value of the image after the trimming and the small angle direction correction through an angle classification model; the deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) and training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. Step S30: and correcting the direction of the image after the trimming and the small-angle direction correction according to the deviation angle detection value.

Further, in step S10, if the edge and the four corner points of the document region in the input image cannot be found, it indicates that the input image is not a document image, and the whole process is exited.

Further, in the step S10, the deviation angle α is defined as an angle from the correct direction of the trimmed image to the actual direction of the trimmed image along the clockwise direction, and the value range of the deviation angle α is 0 degree or more and α < 360 degrees; the small angle direction correction comprises: correcting trimmed images with the angle of 0 degrees < alpha < 45 degrees and the angle of 315 degrees < alpha < 360 degrees into trimmed images with the angle of alpha being 0 degrees; correcting trimmed images with the angle of 45 degrees < alpha < 135 degrees into trimmed images with the angle of alpha being 90 degrees; correcting trimmed images with the angle of 135 degrees < alpha < 225 degrees into trimmed images with the angle of alpha being 180 degrees; and correcting the trimmed images with the angle of 225 degrees < alpha < 315 degrees into trimmed images with the angle of alpha being 270 degrees.

Further, the small-angle direction correction further includes: when alpha is 0 degrees, small-angle correction is not performed; when α is 45 degrees, either the trimmed image corrected to α is 0 degrees or the trimmed image corrected to α is 90 degrees; when α is 135 degrees, either the trimmed image corrected to α is 90 degrees or the trimmed image corrected to α is 180 degrees; when α is 225 degrees, either the trimmed image corrected to α is 180 degrees or the trimmed image corrected to α is 270 degrees; when α is 315 degrees, either the trimmed image corrected to α is 270 degrees or the trimmed image corrected to α is 0 degrees.

Further, in step S20, the angle classification model is obtained by training using a lightweight neural network.

Preferably, in step S20, the angle classification model is trained to uniformly scale the input size of all images in the enhanced training data set to a fixed size.

Preferably, in step S20, the image after the trimming and the small angle direction correction is scaled to a fixed size of the input image during the training of the angle classification model, and then the scaled image after the trimming and the small angle direction correction is sent to the angle classification model.

Further, in step S20, if the angle classification model determines that the image after the trimming and the small angle direction correction belongs to the background image, the whole process is exited.

Further, in step S20, if the angle classification model determines that the deviation angle detection value of the trimmed edge and the image corrected by the small angle direction is 0 degree, the whole process is exited.

The application also provides a direction correcting device of the document image, which comprises a trimming and small-angle direction correcting unit, a deviation angle detecting unit and a large-angle direction correcting unit. The trimming and small-angle direction correcting unit is used for searching edges and four corner points of a document area in an input image and performing trimming and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is closest. The deviation angle detection unit is used for obtaining a deviation angle detection value through an angle classification model on the image after the trimming and the small-angle direction correction; the deviation angle detection value only has four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) and training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. And the large-angle direction correction unit is used for correcting the direction of the image after the trimming and small-angle direction correction according to the deviation angle detection value.

The technical effect that this application obtained is: a set of solution for quickly and accurately trimming edges and correcting directions is provided for document images. After a picture is input, the system can automatically detect a document area in the picture according to a detection algorithm and give four corner points of the document area, cut out the document area through perspective transformation and simultaneously carry out small-angle direction correction, and then carry out large-angle direction correction through detection of an angle classification model, so that convenience is provided for browsing document images or other subsequent processing operations. The present application addresses orientation detection and correction of an entire image, rather than the orientation of a single line of text therein. According to the method and the device, the trimming processing is carried out before the direction of the document image is corrected, and the accuracy of the direction correction of the document image is improved. According to the method and the device, the background images are added in the direction classification of the document images, and the accuracy of the direction classification of the document images is improved.

Drawings

Fig. 1 is a schematic flow chart of a method for correcting the orientation of a document image according to the present application.

Fig. 2 to 5 are schematic diagrams of several trimmed images before small-angle direction correction by perspective transformation.

Fig. 6 to 9 are schematic diagrams of several trimmed images after small-angle direction correction by perspective transformation.

Fig. 10 is a schematic structural diagram of an orientation correction apparatus for document images proposed in the present application.

The reference numbers in the figures illustrate: 10 is a trimming and small-angle direction correcting unit, 20 is a deviation angle detecting unit, and 30 is a large-angle direction correcting unit.

Detailed Description

Referring to fig. 1, the method for correcting the orientation of a document image according to the present application includes the following steps.

Step S10: searching edges and four corner points (horners) of a document region in an input image, and performing edge cutting and small-angle direction correction on the input image by using a perspective transformation (perspective transformation) method to obtain an image after edge cutting and small-angle direction correction. If the input image is a document image, the trimmed image is the document region of the input image. The small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees, or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is the closest.

If the edges and four corner points of the document area in the input image cannot be found in the step, the input image is indicated to belong to the background image, and the whole process is exited.

Referring to fig. 2 to 5, the images after trimming before small angle direction correction are performed by perspective transformation. Please refer to fig. 6 to 9, which are the trimmed images after small angle direction correction by perspective transformation. Wherein the dotted line represents the correct direction of the trimmed image, the solid line represents the actual direction of the trimmed image, the deviation angle alpha is defined as the angle from the correct direction of the trimmed image to the actual direction of the trimmed image along the clockwise direction, and the value range of the deviation angle alpha is more than or equal to 0 degrees and less than 360 degrees. The small-angle direction correction specifically refers to: (1) the trimmed images with the angle of 0 degrees < alpha < 45 degrees and the angle of 315 degrees < alpha < 360 degrees are corrected into trimmed images with the angle of alpha being 0 degrees. The trimmed image shown in fig. 2 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 6. (2) The trimmed images with the angle of 45 degrees < alpha < 135 degrees are all corrected to be trimmed images with the angle of alpha being 90 degrees. The trimmed image shown in fig. 3 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 7. (3) The trimmed images with 135 degrees < alpha < 225 degrees are all corrected to be trimmed images with alpha being 180 degrees. The trimmed image shown in fig. 4 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 8. (4) The trimmed images with 225 degrees < alpha < 315 degrees are all corrected to be trimmed images with alpha being 270 degrees. The trimmed image shown in fig. 5 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 9. There are several special cases described below. When α is 0 degrees, no small angle correction is necessary. When α is 45 degrees, the trimmed image with α being 0 degrees may be corrected, or the trimmed image with α being 90 degrees may be corrected. When α is 135 degrees, the trimmed image with α being 90 degrees may be corrected, or the trimmed image with α being 180 degrees may be corrected. When α is 225 degrees, the trimmed image with α being 180 degrees may be corrected, or the trimmed image with α being 270 degrees may be corrected. When α is 315 degrees, the trimmed image with α being 270 degrees may be corrected, or the trimmed image with α being 0 degrees may be corrected. The deviation angle α of the trimmed image obtained after the small-angle direction correction is only four values, namely 0 degree, 90 degrees, 180 degrees and 270 degrees, which are respectively shown in fig. 6 to 9.

Step S20: and obtaining a deviation angle detection value by the image after the trimming and the small-angle direction correction through an angle classification model. The deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees.

The angle classification model is obtained in the following manner. (1) A plurality of images and corresponding orientation labels are collected as a training data set (training data set). Most of the multiple images are document images after trimming, and the rest of the multiple images are background images. The document image after trimming refers to an image in which a blank area of an edge is removed and only a document area in the image is reserved; and the deviation angle of the actual direction of the document images after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The background class image refers to an image without a document. The direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image. And the direction labels of the background images are fixed to be one, and the image is shown as the background image. (2) And randomly rotating part or all of the images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated document images after the edges are cut. The training data set processed in this way is called an enhanced training data set, and the purpose of the enhanced training data set is to make the distribution of the document images after edge cutting as uniform as possible in different directions. (3) And training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. Preferably, the angle classification model is obtained by using a Neural Network (NN) training, such as SqueezeNet, MobileNet, shuffle net, EffNet, and the like, so as to be conveniently deployed at a mobile terminal such as a smartphone. Preferably, the input sizes of all images in the enhanced training data set are uniformly scaled to a fixed size during training to achieve better training effect.

Preferably, in this step, the image after the trimming and the small angle direction correction is scaled to a fixed size of the input image during the training of the angle classification model, and then the scaled image after the trimming and the small angle direction correction is sent to the angle classification model.

In this step, if the angle classification model determines that the image after the trimming and the small angle direction correction belongs to the background image, a deviation angle detection value cannot be given, and the whole process is exited.

In this step, if the angle classification model determines that the deviation angle detection value of the image after the trimming and the small angle direction correction is 0 degree, the direction of the image after the trimming and the small angle direction correction is not required to be corrected, and the whole process is exited.

Step S30: if the angle classification model provides a deviation angle detection value of the image after the trimming and the small-angle direction correction, the image after the trimming and the small-angle direction correction belongs to a document image, the direction of the image after the trimming and the small-angle direction correction is corrected according to the deviation angle detection value, and the image after the trimming and the small-angle direction correction is rotated to be in the correct direction. The step is to carry out rotation correction on the image after trimming and small-angle direction correction, so that the image after direction correction can be conveniently read and printed.

Referring to fig. 10, the apparatus for correcting the direction of a document image according to the present application includes a trimming and small-angle direction correcting unit 10, a deviation angle detecting unit 20, and a large-angle direction correcting unit 30.

The trimming and small-angle direction correcting unit 10 is configured to search edges and four corner points of a document region in an input image, and perform trimming and small-angle direction correction on the input image by using a perspective transformation method to obtain an image after trimming and small-angle direction correction.

The deviation angle detection unit 20 is configured to obtain a deviation angle detection value from the image after the trimming and the small angle direction correction through an angle classification model. The deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The angle classification model is obtained in the following manner. (1) And collecting a plurality of images and corresponding direction labels as a training data set. Most of the multiple images are document images after trimming, and the rest of the multiple images are background images. The deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The background class image refers to an image without a document. The direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image. And the direction labels of the background images are fixed to be one, and the image is shown as the background image. (2) And randomly rotating part or all of the images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated document images after the edges are cut. The training data set thus processed is referred to as an enhanced training data set. (3) And training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is.

The large-angle direction correcting unit 30 is configured to correct the directions of the images after the trimming and the small-angle direction correction according to the deviation angle detection value, and rotate the images after the trimming and the small-angle direction correction into a correct direction.

Compared with the prior art, the method and the device for correcting the direction of the document image have the following beneficial effects.

First, the present application is based on deep learning (deep learning) technology, and robustness is high.

Secondly, the method changes the direction calculation problem of the document image into a background image and four large-angle direction (0 degree, 90 degrees, 180 degrees and 270 degrees) classification problems, and is high in operation speed. According to the method and the device, each small angle is not processed, the complexity of direction correction of the document image is simplified, and neural network learning is facilitated.

Thirdly, the angle classification model is obtained by training the lightweight neural network, the operation speed is high, the size is small, and the method is particularly suitable for being deployed at a mobile terminal.

The above are merely preferred embodiments of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. a method for correcting the orientation of a document image, comprising the steps of:

Step S10: Find the sides and four corners of the document area in the input image, and use the perspective transformation method to perform edge trimming and small-angle direction correction on the input image; if the input image is a document image, then the image after trimming It is the document area of the input image; the small-angle direction correction refers to correcting the document area of the input image to four forms in which the deviation angle from the correct direction is 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees. One, the document area of the input image is the closest to which of the four forms, and it is corrected to which form;

Step S20: Passing the corrected image of the trimming and the small angle direction through an angle classification model to obtain a deviation angle detection value; the deviation angle detection value has only four values - 0 degrees, or 90 degrees, or 180 degrees. degrees, or 270 degrees;

The angle classification model is obtained in the following manner: (1) collecting multiple document images after trimming, background images without documents and corresponding direction labels as training data sets; the actual direction of the document images after trimming The deviation angle from the correct direction is either 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees; the orientation label of the edge-cut document image is used to record the actual position of the edge-cut document image. The deviation angle between the direction and the correct direction; the direction label of the background image indicates that the image is a background image; (2) randomly rotate some or all images in the training data set in units of 90 degrees, and Correspondingly changing the orientation label of the rotated and trimmed document image to obtain an enhanced training data set; (3) using the enhanced training data set to train an angle classification model, which is used to distinguish document images It is also used to identify which of the four values is the deviation angle between the actual direction and the correct direction of each document image;

Step S30: Correct the direction of the image after trimming and small-angle direction correction according to the deviation angle detection value.

2. The method for correcting the orientation of a document image according to claim 1, wherein in the step S10, if the sides and four corners of the document area in the input image cannot be found, it means that the input image is not a document image, exit the entire process.

3 . The method for correcting the orientation of a document image according to claim 1 , wherein, in the step S10 , the deviation angle α is defined as starting from the correct direction of the trimmed image and going clockwise to the trimmed edge. 4 . The angle between the actual directions of the rear images, the value range of the deviation angle α is 0 degrees≤α<360 degrees; the small-angle direction correction includes: 0 degrees<α<45 degrees and 315 degrees<α<360 degrees The trimmed images with α=0 degree are corrected to the trimmed images with α=0 degree; the trimmed images with 45°<α<135° are corrected to the trimmed images with α=90°; 135 degree<α<225 degree trimmed images are corrected to α=180 degree trimmed images; 225 degree<α<315 degree trimmed images are corrected to α=270 degree trimmed images image behind the edge.

4. The method for correcting the orientation of a document image according to claim 1, wherein the small-angle orientation correction further comprises: when α=0 degrees, no small-angle correction is performed; when α=45 degrees, or When α=135 degrees, or the corrected image after α=90 degrees, Or corrected to the edge-cut image of α=180 degrees; when α=225 degrees, either corrected to the edge-cut image of α=180 degrees, or corrected to the edge-cut image of α=270 degrees; When α=315 degrees, it is either corrected to a trimmed image with α=270 degrees, or corrected to an edge trimmed image of α=0 degrees.

5 . The method for correcting the orientation of a document image according to claim 1 , wherein, in the step S20 , the angle classification model is obtained by training a lightweight neural network. 6 .

6. The method for correcting the orientation of a document image according to claim 1, wherein in the step S20, when the angle classification model is trained, the input size of all images in the enhanced training data set is Uniform scaling to a fixed size.

7 . The method for correcting the orientation of a document image according to claim 6 , wherein, in the step S20 , the image after the trimming and the small angle orientation correction is first scaled to be input during training of the angle classification model. 8 . The fixed size of the image, and then the scaled image after trimming and small angle direction correction is sent to the angle classification model.

8. The method for correcting the orientation of a document image according to claim 1, wherein, in the step S20, if the angle classification model determines that the trimmed and small-angle orientation corrected images belong to background images, then exit the entire process.

9 . The method for correcting the orientation of a document image according to claim 1 , wherein in the step S20 , if the angle classification model determines the deviation angle detection value of the image after the trimming and the small-angle orientation correction. 10 . If it is 0 degrees, the whole process is exited.

10. An orientation correction device for a document image, characterized by comprising a trimming and small-angle orientation correction unit, a deviation angle detection unit, and a large-angle orientation correction unit;

The trimming and small-angle direction correction unit is used to find the edges and four corner points of the document area in the input image, and use the perspective transformation method to perform trimming and small-angle direction correction on the input image; if the input image is document image, then the trimmed image is the document area of the input image; the small-angle direction correction refers to correcting the document area of the input image so that the deviation angle from the correct direction is 0 degrees, or 90 degrees, or 180 degrees. One of the four forms of 270 degrees or 270 degrees, the document area of the input image is the closest to which of the four forms, and which form is corrected;

The deviation angle detection unit is used to obtain the deviation angle detection value through an angle classification model of the image after the trimming and the small angle direction correction; the deviation angle detection value has only four values - 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the angle classification model is obtained in the following manner: (1) collect multiple document images after trimming, background images without documents and corresponding orientation labels as training data sets; The deviation angle between the actual direction of the trimmed document image and the correct direction is either 0 degrees, or 90 degrees, or 180 degrees, or 270 degrees; the orientation label of the trimmed document image is used for Record the deviation angle between the actual direction and the correct direction of the document image after trimming; the direction label of the background image indicates that the image is a background image; (2) For part or all of the training data set The image is randomly rotated in units of 90 degrees, and the orientation label of the rotated and trimmed document image is changed accordingly to obtain an enhanced training data set; (3) Use the enhanced training data set to train an angle classification model , the angle classification model is used to distinguish document images and background images, and is also used to identify which of four values the deviation angle between the actual direction and the correct direction of each document image is;

The large-angle direction correction unit is used for correcting the direction of the image after trimming and small-angle direction correction according to the deviation angle detection value.