CN114267046A - Method and device for correcting direction of document image - Google Patents
Method and device for correcting direction of document image Download PDFInfo
- Publication number
- CN114267046A CN114267046A CN202111679610.5A CN202111679610A CN114267046A CN 114267046 A CN114267046 A CN 114267046A CN 202111679610 A CN202111679610 A CN 202111679610A CN 114267046 A CN114267046 A CN 114267046A
- Authority
- CN
- China
- Prior art keywords
- degrees
- angle
- image
- document
- trimmed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
- Image Input (AREA)
- Editing Of Facsimile Originals (AREA)
Abstract
The application discloses a method for correcting the direction of a document image. And searching edges and four corner points of a document area in the input image, and performing trimming and small-angle direction correction on the input image by using a perspective transformation method. Obtaining a deviation angle detection value of the image after the trimming and the small angle direction correction through an angle classification model; the deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. And correcting the direction of the image after the trimming and the small-angle direction correction according to the deviation angle detection value. The method and the device change the problem of direction calculation of the document image into the problems of background images and four large-angle direction classification, and have high calculation speed. According to the method and the device, each small angle is not processed, the complexity of direction correction of the document image is simplified, and neural network learning is facilitated.
Description
Technical Field
The application relates to a method for correcting the direction of a document image.
Background
The document image refers to a document in an image format, and is usually a document converted from a paper document into the image format by means of photographing, scanning, and the like. The direction in which a document can be read correctly is generally considered to be the correct direction, and some document images are not oriented correctly, for example, inverted by 180 degrees. In order to perform operations such as browsing and reading, OCR (optical character recognition), and the like, the orientation of the document image needs to be corrected to a correct orientation.
Chinese patent application "character recognition method, apparatus, device, and medium based on direction detection" with application publication No. CN112329777A and application publication date of 2021, 2, month, and 5 discloses: rotating the sliced sample to obtain a first training sample; training a MobileNet-v2 network by using a first training sample to obtain a text direction detection model; when a picture to be detected is received, performing text position detection on the picture to be detected to obtain at least one character slice; and inputting each preprocessed character slice into the text direction detection model, and acquiring the output of the text direction detection model as the text direction of each character slice. This document relates primarily to the detection of the direction of reading of a single line of text in a document image, rather than the direction of the entire document image.
Disclosure of Invention
The technical problem to be solved by the application is to provide a method for correcting the direction of a document image, which utilizes the information of a document area in the document image to judge the direction of the image and correct the image quickly and accurately.
In order to solve the above technical problem, the present application provides a method for correcting the orientation of a document image, including the following steps. Step S10: searching edges and four corner points of a document area in an input image, and performing edge cutting and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is closest. Step S20: obtaining a deviation angle detection value of the image after the trimming and the small angle direction correction through an angle classification model; the deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) and training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. Step S30: and correcting the direction of the image after the trimming and the small-angle direction correction according to the deviation angle detection value.
Further, in step S10, if the edge and the four corner points of the document region in the input image cannot be found, it indicates that the input image is not a document image, and the whole process is exited.
Further, in the step S10, the deviation angle α is defined as an angle from the correct direction of the trimmed image to the actual direction of the trimmed image along the clockwise direction, and the value range of the deviation angle α is 0 degree or more and α < 360 degrees; the small angle direction correction comprises: correcting trimmed images with the angle of 0 degrees < alpha < 45 degrees and the angle of 315 degrees < alpha < 360 degrees into trimmed images with the angle of alpha being 0 degrees; correcting trimmed images with the angle of 45 degrees < alpha < 135 degrees into trimmed images with the angle of alpha being 90 degrees; correcting trimmed images with the angle of 135 degrees < alpha < 225 degrees into trimmed images with the angle of alpha being 180 degrees; and correcting the trimmed images with the angle of 225 degrees < alpha < 315 degrees into trimmed images with the angle of alpha being 270 degrees.
Further, the small-angle direction correction further includes: when alpha is 0 degrees, small-angle correction is not performed; when α is 45 degrees, either the trimmed image corrected to α is 0 degrees or the trimmed image corrected to α is 90 degrees; when α is 135 degrees, either the trimmed image corrected to α is 90 degrees or the trimmed image corrected to α is 180 degrees; when α is 225 degrees, either the trimmed image corrected to α is 180 degrees or the trimmed image corrected to α is 270 degrees; when α is 315 degrees, either the trimmed image corrected to α is 270 degrees or the trimmed image corrected to α is 0 degrees.
Further, in step S20, the angle classification model is obtained by training using a lightweight neural network.
Preferably, in step S20, the angle classification model is trained to uniformly scale the input size of all images in the enhanced training data set to a fixed size.
Preferably, in step S20, the image after the trimming and the small angle direction correction is scaled to a fixed size of the input image during the training of the angle classification model, and then the scaled image after the trimming and the small angle direction correction is sent to the angle classification model.
Further, in step S20, if the angle classification model determines that the image after the trimming and the small angle direction correction belongs to the background image, the whole process is exited.
Further, in step S20, if the angle classification model determines that the deviation angle detection value of the trimmed edge and the image corrected by the small angle direction is 0 degree, the whole process is exited.
The application also provides a direction correcting device of the document image, which comprises a trimming and small-angle direction correcting unit, a deviation angle detecting unit and a large-angle direction correcting unit. The trimming and small-angle direction correcting unit is used for searching edges and four corner points of a document area in an input image and performing trimming and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is closest. The deviation angle detection unit is used for obtaining a deviation angle detection value through an angle classification model on the image after the trimming and the small-angle direction correction; the deviation angle detection value only has four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) and training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. And the large-angle direction correction unit is used for correcting the direction of the image after the trimming and small-angle direction correction according to the deviation angle detection value.
The technical effect that this application obtained is: a set of solution for quickly and accurately trimming edges and correcting directions is provided for document images. After a picture is input, the system can automatically detect a document area in the picture according to a detection algorithm and give four corner points of the document area, cut out the document area through perspective transformation and simultaneously carry out small-angle direction correction, and then carry out large-angle direction correction through detection of an angle classification model, so that convenience is provided for browsing document images or other subsequent processing operations. The present application addresses orientation detection and correction of an entire image, rather than the orientation of a single line of text therein. According to the method and the device, the trimming processing is carried out before the direction of the document image is corrected, and the accuracy of the direction correction of the document image is improved. According to the method and the device, the background images are added in the direction classification of the document images, and the accuracy of the direction classification of the document images is improved.
Drawings
Fig. 1 is a schematic flow chart of a method for correcting the orientation of a document image according to the present application.
Fig. 2 to 5 are schematic diagrams of several trimmed images before small-angle direction correction by perspective transformation.
Fig. 6 to 9 are schematic diagrams of several trimmed images after small-angle direction correction by perspective transformation.
Fig. 10 is a schematic structural diagram of an orientation correction apparatus for document images proposed in the present application.
The reference numbers in the figures illustrate: 10 is a trimming and small-angle direction correcting unit, 20 is a deviation angle detecting unit, and 30 is a large-angle direction correcting unit.
Detailed Description
Referring to fig. 1, the method for correcting the orientation of a document image according to the present application includes the following steps.
Step S10: searching edges and four corner points (horners) of a document region in an input image, and performing edge cutting and small-angle direction correction on the input image by using a perspective transformation (perspective transformation) method to obtain an image after edge cutting and small-angle direction correction. If the input image is a document image, the trimmed image is the document region of the input image. The small angle direction correction is to correct the document area of the input image to one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees, or 270 degrees from the correct direction, and to correct the document area of the input image to which of the four forms is the closest.
If the edges and four corner points of the document area in the input image cannot be found in the step, the input image is indicated to belong to the background image, and the whole process is exited.
Referring to fig. 2 to 5, the images after trimming before small angle direction correction are performed by perspective transformation. Please refer to fig. 6 to 9, which are the trimmed images after small angle direction correction by perspective transformation. Wherein the dotted line represents the correct direction of the trimmed image, the solid line represents the actual direction of the trimmed image, the deviation angle alpha is defined as the angle from the correct direction of the trimmed image to the actual direction of the trimmed image along the clockwise direction, and the value range of the deviation angle alpha is more than or equal to 0 degrees and less than 360 degrees. The small-angle direction correction specifically refers to: (1) the trimmed images with the angle of 0 degrees < alpha < 45 degrees and the angle of 315 degrees < alpha < 360 degrees are corrected into trimmed images with the angle of alpha being 0 degrees. The trimmed image shown in fig. 2 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 6. (2) The trimmed images with the angle of 45 degrees < alpha < 135 degrees are all corrected to be trimmed images with the angle of alpha being 90 degrees. The trimmed image shown in fig. 3 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 7. (3) The trimmed images with 135 degrees < alpha < 225 degrees are all corrected to be trimmed images with alpha being 180 degrees. The trimmed image shown in fig. 4 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 8. (4) The trimmed images with 225 degrees < alpha < 315 degrees are all corrected to be trimmed images with alpha being 270 degrees. The trimmed image shown in fig. 5 is corrected in the small angle direction by perspective transformation to obtain the trimmed image shown in fig. 9. There are several special cases described below. When α is 0 degrees, no small angle correction is necessary. When α is 45 degrees, the trimmed image with α being 0 degrees may be corrected, or the trimmed image with α being 90 degrees may be corrected. When α is 135 degrees, the trimmed image with α being 90 degrees may be corrected, or the trimmed image with α being 180 degrees may be corrected. When α is 225 degrees, the trimmed image with α being 180 degrees may be corrected, or the trimmed image with α being 270 degrees may be corrected. When α is 315 degrees, the trimmed image with α being 270 degrees may be corrected, or the trimmed image with α being 0 degrees may be corrected. The deviation angle α of the trimmed image obtained after the small-angle direction correction is only four values, namely 0 degree, 90 degrees, 180 degrees and 270 degrees, which are respectively shown in fig. 6 to 9.
Step S20: and obtaining a deviation angle detection value by the image after the trimming and the small-angle direction correction through an angle classification model. The deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees.
The angle classification model is obtained in the following manner. (1) A plurality of images and corresponding orientation labels are collected as a training data set (training data set). Most of the multiple images are document images after trimming, and the rest of the multiple images are background images. The document image after trimming refers to an image in which a blank area of an edge is removed and only a document area in the image is reserved; and the deviation angle of the actual direction of the document images after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The background class image refers to an image without a document. The direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image. And the direction labels of the background images are fixed to be one, and the image is shown as the background image. (2) And randomly rotating part or all of the images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated document images after the edges are cut. The training data set processed in this way is called an enhanced training data set, and the purpose of the enhanced training data set is to make the distribution of the document images after edge cutting as uniform as possible in different directions. (3) And training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is. Preferably, the angle classification model is obtained by using a Neural Network (NN) training, such as SqueezeNet, MobileNet, shuffle net, EffNet, and the like, so as to be conveniently deployed at a mobile terminal such as a smartphone. Preferably, the input sizes of all images in the enhanced training data set are uniformly scaled to a fixed size during training to achieve better training effect.
Preferably, in this step, the image after the trimming and the small angle direction correction is scaled to a fixed size of the input image during the training of the angle classification model, and then the scaled image after the trimming and the small angle direction correction is sent to the angle classification model.
In this step, if the angle classification model determines that the image after the trimming and the small angle direction correction belongs to the background image, a deviation angle detection value cannot be given, and the whole process is exited.
In this step, if the angle classification model determines that the deviation angle detection value of the image after the trimming and the small angle direction correction is 0 degree, the direction of the image after the trimming and the small angle direction correction is not required to be corrected, and the whole process is exited.
Step S30: if the angle classification model provides a deviation angle detection value of the image after the trimming and the small-angle direction correction, the image after the trimming and the small-angle direction correction belongs to a document image, the direction of the image after the trimming and the small-angle direction correction is corrected according to the deviation angle detection value, and the image after the trimming and the small-angle direction correction is rotated to be in the correct direction. The step is to carry out rotation correction on the image after trimming and small-angle direction correction, so that the image after direction correction can be conveniently read and printed.
Referring to fig. 10, the apparatus for correcting the direction of a document image according to the present application includes a trimming and small-angle direction correcting unit 10, a deviation angle detecting unit 20, and a large-angle direction correcting unit 30.
The trimming and small-angle direction correcting unit 10 is configured to search edges and four corner points of a document region in an input image, and perform trimming and small-angle direction correction on the input image by using a perspective transformation method to obtain an image after trimming and small-angle direction correction.
The deviation angle detection unit 20 is configured to obtain a deviation angle detection value from the image after the trimming and the small angle direction correction through an angle classification model. The deviation angle detection value has only four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The angle classification model is obtained in the following manner. (1) And collecting a plurality of images and corresponding direction labels as a training data set. Most of the multiple images are document images after trimming, and the rest of the multiple images are background images. The deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees. The background class image refers to an image without a document. The direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image. And the direction labels of the background images are fixed to be one, and the image is shown as the background image. (2) And randomly rotating part or all of the images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated document images after the edges are cut. The training data set thus processed is referred to as an enhanced training data set. (3) And training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values the deviation angle between the actual direction and the correct direction of each document image is.
The large-angle direction correcting unit 30 is configured to correct the directions of the images after the trimming and the small-angle direction correction according to the deviation angle detection value, and rotate the images after the trimming and the small-angle direction correction into a correct direction.
Compared with the prior art, the method and the device for correcting the direction of the document image have the following beneficial effects.
First, the present application is based on deep learning (deep learning) technology, and robustness is high.
Secondly, the method changes the direction calculation problem of the document image into a background image and four large-angle direction (0 degree, 90 degrees, 180 degrees and 270 degrees) classification problems, and is high in operation speed. According to the method and the device, each small angle is not processed, the complexity of direction correction of the document image is simplified, and neural network learning is facilitated.
Thirdly, the angle classification model is obtained by training the lightweight neural network, the operation speed is high, the size is small, and the method is particularly suitable for being deployed at a mobile terminal.
The above are merely preferred embodiments of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
Claims (10)
1. A method for correcting the direction of a document image is characterized by comprising the following steps;
step S10: searching edges and four corner points of a document area in an input image, and performing edge cutting and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction refers to correcting a document area of an input image into one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from a correct direction, and correcting the document area of the input image into which form is the closest to the four forms;
step S20: obtaining a deviation angle detection value of the image after the trimming and the small angle direction correction through an angle classification model; the deviation angle detection value only has four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees;
the angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values is the deviation angle between the actual direction and the correct direction of each document image;
step S30: and correcting the direction of the image after the trimming and the small-angle direction correction according to the deviation angle detection value.
2. The method for correcting the orientation of a document image according to claim 1, wherein in said step S10, if the edges and four corner points of the document region in the input image cannot be found, it indicates that the input image is not a document image, and the whole process is exited.
3. The method of correcting the orientation of a document image according to claim 1, wherein in said step S10, a deviation angle α is defined as an angle from the correct orientation of the trimmed image in a clockwise direction to the actual orientation of the trimmed image, the deviation angle α having a value in the range of 0 degrees ≦ α < 360 degrees; the small angle direction correction comprises: correcting trimmed images with the angle of 0 degrees < alpha < 45 degrees and the angle of 315 degrees < alpha < 360 degrees into trimmed images with the angle of alpha being 0 degrees; correcting trimmed images with the angle of 45 degrees < alpha < 135 degrees into trimmed images with the angle of alpha being 90 degrees; correcting trimmed images with the angle of 135 degrees < alpha < 225 degrees into trimmed images with the angle of alpha being 180 degrees; and correcting the trimmed images with the angle of 225 degrees < alpha < 315 degrees into trimmed images with the angle of alpha being 270 degrees.
4. The method of correcting the orientation of a document image according to claim 1, wherein the small-angle orientation correction further comprises: when alpha is 0 degrees, small-angle correction is not performed; when α is 45 degrees, either the trimmed image corrected to α is 0 degrees or the trimmed image corrected to α is 90 degrees; when α is 135 degrees, either the trimmed image corrected to α is 90 degrees or the trimmed image corrected to α is 180 degrees; when α is 225 degrees, either the trimmed image corrected to α is 180 degrees or the trimmed image corrected to α is 270 degrees; when α is 315 degrees, either the trimmed image corrected to α is 270 degrees or the trimmed image corrected to α is 0 degrees.
5. The method for correcting the orientation of a document image according to claim 1, wherein in step S20, the angle classification model is obtained by training using a lightweight neural network.
6. The method for correcting the orientation of a document image according to claim 1, wherein in step S20, the angle classification model is used to uniformly scale the input size of all images in the enhanced training data set to a fixed size during training.
7. The method of claim 6, wherein in step S20, the image after the trimming and the small angle direction correction is scaled to a fixed size of the input image during the training of the angle classification model, and then the scaled image after the trimming and the small angle direction correction is sent to the angle classification model.
8. The method for correcting the orientation of a document image according to claim 1, wherein in step S20, if the angle classification model determines that the image after the trimming and small-angle orientation correction belongs to a background image, the whole process is exited.
9. The method of claim 1, wherein in step S20, if the angle classification model determines that the deviation angle detection value between the cut edge and the small-angle direction corrected image is 0 degree, the whole process is exited.
10. A direction correcting device for a document image is characterized by comprising an edge cutting and small-angle direction correcting unit, a deviation angle detecting unit and a large-angle direction correcting unit;
the trimming and small-angle direction correcting unit is used for searching edges and four corner points of a document area in an input image and performing trimming and small-angle direction correction on the input image by using a perspective transformation method; if the input image is a document image, the trimmed image is a document area of the input image; the small angle direction correction refers to correcting a document area of an input image into one of four forms with a deviation angle of 0 degree, 90 degrees, 180 degrees or 270 degrees from a correct direction, and correcting the document area of the input image into which form is the closest to the four forms;
the deviation angle detection unit is used for obtaining a deviation angle detection value through an angle classification model on the image after the trimming and the small-angle direction correction; the deviation angle detection value only has four values of 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the angle classification model is obtained by adopting the following method: (1) collecting a plurality of trimmed document images, non-document background images and corresponding direction labels as a training data set; the deviation angle between the actual direction of the document image after trimming and the correct direction is either 0 degree, or 90 degrees, or 180 degrees, or 270 degrees; the direction label of the trimmed document image is used for recording the deviation angle between the actual direction and the correct direction of the trimmed document image; the direction label of the background image indicates that the image is a background image; (2) randomly rotating part or all images in the training data set by taking 90 degrees as a unit, and correspondingly changing the direction labels of the rotated trimmed document images to obtain an enhanced training data set; (3) training an angle classification model by using the enhanced training data set, wherein the angle classification model is used for distinguishing the document images from the background images and identifying which of the four values is the deviation angle between the actual direction and the correct direction of each document image;
and the large-angle direction correction unit is used for correcting the direction of the image after the trimming and small-angle direction correction according to the deviation angle detection value.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111679610.5A CN114267046A (en) | 2021-12-31 | 2021-12-31 | Method and device for correcting direction of document image |
PCT/CN2022/088550 WO2023123763A1 (en) | 2021-12-31 | 2022-04-22 | Direction correction method and apparatus for document image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111679610.5A CN114267046A (en) | 2021-12-31 | 2021-12-31 | Method and device for correcting direction of document image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114267046A true CN114267046A (en) | 2022-04-01 |
Family
ID=80832566
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111679610.5A Pending CN114267046A (en) | 2021-12-31 | 2021-12-31 | Method and device for correcting direction of document image |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114267046A (en) |
WO (1) | WO2023123763A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115457559A (en) * | 2022-08-19 | 2022-12-09 | 上海通办信息服务有限公司 | Method, device and equipment for intelligently correcting text and license pictures |
WO2023123763A1 (en) * | 2021-12-31 | 2023-07-06 | 上海合合信息科技股份有限公司 | Direction correction method and apparatus for document image |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108681729B (en) * | 2018-05-08 | 2023-06-23 | 腾讯科技(深圳)有限公司 | Text image correction method, device, storage medium and equipment |
WO2021221614A1 (en) * | 2020-04-28 | 2021-11-04 | Hewlett-Packard Development Company, L.P. | Document orientation detection and correction |
CN112101367A (en) * | 2020-09-15 | 2020-12-18 | 杭州睿琪软件有限公司 | Text recognition method, image recognition and classification method and document recognition processing method |
CN112419207A (en) * | 2020-11-17 | 2021-02-26 | 苏宁金融科技(南京)有限公司 | Image correction method, device and system |
CN114267046A (en) * | 2021-12-31 | 2022-04-01 | 上海合合信息科技股份有限公司 | Method and device for correcting direction of document image |
-
2021
- 2021-12-31 CN CN202111679610.5A patent/CN114267046A/en active Pending
-
2022
- 2022-04-22 WO PCT/CN2022/088550 patent/WO2023123763A1/en unknown
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023123763A1 (en) * | 2021-12-31 | 2023-07-06 | 上海合合信息科技股份有限公司 | Direction correction method and apparatus for document image |
CN115457559A (en) * | 2022-08-19 | 2022-12-09 | 上海通办信息服务有限公司 | Method, device and equipment for intelligently correcting text and license pictures |
CN115457559B (en) * | 2022-08-19 | 2024-01-16 | 上海通办信息服务有限公司 | Method, device and equipment for intelligently correcting texts and license pictures |
Also Published As
Publication number | Publication date |
---|---|
WO2023123763A1 (en) | 2023-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11790641B2 (en) | Answer evaluation method, answer evaluation system, electronic device, and medium | |
CN108171297B (en) | Answer sheet identification method | |
CN110569832B (en) | Text real-time positioning and identifying method based on deep learning attention mechanism | |
CN107067006B (en) | Verification code identification method and system serving for data acquisition | |
US6778703B1 (en) | Form recognition using reference areas | |
CN104143094A (en) | Test paper automatic test paper marking processing method and system without answer sheet | |
CN114267046A (en) | Method and device for correcting direction of document image | |
CN114299528B (en) | Information extraction and structuring method for scanned document | |
CN110942074A (en) | Character segmentation recognition method and device, electronic equipment and storage medium | |
CN110807454B (en) | Text positioning method, device, equipment and storage medium based on image segmentation | |
CN105046200B (en) | Electronic paper marking method based on straight line detection | |
CN107590494B (en) | Answer sheet picture positioning method and device, readable storage medium and electronic equipment | |
CN111091124B (en) | Spine character recognition method | |
CN109740473B (en) | Picture content automatic marking method and system based on paper marking system | |
CN110942063B (en) | Certificate text information acquisition method and device and electronic equipment | |
CN107067399A (en) | A kind of paper image segmentation processing method | |
CN110717492A (en) | Method for correcting direction of character string in drawing based on joint features | |
CN113901933B (en) | Electronic invoice information extraction method, device and equipment based on artificial intelligence | |
CN115984859B (en) | Image character recognition method, device and storage medium | |
CN113901952A (en) | Print form and handwritten form separated character recognition method based on deep learning | |
CN109741273A (en) | A kind of mobile phone photograph low-quality images automatically process and methods of marking | |
CN114445843A (en) | Card image character recognition method and device of fixed format | |
CN116597466A (en) | Engineering drawing text detection and recognition method and system based on improved YOLOv5s | |
CN107066939A (en) | A kind of paper cutting process method of online paper-marking system | |
CN114882204A (en) | Automatic ship name recognition method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |