WO2021042509A1

WO2021042509A1 - Method and apparatus for rectifying deflection of angle of text image, and computer-readable storage medium

Info

Publication number: WO2021042509A1
Application number: PCT/CN2019/116549
Authority: WO
Inventors: 王博
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-09-06
Filing date: 2019-11-08
Publication date: 2021-03-11
Also published as: CN110705546B; CN110705546A

Abstract

The present application relates to artificial intelligence technologies. Disclosed is a method for rectifying the deflection of an angle of a text image, the method comprising: acquiring a text image, and performing a pre-processing operation on the text image to obtain a binarized text image; testing deflected text in the binarized text image by means of an iterative algorithm in order to obtain a deflected text image, and cutting the deflected text image to obtain a binary copy image; incrementally rotating the binary copy image, and converting the incrementally rotated binary copy image into a set of frequency projection histograms; and calculating the standard deviation between a peak point and a peak-valley point of the set of frequency projection histograms to obtain a set of standard deviations, and using the maximum standard deviation in the set of standard deviations as a deflection rectification angle for the text image, thus completing the rectification of the deflection of the angle of the text image. Further provided are an apparatus for rectifying the deflection of an angle of a text image, and a computer-readable storage medium. According to the present application, the accurate rectification of the deflection of the angle of the text image angle is realized.

Description

Method and device for correcting text image angle deviation and computer readable storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 201910846892.X, and the invention title is "Text image angle correction method, device and computer readable storage medium" on September 6, 2019. All of them The content is incorporated in the application by reference.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a method, device and computer-readable storage medium for correcting the angle of a text image based on projection.

Background technique

Optical character recognition technology has extremely wide application scenarios in the current society. The optical character recognition (Optical Character Recognition, OCR) refers to the process of recognizing optical characters in pictures through image processing and pattern recognition technology, and translating the optical characters into computer characters. The main process is to input images and perform pre-processing. Processing, binarization, denoising, character cutting, and character recognition. Most of the OCR algorithms today are based on decision trees and Support Vector Machine (SVM). The recognition accuracy is very sensitive to character deflection. However, The collection of text images is difficult to achieve zero deflection, and it is also difficult to accurately calculate the correction angle.

Summary of the invention

The present application provides a method and device for correcting the angle of a text image, and a computer-readable storage medium, the main purpose of which is to present the user with an accurate correction result when the user performs the angle correction of the text image in the knowledge base.

In order to achieve the above objective, a method for correcting the angle of a text image provided by this application includes:

Acquiring a text image, and performing a preprocessing operation on the text image to obtain a binarized text image;

Detecting skewed text in the binary text image by an iterative algorithm to obtain a skewed text image, and crop the skewed text image to obtain a binary copy image;

Perform progressive rotation on the binary copy image, convert the progressively rotated binary copy image into a frequency projection histogram, and obtain the binary copy image according to the progressive rotation angle of the binary copy image The frequency projection histogram set of the copied image;

Calculate the standard deviation of the peak vertices and peak valley points of the frequency projection histogram set to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image to complete the correction of the text image Angle correction.

In addition, in order to achieve the above object, the present application also provides a text image angle correction device, which includes a memory and a processor, and the memory stores a text image angle correction program that can be run on the processor. When the text image angle correction program is executed by the processor, the following steps are implemented:

In addition, in order to achieve the above-mentioned object, the present application also provides a computer-readable storage medium having a text image angle correction program stored on the computer-readable storage medium, and the text image angle correction program can be used by one or more processors. Execute to realize the steps of the method for correcting the angle of the text image as described above.

The text image angle correction method, device and computer readable storage medium proposed in this application perform preprocessing operations on the acquired text image when the user performs the text image angle correction, and analyze and process the oblique text image in the text image, Obtain the frequency projection histogram set, calculate the standard deviation of the peak apex and peak valley point of the frequency projection histogram set, and use the maximum standard deviation as the correction angle of the text image, so that accurate text can be presented to the user The result of image angle correction.

Description of the drawings

FIG. 1 is a schematic flowchart of a method for correcting the angle of a text image according to an embodiment of the application;

2 is a schematic diagram of the internal structure of a text image angle correction device provided by an embodiment of the application;

Fig. 3 is a schematic diagram of modules of a text image angle correction program in a text image angle correction device provided by an embodiment of the application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

detailed description

It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.

This application provides a method for correcting the angle of a text image. Referring to FIG. 1, it is a schematic flowchart of a method for correcting the angle of a text image according to an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

In this embodiment, the method for correcting the angle of the text image includes:

S1. Acquire a text image, and perform a preprocessing operation on the text image to obtain a binary text image.

In a preferred embodiment of the present application, the text image may be image data such as certificates and invoices. The preprocessing operation is: denoising the text image through an adaptive image denoising filter, using a contrast stretching method to enhance the contrast of the denoised text image, and using the OTSU algorithm to increase the contrast of the text image. Thresholding is performed on the text image to obtain the binarized text image. In detail, the specific implementation steps of the preprocessing operation are as follows:

a. Noise reduction:

This application uses an adaptive image noise reduction filter to reduce the noise of the text image, which is used to filter out the salt and pepper noise of the text image, and can protect the details of the text image to a large extent. Wherein, the salt and pepper noise is a white point or black point that randomly appears in the image, and the adaptive image noise reduction filter is a signal extractor, which is used to extract the original signal from the signal contaminated by noise.

In the preferred embodiment of the present application, the text image is preset to be f(x, y). Under the action of the degradation function H, due to the influence of the salt and pepper noise η(x, y), a degraded image g(x, y) is obtained. y). Therefore, the image degradation formula is obtained: g(x,y)=η(x,y)+f(x,y), and the Adaptive Filter method is used to reduce the noise of the text image, wherein the calculation of the noise reduction The formula is:

among them,

Is the noise variance of the text image,

Is the average gray value of pixels in a window near the point (x, y),

It is the variance of the gray level of pixels in a window near the point (x, y).

b. Contrast enhancement:

The contrast refers to the contrast between the maximum value and the minimum value of the brightness in the imaging system, where low contrast makes image processing more difficult. In the preferred embodiment of the present application, a contrast stretching method is adopted, which uses a method of increasing the dynamic range of gray levels to achieve the purpose of enhancing the contrast of the text image. The contrast stretching is also called gray scale stretching.

Further, the present application performs gray scale stretching on a specific area according to the piecewise linear transformation function in the contrast stretching method, so as to further improve the contrast of the output image. When performing contrast stretching, it essentially realizes gray value conversion. This application implements the gray value transformation through linear stretching. The linear stretching refers to a pixel-level operation with a linear relationship between the input and output gray values. The gray conversion formula is as follows:

D _b =f(D _a )=a*D _a +b

Where a is the linear slope and b is the intercept on the Y axis. When a>1, the contrast of the output image at this time is enhanced compared to the original image. When a<1, the contrast of the output image is weaker than the original image, where D _a represents the gray value of the input image, and D _b represents the gray value of the output image.

c. Image thresholding operation:

This application uses the OTSU algorithm to perform an efficient algorithm for binarizing the contrast-enhanced text image to obtain a binarized image. Further, in a preferred embodiment of the present application, the gray level t is preset to be the segmentation threshold of the foreground and background of the text image after contrast enhancement, and the ratio of the number of front sights to the text image after contrast enhancement is preset to be w ₀ , The average gray level is u ₀ ; the proportion of the number of background points in the contrast-enhanced text image is w ₁ , and the average gray level is u ₁ , so the total average gray level of the text image after the contrast-enhancement is:

u=w ₀ *u ₀ +w ₁ *u ₁ ,

Wherein, the variance of the foreground and background image of the text image after contrast enhancement is:

g=w ₀ *(u ₀ -u)*(u ₀ -u)+w ₁ *(u ₁ -u)*(u ₁ -u)=w ₀ *w ₁ *(u ₀ -u ₁ )* (u ₀ -u ₁ ),

Wherein, when the variance g is the largest, then the difference between the foreground and the background is the largest at this time, the gray level t at this time is the optimal threshold, and the gray level value greater than the gray level t in the text image after the contrast enhancement is set Is 255, the gray value smaller than the gray t is set to 0, and the binary text image of the text image after contrast enhancement is obtained.

Further, the preprocessing operation described in the present application may further include reducing the dimension of the binarized text image through a principal component analysis method, so that the binarized text image can be processed more efficiently. Wherein, the principal component analysis method is a method of transforming a group of potentially correlated variables into a group of linearly uncorrelated variables through orthogonal transformation.

S2. Detect skewed text in the binary text image by an iterative algorithm to obtain a skewed text image, and extract the skewed text image to obtain a binary copy image.

The preferred embodiment of the present application detects the skewed text in the binarized text image to the skewed text image through the AdaBoost iterative algorithm. The AdaBoost iterative algorithm is a detection algorithm whose core is iteration. It constructs a weak classifier for different training sets, and combines each base weak classifier to form a final strong classifier. The implementation of the AdaBoost iterative algorithm is to adjust the data distribution, and set the weight of each sample based on judging the correctness of the classification of each sample in each training set and the accuracy of the overall classification of the last sample. The newly obtained weights will be used as the data set for the training of the lower classifier, and then the classifiers trained each time will be combined to form the final decision classifier.

This application divides the different regions in the binarized text image to obtain training samples (x ₁ , y ₁ ), (x ₂ , y ₂ ), ... (x _n , y _n ), where the negative sample (background ) Is represented by y _i =0, and positive samples (foreground, that is, containing skewed text) are represented by y _i =1. Preferably, the weak classifier constructed in this application is:

Among them, f is the feature, θ is the threshold, p indicates the direction of the inequality sign, and x indicates a detection sub-window. By collecting the constructed weak classifiers, and selecting the best weak classifier h _t (x) with the _{smallest classification error rate ε t} among the constructed weak classifiers, the calculation formula of _{ε t is} :

ε _t =min _f,p,θ ∑ _i (w _i /∑w _i )|h(x,f,p,θ)-y _i |,

Among them, w is the feature weight, and the final strong classifier is obtained:

β _t =ε _t /(1-ε _t ).

Further, the present application detects the skewed text in the binarized text image by means of cascaded classifiers. The cascade classifier is to form a text detection cascade classifier by cascading the strong classifiers obtained by the training, and the cascade classifier is a degraded decision tree. In the cascade classifier, the classification of the second-level classifier is triggered by the positive samples obtained from the first-level classification, the classification of the third-level classifier is triggered by the positive samples obtained from the second-level classification, and so on. Finally, all the skewed text images in the binarized text image in the general environment are detected, and the skewed text images are cropped to obtain the binary copy image.

S3. Perform progressive rotation on the binary copy image, convert the binary copy image after the progressive rotation into a frequency projection histogram, and obtain the binary copy image according to the progressive rotation angle of the binary copy image The frequency projection histogram set of the binary copy image.

The preferred embodiment of this application progressively rotates the binary copy image according to a preset angle. Preferably, this application will progressively rotate the binary copy image between -45° and 45° in units of 2°. Rotate, and calculate the number of long and wide pixels in the binary copy image after each progressive rotation.

Further, the present application converts the progressively rotated binary copy image into a frequency projection histogram through a Fourier transform algorithm. Specifically, the Fourier transform method includes:

Transform it into:

Among them, u=0,1,2,3...M-1; v=0,1,2,3...N-1; x=0,1,2,3...M-1; y=0,1, 2,3...N-1; M and N are the number of long and wide pixels in the binary copy image respectively, x and y are the space coordinate points, and f(x,y) is the binary copy image space Domain sampling value, F(u, v) is the Fourier transform domain sampling value of the binary copy image, u and v are the transform domain coordinate points. Wherein, when the true array of the binary copy image is a square matrix, then M=N. F(u,v) is called the frequency spectrum of the binary copy image signal f(x,y), and the amplitude spectrum and phase spectrum of the binary copy image after Fourier transform are calculated respectively:

Among them, F(u,v)=R(u,v)+jI(u,v)=|F(u,v)|e ^jφ(u,v) , |F(u,v)| means the The amplitude spectrum of the binary copy image, φ(u, v) represents the phase spectrum of the binary copy image.

Further, the present application constructs a frequency projection histogram according to the calculated amplitude spectrum and phase spectrum of the binary copy image, and according to different angles of progressive rotation of the binary copy image, different frequency projection histograms can be obtained , That is, the frequency projection histogram set of the binary copy image.

S4. Calculate the standard deviations of the peak vertices and peak valley points in the frequency projection histogram to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image to complete the correction of the text image Angle correction.

In a preferred embodiment of the present application, the method for calculating the standard deviation of the peak vertices and the peak valley points in the frequency projection histogram set is:

Among them, σ represents the standard deviation of the frequency projection histogram, _xi represents the i-th peak vertex in the frequency projection histogram, n represents the number of peak vertices in the frequency projection histogram, and y _j represents the i-th peak in the frequency projection histogram. Valley point, m represents the number of peak and valley points in the frequency projection histogram, and μ is the mean value of all peak vertices and peak valley points. The required standard deviation reflects the degree of dispersion between the peak valley point and the peak apex.

Further, the present application calculates the standard deviations of all histograms in the frequency projection histogram set to obtain the standard deviation set, and according to the structural characteristics of the text image, it is obtained that when the standard deviation is the largest, it is the highest corrected text image. In the best orientation, the correction angle of the text image is obtained, and the original image is rotated and corrected according to the correction angle.

The invention also provides a text image angle correction device. Referring to FIG. 2, it is a schematic diagram of the internal structure of a text image angle correction device provided by an embodiment of this application.

In this embodiment, the text image angle correction device 1 may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, or a portable computer, or a server. The text image angle correction device 1 at least includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.

The memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like. In some embodiments, the memory 11 may be an internal storage unit of the text image angle correction device 1, for example, a hard disk of the text image angle correction device 1. In other embodiments, the memory 11 may also be an external storage device of the text image angle correction device 1, such as a plug-in hard disk equipped on the text image angle correction device 1, a smart media card (SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the text image angle correction device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the text image angle correction device 1, such as the code of the text image angle correction program 01, etc., but also to temporarily store data that has been output or will be output.

In some embodiments, the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip, for running program codes or processing stored in the memory 11 Data, such as the execution of the text image angle correction program 01 and so on.

The communication bus 13 is used to realize the connection and communication between these components.

The network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the apparatus 1 and other electronic devices.

Optionally, the device 1 may also include a user interface. The user interface may include a display (Display) and an input unit such as a keyboard (Keyboard). The optional user interface may also include a standard wired interface and a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the text image angle correction device 1 and to display a visualized user interface.

Figure 2 only shows the text image angle correction device 1 with components 11-14 and the text image angle correction program 01. Those skilled in the art can understand that the structure shown in Figure 1 does not constitute a text image angle correction device The definition of 1 may include fewer or more components than shown, or a combination of certain components, or different component arrangements.

In the embodiment of the apparatus 1 shown in FIG. 2, the memory 11 stores a text image angle correction program 01; when the processor 12 executes the text image angle correction program 01 stored in the memory 11, the following steps are implemented:

Step 1: Obtain a text image, and perform a preprocessing operation on the text image to obtain a binary text image.

d. Noise reduction:

among them,

Is the noise variance of the text image,

Is the average gray value of pixels in a window near the point (x, y),

e. Contrast enhancement:

D _b =f(D _a )=a*D _a +b

f. Image thresholding operation:

This application uses the OTSU algorithm to perform an efficient algorithm for binarizing the contrast-enhanced text image to obtain a binarized image. Further, the preferred embodiment of the present application presets the gray level t to be the segmentation threshold of the foreground and background of the text image after contrast enhancement, and presets the ratio of the number of front sights to the text image after contrast enhancement as w ₀ , The average gray level is u ₀ ; the proportion of the number of background points in the contrast-enhanced text image is w ₁ , and the average gray level is u ₁ , so the total average gray level of the text image after the contrast-enhancement is:

u=w ₀ *u ₀ +w ₁ *u ₁ ,

Step 2: Detect skewed text in the binarized text image by an iterative algorithm to obtain a skewed text image, and extract the skewed text image to obtain a binary copy image.

ε _t =min _f,p,θ ∑ _i (w _i /∑w _i )|h(x,f,p,θ)-y _i |,

β _t =ε _t /(1-ε _t ).

Step 3: Perform progressive rotation on the binary copy image, convert the binary copy image after the progressive rotation into a frequency projection histogram, and obtain the result according to the progressive rotation angle of the binary copy image The frequency projection histogram set of the binary copy image.

Transform it into:

Among them, u=0,1,2,3...M-1; v=0,1,2,3...N-1; x=0,1,2,3...M-1; y=0,1, 2,3...N-1; M and N are the number of long and wide pixels in the binary copy image respectively, x and y are the space coordinate points, and f(x,y) is the binary copy image space Domain sampling value, F(u, v) is the Fourier transform domain sampling value of the binary copy image, u and v are the transform domain coordinate points. Wherein, when the binary copy image is a square matrix, then M=N. F(u,v) is called the frequency spectrum of the binary copy image signal f(x,y), and the amplitude spectrum and phase spectrum of the binary copy image after Fourier transform are calculated respectively:

Step 4. Calculate the standard deviations of the peak vertices and peak valley points in the frequency projection histogram set to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image to complete the text image The angle correction.

Optionally, in other embodiments, the text image angle correction program can also be divided into one or more modules, and the one or more modules are stored in the memory 11 and run by one or more processors (in this embodiment). It is executed by the processor 12) to complete this application. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is used to describe the execution process of the text image angle correction program in the text image angle correction device .

For example, referring to FIG. 3, a schematic diagram of the program module of the text image angle correction program in an embodiment of the text image angle correction device of this application. In this embodiment, the text image angle correction program can be divided into a text image preview. The processing module 10, the text image detection module 20, the image conversion module 30, and the calculation module 40 are exemplarily:

The text image preprocessing module 10 is used to obtain a text image, and perform a preprocessing operation on the text image to obtain a binary text image.

The text image detection module 20 is configured to detect skewed text in the binarized text image by an iterative algorithm to obtain a skewed text image, and crop the skewed text image to obtain a binary copy image.

The image conversion module 30 is configured to: perform a progressive rotation on the binary copy image, convert the binary copy image after the progressive rotation into a frequency projection histogram, according to the progressive rotation of the binary copy image The angle of rotation is used to obtain the frequency projection histogram set of the binary copy image.

The calculation module 40 is configured to: calculate the standard deviation of the peak apex and the peak valley point of the frequency projection histogram set to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image, Thus, the angle correction of the text image is completed.

The functions or operation steps implemented by the program modules such as the text image preprocessing module 10, the text image detection module 20, the image conversion module 30, and the calculation module 40 when executed are substantially the same as those in the foregoing embodiment, and will not be repeated here.

In addition, an embodiment of the present application also proposes a computer-readable storage medium that stores a text image angle correction program, and the text image angle correction program can be executed by one or more processors to To achieve the following operations:

Calculate the standard deviation of the peak apex and peak valley point of the frequency projection histogram set to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image, thereby completing the text image Angle correction.

The specific implementation of the computer-readable storage medium of the present application is basically the same as the foregoing embodiments of the text image angle correction device and method, and will not be repeated here.

It should be noted that the serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments. And the terms "include", "include" or any other variants thereof in this article are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, but also includes those elements that are not explicitly included. The other elements listed may also include elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article, or method that includes the element.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including a number of instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the method described in each embodiment of the present application.

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

A method for correcting the angle of a text image, characterized in that the method includes:

Acquiring a text image, and performing a preprocessing operation on the text image to obtain a binarized text image;

Detecting skewed text in the binary text image by an iterative algorithm to obtain a skewed text image, and crop the skewed text image to obtain a binary copy image;

Perform progressive rotation on the binary copy image, convert the progressively rotated binary copy image into a frequency projection histogram, and obtain the binary copy image according to the progressive rotation angle of the binary copy image The frequency projection histogram set of the copied image;

Calculate the standard deviation of the peak vertices and peak valley points of the frequency projection histogram set to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image to complete the correction of the text image Angle correction.
The method for correcting the angle of a text image according to claim 1, wherein the preprocessing operation on the text image to obtain a binary text image comprises:

The text image is denoised through an adaptive image denoising filter, the denoised text image is contrast-enhanced using a contrast stretching method, and the contrast-enhanced text image is thresholded according to the OTSU algorithm Operation to obtain the binarized text image.
The method for correcting the angle of a text image according to claim 1, wherein the converting the binary copy image after the progressive rotation into a frequency projection histogram comprises:

Performing Fourier transform on the binary copy image after progressive rotation;

Calculating the amplitude spectrum and the phase spectrum of the binary copy image after Fourier transform;

According to the amplitude spectrum and the phase spectrum, the frequency projection histogram is constructed.
The method for correcting the angle of a text image according to claim 3, wherein the Fourier transform method comprises:

Transform it into:

Among them, u=0,1,2,3...M-1; v=0,1,2,3...N-1; x=0,1,2,3...M-1; y=0,1, 2,3...N-1; M and N are the number of long and wide pixels in the binary copy image respectively, x and y are the space coordinate points, and f(x,y) is the binary copy image space Domain sampling value, F(u, v) is the Fourier transform domain sampling value of the binary copy image, u and v are the transform domain coordinate points.
The method for correcting the angle of a text image according to claim 4, wherein the amplitude spectrum and the phase spectrum of the binary copy image after Fourier transform are calculated respectively, and the amplitude spectrum and the phase spectrum of the binary copy image are calculated according to the calculated amplitude of the binary copy image. Spectrum and phase spectrum, construct frequency projection histogram:

Among them, F(u,v)=R(u,v)+jI(u,v)=|F(u,v)|e jφ(u,v) , |F(u,v)| means the The amplitude spectrum of the binary copy image, φ(u, v) represents the phase spectrum of the binary copy image.
The method for correcting the angle deviation of a text image according to any one of claims 1 to 5, wherein the method of calculating the standard deviation of the peak vertices and the peak valley points in the frequency projection histogram set comprises:

Among them, σ represents the standard deviation of the frequency projection histogram, xi represents the i-th peak vertex in the frequency projection histogram, n represents the number of peak vertices in the frequency projection histogram, and y j represents the i-th peak in the frequency projection histogram. Valley point, m represents the number of peak and valley points in the frequency projection histogram, and μ is the mean value of all peak vertices and peak valley points.
The method for correcting the angle deviation of a text image according to claim 2, wherein the contrast stretching is a conversion of a gray value, and the calculation formula of the gray value is:

D b =f(D a )=a*D a +b

Where a is the linear slope and b is the intercept on the Y axis. When a>1, the output image contrast is enhanced compared to the original image. When a<1, the output image contrast is compared with The original image is weakened, where D a represents the gray value of the input image, and D b represents the gray value of the output image.
A text image angle correction device, characterized in that the device includes a memory and a processor, the memory stores a text image angle correction program that can be run on the processor, and the text image angle correction program is When the processor executes, the following steps are implemented:

Acquiring a text image, and performing a preprocessing operation on the text image to obtain a binarized text image;

Detecting skewed text in the binary text image by an iterative algorithm to obtain a skewed text image, and crop the skewed text image to obtain a binary copy image;

Perform progressive rotation on the binary copy image, convert the progressively rotated binary copy image into a frequency projection histogram, and obtain the binary copy image according to the progressive rotation angle of the binary copy image The frequency projection histogram set of the copied image;

Calculate the standard deviation of the peak vertices and peak valley points of the frequency projection histogram set to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image to complete the correction of the text image Angle correction.
8. The text image angle correction device according to claim 8, wherein the preprocessing operation on the text image to obtain a binary text image comprises:

The text image is denoised through an adaptive image denoising filter, the denoised text image is contrast-enhanced using a contrast stretching method, and the contrast-enhanced text image is thresholded according to the OTSU algorithm Operation to obtain the binarized text image.
8. The text image angle correction device according to claim 8, wherein said converting the progressively rotated binary copy image into a frequency projection histogram comprises:

Performing Fourier transform on the binary copy image after progressive rotation;

Calculating the amplitude spectrum and the phase spectrum of the binary copy image after Fourier transform;

According to the amplitude spectrum and the phase spectrum, the frequency projection histogram is constructed.
10. The text image angle correction device according to claim 10, wherein the Fourier transform method comprises:

Transform it into:

Among them, u=0,1,2,3...M-1; v=0,1,2,3...N-1; x=0,1,2,3...M-1; y=0,1, 2,3...N-1; M and N are the number of long and wide pixels in the binary copy image respectively, x and y are the space coordinate points, and f(x,y) is the binary copy image space Domain sampling value, F(u, v) is the Fourier transform domain sampling value of the binary copy image, u and v are the transform domain coordinate points.
The text image angle correction device according to claim 11, wherein the amplitude spectrum and the phase spectrum of the binary copy image after Fourier transform are respectively calculated, and the amplitude spectrum and the phase spectrum of the binary copy image are calculated according to the calculated amplitude Spectrum and phase spectrum, construct frequency projection histogram:

Among them, F(u,v)=R(u,v)+jI(u,v)=|F(u,v)|e jφ(u,v) , |F(u,v)| means the The amplitude spectrum of the binary copy image, φ(u, v) represents the phase spectrum of the binary copy image.
The text image angle correction device according to any one of claims 8-12, wherein the method of calculating the standard deviation of the peak vertices and the peak valley points in the frequency projection histogram set comprises:

Among them, σ represents the standard deviation of the frequency projection histogram, xi represents the i-th peak apex in the frequency projection histogram, n represents the number of peak vertices in the frequency projection histogram, and y i represents the i-th peak in the frequency projection histogram. Valley point, m represents the number of peak and valley points in the frequency projection histogram, and μ is the mean value of all peak vertices and peak valley points.
9. The text image angle correction device according to claim 9, wherein the contrast stretching is a conversion of a gray value, and the calculation formula of the gray value is:

D b =f(D a )=a*D a +b

Where a is the linear slope and b is the intercept on the Y axis. When a>1, the output image contrast is enhanced compared to the original image. When a<1, the output image contrast is compared with The original image is weakened, where D a represents the gray value of the input image, and D b represents the gray value of the output image.
A computer-readable storage medium, characterized in that a text image angle correction program is stored on the computer readable storage medium, and the text image angle correction program can be executed by one or more processors to realize The steps of the method for correcting the angle of a text image according to any one of 1 to 5:

Acquiring a text image, and performing a preprocessing operation on the text image to obtain a binarized text image;

Detecting skewed text in the binary text image by an iterative algorithm to obtain a skewed text image, and crop the skewed text image to obtain a binary copy image;

Perform progressive rotation on the binary copy image, convert the progressively rotated binary copy image into a frequency projection histogram, and obtain the binary copy image according to the progressive rotation angle of the binary copy image The frequency projection histogram set of the copied image;

Calculate the standard deviation of the peak apex and peak valley point of the frequency projection histogram set to obtain a standard deviation set, and use the maximum standard deviation in the standard deviation set as the correction angle of the text image, thereby completing the text image Angle correction.
15. The computer-readable storage medium of claim 15, wherein the preprocessing operation on the text image to obtain a binarized text image comprises:

The text image is denoised through an adaptive image denoising filter, the denoised text image is contrast-enhanced using a contrast stretching method, and the contrast-enhanced text image is thresholded according to the OTSU algorithm Operation to obtain the binarized text image.
15. The computer-readable storage medium according to claim 15, wherein the converting the binary copy image after the progressive rotation into a frequency projection histogram comprises:

Performing Fourier transform on the binary copy image after progressive rotation;

Calculating the amplitude spectrum and the phase spectrum of the binary copy image after Fourier transform;

According to the amplitude spectrum and the phase spectrum, the frequency projection histogram is constructed.
17. The computer-readable storage medium of claim 17, wherein the Fourier transform method comprises:

Transform it into:

Among them, u=0,1,2,3...M-1; v=0,1,2,3...N-1; x=0,1,2,3...M-1; y=0,1, 2,3...N-1; M and N are the number of long and wide pixels in the binary copy image respectively, x and y are the space coordinate points, and f(x,y) is the binary copy image space Domain sampling value, F(u, v) is the Fourier transform domain sampling value of the binary copy image, u and v are the transform domain coordinate points.
The computer-readable storage medium according to claim 18, wherein the Fourier transform of the binary copy image amplitude spectrum and the phase spectrum are respectively calculated, and the calculated amplitude spectrum of the binary copy image is Spectrum and phase spectrum, construct frequency projection histogram:

Among them, F(u,v)=R(u,v)+jI(u,v)=|F(u,v)|e jφ(u,v) , |F(u,v)| means the The amplitude spectrum of the binary copy image, φ(u, v) represents the phase spectrum of the binary copy image.
The computer-readable storage medium of claim 19, wherein the method of calculating the standard deviation of the peak apex and the peak valley point in the frequency projection histogram set comprises:

Among them, σ represents the standard deviation of the frequency projection histogram, xi represents the i-th peak vertex in the frequency projection histogram, n represents the number of peak vertices in the frequency projection histogram, and y j represents the i-th peak in the frequency projection histogram. Valley point, m represents the number of peak and valley points in the frequency projection histogram, and μ is the mean value of all peak vertices and peak valley points.