CN112257710A - Method and device for detecting inclination of picture with character plane - Google Patents
Method and device for detecting inclination of picture with character plane Download PDFInfo
- Publication number
- CN112257710A CN112257710A CN202011156715.8A CN202011156715A CN112257710A CN 112257710 A CN112257710 A CN 112257710A CN 202011156715 A CN202011156715 A CN 202011156715A CN 112257710 A CN112257710 A CN 112257710A
- Authority
- CN
- China
- Prior art keywords
- picture
- value
- clustering
- slope
- straight line
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000001514 detection method Methods 0.000 claims abstract description 16
- 238000012935 Averaging Methods 0.000 claims abstract description 7
- 238000001914 filtration Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/243—Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Character Input (AREA)
Abstract
The application provides a method and a device for detecting the inclination of a picture with a text plane, wherein the method comprises the following steps: the text box detection module is used for acquiring a text box corresponding to each text line in the picture to be detected and expressing the text box by four vertexes of the text box in which the text line is positioned; the straight line fitting module is used for performing straight line fitting on four vertexes of each text box; the clustering module is used for clustering the slope of the fitted straight line; the central slope value acquisition module is used for averaging the linear slopes in the clustered categories to obtain a plurality of central slope values; and the judging module is used for judging whether the picture to be detected inclines or not according to the comparison result of the plurality of central slope values and the threshold value. According to the method and the device, the image inclination is judged through the clustering algorithm of the slope of the straight lines, on one hand, the robustness of inclination identification can be enhanced by combining the clustering algorithm, on the other hand, plane inclination and depth inclination can be distinguished, and more comprehensive detection results are obtained.
Description
Technical Field
The application belongs to the technical field of data processing, and particularly relates to a method and a device for detecting the inclination of a picture with a text plane.
Background
The fresh electric business hopes that the characters on the menu can be automatically recognized through selling the shot merchant menu pictures. However, due to the fact that the quality angles of the shot pictures are different, great challenges are caused to a subsequent character recognition algorithm, and meanwhile, some manpower is wasted. Therefore, it is desirable to use some pre-judgment algorithms to feed back whether a picture is qualified or not in real time when a picture is sold and taken, and to judge whether the picture is suitable for the algorithm for character recognition or not before character recognition. If the judgment is not suitable, the shooting can be required to be performed again on the spot, and unnecessary labor waste is avoided. Meanwhile, the subsequent character recognition rate can be improved, and the cost of manual recognition is reduced.
Among all the picture quality problems, one of the biggest problems is that the shooting angle is too inclined, so that characters are seriously deformed, or the characters at the far end are too small to be recognized. At present, all straight lines in a picture are identified by algorithms such as Hough transformation and the like according to the method for judging the inclination of a single picture, and then the inclination of the picture is judged by the inclination of the straight lines or the inclination is detected by utilizing space bars among lines of characters. And only the tilt parallel to the picture plane can be recognized, and the tilt perpendicular to the depth direction of the picture cannot be recognized.
Disclosure of Invention
In order to solve at least one of the above technical problems, the present application provides a method and an apparatus capable of automatically determining whether a menu picture in a natural scene is too inclined, and the method and the apparatus can utilize higher-level statistical information to make recognition more robust by using algorithms such as clustering. The method is not limited to menu pictures, and is suitable for gradient judgment of pictures with character planes in various natural scenes, such as billboards and the like.
The application provides a method for detecting the inclination of a picture with a text plane in a first aspect, which comprises the following steps: acquiring a text box corresponding to each text line in a picture to be detected, and expressing the text box by using four vertexes of the text box where the text line is located; performing straight line fitting on four vertexes of each text box; clustering the slope of the fitted straight line; averaging the linear slopes in the clustered categories to obtain a plurality of central slope values; and judging whether the picture to be detected is inclined or not according to the comparison result of the plurality of central slope values and the threshold value.
Preferably, before clustering the slope of the fitted straight line, the method further includes: setting a first threshold value, and filtering out straight lines with slopes exceeding the first threshold value, wherein the first threshold value is selected from any value of 4-6.
Preferably, clustering the slope of the fitted straight line includes: setting not less than two cluster categories, and giving an initial central value in each category; performing initial clustering on all the straight line slopes according to Euclidean distances between all the straight line slopes and all the initial central values; and recalculating the average value of the slope of the straight line in each clustering category after the initial clustering, taking the average value as a new central value, clustering all the slope of the straight line again, and iterating for multiple times until convergence.
Preferably, the determining whether the picture to be detected is inclined includes: and setting a second threshold, and judging the picture to be detected as an inclined picture when the central slope value of the straight slope in the category which covers the straight slope most after clustering is larger than the first threshold, wherein the second threshold is selected from any value of 0.8-1.2.
Preferably, the determining whether the picture to be detected is inclined includes: setting a third threshold, selecting the maximum value and the minimum value of a plurality of central slope values obtained after clustering, obtaining a difference value, and if the difference value is greater than the third threshold, judging that the picture to be detected is a depth-gradient picture, wherein the third threshold is selected from any value of 0.2-0.3.
The second aspect of the present application provides a device for detecting an inclination of a picture with a text plane, which corresponds to the above method, and the device includes: the text box detection module is used for acquiring a text box corresponding to each text line in the picture to be detected and expressing the text box by four vertexes of the text box in which the text line is positioned; the straight line fitting module is used for performing straight line fitting on four vertexes of each text box; the clustering module is used for clustering the slope of the fitted straight line; the central slope value acquisition module is used for averaging the linear slopes in the clustered categories to obtain a plurality of central slope values; and the judging module is used for judging whether the picture to be detected inclines or not according to the comparison result of the plurality of central slope values and the threshold value.
Preferably, the method further comprises a filtering module, configured to filter, before clustering, a straight line with a slope exceeding a first threshold after fitting, by a set first threshold, where the first threshold is selected from any value from 4 to 6.
Preferably, the clustering module includes: the device comprises a clustering parameter setting unit, a central value setting unit and a central value setting unit, wherein the clustering parameter setting unit is used for setting at least two clustering categories and giving an initial central value in each category; the initial clustering unit is used for initially clustering all the straight line slopes according to Euclidean distances between all the straight line slopes and all the initial central values; and the iteration unit is used for recalculating the average value of the slope of the straight line in each clustering category after the initial clustering, taking the recalculated average value as a new central value, reclustering all the slopes of the straight line, and iterating for multiple times until convergence.
Preferably, the determination module includes: and the first judging unit is used for setting a second threshold value, and judging that the picture to be detected is an inclined picture when the central slope value of the straight line slope in the category which covers the straight line slope with the most straight line slopes after clustering is larger than the first threshold value, wherein the second threshold value is selected from any value of 0.8-1.2.
Preferably, the determination module includes: and the second judging unit is used for setting a third threshold, selecting the maximum value and the minimum value of the central slope values obtained after clustering, obtaining a difference value, and judging that the picture to be detected is a depth-gradient picture if the difference value is greater than the third threshold, wherein the third threshold is selected from any value of 0.2-0.3.
According to the method and the device, the image inclination is judged through the clustering algorithm of the slope of the straight lines, on one hand, the robustness of inclination identification can be enhanced by combining the clustering algorithm, on the other hand, plane inclination and depth inclination can be distinguished, and more comprehensive detection results are obtained.
Drawings
Fig. 1 is a flowchart of a method for detecting inclination of a picture with a text plane according to a preferred embodiment of the present invention.
Fig. 2 is a schematic diagram of acquiring a text box according to the embodiment shown in fig. 1 of the present application.
Detailed Description
In order to make the implementation objects, technical solutions and advantages of the present application clearer, the technical solutions in the embodiments of the present application will be described in more detail below with reference to the accompanying drawings in the embodiments of the present application. In the drawings, the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The described embodiments are some, but not all embodiments of the present application. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application, and should not be construed as limiting the present application. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application are within the scope of protection of the present application. Embodiments of the present application will be described in detail below with reference to the drawings.
The present application provides a method for detecting an inclination of a picture with a text plane, as shown in fig. 1, the method mainly includes:
and step S1, acquiring a text box corresponding to each text line in the picture to be detected, and representing by four vertexes of the text box in which the text line is located.
And step S2, performing straight line fitting on the four vertexes of each text box.
And step S3, clustering the slope of the fitted straight line.
And step S4, averaging the slope of the straight lines in each clustered category to obtain a plurality of central slope values.
And step S5, judging whether the picture to be detected is inclined or not according to the comparison result of the central slope values and the threshold value.
In step S1, the text box is preferably recognized using deep learning. There are many algorithms for detecting the position of the text box in the current machine vision field, and the invention is not limited to which algorithm is used to obtain the text box, for example, the EAST network is used for detection, and EAST is a text detection network based on deep learning. The input is a single picture, the output is a series of polygon coordinates, and each polygon corresponds to one text line in the picture. EAST is a full convolution network, mainly divided into three parts: the device comprises a feature extraction layer, a feature fusion layer and an output layer. The feature extraction layer can use any Backbone network, and PVANet is used in the paper. And the feature fusion layer splices two adjacent layers after up-sampling so as to obtain richer semantic and position information.
The output layer is divided into three parts: score map, RBOX, and QUAD. Wherein score map gives the probability that each pixel belongs to a text region. RBOX outputs 5 values for each pixel, and the sub-table indicates 4 distances of the pixel to the top, right, bottom, and left boundaries of the rectangle, and the rotation angle of the rectangle. QUAD outputs 8 values for each pixel, as the coordinate offset of that pixel to the four corner vertices of the quadrilateral.
Finally, the NMS is performed on all the obtained text boxes to obtain the final result, as shown in fig. 2, it should be understood that in fig. 2, the text box detection is performed by machine learning, and is generally performed based on the principle of word space, so that each word is generally recognized as one text box if it is compact, and a word is generally recognized as a plurality of text boxes if it is separated by spaces.
Usually, the formed text box contains a plurality of words to form a rectangular text box, but based on other factors of the detection algorithm, the compact text line is sometimes divided into a plurality of text boxes, that is, some text boxes contain one word to form a square text box. In any case, however, the formed text box is substantially rectangular with four vertices, and therefore, the text box is represented by four vertices in step S1.
In step S2, a straight line is fitted to each text box, for example, a straight line is fitted to the four points by the least square method, and the slopes of all the straight lines are calculated.
In step S3, the calculated slope of the straight line is clustered.
In some optional embodiments, before performing clustering, further comprising: setting a first threshold value, and filtering out straight lines with slopes exceeding the first threshold value, wherein the first threshold value is selected from any value of 4-6.
The purpose of this embodiment is to remove all lines with slopes greater than a threshold, where the threshold is empirically set to 5, for example. The filtering slope is because the current character detection algorithm has certain limitation on the vertically arranged characters, and the false detection rate is higher. And some single words can be recognized as text lines, and can be fitted into vertical straight lines when the straight lines are fitted, so that the algorithm effect is influenced. Both outliers are therefore filtered by a threshold.
In some alternative embodiments, step S3 further includes:
step S31, a number of cluster categories not less than two are set, and an initial center value is given in each category.
Step S32, initially clustering all the slope of the straight lines by the euclidean distance between each slope of the straight lines and each initial center value.
And step S33, recalculating the average value of the slope of the straight line in each cluster type after the initial clustering, taking the average value as a new central value, clustering all the slope of the straight line again, and iterating for many times until convergence.
The implementation of the application has the following beneficial effects:
(1) the gradient is directly judged through an algorithm, the standard is unified, and the problem of unequal quality of shot pictures is avoided.
(2) A decision can be made in real time to take the picture so that the photographer decides whether or not to take the picture again.
(3) Can be applied to natural scenes. For pictures with complex backgrounds, the rate of misjudgment by algorithms that identify straight lines, etc. is quite high because straight lines in the background are all calculated as table lines. But the invention avoids such misjudgment and is not influenced by lines or patterns in the background.
(4) The inclination perpendicular to the depth direction of the picture can be judged. Existing methods of determination are limited to determining tilt due to rotation parallel to the image plane. However, in the menu pictures, due to the fact that the shooting angle of many pictures is too large, the menu is greatly inclined in the direction perpendicular to the picture, if one side of the menu is closer to the camera, the font is clearer, the other side of the menu is far away from the camera, and the font is very small and fuzzy.
It can be understood that after the clustering, the slope can be made to have more statistical significance, so that the result is more robust. Here, the kmeans algorithm is used, and the number of categories n is set to be the number of straight lines divided by 5. That is, if the number of straight lines in the filtered graph is 20, the number of categories n is set to 4. kmeans assigns all the straight lines into n classes by randomly setting n central values and calculating euclidean distances. Then, a new center value is calculated according to the slope of the straight line in each class, and all the straight lines are redistributed. Iterate multiple times until convergence. Eventually all lines will be assigned to their respective classes, each class having a center value, called the center slope. The center slope is obtained by calculating the slope average of all lines in this class.
Step S4 actually obtains the center slope of each class after the iterative convergence of step S3. In step S5, the degree of inclination of the picture is predicted based on these slopes.
In some optional embodiments, determining whether the picture to be detected is tilted comprises:
and setting a second threshold, and judging the picture to be detected as an inclined picture when the central slope value of the straight slope in the category which covers the straight slope most after clustering is larger than the first threshold, wherein the second threshold is selected from any value of 0.8-1.2.
The embodiment mainly judges whether the picture is inclined on the plane or not, and by setting a second threshold, when the slope of the clustered straight line exceeds the second threshold, the picture is determined to be inclined on the plane, and a central slope value obtained by covering the straight line slope in a category with the highest slope of the straight line after clustering is preferentially selected as a comparison standard, so that the coverage rate is the most extensive when the judgment is carried out, and the result is more accurate. In an alternative embodiment, the average of all the line slopes before the entire non-cluster may also be used for decision comparison with the threshold.
In this embodiment, the second threshold is generally selected to be 1, which means that when the central slope value c obtained by covering the slope of the straight line in the category with the highest slope of the straight line after clustering is greater than 1, it can be determined that the text plane is inclined in the direction parallel to the picture.
It can be understood that another purpose of the present application is to perform image pre-screening processing as a character recognition algorithm through image gradient determination to guide the front end to obtain a shot image with higher quality, so that the value of the second threshold can be determined according to subsequent character recognition steps, that is, in the character recognition steps, the inclination degree of the image exceeding too much can result in unrecognizable characters, and the value can be counted and used as a selection standard of the second threshold.
Besides identifying the inclination of the picture to be detected in the plane, the following steps can be used for judging whether the picture to be detected has the inclination in the depth direction. That is, in some optional embodiments, determining whether the picture to be detected is tilted includes:
setting a third threshold, selecting the maximum value and the minimum value of a plurality of central slope values obtained after clustering, obtaining a difference value, and if the difference value is greater than the third threshold, judging that the picture to be detected is a depth-gradient picture, wherein the third threshold is selected from any value of 0.2-0.3.
In this embodiment, when the plane is inclined in the depth direction perpendicular to the screen direction, the parallel lines tend to intersect at a vanishing point (vanishing point). I.e. a line with a uniform slope in the real world, the slope of which changes in different directions when there is a slope in the depth direction. Therefore, whether the plane is inclined in the direction perpendicular to the picture can be determined by the maximum difference value of the central slopes, that is, if the difference value between the maximum value and the minimum value of the central slope values of all the categories is greater than the third threshold value, the plane is determined to be an inclined picture.
As mentioned above, in view of the purpose of the picture pre-screening process as a character recognition algorithm of the present invention, a suitable third threshold value can be still calculated according to the recognition rate of the character recognition step, and in general, the third threshold value can be selected to be 0.25.
It can be understood that, a single character line detection has an accidental nature, which is easy to cause misjudgment, for example, only one character is inclined, the inclination reason may be artistic character effect or other reasons, and is not caused by the inclination of the picture, and the way that a plurality of character lines are inclined to represent the inclination of the picture is more stable, the precision is higher, and the accidental nature is avoided. According to the method and the device, the slope of the fitting straight line of the text boxes is processed through the clustering algorithm, the accuracy of image gradient detection is improved, plane inclination and depth inclination can be distinguished, and a more comprehensive detection result is obtained.
In a second aspect of the present application, a device for detecting an inclination of a picture with a text plane corresponding to the above method is provided, which mainly includes:
the text box detection module is used for acquiring a text box corresponding to each text line in the picture to be detected and expressing the text box by four vertexes of the text box in which the text line is positioned;
the straight line fitting module is used for performing straight line fitting on four vertexes of each text box;
the clustering module is used for clustering the slope of the fitted straight line;
the central slope value acquisition module is used for averaging the linear slopes in the clustered categories to obtain a plurality of central slope values;
and the judging module is used for judging whether the picture to be detected inclines or not according to the comparison result of the plurality of central slope values and the threshold value.
In some optional embodiments, the method further comprises a filtering module, configured to filter, before performing clustering, a straight line with a fitted slope exceeding a first threshold by a set first threshold, where the first threshold is selected from any value from 4 to 6.
In some optional embodiments, the clustering module comprises:
the device comprises a clustering parameter setting unit, a central value setting unit and a central value setting unit, wherein the clustering parameter setting unit is used for setting at least two clustering categories and giving an initial central value in each category;
the initial clustering unit is used for initially clustering all the straight line slopes according to Euclidean distances between all the straight line slopes and all the initial central values;
and the iteration unit is used for recalculating the average value of the slope of the straight line in each clustering category after the initial clustering, taking the recalculated average value as a new central value, reclustering all the slopes of the straight line, and iterating for multiple times until convergence.
In some optional embodiments, the determining module comprises:
and the first judging unit is used for setting a second threshold value, and judging that the picture to be detected is an inclined picture when the central slope value of the straight line slope in the category which covers the straight line slope with the most straight line slopes after clustering is larger than the first threshold value, wherein the second threshold value is selected from any value of 0.8-1.2.
In some optional embodiments, the determining module comprises:
and the second judging unit is used for setting a third threshold, selecting the maximum value and the minimum value of the central slope values obtained after clustering, obtaining a difference value, and judging that the picture to be detected is a depth-gradient picture if the difference value is greater than the third threshold, wherein the third threshold is selected from any value of 0.2-0.3.
In other aspects of the present application, a computer device is provided, which includes a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to implement the method for detecting inclination of a picture with a text plane as described above.
In this embodiment, the present invention may be run with a GPU or directly on a CPU, with the main time difference depending on the speed of the text recognition portion. Currently, the EAST algorithm is 0.3 s/frame on the GPU and 1.07 s/frame on the CPU. The development labor cost is about 2 person days. The income is that when a sales visiting merchant takes a menu picture, a judgment result can be given on the spot, and whether the picture is qualified or not is judged. If unqualified, the sales can be retaken on the spot, and the manpower cost of visiting again is saved. In addition, through picture quality control, the overall picture quality can be improved, and the accuracy and the efficiency of subsequent picture information capture are ensured. At present, the character recognition rate of menu pictures is about 40%, and the main reason is that the picture quality is too poor, and characters which are too inclined cannot be recognized. Through automatic checking of the inclination, the picture quality is controlled, and the recognition rate can be improved to about 70%.
In other aspects of the present application, a readable storage medium is provided, which stores a computer program, and the computer program is used for implementing the method for detecting inclination of a picture with a text plane as described above when being executed by a processor.
In particular, according to embodiments of the present application, the processes described above with reference to the flow diagrams may be implemented as a computer software program, in particular a computer program installed on a mobile phone terminal, which is capable of interacting with a server. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. The computer storage media of the present application may be computer-readable signal media or computer-readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules or units described in the embodiments of the present application may be implemented by software or hardware. The modules or units described may also be provided in a processor, the names of which in some cases do not constitute a limitation of the module or unit itself.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A method for detecting the inclination of a picture with a text plane is characterized by comprising the following steps:
acquiring a text box corresponding to each text line in a picture to be detected, and expressing the text box by using four vertexes of the text box where the text line is located;
performing straight line fitting on four vertexes of each text box;
clustering the slope of the fitted straight line;
averaging the linear slopes in the clustered categories to obtain a plurality of central slope values;
and judging whether the picture to be detected is inclined or not according to the comparison result of the plurality of central slope values and the threshold value.
2. The method for detecting the inclination of the picture with the text plane according to claim 1, wherein before clustering the slopes of the fitted straight lines, the method further comprises:
setting a first threshold value, and filtering out straight lines with slopes exceeding the first threshold value, wherein the first threshold value is selected from any value of 4-6.
3. The method for detecting the inclination of the picture with the text plane according to claim 1, wherein clustering the slopes of the fitted straight lines comprises:
setting not less than two cluster categories, and giving an initial central value in each category;
performing initial clustering on all the straight line slopes according to Euclidean distances between all the straight line slopes and all the initial central values;
and recalculating the average value of the slope of the straight line in each clustering category after the initial clustering, taking the average value as a new central value, clustering all the slope of the straight line again, and iterating for multiple times until convergence.
4. The method according to claim 1, wherein determining whether the picture to be detected is tilted comprises:
and setting a second threshold, and judging the picture to be detected as an inclined picture when the central slope value of the straight slope in the category which covers the straight slope most after clustering is larger than the first threshold, wherein the second threshold is selected from any value of 0.8-1.2.
5. The method according to claim 1, wherein determining whether the picture to be detected is tilted comprises:
setting a third threshold, selecting the maximum value and the minimum value of a plurality of central slope values obtained after clustering, obtaining a difference value, and if the difference value is greater than the third threshold, judging that the picture to be detected is a depth-gradient picture, wherein the third threshold is selected from any value of 0.2-0.3.
6. The utility model provides a take picture gradient detection device of characters plane which characterized in that includes:
the text box detection module is used for acquiring a text box corresponding to each text line in the picture to be detected and expressing the text box by four vertexes of the text box in which the text line is positioned;
the straight line fitting module is used for performing straight line fitting on four vertexes of each text box;
the clustering module is used for clustering the slope of the fitted straight line;
the central slope value acquisition module is used for averaging the linear slopes in the clustered categories to obtain a plurality of central slope values;
and the judging module is used for judging whether the picture to be detected inclines or not according to the comparison result of the plurality of central slope values and the threshold value.
7. The picture inclination detecting device with the text plane according to claim 6, further comprising a filtering module for filtering a straight line with a slope exceeding a first threshold value after fitting by a set first threshold value before clustering, wherein the first threshold value is selected from any value of 4-6.
8. The device for detecting inclination of picture with literal plane according to claim 6, characterized in that the clustering module comprises:
the device comprises a clustering parameter setting unit, a central value setting unit and a central value setting unit, wherein the clustering parameter setting unit is used for setting at least two clustering categories and giving an initial central value in each category;
the initial clustering unit is used for initially clustering all the straight line slopes according to Euclidean distances between all the straight line slopes and all the initial central values;
and the iteration unit is used for recalculating the average value of the slope of the straight line in each clustering category after the initial clustering, taking the recalculated average value as a new central value, reclustering all the slopes of the straight line, and iterating for multiple times until convergence.
9. The apparatus for detecting inclination of a picture with a letter plane according to claim 6, wherein said determination module comprises:
and the first judging unit is used for setting a second threshold value, and judging that the picture to be detected is an inclined picture when the central slope value of the straight line slope in the category which covers the straight line slope with the most straight line slopes after clustering is larger than the first threshold value, wherein the second threshold value is selected from any value of 0.8-1.2.
10. The apparatus for detecting inclination of a picture with a letter plane according to claim 6, wherein said determination module comprises:
and the second judging unit is used for setting a third threshold, selecting the maximum value and the minimum value of the central slope values obtained after clustering, obtaining a difference value, and judging that the picture to be detected is a depth-gradient picture if the difference value is greater than the third threshold, wherein the third threshold is selected from any value of 0.2-0.3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011156715.8A CN112257710B (en) | 2020-10-26 | 2020-10-26 | Picture gradient detection method and device with text plane |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011156715.8A CN112257710B (en) | 2020-10-26 | 2020-10-26 | Picture gradient detection method and device with text plane |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112257710A true CN112257710A (en) | 2021-01-22 |
CN112257710B CN112257710B (en) | 2024-09-24 |
Family
ID=74261249
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011156715.8A Active CN112257710B (en) | 2020-10-26 | 2020-10-26 | Picture gradient detection method and device with text plane |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112257710B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118397242A (en) * | 2024-05-23 | 2024-07-26 | 南京云阶电力科技有限公司 | Table picture inclination judging and correcting method and device based on k-means clustering |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005056346A (en) * | 2003-08-07 | 2005-03-03 | Ricoh Co Ltd | Skew-detecting method, skew-detecting device and program |
CN108805131A (en) * | 2018-05-22 | 2018-11-13 | 北京旷视科技有限公司 | Text line detection method, apparatus and system |
CN110705233A (en) * | 2019-09-03 | 2020-01-17 | 平安科技(深圳)有限公司 | Note generation method and device based on character recognition technology and computer equipment |
CN111325199A (en) * | 2018-12-14 | 2020-06-23 | 中移(杭州)信息技术有限公司 | Character inclination angle detection method and device |
CN111553344A (en) * | 2020-04-17 | 2020-08-18 | 携程旅游信息技术(上海)有限公司 | Method, system, device and storage medium for correcting inclination of text image |
-
2020
- 2020-10-26 CN CN202011156715.8A patent/CN112257710B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005056346A (en) * | 2003-08-07 | 2005-03-03 | Ricoh Co Ltd | Skew-detecting method, skew-detecting device and program |
CN108805131A (en) * | 2018-05-22 | 2018-11-13 | 北京旷视科技有限公司 | Text line detection method, apparatus and system |
CN111325199A (en) * | 2018-12-14 | 2020-06-23 | 中移(杭州)信息技术有限公司 | Character inclination angle detection method and device |
CN110705233A (en) * | 2019-09-03 | 2020-01-17 | 平安科技(深圳)有限公司 | Note generation method and device based on character recognition technology and computer equipment |
CN111553344A (en) * | 2020-04-17 | 2020-08-18 | 携程旅游信息技术(上海)有限公司 | Method, system, device and storage medium for correcting inclination of text image |
Non-Patent Citations (3)
Title |
---|
DEHGHAN, M: "Unconstrained Farsi handwritten word recognition using fuzzy vector quantization and hidden Markov models", PATTERN RECOGNITION LETTERS, vol. 22, no. 2, 1 February 2001 (2001-02-01), pages 209 - 214, XP004315122, DOI: 10.1016/S0167-8655(00)00090-8 * |
雷超阳;刘军华;: "基于SOM的车牌号码倾斜校正", 长沙交通学院学报, no. 04, 15 December 2007 (2007-12-15) * |
魏宏喜;高光来;: "蒙文文档图像的倾斜检测方法", 内蒙古大学学报(自然科学版), no. 04, 15 July 2007 (2007-07-15) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118397242A (en) * | 2024-05-23 | 2024-07-26 | 南京云阶电力科技有限公司 | Table picture inclination judging and correcting method and device based on k-means clustering |
Also Published As
Publication number | Publication date |
---|---|
CN112257710B (en) | 2024-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110008809B (en) | Method and device for acquiring form data and server | |
US9355432B1 (en) | Method and system for automatically cropping images | |
CN110619333B (en) | Text line segmentation method, text line segmentation device and electronic equipment | |
CN114119676B (en) | Target detection tracking identification method and system based on multi-feature information fusion | |
CN111553923B (en) | Image processing method, electronic equipment and computer readable storage medium | |
JP7253573B2 (en) | Matching method, device, electronic device and computer readable storage medium | |
CN110390327B (en) | Foreground extraction method and device, computer equipment and storage medium | |
CN111259854A (en) | Method and device for identifying structured information of table in text image | |
CN110913243A (en) | Video auditing method, device and equipment | |
CN116168351B (en) | Inspection method and device for power equipment | |
CN113850238B (en) | Document detection method and device, electronic equipment and storage medium | |
CN111950345B (en) | Camera identification method and device, electronic equipment and storage medium | |
CN111597845A (en) | Two-dimensional code detection method, device and equipment and readable storage medium | |
CN112733652A (en) | Image target identification method and device, computer equipment and readable storage medium | |
CN113840135B (en) | Color cast detection method, device, equipment and storage medium | |
CN112257710B (en) | Picture gradient detection method and device with text plane | |
JP4967045B2 (en) | Background discriminating apparatus, method and program | |
CN115471439A (en) | Method and device for identifying defects of display panel, electronic equipment and storage medium | |
CN117557777A (en) | Sample image determining method and device, electronic equipment and storage medium | |
CN113591433A (en) | Text typesetting method and device, storage medium and computer equipment | |
CN115457581A (en) | Table extraction method and device and computer equipment | |
CN114926829A (en) | Certificate detection method and device, electronic equipment and storage medium | |
CN114511862A (en) | Form identification method and device and electronic equipment | |
CN115331019A (en) | Data processing method and device, computer equipment and storage medium | |
CN116584100A (en) | Image space detection suitable for overlay media content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |