CN117095418A - Table comparison method and device based on picture pixel difference - Google Patents

Table comparison method and device based on picture pixel difference Download PDF

Info

Publication number
CN117095418A
CN117095418A CN202311347976.1A CN202311347976A CN117095418A CN 117095418 A CN117095418 A CN 117095418A CN 202311347976 A CN202311347976 A CN 202311347976A CN 117095418 A CN117095418 A CN 117095418A
Authority
CN
China
Prior art keywords
picture
font
corrected
interference
imported
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311347976.1A
Other languages
Chinese (zh)
Other versions
CN117095418B (en
Inventor
苑亚龙
王俊凯
周保忠
柳林祥
李思涯
刘志坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xunce Technology Co ltd
Original Assignee
Shenzhen Xunce Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Xunce Technology Co ltd filed Critical Shenzhen Xunce Technology Co ltd
Priority to CN202311347976.1A priority Critical patent/CN117095418B/en
Publication of CN117095418A publication Critical patent/CN117095418A/en
Application granted granted Critical
Publication of CN117095418B publication Critical patent/CN117095418B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/155Removing patterns interfering with the pattern to be recognised, such as ruled lines or underlines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the field of picture detection, and discloses a table comparison method and device based on picture pixel difference, wherein the method comprises the following steps: detecting whether the picture of the imported form accords with standardization or not; carrying out table picture correction on the imported table picture, removing an interference table frame in the corrected table picture, obtaining an interference-removed corrected picture, and detecting whether the interference-removed corrected picture has an overlapped font; identifying a correction font gap of the corrected picture after the overlapping is deleted, and determining a correction font part in the corrected picture after the overlapping is deleted; performing pixel-level enhancement processing on the corrected font part, and removing interference pixel points in the corrected enhancement part; detecting the font essence of the correction part without the interference to obtain a detected font essence, and detecting the font form of the correction part without the interference to obtain a detected font form; and comparing the table format of the imported table picture with the table format of the standard table picture to obtain a second format comparison result. The invention can improve the table comparison efficiency.

Description

Table comparison method and device based on picture pixel difference
Technical Field
The invention relates to the field of image detection, in particular to a table comparison method and device based on picture pixel differences.
Background
The method is characterized in that the original form template sample is used to acquire the current system data and numerical value parameters, the form template is filled in by adopting a manual or functional acquisition mode, parameters such as template fonts, ground colors, oblique lines, thickening, positions and the like are kept, the document is passed through a document generation picture conversion tool as an expected verification standard, after that, the form output of a tested system is completed, the document generated by the tested system is passed through the document generation picture conversion tool to generate a picture as an expected content, and then the tested system is compared with the expected standard by combining the picture pixel position comparison tool, so that the current defects of the tested system can be identified.
Currently, in a general software testing process, defects of a table format are as follows: 1. the version of the tested system is modified for many times, so that the version of the output table format may have variation, the table comparison is required to be repeatedly performed through human eye inspection, in the process, the similarity of minutiae such as three-position, font type difference, font size difference and the like in the table is high, human eyes are difficult to identify the difference or recognize the difference, the human eyes are easy to miss and consume huge time due to the fact that the comparison content and elements are many, the comparison between the automatically generated table format of the second system and the manually generated table format is realized through the comparison table frame, but due to the fact that the size of the font content is inconsistent, the size of the table frame of the same font area is increased or reduced at some time, the table frame at this time is changed due to the adaptation to the sub-graph content, and is not judged to be inconsistent, but many methods nowadays are realized through direct detection of the table frame, the method obviously has the problem that when the table frames of one picture are suitable for font contents to be enlarged or reduced, the table frames of the two pictures are not completely consistent, the table data are positioned firstly and then are compared, the font contents are often positioned based on pixel points of the font contents, but the distribution of the pixel points of the font contents is more complex, the positioning is unstable, the comparison between the four and current two tables comprises the comparison of the formats and the comparison of the font contents, but the contents in the two aspects are often separately compared, but in the practical case, the disorder of the formats can cause huge interference on the comparison of the font contents, if the disorder of the formats is not corrected and the interference of the table frames is not removed, will result in difficulties in contrast on the font content. In summary, the comparison between the conventional two tables is inefficient.
Disclosure of Invention
In order to solve the above problems, the present invention provides a table comparison method and apparatus based on the pixel difference of a picture, which can improve the table comparison efficiency.
In a first aspect, the present invention provides a table comparing method based on pixel differences of a picture, including:
acquiring data to be imported and standard table data, importing the data to be imported into a preset table template to obtain imported table data, converting the imported table data and the standard table data into imported table pictures and standard table pictures, and detecting whether the imported table pictures accord with preset normalization;
when the imported table picture does not accord with the preset normalization, a first format comparison result between the imported table picture and the standard table picture is obtained, table picture correction is carried out on the imported table picture to obtain a corrected table picture, an interference table frame in the corrected table picture is removed, an interference-removed corrected picture is obtained, and whether an overlapped font exists in the interference-removed corrected picture is detected;
when the overlapped fonts exist in the modified pictures with the interference removed, deleting the overlapped fonts to obtain modified pictures with the overlapped deleted fonts, identifying modified font gaps in the modified pictures with the overlapped deleted fonts, and determining modified font positions in the modified pictures with the overlapped deleted fonts based on the modified font gaps;
Performing pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part, and removing interference pixel points in the corrected enhanced part to obtain a corrected part with interference removed;
performing font substance detection on the interference-removed correction part to obtain a detection font substance, comparing the detection font substance with the detection font substance corresponding to the standard table data to obtain a substance comparison result, performing font form detection on the interference-removed correction part to obtain a detection font form, comparing the detection font form with the detection font form corresponding to the standard table data to obtain a form comparison result, and taking the substance comparison result and the form comparison result as a first data comparison result between the imported table picture and the standard table picture;
and when the imported table picture accords with the preset normalization, comparing the table format between the imported table picture and the standard table picture to obtain a second format comparison result, and comparing the table data between the imported table picture and the standard table picture to obtain a second data comparison result.
In a possible implementation manner of the first aspect, the detecting whether the import table picture meets a preset normative includes:
Converting the imported form picture into an edge binary picture;
extracting pixel points positioned at the edge from the edge binary image to obtain edge pixel points;
judging whether the edge where the edge pixel point is located is a table straight line or not;
when the edge where the edge pixel point is located is a table straight line, fitting the table straight line corresponding to the edge pixel point to obtain a fitted table straight line;
identifying the horizontal and vertical lines in the fitted form straight line using the following formula:
wherein horizontal line represents a horizontal line in the fitted form straight line, vertical line represents a Vertical line in the fitted form straight line, x u 、y u Represents the horizontal and vertical coordinates, x of the ith pixel point in the straight line of the fitting table 0 、y 0 A threshold value, x, representing the difference between the preset coordinate value and its neighbor coordinate value u+1 、y u+1 Represents x u 、y u Neighbor coordinate value, x u -x u+1 >x 0 Indicating that when the difference between the abscissa of a certain pixel point in the fitting table straight line and the abscissa of the neighboring pixel point is larger than x 0 At the time of threshold value, the straight line of the fitting table at the moment is a vertical line, y u -y u+1 >y 0 Indicating that when the difference between the ordinate of a certain pixel point in the fitting table straight line and the ordinate of the neighbor pixel point is larger than y 0 When the threshold value is reached, the straight line of the fitting table at the moment is a transverse line;
detecting whether every two of the transverse lines are parallel or not by identifying the slope of the straight line of the fitting table, and detecting whether every two of the vertical lines are parallel or not to obtain a parallel detection result;
detecting whether the included angle between the transverse line and the vertical line is a right angle or not by identifying the slope of the straight line of the fitting table, so as to obtain an included angle detection result;
detecting whether the number of transverse lines which accord with the length of the preset transverse line in the transverse lines is even or not, and detecting whether the number of vertical lines which accord with the length of the preset vertical line in the vertical lines is even or not, so as to obtain a length detection result;
and determining whether the imported form picture accords with the preset standardization or not based on the parallel detection result, the included angle detection result and the length detection result.
In one possible implementation manner of the first aspect, the determining whether the edge where the edge pixel point is located is a table straight line includes:
constructing a curve function for judging whether the edge where the edge pixel point is located is a table straight line or not by using the following formula:
wherein,representing the curve function +_>、/>Representing the horizontal and vertical coordinates of the edge pixel point in the edge binary image,/- >Representing the angle between the perpendicular line passing through the origin point and perpendicular to the curve function and the horizontal axis in the polar coordinate system established in the edge binary image,/A>Representing the vertical distance from the origin to the curve function;
and acquiring an included angle and a vertical distance from the curve function, constructing an included angle-distance combination between the included angle and the vertical distance, and judging that the edge where the edge pixel point is located is the straight line of the table when the number of the included angle-distance combination accords with a preset number.
In a possible implementation manner of the first aspect, the performing table picture correction on the imported table picture to obtain a corrected table picture includes:
obtaining a parallel detection result, an included angle detection result, a length detection result and a fitting form straight line;
when the parallel detection result is non-parallel, carrying out first form straight line correction on the fitting form straight line to obtain a first corrected form straight line;
when the included angle detection result is not vertical, carrying out second form straight line correction on the fitting form straight line to obtain a second corrected form straight line;
when the length detection result is not even, carrying out third table straight line correction on the fitting table straight line to obtain a third corrected table straight line;
And determining the correction table picture based on the first correction table straight line, the second correction table straight line and the third correction table straight line.
In a possible implementation manner of the first aspect, the performing a second table straight line correction on the fitted table straight line to obtain a second corrected table straight line includes:
acquiring an included angle between a horizontal line and a vertical line in the fitting table straight line, and identifying an acute angle in the included angle between the horizontal line and the vertical line and an intersection point between the corresponding horizontal line and the vertical line;
calculating the offset distance of the intersection point by using the following formula:
wherein,representing the offset distance, +.>Represents the acute angle in the angle between the horizontal line and the vertical line, +.>A clip edge located on the vertical line among clip edges representing an acute angle in an angle between the horizontal line and the vertical line;
and based on the offset distance, performing offset in the horizontal direction on an end point which coincides with the intersection point in the end points of the vertical lines, performing offset in the vertical direction on an end point which does not coincide with the intersection point in the end points of the vertical lines, performing offset in the horizontal direction on the vertical lines which are parallel to the vertical lines and meet the acute angle direction, obtaining corrected vertical lines, and taking the corrected vertical lines as the second correction table straight lines.
In one possible implementation manner of the first aspect, the removing the interference table frame in the correction table picture to obtain a correction picture with interference removed includes:
dividing a rectangular area in the correction table picture;
calculating the association degree between the non-central pixel point and the central pixel point in the rectangular area by using the following formula:
wherein,representing the degree of association between non-center pixel points and center pixel points in the rectangular region, v 0 Representing the center pixel point in the rectangular region, v i Representing the ith non-center pixel point in the rectangular area, I representing the pixel gray value,/->Representing the initial degree of association with +.>,/>Represents the degree of association after being screened, n represents the calculated +.>The number of min which is the minimum value selected in the process;
when the association degree is greater than a preset association degree, identifying an interference table frame in the correction table picture by using the following formula:
wherein,representing the interference table frame, +.>Matrix of pixels representing said correction table picture +.>Representing a pixel point structure selected from the non-center pixel points when the association degree is greater than a preset association degree Structural element in horizontal direction>Representing structural elements in the vertical direction formed by pixel points selected from the non-center pixel points when the association degree is larger than a preset association degree;
and deleting the pixel points of the interference table frame in the correction table picture to obtain the correction picture without the interference.
In a possible implementation manner of the first aspect, the performing pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part includes:
and comparing the global variance with the local variance of the corrected font part by using the following formula to obtain a variance comparison result:
wherein,representing the variance comparison result,/->Representing the coordinates in the corrected font part asIs used as the center, and the local variance of the local area with a certain size smaller than the area of the corrected font part is formed>Representing the global variance->Representing a parameter less than 0.5;
selecting a region to be enhanced from the corrected font part based on the variance comparison result;
and carrying out pixel-level enhancement processing on the region to be enhanced by using the following formula to obtain a pixel-level enhancement region:
Wherein,representing the pixel level enhancement area,/and/or>Representing the region to be enhanced, < >>Representing pixel coordinates in the region to be enhanced, < >>Representing a preset pixel-level enhancement coefficient;
and taking the corrected font part containing the pixel-level enhancement region as the corrected enhancement part.
In one possible implementation manner of the first aspect, the removing the interference pixel point in the correction enhancing portion, to obtain a correction portion with interference removed, includes:
constructing a pixel point matrix for detecting interference pixels in the modified enhancement region:
wherein,、/>、/>、/>representing the method for detecting the correction enhancing partPixel matrix of interference pixels, < >>Representing arbitrary pixel values, +.>Representing a pixel value of a central pixel in the pixel matrix;
and when the nine-square grid matrix formed by taking the pixel points in the correction enhancing part as the central pixel points accords with the pixel point matrix, deleting the central pixel points of the nine-square grid matrix to obtain the correction part for removing the interference.
In one possible implementation manner of the first aspect, the detecting the font form of the modified portion with the interference removed to obtain a detected font form includes:
Inputting the correction part with the interference removed into a font form detection model;
in the font form detection model, extracting the font characteristics of the correction part without the interference by using the following formula to obtain extracted font characteristics:
wherein,representing the extracted font features->Representing a size of +.>Is modified by the interference-free correction part, +.>Representing a convolution layer in said font form detecting model,/->Representing the fontsPooling layer in formal detection model, +.>Representing a normalization layer in the font form detection model,/->Representing an activation function in said font form detecting model,/->Representing a wavelet filter in the font style detection model;
and identifying the font form category corresponding to the extracted font characteristic by using a classifier in the font form detection model, and taking the font form category as the detected font form.
In a second aspect, the present invention provides a table comparing device based on pixel differences of a picture, the device comprising:
the standard detection module is used for acquiring data to be imported and standard table data, importing the data to be imported into a preset table template to obtain imported table data, converting the imported table data and the standard table data into imported table pictures and standard table pictures, and detecting whether the imported table pictures accord with preset standardization or not;
The overlapping detection module is used for obtaining a first format comparison result between the imported form picture and the standard form picture when the imported form picture does not accord with the preset normalization, carrying out form picture correction on the imported form picture to obtain a corrected form picture, removing an interference form frame in the corrected form picture to obtain an interference-removed corrected picture, and detecting whether an overlapped font exists in the interference-removed corrected picture;
the position determining module is used for deleting the overlapped fonts when the overlapped fonts exist in the corrected pictures without the interference, obtaining corrected pictures with the overlapped deleted fonts, identifying corrected font gaps in the corrected pictures with the overlapped deleted fonts, and determining corrected font positions in the corrected pictures with the overlapped deleted fonts based on the corrected font gaps;
the interference removing module is used for carrying out pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part, removing interference pixel points in the corrected enhanced part and obtaining a corrected part from which interference is removed;
the first data comparison module is used for carrying out font substance detection on the correction part without interference to obtain a detection font substance, comparing the detection font substance with the detection font substance corresponding to the standard table data to obtain a substance comparison result, carrying out font form detection on the correction part without interference to obtain a detection font form, comparing the detection font form with the detection font form corresponding to the standard table data to obtain a form comparison result, and taking the substance comparison result and the form comparison result as a first data comparison result between the imported table picture and the standard table picture;
And the second data comparison module is used for comparing the table format between the imported table picture and the standard table picture to obtain a second format comparison result when the imported table picture accords with the preset normalization, and comparing the table data between the imported table picture and the standard table picture to obtain a second data comparison result.
Compared with the prior art, the technical principle and beneficial effect of this scheme lie in:
it can be seen that, in the embodiment of the present invention, by detecting whether the imported table picture accords with a preset normalization, so as to avoid the problem that when the table frame of one picture is adapted to the font content to be enlarged or reduced, the table frames of two pictures are not completely consistent, when the imported table picture does not accord with the normalization, the table data at this time can be directly judged to be inconsistent with the standard table picture, so that the flow of comparing the table format of the imported table picture with the table format of the standard table picture can be simplified, the comparison efficiency of the table format is improved, further, the embodiment of the present invention encapsulates the table so as to avoid the situation that the number of parallel lines of the unpacked table is odd, thereby avoiding the adverse effect of the odd number of parallel lines on the linear length detection efficiency, further, the embodiment of the present invention corrects the imported table picture so as to reduce the adverse effect of the disordered table format on the subsequent linear detection efficiency, and further, the embodiment of the invention further protects the linear correction area by carrying out the first table correction on the linear correction on the corrected linear line, thereby avoiding the problem that the linear correction of the linear correction area is carried out on the linear correction of the table; secondly, according to the embodiment of the invention, the interference table frame in the corrected table picture is removed so as to remove adverse effects of the table frame on character detection in the table, so that character detection efficiency is improved, and further, the embodiment of the invention is used for rapidly identifying the font position by directly passing through blank area pixel points by identifying the corrected font gap in the corrected picture after overlapping is deleted, so that the font detection efficiency is improved; further, the embodiment of the invention carries out pixel-level enhancement processing on the corrected font part so as to enhance the contrast of pixel points in the picture, reduce characteristic recognition errors caused by unobvious pixel points, facilitate the subsequent recognition accuracy of fonts, styles and the like formed by the pixel points, and can quickly and cleanly recognize the fonts and the styles, thereby improving the table comparison efficiency. Therefore, the table comparison method and device based on the picture pixel difference can improve the table comparison efficiency.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
Fig. 1 is a flow chart of a table comparing method based on pixel differences of a picture according to an embodiment of the invention;
FIG. 2 is a schematic diagram showing the relationship between the acute angle of the table comparing method based on the pixel difference of the picture and the intersection point and offset distance between the corresponding horizontal line and vertical line according to the embodiment of the present invention;
FIG. 3 is a flowchart illustrating another step of a table comparing method based on pixel differences of the picture according to an embodiment of the present invention;
fig. 4 is a schematic block diagram of a table comparing device based on pixel differences of pictures according to an embodiment of the invention.
Detailed Description
It should be understood that the detailed description is presented by way of example only and is not intended to limit the invention.
The embodiment of the invention provides a table comparison method based on picture pixel difference, and an execution subject of the table comparison method based on picture pixel difference comprises, but is not limited to, at least one of a server, a terminal and the like which can be configured to execute the method provided by the embodiment of the invention. In other words, the table comparing method based on the picture pixel difference may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Fig. 1 is a flowchart illustrating a table comparing method based on pixel differences of a picture according to an embodiment of the invention. The table comparison method based on the picture pixel difference depicted in fig. 1 comprises the following steps:
s1, acquiring data to be imported and standard table data, importing the data to be imported into a preset table template to obtain imported table data, converting the imported table data and the standard table data into imported table pictures and standard table pictures, and detecting whether the imported table pictures accord with preset normative or not.
In the embodiment of the invention, the data to be imported refers to non-form data generated in different business scenes, for example, in a financial business scene, including product newly-added data, distribution object information, sponsor information, external mechanism information, associated party data, asset scale data and the like; the standard form data refers to standard form data obtained by manually converting the data to be imported, such as standard Excel form data.
In the embodiment of the present invention, the preset table template refers to a blank table consistent with the table format of the standard table data, for example, a blank Excel table consistent with the table header of the standard table data.
It should be noted that, the process of importing the data to be imported into a preset form template to obtain imported form data is implemented by a tested system, where the tested system is a system for automatically generating a form according to the input data.
Optionally, the process of converting the import table data and the standard table data into the import table picture and the standard table picture refers to a process of converting a table into a picture, and may be implemented by a PDF converter.
Further, in the embodiment of the invention, whether the imported table picture accords with the preset normalization is detected, so that whether the table format in the imported table picture accords with the normalization in parallel, vertical and length aspects is detected, the table frame is not directly detected, the problem that when the table frame of one picture is suitable for font content to be enlarged or reduced, the table frames of the two pictures are not completely consistent is avoided, if the normalization is not met, the fact that the table data at the moment is inconsistent with the standard table picture can be directly judged, and therefore, the process of comparing the table format of the imported table picture with the table format of the standard table picture can be simplified, and the comparison efficiency of the table format is improved.
In an embodiment of the present invention, the detecting whether the imported table picture meets a preset normalization includes: converting the imported form picture into an edge binary picture; extracting pixel points positioned at the edge from the edge binary image to obtain edge pixel points; judging whether the edge where the edge pixel point is located is a table straight line or not; when the edge where the edge pixel point is located is a table straight line, fitting the table straight line corresponding to the edge pixel point to obtain a fitted table straight line; identifying the horizontal and vertical lines in the fitted form straight line using the following formula:
wherein horizontal line represents a horizontal line in the fitted form straight line, vertical line represents a Vertical line in the fitted form straight line, x u 、y u Represents the horizontal and vertical coordinates, x of the ith pixel point in the straight line of the fitting table 0 、y 0 A threshold value, x, representing the difference between the preset coordinate value and its neighbor coordinate value u+1 、y u+1 Represents x u 、y u Neighbor coordinate value, x u -x u+1 >x 0 Indicating that when the difference between the abscissa of a certain pixel point in the fitting table straight line and the abscissa of the neighboring pixel point is larger than x 0 At the time of threshold value, the straight line of the fitting table at the moment is a vertical line, y u -y u+1 >y 0 Indicating that when the difference between the ordinate of a certain pixel point in the fitting table straight line and the ordinate of the neighbor pixel point is larger than y 0 When the threshold value is reached, the straight line of the fitting table at the moment is a transverse line;
detecting whether every two of the transverse lines are parallel or not by identifying the slope of the straight line of the fitting table, and detecting whether every two of the vertical lines are parallel or not to obtain a parallel detection result; detecting whether the included angle between the transverse line and the vertical line is a right angle or not by identifying the slope of the straight line of the fitting table, so as to obtain an included angle detection result; detecting whether the number of transverse lines which accord with the length of the preset transverse line in the transverse lines is even or not, and detecting whether the number of vertical lines which accord with the length of the preset vertical line in the vertical lines is even or not, so as to obtain a length detection result; and determining whether the imported form picture accords with the preset standardization or not based on the parallel detection result, the included angle detection result and the length detection result.
The edge Binary Image refers to a Binary Image (Binary Image), and each pixel on the Binary Image has only two possible values or gray scale states, and is usually represented by black and white, B & W, a monochrome Image, and the like.
Optionally, the fitting the table straight line corresponding to the edge pixel point, and the process of obtaining the fitted table straight line refers to a curve fitting process of fitting a plurality of table straight lines into one table straight line; the process of detecting whether the number of the vertical lines, which accords with the preset vertical line length, is even is similar to the principle of detecting whether the number of the horizontal lines, which accords with the preset horizontal line length, is even, and further description is omitted here.
Further, the embodiment of the invention converts the imported table picture into the edge binary picture so as to be used for obtaining the highlighted pixel points with obvious brightness change in the edge binary picture, and identifying the special edge-straight line segments in the edges formed when the pixel points are adjacent and have similar directions.
In yet another embodiment of the present invention, the converting the import table picture into an edge binary picture includes: calculating a threshold value for distinguishing whether the pixel point in the imported table picture is an edge pixel point by using the following formula:
wherein,a threshold value for distinguishing whether the pixel point in the imported form picture is an edge pixel point or not is represented by +.>Expressed in the picture of the import table in coordinates +.>Is centered on the pixel point of (2) and is about the average value of the picture area with the size r x r +.>Expressed in the picture of the import table in coordinates +.>Is centered on the pixel point of (2) and is based on the standard deviation of the picture area with the size r +.>Representing a preset parameter, and 0<k<1,/>Representing a power value of 2 as a base and the number of gray bits of the edge binary picture as an exponent, < >>Expressed in coordinates +.>Is centered on the pixel point of (c) and has a coordinate of +.r in the picture region of size r +. >Gray values of the pixels of (a);
calculating the difference between the pixel points in the imported form picture and the corresponding average value by using the following formula:
wherein,representing the difference between the pixel points in the imported form picture and the corresponding mean value thereof, and (I)>Representing coordinates of +.>Gray value of pixel of +.>Is expressed in the imported formIn the tablet, the coordinates are +.>Taking the pixel point of the picture as the center, and taking the average value of the picture area with the size of r;
when the difference value between the pixel point in the imported form picture and the average value corresponding to the pixel point is larger than the threshold value for distinguishing whether the pixel point in the imported form picture is an edge pixel point or not, judging that the pixel point in the imported form picture is the edge pixel point; and in the imported table picture, performing binary marking on the edge pixel points and the non-edge pixel points to obtain the edge binary picture.
Optionally, in the importing table picture, performing binary marking on the edge pixel point and the non-edge pixel point, and obtaining the edge binary picture includes: and replacing the pixel value of the edge pixel point with a binary 1, and replacing the pixel value of the non-edge pixel point with a binary 0.
In another embodiment of the present invention, the determining whether the edge where the edge pixel point is located is a table line includes: constructing a curve function for judging whether the edge where the edge pixel point is located is a table straight line or not by using the following formula:
wherein,representing the curve function +_>、/>Representing the horizontal and vertical coordinates of the edge pixel point in the edge binary image,/->Representing sag past origin in a polar coordinate system established in said edge binary imageAn angle between the vertical line and the horizontal axis of the curve function>Representing the vertical distance from the origin to the curve function;
and acquiring an included angle and a vertical distance from the curve function, constructing an included angle-distance combination between the included angle and the vertical distance, and judging that the edge where the edge pixel point is located is the straight line of the table when the number of the included angle-distance combination accords with a preset number.
The preset number of settings is set according to an actual scene, under the premise that X, Y parameters in the curve function are unchanged, included angle parameters and distance parameters which accord with the curve function are identified, each combination of each included angle parameter and each distance parameter which accord with the curve function is indicated to correspond to a straight line, when X, Y parameters in the curve function are unchanged, each combination of m included angle parameters and each distance parameter which accord with the curve function is indicated to correspond to m straight lines, at the moment, X, Y parameters are converted, the greatest number of straight lines under which X, Y parameters are queried is indicated to be present on the straight line, namely, each straight line passes through the corresponding edge pixel point, and at the moment, the preset number is set to be the maximum number.
Further, in the embodiment of the present invention, before detecting whether the number of transverse lines of the transverse lines, which matches with the preset transverse line length, is an even number, the table is packaged, so as to avoid the situation that the number of parallel lines of the unpacked table is an odd number, for example, when the unpacked table is a three-line table, the table transverse line is 3 and is an odd number, and because each lattice in the table is represented by a rectangle, whether the lengths of two parallel straight lines (even number of straight lines) above and below the rectangle are completely consistent can be detected, so that whether the straight line length of the table is abnormal can be detected by detecting whether two straight lines with completely consistent lengths exist in the parallel straight lines, but if the number of parallel straight lines in the table may exist an odd number and an even number at this time, the odd number may have an adverse effect on the straight line length detection, and if the unpacked table is packaged as an even number of straight lines, the adverse effect of the odd number of parallel straight lines on the straight line length detection can be avoided, so as to avoid the adverse effect of the odd number of parallel straight lines on the straight line length detection efficiency.
In another embodiment of the present invention, before detecting whether the number of transverse lines of the transverse lines, which meet a preset transverse line length, is an even number, the method further includes: when the total number of the transverse lines is not even, acquiring vertical lines in the fitting table straight line, and identifying vertical line-transverse line intersection points of the vertical lines and the transverse lines; and carrying out endpoint virtual connection on the vertical line-horizontal line intersection point in the horizontal direction.
The preset transverse line length refers to the length of a current transverse line when a certain transverse line is detected in length, and is used for inquiring whether the transverse line consistent with the length of the currently detected transverse line exists in the table.
Alternatively, the process of virtually connecting the end points of the vertical-horizontal line intersections from the horizontal direction refers to a process of connecting a plurality of vertical-horizontal line intersections on the same horizontal line by a broken line to generate one straight line.
S2, when the imported table picture does not accord with the preset normalization, a first format comparison result between the imported table picture and the standard table picture is obtained, table picture correction is carried out on the imported table picture, a corrected table picture is obtained, an interference table frame in the corrected table picture is removed, an interference-removed corrected picture is obtained, and whether overlapping fonts exist in the interference-removed corrected picture is detected.
In the embodiment of the present invention, the first format comparison result refers to a result of failure in format comparison.
Further, in the embodiment of the invention, the imported table picture is subjected to table picture correction so as to be used for converting the table format which does not accord with the specification into the table format which accords with the specification, so that adverse effects of the disordered table format on comparison efficiency of subsequent table data are reduced.
In an embodiment of the present invention, the performing table picture correction on the imported table picture to obtain a corrected table picture includes: obtaining a parallel detection result, an included angle detection result, a length detection result and a fitting form straight line; when the parallel detection result is non-parallel, carrying out first form straight line correction on the fitting form straight line to obtain a first corrected form straight line; when the included angle detection result is not vertical, carrying out second form straight line correction on the fitting form straight line to obtain a second corrected form straight line; when the length detection result is not even, carrying out third table straight line correction on the fitting table straight line to obtain a third corrected table straight line; and determining the correction table picture based on the first correction table straight line, the second correction table straight line and the third correction table straight line.
Further, in the embodiment of the invention, the first table straight line is corrected for the fitted table straight line, so that the two straight lines are corrected to be parallel according to the maximum distance between the two straight lines which are required to be parallel, the corrected two parallel straight lines are not overlapped with the fonts mixed in the corrected two straight lines, the interference of the straight lines on the font detection is reduced, and the font detection efficiency is improved.
In yet another embodiment of the present invention, the performing a first table line correction on the fitted table line to obtain a first corrected table line includes: acquiring non-parallel fitting table straight lines and neighbor table straight lines in the fitting table straight lines; starting from the non-parallel fitting table straight line, making a vertical line perpendicular to the neighbor table straight line, selecting a vertical line with the longest length from the perpendicular lines, obtaining the longest vertical line, and obtaining a vertical line intersection point between the longest vertical line and the non-parallel fitting table straight line; and taking a vertical line which is perpendicular to the longest vertical line and passes through the intersection point of the vertical lines as a first correction table straight line after correcting the non-parallel fitting table straight line.
Further, in the embodiment of the present invention, on the premise that the area corresponding to the obtuse angle direction is enlarged according to the direction of the obtuse angle by performing the second table straight line correction on the fitted table straight line, the correction of the straight line is realized, and since the obtuse angle is often caused by too much contents in the table frame and the table frame is extruded, the area corresponding to the obtuse angle direction, which is the table frame with too much contents, needs to be enlarged, so that after the correction of the table straight line, no influence on the font area is ensured, and the condition that the font caused by the format correction is disturbed is avoided.
In yet another embodiment of the present invention, the performing a second table straight line correction on the fitted table straight line to obtain a second corrected table straight line includes: acquiring an included angle between a horizontal line and a vertical line in the fitting table straight line, and identifying an acute angle in the included angle between the horizontal line and the vertical line and an intersection point between the corresponding horizontal line and the vertical line; calculating the offset distance of the intersection point by using the following formula:
wherein,representing the offset distance, +.>Represents the acute angle in the angle between the horizontal line and the vertical line, +.>A clip edge located on the vertical line among clip edges representing an acute angle in an angle between the horizontal line and the vertical line; />
And based on the offset distance, performing offset in the horizontal direction on an end point which coincides with the intersection point in the end points of the vertical lines, performing offset in the vertical direction on an end point which does not coincide with the intersection point in the end points of the vertical lines, performing offset in the horizontal direction on the vertical lines which are parallel to the vertical lines and meet the acute angle direction, obtaining corrected vertical lines, and taking the corrected vertical lines as the second correction table straight lines.
Referring to fig. 2, a schematic diagram of the relationship between the acute angle of the table comparing method based on the difference of pixels of the picture and the intersection point and the offset distance between the corresponding horizontal line and the vertical line according to the embodiment of the invention is shown.
In yet another embodiment of the present invention, the performing a third table line correction on the fitted table line to obtain a third corrected table line includes: acquiring target transverse lines, the number of which is not even, of the transverse lines which accord with the preset transverse line length, and acquiring target vertical lines, the number of which is not even, of the vertical lines which accord with the preset vertical line length; inquiring transverse lines with the length difference value smaller than a preset length difference value between the target transverse line and other transverse lines in the transverse lines to obtain corrected transverse lines; the length of the target transverse line is adjusted to be the same as the length of the correction transverse line, and a final transverse line is obtained; correcting the target vertical line to obtain a final vertical line; and taking the final transverse line and the final vertical line as the third correction table straight line.
Optionally, the target vertical line is corrected to obtain a final vertical line, and the principle of inquiring the transverse line with the length difference value smaller than the preset length difference value between the target transverse line and other transverse lines in the transverse lines is adopted to obtain a corrected transverse line; the principle of adjusting the length of the target transverse line to be the same as the length of the correction transverse line to obtain the final transverse line is similar, and will not be further described herein.
Further, the embodiment of the invention removes the adverse effect of the table frame on the character detection in the table by removing the interference table frame in the corrected table picture, thereby improving the character detection efficiency.
In an embodiment of the present invention, the removing the interference table frame in the correction table picture to obtain the correction picture with interference removed includes: dividing a rectangular area in the correction table picture; calculating the association degree between the non-central pixel point and the central pixel point in the rectangular area by using the following formula:
wherein,representing the degree of association between non-center pixel points and center pixel points in the rectangular region, v 0 Representing the center pixel point in the rectangular region, v i Representing the ith non-center pixel point in the rectangular area, I representing the pixel gray value,/->Representing the initial degree of association with +.>,/>Represents the degree of association after being screened, n represents the calculated +.>The number of min which is the minimum value selected in the process;
when the association degree is greater than a preset association degree, identifying an interference table frame in the correction table picture by using the following formula:
wherein,representing the interference table frame, +. >Matrix of pixels representing said correction table picture +.>Representing structural elements in the horizontal direction formed by pixel points selected from the non-center pixel points when the association degree is greater than a preset association degree, < + >>Indicating that the degree of association is greater than a preset levelWhen the association degree is high, the structural elements in the vertical direction are formed by the pixel points selected from the non-central pixel points;
and deleting the pixel points of the interference table frame in the correction table picture to obtain the correction picture without the interference.
In an embodiment of the present invention, the detecting whether the modified image with interference removed has an overlapped font includes: acquiring an overlapped picture template of fonts in the interference-removed corrected picture; calculating the occupation ratio of the fonts in the modified picture without the interference by using the following formula:
wherein,representing the ratio of fonts in the interference-free corrected picture,>representing the area occupied by the fonts in the interference-free corrected picture, < >>Representing the total area of a table frame where the fonts in the interference-removed corrected picture are located;
when the occupation ratio of the fonts in the interference-removed corrected picture is consistent with the occupation ratio of the fonts in the overlapped picture template, judging that overlapped fonts exist in the interference-removed corrected picture; and when the occupation ratio of the fonts in the interference-removed corrected picture is inconsistent with the occupation ratio of the fonts in the overlapped picture template, judging that the overlapped fonts do not exist in the interference-removed corrected picture.
And S3, when the overlapped fonts exist in the corrected picture without the interference, deleting the overlapped fonts to obtain a corrected picture with the deleted overlapped fonts, identifying the corrected font gap in the corrected picture with the deleted overlapped fonts, and determining the corrected font position in the corrected picture with the deleted overlapped fonts based on the corrected font gap.
The embodiment of the invention identifies the corrected font gap in the corrected picture after deleting overlapping, which is used for solving the defects of stable distribution of pixels based on the blank gap and complex and unstable distribution of pixels in the font area, and then determines the coordinates of the font by using the identified gap, thereby avoiding the interference of the pixels of the font, and rapidly identifying the font position by directly using the pixels in the blank area, thereby improving the font detection efficiency.
In an embodiment of the present invention, the identifying the corrected font gap in the corrected image after deleting the overlapping includes: inquiring a non-pixel point area and a pixel point area from the deleted and overlapped corrected picture; performing edge detection on the pixel point area to obtain a pixel point edge, and taking the pixel point edge as the edge of the non-pixel point area; identifying parallel and adjacent pixel point edges in the pixel point edges to obtain paired pixel point edges; calculating the edge gap of the edges of the paired pixel points by using the following formula:
Wherein,representing the gap between two edge lines when the pair of pixel points edges are two edge lines parallel in the vertical direction, +.>Representing a gap between two edge lines when the pair of pixel point edges are two edge lines parallel in the horizontal direction, 1, 2 denote the serial numbers of two edge lines in the edges of the pair of pixels, +.>Represents the intersection abscissa of the perpendicular line perpendicular to the two edge lines on the 1 st edge line when the two edge lines are parallel in the pair of pixel points, and +.>Represents the intersection abscissa of the perpendicular line perpendicular to the two edge lines on the 2 nd edge line when the two edge lines of the pair of pixel points are parallel, and +.>Representing the ordinate of the intersection point of the vertical lines perpendicular to the two edge lines on the 1 st edge line when the two edge lines of the pair of pixel points are parallel, and +.>Representing the ordinate of the intersection point of the vertical line which is perpendicular to the two edge lines on the 2 nd edge line when the two edge lines of the pair of pixel points are parallel;
and taking the edge gaps of the edges of the paired pixel points as the corrected font gaps in the corrected images after deleting and overlapping.
In an embodiment of the present invention, referring to fig. 3, the determining, based on the corrected font gap, the corrected font part in the corrected image after deleting the overlapping includes:
S301, constructing a gap coordinate of the corrected font gap based on the column number and the row number of the corrected font gap;
s302, acquiring the width and the height of the corrected font gap, and identifying a rectangular structure of the corrected font gap by utilizing the width and the height;
s303, determining the corrected font part in the corrected picture after deleting and overlapping according to the rectangular structure and the clearance coordinate.
Wherein the gap coordinates are characterized by a number of rows and a number of columns, such as a corrected font gap of a third column of a second row; the rectangular structure comprises a horizontal rectangle and a vertical rectangle, wherein the horizontal rectangle is represented when the width is larger than the height, and the vertical rectangle is represented when the width is not larger than the height; the position of the corrected font part is characterized by the rectangular structure and the gap coordinates, for example, the rectangular structure is a horizontal rectangle, the coordinates of the horizontal rectangle as the gap are the second row and the third column, and the corrected font part is positioned above the horizontal rectangle, and can be expressed as being positioned above the horizontal rectangle in the second row and the third column.
And S4, performing pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part, and removing interference pixel points in the corrected enhanced part to obtain a corrected part with interference removed.
The embodiment of the invention is used for enhancing the contrast of the pixel points in the picture when the contrast of the pixel points in the picture is not obvious by carrying out pixel-level enhancement processing on the corrected font part, thereby reducing the feature recognition error caused by the unobvious pixel points, facilitating the subsequent accuracy of recognizing fonts, styles and the like formed by the pixel points, rapidly and cleanly recognizing the fonts and the styles, and improving the table contrast efficiency.
In an embodiment of the present invention, the performing pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part includes: and comparing the global variance with the local variance of the corrected font part by using the following formula to obtain a variance comparison result:
wherein,representing the variance comparison result,/->Representing the coordinates in the corrected font part asIs used as the center, and the local variance of the local area with a certain size smaller than the area of the corrected font part is formed>Representing the global variance->Representing a parameter less than 0.5;
selecting a region to be enhanced from the corrected font part based on the variance comparison result; and carrying out pixel-level enhancement processing on the region to be enhanced by using the following formula to obtain a pixel-level enhancement region:
Wherein,representing the pixel level enhancement area,/and/or>Representing the region to be enhanced, < >>Representing pixel coordinates in the region to be enhanced, < >>Representing a preset pixel-level enhancement coefficient;
and taking the corrected font part containing the pixel-level enhancement region as the corrected enhancement part.
Further, according to the embodiment of the invention, by removing the interference pixel points in the correction enhancing part, the method is used for removing the reasons of the font style in the correction enhancing part, and the redundant parts such as left-falling, right-falling, horizontal, vertical strokes, corners, tail and the like of the font are brought, so that the substantial detection efficiency of the font is improved. The interference pixel points refer to pixel points of redundant parts such as corners and trailing parts of strokes such as left-falling, right-falling, horizontal and vertical strokes and the like of the fonts caused by the font style.
In an embodiment of the present invention, the removing the interference pixel point in the correction enhancing portion to obtain a correction portion with interference removed includes: constructing a pixel point matrix for detecting interference pixels in the modified enhancement region:
wherein,、/>、/>、/>a pixel matrix representing the pixel matrix for detecting the interference pixels in the modified enhancement region,/- >Representing arbitrary pixel values, +.>Representing a pixel value of a central pixel in the pixel matrix;
and when the nine-square grid matrix formed by taking the pixel points in the correction enhancing part as the central pixel points accords with the pixel point matrix, deleting the central pixel points of the nine-square grid matrix to obtain the correction part for removing the interference.
S5, performing font substance detection on the interference-removed correction part to obtain a detection font substance, comparing the detection font substance with the detection font substance corresponding to the standard table data to obtain a substance comparison result, performing font form detection on the interference-removed correction part to obtain a detection font form, comparing the detection font form with the detection font form corresponding to the standard table data to obtain a form comparison result, and taking the substance comparison result and the form comparison result as a first data comparison result between the imported table picture and the standard table picture.
According to the embodiment of the invention, the font substance detection is carried out on the interference-removed correction part, so that the font type of the interference-removed correction part is identified, and the font type in the standard form picture is conveniently compared with the font type.
In an embodiment of the present invention, the detecting the font substance for the correction portion with interference removed to obtain a detected font substance includes: inputting the correction part with the interference removed into a font substantive detection model; in the font parenchyma detection model, the font parenchyma category probability of the interference-removed correction part is calculated by using the following formula:
wherein,representing the font parenchymal class probability, +.>Representing a size of +.>Is modified by the interference-free correction part, +.>Representing a single convolution layer in said font substance detecting model,>representing a single-layered pooling layer in the font substance detecting model->Representing multiple convolution layers in the font substance detecting model,>representing a multi-layered pooling layer in the font substance detecting model->Representing the font realA full connection layer in the quality detection model;
and identifying the font substantive category corresponding to the font substantive category probability, and taking the font substantive category as the detected font substantive.
The font substance category refers to a font category, that is, what the word is, for example, a "have, difficult, good" category.
Further, the embodiment of the invention is used for identifying the style of the fonts by detecting the font form of the correction part without the interference.
In an embodiment of the present invention, the detecting the font form of the correction portion with interference removed to obtain a detected font form includes: inputting the correction part with the interference removed into a font form detection model; in the font form detection model, extracting the font characteristics of the correction part without the interference by using the following formula to obtain extracted font characteristics:
wherein,representing the extracted font features->Representing a size of +.>Is modified by the interference-free correction part, +.>Representing a convolution layer in said font form detecting model,/->Representing a pooling layer in said font form detection model,>representation houseNormalization layer in the font form detection model, < >>Representing an activation function in said font form detecting model,/->Representing a wavelet filter in the font style detection model;
and identifying the font form category corresponding to the extracted font characteristic by using a classifier in the font form detection model, and taking the font form category as the detected font form.
The font type category refers to style category of fonts, such as Song Ti style, seal style, regular script style and the like.
S6, when the imported table picture accords with the preset standardization, comparing the table format between the imported table picture and the standard table picture to obtain a second format comparison result, and comparing the table data between the imported table picture and the standard table picture to obtain a second data comparison result.
The embodiment of the invention is used for checking whether the table format of the generated import table is consistent with the standard table by comparing the table format between the import table picture and the standard table picture.
In an embodiment of the present invention, the comparing the table format between the imported table picture and the standard table picture to obtain a second format comparison result includes: performing picture blocking processing on the imported table picture to obtain a block picture; calculating multispectral vectors of the block pictures by using the following formula:
wherein,multispectral vector representing the block picture,/->Representing the block picture, DCT representing discrete cosine transform, T representing the serial number of one block picture among a plurality of block pictures obtained by carrying out picture block processing on the imported table picture, T representing the total number of a plurality of block pictures obtained by carrying out picture block processing on the imported table picture, Representing a splice symbol;
the attention map of the multispectral vector is calculated using the following formula:
wherein,representing the attention seeking, ->Representing a fully connected network, +.>Representing an activation function;
determining a position of a table frame corresponding to the attention map; and comparing the table format between the imported table picture and the standard table picture based on the table frame position to obtain the first format comparison result.
The table frame positions refer to coordinates of four corners of the table frame in the imported table picture. Optionally, the process of comparing the table format between the imported table picture and the standard table picture based on the table frame position to obtain the first format comparison result is: and determining the first format comparison result by comparing the table frame positions between the imported table picture and the standard table picture.
In an embodiment of the present invention, the principle of comparing the table data between the imported table picture and the standard table picture to obtain the second data comparison result is similar to the principle of performing font essence detection on the modified position with interference removed to obtain a detected font essence, comparing the detected font essence with the detected font essence corresponding to the standard table data to obtain an essence comparison result, performing font form detection on the modified position with interference removed to obtain a detected font form, and comparing the detected font form with the detected font form corresponding to the standard table data to obtain a form comparison result, which is not further described herein.
It can be seen that, in the embodiment of the present invention, by detecting whether the imported table picture accords with a preset normalization, so as to avoid the problem that when the table frame of one picture is adapted to the font content to be enlarged or reduced, the table frames of two pictures are not completely consistent, when the imported table picture does not accord with the normalization, the table data at this time can be directly judged to be inconsistent with the standard table picture, so that the flow of comparing the table format of the imported table picture with the table format of the standard table picture can be simplified, the comparison efficiency of the table format is improved, further, the embodiment of the present invention encapsulates the table so as to avoid the situation that the number of parallel lines of the unpacked table is odd, thereby avoiding the adverse effect of the odd number of parallel lines on the linear length detection efficiency, further, the embodiment of the present invention corrects the imported table picture so as to reduce the adverse effect of the disordered table format on the subsequent linear detection efficiency, and further, the embodiment of the invention further protects the linear correction area by carrying out the first table correction on the linear correction on the corrected linear line, thereby avoiding the problem that the linear correction of the linear correction area is carried out on the linear correction of the table; secondly, according to the embodiment of the invention, the interference table frame in the corrected table picture is removed so as to remove adverse effects of the table frame on character detection in the table, so that character detection efficiency is improved, and further, the embodiment of the invention is used for rapidly identifying the font position by directly passing through blank area pixel points by identifying the corrected font gap in the corrected picture after overlapping is deleted, so that the font detection efficiency is improved; further, the embodiment of the invention carries out pixel-level enhancement processing on the corrected font part so as to enhance the contrast of pixel points in the picture, reduce characteristic recognition errors caused by unobvious pixel points, facilitate the subsequent recognition accuracy of fonts, styles and the like formed by the pixel points, and can quickly and cleanly recognize the fonts and the styles, thereby improving the table comparison efficiency. Therefore, the table comparison method based on the picture pixel difference can improve the table comparison efficiency.
FIG. 4 is a functional block diagram of a table comparing device based on pixel differences of pictures according to the present invention.
The table comparing device 400 based on the pixel difference of the picture can be installed in an electronic device. Depending on the implemented functions, the table comparing device based on the difference of picture pixels may include a specification detection module 401, an overlap detection module 402, a location determination module 403, an interference removal module 404, a first data comparing module 405, and a second data comparing module 406. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the embodiment of the present invention, the functions of each module/unit are as follows:
the standard detection module 401 is configured to obtain data to be imported and standard table data, import the data to be imported into a preset table template to obtain imported table data, convert the imported table data and the standard table data into an imported table picture and a standard table picture, and detect whether the imported table picture accords with a preset standardization;
The overlap detection module 402 is configured to obtain a first format comparison result between the imported table picture and the standard table picture when the imported table picture does not conform to the preset normalization, perform table picture correction on the imported table picture to obtain a corrected table picture, remove an interference table frame in the corrected table picture, obtain an interference-removed corrected picture, and detect whether an overlapped font exists in the interference-removed corrected picture;
the location determining module 403 is configured to, when the overlapping fonts exist in the modified image with interference removed, delete the overlapping fonts to obtain a modified image with overlapping deleted, identify a modified font gap in the modified image with overlapping deleted, and determine a modified font location in the modified image with overlapping deleted based on the modified font gap;
the interference removing module 404 is configured to perform pixel level enhancement processing on the corrected font part to obtain a corrected enhanced part, remove interference pixels in the corrected enhanced part, and obtain a corrected part from which interference is removed;
the first data comparing module 405 is configured to perform font parenchyma detection on the modified location with interference removed to obtain a detected font parenchyma, compare the detected font parenchyma with the detected font parenchyma corresponding to the standard table data to obtain a parenchyma comparison result, perform font form detection on the modified location with interference removed to obtain a detected font form, compare the detected font form with the detected font form corresponding to the standard table data to obtain a form comparison result, and use the parenchyma comparison result and the form comparison result as a first data comparison result between the imported table picture and the standard table picture;
The second data comparison module 406 is configured to compare the table format between the imported table picture and the standard table picture to obtain a second format comparison result when the imported table picture meets the preset normalization, and compare the table data between the imported table picture and the standard table picture to obtain a second data comparison result.
In detail, the modules in the table comparing device 400 based on picture pixel difference in the embodiment of the present invention use the same technical means as the table comparing method based on picture pixel difference described in fig. 1 to 3 and can generate the same technical effects, which are not described herein.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A table comparison method based on picture pixel differences, the method comprising:
acquiring data to be imported and standard table data, importing the data to be imported into a preset table template to obtain imported table data, converting the imported table data and the standard table data into imported table pictures and standard table pictures, and detecting whether the imported table pictures accord with preset normalization;
when the imported table picture does not accord with the preset normalization, a first format comparison result between the imported table picture and the standard table picture is obtained, table picture correction is carried out on the imported table picture to obtain a corrected table picture, an interference table frame in the corrected table picture is removed, an interference-removed corrected picture is obtained, and whether an overlapped font exists in the interference-removed corrected picture is detected;
when the overlapped fonts exist in the modified pictures with the interference removed, deleting the overlapped fonts to obtain modified pictures with the overlapped deleted fonts, identifying modified font gaps in the modified pictures with the overlapped deleted fonts, and determining modified font positions in the modified pictures with the overlapped deleted fonts based on the modified font gaps;
Performing pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part, and removing interference pixel points in the corrected enhanced part to obtain a corrected part with interference removed;
performing font substance detection on the interference-removed correction part to obtain a detection font substance, comparing the detection font substance with the detection font substance corresponding to the standard table data to obtain a substance comparison result, performing font form detection on the interference-removed correction part to obtain a detection font form, comparing the detection font form with the detection font form corresponding to the standard table data to obtain a form comparison result, and taking the substance comparison result and the form comparison result as a first data comparison result between the imported table picture and the standard table picture;
and when the imported table picture accords with the preset normalization, comparing the table format between the imported table picture and the standard table picture to obtain a second format comparison result, and comparing the table data between the imported table picture and the standard table picture to obtain a second data comparison result.
2. The method of claim 1, wherein the detecting whether the import table picture meets a preset normative comprises:
Converting the imported form picture into an edge binary picture;
extracting pixel points positioned at the edge from the edge binary image to obtain edge pixel points;
judging whether the edge where the edge pixel point is located is a table straight line or not;
when the edge where the edge pixel point is located is a table straight line, fitting the table straight line corresponding to the edge pixel point to obtain a fitted table straight line;
identifying the horizontal and vertical lines in the fitted form straight line using the following formula:
wherein horizontal line represents a horizontal line in the fitted form straight line, vertical line represents a Vertical line in the fitted form straight line, x u 、y u Represents the horizontal and vertical coordinates, x of the ith pixel point in the straight line of the fitting table 0 、y 0 A threshold value, x, representing the difference between the preset coordinate value and its neighbor coordinate value u+1 、y u+1 Represents x u 、y u Neighbor coordinate value, x u -x u+1 >x 0 Indicating that when the difference between the abscissa of a certain pixel point in the fitting table straight line and the abscissa of the neighboring pixel point is larger than x 0 At the time of threshold value, the straight line of the fitting table at the moment is a vertical line, y u -y u+1 >y 0 Indicating that when the difference between the ordinate of a certain pixel point in the fitting table straight line and the ordinate of the neighbor pixel point is larger than the y_0 threshold value, the fitting table straight line is a horizontal line;
Detecting whether every two of the transverse lines are parallel or not by identifying the slope of the straight line of the fitting table, and detecting whether every two of the vertical lines are parallel or not to obtain a parallel detection result;
detecting whether the included angle between the transverse line and the vertical line is a right angle or not by identifying the slope of the straight line of the fitting table, so as to obtain an included angle detection result;
detecting whether the number of transverse lines which accord with the length of the preset transverse line in the transverse lines is even or not, and detecting whether the number of vertical lines which accord with the length of the preset vertical line in the vertical lines is even or not, so as to obtain a length detection result;
and determining whether the imported form picture accords with the preset standardization or not based on the parallel detection result, the included angle detection result and the length detection result.
3. The method of claim 2, wherein the determining whether the edge at which the edge pixel point is located is a table line comprises:
constructing a curve function for judging whether the edge where the edge pixel point is located is a table straight line or not by using the following formula:
wherein,representing the curve function, X, Y representing the horizontal and vertical coordinates of the edge pixel point in the edge binary image, θ representing the angle between the vertical line passing through the origin point and perpendicular to the curve function and the horizontal axis in the polar coordinate system established in the edge binary image, and r representing the vertical distance from the origin point to the curve function;
And acquiring an included angle and a vertical distance from the curve function, constructing an included angle-distance combination between the included angle and the vertical distance, and judging that the edge where the edge pixel point is located is the straight line of the table when the number of the included angle-distance combination accords with a preset number.
4. The method of claim 2, wherein performing table picture correction on the imported table picture to obtain a corrected table picture comprises:
obtaining a parallel detection result, an included angle detection result, a length detection result and a fitting form straight line;
when the parallel detection result is non-parallel, carrying out first form straight line correction on the fitting form straight line to obtain a first corrected form straight line;
when the included angle detection result is not vertical, carrying out second form straight line correction on the fitting form straight line to obtain a second corrected form straight line;
when the length detection result is not even, carrying out third table straight line correction on the fitting table straight line to obtain a third corrected table straight line;
and determining the correction table picture based on the first correction table straight line, the second correction table straight line and the third correction table straight line.
5. The method of claim 4, wherein performing a second table line correction on the fitted table line to obtain a second corrected table line comprises:
acquiring an included angle between a horizontal line and a vertical line in the fitting table straight line, and identifying an acute angle in the included angle between the horizontal line and the vertical line and an intersection point between the corresponding horizontal line and the vertical line;
calculating the offset distance of the intersection point by using the following formula:
wherein l lin Representing the offset distance, θ represents the acute angle in the angle between the horizontal line and the vertical line, l xie A clip edge located on the vertical line among clip edges representing an acute angle in an angle between the horizontal line and the vertical line;
and based on the offset distance, performing offset in the horizontal direction on an end point which coincides with the intersection point in the end points of the vertical lines, performing offset in the vertical direction on an end point which does not coincide with the intersection point in the end points of the vertical lines, performing offset in the horizontal direction on the vertical lines which are parallel to the vertical lines and meet the acute angle direction, obtaining corrected vertical lines, and taking the corrected vertical lines as the second correction table straight lines.
6. The method of claim 1, wherein the removing the interference table frame in the correction table picture to obtain the correction picture with interference removed comprises:
dividing a rectangular area in the correction table picture;
calculating the association degree between the non-central pixel point and the central pixel point in the rectangular area by using the following formula:
wherein,representing the degree of association between non-center pixel points and center pixel points in the rectangular region, v 0 Representing the center pixel point in the rectangular region, v i Representing the ith non-center pixel point in the rectangular area, I representing the pixel gray value,/->Representing the initial degree of association with +.>,/>Represents the degree of association after being screened, n represents the calculated +.>The number of min which is the minimum value selected in the process;
when the association degree is greater than a preset association degree, identifying an interference table frame in the correction table picture by using the following formula:
wherein X is kf Representing the interference table frame, X represents a matrix formed by pixel points of the correction table picture, Y k Representing structural elements in the horizontal direction formed by pixel points selected from the non-center pixel points when the association degree is larger than a preset association degree, Y f Representing structural elements in the vertical direction formed by pixel points selected from the non-center pixel points when the association degree is larger than a preset association degree;
and deleting the pixel points of the interference table frame in the correction table picture to obtain the correction picture without the interference.
7. The method of claim 1, wherein performing pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part comprises:
and comparing the global variance with the local variance of the corrected font part by using the following formula to obtain a variance comparison result:
wherein (yes/no) represents the variance comparison result, epsilon (x, y) represents the pixel point with coordinates (x, y) in the corrected font part as the center, a local variance of a local area with a certain size smaller than the area of the corrected font part is formed,representing the global variance, s representing a parameter less than 0.5;
selecting a region to be enhanced from the corrected font part based on the variance comparison result;
and carrying out pixel-level enhancement processing on the region to be enhanced by using the following formula to obtain a pixel-level enhancement region:
wherein, Representing the pixel level enhancement area,/and/or>Representing the region to be enhanced, < >>Representing pixel coordinates in the region to be enhanced, < >>Representing a preset pixel-level enhancement coefficient;
and taking the corrected font part containing the pixel-level enhancement region as the corrected enhancement part.
8. The method of claim 1, wherein the removing the interfering pixels in the correction enhancing region to obtain a correction region from which the interference is removed comprises:
constructing a pixel point matrix for detecting interference pixels in the modified enhancement region:
wherein,、/>、/>、/>a pixel matrix representing the pixel matrix for detecting the interference pixels in the modified enhancement region,/->Representing arbitrary pixel values, +.>Representing a pixel value of a central pixel in the pixel matrix;
and when the nine-square grid matrix formed by taking the pixel points in the correction enhancing part as the central pixel points accords with the pixel point matrix, deleting the central pixel points of the nine-square grid matrix to obtain the correction part for removing the interference.
9. The method according to claim 1, wherein the detecting the font style of the modified portion with the interference removed to obtain the detected font style includes:
Inputting the correction part with the interference removed into a font form detection model;
in the font form detection model, extracting the font characteristics of the correction part without the interference by using the following formula to obtain extracted font characteristics:
wherein x' C*D Representing the extracted font features, x C*D Representing the modified part with the size of C x D and removed interference, CONV represents a convolution layer in the font form detecting model, POOL represents the font form detecting modelPooling layers in a model, BN representing normalization layers in the font form detection model, RELU representing activation functions in the font form detection model, HAAR representing wavelet filters in the font form detection model;
and identifying the font form category corresponding to the extracted font characteristic by using a classifier in the font form detection model, and taking the font form category as the detected font form.
10. A table contrast device based on picture pixel differences, the device comprising:
the standard detection module is used for acquiring data to be imported and standard table data, importing the data to be imported into a preset table template to obtain imported table data, converting the imported table data and the standard table data into imported table pictures and standard table pictures, and detecting whether the imported table pictures accord with preset standardization or not;
The overlapping detection module is used for obtaining a first format comparison result between the imported form picture and the standard form picture when the imported form picture does not accord with the preset normalization, carrying out form picture correction on the imported form picture to obtain a corrected form picture, removing an interference form frame in the corrected form picture to obtain an interference-removed corrected picture, and detecting whether an overlapped font exists in the interference-removed corrected picture;
the position determining module is used for deleting the overlapped fonts when the overlapped fonts exist in the corrected pictures without the interference, obtaining corrected pictures with the overlapped deleted fonts, identifying corrected font gaps in the corrected pictures with the overlapped deleted fonts, and determining corrected font positions in the corrected pictures with the overlapped deleted fonts based on the corrected font gaps;
the interference removing module is used for carrying out pixel-level enhancement processing on the corrected font part to obtain a corrected enhanced part, removing interference pixel points in the corrected enhanced part and obtaining a corrected part from which interference is removed;
the first data comparison module is used for carrying out font substance detection on the correction part without interference to obtain a detection font substance, comparing the detection font substance with the detection font substance corresponding to the standard table data to obtain a substance comparison result, carrying out font form detection on the correction part without interference to obtain a detection font form, comparing the detection font form with the detection font form corresponding to the standard table data to obtain a form comparison result, and taking the substance comparison result and the form comparison result as a first data comparison result between the imported table picture and the standard table picture;
And the second data comparison module is used for comparing the table format between the imported table picture and the standard table picture to obtain a second format comparison result when the imported table picture accords with the preset normalization, and comparing the table data between the imported table picture and the standard table picture to obtain a second data comparison result.
CN202311347976.1A 2023-10-18 2023-10-18 Table comparison method and device based on picture pixel difference Active CN117095418B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311347976.1A CN117095418B (en) 2023-10-18 2023-10-18 Table comparison method and device based on picture pixel difference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311347976.1A CN117095418B (en) 2023-10-18 2023-10-18 Table comparison method and device based on picture pixel difference

Publications (2)

Publication Number Publication Date
CN117095418A true CN117095418A (en) 2023-11-21
CN117095418B CN117095418B (en) 2024-03-01

Family

ID=88780616

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311347976.1A Active CN117095418B (en) 2023-10-18 2023-10-18 Table comparison method and device based on picture pixel difference

Country Status (1)

Country Link
CN (1) CN117095418B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120177295A1 (en) * 2011-01-07 2012-07-12 Yuval Gronau Document comparison and analysis for improved OCR
CN110610170A (en) * 2019-09-24 2019-12-24 南京环印防伪科技有限公司 Document comparison method based on image accurate correction
CN110955603A (en) * 2019-12-03 2020-04-03 望海康信(北京)科技股份公司 Automatic testing method and device, electronic equipment and computer readable storage medium
CN114896175A (en) * 2022-07-14 2022-08-12 深圳市明源云科技有限公司 Automatic test method, device, equipment and medium for report export function

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120177295A1 (en) * 2011-01-07 2012-07-12 Yuval Gronau Document comparison and analysis for improved OCR
CN110610170A (en) * 2019-09-24 2019-12-24 南京环印防伪科技有限公司 Document comparison method based on image accurate correction
CN110955603A (en) * 2019-12-03 2020-04-03 望海康信(北京)科技股份公司 Automatic testing method and device, electronic equipment and computer readable storage medium
CN114896175A (en) * 2022-07-14 2022-08-12 深圳市明源云科技有限公司 Automatic test method, device, equipment and medium for report export function

Also Published As

Publication number Publication date
CN117095418B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
US20210256253A1 (en) Method and apparatus of image-to-document conversion based on ocr, device, and readable storage medium
US7965892B2 (en) Image processing apparatus, control method thereof, and program
CN109766778A (en) Invoice information input method, device, equipment and storage medium based on OCR technique
CN111353961B (en) Document curved surface correction method and device
CN111353497A (en) Identification method and device for identity card information
CN109993161B (en) Text image rotation correction method and system
CN111680690A (en) Character recognition method and device
CN114915788B (en) Image compression method, system and medium for archive
CN112070649A (en) Method and system for removing specific character string watermark
CN115619656A (en) Digital file deviation rectifying method and system
CN113591831A (en) Font identification method and system based on deep learning and storage medium
CN111626145A (en) Simple and effective incomplete form identification and page-crossing splicing method
CN117095418B (en) Table comparison method and device based on picture pixel difference
CN113793264B (en) Archive image processing method and system based on convolution model and electronic equipment
CN116030472A (en) Text coordinate determining method and device
CN116229098A (en) Image recognition method based on mask contour tracking and related products
CN111738272A (en) Target feature extraction method and device and electronic equipment
CN115797939A (en) Two-stage italic character recognition method and device based on deep learning
CN115527023A (en) Image detection method, image detection device, electronic equipment and storage medium
CN113112531B (en) Image matching method and device
CN112419208A (en) Construction drawing review-based vector drawing compiling method and system
CN114648751A (en) Method, device, terminal and storage medium for processing video subtitles
CN112634141A (en) License plate correction method, device, equipment and medium
CN113784009B (en) Paper text image processing method and device and electronic equipment
Shao et al. Digital Image Aesthetic Composition Optimization Based on Perspective Tilt Correction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 518000, C6 Floor, Building 1, Shenzhen Software Industry Base, No. 81, 83, and 85 Gaoxin South Tenth Road, Binhai Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong Province

Patentee after: Shenzhen Xunce Technology Co.,Ltd.

Country or region after: China

Address before: Room 118a, industry university research building, Hong Kong University of science and technology, 9 Yuexing 1st Road, Gaoxin Park, Yuehai street, Nanshan District, Shenzhen, Guangdong 518000

Patentee before: SHENZHEN XUNCE TECHNOLOGY Co.,Ltd.

Country or region before: China