CN112668566A

CN112668566A - Form processing method and device, electronic equipment and storage medium

Info

Publication number: CN112668566A
Application number: CN202011538336.5A
Authority: CN
Inventors: 高超; 徐国强
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd; OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2021-04-16
Also published as: WO2022134771A1

Abstract

The present application relates to image processing technologies, and in particular, to a form processing method, apparatus, electronic device, and storage medium, where the method includes: acquiring a form image of a first form, identifying the form image to obtain an identification result, and performing feature extraction on the identification result to obtain a first feature vector; performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector; splicing the first feature vector and the second feature vector to obtain a third feature vector; inputting the third feature vector into a preset model to obtain vertex features, and determining a relational adjacency matrix based on the vertex features; and carrying out reduction processing according to the relational adjacency matrix to obtain a second table, wherein the second table is the table structure of the first table. By adopting the embodiment of the application, the form can be restored to the precision.

Description

Form processing method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a form processing method and apparatus, an electronic device, and a storage medium.

Background

An Optical Character Recognition (OCR) technology can convert printed characters in an image into a text format which can be processed by a computer, is widely applied to scenes such as data entry, verification comparison and the like, and becomes a key link of informatization and digitization of various industries of national economy. With the continuous development of big data and deep learning technology, the OCR technology has made a breakthrough, and in the recognition of printed document scan, the accuracy of character recognition can reach more than 99%.

In addition to solving the problems of text position detection and content recognition, OCR systems typically also need to parse and restore the layout structure of a document. The form is the most common and important layout structure, and the form analysis technology becomes a key ring for the electronization of the paper document.

Currently, table detection and structure recognition in the industry are usually based on table frame line detection, and the row and column positions of texts are obtained from horizontal and vertical lines in an image, so as to obtain the structure information of the whole table. The method has a good identification effect on the table with complete frame lines of the conventional table, and the structures of the table without frame lines, the three-line table, the four-line table and the like which are invisible to the frame lines of the table cannot be accurately restored. Therefore, the problem of how to improve the table reduction precision needs to be solved urgently.

Disclosure of Invention

The embodiment of the application provides a form processing method and device, an electronic device and a storage medium, and can improve form restoration precision.

In a first aspect, an embodiment of the present application provides a table processing method, where the method includes:

acquiring a form image of a first form, identifying the form image to obtain an identification result, and performing feature extraction on the identification result to obtain a first feature vector;

performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector;

splicing the first feature vector and the second feature vector to obtain a third feature vector;

inputting the third feature vector into a preset model to obtain vertex features, and determining a relational adjacency matrix based on the vertex features;

and carrying out reduction processing according to the relational adjacency matrix to obtain a second table, wherein the second table is the table structure of the first table.

In a second aspect, an embodiment of the present application provides a form processing apparatus, including: an acquisition unit, a detection unit, a splicing unit, an input unit and a recovery unit, wherein,

the acquiring unit is used for acquiring a form image of a first form, identifying the form image to obtain an identification result, and extracting the characteristics of the identification result to obtain a first characteristic vector;

the detection unit is used for carrying out line segment detection on the form image to obtain a line segment detection result, and carrying out feature extraction on the line segment detection result to obtain a second feature vector;

the splicing unit is used for splicing the first feature vector and the second feature vector to obtain a third feature vector;

the input unit is used for inputting the third feature vector into a preset model to obtain vertex features, and determining a relational adjacency matrix based on the vertex features;

and the restoring unit is used for carrying out restoring processing according to the relational adjacency matrix to obtain a second table, and the second table is the table structure of the first table.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for executing the steps in the first aspect of the embodiment of the present application.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program for electronic data exchange, where the computer program enables a computer to perform some or all of the steps described in the first aspect of the embodiment of the present application.

In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, where the computer program is operable to cause a computer to perform some or all of the steps as described in the first aspect of the embodiments of the present application. The computer program product may be a software installation package.

The embodiment of the application has the following beneficial effects:

it can be seen that the form processing method, apparatus, electronic device, and storage medium described in the embodiments of the present application obtain a form image of a first form, identify the form image to obtain an identification result, perform feature extraction on the identification result to obtain a first feature vector, perform line segment detection on the form image to obtain a line segment detection result, perform feature extraction on the line segment detection result to obtain a second feature vector, splice the first feature vector and the second feature vector to obtain a third feature vector, input the third feature vector to a preset model to obtain a vertex feature, determine a relationship adjacency matrix based on the vertex feature, perform reduction processing according to the relationship adjacency matrix to obtain a second form, which is a form structure of the first form, and further, on one hand, perform overall identification on the form image, the method can obtain an overall recognition result, such as text box content and content information, and can detect the lines, namely local image recognition, fuse the features of the text box content and the content information, effectively utilize global vertex information, have deeper layers and stronger feature extraction capability, so that the table can be deeply restored, and the table restoration accuracy is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a form processing method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart diagram of another form processing method provided in the embodiments of the present application;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 4 is a block diagram of functional units of a form processing apparatus according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

The electronic device according to the embodiment of the present application may include various handheld devices (such as a mobile phone, a tablet computer, a POS machine, and the like) having an image processing function, a scanner, a micro printer, a desktop, a vehicle-mounted device, a wearable device (a smart watch, a smart bracelet, a wireless headset, an augmented reality/virtual reality device, smart glasses), an AI robot, a computing device, or other processing devices connected to a wireless modem, and various forms of User Equipment (UE), a Mobile Station (MS), a terminal device (terminal device), and the like. For convenience of description, the above-mentioned devices are collectively referred to as electronic devices.

The following describes embodiments of the present application in detail.

Referring to fig. 1, fig. 1 is a schematic flow chart diagram of a form processing method according to an embodiment of the present application, and as shown in the figure, the form processing method is applied to an electronic device, and includes:

101. the method comprises the steps of obtaining a form image of a first form, identifying the form image to obtain an identification result, and extracting features of the identification result to obtain a first feature vector.

In this embodiment of the application, the first form may be any form, for example, an excel form, or a form drawn by a user's hand, or a form in a word file. In specific implementation, the electronic device may shoot the first table to obtain a table image of the first table, and recognize the table image, and mainly may obtain a recognition result by using an OCR technology, where the recognition result may include content of a text box and content information, and further perform feature extraction on the recognition result to obtain a first feature vector, where features obtained by the feature extraction may be at least one of the following: vertex coordinates, center coordinates, width, height, color, text character type statistics, text word vectors, sentence vectors, foreground color, font, texture, etc., of the text box, without limitation. The algorithm corresponding to feature extraction may be at least one of: the method includes the steps of harris corner detection, a scale invariant feature transformation algorithm, a neural network algorithm, wavelet transformation and the like, which are not limited herein, and further, each feature may be represented in a vector form to obtain a first feature vector, where the first feature vector may include features of text box dimensions of a table and may also include features of content dimensions of the table.

Specifically, the electronic device may perform OCR recognition on the entire document to obtain a position of the text box and content information, and extract features to obtain N_tA feature vector of x D, where N_tThe number of text boxes, D the feature dimension, and t the positive integer. The features may include at least one of: location features (e.g., vertex coordinates, center coordinates, width, height of a text box), language features (e.g., text character type statistics, text word vectors, sentence vectors, etc.), image features (e.g., foreground color, font, texture, etc.), etc., without limitation.

Optionally, in the step 101, acquiring the form image of the first form may include the following steps:

11. acquiring target environment parameters;

12. determining target shooting parameters corresponding to the target environment parameters according to a mapping relation between preset environment parameters and the shooting parameters;

13. and shooting the first table according to the target shooting parameters to obtain the table image.

Wherein the environmental parameter may be at least one of: ambient light brightness, ambient color temperature, dithering parameters, temperature, humidity, weather, and the like, without limitation. The electronic device may acquire the environmental parameter through a sensor, and the sensor may be at least one of: the electronic device may be configured to detect an environmental parameter by integrating the above sensors, without limitation. The shooting parameter may be at least one of: the sensitivity ISO, the exposure time, the flash brightness, the flash operating frequency, the flash color, the anti-shake parameter, the white balance parameter, and the like, are not limited herein. The electronic device may store a mapping relationship between preset environmental parameters and shooting parameters in advance.

In specific implementation, the electronic device can obtain the target environment parameters, further determine the target shooting parameters corresponding to the target environment parameters according to the mapping relation between the preset environment parameters and the shooting parameters, and shoot the first form according to the target shooting parameters to obtain the form image.

102. And performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector.

The electronic device may perform line segment detection on the form image, and the specific algorithm may be at least one of the following algorithms: hough transform, neural network algorithm, ecological algorithm, etc., without limitation. The algorithm corresponding to feature extraction may be at least one of: harris corner detection, scale invariant feature transform algorithms, neural network algorithms, wavelet transforms, etc., without limitation.

In specific implementation, for example, the electronic device may perform line segment detection on the form image to obtain coordinate information of all line segments in the form image, extract the features, and obtain N_lA feature vector of x D, where N_lThe number of line segments, and D is the characteristic dimension. The features may include at least one of: position features (e.g., line segment start point, end point, midpoint coordinates, length, angle, etc.), image features (e.g., foreground color, line shape, thickness, etc.), and the like, are not limited herein. Further, electricityThe sub-device may process the feature-based feature into a second feature vector, for example, convert different features into the same dimension, and then perform stitching. The line segment detection algorithm may be at least one of: hough transform, Line Segment detection algorithm (LSD), neural network algorithm, etc., without limitation thereto.

Optionally, in the step 102, performing line segment detection on the form image to obtain a line segment detection result, the method may include the following steps:

21. determining a boundary contour of the form image;

22. performing texture extraction on the image in the interface profile to obtain a P stripe path, wherein P is an integer greater than 1;

23. screening the P stripe paths to obtain Q stripes, wherein Q is an integer which is more than 1 and less than P;

24. and carrying out line segment detection on the Q stripe circuit to obtain a line segment detection result.

In a specific implementation, the electronic device may determine a boundary contour of the form image, and the interface contour may be set by a user, or may perform rough recognition on the form image (for example, recognize from the periphery inward, recognize only the peripheral contour, and stop the recognition operation after recognizing the peripheral contour), and use the contour of the outermost periphery as the boundary contour. Furthermore, the electronic device may perform texture extraction on an image in the interface profile to obtain a P stripe path, and further, the electronic device may screen the P stripe path to obtain a Q stripe path, for example, may detect whether the length of each texture in the P stripe path is within a preset length range, or may detect whether the width of each texture in the P stripe path is within a preset width range, or may detect whether the bending angle of each texture in the P stripe path is within a preset angle range, or the like, where the preset length range, the preset width range, and the preset angle range may all be set by a user or default by a system, and finally, perform line segment detection on the Q stripe path to obtain a line segment detection result, so that the false recognition probability may be reduced.

103. And splicing the first feature vector and the second feature vector to obtain a third feature vector.

In a specific implementation, the electronic device may process the dimensions of the first feature vector and the second feature vector into the same dimension in at least one dimension, and then splice the first feature vector and the second feature vector to obtain a third feature vector.

In particular implementations, for example. The electronic device may splice the first feature vector and the second feature vector to obtain a third feature vector, for example, the feature splicing may obtain an N × D feature vector E, where N is N_t+N_l。

104. And inputting the third feature vector into a preset model to obtain vertex features, and determining a relational adjacency matrix based on the vertex features.

The preset model can be set by a user or defaulted by a system, and the preset model can be at least one of the following models: a Transformer model, a neural network model, etc., and is not limited herein. The vertex feature may be at least one of: vertex position, number of vertices, vertex vector, etc., without limitation.

In a specific implementation, the electronic device may input the feature vector E into the transform model, and obtain a feature representation X of a vertex, where X is nxd. Based on the vertex feature X. Furthermore, the relational adjacency matrix a may be obtained by bilinear multiplication^TThe shape of W may be R × D × D, and the shape of A may be R × N × N, where R is the relational number. The relationship predicted here may include 3 (R ═ 3): the two vertices are in the same table, same row, same column, which can be represented by a0, a1, a2, respectively. And the target relation adjacency matrix Agt can be manually marked, and the vertex characteristic vector E is input to obtain the predicted relation adjacency matrix A. The prediction result A is compared with the labeled answer Agt, and the model parameters can also be optimized by using a gradient descent method. And continuously iterating and optimizing until the model training is completed.

Optionally, in the step 104, inputting the third feature vector into a preset model to obtain a vertex feature, and determining a relational adjacency matrix based on the vertex feature, includes the following steps:

41. inputting the third feature vector into a preset model to obtain vertex features;

42. based on the vertex characteristics, carrying out operation through bilinear multiplication to obtain an initial relation adjacency matrix;

43. acquiring a preset relation adjacency matrix obtained based on the vertex characteristic prediction;

44. optimizing the model parameters of the preset model according to the initial relation adjacency matrix and the comparison result between the preset relation adjacency matrices;

45. and outputting a relational adjacency matrix based on the optimized preset model when the model parameters meet preset conditions.

The preset condition may be set by the user or default to the system, for example, the preset condition is that the preset model converges, or the preset condition is that the recognition accuracy of the preset model reaches a specified threshold, and the specified threshold may be set by the user or default to the system. The electronic device may input the third feature vector to a preset model to obtain vertex features, perform operation by bilinear multiplication based on the vertex features to obtain an initial relationship adjacency matrix, and obtain a preset relationship adjacency matrix predicted based on the vertex features, where the preset relationship adjacency matrix may be drawn by a user by hand, or may be obtained by connecting vertices according to a preset rule, where the preset rule may be preset or default to a system, for example, a straight line connection or a connection according to a certain curved arc line, and the like, and is not limited herein. Furthermore, the electronic device may optimize a model parameter of the preset model according to a comparison result between the initial relationship adjacency matrix and the preset relationship adjacency matrix, specifically, may compare a difference between the initial relationship adjacency matrix and the preset relationship adjacency matrix to obtain a comparison result, and then feed back the comparison result to the preset model to optimize the model parameter, where the model parameter may be a control parameter of each module of the preset model, and the control parameter may be at least one of: convolution kernel size, bias, threshold batch-size, etc., and are not limited herein. And outputting the relational adjacency matrix based on the optimized preset model when the model parameters meet the preset conditions, specifically, inputting the third eigenvector into the optimized preset model to obtain the final relational adjacency matrix.

105. And carrying out reduction processing according to the relational adjacency matrix to obtain a second table, wherein the second table is the table structure of the first table.

After obtaining the relational adjacency matrix a, the electronic device may restore the table structure through a post-processing algorithm, and the specific implementation may be: (1) solving connected components for A0, wherein each connected component corresponds to a table; (2) constructing subgraphs from A1 and A2 according to the top points of the connected component weights, and solving the great cliques of the subgraphs, wherein each clique corresponds to one row or one column of the table respectively; (3) sorting rows and columns in the same table according to the position sequence; (4) and restoring the cells according to the sorted rows and columns, wherein if the plurality of vertexes belong to the same row and the same column, the plurality of vertexes belong to the same cell. The electronic device can restore the relational adjacency matrix through a table restoration algorithm to obtain a second table, wherein the second table is a table structure of the first table. The table reduction algorithm may be at least one of: a neural network algorithm, an inverse transformation algorithm corresponding to the relational adjacency matrix, and the like, which are not limited herein.

In specific implementation, in the embodiment of the application, text boxes and line segments in a document can be input into a graph model as vertices, relationships between the vertices are predicted, a relationship whether two text boxes are in the same table, the same row and the same column is obtained, and then a table structure is restored, so that information such as position relationships and text logical relationships between the text boxes can be effectively utilized, table identification of incomplete frame lines can be supported, and better table detection and structure identification effects can be obtained. In addition, a Transformer structure can be used for carrying out vertex feature representation, global vertex information can be effectively utilized, the layer number is deeper, the feature extraction capability is stronger, the document entry efficiency and the paper text electronization efficiency can be improved, and the informatization and digitization development of various industries can be better promoted.

Optionally, between the above steps 101 to 102, the following steps may be further included:

a1, determining a target image quality evaluation value of the form image;

a2, when the target image quality evaluation value is larger than a preset image quality evaluation value, executing the step of performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector;

alternatively, the first and second electrodes may be,

a3, when the target image quality evaluation value is less than or equal to the preset image quality evaluation value, performing image enhancement processing on the form image to obtain a target form image;

then, in step 102, the table image is subjected to line segment detection to obtain a line segment detection result, and the line segment detection result is subjected to feature extraction to obtain a second feature vector, which may be implemented as follows:

performing line segment detection on the target form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector

In the embodiment of the present application, the preset image quality evaluation value may be preset or default to the system. The electronic device can perform image quality evaluation on the form image, and further can obtain a target image quality evaluation value of the form image. Specifically, the electronic device may perform image quality evaluation on the table image by using at least one image quality evaluation value, and the image quality evaluation index may be at least one of the following: information entropy, definition, mean square error, average gradient, edge preservation, average gray level, and the like, which are not limited herein, further obtaining a target image quality evaluation value, and performing step 102 when the target image quality evaluation value is greater than a preset image quality evaluation value, otherwise, performing image enhancement processing on the table image to obtain a target table image when the target image quality evaluation value is less than or equal to the preset image quality evaluation value, and further performing step 102 according to the target table image, wherein an image enhancement algorithm corresponding to the image enhancement processing may be at least one of: gray scale stretching, histogram equalization, wavelet transformation, neural network algorithm and the like are not limited, and further, the line detection accuracy can be ensured.

Further, optionally, in the step a3, performing image enhancement processing on the form image to obtain the target form image, the method may include the following steps:

a31, acquiring the text box content in the recognition result;

a32, determining a target attribute parameter of the text box content;

a33, acquiring a reference attribute parameter of the first table;

a34, determining a target deviation degree between the target attribute parameter and the reference attribute parameter;

a35, determining a target image enhancement parameter corresponding to the target deviation degree according to a mapping relation between a preset deviation degree and an image enhancement parameter;

and A36, performing image enhancement processing on the form image according to the target image enhancement parameters to obtain the target form image.

In this embodiment, the identification result may include content of the text box and content information, and then the electronic device may obtain the content of the text box in the identification result, and may further determine a target attribute parameter of the content of the text box, where the target attribute information may be at least one of the following: the average width of the lines of the text box, the number of lines of the text box, the vertex position of the text box, the number of vertices of the text box, the area of the text box, and the like, which are not limited herein. The reference property parameter may also be at least one of: the average width of the lines of the text box, the number of lines of the text box, the vertex position of the text box, the number of vertices of the text box, the area of the text box, and the like, which are not limited herein. The reference attribute information may be a fixed attribute of the table, and since the first table is pre-existing, for example, a printed table, and since the table attribute is also an inherent setting, the electronic device may directly obtain the reference attribute parameter of the first table. Of course, the reference attribute parameter may be set by the user by himself or herself empirically. The electronic device may further pre-store a mapping relationship between a preset deviation degree and an image enhancement parameter, where the image enhancement parameter may be at least one of: the control parameters of the image enhancement algorithm and the image enhancement algorithm are not limited herein, and the control parameters of the image enhancement algorithm are different for different image enhancement algorithms, and the control parameters of the image enhancement algorithm are used for controlling the image enhancement parameters.

Furthermore, the electronic device may obtain the reference attribute parameter of the first table, and may further determine a target deviation between the target attribute parameter and the reference attribute parameter, where the target deviation may be obtained as follows:

target deviation (reference attribute parameter-target attribute parameter)/reference attribute parameter

Further, the electronic device determines a target image enhancement parameter corresponding to the target deviation degree according to a mapping relation between a preset deviation degree and the image enhancement parameter, and finally, performs image enhancement processing on the form image according to the target image enhancement parameter to obtain a target form image, and further adjusts the image enhancement parameter according to the deviation between the form image and the real form to enhance the image in a targeted manner, so as to avoid over-enhancement or under-enhancement of the image caused by the form image.

Further, optionally, the step a1, determining the target image quality evaluation value of the form image, may include the steps of:

a11, performing background removing operation on the form image to obtain a target foreground image;

a12, determining the area of a target area of the target foreground image;

a13, extracting characteristic points of the target foreground image to obtain a target characteristic point set, and determining the number of the target characteristic points according to the target characteristic point set;

a14, determining the distribution density of the target characteristic points according to the number of the target characteristic points and the area of the target area;

and A15, determining the target image quality evaluation value corresponding to the target feature point distribution density according to the preset mapping relation between the feature point distribution density and the image quality evaluation value.

In a specific implementation, a mapping relationship between a preset feature point distribution density and an image quality evaluation value may be stored in the electronic device in advance, and the greater the feature point distribution density, the better the image quality is.

In specific implementation, the electronic device may perform background removal on the table image to obtain a target foreground image, where the background is a blank portion, or a background (e.g., a watermark, a background image, etc.) selected in advance for the table, so as to determine a target area of the target foreground image, and perform feature point extraction on the target foreground image to obtain a target feature point set, where a feature point extraction algorithm may be at least one of: a harris corner detection algorithm, a scale invariant feature extraction algorithm, a neural network algorithm, etc., which are not limited herein. Furthermore, the electronic device may determine the number of target feature points according to the target feature point set, and may also determine the distribution density of the target feature points according to the number of target feature points and the area of the target region, that is:

target characteristic point distribution density is equal to the number of target characteristic points/area of target area

Furthermore, the electronic device can determine a target image quality evaluation value corresponding to the target feature point distribution density according to a mapping relation between the preset feature point distribution density and the image quality evaluation value, and since the background is removed and the image quality evaluation is performed only according to the foreground part of the image, and the white content in the table is more specifically combined with the table characteristics, the image quality evaluation can be performed according to the feature points in the corresponding outline, and further, the accurate image quality evaluation can be performed on the table image.

It can be seen that the table processing method described in this embodiment of the application obtains a table image of a first table, identifies the table image to obtain an identification result, performs feature extraction on the identification result to obtain a first feature vector, performs line segment detection on the table image to obtain a line segment detection result, performs feature extraction on the line segment detection result to obtain a second feature vector, splices the first feature vector and the second feature vector to obtain a third feature vector, inputs the third feature vector to a preset model to obtain a vertex feature, determines a relational adjacency matrix based on the vertex feature, performs reduction processing according to the relational adjacency matrix to obtain a second table, which is a table structure of the first table, and further, may perform overall identification on the table image to obtain an overall identification result, for example, the text box content and the content information can be detected, namely local image identification is performed, the characteristics of the text box content and the content information are fused, the global vertex information is effectively utilized, the layer number is deeper, the characteristic extraction capability is stronger, and therefore the table can be deeply restored, and the table restoration accuracy is improved.

Referring to fig. 2, fig. 2 is a schematic flowchart of a form processing method provided in an embodiment of the present application, applied to an electronic device, where as shown in the figure, the form processing method includes:

201. the method comprises the steps of obtaining a form image of a first form, identifying the form image to obtain an identification result, and extracting features of the identification result to obtain a first feature vector.

202. And determining a target image quality evaluation value of the table image.

203. And when the target image quality evaluation value is larger than a preset image quality evaluation value, performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector.

204. And splicing the first feature vector and the second feature vector to obtain a third feature vector.

205. And inputting the third feature vector into a preset model to obtain vertex features, and determining a relational adjacency matrix based on the vertex features.

206. And carrying out reduction processing according to the relational adjacency matrix to obtain a second table, wherein the second table is the table structure of the first table.

The detailed description of the steps 201 to 206 may refer to the corresponding steps described in the above fig. 1, and is not repeated herein.

It can be seen that the table processing method described in this embodiment of the present application obtains a table image of a first table, identifies the table image to obtain an identification result, performs feature extraction on the identification result to obtain a first feature vector, determines a target image quality evaluation value of the table image, performs line segment detection on the table image to obtain a line segment detection result, performs feature extraction on the line segment detection result to obtain a second feature vector, splices the first feature vector and the second feature vector to obtain a third feature vector, inputs the third feature vector to a preset model to obtain vertex features, determines a relational adjacency matrix based on the vertex features, performs reduction processing according to the relational adjacency matrix to obtain a second table, where the second table is a table structure of the first table, furthermore, firstly, the table image can be wholly recognized, the whole recognition result can be obtained, such as text box content and content information, secondly, the line detection can be performed, which is equivalent to local image recognition, the features of the text box content and the content information are fused, the global vertex information is effectively utilized, the number of layers is deeper, the feature extraction capability is stronger, and therefore the table can be deeply restored, and the table restoration accuracy is improved.

In accordance with the foregoing embodiments, please refer to fig. 3, fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application, and as shown in the drawing, the electronic device includes a processor, a memory, a communication interface, and one or more programs, the one or more programs are stored in the memory and configured to be executed by the processor, and in an embodiment of the present application, the programs include instructions for performing the following steps:

It can be seen that, in the electronic device described in this embodiment of the present application, a form image of a first form is obtained, the form image is identified to obtain an identification result, feature extraction is performed on the identification result to obtain a first feature vector, line segment detection is performed on the form image to obtain a line segment detection result, feature extraction is performed on the line segment detection result to obtain a second feature vector, the first feature vector and the second feature vector are spliced to obtain a third feature vector, the third feature vector is input to a preset model to obtain a vertex feature, a relationship adjacency matrix is determined based on the vertex feature, reduction processing is performed according to the relationship adjacency matrix to obtain a second form, the second form is a form structure of the first form, and in addition, integral identification can be performed on the form image to obtain an integral identification result, such as text box content and content information, and secondly, the lines can be detected, which is equivalent to local image recognition, the characteristics of the lines and the local image can be fused, the global vertex information can be effectively utilized, the layer number is deeper, the characteristic extraction capability is stronger, and therefore the table can be deeply restored, and the table restoration accuracy is improved.

Optionally, in the aspect of performing line segment detection on the form image to obtain a line segment detection result, the program includes instructions for performing the following steps:

determining a boundary contour of the form image;

performing texture extraction on the image in the interface profile to obtain a P stripe path, wherein P is an integer greater than 1;

screening the P stripe paths to obtain Q stripes, wherein Q is an integer which is more than 1 and less than P;

and carrying out line segment detection on the Q stripe circuit to obtain a line segment detection result.

Optionally, in the aspect that the third feature vector is input to a preset model to obtain a vertex feature, and a relational adjacency matrix is determined based on the vertex feature, the program includes instructions for performing the following steps:

inputting the third feature vector into a preset model to obtain vertex features;

based on the vertex characteristics, carrying out operation through bilinear multiplication to obtain an initial relation adjacency matrix;

acquiring a preset relation adjacency matrix obtained based on the vertex characteristic prediction;

optimizing the model parameters of the preset model according to the initial relation adjacency matrix and the comparison result between the preset relation adjacency matrices;

and outputting a relational adjacency matrix based on the optimized preset model when the model parameters meet preset conditions.

Optionally, in the aspect of obtaining the form image of the first form, the program includes instructions for performing the following steps:

acquiring target environment parameters;

determining target shooting parameters corresponding to the target environment parameters according to a mapping relation between preset environment parameters and the shooting parameters;

and shooting the first table according to the target shooting parameters to obtain the table image.

Optionally, after the table image of the first table is obtained, the table image is identified to obtain an identification result, the identification result is subjected to feature extraction to obtain a first feature vector, and before the table image is subjected to line segment detection to obtain a line segment detection result, and the line segment detection result is subjected to feature extraction to obtain a second feature vector, the program further includes instructions for executing the following steps:

determining a target image quality evaluation value of the table image;

when the target image quality evaluation value is larger than a preset image quality evaluation value, executing the step of performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector;

alternatively, the first and second electrodes may be,

when the target image quality evaluation value is less than or equal to the preset image quality evaluation value, performing image enhancement processing on the form image to obtain a target form image, performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector, including:

Optionally, in the aspect of performing image enhancement processing on the form image to obtain the target form image, the program includes instructions for performing the following steps:

acquiring text box content in the recognition result;

determining a target attribute parameter of the text box content;

acquiring a reference attribute parameter of the first table;

determining a target degree of deviation between the target attribute parameter and the reference attribute parameter;

determining a target image enhancement parameter corresponding to the target deviation degree according to a mapping relation between a preset deviation degree and an image enhancement parameter;

and carrying out image enhancement processing on the form image according to the target image enhancement parameters to obtain the target form image.

Optionally, in the aspect of determining the target image quality evaluation value of the form image, the program includes instructions for:

performing background removal operation on the form image to obtain a target foreground image;

determining the area of a target region of the target foreground image;

extracting characteristic points of the target foreground image to obtain a target characteristic point set, and determining the number of the target characteristic points according to the target characteristic point set;

determining the distribution density of the target characteristic points according to the number of the target characteristic points and the area of the target area;

and determining the target image quality evaluation value corresponding to the target feature point distribution density according to a preset mapping relation between the feature point distribution density and the image quality evaluation value.

The above description has introduced the solution of the embodiment of the present application mainly from the perspective of the method-side implementation process. It is understood that the electronic device comprises corresponding hardware structures and/or software modules for performing the respective functions in order to realize the above-mentioned functions. Those of skill in the art will readily appreciate that the present application is capable of hardware or a combination of hardware and computer software implementing the various illustrative elements and algorithm steps described in connection with the embodiments provided herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiment of the present application, the electronic device may be divided into the functional units according to the method example, for example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. It should be noted that the division of the unit in the embodiment of the present application is schematic, and is only a logic function division, and there may be another division manner in actual implementation.

Fig. 4 is a block diagram showing functional units of a table processing apparatus 400 according to an embodiment of the present application. The table processing apparatus 400, said apparatus 400 comprising: an acquisition unit 401, a detection unit 402, a splicing unit 403, an input unit 404, and a restoration unit 405, wherein,

the obtaining unit 401 is configured to obtain a form image of a first form, identify the form image to obtain an identification result, and perform feature extraction on the identification result to obtain a first feature vector;

the detection unit 402 is configured to perform line segment detection on the form image to obtain a line segment detection result, and perform feature extraction on the line segment detection result to obtain a second feature vector;

the splicing unit 403 is configured to splice the first eigenvector and the second eigenvector to obtain a third eigenvector;

the input unit 404 is configured to input the third feature vector to a preset model to obtain a vertex feature, and determine a relational adjacency matrix based on the vertex feature;

the restoring unit 405 is configured to perform restoring processing according to the relational adjacency matrix to obtain a second table, where the second table is a table structure of the first table.

It can be seen that, the form processing apparatus described in this embodiment of the present application obtains a form image of a first form, identifies the form image to obtain an identification result, performs feature extraction on the identification result to obtain a first feature vector, performs line segment detection on the form image to obtain a line segment detection result, performs feature extraction on the line segment detection result to obtain a second feature vector, splices the first feature vector and the second feature vector to obtain a third feature vector, inputs the third feature vector to a preset model to obtain a vertex feature, determines a relational adjacency matrix based on the vertex feature, performs reduction processing according to the relational adjacency matrix to obtain a second form, which is a form structure of the first form, and further, in one aspect, performs overall identification on the form image to obtain an overall identification result, for example, the text box content and the content information can be detected, namely local image identification is performed, the characteristics of the text box content and the content information are fused, the global vertex information is effectively utilized, the layer number is deeper, the characteristic extraction capability is stronger, and therefore the table can be deeply restored, and the table restoration accuracy is improved.

In one possible example, in the aspect of performing line segment detection on the form image to obtain a line segment detection result, the detecting unit 402 is specifically configured to:

determining a boundary contour of the form image;

Optionally, in the aspect that the third feature vector is input to a preset model to obtain a vertex feature, and a relational adjacency matrix is determined based on the vertex feature, the input unit 404 is specifically configured to:

Optionally, in terms of acquiring the form image of the first form, the acquiring unit 401 is specifically configured to:

acquiring target environment parameters;

Optionally, after the table image of the first table is obtained, the table image is identified to obtain an identification result, the identification result is subjected to feature extraction to obtain a first feature vector, and the table image is subjected to line segment detection to obtain a line segment detection result, and the line segment detection result is subjected to feature extraction to obtain a second feature vector, the apparatus 400 is further specifically configured to:

determining a target image quality evaluation value of the table image;

alternatively, the first and second electrodes may be,

Optionally, in the aspect of performing image enhancement processing on the form image to obtain a target form image, the apparatus 400 is specifically configured to:

acquiring text box content in the recognition result;

determining a target attribute parameter of the text box content;

acquiring a reference attribute parameter of the first table;

Optionally, in the aspect of determining the target image quality evaluation value of the form image, the apparatus 400 is specifically configured to:

determining the area of a target region of the target foreground image;

It can be understood that the functions of each program module of the table processing apparatus in this embodiment may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the related description of the foregoing method embodiment, which is not described herein again.

Embodiments of the present application also provide a computer storage medium, where the computer storage medium stores a computer program for electronic data exchange, the computer program enabling a computer to execute part or all of the steps of any one of the methods described in the above method embodiments, and the computer includes an electronic device.

Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any of the methods as described in the above method embodiments. The computer program product may be a software installation package, the computer comprising an electronic device.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit may be stored in a computer readable memory if it is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to execute all or part of the steps of the above-mentioned method of the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method of form processing, the method comprising:

2. The method of claim 1, wherein the performing line segment detection on the form image to obtain a line segment detection result comprises:

determining a boundary contour of the form image;

3. The method according to claim 1 or 2, wherein the inputting the third feature vector into a preset model to obtain a vertex feature, and determining a relational adjacency matrix based on the vertex feature comprises:

4. The method of claim 1 or 2, wherein obtaining the form image of the first form comprises:

acquiring target environment parameters;

5. The method according to claim 1 or 2, wherein before the obtaining a form image of a first form, recognizing the form image to obtain a recognition result, performing feature extraction on the recognition result to obtain a first feature vector, performing line segment detection on the form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector, the method further comprises:

determining a target image quality evaluation value of the table image;

alternatively, the first and second electrodes may be,

and performing line segment detection on the target form image to obtain a line segment detection result, and performing feature extraction on the line segment detection result to obtain a second feature vector.

6. The method of claim 5, wherein the image enhancing the form image to obtain the target form image comprises:

acquiring text box content in the recognition result;

determining a target attribute parameter of the text box content;

acquiring a reference attribute parameter of the first table;

7. The method of claim 5, wherein determining a target image quality assessment value for the form image comprises:

determining the area of a target region of the target foreground image;

8. A form processing apparatus, the apparatus comprising: an acquisition unit, a detection unit, a splicing unit, an input unit and a recovery unit, wherein,

9. An electronic device comprising a processor, a memory for storing one or more programs and configured for execution by the processor, the programs comprising instructions for performing the steps of the method of any of claims 1-7.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-7.