CN113887441A - Table data processing method, device, equipment and storage medium - Google Patents

Table data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN113887441A
CN113887441A CN202111168456.5A CN202111168456A CN113887441A CN 113887441 A CN113887441 A CN 113887441A CN 202111168456 A CN202111168456 A CN 202111168456A CN 113887441 A CN113887441 A CN 113887441A
Authority
CN
China
Prior art keywords
sample
picture
training
line
target table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111168456.5A
Other languages
Chinese (zh)
Inventor
孙铁
朱运明
王琳婧
苏沁宁
田鸥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202111168456.5A priority Critical patent/CN113887441A/en
Publication of CN113887441A publication Critical patent/CN113887441A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The embodiment of the invention relates to the field of artificial intelligence, and discloses a table data processing method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring sample table training pictures of one or more non-standard tables; adding line labels to the training pictures of the sample tables; inputting each sample form training picture added with the line label into a preset deep neural network model for training to obtain a form data processing model; inputting the target table picture into a table data processing model to obtain line information of the target table picture; the text in the target table of the target table picture for determining the line information is detected and identified through the preset deep neural network model, so that the text content of the target table is obtained, the extraction of the content of the non-standard table and the reproduction of the table are realized, and the efficiency and the accuracy of data processing of the non-standard table are improved. The present invention relates to blockchain techniques, such as data can be written into blockchains for use in scenarios such as data forensics.

Description

Table data processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to a method, an apparatus, a device, and a storage medium for processing table data.
Background
In various images, some nonstandard tables which are not 'field character tables' exist, namely some non-standard tables with missing lines exist, all information needs to be identified for structured output, and the identification and output of the merging cells and the identification of the tables with missing lines in the prior art cannot achieve a good effect. Generally, the method is realized by OPENCV, and includes converting an image into a gray image, then performing edge detection, using hough line transformation, performing overlap filtering, then performing region of interest (ROI) selection, and performing content extraction after clipping. In addition, the detection of lines is realized by a deep learning target detection algorithm in the industry, but the detection of missing lines is not realized. In the process of reproducing the table contents, the one-to-one correspondence of the merged cells and the semantic relations among the cells in the table is absent in the industry at present. Therefore, how to more effectively detect a non-standard form such as a missing line and recognize the content of the form becomes important.
Disclosure of Invention
Embodiments of the present invention provide a method, an apparatus, a device, and a medium for processing table data, which implement extraction of content information of a non-standard table and reproduction of a complete table, and improve efficiency and accuracy of processing non-standard table data.
In a first aspect, an embodiment of the present invention provides a table data processing method, including:
obtaining a sample training set, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
adding a line label to each sample table training picture in the sample training set, wherein the line label is used for indicating line information of a sample table corresponding to each sample table training picture;
inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
and detecting and identifying texts in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain text contents in the target table.
Further, the adding a line label to each sample table training picture in the sample training set includes:
determining line information of sample tables in training pictures of the sample tables in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each type of line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line vertical to the horizontal plane, a first display line parallel to the horizontal plane and a second display line vertical to the horizontal plane;
and adding corresponding line labels to the sample table training pictures according to the line types of the sample tables in the sample table training pictures and the position information of each line.
Further, the step of inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model includes:
extracting form feature vectors from each sample form training picture added with the line label, and inputting the form feature vectors into the preset deep neural network model to obtain a loss function value;
when the loss function value does not meet a preset condition, adjusting the model parameters of the preset deep neural network model according to the loss function value, and inputting the training pictures of each sample table added with the line label into the deep neural network model after the model parameters are adjusted for retraining;
and when the loss function value obtained by retraining meets a preset condition, determining to obtain the table data processing model.
Further, the inputting the table feature vector into the preset deep neural network model to obtain a loss function value includes:
inputting the table feature vectors into the preset deep neural network model, and obtaining a loss function value of each category of lines in the sample table corresponding to each sample table training picture through a ResNet backbone network module and a slim module of tensoflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the line of each category of each sample table training picture.
Further, before the detecting and identifying the text in the target table corresponding to the target table picture of the determined line information through a preset deep neural network model to obtain the text content in the target table, the method further includes:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
and determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
Further, the preset deep neural network model comprises a character detection model and a character recognition model; the detecting and identifying the text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain the text content in the target table comprises the following steps:
inputting the line information corresponding to the target table picture into the character detection model to obtain the position information of the text in each cell in the target table corresponding to the target table picture;
and inputting the line information corresponding to the target table picture and the position information of the text in each cell in the target table into the character recognition model to obtain the text content in each cell in the target table.
Further, after the text in the target table corresponding to the target table picture for determining the line information is detected and identified through a preset deep neural network model to obtain the text content in the target table, the method further includes:
inputting the position information and the text content of the text in each cell in the target table into a PtrNet model to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper-lower relation among the cells is a parent-child node;
and displaying the target table according to the tree structure among the cells in the target table.
In a second aspect, an embodiment of the present invention provides a table data processing apparatus, which is applied to a data management platform, where the data management platform is in communication connection with a big data platform, and the apparatus includes:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a sample training set, the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
an adding unit, configured to add a line label to each sample table training picture in the sample training set, where the line label is used to indicate line information of a sample table corresponding to each sample table training picture;
the training unit is used for inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
the processing unit is used for inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
and the recognition unit is used for detecting and recognizing the text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain the text content in the target table.
In a third aspect, an embodiment of the present invention provides a computer device, including a processor and a memory, where the memory is used to store a computer program, and the computer program includes a program, and the processor is configured to call the computer program to execute the method of the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement the method of the first aspect.
The embodiment of the invention can obtain a sample training set, wherein the sample training set comprises one or more sample form training pictures, the one or more sample form training pictures comprise pictures of non-standard forms, and the non-standard forms refer to forms with incomplete form lines; adding a line label to each sample table training picture in the sample training set, wherein the line label is used for indicating line information of a sample table corresponding to each sample table training picture; inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model; inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture; and detecting and identifying texts in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain text contents in the target table. In the embodiment of the invention, the extraction of the content information of the non-standard table and the reproduction of the complete table are realized through the mode, and the efficiency and the accuracy of the data processing of the non-standard table are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram of a table data processing method according to an embodiment of the present invention;
FIG. 2 is a schematic block diagram of a table data processing apparatus according to an embodiment of the present invention;
fig. 3 is a schematic block diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The form data processing method provided by the embodiment of the invention can be applied to a form data processing device, in some embodiments, the form data processing device is disposed in a computer device, and in some embodiments, the computer device includes, but is not limited to, one or more of a smartphone, a tablet computer, a laptop computer, and the like.
The embodiment of the invention can obtain a sample training set, wherein the sample training set comprises one or more sample form training pictures, the one or more sample form training pictures comprise pictures of non-standard forms, and the non-standard forms refer to forms with incomplete form lines; adding a line label to each sample table training picture in the sample training set, wherein the line label is used for indicating line information of a sample table corresponding to each sample table training picture; inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model; inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture; and detecting and identifying texts in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain text contents in the target table.
The method comprises the steps of obtaining a sample training set comprising one or more sample form training pictures, adding line labels to the sample form training pictures in the sample training set, inputting the sample form training pictures added with the line labels into a preset deep neural network model for training to obtain a form data processing model, inputting a target form picture to be processed into the form data processing model to obtain line information corresponding to the target form picture, and classifying the non-standard form data; the text in the target table corresponding to the target table picture for determining the line information is detected and identified through a preset deep neural network model, so that the text content in the target table is obtained, the extraction of the content information of the non-standard table data and the table cell relation and the reproduction of the table and the content are realized, and the extraction efficiency and the accuracy of the non-standard table data are improved.
The embodiment of the application can acquire and process related data (such as a sample table training picture, a target table picture to be processed and the like) based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The embodiment of the application can be applied to various fields, such as the field of medical services, the field of financial services and the like.
In one possible implementation, in the field of medical business, the data may be medical form data associated with the medical business, such as medical device form data associated with the medical business, functional usage form data of the medical business, and the like.
The following describes schematically a table data processing method according to an embodiment of the present invention with reference to fig. 1.
Referring to fig. 1, fig. 1 is a schematic flow chart of a table data processing method according to an embodiment of the present invention, and as shown in fig. 1, the method may be executed by a table data processing apparatus, and the table data processing apparatus is disposed in a computer device. Specifically, the method of the embodiment of the present invention includes the following steps.
S101: obtaining a sample training set, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines.
In an embodiment of the present invention, a form data processing apparatus may obtain a sample training set, where the sample training set includes one or more sample form training pictures, and the one or more sample form training pictures include pictures of non-standard forms, where the non-standard forms refer to forms with incomplete form lines.
In some embodiments, the table with incomplete table lines includes, but is not limited to, missing table lines, disorder table lines, and free table lines.
S102: and adding a line label to each sample table training picture in the sample training set, wherein the line label is used for indicating line information of the sample table corresponding to each sample table training picture.
In this embodiment of the present invention, the form data processing apparatus may add a line label to each sample form training picture in the sample training set, where the line label is used to indicate line information of a sample form corresponding to each sample form training picture.
In one embodiment, the form data processing apparatus may determine line information of sample forms in training pictures of sample forms in the training picture of sample forms when adding a line label to the training pictures of sample forms in the training picture of sample forms, the line information including one or more categories of lines including a first hidden line parallel to a horizontal plane, a second hidden line perpendicular to the horizontal plane, a first displayed line parallel to the horizontal plane, and a second displayed line perpendicular to the horizontal plane, and a position coordinate of each line; and adding corresponding line labels to the sample table training pictures according to the line types of the sample tables in the sample table training pictures and the position information of each line.
In one embodiment, when adding a corresponding line label to each sample table training picture according to the type of the line of the sample table in each sample table training picture and the position information of each line, the table data processing apparatus may add a corresponding line label to each sample table training picture by using lines of different colors according to the type of the line of the sample table in each sample table training picture and the position information of each line.
For example, a purple line is used as the line label of the first hidden line in the sample table training picture, a blue line is used as the line label of the second hidden line in the sample table training picture, a red line is used as the line label of the first displayed line in the sample table training picture, and an orange line is used as the line label of the second displayed line in the sample table training picture.
In some embodiments, the line label may also be represented by different characters such as words, letters, numbers, and the like, and the embodiments of the present invention are not particularly limited.
In one embodiment, when the form data processing device detects that there is a free line segment in the sample form training picture, the free line segment may be merged according to the distance and the inclination angle of the free line segment to obtain a line segment, a straight line, or a wire frame.
S103: and inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model.
In the embodiment of the invention, the form data processing device can input each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain the form data processing model. In some embodiments, the preset deep neural network model may be a model structure trimmed based on ResNet 50.
In an embodiment, when each sample table training picture added with a line label in the sample training set is input into a preset deep neural network model for training to obtain a table data processing model, the table data processing device may extract a table feature vector from each sample table training picture added with a line label, and input the table feature vector into the preset deep neural network model to obtain a loss function value; when the loss function value does not meet a preset condition, adjusting the model parameters of the preset deep neural network model according to the loss function value, and inputting the training pictures of each sample table added with the line label into the deep neural network model after the model parameters are adjusted for retraining; and when the loss function value obtained by retraining meets a preset condition, determining to obtain the table data processing model.
In one embodiment, the preset deep neural network model comprises a ResNet backbone network module and a slim module of tensorflow; when the table feature vector is input into the preset deep neural network model by the table data processing device to obtain a loss function value, the table feature vector is input into the preset deep neural network model, and the loss function value of each type of line in the sample table corresponding to each sample table training picture is obtained through a ResNet backbone network module and a tensoflow slim module in the preset deep neural network model; and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the line of each category of each sample table training picture.
In one embodiment, the formula for calculating the loss function value of the line of each category of the training picture of the sample table is shown in the following formula (1):
d=1-2|X*Y|/(|X|+|Y|)(1)
wherein | X × Y | represents the intersection of sets X and Y, | X | and | Y | represent the number of elements thereof, and for this task, X and Y represent the line label and the predicted value of each line category, respectively.
The embodiment of the invention finely adjusts ResNet50 by using the ResNet backbone network and the slim module of tensoflow, is convenient for loading the layer of the network structure so as to adjust and modify, and can completely classify on the basis of using the ResNet structure.
S104: and inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture.
In the embodiment of the present invention, the table data processing apparatus may input a target table picture to be processed into the table data processing model, so as to obtain line information corresponding to the target table picture.
S105: and detecting and identifying texts in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain text contents in the target table.
In the embodiment of the invention, the table data processing device can detect and identify the text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain the text content in the target table.
In one embodiment, before detecting and identifying a text in a target table corresponding to the target table picture for determining line information through a preset deep neural network model to obtain text content in the target table, the table data processing device may determine category and position information of a line in the target table corresponding to the target table picture according to the line information of the target table picture; and determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
In one embodiment, the preset deep neural network model comprises a character detection model and a character recognition model; when the form data processing device detects and identifies the text in the target form corresponding to the target form picture for determining the line information through a preset deep neural network model to obtain the text content in the target form, the form data processing device can input the line information corresponding to the target form picture into the character detection model to obtain the position information of the text in each cell in the target form corresponding to the target form picture; and inputting the line information corresponding to the target table picture and the position information of the text in each cell in the target table into the character recognition model to obtain the text content in each cell in the target table.
In certain embodiments, the word detection model includes, but is not limited to, a (differential organizing Network, DBNet) model and the word recognition model includes, but is not limited to, a (volumetric recovery Neural Network, CRNN) model. The DBNet model is a text detection model based on segmentation, and the position of the text in each current cell in the current cell is found in the task; the CRNN network model is a convolution cycle neural network structure, is used for solving the problem of sequence recognition based on images, is used for recognizing a text sequence with indefinite length, does not need to cut a single character at first, but converts the text recognition into a sequence learning problem with time sequence dependence, namely the line text sequence recognition based on the images, and realizes the recognition of the text by inputting the found position of the text in the current cell into the CRNN model.
In one embodiment, after detecting and identifying a text in a target table corresponding to the target table picture for determining line information through a preset deep neural network model to obtain a text content in the target table, the table data processing device may input position information and the text content of the text in each cell in the target table into a PtrNet model to obtain a tree structure between the cells in the target table, where the tree structure includes a plurality of nodes, each cell is a node, and an upper-lower relationship between the cells is a parent-child node; and displaying the target table according to the tree structure among all the cells in the target table. In some embodiments, the PtrNet model is a Pointer network Pointer model. In some embodiments, the parent node of each cell may be empty, may be one or more; in some embodiments, the child nodes of each cell may be empty, possibly one or more.
In the embodiment of the invention, a form data processing device realizes classification of non-standard form data by acquiring a sample training set comprising one or more sample form training pictures, adding a line label to each sample form training picture in the sample training set, inputting each sample form training picture added with the line label into a preset deep neural network model for training to obtain a form data processing model, inputting a target form picture to be processed into the form data processing model to obtain line information corresponding to the target form picture; the text in the target table corresponding to the target table picture for determining the line information is detected and identified through a preset deep neural network model, so that the text content in the target table is obtained, the extraction of the content information of the non-standard table data and the table cell relation and the reproduction of the table and the content are realized, and the extraction efficiency and the accuracy of the non-standard table data are improved.
The embodiment of the invention also provides a table data processing device, which is used for executing the unit of the method in any one of the preceding claims. Specifically, referring to fig. 2, fig. 2 is a schematic block diagram of a table data processing apparatus according to an embodiment of the present invention. The form data processing apparatus of the present embodiment includes: an acquisition unit 201, an addition unit 202, a training unit 203, a processing unit 204, and a recognition unit 205.
An obtaining unit 201, configured to obtain a sample training set, where the sample training set includes one or more sample table training pictures, and the one or more sample table training pictures include pictures of a non-standard table, where the non-standard table refers to a table with incomplete table lines;
an adding unit 202, configured to add a line label to each sample table training picture in the sample training set, where the line label is used to indicate line information of a sample table corresponding to each sample table training picture;
the training unit 203 is configured to input each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training, so as to obtain a form data processing model;
the processing unit 204 is configured to input a target form picture to be processed into the form data processing model, so as to obtain line information corresponding to the target form picture;
the identifying unit 205 is configured to detect and identify a text in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model, so as to obtain a text content in the target table.
Further, when the adding unit 202 adds a line label to each sample table training picture in the sample training set, it is specifically configured to:
determining line information of sample tables in training pictures of the sample tables in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each type of line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line vertical to the horizontal plane, a first display line parallel to the horizontal plane and a second display line vertical to the horizontal plane;
and adding corresponding line labels to the sample table training pictures according to the line types of the sample tables in the sample table training pictures and the position information of each line.
Further, the training unit 203 inputs each sample form training picture added with a line label in the sample training set into a preset deep neural network model for training, and when a form data processing model is obtained, the training unit is specifically configured to:
extracting form feature vectors from each sample form training picture added with the line label, and inputting the form feature vectors into the preset deep neural network model to obtain a loss function value;
when the loss function value does not meet a preset condition, adjusting the model parameters of the preset deep neural network model according to the loss function value, and inputting the training pictures of each sample table added with the line label into the deep neural network model after the model parameters are adjusted for retraining;
and when the loss function value obtained by retraining meets a preset condition, determining to obtain the table data processing model.
Further, the training unit 203 inputs the table feature vector into the preset deep neural network model, and when obtaining the loss function value, is specifically configured to:
inputting the table feature vectors into the preset deep neural network model, and obtaining a loss function value of each category of lines in the sample table corresponding to each sample table training picture through a ResNet backbone network module and a slim module of tensoflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the line of each category of each sample table training picture.
Further, the recognition unit 205 detects and recognizes the text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model, and before obtaining the text content in the target table, is further configured to:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
and determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
Further, the preset deep neural network model comprises a character detection model and a character recognition model; the recognition unit 205 detects and recognizes a text in a target table corresponding to the target table picture of the determined line information through a preset deep neural network model, and when obtaining a text content in the target table, is specifically configured to:
inputting the line information corresponding to the target table picture into the character detection model to obtain the position information of the text in each cell in the target table corresponding to the target table picture;
and inputting the line information corresponding to the target table picture and the position information of the text in each cell in the target table into the character recognition model to obtain the text content in each cell in the target table.
Further, the identification unit 205 detects and identifies a text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model, and after obtaining the text content in the target table, is further configured to:
inputting the position information and the text content of the text in each cell in the target table into a PtrNet model to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper-lower relation among the cells is a parent-child node;
and displaying the target table according to the tree structure among the cells in the target table.
In the embodiment of the invention, a form data processing device realizes classification of non-standard form data by acquiring a sample training set comprising one or more sample form training pictures, adding a line label to each sample form training picture in the sample training set, inputting each sample form training picture added with the line label into a preset deep neural network model for training to obtain a form data processing model, inputting a target form picture to be processed into the form data processing model to obtain line information corresponding to the target form picture; the text in the target table corresponding to the target table picture for determining the line information is detected and identified through a preset deep neural network model, so that the text content in the target table is obtained, the extraction of the content information of the non-standard table data and the table cell relation and the reproduction of the table and the content are realized, and the extraction efficiency and the accuracy of the non-standard table data are improved.
Referring to fig. 3, fig. 3 is a schematic block diagram of a computer device provided in an embodiment of the present invention, and in some embodiments, the computer device in the embodiment shown in fig. 3 may include: one or more processors 301; one or more input devices 302, one or more output devices 303, and memory 304. The processor 301, the input device 302, the output device 303, and the memory 304 are connected by a bus 305. The memory 304 is used for storing computer programs, including programs, and the processor 301 is used for executing the programs stored in the memory 304. Wherein the processor 301 is configured to invoke the program to perform:
obtaining a sample training set, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
adding a line label to each sample table training picture in the sample training set, wherein the line label is used for indicating line information of a sample table corresponding to each sample table training picture;
inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
and detecting and identifying texts in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain text contents in the target table.
Further, when the processor 301 adds a line label to each sample table training picture in the sample training set, it is specifically configured to:
determining line information of sample tables in training pictures of the sample tables in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each type of line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line vertical to the horizontal plane, a first display line parallel to the horizontal plane and a second display line vertical to the horizontal plane;
and adding corresponding line labels to the sample table training pictures according to the line types of the sample tables in the sample table training pictures and the position information of each line.
Further, the processor 301 inputs each sample form training picture added with a line label in the sample training set into a preset deep neural network model for training, and when a form data processing model is obtained, the processor is specifically configured to:
extracting form feature vectors from each sample form training picture added with the line label, and inputting the form feature vectors into the preset deep neural network model to obtain a loss function value;
when the loss function value does not meet a preset condition, adjusting the model parameters of the preset deep neural network model according to the loss function value, and inputting the training pictures of each sample table added with the line label into the deep neural network model after the model parameters are adjusted for retraining;
and when the loss function value obtained by retraining meets a preset condition, determining to obtain the table data processing model.
Further, the processor 301 inputs the table feature vector into the preset deep neural network model, and when obtaining the loss function value, is specifically configured to:
inputting the table feature vectors into the preset deep neural network model, and obtaining a loss function value of each category of lines in the sample table corresponding to each sample table training picture through a ResNet backbone network module and a slim module of tensoflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the line of each category of each sample table training picture.
Further, the processor 301 detects and identifies a text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model, and before obtaining a text content in the target table, is further configured to:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
and determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
Further, the preset deep neural network model comprises a character detection model and a character recognition model; the processor 301 detects and identifies a text in a target table corresponding to the target table picture with line information determined through a preset deep neural network model, and when obtaining text content in the target table, is specifically configured to:
inputting the line information corresponding to the target table picture into the character detection model to obtain the position information of the text in each cell in the target table corresponding to the target table picture;
and inputting the line information corresponding to the target table picture and the position information of the text in each cell in the target table into the character recognition model to obtain the text content in each cell in the target table.
Further, the processor 301 detects and identifies a text in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model, and after obtaining a text content in the target table, is further configured to:
inputting the position information and the text content of the text in each cell in the target table into a PtrNet model to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper-lower relation among the cells is a parent-child node;
and displaying the target table according to the tree structure among the cells in the target table.
In the embodiment of the invention, computer equipment acquires a sample training set comprising one or more sample form training pictures, wherein the one or more sample form training pictures comprise pictures of non-standard forms, adds line labels to the sample form training pictures in the sample training set, inputs the sample form training pictures added with the line labels into a preset deep neural network model for training to obtain a form data processing model, inputs a target form picture to be processed into the form data processing model to obtain line information corresponding to the target form picture, and realizes classification of the non-standard form data; the text in the target table corresponding to the target table picture for determining the line information is detected and identified through a preset deep neural network model, so that the text content in the target table is obtained, the extraction of the content information of the non-standard table data and the table cell relation and the reproduction of the table and the content are realized, and the extraction efficiency and the accuracy of the non-standard table data are improved.
It should be understood that, in the embodiment of the present invention, the Processor 301 may be a Central Processing Unit (CPU), and the Processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 302 may include a touch pad, a microphone, etc., and the output device 303 may include a display (LCD, etc.), a speaker, etc.
The memory 304 may include a read-only memory and a random access memory, and provides instructions and data to the processor 301. A portion of the memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store device type information.
In a specific implementation, the processor 301, the input device 302, and the output device 303 described in this embodiment of the present invention may execute the implementation described in the method embodiment shown in fig. 1 provided in this embodiment of the present invention, and may also execute the implementation of the table data processing apparatus described in fig. 2 in this embodiment of the present invention, which is not described herein again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for processing table data described in the embodiment corresponding to fig. 1 may be implemented, or a table data processing apparatus according to the embodiment corresponding to fig. 2 may also be implemented, which is not described herein again.
The computer readable storage medium may be an internal storage unit of the table data processing apparatus according to any of the foregoing embodiments, for example, a hard disk or a memory of the table data processing apparatus. The computer readable storage medium may also be an external storage device of the table data processing apparatus, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the table data processing apparatus. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the table data processing device. The computer-readable storage medium is used to store the computer program and other programs and data required by the table data processing apparatus. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a computer-readable storage medium, which includes several instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage media comprise: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
It is emphasized that the data may also be stored in a node of a blockchain in order to further ensure the privacy and security of the data. The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims (10)

1. A method for processing tabular data, comprising:
obtaining a sample training set, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
adding a line label to each sample table training picture in the sample training set, wherein the line label is used for indicating line information of a sample table corresponding to each sample table training picture;
inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
and detecting and identifying texts in a target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain text contents in the target table.
2. The method of claim 1, wherein the adding a line label to each sample form training picture in the sample training set comprises:
determining line information of sample tables in training pictures of the sample tables in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each type of line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line vertical to the horizontal plane, a first display line parallel to the horizontal plane and a second display line vertical to the horizontal plane;
and adding corresponding line labels to the sample table training pictures according to the line types of the sample tables in the sample table training pictures and the position information of each line.
3. The method according to claim 2, wherein the inputting of each sample table training picture added with a line label in the sample training set into a preset deep neural network model for training to obtain a table data processing model comprises:
extracting form feature vectors from each sample form training picture added with the line label, and inputting the form feature vectors into the preset deep neural network model to obtain a loss function value;
when the loss function value does not meet a preset condition, adjusting the model parameters of the preset deep neural network model according to the loss function value, and inputting the training pictures of each sample table added with the line label into the deep neural network model after the model parameters are adjusted for retraining;
and when the loss function value obtained by retraining meets a preset condition, determining to obtain the table data processing model.
4. The method of claim 3, wherein inputting the table feature vector into the pre-defined deep neural network model to obtain a loss function value comprises:
inputting the table feature vectors into the preset deep neural network model, and obtaining a loss function value of each category of lines in the sample table corresponding to each sample table training picture through a ResNet backbone network module and a slim module of tensoflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the line of each category of each sample table training picture.
5. The method according to claim 1, wherein before detecting and recognizing the text in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model to obtain the text content in the target table, the method further comprises:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
and determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
6. The method of claim 5, wherein the preset deep neural network model comprises a text detection model and a text recognition model; the detecting and identifying the text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain the text content in the target table comprises the following steps:
inputting the line information corresponding to the target table picture into the character detection model to obtain the position information of the text in each cell in the target table corresponding to the target table picture;
and inputting the line information corresponding to the target table picture and the position information of the text in each cell in the target table into the character recognition model to obtain the text content in each cell in the target table.
7. The method according to claim 6, wherein after detecting and recognizing the text in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model to obtain the text content in the target table, the method further comprises:
inputting the position information and the text content of the text in each cell in the target table into a PtrNet model to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper-lower relation among the cells is a parent-child node;
and displaying the target table according to the tree structure among the cells in the target table.
8. A form data processing apparatus, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a sample training set, the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
an adding unit, configured to add a line label to each sample table training picture in the sample training set, where the line label is used to indicate line information of a sample table corresponding to each sample table training picture;
the training unit is used for inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
the processing unit is used for inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
and the recognition unit is used for detecting and recognizing the text in the target table corresponding to the target table picture with the determined line information through a preset deep neural network model to obtain the text content in the target table.
9. A computer device comprising a processor and a memory, wherein the memory is configured to store a computer program and the processor is configured to invoke the computer program to perform the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method of any one of claims 1-7.
CN202111168456.5A 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium Pending CN113887441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111168456.5A CN113887441A (en) 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111168456.5A CN113887441A (en) 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113887441A true CN113887441A (en) 2022-01-04

Family

ID=79005495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111168456.5A Pending CN113887441A (en) 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113887441A (en)

Similar Documents

Publication Publication Date Title
US10140511B2 (en) Building classification and extraction models based on electronic forms
CN110197146B (en) Face image analysis method based on deep learning, electronic device and storage medium
CN113837151B (en) Table image processing method and device, computer equipment and readable storage medium
CN112883926B (en) Identification method and device for form medical images
CN110795714A (en) Identity authentication method and device, computer equipment and storage medium
WO2023231380A1 (en) Electrode plate defect recognition method and apparatus, and electrode plate defect recognition model training method and apparatus, and electronic device
CN109784339A (en) Picture recognition test method, device, computer equipment and storage medium
CN115758451A (en) Data labeling method, device, equipment and storage medium based on artificial intelligence
CN114005126A (en) Table reconstruction method and device, computer equipment and readable storage medium
CN112232336A (en) Certificate identification method, device, equipment and storage medium
CN114022891A (en) Method, device and equipment for extracting key information of scanned text and storage medium
CN112613367A (en) Bill information text box acquisition method, system, equipment and storage medium
CN115578736A (en) Certificate information extraction method, device, storage medium and equipment
CN113887441A (en) Table data processing method, device, equipment and storage medium
CN115690819A (en) Big data-based identification method and system
CN112395834B (en) Brain graph generation method, device and equipment based on picture input and storage medium
CN112149523B (en) Method and device for identifying and extracting pictures based on deep learning and parallel-searching algorithm
CN114818627A (en) Form information extraction method, device, equipment and medium
CN114529933A (en) Contract data difference comparison method, device, equipment and medium
CN114495146A (en) Image text detection method and device, computer equipment and storage medium
CN114120305A (en) Training method of text classification model, and recognition method and device of text content
CN114049686A (en) Signature recognition model training method and device and electronic equipment
CN113111882B (en) Card identification method and device, electronic equipment and storage medium
CN112949450B (en) Bill processing method, device, electronic equipment and storage medium
CN113435331B (en) Image character recognition method, system, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination