CN113887441B - Table data processing method, device, equipment and storage medium - Google Patents

Table data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN113887441B
CN113887441B CN202111168456.5A CN202111168456A CN113887441B CN 113887441 B CN113887441 B CN 113887441B CN 202111168456 A CN202111168456 A CN 202111168456A CN 113887441 B CN113887441 B CN 113887441B
Authority
CN
China
Prior art keywords
sample
training
line
picture
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111168456.5A
Other languages
Chinese (zh)
Other versions
CN113887441A (en
Inventor
孙铁
朱运明
王琳婧
苏沁宁
田鸥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202111168456.5A priority Critical patent/CN113887441B/en
Publication of CN113887441A publication Critical patent/CN113887441A/en
Application granted granted Critical
Publication of CN113887441B publication Critical patent/CN113887441B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to the field of artificial intelligence and discloses a method, a device, equipment and a storage medium for processing table data, wherein the method comprises the following steps: acquiring sample form training pictures of one or more nonstandard forms; adding line labels to each sample form training picture; inputting each sample form training picture added with the line label into a preset deep neural network model for training to obtain a form data processing model; inputting the target table picture into a table data processing model to obtain line information of the target table picture; the text in the target table of the target table picture for determining the line information is detected and identified through the preset deep neural network model, so that the text content of the target table is obtained, the extraction of the content of the nonstandard table and the reproduction of the table are realized, and the efficiency and the accuracy of the nonstandard table data processing are improved. The present invention relates to blockchain techniques, such as writing data into blockchains for use in data forensics and other scenarios.

Description

Table data processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to a method, an apparatus, a device, and a storage medium for processing table data.
Background
In various images, there are usually some non-standard forms other than 'field character lattice', that is, some lines are missing, and each piece of information needs to be identified for structural output, but at present, the industry cannot realize better effect on the identification and output of the merging cells and the identification under the condition of missing lines of the form. It is common practice to convert the image into a gray image by OPENCV, then perform edge detection, transform with Huo Fuxian, overlap filtering, then select the region of interest (region of interest, ROI), and crop the image for content extraction. In addition, there are also techniques in the industry that implement detection of lines by deep learning target detection algorithms, but that lack detection of missing lines. In the reproduction process of table contents, the semantic relationships in the table such as merging cells and between cells are lack of one-to-one correspondence reproduction in the industry. Therefore, how to more effectively detect a nonstandard table such as a line defect and identify the table content becomes an important point of research.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a medium for processing table data, which realize extraction of content information of a nonstandard table and reproduction of a complete table and improve the efficiency and accuracy of processing the nonstandard table data.
In a first aspect, an embodiment of the present invention provides a table data processing method, including:
acquiring a sample training set, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
adding line labels to each sample table training picture in the sample training set, wherein the line labels are used for indicating line information of sample tables corresponding to each sample table training picture;
inputting each sample form training picture added with line labels in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
And detecting and identifying texts in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model, and obtaining text contents in the target table.
Further, the adding line labels to each sample table training picture in the sample training set includes:
Determining line information of sample tables in each sample table training picture in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line perpendicular to the horizontal plane, a first display line parallel to the horizontal plane and a second display line perpendicular to the horizontal plane;
And adding corresponding line labels to the sample table training pictures according to the types of the lines of the sample table in the sample table training pictures and the position information of each line.
Further, inputting each sample table training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a table data processing model, which comprises the following steps:
Extracting form feature vectors from each sample form training picture added with the line labels, and inputting the form feature vectors into the preset deep neural network model to obtain loss function values;
When the loss function value does not meet a preset condition, adjusting model parameters of the preset deep neural network model according to the loss function value, and inputting training pictures of each sample form added with the line labels into the deep neural network model with the model parameters adjusted for retraining;
and when the loss function value obtained through retraining meets a preset condition, determining to obtain the table data processing model.
Further, the inputting the table feature vector into the preset deep neural network model to obtain a loss function value includes:
Inputting the table feature vector into the preset deep neural network model, and obtaining a loss function value of each type of line in a sample table corresponding to each sample table training picture through a ResNet backbone network module and a slide module of tensorflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the lines of each category of each sample table training picture.
Further, the detecting and identifying, by the preset deep neural network model, the text in the target table corresponding to the target table picture for determining the line information, and before obtaining the text content in the target table, further includes:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
And determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
Further, the preset deep neural network model comprises a character detection model and a character recognition model; detecting and identifying texts in a target table corresponding to the target table picture for determining line information through a preset deep neural network model to obtain text contents in the target table, wherein the method comprises the following steps:
inputting line information corresponding to the target table picture into the text detection model to obtain position information of texts in each cell in a target table corresponding to the target table picture;
and inputting line information corresponding to the target table picture and position information of the text in each cell in the target table into the text recognition model to obtain text content in each cell in the target table.
Further, the detecting and identifying, by the preset deep neural network model, the text in the target table corresponding to the target table picture for determining the line information, and after obtaining the text content in the target table, further includes:
Inputting the position information and text content of the text in each cell in the target table into PtrNet models to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper and lower relations among the cells are father and child nodes;
And displaying the target table according to the tree structure among all the cells in the target table.
In a second aspect, an embodiment of the present invention provides a table data processing apparatus, applied to a data management platform, where the data management platform is communicatively connected to a large data platform, the apparatus includes:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a sample training set, the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
The adding unit is used for adding line labels to each sample table training picture in the sample training set, and the line labels are used for indicating line information of sample tables corresponding to each sample table training picture;
The training unit is used for inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
The processing unit is used for inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
the identification unit is used for detecting and identifying texts in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model, and obtaining text contents in the target table.
In a third aspect, an embodiment of the present invention provides a computer device, including a processor and a memory, where the memory is configured to store a computer program, the computer program including a program, and the processor is configured to invoke the computer program to perform the method of the first aspect.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium storing a computer program for execution by a processor to implement the method of the first aspect.
According to the embodiment of the invention, a sample training set can be obtained, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines; adding line labels to each sample table training picture in the sample training set, wherein the line labels are used for indicating line information of sample tables corresponding to each sample table training picture; inputting each sample form training picture added with line labels in the sample training set into a preset deep neural network model for training to obtain a form data processing model; inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture; and detecting and identifying texts in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model, and obtaining text contents in the target table. The embodiment of the invention realizes the extraction of the content information of the nonstandard table and the reproduction of the complete table, and improves the efficiency and the accuracy of the data processing of the nonstandard table.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a table data processing method according to an embodiment of the present invention;
FIG. 2 is a schematic block diagram of a form data processing apparatus provided in an embodiment of the present invention;
Fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The table data processing method provided by the embodiment of the invention can be applied to a table data processing device, in some embodiments, the table data processing device is arranged in a computer device, and in some embodiments, the computer device comprises one or more of a smart phone, a tablet computer, a laptop computer and the like.
According to the embodiment of the invention, a sample training set can be obtained, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines; adding line labels to each sample table training picture in the sample training set, wherein the line labels are used for indicating line information of sample tables corresponding to each sample table training picture; inputting each sample form training picture added with line labels in the sample training set into a preset deep neural network model for training to obtain a form data processing model; inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture; and detecting and identifying texts in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model, and obtaining text contents in the target table.
According to the embodiment of the invention, a sample training set comprising one or more sample table training pictures is obtained, the one or more sample table training pictures comprise pictures of non-standard tables, line labels are added to each sample table training picture in the sample training set, each sample table training picture added with the line labels is input into a preset deep neural network model for training to obtain a table data processing model, a target table picture to be processed is input into the table data processing model to obtain line information corresponding to the target table picture, and classification of non-standard table data is realized; the text in the target table corresponding to the target table picture of the line information is detected and identified through a preset deep neural network model, the text content in the target table is obtained, extraction of the content information of the nonstandard table data and the table cell relation and reproduction of the table and the content are achieved, and further the efficiency and accuracy of the extraction of the nonstandard table data are improved.
The embodiment of the application can acquire and process related data (such as sample form training pictures, target form pictures to be processed and the like) based on an artificial intelligence technology. Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the application can be applied to various fields, such as the medical service field, the financial service field and the like.
In one possible implementation, the data may be medical form data associated with a medical business, such as medical device form data associated with a medical business, functional usage form data of a medical business, and the like, in the medical business arts.
The table data processing method provided by the embodiment of the invention is schematically described below with reference to fig. 1.
Referring to fig. 1, fig. 1 is a schematic flowchart of a table data processing method according to an embodiment of the present invention, and as shown in fig. 1, the method may be performed by a table data processing apparatus, where the table data processing apparatus is disposed in a computer device. Specifically, the method of the embodiment of the invention comprises the following steps.
S101: a sample training set is obtained, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines.
In the embodiment of the invention, the table data processing device may acquire a sample training set, where the sample training set includes one or more sample table training pictures, and the one or more sample table training pictures include pictures of a non-standard table, where the non-standard table refers to a table with incomplete table lines.
In some embodiments, the incomplete form of the form line includes, but is not limited to, missing form line, disordered form line, free form line, and the like.
S102: and adding line labels to each sample table training picture in the sample training set, wherein the line labels are used for indicating line information of sample tables corresponding to each sample table training picture.
In the embodiment of the invention, the table data processing device can add line labels to each sample table training picture in the sample training set, wherein the line labels are used for indicating line information of sample tables corresponding to each sample table training picture.
In one embodiment, when adding a line label to each sample table training picture in the sample training set, the table data processing device may determine line information of sample tables in each sample table training picture in the sample training set, where the line information includes one or more types of lines and position coordinates of each line, and the types of lines include a first hidden line parallel to a horizontal plane, a second hidden line perpendicular to the horizontal plane, a first display line parallel to the horizontal plane, and a second display line perpendicular to the horizontal plane; and adding corresponding line labels to the sample table training pictures according to the types of the lines of the sample table in the sample table training pictures and the position information of each line.
In one embodiment, when adding corresponding line labels to the sample table training pictures according to the types of the lines of the sample table in the sample table training pictures and the position information of each line, the table data processing device may add corresponding line labels to the sample table training pictures by using lines with different colors according to the types of the lines of the sample table in the sample table training pictures and the position information of each line.
For example, a purple line is used as a line label for a first hidden line in a sample form training picture, a blue line is used as a line label for a second hidden line in the sample form training picture, a red line is used as a line label for a first displayed line in the sample form training picture, and an orange line is used as a line label for a second displayed line in the sample form training picture.
In some embodiments, the line tag may be represented by different characters, such as letters, numbers, and the like, which are not particularly limited in the embodiments of the present invention.
In one embodiment, when the table data processing device detects that a free line segment exists in the sample table training picture, the free line segment can be combined according to the distance and the inclination angle of the free line segment, so as to obtain a line segment, a straight line or a wire frame.
S103: and inputting each sample table training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a table data processing model.
In the embodiment of the invention, the table data processing device can input each sample table training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a table data processing model. In some embodiments, the preset deep neural network model may be a model structure based on ResNet after fine tuning.
In one embodiment, when the table data processing device inputs each sample table training picture added with the line tag in the sample training set into a preset deep neural network model to train to obtain a table data processing model, table feature vectors can be extracted from each sample table training picture added with the line tag, and the table feature vectors are input into the preset deep neural network model to obtain a loss function value; when the loss function value does not meet a preset condition, adjusting model parameters of the preset deep neural network model according to the loss function value, and inputting training pictures of each sample form added with the line labels into the deep neural network model with the model parameters adjusted for retraining; and when the loss function value obtained through retraining meets a preset condition, determining to obtain the table data processing model.
In one embodiment, the preset deep neural network model includes a ResNet backbone network module and a slide module of tensorflow; when the table feature vector is input into the preset deep neural network model to obtain a loss function value, the table data processing device inputs the table feature vector into the preset deep neural network model, and obtains the loss function value of each type of line in a sample table corresponding to each sample table training picture through a ResNet backbone network module and a clim module of tensorflow in the preset deep neural network model; and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the lines of each category of each sample table training picture.
In one embodiment, the formula for calculating the loss function value for each class of line of the respective sample table training pictures is shown in the following formula (1):
d=1-2|X*Y|/(|X|+|Y|)(1)
where |x x| represents the intersection of sets X and Y, |x| and |y| represent the number of elements thereof, and for this task, X and Y represent the line labels and predictors for each line class, respectively.
The embodiment of the invention can be used for carrying out fine adjustment on ResNet by using ResNet backbone network and a slide module of tensorflow, so that the layers of the network structure can be conveniently loaded for adjustment and modification, and the classification can be well carried out on the basis of using ResNet structure.
S104: and inputting the target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture.
In the embodiment of the invention, the table data processing device can input the target table picture to be processed into the table data processing model to obtain the line information corresponding to the target table picture.
S105: and detecting and identifying texts in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model, and obtaining text contents in the target table.
In the embodiment of the invention, the table data processing device can detect and identify the text in the target table corresponding to the target table picture for determining the line information through the preset deep neural network model, and the text content in the target table is obtained.
In one embodiment, the table data processing device may determine, according to the line information of the target table picture, the category and the position information of the line in the target table corresponding to the target table picture before detecting and identifying the text in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model to obtain the text content in the target table; and determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
In one embodiment, the preset deep neural network model includes a text detection model and a text recognition model; the table data processing device detects and identifies texts in a target table corresponding to the target table picture with the line information determined through a preset deep neural network model, and when text content in the target table is obtained, the line information corresponding to the target table picture can be input into the text detection model to obtain the position information of the texts in each cell in the target table corresponding to the target table picture; and inputting line information corresponding to the target table picture and position information of the text in each cell in the target table into the text recognition model to obtain text content in each cell in the target table.
In certain embodiments, the text detection model includes, but is not limited to, a (Differentiable binarization Net, DBNet) model, and the text recognition model includes, but is not limited to, a (Convolutional Recurrent Neural Network, CRNN) model. Wherein DBNet model is a text detection model based on segmentation, and in this task is to find the position of the text in each current cell in the current cell; the CRNN network model is a convolution cyclic neural network structure and is used for solving the problem of image-based sequence recognition, is used for recognizing text sequences with indefinite lengths, does not need to cut single characters first, but converts the text recognition into the sequence learning problem depending on time sequence, namely, the line text sequence recognition based on images, and in this task, the recognition of the text is realized by finding the position of the text in the current cell and then inputting the text into the CRNN model.
In one embodiment, the table data processing device detects and identifies text in a target table corresponding to the target table picture for determining line information through a preset deep neural network model, after text content in the target table is obtained, position information and text content of text in each cell in the target table can be input PtrNet into the model to obtain a tree structure among cells in the target table, the tree structure comprises a plurality of nodes, each cell is a node, and the upper-lower relationship among the cells is a parent-child node; and displaying the target table according to the tree structure among all the cells in the target table. In some embodiments, ptrNet model is a Pointer network Pointer Networks model. In some embodiments, the parent node of each cell may be empty, possibly one or more; in some embodiments, the child nodes of each cell may be empty, possibly one or more.
In the embodiment of the invention, a table data processing device adds line labels to each sample table training picture in a sample training set by acquiring a sample training set comprising one or more sample table training pictures, inputting each sample table training picture added with the line labels into a preset deep neural network model for training to obtain a table data processing model, inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture, and classifying the non-standard table data; the text in the target table corresponding to the target table picture of the line information is detected and identified through a preset deep neural network model, the text content in the target table is obtained, extraction of the content information of the nonstandard table data and the table cell relation and reproduction of the table and the content are achieved, and further the efficiency and accuracy of the extraction of the nonstandard table data are improved.
The embodiment of the invention also provides a table data processing device which is used for executing the unit of the method. In particular, referring to fig. 2, fig. 2 is a schematic block diagram of a table data processing apparatus according to an embodiment of the present invention. The table data processing apparatus of the present embodiment includes: an acquisition unit 201, an addition unit 202, a training unit 203, a processing unit 204, and an identification unit 205.
An obtaining unit 201, configured to obtain a sample training set, where the sample training set includes one or more sample table training pictures, and the one or more sample table training pictures include pictures of a non-standard table, where the non-standard table refers to a table with incomplete table lines;
An adding unit 202, configured to add a line tag to each sample table training picture in the sample training set, where the line tag is used to indicate line information of a sample table corresponding to each sample table training picture;
the training unit 203 is configured to input each sample table training picture added with a line label in the sample training set into a preset deep neural network model for training, so as to obtain a table data processing model;
the processing unit 204 is configured to input a target table picture to be processed into the table data processing model, so as to obtain line information corresponding to the target table picture;
And the identifying unit 205 is configured to detect and identify, by using a preset deep neural network model, a text in a target table corresponding to the target table picture for determining line information, so as to obtain text content in the target table.
Further, when the adding unit 202 adds a line label to each sample table training picture in the sample training set, the adding unit is specifically configured to:
Determining line information of sample tables in each sample table training picture in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line perpendicular to the horizontal plane, a first display line parallel to the horizontal plane and a second display line perpendicular to the horizontal plane;
And adding corresponding line labels to the sample table training pictures according to the types of the lines of the sample table in the sample table training pictures and the position information of each line.
Further, the training unit 203 inputs each sample table training picture added with the line tag in the sample training set into a preset deep neural network model for training, and is specifically configured to:
Extracting form feature vectors from each sample form training picture added with the line labels, and inputting the form feature vectors into the preset deep neural network model to obtain loss function values;
When the loss function value does not meet a preset condition, adjusting model parameters of the preset deep neural network model according to the loss function value, and inputting training pictures of each sample form added with the line labels into the deep neural network model with the model parameters adjusted for retraining;
and when the loss function value obtained through retraining meets a preset condition, determining to obtain the table data processing model.
Further, when the training unit 203 inputs the table feature vector into the preset deep neural network model to obtain the loss function value, the training unit is specifically configured to:
Inputting the table feature vector into the preset deep neural network model, and obtaining a loss function value of each type of line in a sample table corresponding to each sample table training picture through a ResNet backbone network module and a slide module of tensorflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the lines of each category of each sample table training picture.
Further, the identifying unit 205 detects and identifies, through a preset deep neural network model, a text in a target table corresponding to the target table picture for determining line information, and before obtaining text content in the target table, is further configured to:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
And determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
Further, the preset deep neural network model comprises a character detection model and a character recognition model; the identifying unit 205 detects and identifies, through a preset deep neural network model, a text in a target table corresponding to the target table picture for determining line information, and when obtaining text content in the target table, the identifying unit is specifically configured to:
inputting line information corresponding to the target table picture into the text detection model to obtain position information of texts in each cell in a target table corresponding to the target table picture;
and inputting line information corresponding to the target table picture and position information of the text in each cell in the target table into the text recognition model to obtain text content in each cell in the target table.
Further, the identifying unit 205 detects and identifies, through a preset deep neural network model, a text in a target table corresponding to the target table picture that determines line information, and after obtaining a text content in the target table, is further configured to:
Inputting the position information and text content of the text in each cell in the target table into PtrNet models to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper and lower relations among the cells are father and child nodes;
And displaying the target table according to the tree structure among all the cells in the target table.
In the embodiment of the invention, a table data processing device adds line labels to each sample table training picture in a sample training set by acquiring a sample training set comprising one or more sample table training pictures, inputting each sample table training picture added with the line labels into a preset deep neural network model for training to obtain a table data processing model, inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture, and classifying the non-standard table data; the text in the target table corresponding to the target table picture of the line information is detected and identified through a preset deep neural network model, the text content in the target table is obtained, extraction of the content information of the nonstandard table data and the table cell relation and reproduction of the table and the content are achieved, and further the efficiency and accuracy of the extraction of the nonstandard table data are improved.
Referring to fig. 3, fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present invention, and in some embodiments, the computer device according to the present embodiment shown in fig. 3 may include: one or more processors 301; one or more input devices 302, one or more output devices 303, and a memory 304. The processor 301, the input device 302, the output device 303, and the memory 304 are connected via a bus 305. The memory 304 is used for storing a computer program comprising a program, and the processor 301 is used for executing the program stored in the memory 304. Wherein the processor 301 is configured to invoke the program execution:
acquiring a sample training set, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
adding line labels to each sample table training picture in the sample training set, wherein the line labels are used for indicating line information of sample tables corresponding to each sample table training picture;
inputting each sample form training picture added with line labels in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
And detecting and identifying texts in the target table corresponding to the target table picture for determining the line information through a preset deep neural network model, and obtaining text contents in the target table.
Further, when the processor 301 adds a line label to each sample table training picture in the sample training set, the method is specifically used for:
Determining line information of sample tables in each sample table training picture in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line perpendicular to the horizontal plane, a first display line parallel to the horizontal plane and a second display line perpendicular to the horizontal plane;
And adding corresponding line labels to the sample table training pictures according to the types of the lines of the sample table in the sample table training pictures and the position information of each line.
Further, the processor 301 inputs each sample table training picture added with the line tag in the sample training set into a preset deep neural network model for training, and is specifically configured to:
Extracting form feature vectors from each sample form training picture added with the line labels, and inputting the form feature vectors into the preset deep neural network model to obtain loss function values;
When the loss function value does not meet a preset condition, adjusting model parameters of the preset deep neural network model according to the loss function value, and inputting training pictures of each sample form added with the line labels into the deep neural network model with the model parameters adjusted for retraining;
and when the loss function value obtained through retraining meets a preset condition, determining to obtain the table data processing model.
Further, when the processor 301 inputs the table feature vector into the preset deep neural network model to obtain a loss function value, the method is specifically used for:
Inputting the table feature vector into the preset deep neural network model, and obtaining a loss function value of each type of line in a sample table corresponding to each sample table training picture through a ResNet backbone network module and a slide module of tensorflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the lines of each category of each sample table training picture.
Further, the processor 301 detects and identifies, through a preset deep neural network model, a text in a target table corresponding to the target table picture for determining line information, and before obtaining text content in the target table, is further configured to:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
And determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
Further, the preset deep neural network model comprises a character detection model and a character recognition model; the processor 301 detects and identifies, through a preset deep neural network model, a text in a target table corresponding to the target table picture for determining line information, and when obtaining text content in the target table, is specifically configured to:
inputting line information corresponding to the target table picture into the text detection model to obtain position information of texts in each cell in a target table corresponding to the target table picture;
and inputting line information corresponding to the target table picture and position information of the text in each cell in the target table into the text recognition model to obtain text content in each cell in the target table.
Further, the processor 301 detects and identifies, through a preset deep neural network model, a text in a target table corresponding to the target table picture for determining line information, and after obtaining a text content in the target table, is further configured to:
Inputting the position information and text content of the text in each cell in the target table into PtrNet models to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper and lower relations among the cells are father and child nodes;
And displaying the target table according to the tree structure among all the cells in the target table.
In the embodiment of the invention, a computer device adds line labels to each sample table training picture in one or more sample table training pictures by acquiring the sample training set comprising one or more sample table training pictures, inputting each sample table training picture added with the line labels into a preset deep neural network model for training to obtain a table data processing model, inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture, and classifying non-standard table data; the text in the target table corresponding to the target table picture of the line information is detected and identified through a preset deep neural network model, the text content in the target table is obtained, extraction of the content information of the nonstandard table data and the table cell relation and reproduction of the table and the content are achieved, and further the efficiency and accuracy of the extraction of the nonstandard table data are improved.
It should be appreciated that in embodiments of the present invention, the Processor 301 may be a central processing unit (Central Processing Unit, CPU), which may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL processors, DSPs), application SPECIFIC INTEGRATED Circuits (ASICs), off-the-shelf Programmable gate arrays (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 302 may include a touch pad, a microphone, etc., and the output device 303 may include a display (LCD, etc.), a speaker, etc.
The memory 304 may include read only memory and random access memory and provides instructions and data to the processor 301. A portion of memory 304 may also include non-volatile random access memory. For example, the memory 304 may also store information of device type.
In a specific implementation, the processor 301, the input device 302, and the output device 303 described in the embodiments of the present invention may execute the implementation described in the embodiment of the method described in fig. 1 provided in the embodiments of the present invention, and may also execute the implementation of the table data processing apparatus described in fig. 2 in the embodiments of the present invention, which is not described herein again.
The embodiment of the present invention further provides a computer readable storage medium, where the computer readable storage medium stores a computer program, where the computer program when executed by a processor implements a table data processing method described in the embodiment corresponding to fig. 1, and may also implement a table data processing apparatus in the embodiment corresponding to fig. 2 of the present invention, which is not described herein again.
The computer readable storage medium may be an internal storage unit of the table data processing apparatus according to any one of the foregoing embodiments, for example, a hard disk or a memory of the table data processing apparatus. The computer readable storage medium may also be an external storage device of the table data processing apparatus, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), or the like, which are provided on the table data processing apparatus. Further, the computer readable storage medium may also include both an internal storage unit and an external storage device of the tabular data processing apparatus. The computer readable storage medium is used for storing the computer program and other programs and data required by the tabular data processing device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a computer-readable storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a terminal, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes. The computer readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
It is emphasized that to further guarantee the privacy and security of the data, the data may also be stored in a blockchain node. The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention.

Claims (7)

1. A form data processing method, comprising:
acquiring a sample training set, wherein the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
adding line labels to each sample table training picture in the sample training set, wherein the line labels are used for indicating line information of sample tables corresponding to each sample table training picture;
inputting each sample form training picture added with line labels in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
Detecting and identifying texts in a target table corresponding to the target table picture of the line information through a preset deep neural network model to obtain text contents in the target table;
the adding line labels to each sample table training picture in the sample training set comprises:
Determining line information of sample tables in each sample table training picture in the sample training set, wherein the line information comprises one or more types of lines and position coordinates of each line, and the types of the lines comprise a first hidden line parallel to a horizontal plane, a second hidden line perpendicular to the horizontal plane, a first display line parallel to the horizontal plane and a second display line perpendicular to the horizontal plane;
adding corresponding line labels to the sample table training pictures according to the types of the lines of the sample table in the sample table training pictures and the position information of each line;
Inputting each sample table training picture added with line labels in the sample training set into a preset deep neural network model for training to obtain a table data processing model, wherein the method comprises the following steps of:
Extracting form feature vectors from each sample form training picture added with the line labels, and inputting the form feature vectors into the preset deep neural network model to obtain loss function values;
When the loss function value does not meet a preset condition, adjusting model parameters of the preset deep neural network model according to the loss function value, and inputting training pictures of each sample form added with the line labels into the deep neural network model with the model parameters adjusted for retraining;
When the loss function value obtained through retraining meets a preset condition, determining to obtain the form data processing model;
Inputting the table feature vector into the preset deep neural network model to obtain a loss function value, including:
Inputting the table feature vector into the preset deep neural network model, and obtaining a loss function value of each type of line in a sample table corresponding to each sample table training picture through a ResNet backbone network module and a slide module of tensorflow in the preset deep neural network model;
and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the lines of each category of each sample table training picture.
2. The method according to claim 1, wherein the detecting and identifying, by the preset deep neural network model, text in a target form corresponding to the target form picture for determining line information, and before obtaining text content in the target form, further includes:
determining the category and position information of lines in a target table corresponding to the target table picture according to the line information of the target table picture;
And determining each cell in the target table according to the category and the position information of the line in the target table corresponding to the target table picture.
3. The method of claim 2, wherein the predetermined deep neural network model comprises a text detection model and a text recognition model; detecting and identifying texts in a target table corresponding to the target table picture for determining line information through a preset deep neural network model to obtain text contents in the target table, wherein the method comprises the following steps:
inputting line information corresponding to the target table picture into the text detection model to obtain position information of texts in each cell in a target table corresponding to the target table picture;
and inputting line information corresponding to the target table picture and position information of the text in each cell in the target table into the text recognition model to obtain text content in each cell in the target table.
4. The method according to claim 3, wherein the detecting and identifying, by the preset deep neural network model, the text in the target form corresponding to the target form picture for which the line information is determined, and after obtaining the text content in the target form, further includes:
Inputting the position information and text content of the text in each cell in the target table into PtrNet models to obtain a tree structure among the cells in the target table, wherein the tree structure comprises a plurality of nodes, each cell is a node, and the upper and lower relations among the cells are father and child nodes;
And displaying the target table according to the tree structure among all the cells in the target table.
5. A form data processing apparatus, comprising:
the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a sample training set, the sample training set comprises one or more sample table training pictures, the one or more sample table training pictures comprise pictures of non-standard tables, and the non-standard tables refer to tables with incomplete table lines;
The adding unit is used for adding line labels to each sample table training picture in the sample training set, and the line labels are used for indicating line information of sample tables corresponding to each sample table training picture;
The training unit is used for inputting each sample form training picture added with the line label in the sample training set into a preset deep neural network model for training to obtain a form data processing model;
The processing unit is used for inputting a target table picture to be processed into the table data processing model to obtain line information corresponding to the target table picture;
The identification unit is used for detecting and identifying texts in the target form corresponding to the target form picture of the determined line information through a preset deep neural network model to obtain text contents in the target form;
The adding unit is specifically configured to determine line information of a sample table in each sample table training picture in the sample training set when adding line labels to each sample table training picture in the sample training set, where the line information includes one or more types of lines and position coordinates of each line, and the types of the lines include a first hidden line parallel to a horizontal plane, a second hidden line perpendicular to the horizontal plane, a first display line parallel to the horizontal plane, and a second display line perpendicular to the horizontal plane; adding corresponding line labels to the sample table training pictures according to the types of the lines of the sample table in the sample table training pictures and the position information of each line;
The training unit inputs each sample table training picture added with the line label in the sample training set into a preset deep neural network model for training, and when a table data processing model is obtained, the training unit is specifically used for extracting a table feature vector from each sample table training picture added with the line label, and inputting the table feature vector into the preset deep neural network model to obtain a loss function value; when the loss function value does not meet a preset condition, adjusting model parameters of the preset deep neural network model according to the loss function value, and inputting training pictures of each sample form added with the line labels into the deep neural network model with the model parameters adjusted for retraining; when the loss function value obtained through retraining meets a preset condition, determining to obtain the form data processing model;
The training unit inputs the table feature vector into the preset deep neural network model, and is specifically configured to input the table feature vector into the preset deep neural network model when obtaining a loss function value, and obtain the loss function value of each type of line in the sample table corresponding to each sample table training picture through a ResNet backbone network module and a clim module of tensorflow in the preset deep neural network model; and determining the average value of the loss function values of the lines of each category as the loss function value of each sample table training picture according to the loss function value of the lines of each category of each sample table training picture.
6. A computer device comprising a processor and a memory, wherein the memory is for storing a computer program, the processor being configured to invoke the computer program to perform the method of any of claims 1-4.
7. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any of claims 1-4.
CN202111168456.5A 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium Active CN113887441B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111168456.5A CN113887441B (en) 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111168456.5A CN113887441B (en) 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113887441A CN113887441A (en) 2022-01-04
CN113887441B true CN113887441B (en) 2024-09-10

Family

ID=79005495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111168456.5A Active CN113887441B (en) 2021-09-30 2021-09-30 Table data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113887441B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993112A (en) * 2019-03-29 2019-07-09 杭州睿琪软件有限公司 The recognition methods of table and device in a kind of picture
CN112949443A (en) * 2021-02-24 2021-06-11 平安科技(深圳)有限公司 Table structure identification method and device, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110390269B (en) * 2019-06-26 2023-08-01 平安科技(深圳)有限公司 PDF document table extraction method, device, equipment and computer readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993112A (en) * 2019-03-29 2019-07-09 杭州睿琪软件有限公司 The recognition methods of table and device in a kind of picture
CN112949443A (en) * 2021-02-24 2021-06-11 平安科技(深圳)有限公司 Table structure identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113887441A (en) 2022-01-04

Similar Documents

Publication Publication Date Title
CN109886928B (en) Target cell marking method, device, storage medium and terminal equipment
CN112699775B (en) Certificate identification method, device, equipment and storage medium based on deep learning
US9626555B2 (en) Content-based document image classification
US20170109610A1 (en) Building classification and extraction models based on electronic forms
EP3358476A1 (en) Method and apparatus for constructing decision model, computer device and storage device
CN112200081A (en) Abnormal behavior identification method, device, electronic device and storage medium
CN107944020A (en) Facial image lookup method and device, computer installation and storage medium
CN114005126B (en) Table reconstruction method, device, computer equipment and readable storage medium
CN114241499B (en) Table image recognition method, device, equipment and readable storage medium
CN108345641A (en) A kind of method crawling website data, storage medium and server
CN113705749B (en) Two-dimensional code recognition method, device, equipment and storage medium based on deep learning
CN106327546B (en) Method and device for testing face detection algorithm
CN112801099B (en) Image processing method, device, terminal equipment and medium
CN112883926B (en) Identification method and device for form medical images
CN113850260B (en) Key information extraction method and device, electronic equipment and readable storage medium
CN112434555B (en) Key value pair region identification method and device, storage medium and electronic equipment
CN112232336B (en) A certificate identification method, device, equipment and storage medium
CN113837151A (en) Table image processing method and device, computer equipment and readable storage medium
CN113963147A (en) A method and system for extracting key information based on semantic segmentation
CN113971726A (en) Character recognition method, system and storage medium based on industrial equipment label
WO2023231380A1 (en) Electrode plate defect recognition method and apparatus, and electrode plate defect recognition model training method and apparatus, and electronic device
CN113936286A (en) Image text recognition method and device, computer equipment and storage medium
CN113780116A (en) Invoice classification method, apparatus, computer equipment and storage medium
CN111414889B (en) Financial statement identification method and device based on character identification
CN114612919B (en) Bill information processing system, method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant