CN112949443B - Table structure identification method and device, electronic equipment and storage medium - Google Patents

Table structure identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112949443B
CN112949443B CN202110206569.3A CN202110206569A CN112949443B CN 112949443 B CN112949443 B CN 112949443B CN 202110206569 A CN202110206569 A CN 202110206569A CN 112949443 B CN112949443 B CN 112949443B
Authority
CN
China
Prior art keywords
table structure
relation
text
text box
constructing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110206569.3A
Other languages
Chinese (zh)
Other versions
CN112949443A (en
Inventor
王文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110206569.3A priority Critical patent/CN112949443B/en
Priority to PCT/CN2021/096534 priority patent/WO2022178994A1/en
Publication of CN112949443A publication Critical patent/CN112949443A/en
Application granted granted Critical
Publication of CN112949443B publication Critical patent/CN112949443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/287Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of data analysis, and discloses a table structure identification method, which comprises the following steps: acquiring a training data set and constructing a label; training the original table structure recognition model by using the training data set and the label to obtain a standard table structure recognition model; acquiring a form page to be identified, and constructing document node characteristics and form line characteristics; performing table detection and recognition on the document node characteristics and the table line characteristics by using the standard table structure recognition model to obtain a predicted table structure relation; and carrying out reduction processing on the to-be-identified form page according to the predicted form structure relationship to obtain a form structure. In addition, the invention also relates to a blockchain technology, and the to-be-identified table page can be stored in a node of the blockchain. The invention also provides a table structure identification device, electronic equipment and a computer readable storage medium. The invention can solve the problems of poor dependence on images and poor table recognition effect.

Description

Table structure identification method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data analysis technologies, and in particular, to a method and apparatus for identifying a table structure, an electronic device, and a computer readable storage medium.
Background
With the advent of the big data age, how to obtain key and valuable information from massive data is becoming more and more important. If information is extracted from patient bill of charge, laboratory sheet, physical examination report and other documents in each big hospital and physical examination institution, the subsequent diagnosis efficiency of doctors can be improved. The table structure in the document can clearly show the logical and quantitative relation of the original document data, and a lot of information is usually presented in the form of a table, so that the table structure is required to be restored before the information is extracted from the table.
The conventional table structure identification adopts an image processing-based method, and adopts a detection or segmentation method in an image to identify and restore the table structure. However, the method is highly dependent on image quality, and when the image quality is low, the background is complex, and the table color shading is obvious, the detection and identification effects of the table structure are poor, and meanwhile, the method does not have good generalization capability.
Disclosure of Invention
The invention provides a method and a device for identifying a table structure and a computer readable storage medium, and mainly aims to solve the problems of poor dependence on images and poor table identification effect.
In order to achieve the above object, the present invention provides a method for identifying a table structure, including:
acquiring a training data set and constructing a label of the training data set;
training the pre-constructed original table structure recognition model by utilizing the training data set and the label to obtain a standard table structure recognition model;
acquiring a form page to be identified, and constructing document node characteristics and form line characteristics of the form page to be identified;
performing table detection and recognition on the document node characteristics and the table line characteristics by using the standard table structure recognition model to obtain a predicted table structure relation;
and carrying out reduction processing on the to-be-identified form page according to the predicted form structure relationship to obtain a form structure.
Optionally, the acquiring a training data set includes:
crawling a plurality of PDF documents from a webpage, and analyzing and screening the PDF documents to obtain a plurality of form pages;
converting each form page into a page picture, and performing text detection and recognition on the page picture to obtain a recognition result;
and deleting the page pictures which do not accord with the preset rule in the page pictures according to the identification result to obtain a training data set.
Optionally, the constructing the label of the training data set includes:
performing text box detection and recognition on the training data set to obtain a plurality of text boxes;
taking each text box in the plurality of text boxes as a node, and judging the adjacent relation between any two nodes according to a preset relation condition to obtain a table structure relation;
and constructing an adjacent matrix according to the table structure relationship to obtain the label.
Optionally, training the pre-constructed original table structure recognition model by using the training data set and the label to obtain a standard table structure recognition model, including:
preprocessing the training data set to obtain training characteristics;
performing table recognition on the training features through the original table structure recognition model to obtain a relation prediction matrix;
calculating a loss value of the relation prediction matrix according to the label and a preset loss function;
and adjusting parameters of the original table structure identification model according to the loss value, and returning to the step of carrying out table identification on the training features through the original table structure identification model to obtain a relation prediction matrix until the loss value is not reduced any more to obtain a standard table structure identification model.
Optionally, the constructing the document node feature and the form line feature of the form page to be identified includes:
performing text box detection and recognition on the form page to be recognized to obtain a text box, wherein the text box comprises a plurality of text strips and corresponding text box coordinates;
constructing the position characteristics of the text box according to the text box coordinates of the text box;
constructing text features of the text box according to the text bars of the text box;
constructing line type characteristics of the text box according to a preset line rule;
collecting the position characteristics of the text box, the text characteristics of the text box and the line type characteristics of the text box to obtain document node characteristics;
performing table line detection on the to-be-identified table page to obtain table grid lines;
constructing the position characteristics of the table grid lines according to the endpoint coordinates of the table grid lines;
constructing text features of the table grid lines according to preset text conditions;
constructing line type characteristics of the table lines according to the types of the table lines;
and collecting the position features of the table lines, the text features of the table lines and the type features of the table lines to obtain the table line features.
Optionally, the performing table detection and recognition on the document node feature and the table line feature by using the standard table structure recognition model to obtain a predicted table structure relationship includes:
integrating the document node characteristics and the form line characteristics to obtain input characteristics;
extracting the characteristics of the input characteristics by utilizing a translation layer of the standard table structure recognition model to obtain node characteristics;
performing bilinear transformation on the node characteristics by utilizing a transformation layer of the standard table structure identification model to obtain edge characteristics;
and carrying out relation prediction on the edge characteristics by utilizing the full connection layer of the standard table structure recognition model to obtain a predicted table structure relation, wherein the predicted table structure relation comprises a table relation, a row relation and a column relation.
Optionally, the restoring processing is performed on the form page to be identified according to the predicted form structure relationship to obtain a form structure, including:
performing text box detection and recognition on the form page to be recognized to obtain a plurality of text boxes;
taking each text box as a node according to the table relation in the predicted table structural relation, and constructing an undirected graph to obtain a table relation graph;
Dividing the nodes into a plurality of table sets by solving connected components of the table relation graph;
respectively constructing a row relationship diagram and a column relationship diagram for each table set according to the row relationship and the column relationship of the predicted table structure relationship;
solving a row maximum group in the row relation diagram by using a maximum group algorithm, and sequencing the row maximum groups from large to small according to the ordinate of the row maximum group to obtain a row set;
solving a column maximum group in the column relation graph by using a maximum group algorithm, and sequencing the column maximum groups from small to large according to the abscissa of the column maximum group to obtain a column set;
and integrating the row set and the column set to obtain a table structure.
In order to solve the above problems, the present invention also provides a table structure identifying apparatus, the apparatus comprising:
the data acquisition module is used for acquiring a training data set and constructing a label of the training data set;
the model training module is used for training the pre-constructed original table structure recognition model by utilizing the training data set and the label to obtain a standard table structure recognition model;
the characteristic construction module is used for acquiring a form page to be identified and constructing document node characteristics and form line characteristics of the form page to be identified;
The table detection module is used for carrying out table detection and identification on the document node characteristics and the table line characteristics by using the standard table structure identification model to obtain a predicted table structure relation;
and the table reduction module is used for carrying out reduction processing on the to-be-identified table pages according to the predicted table structure relationship to obtain a table structure.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
a memory storing at least one instruction; a kind of electronic device with high-pressure air-conditioning system
And the processor executes the instructions stored in the memory to realize the table structure identification method.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored therein at least one instruction that is executed by a processor in an electronic device to implement the above-mentioned table structure identification method.
The invention trains the pre-constructed original table structure recognition model by utilizing the training data set and the label, wherein the training data set is a table data set in the general field and comprises a large number of table structures, so that the accuracy of a table recognition result can be improved; meanwhile, the input of the table structure recognition model is document node characteristics and table line characteristics, the output is predicted table structure relation, the table structure is restored by restoring the table structure relation, and the table is restored by a non-image processing method, so that the dependency on the image quality can be effectively avoided, the image recognition error caused by low image quality, complex background or obvious table color shading is reduced, and the recognition accuracy is improved. Therefore, the method, the device, the electronic equipment and the computer readable storage medium for identifying the table structure can solve the problems of poor dependence on images and poor table identification effect.
Drawings
FIG. 1 is a flowchart illustrating a method for identifying a table structure according to an embodiment of the present invention;
FIG. 2 is a functional block diagram of a table structure recognition device according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device for implementing the method for identifying a table structure according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The embodiment of the application provides a table structure identification method. The execution subject of the table structure identification method includes, but is not limited to, at least one of a server, a terminal, and the like, which can be configured to execute the method provided in the embodiments of the present application. In other words, the table structure identification method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Referring to fig. 1, a flow chart of a table structure recognition method according to an embodiment of the invention is shown. In this embodiment, the table structure identifying method includes:
s1, acquiring a training data set and constructing a label of the training data set.
In the embodiment of the invention, the training data set is a picture set containing a table structure, such as a picture converted from a PDF document containing table contents.
In detail, the acquiring the training data set includes:
crawling a plurality of PDF documents from a webpage, and analyzing and screening the PDF documents to obtain a plurality of form pages;
converting each form page into a page picture, and performing text detection and recognition on the page picture to obtain a recognition result;
and deleting the page pictures which do not accord with the preset rule in the page pictures according to the identification result to obtain a training data set.
For example, a large number of PDF documents in the general field are crawled from the internet, and the PDF documents are parsed by using a pdfplumber library to obtain PDF parsing results. Screening pages containing tables from the PDF document according to the analysis result to obtain a plurality of table pages, converting each table page into a picture by using a poppler tool, and detecting and identifying OCR text frames to obtain an OCR analysis result; and comparing and verifying the PDF analysis result with the OCR analysis result according to a predefined rule (such as that the number of the tables in the PDF analysis result of the same page is the same as that of the tables in the OCR analysis result, and the like), deleting the pictures which do not accord with the rule from the pictures containing the tables, and obtaining a training data set.
In detail, the constructing the tag of the training data set includes:
performing text box detection and recognition on the training data set to obtain a plurality of text boxes;
taking each text box in the plurality of text boxes as a node, and judging the adjacent relation between any two nodes according to a preset relation condition to obtain a table structure relation;
and constructing an adjacent matrix according to the table structure relationship to obtain the label.
Further, the embodiment of the invention utilizes the OCR technology to detect and identify the text box of the training data set, and the preset relation condition comprises: whether two nodes belong to the same table (table), whether two nodes belong to the same row (row), and whether two nodes belong to the same column (col); the adjacency relationship comprises a table relationship, a row relationship and a column relationship; the table structure relationship includes a plurality of adjacency relationships.
The construction of the adjacency matrix according to the table structure relation is to consider a text box of a cell in a table of a training data set as a node in a Graph (Graph), if an adjacency relation exists between any two nodes, an element at a corresponding position in the adjacency matrix is 1, otherwise, the element is 0, and three adjacency matrices are constructed according to the adjacency relation and serve as labels for subsequent model training.
S2, training the pre-constructed original table structure recognition model by utilizing the training data set and the label to obtain a standard table structure recognition model.
The original table structure recognition model in the embodiment of the invention is a bert model based on a transformer structure, and can predict the adjacent relation of each node in the page according to the input characteristics.
In detail, training the pre-constructed original table structure recognition model by using the training data set and the label to obtain a standard table structure recognition model, including:
preprocessing the training data set to obtain training characteristics;
performing table recognition on the training features through the original table structure recognition model to obtain a relation prediction matrix;
calculating a loss value of the relation prediction matrix according to the label and a preset loss function;
and adjusting parameters of the original table structure identification model according to the loss value, and returning to the step of carrying out table identification on the training features through the original table structure identification model to obtain a relation prediction matrix until the loss value is not reduced any more to obtain a standard table structure identification model.
Wherein, the preset loss function can use a mean square error loss function or a cross entropy loss function.
Further, the preprocessing the training data set includes:
text box detection and recognition are carried out on the training data set to obtain a text detection result, and document node characteristics are constructed according to the text detection result;
performing table line detection on the training data set to obtain a detection result, and constructing table line characteristics according to the detection;
and merging the document node characteristics and the table line characteristics to obtain training characteristics.
S3, acquiring a form page to be identified, and constructing document node characteristics and form line characteristics of the form page to be identified.
The form page to be identified in the embodiment of the invention can be an actual bill picture in the medical field or a picture converted from a document containing a form in the medical field. The form page to be identified may be obtained from a pre-built database. To further ensure the privacy and security of the form page to be identified, the form page to be identified may also be obtained from a node of a blockchain.
In detail, the construction of the document node characteristics and the form line characteristics of the form page to be identified comprises the following steps:
Performing text box detection and recognition on the form page to be recognized to obtain a text box, wherein the text box comprises a plurality of text strips and corresponding text box coordinates;
constructing the position characteristics of the text box according to the text box coordinates of the text box;
constructing text features of the text box according to the text bars of the text box;
constructing line type characteristics of the text box according to a preset line rule;
collecting the position characteristics of the text box, the text characteristics of the text box and the line type characteristics of the text box to obtain document node characteristics;
performing table line detection on the to-be-identified table page to obtain table grid lines;
constructing the position characteristics of the table grid lines according to the endpoint coordinates of the table grid lines;
constructing text features of the table grid lines according to preset text conditions;
constructing line type characteristics of the table lines according to the types of the table lines;
and collecting the position characteristics of the table grid lines, the text characteristics of the table lines and the line type characteristics of the table grid lines to obtain the table line characteristics.
For example: each text box detected by the form page to be identified is taken as a node, and N nodes are assumed, wherein the document node characteristics D_F (N) = (F (1), F (2),. The total of F (N)), and the characteristics F (x) = [ x1, y1, x2, y2, (x1+x2)/2, (y1+y2)/2, x2-x1, y2-y1, num, char, space, other, 0] of each node. Wherein, [ x1, y1, x2, y2, (x1+x2)/2, (y1+y2)/2, x2-x1, y2-y1] is a position feature, (x 1, y 1) is coordinates of an upper left corner of the text box, (x 2, y 2) is coordinates of a lower right corner of the text box, ((x1+x2)/2, (y1+y2)/2) represents a center point of the text box, x2-x1 represents a width of the text box, and y2-y1 represents a height of the text box; [ num, char, space, other ] is a text feature, num, char, space, other representing the frequency of numbers, letters, spaces, or other types in the text box, respectively, calculated from the content of the text bar; the last two bits 0,0 are line type features, set to 0 according to a preset line rule, indicating that the node here is a text box, neither a horizontal line nor a vertical line.
For example: assuming that a total of M horizontal and vertical lines are detected in the form page to be identified, the form line feature l_f (M) = (F '(1), F' (2),... Wherein, [ x1', y1', x2', y2', (x 1 '+x2')/2, (y 1 '+y2')/2, x2'-x1', y2'-y1' ] are location features, (x 1', y 1') are left end coordinates of the horizontal line or upper end coordinates of the vertical line, (x 2', y 2') are right end coordinates of the horizontal line or lower end coordinates of the vertical line, ((x 1 '+x2')/2, (y 1 '+y2')/2) represents the midpoint of the table grid line, and x2'-x1', y2'-y1' represent the lengths of the horizontal line and the vertical line, respectively; the middle four bits [0, 0] are text features, are set to 0 according to preset text conditions, represent non-text nodes, the last two bits [0,1] are line type features, [0,1] represents line types, and [1,0] represents line types as vertical lines.
In the embodiment of the invention, the document node characteristics are information of text strips in the form page to be identified, namely, content information of the form, the form line characteristics are information of form lines in the form page to be identified, namely, frame information of the form, and the form page to be identified is represented by constructing the document node characteristics and the form line characteristics.
And S4, carrying out table detection and recognition on the document node characteristics and the table line characteristics by using the standard table structure recognition model to obtain a predicted table structure relation.
In detail, the S4 includes:
integrating the document node characteristics and the form line characteristics to obtain input characteristics;
extracting the characteristics of the input characteristics by utilizing a translation layer of the standard table structure recognition model to obtain node characteristics;
performing bilinear transformation on the node characteristic input by utilizing a transformation layer of the standard table structure recognition model to obtain edge characteristics;
and carrying out relation prediction on the edge characteristics by utilizing the full connection layer of the standard table structure recognition model to obtain a predicted table structure relation, wherein the predicted table structure relation comprises a table relation, a row relation and a column relation.
Further, the table relationship, the row relationship and the column relationship in the predicted table structure relationship refer to the table relationship, the row relationship and the column relationship between any two document nodes in the table page to be identified.
Optionally, in the embodiment of the present invention, the standard table structure recognition model includes a translation layer, a transformation layer, and a full connection layer, where the translation layer obtains each node feature corresponding to a to-be-recognized table page by performing operations such as encoding and decoding on an input feature; the transformation layer obtains the characteristics between any two nodes, namely edge characteristics, through linear transformation; and the full-connection layer calculates the relation characteristic (adjacent relation) between any two nodes through preset parameters to obtain the predicted table structure relation.
S5, carrying out reduction processing on the to-be-identified form page according to the predicted form structure relation to obtain a form structure.
In detail, the restoring processing is performed on the form page to be identified according to the predicted form structure relationship to obtain a form structure, including:
performing text box detection and recognition on the form page to be recognized to obtain a plurality of text boxes;
taking each text box as a node according to the table relation in the predicted table structural relation, and constructing an undirected graph to obtain a table relation graph;
dividing the nodes into a plurality of table sets by solving connected components of the table relation graph;
respectively constructing a row relationship diagram and a column relationship diagram for each table set according to the row relationship and the column relationship of the predicted table structure relationship;
solving a row maximum group in the row relation diagram by using a maximum group algorithm, and sequencing the row maximum groups from large to small according to the ordinate of the row maximum group to obtain a row set;
solving a column maximum group in the column relation graph by using a maximum group algorithm, and sequencing the column maximum groups from small to large according to the abscissa of the column maximum group to obtain a column set;
and integrating the row set and the column set to obtain a table structure.
Further, the largest connected subgraph of the undirected graph is called the connected component of G, and any connected graph has only one connected component, i.e. is itself, and the undirected graph which is not connected has a plurality of connected components. A table can be regarded as a connected graph, and a plurality of tables in the to-be-identified table page can be divided by solving connected components.
Alternatively, a blob refers to a complete sub-graph of an undirected graph, a biggest blob being a locally largest blob, which is referred to as a biggest blob of the graph if it is not contained by any other blob, i.e., it is not a proper subset of any other blob.
The maximum clique algorithm is an algorithm for solving all maximum cliques of one undirected graph, and specifically comprises: generating all subgraphs of the undirected graph; judging whether the subgraph is a group or not, and deleting the subgraph which is not the group to obtain the group; judging whether the cluster is a maximum cluster or not, and deleting the cluster which is not the maximum cluster to obtain the maximum cluster.
According to the embodiment of the invention, the row information and the column information in one table can be obtained by respectively solving the maximum groups of the row relation diagram and the column relation diagram.
The embodiment of the invention realizes the restoration of the table structure by a non-image processing method, and can greatly reduce the dependence on the image quality.
The invention trains the pre-constructed original table structure recognition model by utilizing the training data set and the label, wherein the training data set is a table data set in the general field and comprises a large number of table structures, so that the accuracy of a table recognition result can be improved; meanwhile, the input of the table structure recognition model is document node characteristics and table line characteristics, the output is predicted table structure relation, the table structure is restored by restoring the table structure relation, and the table is restored by a non-image processing method, so that the dependency on the image quality can be effectively avoided, the image recognition error caused by low image quality, complex background or obvious table color shading is reduced, and the recognition accuracy is improved. Therefore, the method, the device, the electronic equipment and the computer readable storage medium for identifying the table structure can solve the problems of poor dependence on images and poor table identification effect.
Fig. 2 is a functional block diagram of a table structure recognition device according to an embodiment of the present invention.
The table structure recognition apparatus 100 of the present invention may be installed in an electronic device. Depending on the implemented functions, the table structure identifying apparatus 100 may include a data acquisition module 101, a model training module 102, a feature construction module 103, a table detection module 104, and a table restoration module 105. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the data acquisition module 101 is configured to acquire a training data set, and construct a label of the training data set.
In the embodiment of the invention, the training data set is a picture set containing a table structure, such as a picture converted from a PDF document containing table contents.
In detail, when acquiring the training data set, the data acquisition module 101 specifically performs the following operations:
crawling a plurality of PDF documents from a webpage, and analyzing and screening the PDF documents to obtain a plurality of form pages;
converting each form page into a page picture, and performing text detection and recognition on the page picture to obtain a recognition result;
and deleting the page pictures which do not accord with the preset rule in the page pictures according to the identification result to obtain a training data set.
For example, a large number of PDF documents in the general field are crawled from the internet, and the PDF documents are parsed by using a pdfplumber library to obtain PDF parsing results. Screening pages containing tables from the PDF document according to the analysis result to obtain a plurality of table pages, converting each table page into a picture by using a poppler tool, and detecting and identifying OCR text frames to obtain an OCR analysis result; and comparing and verifying the PDF analysis result with the OCR analysis result according to a predefined rule (such as that the number of the tables in the PDF analysis result of the same page is the same as that of the tables in the OCR analysis result, and the like), deleting the pictures which do not accord with the rule from the pictures containing the tables, and obtaining a training data set.
In detail, in constructing the tag of the training data set, the data acquisition module 101 specifically performs the following operations:
performing text box detection and recognition on the training data set to obtain a plurality of text boxes;
taking each text box in the plurality of text boxes as a node, and judging the adjacent relation between any two nodes according to a preset relation condition to obtain a table structure relation;
and constructing an adjacent matrix according to the table structure relationship to obtain the label.
Further, the embodiment of the invention utilizes the OCR technology to detect and identify the text box of the training data set, and the preset relation condition comprises: whether two nodes belong to the same table (table), whether two nodes belong to the same row (row), and whether two nodes belong to the same column (col); the adjacency relationship comprises a table relationship, a row relationship and a column relationship; the table structure relationship includes a plurality of adjacency relationships.
The construction of the adjacency matrix according to the table structure relation is to consider a text box of a cell in a table of a training data set as a node in a Graph (Graph), if an adjacency relation exists between any two nodes, an element at a corresponding position in the adjacency matrix is 1, otherwise, the element is 0, and three adjacency matrices are constructed according to the adjacency relation and serve as labels for subsequent model training.
The model training module 102 is configured to train the pre-constructed original table structure recognition model by using the training data set and the label, so as to obtain a standard table structure recognition model.
The original table structure recognition model in the embodiment of the invention is a bert model based on a transformer structure, and can predict the adjacent relation of each node in the page according to the input characteristics.
In detail, the model training module 102 is specifically configured to:
preprocessing the training data set to obtain training characteristics;
performing table recognition on the training features through the original table structure recognition model to obtain a relation prediction matrix;
calculating a loss value of the relation prediction matrix according to the label and a preset loss function;
and adjusting parameters of the original table structure identification model according to the loss value, and returning to the step of carrying out table identification on the training features through the original table structure identification model to obtain a relation prediction matrix until the loss value is not reduced any more to obtain a standard table structure identification model.
Wherein, the preset loss function can use a mean square error loss function or a cross entropy loss function.
Further, the preprocessing the training data set includes:
text box detection and recognition are carried out on the training data set to obtain a text detection result, and document node characteristics are constructed according to the text detection result;
performing table line detection on the training data set to obtain a detection result, and constructing table line characteristics according to the detection;
and merging the document node characteristics and the table line characteristics to obtain training characteristics.
The feature construction module 103 is configured to obtain a form page to be identified, and construct document node features and form line features of the form page to be identified.
The form page to be identified in the embodiment of the invention can be an actual bill picture in the medical field or a picture converted from a document containing a form in the medical field. The form page to be identified may be obtained from a pre-built database. To further ensure the privacy and security of the form page to be identified, the form page to be identified may also be obtained from a node of a blockchain.
In detail, when the document node feature and the form line feature of the form page to be identified are constructed, the feature construction module 103 specifically performs the following operations:
Performing text box detection and recognition on the form page to be recognized to obtain a text box, wherein the text box comprises a plurality of text strips and corresponding text box coordinates;
constructing the position characteristics of the text box according to the text box coordinates of the text box;
constructing text features of the text box according to the text bars of the text box;
constructing line type characteristics of the text box according to a preset line rule;
collecting the position characteristics of the text box, the text characteristics of the text box and the line type characteristics of the text box to obtain document node characteristics;
performing table line detection on the to-be-identified table page to obtain table grid lines;
constructing the position characteristics of the table grid lines according to the endpoint coordinates of the table grid lines;
constructing text features of the table grid lines according to preset text conditions;
constructing line type characteristics of the table lines according to the types of the table lines;
and collecting the position characteristics of the table grid lines, the text characteristics of the table lines and the line type characteristics of the table grid lines to obtain the table line characteristics.
For example: each text box detected by the form page to be identified is taken as a node, and N nodes are assumed, wherein the document node characteristics D_F (N) = (F (1), F (2),. The total of F (N)), and the characteristics F (x) = [ x1, y1, x2, y2, (x1+x2)/2, (y1+y2)/2, x2-x1, y2-y1, num, char, space, other, 0] of each node. Wherein, [ x1, y1, x2, y2, (x1+x2)/2, (y1+y2)/2, x2-x1, y2-y1] is a position feature, (x 1, y 1) is coordinates of an upper left corner of the text box, (x 2, y 2) is coordinates of a lower right corner of the text box, ((x1+x2)/2, (y1+y2)/2) represents a center point of the text box, x2-x1 represents a width of the text box, and y2-y1 represents a height of the text box; [ num, char, space, other ] is a text feature, num, char, space, other representing the frequency of numbers, letters, spaces, or other types in the text box, respectively, calculated from the content of the text bar; the last two bits 0,0 are line type features, set to 0 according to a preset line rule, indicating that the node here is a text box, neither a horizontal line nor a vertical line.
For example: assuming that a total of M horizontal and vertical lines are detected in the form page to be identified, the form line feature l_f (M) = (F '(1), F' (2),... Wherein, [ x1', y1', x2', y2', (x 1 '+x2')/2, (y 1 '+y2')/2, x2'-x1', y2'-y1' ] are location features, (x 1', y 1') are left end coordinates of the horizontal line or upper end coordinates of the vertical line, (x 2', y 2') are right end coordinates of the horizontal line or lower end coordinates of the vertical line, ((x 1 '+x2')/2, (y 1 '+y2')/2) represents the midpoint of the table grid line, and x2'-x1', y2'-y1' represent the lengths of the horizontal line and the vertical line, respectively; the middle four bits [0, 0] are text features, are set to 0 according to preset text conditions, represent non-text nodes, the last two bits [0,1] are line type features, [0,1] represents line types, and [1,0] represents line types as vertical lines.
In the embodiment of the invention, the document node characteristics are information of text strips in the form page to be identified, namely, content information of the form, the form line characteristics are information of form lines in the form page to be identified, namely, frame information of the form, and the form page to be identified is represented by constructing the document node characteristics and the form line characteristics.
The table detection module 104 is configured to perform table detection and recognition on the document node feature and the table line feature by using the standard table structure recognition model, so as to obtain a predicted table structure relationship.
In detail, the table detection module 104 is specifically configured to:
integrating the document node characteristics and the form line characteristics to obtain input characteristics;
extracting the characteristics of the input characteristics by utilizing a translation layer of the standard table structure recognition model to obtain node characteristics;
performing bilinear transformation on the node characteristic input by utilizing a transformation layer of the standard table structure recognition model to obtain edge characteristics;
and carrying out relation prediction on the edge characteristics by utilizing the full connection layer of the standard table structure recognition model to obtain a predicted table structure relation, wherein the predicted table structure relation comprises a table relation, a row relation and a column relation.
Further, the table relationship, the row relationship and the column relationship in the predicted table structure relationship refer to the table relationship, the row relationship and the column relationship between any two document nodes in the table page to be identified.
Optionally, in the embodiment of the present invention, the standard table structure recognition model includes a translation layer, a transformation layer, and a full connection layer, where the translation layer obtains each node feature corresponding to a to-be-recognized table page by performing operations such as encoding and decoding on an input feature; the transformation layer obtains the characteristics between any two nodes, namely edge characteristics, through linear transformation; and the full-connection layer calculates the relation characteristic (adjacent relation) between any two nodes through preset parameters to obtain the predicted table structure relation.
The table restoration module 105 is configured to restore the to-be-identified table page according to the predicted table structure relationship, so as to obtain a table structure.
In detail, the table restoration module 105 is specifically configured to:
performing text box detection and recognition on the form page to be recognized to obtain a plurality of text boxes;
taking each text box as a node according to the table relation in the predicted table structural relation, and constructing an undirected graph to obtain a table relation graph;
dividing the nodes into a plurality of table sets by solving connected components of the table relation graph;
respectively constructing a row relationship diagram and a column relationship diagram for each table set according to the row relationship and the column relationship of the predicted table structure relationship;
solving a row maximum group in the row relation diagram by using a maximum group algorithm, and sequencing the row maximum groups from large to small according to the ordinate of the row maximum group to obtain a row set;
solving a column maximum group in the column relation graph by using a maximum group algorithm, and sequencing the column maximum groups from small to large according to the abscissa of the column maximum group to obtain a column set;
and integrating the row set and the column set to obtain a table structure.
Further, the largest connected subgraph of the undirected graph is called the connected component of G, and any connected graph has only one connected component, i.e. is itself, and the undirected graph which is not connected has a plurality of connected components. A table can be regarded as a connected graph, and a plurality of tables in the to-be-identified table page can be divided by solving connected components.
Alternatively, a blob refers to a complete sub-graph of an undirected graph, a biggest blob being a locally largest blob, which is referred to as a biggest blob of the graph if it is not contained by any other blob, i.e., it is not a proper subset of any other blob.
The maximum clique algorithm is an algorithm for solving all maximum cliques of one undirected graph, and specifically comprises: generating all subgraphs of the undirected graph; judging whether the subgraph is a group or not, and deleting the subgraph which is not the group to obtain the group; judging whether the cluster is a maximum cluster or not, and deleting the cluster which is not the maximum cluster to obtain the maximum cluster.
According to the embodiment of the invention, the row information and the column information in one table can be obtained by respectively solving the maximum groups of the row relation diagram and the column relation diagram.
The embodiment of the invention realizes the restoration of the table structure by a non-image processing method, and can greatly reduce the dependence on the image quality.
Fig. 3 is a schematic structural diagram of an electronic device for implementing a table structure recognition method according to an embodiment of the present invention.
The electronic device 1 may comprise a processor 10, a memory 11 and a bus, and may further comprise a computer program, such as a table structure identification program 12, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only for storing application software installed in the electronic device 1 and various types of data, such as codes of the table structure recognition program 12, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects respective components of the entire electronic device using various interfaces and lines, and executes various functions of the electronic device 1 and processes data by running or executing programs or modules (e.g., a table structure recognition program, etc.) stored in the memory 11, and calling data stored in the memory 11.
The bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
Fig. 3 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
Further, the electronic device 1 may also comprise a network interface, optionally the network interface may comprise a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the electronic device 1 and other electronic devices.
The electronic device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The table structure identification program 12 stored in the memory 11 in the electronic device 1 is a combination of a plurality of instructions, which when run in the processor 10, can realize:
Acquiring a training data set and constructing a label of the training data set;
training the pre-constructed original table structure recognition model by utilizing the training data set and the label to obtain a standard table structure recognition model;
acquiring a form page to be identified, and constructing document node characteristics and form line characteristics of the form page to be identified;
performing table detection and recognition on the document node characteristics and the table line characteristics by using the standard table structure recognition model to obtain a predicted table structure relation;
and carrying out reduction processing on the to-be-identified form page according to the predicted form structure relationship to obtain a form structure.
Specifically, the specific implementation method of the above instructions by the processor 10 may refer to descriptions of related steps in the corresponding embodiments of fig. 1 to 3, which are not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:
acquiring a training data set and constructing a label of the training data set;
training the pre-constructed original table structure recognition model by utilizing the training data set and the label to obtain a standard table structure recognition model;
acquiring a form page to be identified, and constructing document node characteristics and form line characteristics of the form page to be identified;
performing table detection and recognition on the document node characteristics and the table line characteristics by using the standard table structure recognition model to obtain a predicted table structure relation;
and carrying out reduction processing on the to-be-identified form page according to the predicted form structure relationship to obtain a form structure.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (8)

1. A method of identifying a table structure, the method comprising:
acquiring a training data set, carrying out text box detection and recognition on the training data set to obtain a plurality of text boxes, taking each text box in the plurality of text boxes as a node, judging the adjacent relation between any two nodes according to a preset relation condition to obtain a table structure relation, and constructing an adjacent matrix according to the table structure relation to obtain a label;
training the pre-constructed original table structure recognition model by utilizing the training data set and the label to obtain a standard table structure recognition model, wherein the standard table structure recognition model comprises a translation layer, a transformation layer and a full connection layer;
acquiring a form page to be identified, and constructing document node characteristics and form line characteristics of the form page to be identified;
integrating the document node characteristics and the table line characteristics to obtain input characteristics, encoding and decoding the input characteristics by utilizing the translation layer to obtain node characteristics, performing bilinear transformation on the node characteristic input between any two nodes by utilizing the transformation layer to obtain edge characteristics, and obtaining a predicted table structure relationship by utilizing the full-connection layer to obtain an adjacent relationship between any two nodes, wherein the predicted table structure relationship comprises a table relationship, a row relationship and a column relationship;
And carrying out reduction processing on the to-be-identified form page according to the predicted form structure relationship to obtain a form structure.
2. The method of claim 1, wherein the acquiring a training dataset comprises:
crawling a plurality of PDF documents from a webpage, and analyzing and screening the PDF documents to obtain a plurality of form pages;
converting each form page into a page picture, and performing text detection and recognition on the page picture to obtain a recognition result;
and deleting the page pictures which do not accord with the preset rule in the page pictures according to the identification result to obtain a training data set.
3. The method for identifying a table structure according to claim 1, wherein training the pre-constructed original table structure identification model by using the training data set and the tag to obtain a standard table structure identification model comprises:
preprocessing the training data set to obtain training characteristics;
performing table recognition on the training features through the original table structure recognition model to obtain a relation prediction matrix;
calculating a loss value of the relation prediction matrix according to the label and a preset loss function;
And adjusting parameters of the original table structure identification model according to the loss value, and returning to the step of carrying out table identification on the training features through the original table structure identification model to obtain a relation prediction matrix until the loss value is not reduced any more to obtain a standard table structure identification model.
4. The method for identifying a form structure according to claim 1, wherein the constructing the document node feature and the form line feature of the form page to be identified comprises:
performing text box detection and recognition on the form page to be recognized to obtain a text box, wherein the text box comprises a plurality of text strips and corresponding text box coordinates;
constructing the position characteristics of the text box according to the text box coordinates of the text box;
constructing text features of the text box according to the text bars of the text box;
constructing line type characteristics of the text box according to a preset line rule;
collecting the position characteristics of the text box, the text characteristics of the text box and the line type characteristics of the text box to obtain document node characteristics;
performing table line detection on the to-be-identified table page to obtain table grid lines;
Constructing the position characteristics of the table grid lines according to the endpoint coordinates of the table grid lines;
constructing text features of the table grid lines according to preset text conditions;
constructing line type characteristics of the table lines according to the types of the table lines;
and collecting the position characteristics of the table grid lines, the text characteristics of the table lines and the line type characteristics of the table grid lines to obtain the table line characteristics.
5. The method for identifying a table structure according to claim 1, wherein the restoring the table page to be identified according to the predicted table structure relationship to obtain the table structure includes:
performing text box detection and recognition on the form page to be recognized to obtain a plurality of text boxes;
taking each text box as a node according to the table relation in the predicted table structural relation, and constructing an undirected graph to obtain a table relation graph;
dividing the nodes into a plurality of table sets by solving connected components of the table relation graph;
respectively constructing a row relationship diagram and a column relationship diagram for each table set according to the row relationship and the column relationship of the predicted table structure relationship;
solving a row maximum group in the row relation diagram by using a maximum group algorithm, and sequencing the row maximum groups from large to small according to the ordinate of the row maximum group to obtain a row set;
Solving a column maximum group in the column relation graph by using a maximum group algorithm, and sequencing the column maximum groups from small to large according to the abscissa of the column maximum group to obtain a column set;
and integrating the row set and the column set to obtain a table structure.
6. A form structure identification device, the device comprising:
the data acquisition module is used for acquiring a training data set, carrying out text box detection and recognition on the training data set to obtain a plurality of text boxes, taking each text box in the plurality of text boxes as a node, judging the adjacent relation between any two nodes according to a preset relation condition to obtain a table structure relation, and constructing an adjacent matrix according to the table structure relation to obtain a label;
the model training module is used for training a pre-constructed original table structure identification model by utilizing the training data set and the label to obtain a standard table structure identification model, and the standard table structure identification model comprises a translation layer, a transformation layer and a full connection layer;
the characteristic construction module is used for acquiring a form page to be identified and constructing document node characteristics and form line characteristics of the form page to be identified;
The table detection module is used for integrating the document node characteristics and the table line characteristics to obtain input characteristics, the translation layer is used for encoding and decoding the input characteristics to obtain node characteristics, the transformation layer is used for carrying out bilinear transformation on the node characteristic input between any two nodes to obtain edge characteristics, and the full-connection layer is used for carrying out adjacent relation between any two nodes to obtain a predicted table structure relation, wherein the predicted table structure relation comprises a table relation, a row relation and a column relation;
and the table reduction module is used for carrying out reduction processing on the to-be-identified table pages according to the predicted table structure relationship to obtain a table structure.
7. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the table structure identification method of any one of claims 1 to 5.
8. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the table structure identification method according to any one of claims 1 to 5.
CN202110206569.3A 2021-02-24 2021-02-24 Table structure identification method and device, electronic equipment and storage medium Active CN112949443B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110206569.3A CN112949443B (en) 2021-02-24 2021-02-24 Table structure identification method and device, electronic equipment and storage medium
PCT/CN2021/096534 WO2022178994A1 (en) 2021-02-24 2021-05-27 Table structure recognition method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110206569.3A CN112949443B (en) 2021-02-24 2021-02-24 Table structure identification method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112949443A CN112949443A (en) 2021-06-11
CN112949443B true CN112949443B (en) 2023-07-25

Family

ID=76245817

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110206569.3A Active CN112949443B (en) 2021-02-24 2021-02-24 Table structure identification method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112949443B (en)
WO (1) WO2022178994A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610043A (en) * 2021-08-19 2021-11-05 海默潘多拉数据科技(深圳)有限公司 Industrial drawing table structured recognition method and system
CN113762158A (en) * 2021-09-08 2021-12-07 平安资产管理有限责任公司 Borderless table recovery model training method, device, computer equipment and medium
CN113849552A (en) * 2021-09-27 2021-12-28 中国平安财产保险股份有限公司 Structured data conversion method and device, electronic equipment and medium
CN115116060B (en) * 2022-08-25 2023-01-24 深圳前海环融联易信息科技服务有限公司 Key value file processing method, device, equipment and medium
CN116127927B (en) * 2023-04-04 2023-06-16 北京智麟科技有限公司 Method for converting webpage form into PDF file

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11093740B2 (en) * 2018-11-09 2021-08-17 Microsoft Technology Licensing, Llc Supervised OCR training for custom forms
CN110334585B (en) * 2019-05-22 2023-10-24 平安科技(深圳)有限公司 Table identification method, apparatus, computer device and storage medium
CN111382717B (en) * 2020-03-17 2022-09-09 腾讯科技(深圳)有限公司 Table identification method and device and computer readable storage medium
CN111860257B (en) * 2020-07-10 2022-11-11 上海交通大学 Table identification method and system fusing multiple text features and geometric information
CN112381010A (en) * 2020-11-17 2021-02-19 深圳壹账通智能科技有限公司 Table structure restoration method, system, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2022178994A1 (en) 2022-09-01
CN112949443A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN112949443B (en) Table structure identification method and device, electronic equipment and storage medium
CN111652845A (en) Abnormal cell automatic labeling method and device, electronic equipment and storage medium
CN112380859A (en) Public opinion information recommendation method and device, electronic equipment and computer storage medium
CN111932534B (en) Medical image picture analysis method and device, electronic equipment and readable storage medium
CN112699775A (en) Certificate identification method, device and equipment based on deep learning and storage medium
CN111930962A (en) Document data value evaluation method and device, electronic equipment and storage medium
CN112860905A (en) Text information extraction method, device and equipment and readable storage medium
WO2023071127A1 (en) Policy recommended method and apparatus, device, and storage medium
CN114398557A (en) Information recommendation method and device based on double portraits, electronic equipment and storage medium
CN112990374B (en) Image classification method, device, electronic equipment and medium
CN111460293B (en) Information pushing method and device and computer readable storage medium
CN113505273A (en) Data sorting method, device, equipment and medium based on repeated data screening
CN112783989A (en) Data processing method and device based on block chain
CN116578696A (en) Text abstract generation method, device, equipment and storage medium
CN115982454A (en) User portrait based questionnaire pushing method, device, equipment and storage medium
US20230023636A1 (en) Methods and systems for preparing unstructured data for statistical analysis using electronic characters
CN113705686B (en) Image classification method, device, electronic equipment and readable storage medium
CN113221888B (en) License plate number management system test method and device, electronic equipment and storage medium
CN112233194B (en) Medical picture optimization method, device, equipment and computer readable storage medium
CN113343102A (en) Data recommendation method and device based on feature screening, electronic equipment and medium
CN112528984A (en) Image information extraction method, device, electronic equipment and storage medium
CN116306575B (en) Document analysis method, document analysis model training method and device and electronic equipment
CN117558392B (en) Electronic medical record sharing collaboration method and system
CN113761873A (en) PDF analysis method and device, electronic equipment and storage medium
CN113486266A (en) Page label adding method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant