CN115527223A - Complex diagram extraction method and system based on computer vision and graph convolution network - Google Patents

Complex diagram extraction method and system based on computer vision and graph convolution network Download PDF

Info

Publication number
CN115527223A
CN115527223A CN202210667214.9A CN202210667214A CN115527223A CN 115527223 A CN115527223 A CN 115527223A CN 202210667214 A CN202210667214 A CN 202210667214A CN 115527223 A CN115527223 A CN 115527223A
Authority
CN
China
Prior art keywords
convolution
image
picture
computer vision
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210667214.9A
Other languages
Chinese (zh)
Inventor
江秀
伍惠英
翁晓锋
曹凯
谢登峰
方声财
林晋瑶
陈榕城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Yili Information Technology Co ltd
Original Assignee
Fujian Yili Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Yili Information Technology Co ltd filed Critical Fujian Yili Information Technology Co ltd
Priority to CN202210667214.9A priority Critical patent/CN115527223A/en
Publication of CN115527223A publication Critical patent/CN115527223A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/174Form filling; Merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/1607Correcting image deformation, e.g. trapezoidal deformation caused by perspective

Abstract

The invention relates to a complex diagram extraction method based on computer vision and a diagram convolution network, which comprises the following steps: step S1, rendering a document into an image, and performing layout segmentation by adopting a computer vision and deep learning technology; s2, preprocessing the segmented image; and S2, analyzing the preprocessed image topological structure based on the model of the graph convolution network, and detecting and extracting the table. The invention realizes the end-to-end table detection and effectively improves the detection efficiency and the accuracy.

Description

Complex diagram extraction method and system based on computer vision and graph convolution network
Technical Field
The invention relates to the technical field of chart data analysis and extraction, in particular to a complex chart extraction method and system based on computer vision and a chart convolution network.
Background
With the continuous deepening of the application and the increasing of the data quantity, the core data are distributed in text, table and chart information such as annual reports, financial reports, audit reports and IPO reports of companies, even in formats such as scanning pieces, which belong to unstructured format data, and a lot of time is consumed by depending on manual reading, positioning and manual extraction, and a lot of time is needed for finding the core chart data. The steps of copying data from an original report, calculating the data to finally enter an analysis model are various, manual operation is prone to errors, algorithm technology threshold is high, sample diversity is complex, and an enterprise IT department cannot make a work. The purpose of form recognition is to acquire forms in images and access data thereof, and is an important branch of the field of document analysis and recognition. How to effectively utilize the technology and how to efficiently find the table area from the document or the image by an intelligent means to realize intelligent analysis and intelligent extraction of data is a pain point and a challenge currently facing.
Disclosure of Invention
In view of this, the present invention provides a complex diagram extraction method based on computer vision and a graph convolution network, which realizes end-to-end table detection and effectively improves detection efficiency and accuracy.
In order to achieve the purpose, the invention adopts the following technical scheme:
a complex diagram extraction method based on computer vision and graph convolution network comprises the following steps:
step S1, rendering a document into an image, and performing layout segmentation by adopting a computer vision and deep learning technology;
s2, preprocessing the segmented image;
and S3, analyzing the preprocessed image topological structure based on the model of the graph convolution network, and detecting and extracting the table.
Further, in the step S1, a full convolution neural network is adopted to identify each independent area in the document page, including a title, a paragraph, a table, an illustration, and a data diagram layout.
Further, the full convolution neural network performs image semantic segmentation through convolution, deconvolution and a layer jump structure, and specifically includes the following steps:
inputting the image into a convolution neural network, and obtaining a series of characteristic graphs through multiple convolution and pooling processes;
then, the resolution is improved through upsampling, and after the resolution of the picture is improved to be consistent with the original picture, the area with high weight is the area where the target is located;
and finally, restoring the restored image by combining the data after the up-sampling and the upper-layer convolution pooling.
Furthermore, the full convolution neural network adopts a skip level connection method, the feature maps extracted from the first layers of convolution are respectively connected with the subsequent upsampling layer, and then are added to continue upsampling.
Further, the preprocessing comprises:
(1) There is red seal to shelter from
For the condition that the red seal is shielded, the red seal removing operation is carried out on the existing document, and then the character recognition is carried out
(2) Has wrinkles
Identifying the wrinkle condition of a scanned part or picture with wrinkles, wherein the wrinkle condition comprises resolvable, partially resolved and unresolvable, if the wrinkle condition cannot be resolved, not resolving, alarming to resolve a result, performing manual intervention on the resolved result, performing wrinkle degree evaluation in the first step by resolving, wherein the wrinkle condition comprises clear and resolvable contents of partial wrinkles, and the accuracy of seriously resolved contents of wrinkles is lower than the average level; firstly, inclination, handstand and correction treatment are carried out on part of wrinkle contents clearly; then, identifying the table purpose, and comparing the labeled sample data;
(3) Image tilting
For the condition that the scanned part or the picture is inclined, before analysis, the image is corrected and then analyzed according to the scanned part and the picture processing algorithm;
(4) Image side stand
For the condition that the scanning piece or the picture stands on the side, the image is analyzed according to the scanning piece and the picture processing algorithm after being processed in the forward direction before being analyzed;
(5) Image handstand
For the condition that the scanning piece or the picture is inverted, the image is analyzed according to the scanning piece and the picture processing algorithm after being processed in the forward direction before being analyzed;
(6) Cross-page table merging
For the table in the scanned piece or the picture and the condition of cross-page segmentation, firstly, if the table header exists, performing table header comparison, and then performing table merging according to the table header content; if the table header does not exist, merging the tables according to the length of the tables and the division number of the tables;
(7) Form radio
The cases that the table does not exist in the scanned object or the picture comprise the cases that the beginning and the end exist, the beginning exists, the end exists and no table exists overseas; identifying the table use according to the title of the text; after identification, carrying out matching identification analysis according to a sample labeling result, and carrying out table reduction; under the condition of no title, sample data matching is carried out according to the labeling result; after matching, carrying out analysis table reduction; and under the condition of no sample data, early warning the table and carrying out manual algorithm intervention.
Further, the step S3 specifically includes:
firstly, abstracting the information of the table structure into a row-column relationship among nodes, namely, character strings in the same column in the table form nodes with the same row relationship, character strings in the same row in the table form nodes with the same row relationship, and finally restoring the table with the digital structure through the row-column relationship among the nodes;
secondly, constructing a spatial relationship graph by adopting an epsilon-neighbor graph, finding epsilon neighbor samples closest to a certain node through Euclidean distance according to text information, position information and picture information in a given sample table data set, and then respectively connecting ix with the epsilon neighbor samples to form epsilon directed edges, wherein all nodes in the space are processed according to the method;
and finally, constructing a diffusion convolution neural network, acquiring text features, position features and image features for unified modeling based on two text boxes indicated by each edge in the spatial relationship diagram, and giving structure position prediction aiming at the two text boxes so as to identify and extract the table.
Further, the diffusion convolution neural network regards graph convolution as a diffusion process, and information is transferred from one node to an adjacent node with a certain transfer probability, so that the information distribution is balanced after several rounds, and then the convolution operation of each layer is expressed as:
H (k) =f(W (k) ⊙P k X)
where K denotes the number of layers, P = D -1 A represents a transition matrix, D is a node degree matrix, A is an adjacency matrix, P k The node neighbor range observed by the convolution is represented, and the convolution of the neighbor node with the distance of 1 is represented when k =1, and the convolution of the neighbor node with the distance of 2 is represented when k = 2.
A system of a complex chart extraction method based on computer vision and graph convolution network comprises a labeling module, a training module and an extraction module;
the marking template realizes multi-event marking on the same document through a self-defined template, and firstly, an index and a marking template are created; secondly, creating a labeling set, uploading a file to be labeled, supporting PDF file and plain text file labeling, and finally performing visible labeling;
the training progress and loss of the training module can be dynamically updated and displayed by a visual chart, and a log generated in the training process can be fed back to a Web page in real time, so that an algorithm worker can conveniently analyze and locate problems; after training is finished, outputting the accuracy, the recall rate and the F1-Socre of the overall and sub indexes, and simultaneously generating an index confusion matrix;
the extraction module can immediately issue the HTTP model service which can be directly called according to the model which is successfully trained, and can quickly verify the model service by inputting the text.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention realizes the end-to-end table detection, and the recall rate of the table detection is far higher than that of the traditional table detection algorithm;
2. according to the method, a basic model can be generated through a small amount of labels, the result of algorithm pre-labeling is provided, a trained model can be selected in a continuous labeling task, and after a certain number of samples are accumulated, the trained model can be continuously fed back to algorithm training, so that the generalization capability of the model is further improved; and finally rendering the document into an image, performing layout segmentation in a visual mode, and visually displaying the electronic document table, the table label and the table extraction visual mode.
Drawings
FIG. 1 is a diagram of a full convolution neural network in accordance with an embodiment of the present invention;
FIG. 2 is a schematic diagram of an image with red seal occlusion according to an embodiment of the invention;
FIG. 3 illustrates the relationship between nodes after table abstraction according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a Diffusion Convolutional Neural Network (DCNN) according to an embodiment of the present invention;
FIG. 5 is a schematic illustration of visualization tagging in an embodiment of the present invention;
FIG. 6 is a graphical illustration of a dynamic update visualization of training progress and LOSS (LOSS) in an embodiment of the invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
The invention provides a complex diagram extraction method based on computer vision and a diagram convolution network, which comprises the following steps:
step S1, rendering a document into an image, and performing layout segmentation by adopting a computer vision and deep learning technology;
referring to fig. 1, in the present embodiment, a full convolution neural network is used to identify each independent region in a document page, including a title, a paragraph, a table, an illustration, and a data diagram layout, and the full convolution neural network is trained using a pixel-level cross entropy loss function.
The full convolution neural network carries out image semantic segmentation through key steps of convolution, deconvolution, layer jump structures and the like, thereby realizing the layout segmentation, and the method mainly comprises the following steps:
first, after an RGB image is input to a convolutional neural network, a series of feature maps (convolution) are obtained through a plurality of convolution and pooling processes.
And finally, after the resolution of the picture is improved to be consistent with the original picture, the area with high weight is the area where the target is located.
And finally, restoring the restored image by combining the data (layer jump structure) after the upsampling and the upper layer convolution pooling.
When the feature map is still large during convolution, the extracted image information is very rich, and the information loss of the subsequent image becomes more obvious. For the image of the last layer, 32 times of upsampling is needed to obtain the same size as the original image, but the upsampling is performed only on the image of the last layer, the obtained result is still not accurate, and some details are still not accurate.
Therefore, a skip level connection method is adopted, namely feature maps extracted from the first layers of convolution are respectively connected with the subsequent upsampling layers, then the feature maps are added and continuously upsampled, and the feature maps with the same size as the original image can be obtained after upsampling for multiple times.
S2, preprocessing the segmented image, including
The method comprises the following steps:
(1) With red seal shielding
For the condition that the red seal is shielded, the red seal removing operation is carried out on the existing document, and then the character recognition is carried out
(2) Has wrinkles
Identifying the wrinkle condition of the scanned part or the picture with wrinkles, wherein the wrinkle condition comprises resolvable state, partial resolution and unresolvable state, if the scanned part or the picture with wrinkles is unresolvable, alarming to resolve the result, performing manual intervention on the resolved result, and firstly performing wrinkle degree evaluation by resolving, wherein the wrinkle degree evaluation comprises clear and resolvable contents of partial wrinkles, and the accuracy rate of the seriously resolved contents of wrinkles is lower than the average level; firstly, inclination, handstand and correction treatment are carried out on part of wrinkle contents clearly; then, according to the identification of the table purpose, the labeled sample data is compared;
(3) Image tilting
For the condition that the scanned part or the picture is inclined, before analysis, the image is corrected and then analyzed according to the scanned part and the picture processing algorithm;
(4) Image side stand
For the condition that the scanned part or the picture stands on one side, the image is analyzed according to the scanned part and the picture processing algorithm after being processed in the forward direction before being analyzed;
(5) Image handstand
For the condition that the scanned piece or the picture is inverted, carrying out forward processing on the image before analysis, and then carrying out analysis according to the scanned piece and the picture processing algorithm;
(6) Cross-page table merging
For the table in the scanned piece or the picture and the condition of cross-page segmentation, firstly, if the table header exists, performing table header comparison, and then performing table merging according to the table header content; if the table header does not exist, merging the tables according to the length of the tables and the dividing number of the tables;
(7) Form radio
The cases that the table does not exist in the scanned object or the picture comprise the cases that the beginning and the end exist, the beginning exists, the end exists and no table exists overseas; identifying the table use according to the title of the text; after identification, carrying out matching identification analysis according to a sample labeling result, and carrying out table reduction; under the condition of no title, sample data matching is carried out according to the labeling result; after matching, carrying out analysis table reduction; and under the condition of no sample data, early warning the table and carrying out manual algorithm intervention.
And S3, analyzing the preprocessed image topological structure based on the model of the graph convolution network, and detecting and extracting the table.
In this embodiment, the information of the table structure may be abstracted into a row-column relationship between nodes, that is, the character strings in the same column in the table all constitute nodes having a "same column" relationship, the character strings in the same row in the table all constitute nodes having a "same row" relationship, and the table with a digital structure may be finally restored through the row-column relationship between the nodes, as shown in table 1 and fig. 3:
Figure RE-GDA0003935695910000101
secondly, constructing a spatial relationship graph by adopting an epsilon-neighbor graph, finding epsilon neighbor samples closest to a certain node through Euclidean distance according to text information, position information and picture information in a given sample table data set, and then respectively connecting ix with the epsilon neighbor samples to form epsilon directed edges, wherein all nodes in the space are processed according to the mode;
and finally, constructing a diffusion convolution neural network, acquiring text features, position features and image features for unified modeling based on two text boxes indicated by each edge in the spatial relationship diagram, and giving structure position prediction aiming at the two text boxes so as to identify and extract the table.
The diffusion convolution neural network regards graph convolution as a diffusion process, and information is transferred from one node to an adjacent node with a certain transfer probability, so that the information distribution is balanced after several rounds, and then the convolution operation of each layer is expressed as:
H (k) =f(W (k) ⊙P k X)
where K denotes the number of layers, P = D -1 A represents a transition matrix, D is a node degree matrix, A is an adjacency matrix, P k The node neighbor range observed by the convolution is represented, and the convolution of the neighbor node with the distance of 1 is represented when k =1, and the convolution of the neighbor node with the distance of 2 is represented when k = 2. When we calculate the feature of the node of each layer, it is necessary to connect the node features of each layer into a matrix, and then perform feature transformation on the feature of each node in different layers through some linear transformations, as shown in fig. 5 below, to finally obtain the whole graph or the feature matrix of each node.
In this embodiment, referring to fig. 6, a system of a complex chart extraction method based on computer vision and a graph convolution network is further provided, including a labeling module, a training module, and an extraction module;
the marking template realizes multi-event marking on the same document through a self-defined template, and firstly, an index and a marking template are created; secondly, creating a label set, uploading a file to be marked, supporting PDF file and plain text file labeling, and finally visually labeling;
the training progress and loss of the training module can be dynamically updated and displayed by a visual chart, and a log generated in the training process can be fed back to a Web page in real time, so that an algorithm worker can conveniently analyze and locate problems; after training is finished, the accuracy, recall rate and F1-Socre of the overall index and the sub-index are output, and an index confusion matrix is generated at the same time;
the extraction module can immediately issue the HTTP model service which can be directly called according to the model which is successfully trained, and can quickly verify the model service by inputting the text.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (8)

1. A complex diagram extraction method based on computer vision and graph convolution network is characterized by comprising the following steps:
step S1, rendering a document into an image, and performing layout segmentation by adopting a computer vision and deep learning technology;
s2, preprocessing the segmented image;
and S3, analyzing the preprocessed image topological structure based on the model of the graph convolution network, and detecting and extracting the table.
2. The computer vision and graph convolution network-based complex diagram extraction method according to claim 1, wherein the step S1 adopts a full convolution neural network to identify each independent area in a document page, including a title, a paragraph, a table, an illustration, and a data diagram layout.
3. The method for extracting complex diagrams based on computer vision and graph convolution networks according to claim 2, wherein the full convolution neural network performs image semantic segmentation through convolution, deconvolution and layer jump structures, and specifically comprises the following steps:
inputting the image into a convolutional neural network, and obtaining a series of characteristic graphs through multiple convolution and pooling processes;
then, the resolution is improved through upsampling, and after the resolution of the picture is improved to be consistent with the original picture, the area with high weight is the area where the target is located;
and finally, restoring the restored image by combining the data after the up-sampling and the upper-layer convolution pooling.
4. The method for extracting complex diagrams based on computer vision and graph convolution network according to claim 3, characterized in that the full convolution neural network adopts a jump connection method, and feature maps extracted from the first few layers of convolution are respectively connected with the subsequent upsampling layer, and then are added to continue upsampling.
5. The computer vision and graph convolution network based complex graph extraction method of claim 1, wherein the preprocessing includes:
(1) There is red seal to shelter from
For the condition that the red seal is shielded, the red seal removing operation is carried out on the existing document, and then the character recognition is carried out
(2) Has wrinkles
Identifying the wrinkle condition of the scanned part or the picture with wrinkles, wherein the wrinkle condition comprises resolvable state, partial resolution and unresolvable state, if the scanned part or the picture with wrinkles is unresolvable, alarming to resolve the result, performing manual intervention on the resolved result, and firstly performing wrinkle degree evaluation by resolving, wherein the wrinkle degree evaluation comprises clear and resolvable contents of partial wrinkles, and the accuracy rate of the seriously resolved contents of wrinkles is lower than the average level; firstly, carrying out inclination, inversion and correction treatment on part of wrinkle contents; then, according to the identification of the table purpose, the labeled sample data is compared;
(3) Image tilting
For the condition that the scanned part or the picture is inclined, before analysis, the image is corrected and then analyzed according to the scanned part and the picture processing algorithm;
(4) Image side stand
For the condition that the scanning piece or the picture stands on the side, the image is analyzed according to the scanning piece and the picture processing algorithm after being processed in the forward direction before being analyzed;
(5) Image handstand
For the condition that the scanning piece or the picture is inverted, the image is analyzed according to the scanning piece and the picture processing algorithm after being processed in the forward direction before being analyzed;
(6) Cross-page table merging
For the table in the scanned piece or the picture and the condition of cross-page segmentation, firstly, if the table header exists, performing table header comparison, and then performing table merging according to the table header content; if the table header does not exist, merging the tables according to the length of the tables and the dividing number of the tables;
(7) Form radio
The method comprises the following steps that when no table exists in a scanned object or a picture, the scanned object or the picture comprises the situations that a start and an end exist, the start exists, the end exists and no table exists overseas; identifying the table use according to the title of the text; after identification, carrying out matching identification analysis according to a sample labeling result, and carrying out table reduction; under the condition of no title, sample data matching is carried out according to the labeling result; after matching, carrying out analysis table reduction; and under the condition of no sample data, early warning the table and carrying out manual algorithm intervention.
6. The method for extracting complex diagrams based on computer vision and graph convolution network according to claim 1, wherein the step S3 is specifically:
firstly, abstracting the information of the table structure into a row-column relationship among nodes, namely, character strings in the same column in the table form nodes with the same row relationship, character strings in the same row in the table form nodes with the same row relationship, and finally restoring the table with the digital structure through the row-column relationship among the nodes;
secondly, constructing a spatial relationship graph by adopting an epsilon-neighbor graph, finding epsilon neighbor samples closest to a certain node through Euclidean distance according to text information, position information and picture information in a given sample table data set, and then respectively connecting ix with the epsilon neighbor samples to form epsilon directed edges, wherein all nodes in the space are processed according to the method;
and finally, constructing a diffusion convolution neural network, acquiring text features, position features and image features for unified modeling based on two text boxes indicated by each edge in the spatial relationship diagram, and giving structure position prediction aiming at the two text boxes so as to identify and extract the table.
7. The method for extracting complex diagrams based on computer vision and graph convolution networks according to claim 6, wherein the diffusion convolution neural network regards graph convolution as a diffusion process, and if information is transferred from one node to an adjacent node with a certain transfer probability, so that the information distribution reaches equilibrium after several rounds, the convolution operation of each layer is expressed as:
H (k) =f(W (k) ⊙P k X)
where K denotes the number of layers, P = D -1 A represents a transition matrix, D is a node degree matrix, A is an adjacency matrix, P k The node neighbor range observed by the convolution is represented, and the convolution of the neighbor node with the distance of 1 is represented when k =1, and the convolution of the neighbor node with the distance of 2 is represented when k = 2.
8. The system of the complex diagram extraction method based on the computer vision and the graph convolution network is characterized by comprising a labeling module, a training module and an extraction module;
the marking template realizes multi-event marking on the same document through a self-defined template, and firstly, an index and a marking template are created; secondly, creating a label set, uploading a file to be marked, supporting PDF file and plain text file labeling, and finally visually labeling;
the training progress and loss of the training module can be dynamically updated and displayed by a visual chart, and logs generated in the training process can be fed back to a Web page in real time, so that an algorithm worker can analyze and position problems conveniently; after training is finished, the accuracy, recall rate and F1-Socre of the overall index and the sub-index are output, and an index confusion matrix is generated at the same time;
the extraction module can immediately issue the HTTP model service which can be directly called according to the model which is successfully trained, and can quickly verify the model service by inputting the text.
CN202210667214.9A 2022-06-14 2022-06-14 Complex diagram extraction method and system based on computer vision and graph convolution network Pending CN115527223A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210667214.9A CN115527223A (en) 2022-06-14 2022-06-14 Complex diagram extraction method and system based on computer vision and graph convolution network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210667214.9A CN115527223A (en) 2022-06-14 2022-06-14 Complex diagram extraction method and system based on computer vision and graph convolution network

Publications (1)

Publication Number Publication Date
CN115527223A true CN115527223A (en) 2022-12-27

Family

ID=84696314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210667214.9A Pending CN115527223A (en) 2022-06-14 2022-06-14 Complex diagram extraction method and system based on computer vision and graph convolution network

Country Status (1)

Country Link
CN (1) CN115527223A (en)

Similar Documents

Publication Publication Date Title
CN111626146B (en) Merging cell table segmentation recognition method based on template matching
Dong et al. Tablesense: Spreadsheet table detection with convolutional neural networks
CN109635805B (en) Image text positioning method and device and image text identification method and device
JPS61267177A (en) Retrieving system for document picture information
JP2002279433A (en) Method and device for retrieving character in video
CN111651636A (en) Video similar segment searching method and device
CN115424282A (en) Unstructured text table identification method and system
CN116052193B (en) RPA interface dynamic form picking and matching method and system
CN112712014A (en) Table picture structure analysis method, system, equipment and readable storage medium
CN109213886B (en) Image retrieval method and system based on image segmentation and fuzzy pattern recognition
CN113837151A (en) Table image processing method and device, computer equipment and readable storage medium
CN111626145B (en) Simple and effective incomplete form identification and page-crossing splicing method
CN114120299A (en) Information acquisition method, device, storage medium and equipment
CN114359924A (en) Data processing method, device, equipment and storage medium
Colter et al. Tablext: A combined neural network and heuristic based table extractor
CN115497010A (en) Deep learning-based geographic information identification method and system
CN106611148B (en) Image-based offline formula identification method and device
CN111832497B (en) Text detection post-processing method based on geometric features
CN111145314B (en) Method for extracting place name symbol of scanning electronic map by combining place name labeling
CN117173730A (en) Document image intelligent analysis and processing method based on multi-mode information
Li et al. Automatic pavement crack detection based on single stage salient-instance segmentation and concatenated feature pyramid network
CN115527223A (en) Complex diagram extraction method and system based on computer vision and graph convolution network
CN115713775A (en) Method, system and computer equipment for extracting form from document
CN115690795A (en) Resume information extraction method and device, electronic equipment and storage medium
CN113806472B (en) Method and equipment for realizing full-text retrieval of text picture and image type scanning piece

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination