WO2022062798A1

WO2022062798A1 - Rpa and ai-based table information extraction method and apparatus, device and medium

Info

Publication number: WO2022062798A1
Application number: PCT/CN2021/114068
Authority: WO
Inventors: 汪冠春; 胡一川; 褚瑞; 李玮; 张海雷; 白龙飞
Original assignee: 北京来也网络科技有限公司; 来也科技(北京)有限公司
Priority date: 2020-09-25
Filing date: 2021-08-23
Publication date: 2022-03-31
Also published as: CN112149399A

Abstract

An RPA and AI-based table information extraction method and apparatus, a device and a medium. Said method comprises: S1, converting a file containing a table into a picture; S2, recognizing the table in the picture, and according to a recognition result, generating an information extraction template corresponding to a table type, the information extraction template containing keys of key-value pairs in the table and position information thereof, and position information of values of the key-value pairs to be extracted; and S3, extracting table content from the recognition result according to the information extraction template. Said method reduces labor costs, improves the universality of the information extraction template, and increases the accuracy of table content extraction.

Description

Form information extraction method, device, equipment and medium based on RPA and AI

technical field

The present invention relates to the technical field of table processing, and in particular, to a method, device, device and medium for extracting table information based on RPA and AI.

Background technique

RPA (Robotic Process Automation) is to simulate the operation of people on the computer through a specific "robot software", and automatically execute process tasks according to rules.

AI (Artificial Intelligence) is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence.

RPA has unique advantages: low-code, non-intrusive. Low-code means that RPA does not require a high level of IT to operate, and business personnel who do not understand programming can develop processes; non-intrusive means that RPA can simulate human operations without opening interfaces to software systems. However, traditional RPA has certain limitations: it can only be based on fixed rules, and the application scenarios are limited. With the continuous development of AI (Artificial Intelligence) technology, the deep integration of RPA and AI has overcome the limitations of traditional RPA. RPA+AI=Hand work+Head work is greatly changing the value of labor.

In the process of processing tasks, RPA will encounter a large amount of tabular data. Especially for enterprises and institutions, they may face a large amount of tabular data every day. It is particularly useful to correctly extract useful information from these tabular data and enter it into the designated system. At present, it is generally done in the following two ways: one is to manually screen the information in the form to select useful information, and then manually enter the information obtained from the screening into the system. The second is to manually intervene and summarize the matching rules of various forms, that is, specify the corresponding rule template according to the structure information of the form, and then extract the form information through a program or algorithm, and then fill in the system structure according to the need, and then follow the program. Or manually enter the extracted information into the system.

However, for the first method above, when manually screening the form information, errors may occur when entering information due to some deviations or inertia of human thinking, and the labor cost is high. For the above-mentioned second method, there are the following defects: (1) The table structure is inconsistent, different rules need to be manually summarized, and the generality is insufficient. (2) The system architecture is inconsistent, which leads to higher requirements for the designer's programming ability when designing programs or algorithms, and at the same time the designed programs are not versatile enough. For example, when the system architecture changes, for designers, the program The changes are relatively large, time-consuming and labor-intensive, resulting in low work efficiency.

SUMMARY OF THE INVENTION

The present invention provides a table information extraction method, device, device and medium based on RPA and AI, so as to overcome at least one technical problem existing in the prior art.

In a first aspect of the embodiments of the present invention, a method for extracting table information based on RPA and AI is provided, and the method includes:

S1. Convert the file containing the table into a picture;

S2. Identify the table in the picture, and generate an information extraction template corresponding to the table type according to the identification result, where the information extraction template includes the key and position information of each key-value pair in the table, and the information about each key-value pair to be extracted. The location information of the value of the key-value pair;

S3. Extract table content from the recognition result according to the information extraction template.

In a second aspect of the embodiments of the present invention, a table information extraction device based on RPA and AI is provided, and the device includes:

An image conversion template, configured to convert a file containing a table into an image;

The template generation module is configured to identify the table in the picture, and generate an information extraction template corresponding to the table type according to the identification result, and the information extraction template contains the keys and position information of each key-value pair in the table, And the location information of the value of each key-value pair to be extracted;

The content extraction module is configured to extract table content from the recognition result according to the information extraction template.

In a third aspect, an embodiment of the present invention further provides a computing device, including:

a memory in which executable program code is stored;

a processor coupled to the memory;

The processor invokes the executable program code stored in the memory to execute part or all of the steps of the table information extraction method based on RPA and AI provided by any embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, and the computer program includes a method for executing the table information extraction method based on RPA and AI provided by any embodiment of the present invention. Instructions for some or all of the steps.

In the technical solution provided by the embodiments of the present invention, when extracting table information, a file containing a table can be converted into a picture, so as to associate the content of the cells in the table with the table. By identifying the table in the picture, an information extraction template corresponding to the table type can be generated according to the identification result. The information extraction template includes the key and location information of each key-value pair in the table, as well as the information of each key-value pair to be extracted. Location information for the value. According to the information extraction template, the table content can be extracted from the recognition result. By adopting the above technical solution, the problems of high labor cost and poor accuracy when manually extracting table information are avoided. Moreover, compared with the method of summarizing matching rules of various forms by manual intervention, the method provided by this implementation does not require developers to summarize different rules, and is more versatile.

The innovative points of the embodiments of the present invention include:

1. By converting the file containing the table into a picture, and by identifying the table in the picture, an information extraction template corresponding to the table type can be generated according to the recognition result. According to the information extraction template, the table content can be extracted from the recognition result, which avoids the problems of high labor cost and poor accuracy when manually extracting table information. Moreover, compared with the method of summarizing matching rules of various forms by manual intervention, the method provided by this implementation does not require developers to summarize different rules, and is more versatile. This is one of the innovative points of the embodiments of the present invention.

2. The method of first converting the file containing the table into a picture, and then recognizing the table in the picture improves the reliability of the table data and helps to improve the versatility of the information extraction template, which is an embodiment of the present invention. One of the innovations.

3. The generated information extraction template contains some special syntax identifiers, such as square brackets, angle brackets, etc. These identifiers are determined based on table properties, which are related to the content and location of cells in the table. When using the information extraction template to extract the table content, it is necessary to match the content in the information extraction template with the recognition result of the picture according to the preset meaning represented by the grammar mark. For example, the angle brackets indicate that the content needs to be fuzzy matched. Square brackets indicate that the content in it needs to be strictly matched. This setting helps to improve the accuracy of table content extraction, which is one of the innovative points of the embodiments of the present invention.

4. The position information of each key-value pair in the information extraction template is represented in the form of a regular expression. This setting can avoid the problem that the table content cannot be accurately extracted due to disordered row and column information of the cells in the OCR identification result, which is one of the innovative points of the embodiments of the present invention.

5. The first information extraction template corresponding to the left and right one-to-one type table is generated based on the content of each row in the table, that is, for the content of each row in the table, a first information extraction template will be generated correspondingly, that is The number of rows of the table is equal to the number of the first information extraction templates. Compared with the method of generating a template for each key-value pair in the table, this embodiment can reduce the number of information extraction templates and improve the speed of template generation, which is one of the innovative points of the embodiments of the present invention.

6. For the table in the upper-lower one-to-many format, or the left-right one-to-many format, if the first column of the value to be extracted is not enumerable or irregular, when generating the information extraction template corresponding to this type of table , an auxiliary variable can be added before the cells in the first column of the table to distinguish the contents of different rows in the table, so as to avoid extracting the contents of the next row as the contents of the current row during information extraction, which is an embodiment of the present invention. One of the innovations.

7. The information extraction template corresponds to the table type and has strong versatility, that is, if there are multiple tables of the same type in the picture, the method provided by the embodiment of the present invention can generate the same information extraction for multiple tables of the same type template. According to this template, the contents of multiple tables of the same type can be extracted, which improves the speed of extracting table contents, which is one of the innovative points of the embodiments of the present invention.

Description of drawings

In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained according to these drawings without creative efforts.

1a is a flowchart of table information extraction and input based on the combination of RPA+AI provided by Embodiment 1 of the present invention;

FIG. 1b is a schematic diagram of a field establishment interface according to Embodiment 1 of the present invention;

1c is a schematic interface diagram of a pre-release information extraction template provided by Embodiment 1 of the present invention;

1d is a schematic interface diagram of a post-release information extraction template provided by Embodiment 1 of the present invention;

FIG. 1e is a schematic interface diagram of an information extraction template corresponding to a one-to-many type table after publishing provided by Embodiment 1 of the present invention;

2 is a schematic flowchart of a method for extracting table information based on RPA and AI according to Embodiment 2 of the present invention;

3 is a flowchart of a preferred method for extracting table information based on RPA and AI provided by Embodiment 3 of the present invention;

4 is a flowchart of a preferred method for extracting table information based on RPA and AI provided by Embodiment 4 of the present invention;

5 is a flowchart of a preferred method for extracting table information based on RPA and AI provided by Embodiment 5 of the present invention;

6 is a flowchart of a preferred method for extracting table information based on RPA and AI provided by Embodiment 6 of the present invention;

7 is a flowchart of a preferred method for extracting table information based on RPA and AI provided by Embodiment 7 of the present invention;

8 is a schematic structural diagram of an apparatus for extracting table information based on RPA and AI according to Embodiment 8 of the present invention;

FIG. 9 is a schematic structural diagram of a computing device according to Embodiment 9 of the present invention.

detailed description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

It should be noted that the terms "comprising" and "having" and any modifications thereof in the embodiments of the present invention and the accompanying drawings are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally also includes For other steps or units inherent to these processes, methods, products or devices.

In the description of the present invention, "template" is a text expression provided by the developer for the "information extraction" function. Use this expression to match several fragments of text and extract information. For the template provided by the embodiment of the present invention, you need to understand the following necessary syntax:

1. The "square brackets" [] represent strict matching. The matching content can be the "vocabulary", "regular expression" defined in the resource in advance, or the phrase that needs to be matched.

2. "Angle brackets" <> represent fuzzy matching. Fuzzy matching is a concept corresponding to strict matching. Strict matching requires that the text to be matched must be exactly the same as the specified matching content. Fuzzy matching only needs to be semantically close to each other, that is, the similarity needs to be greater than the set threshold.

3. The symbol <*> represents matching text fragments of any length.

4. When matching {,},[,],<,>,|,{,},* in the template, use "\" to escape.

5, ^: can only appear in the template header, used to define that the template must be matched from scratch.

6. $: can only appear at the end of the template, and is used to define that the template must match to the end.

In the description of the present invention, "field" refers to the key information extracted from the template, which is a name specific to the current information extraction task, and the name is generally designated by the user.

In the description of the present invention, a "vocabulary" is an information structure composed of <vocabulary name, vocabulary value, and various expressions of vocabulary value>. A vocabulary describes a relatively fixed class of "external knowledge" in lexical form that is strongly related to the developer's field.

In the description of the present invention, the term "regular expression" is a logical formula for operating on strings, describing a pattern of string matching, which can be used to check whether a string contains a certain substring, a substring that will be matched Replace or extract a substring that meets a certain condition from a certain string, etc.

The embodiment of the present invention discloses a table information extraction method, device, device and medium based on RPA and AI. Each of them will be described in detail below.

Example 1

Robotic Process Automation (RPA) for short is a specific "robot software" that simulates human operations on a computer and automatically performs process tasks according to rules.

AI (Artificial Intelligence) is the English abbreviation of artificial intelligence. It is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence.

With the continuous development of Internet technology, massive amounts of text data will be accumulated, including unstructured data and structured data. For unstructured data, such as text, pictures and videos, and for structured data such as tabular data, it takes a lot of manpower and material resources to extract useful information from these massive data sets.

For enterprises and institutions, they may face massive amounts of tabular data every day. For the process of correctly extracting useful information from these tabular data and entering it into the designated system, relying solely on manpower is not only very expensive, but also very likely to make mistakes in many cases, bringing incalculable costs. loss. Therefore, based on this consideration, this embodiment proposes a table information extraction method based on RPA+AI, so as to realize table information extraction and automatic entry of table information. Fig. 1a is a flow chart of table information extraction and input based on the combination of RPA+AI provided by Embodiment 1 of the present invention, and each step in Fig. 1a is introduced below:

110. Convert table files into pictures with the help of RPA technology.

In this embodiment, the file can be parsed by writing a large number of codes and rules in a conventional manner. However, due to the diversity of table forms, the traditional method can easily lead to the inability of parsing programs and rules to be reused in many cases, which increases the development cost. In this embodiment, in order to solve the above problem, an automated service platform, such as Uibot software, can automatically convert the table file into a picture by adopting a construction process.

120. Perform OCR identification on the generated picture.

In this embodiment, an OCR (Optical Character Recognition, Optical Character Recognition) technology can be used to recognize the picture. After OCR identification, the content of each cell and the position information of each cell in the table will be returned. The position information includes information such as start row index, start column index, end row index and end column index.

130. Automatically generate a form information extraction template.

Step 130 is the key point of the embodiment of the present invention, that is, a template for extracting table information is automatically generated according to the OCR recognition result of the table image. In this embodiment, the form information extraction template corresponds to the form type. For different types of tables, information extraction templates corresponding to the table types can be generated by calling different template interfaces. Next, the step of automatically generating the form information extraction template will be analyzed from the following two aspects:

a. The table format is in the form of left and right key-value, as shown in Table 1 below. The table type in Table 1 below can be considered as a one-to-one form, that is, the fields to be extracted are on the left, such as name, while The corresponding value is on the right, such as Zhang San. After performing OCR on the pictures containing Table 1, the row and column indices corresponding to each cell can be obtained (the row and column indices are calculated from 0). For example, for the cell "name", the corresponding row and column indices are 0, 0 respectively. , the row and column indices corresponding to the cell "Hefei City, Anhui Province" are 1, 5.

Form 1 Personal Information Form

In this embodiment, before generating the information extraction template, the user can create the fields to be extracted according to the key-value pairs in the table. For example, for the above Table 1, the fields that the user needs to extract may include: name, age, gender, ethnicity, Birthplace, date of birth, place of birth, education, master's degree and current city of residence, etc. FIG. 1b is a schematic diagram of a field establishment interface according to Embodiment 1 of the present invention. As shown in Figure 1b, the interface shows the following partial fields created by the user: "place of birth", "name", "educational education", "age" and "place of origin".

In this embodiment, for a one-to-one left-right table, the corresponding information extraction template is generated by row. When generating the information extraction template, the contents of each cell in the table need to be spliced according to the position information of the row and column in the table. For each line of content in the spliced content, the location information and content of each cell are included.

The generation process of the information extraction template corresponding to each line of content can be as follows: the position information of each key in the spliced content can be used as the position information of the key in the information extraction template; each key in the spliced content can be used as the information extraction template The key in ; the position information of the value of each key-value pair in the spliced content is used as the position information of the value to be extracted in the information extraction template. In addition, for the key and location information of each key-value pair of the information extraction template, as well as the location information of the value of each key-value pair to be extracted, corresponding grammatical identifiers need to be added for the subsequent extraction of table content.

Specifically, for the first line of content in Table 1 above, the corresponding information extraction template is:

[@R0@C0-]<name>[@R0@C1-]{name:<*:0,>}[@R0@C2-]<age>[@R0@C3-]{age:<*: 0,>}[@R0@C4-]<gender>[@R0@C5-]<*:0,>[@R0@C6-]<ethnicity>[@R0@C7-]<*:0,> [\n].

For the second line of content in Table 1 above, the corresponding information extraction template is:

[@R1@C0-]<Hometown>[@R1@C1-]{Hometown:<*:0,>}[@R1@C2-]<Birthplace>[@R1@C3-]<*:0 ,>[@R1@C4-]<Birthplace>[@R1@C5-]{Birthplace:<*:0,>}[\n].

For the content of the third row in Table 1 above, the corresponding information extraction template is:

[@R2@C0-]<educational education>[@R2@C1-]<{educational education:<*:0,>}[@R2@C2-]<current residence city>[@R2@C3-]<*: 0,>[\n].

Specifically, FIG. 1c is a schematic interface diagram of an information extraction template before publishing according to Embodiment 1 of the present invention. As shown in Figure 1c, for the content of the third row in the above Table 1, a column of matching text corresponds to each node in the table, that is, each key-value pair in the table. The user can choose whether to output to the specified field for each generated node. For example, in the first row of the table, if the user wants to extract the name and age, then select the name and age in the output to field below. For the table content that the user does not want to extract, select "Do not output" in the output to field.

FIG. 1d is a schematic interface diagram of a post-release information extraction template provided by Embodiment 1 of the present invention. In this embodiment, after the information extraction template corresponding to each line of content is generated, if a template publishing instruction triggered by the user is received, the finally displayed information template corresponding to Table 1 above is displayed to the user. In addition, the user can edit, copy and delete any one of the templates in Figure 1d.

b. The table format is in the form of upper and lower key-value, as shown in Table 2 below.

Table 2 Project Information Sheet

项目project	附注Notes	2020年半年度-金额2020 Half Year - Amount	2019年半年度-金额2019 Half Year - Amount
一、营业总收入I. Total operating income		1545105967.981545105967.98	1559860357.351559860357.35
其中：营业收入Of which: operating income		1545105967.981545105967.98	1559860357.351559860357.35
利息收入interest income
已赚保费Premium earned
手续费及佣金收入Fee and commission income
二、营业总成本2. Total operating cost		1509316804.581509316804.58	1509825477.671509825477.67
其中：营业成本Of which: Operating costs		1476660727.541476660727.54	1463505521.821463505521.82
利息支出interest expense
手续费及佣金支出Fees and Commissions Expenses
退保金Surrender
赔付支出净额Net payout
提取保险责任准备金净额Net withdrawal of insurance liability reserves
保单红利支出dividend payment policy
分保费用Reinsurance costs
税金及附加Taxes and surcharges		3239576.543239576.54	3747299.053747299.05
销售费用sales expense		4707080.574707080.57	14884188.7814884188.78
管理费用Management fees		8800099.478800099.47	7604838.027604838.02
研发费用R&D expenses		13470896.3513470896.35	18269596.8818269596.88

For the above-mentioned upper and lower table forms, two cases will be considered:

(1) □ The first column of the value to be extracted in the upper and lower form tables is not enumerable or irregular, that is, it cannot be expressed by establishing a vocabulary or by using regular expressions.

For the above situation, a preset standard template needs to be specified first to match the keys in Table 2 above, that is, to match the "item", "note", "2020 semi-annual-amount" and "2019 semi-annual" in the table -Amount". Taking the above Table 2 as an example, the default standard template is:

[@R0@C0-]<project>[@R0@C1-]<note>[@R0@C2-]<2020 half year>[@R0@C3-]<2019 half year>

Then, the content of each cell in the table is spliced according to the position information of the row, and the spliced content is matched with the content of the preset standard template. If the match is successful, the key in the table and the to-be-extracted key can be determined. The start position information and end position information of the value in the table, and can record the number of matching columns cols. By traversing the OCR recognition results by row, if the number of columns in the table is also cols, an auxiliary variable @Frow_n- is introduced, where row_n represents the number of rows, and a template is established. Among them, the auxiliary variable is used to distinguish the content of different rows when the table content is extracted.

Specifically, FIG. 1e is a schematic diagram of an interface of an information extraction template corresponding to a one-to-many type table after publishing according to Embodiment 1 of the present invention. For the above Table 2, if the user presets the extracted fields as item, 2019 semi-annual-amount, and 2020 semi-annual-amount, the generated information extraction template is shown in Figure 1e. Among them, F0 is an auxiliary variable. The user can choose whether to output the extracted table content to the field on the display interface before publishing. In addition, the user can also perform operations such as editing, copying and deleting on the generated information extraction template on the interface shown in FIG. 1e.

(2) The first column of the value to be extracted in the upper and lower form tables is enumerable or can be represented by a regular expression.

In this case, it is necessary to judge whether the content to be extracted in the table belongs to the preset vocabulary, and if it does not belong to the preset vocabulary, stop the operation of generating the information extraction template; if it belongs to the preset vocabulary, then splicing the content Match with preset standard templates. If the matching is successful, the number of matching columns can be obtained, and then an information extraction template can be generated based on the key matching the preset standard template and the position information of the key in the table.

Specifically, the information extraction template corresponding to the above Table 2 is:

[@R1@C0-]{project:[@V_D]}[@R1@C1-]{notes<*:0,>}[@R1@C2-]{2020 half year:<*:0,> }[@R1@C3-]{Semi-annual 2019:<*:0,>}[\n].

In the above information extraction template, since the first column can be represented according to the regular expression, the above vocabulary can be replaced with the regular expression V_D.

The method for generating a form information extraction template provided by this embodiment avoids the problems of high labor cost and poor accuracy when manually extracting form information, and compared with the method of summarizing the matching rules of various forms by manual intervention , the method provided by this implementation does not require developers to summarize different rules, and has strong generality.

140. Perform table information extraction based on the generated template.

150. Use RPA technology to automatically enter the extracted information into the system.

In this embodiment, an automated service platform, such as Uibot software, can be used to implement automatic input of information by building a process. Compared with the traditional method of entering information manually or through programming, the input method provided by this embodiment has higher versatility, and reduces labor costs and maintenance costs to a great extent.

Embodiment 2

FIG. 2 is a schematic flowchart of a method for extracting table information based on RPA and AI according to Embodiment 2 of the present invention. The method can be applied to application scenarios such as table data screening and entry systems, and can be performed by a table information extraction device based on RPA and AI, which can be implemented by software and/or hardware. As shown in Figure 2, the method provided by this embodiment specifically includes:

210. Convert the file containing the table into a picture.

The file containing the table may be a Word document, an Excel document, or a PDF document, or the like. In this embodiment, the RPA technology can be used to convert the file containing the table into a picture. With this setting, the table content and its position information in the table can be solidified together. If the information extraction template is generated by directly recognizing the table in the file, it is easy to identify the content in the table as the text in the file, thereby causing the loss of data information in the table. At the same time, due to the diversity of table forms, directly identifying the table content will also lead to the fact that the parsing programs and rules used to identify the table content cannot be reused in many cases, resulting in an increase in development costs. In this embodiment, the file containing the table is first converted into a picture, and then the table in the picture is identified, which improves the reliability of the table data and helps to improve the generality of the information extraction template.

220. Identify the table in the picture, and generate an information extraction template corresponding to the table type according to the identification result.

Exemplarily, OCR (Optical Character Recognition, Optical Character Recognition) technology can be used to recognize the picture, and the recognition result includes the content of each cell in the table and the position information of each cell in the table. The position information of each cell in the table includes a start row index, a start column index, an end row index, an end column index, and the like.

In this embodiment, the table type can be determined by the positional relationship and the corresponding relationship between each key-value pair in the table. For different types of tables, information extraction templates corresponding to the table types can be generated by calling different template interfaces. Before calling different template interfaces, users can specify the fields they want to extract according to the key-value pair information in the table. After the information extraction template is generated, the user can also select whether to output the extracted table content by triggering the field output instruction.

In this embodiment, for any type of table, an information extraction template corresponding to the table type can be generated according to the content of each cell in the table and the position information of each cell in the table. The information extraction template includes the key and position information of each key-value pair in the table, and the position information of the value of each key-value pair to be extracted.

Specifically, taking the above Table 1 as an example, for the user preset extracted fields as name, age, gender and nationality information, the constructed information extraction template is as follows:

[@R0@C0-]<name>[@R0@C1-]{name:<*:0,>}[@R0@C2-]<age>[@R0@C3-]{age:<*: 0,>}[@R0@C4-]<gender>[@R0@C5-]<*:0,>[@R0@C6-]<ethnicity>[@R0@C7-]<*:0,> [\n]

In the above information extraction template, [@R0@C0-]<name> means that the row and column information of the "name" in the table is the zeroth row and zeroth column; [@R0@C1-]{name:<*>} means The row and column information where the content of the value corresponding to "name" is located is the zeroth row and the first column. Other fields to be extracted, such as age, gender, and ethnicity, are represented in the information extraction template in a manner similar to the representation of the above-mentioned names, and will not be repeated here.

It should be noted that, for the generated information extraction template, some special syntax identifiers can be added to it, and these identifiers are determined according to the table attributes. For example, for the location information of the cells in the table, add square brackets [ ], such as [@R0@C0-] in the template above. Add angle brackets <> to the keys of key-value pairs in the table, such as <name> in the template above. For the value of the key-value pair to be extracted in the table, express it in the form of asterisks in angle brackets, such as <*>, and separate the value to be extracted and its corresponding key with a colon ":". If the value to be extracted needs to be output to a field, add curly brackets to each key-value pair, such as {name:<*:0,>} in the above template. If the user has set that the value to be extracted does not need to be output to the field, there is no need to add the above curly brackets.

In addition, in this embodiment, the grammatical identifiers in the information extraction template have certain preset meanings. For example, square brackets represent strict matching, that is, it is determined whether the strings to be matched are the same; angle brackets represent fuzzy matching, that is, it is determined that matching is performed. Whether the similarity of the content is greater than the set threshold. When using the information extraction template to extract the table content, it is necessary to match the content in the information extraction template with the recognition result of the picture according to the preset meaning represented by the identifier.

It should also be noted that, in order to ensure the accuracy of the information extraction template and the accuracy of subsequent table content extraction, in this embodiment, the position information of each key-value pair in the information extraction template can be represented in the form of regular expressions. This setting can avoid the problem that the table content cannot be accurately extracted due to disordered row and column information of the cells in the OCR recognition result.

Further, after the information extraction template is generated, the user can perform related debugging according to the automatically generated template, for example, the template can be edited, copied and deleted.

230. Extract table content from the recognition result according to the information extraction template.

After the information extraction template is generated, the user can call the information extraction engine interface to extract information.

Specifically, when the table information is extracted according to the information extraction template, all contents in the information extraction template may be matched with the OCR identification result until the matching is successful.

Specifically, in the matching process, it is necessary to perform matching according to the preset meaning corresponding to the grammatical identifier in the information extraction template. For example, it is determined whether the string in square brackets corresponds to the string corresponding to the cell position information in the OCR recognition result. The same; or, determine whether the similarity between the content in the angle brackets and the key of the key-value pair in the OCR recognition result is greater than the set threshold. If the strings are equal or the similarity of the text is greater than the set threshold, the match is successful. After the matching is successful, the table content to be extracted can be extracted from the recognition result.

In the technical solution provided by this embodiment, when extracting table information, a file containing a table can be converted into a picture, so as to associate the content of the cells in the table with the table. By identifying the table in the picture, an information extraction template corresponding to the table type can be generated according to the identification result. The information extraction template includes the key and location information of each key-value pair in the table, as well as the information of each key-value pair to be extracted. Location information for the value. According to the information extraction template, the table content can be extracted from the recognition result. By adopting the above technical solution, the problems of high labor cost and poor accuracy when manually extracting table information are avoided. Moreover, compared with the method of summarizing matching rules of various forms by manual intervention, the method provided by this implementation does not require developers to summarize different rules, and is more versatile.

Embodiment 3

FIG. 3 is a flowchart of a preferred method for extracting table information based on RPA and AI provided by Embodiment 3 of the present invention. On the basis of the above-mentioned embodiments, this embodiment, on the basis of the above-mentioned embodiments, interprets the information corresponding to the left and right one-to-one format for the table type. The generation process of the extraction template is described in detail. Wherein, the key and value of each key-value pair in the table in the left-right one-to-one format have a left-right positional relationship, and the key and the value have a one-to-one relationship. As shown in Figure 3, the method includes:

310. Convert the file containing the table into a picture.

320. Perform optical character OCR recognition on the picture to obtain a recognition result, where the recognition result includes the content of each cell in each table and the position information of each cell in the table.

330. Splicing the content of each cell in the table according to the position information of the row and column in the table.

For the table in the picture, after OCR identification, the row and column index of each cell and the correspondence between the cells have been determined. In this embodiment, after the content of each cell in the table is spliced according to the position information of the row and column in the table, the spliced content is embodied in the form of a character string.

340. For each row of content in the table, based on the spliced content, generate a first information extraction template corresponding to the table type.

In this embodiment, for the one-to-one left-right form, the field to be extracted is on the left, such as "name" in Table 1 above, and the corresponding value is on the right, such as "Zhang San".

It should be noted that, in this embodiment, the first information extraction template corresponding to the left and right one-to-one type table is generated based on the content of each row in the table, that is, for the content of each row in the table, a corresponding one is generated. The first information extraction template, that is, the number of rows of the table is equal to the number of the first information extraction template. Compared with the method of generating a template for each key-value pair in the table, this embodiment can reduce the number of templates for information extraction and improve the speed of template generation.

Specifically, after the content of each cell in the table is spliced according to the position information of the row and column in the table, each row of content in the spliced content includes the position information and content of each cell. The generation process of the first information extraction template corresponding to each line of content can be as follows: the position information of each key in the content after splicing can be used as the position information of the key in the first information extraction template; each key in the content after splicing, As the key in the first information extraction template; the position information of the value of each key-value pair in the spliced content is used as the position information of the value to be extracted in the first information extraction template. In addition, for the keys and their location information of each key-value pair of the first information extraction template, as well as the location information of the value of each key-value pair to be extracted, a corresponding grammatical identifier needs to be added to it, so as to be used for subsequent table content extraction .

Specifically, for the content of the first row in the above Table 1, the corresponding first extraction template is:

For the content of the second row in the above Table 1, the corresponding first extraction template is:

For the content of the third row in the above table 1, the corresponding first extraction template is:

350. Extract table content from the recognition result according to the first information extraction template.

On the basis of the above-mentioned embodiments, this embodiment refines the generation process of the first information extraction template corresponding to the table whose table type is one-to-one format. The position information of the inner row and column is spliced, and the first information extraction template corresponding to the content of each row in the table is generated based on the spliced content, avoiding the problems of high labor cost and poor accuracy when manually extracting table information. , and compared with the method of summarizing the matching rules of various forms by manual intervention, the method provided by this implementation does not require developers to summarize different rules, and is more versatile.

Embodiment 4

FIG. 4 is a flowchart of a preferred method for extracting table information based on RPA and AI according to Embodiment 4 of the present invention. In this embodiment, on the basis of the above embodiment, the table type is corresponding to the upper and lower one-to-one format. The generation process of the information extraction template is introduced in detail. Wherein, the key and value of each key-value pair in the table in the upper-lower one-to-many format have an upper-lower positional relationship, and the key and the value have a one-to-many relationship. It should be noted that the table type in the upper and lower one-to-many format includes the following two situations: 1. The first column of the value to be extracted in the table is not enumerable or irregular, that is, it cannot be established by establishing a vocabulary or cannot be used. 2. The first column in the value that needs to be extracted in the table is enumerable or can be represented by a regular expression. In this embodiment, the above-mentioned first case is first described in detail. As shown in Figure 4, the table information extraction method based on RPA and AI provided by this embodiment includes:

410. Convert the file containing the table into a picture.

420. Perform optical character OCR recognition on the picture to obtain a recognition result, where the recognition result includes the content of each cell in each table and the position information of each cell in the table.

430. If the preset vocabulary is not detected, splicing the content of each cell in the table according to the position information of the row, and matching the spliced content with the content of the preset standard template.

In this embodiment, the preset vocabulary table includes all the contents in the table pre-set and extracted by the user. If the preset vocabulary is not detected, it means that the first column in the value to be extracted in the table is not enumerable or irregular.

In this embodiment, the information extraction template corresponds to the table type, and has strong generality. That is, if there are multiple tables of the same type in the picture, the method provided in this embodiment can generate the same table for multiple tables of the same type. Information extraction template. According to this template, the content in multiple tables of the same type can be extracted, which improves the speed of extracting the content of subsequent tables.

In this embodiment, the preset standard template includes the key of the extracted key-value pair preset by the user. The grammatical identifiers of the key and its location information in the key-value pair in the preset standard template are the same as the grammatical identifiers of the key and its location information in the information extraction template in the embodiment of the present invention. Specifically, for the above Table 2, the corresponding preset standard templates are as follows:

In this embodiment, the content of each cell in the table is spliced according to the position information of the row, and the spliced content is matched with the content of the preset standard template. This setting is to determine the user preset from the recognition result. The key corresponding to the extracted content can be determined, and the start position information and end position information of the value to be extracted in the table can be determined.

440. If the matching is successful, use the number of columns corresponding to the keys matching the preset standard template in the table as the first target number.

Specifically, taking the preset standard template corresponding to Table 2 above as an example, if the matching is successful, the number of second targets is 4.

In addition, the spliced content is matched with the content of the preset standard template. After the matching is successful, the value corresponding to each key in the table, as well as the start position information and end position information of the row where the content to be extracted is located are also determined. .

450. Traverse the table row by row, and use the number of columns in the table as the first standard number.

460. If the first standard number matches the first target number, add an auxiliary variable before the cells in the first column in the table.

In this embodiment, auxiliary variables are added before the cells in the first column of the table to distinguish the contents of different rows in the table, so as to avoid extracting the contents of the next row as the contents of the current row during information extraction.

470. Generate a second information extraction template corresponding to the table type based on the auxiliary variable, the key matching the preset standard template, and the position information of the key in the table.

The second information extraction template includes auxiliary variables, matching keys and their location information, and location information of the values of each key-value pair to be extracted in the table.

Specifically, the generation process of the second information extraction template may specifically be as follows: adding auxiliary variables to the starting position of the second information extraction template; taking the position information of each key that matches as the position information of the keys in the second information extraction template ; Use the matched keys as the keys in the second information extraction template; use the position information of the values to be extracted corresponding to each key as the position information of the values to be extracted in the second information extraction template. In addition, for the generated second information extraction template, a corresponding grammatical identifier is added for the key of each key-value pair and its location information, as well as the location information of the value of each key-value pair to be extracted, to be used for subsequent table contents. Extract. The syntax identifier involved in the second information extraction template has the same meaning as the syntax identifier mentioned in the first information extraction template, which is not repeated in this embodiment.

Specifically, for the above table 2, if the fields that the user wants to extract are items, notes, semi-annual-amount in 2019 and semi-annual-amount in 2020, the generated second information extraction template is:

[Fi][@R1@C0-]{items:<*:0,>}[@R1@C1-]{notes<*:0>}[@R1@C2-]{2020 half-year-amount: <*:0,>}[@R1@C3-]{2019 half year:<*:0,>}[\n].

480. Extract table content from the recognition result according to the second information extraction template.

On the basis of the above-mentioned embodiments, this embodiment generates a second information extraction template corresponding to the table type in the top-bottom one-to-many format, and the first column of the value to be extracted is not enumerable or irregular. The process is refined. By splicing the content of each cell in the table according to the position information of the row, and matching the content after splicing with the content of the preset standard template, if the matching is successful, the starting position information in the table where the value to be extracted is located can be obtained and end location information. Traverse the table row by row. If the number of columns in the table matches the number of keys in the preset standard template, add auxiliary variables before the cells in the first column of the table to distinguish the contents of each row in the table. Based on auxiliary variables and matching keys and their location information, the second information extraction template corresponding to the table type can be generated, avoiding the introduction of excessive manual intervention. The implementation of the provided method does not require developers to summarize different rules, and the generality is strong.

Embodiment 5

FIG. 5 is a flowchart of a preferred method for extracting table information based on RPA and AI provided by Embodiment 5 of the present invention. Based on the above embodiments, the first column of the value to be extracted in the table is: The cases that can be enumerated or can be represented by regular expressions are described in detail. As shown in Figure 5, the table information extraction method based on RPA and AI provided by this embodiment includes:

510. Convert the file containing the table into a picture.

520. Perform optical character OCR recognition on the picture to obtain a recognition result, where the recognition result includes the content of each cell in each table and the position information of each cell in the table.

530. If the preset vocabulary is detected, match the value of each key-value pair in the table with the content of the preset vocabulary.

In this embodiment, for the case where the value can be enumerated, it is necessary to judge whether the content to be extracted in the table belongs to the preset vocabulary, and if it belongs to the preset vocabulary, then perform the splicing operation of the cell content; if it does not belong to the preset vocabulary word list, the operation of generating the information extraction template is stopped.

540. If the matching is successful, the content of each cell in the table is spliced according to the position information of the row, and the spliced content is matched with the content of the preset standard template.

The standard template includes the key of the extracted key-value pair preset by the user. The specific matching method is the same as the matching method mentioned in the above-mentioned embodiment, and will not be repeated here.

550. If the matching is successful, use the number of columns corresponding to the keys whose table content matches the preset standard template as the second target number.

560. Traverse the table row by row, and use the number of columns in the table as the second standard number.

570. If the second standard number matches the second target number, generate a third information extraction template corresponding to the table type based on the key matching the preset standard template and the position information of the key in the table.

The third information extraction template includes a key matching the preset standard template, the position information of the key in the table, and the position information of the value of each key-value pair to be extracted in the table. Different from the second information extraction template, there is no need to add auxiliary variables to the third information extraction template. In addition, the method of generating the third information extraction template is similar to the generation method of the second information extraction template, which will not be repeated here. .

Specifically, for the above Table 2, the generated third information extraction template is:

[@R0@C0-]{project:[@V_D]}[@R0@C1-]{notes<*:0,>}[@R0@C2-]{2020 half year:<*:0,> }[@R0@C3-]{Semi-annual 2019:<*:0,>}[\n].

In the above third information extraction template, since the first column can be represented according to the regular expression, the above vocabulary can be replaced with the regular expression V_D.

580. Extract table content from the recognition result according to the third information extraction template.

In this embodiment, on the basis of the above-mentioned embodiment, the table type is in the upper-lower one-to-many format, and the first column in the value to be extracted is enumerable, that is, the first column corresponding to the table that can be expressed in the form of a vocabulary The generation process of the three information extraction templates is refined. Different from the above-mentioned second information extraction template, the generation process of the third information extraction template does not need to add auxiliary variables, but it needs to judge whether the value to be extracted belongs to the preset vocabulary, and if it belongs to the preset vocabulary, it is based on the matching vocabulary. The key and its location information can generate a third information extraction template corresponding to the table type. This embodiment is set in this way to avoid introducing too much manual intervention, and compared with the method of summarizing matching rules of various tables by manual intervention, the method provided by this implementation does not require developers to summarize different rules, and is more versatile.

Embodiment 6

FIG. 6 is a flowchart of a preferred method for extracting table information based on RPA and AI according to Embodiment 6 of the present invention. This embodiment introduces in detail the generation of an information extraction template corresponding to a left-right one-to-many format. The key and value of each key-value pair in the table in the left-right one-to-many format have a left-right positional relationship, and the key and the value have a one-to-many relationship. In this embodiment, the first column in the value to be extracted is not enumerable or irregular. The generation method of the fourth information extraction template provided in this embodiment is similar to the generation method of the second information extraction template corresponding to the upper and lower one-to-many format and the first column of the value to be extracted is non-enumerable, and the difference is Because of the positional relationship between key-value pairs in the table, in this embodiment, the cell content is spliced by column, and the table traversal is performed by column, so as to determine the number of table content rows. As shown in Figure 6, the table information extraction method based on RPA and AI provided by this embodiment includes:

610. Convert the file containing the table into a picture.

620. Perform optical character OCR recognition on the picture to obtain a recognition result, where the recognition result includes the content of each cell in each table and the position information of each cell in the table.

630. If the preset vocabulary is not detected, splicing the content of each cell in the table according to the position information of the column, and matching the spliced content with the keys in the preset standard template.

Wherein, the standard template includes pre-set keys of the extracted key-value pairs.

640. If the matching is successful, use the number of rows corresponding to the key whose table content matches the preset standard template as the third target number.

650. Traverse the table by column to determine the third standard number of rows in the table.

660. If the third standard number matches the third target number, add an auxiliary variable before each cell in the table.

670. Generate a fourth information extraction template corresponding to the table type based on the auxiliary variable, the key matching the preset standard template, and the position information of the key in the table.

The fourth information extraction template includes auxiliary variables, the matched keys and their location information, and location information of the values of each key-value pair in the table. The specific generation process of the fourth information extraction template is similar to the generation process of the second information extraction template. For details, please refer to the above-mentioned generation process of the second information extraction template, which will not be repeated here.

680. Extract table content from the recognition result according to the fourth information extraction template.

In this embodiment, for the left-right one-to-many format, and the first column is a non-enumerable type table, the content of each cell in the table is spliced according to the position information of the column, and the spliced content and the preset The content of the standard template is matched. If the match is successful, the start position information and end position information in the table where the value to be extracted is located can be obtained. Traverse the table by column. If the number of rows in the table matches the number of keys in the preset standard template, add auxiliary variables before the cells in the first column of the table to distinguish the contents of each row in the table. A fourth information extraction template corresponding to the table type may be generated based on the auxiliary variable, the key whose table content matches the preset standard template, and its position information. By adopting the above technical solution, the introduction of excessive manual intervention is avoided, and compared with the method of summarizing matching rules of various forms by manual intervention, the method provided by this implementation does not require developers to summarize different rules, and is more versatile.

Embodiment 7

FIG. 7 is a flowchart of a preferred method for extracting table information based on RPA and AI according to Embodiment 7 of the present invention. This embodiment introduces in detail the generation of an information extraction template corresponding to a left-right one-to-many format. The key and value of each key-value pair in the table in the left-right one-to-many format have a left-right positional relationship, and the key and the value have a one-to-many relationship. In this embodiment, the first column of the value to be extracted can be enumerated, that is, it can be expressed in the form of a vocabulary. The generation method of the fifth information extraction template provided by this embodiment is similar to the generation method of the third information extraction template corresponding to the first column of the value that needs to be extracted in the one-to-many format and the first column can be enumerated. That is, due to the positional relationship between the key-value pairs in the table, in this embodiment, the content of the cells is spliced by columns. Traversing the table is to traverse by column to determine the number of table content rows. As shown in Figure 7, the table information extraction method based on RPA and AI provided by this embodiment includes:

710. Convert the file containing the table into a picture.

720. Perform optical character OCR recognition on the picture to obtain a recognition result, where the recognition result includes the content of each cell in each table and the position information of each cell in the table.

730. If the preset vocabulary is detected, match the value of each key-value pair in the table with the content of the preset vocabulary.

740. If the matching is successful, the content of each cell in the table is spliced according to the position information of the column, and the spliced content is matched with the content of the preset standard template.

The preset standard template includes preset keys of the extracted key-value pairs.

750. If the matching is successful, use the number of rows corresponding to the keys whose table content matches the preset standard template as the fourth target number.

760. Traverse the table by column, and use the number of rows in the table as the fourth standard number.

770. If the fourth standard number matches the fourth target number, generate a fifth information extraction template corresponding to the table type based on the key matching the preset standard template and the position information of the key in the table.

Wherein, the fifth information extraction template includes the keys matching the preset standard template and their location information, and the location information of the values of each key-value pair to be extracted in the table. The specific generation process of the fifth information extraction template is similar to the generation process of the third information extraction template. For details, please refer to the above-mentioned generation process of the third information extraction template, which will not be repeated here.

780. Extract table content from the recognition result according to the fifth information extraction template.

In this embodiment, on the basis of the above-mentioned embodiment, the table type is left-right one-to-many format, and the first column in the value to be extracted is enumerable, that is, the first column corresponding to the table that can be expressed in the form of a vocabulary The generation process of five information extraction templates is refined. Different from the above-mentioned fourth information extraction template, the generation process of the fifth information extraction template does not need to add auxiliary variables, but it needs to judge whether the value to be extracted belongs to the preset vocabulary, and if it belongs to the preset vocabulary, it can be based on matching. key and its location information, and generate a fourth information extraction template corresponding to the table type. This embodiment is set in this way to avoid introducing too much manual intervention, and compared with the method of summarizing matching rules of various tables by manual intervention, the method provided by this implementation does not require developers to summarize different rules, and is more versatile.

Embodiment 8

8 is a schematic structural diagram of an apparatus for extracting table information based on RPA and AI provided in Embodiment 8 of the present invention. As shown in FIG. 8 , the apparatus includes: a picture conversion template 810, a template generation module 820, and a content extraction module 830; in,

The image conversion template 810 is configured to convert a file containing a table into an image;

The template generation module 820 is configured to identify the table in the picture, and generate an information extraction template corresponding to the table type according to the identification result, and the information extraction template includes the keys of each key-value pair in the table and their location information , and the location information of the value of each key-value pair to be extracted;

The content extraction module 830 is configured to extract table content from the identification result according to the information extraction template.

Optionally, the template generation module 820 includes:

The picture recognition unit is configured to perform optical character OCR recognition on the picture to obtain a recognition result, where the recognition result includes the content of each cell in each table, and the position information of each cell in the table;

The template generating unit is configured to, for any type of table, generate an information extraction template corresponding to the table type according to the content of each cell in the table and the position information of each cell in the table.

Optionally, the table type includes a left-right one-to-one format, the key and value of each key-value pair in the left-right one-to-one format table are in a left-right positional relationship, and the key and value are in a one-to-one relationship;

Correspondingly, the template generation unit is specifically configured as:

Splicing the content of each cell in the table according to the position information of the row and column in the table;

For each row of content in the table, based on the spliced content, generate a first information extraction template corresponding to the table type;

Wherein, the first information extraction template includes the key and position information of each key-value pair in each row in the table, and the position information of the value of each key-value pair to be extracted.

Optionally, the table type includes a top-bottom one-to-many format, the key and value of each key-value pair in the top-bottom one-to-many format table are in a top-bottom position relationship, and the key and value are a one-to-many relationship;

Correspondingly, the template generation unit is specifically configured as:

If the preset vocabulary table is not detected, the content of each cell in the table is spliced according to the position information of the row, and the spliced content is matched with the content of a preset standard template, the standard template includes preset Determine the key of the extracted key-value pair;

If the match is successful, the number of columns corresponding to the matched key is taken as the first target number;

Traverse the table row by row, and take the number of columns in the table as the first standard number;

If the first standard number matches the first target number, an auxiliary variable is added before the cells in the first column in the table, and the auxiliary variable is used to distinguish the content of each row in the table when the table content is extracted;

Based on the auxiliary variable and the matched key and its position information, a second information extraction template corresponding to the form type is generated;

Wherein, the second information extraction template includes the auxiliary variable, the matching key and its location information, and the location information of the value of each key-value pair to be extracted in the table.

Correspondingly, the template generation unit is specifically configured as:

If the preset vocabulary is detected, the value of each key-value pair in the table is matched with the content of the preset vocabulary;

If the matching is successful, the content of each cell in the table is spliced according to the position information of the row, and the spliced content is matched with the content of a preset standard template, where the preset standard template includes preset extracted the key of the key-value pair;

If the match is successful, the number of columns corresponding to the matched key is taken as the second target number;

Traverse the table row by row, and use the number of columns in the table as the second standard number;

If the second standard number matches the second target number, then based on the matched keys and their location information, a third information extraction template corresponding to the table type is generated;

Wherein, the third information extraction template includes the matched key and its location information, and the location information of the value of each key-value pair to be extracted in the table.

Optionally, the table type includes a left-right one-to-many format, the key and value of each key-value pair in the left-right one-to-many format table have a left-right positional relationship, and the key and the value have a one-to-many relationship;

Correspondingly, the template generation unit is specifically configured as:

If the preset vocabulary table is not detected, the content of each cell in the table is spliced according to the position information of the column, and the spliced content is matched with the keys in the preset standard template, and the preset standard template includes Preset the key of the extracted key-value pair;

If the match is successful, the number of rows corresponding to the matched key is taken as the third target number;

Traverse the table by column to determine the third standard number of rows in the table;

If the number of the third standard matches the number of the third target, an auxiliary variable is added before each cell in the table, and the auxiliary variable is used to distinguish the content of each row in the table when the table content is extracted;

Based on the auxiliary variable and the matched key and its position information, a fourth information extraction template corresponding to the form type is generated;

Wherein, the fourth information extraction template includes the auxiliary variable, the matching key and its position information, and the position information of the value of each key-value pair in the table.

Correspondingly, the template generation unit is specifically configured as:

If the matching is successful, the content of each cell in the table is spliced according to the position information of the column, and the spliced content is matched with the content of a preset standard template, where the preset standard template includes preset extracted the key of the key-value pair;

If the match is successful, the number of columns corresponding to the matched key is taken as the fourth target number;

Traverse the table by column, and use the number of rows in the table as the fourth standard number;

If the fourth standard number matches the fourth target number, then based on the matched keys and their location information, a fifth information extraction template corresponding to the table type is generated;

Wherein, the fifth information extraction template includes the matched key and its location information, and location information of the value of each key-value pair to be extracted in the table.

The apparatus for extracting table information based on RPA and AI provided by the embodiment of the present invention can execute the basic information provided by any embodiment of the present invention.

The table information extraction method for RPA and AI has corresponding functional modules and beneficial effects of the execution method. For technical details not described in detail in the foregoing embodiments, reference may be made to the table information extraction method based on RPA and AI provided by any embodiment of the present invention.

Embodiment 9

Please refer to FIG. 9 , which is a schematic structural diagram of a computing device according to Embodiment 9 of the present invention. As shown in Figure 9, the computing device may include:

a memory 901 storing executable program code;

a processor 902 coupled to the memory 901;

The processor 902 invokes the executable program code stored in the memory 901 to execute the table information extraction method based on RPA and AI provided by any embodiment of the present invention.

The embodiment of the present invention also discloses a computer-readable storage medium storing a computer program, wherein the computer program enables a computer to execute the RPA- and AI-based table information extraction method provided by any embodiment of the present invention.

Those of ordinary skill in the art can understand that the accompanying drawing is only a schematic diagram of an embodiment, and the modules or processes in the accompanying drawing are not necessarily necessary to implement the present invention.

Those skilled in the art may understand that: the modules in the apparatus in the embodiment may be distributed in the apparatus in the embodiment according to the description of the embodiment, and may also be located in one or more apparatuses different from this embodiment with corresponding changes. The modules in the foregoing embodiments may be combined into one module, or may be further split into multiple sub-modules.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be The technical solutions described in the foregoing embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the present invention.

Claims

A form information extraction method based on RPA and AI, characterized in that, comprising:

S1. Convert the file containing the table into a picture;

S2. Identify the table in the picture, and generate an information extraction template corresponding to the table type according to the identification result, where the information extraction template includes the key and position information of each key-value pair in the table, and the information about each key-value pair to be extracted. The location information of the value of the key-value pair;

S3. Extract table content from the recognition result according to the information extraction template.
The method according to claim 1, wherein step S2 specifically comprises:

S21, performing optical character OCR recognition on the picture to obtain a recognition result, where the recognition result includes the content of each cell in each table, and the position information of each cell in the table;

S22. For any type of table, generate an information extraction template corresponding to the table type according to the content of each cell in the table and the position information of each cell in the table.
The method according to claim 1 or 2, wherein the table type includes a left-right one-to-one format, and the key and value of each key-value pair in the left-right one-to-one format table have a left-right positional relationship, and the key and the value are in a left-right positional relationship. is a one-to-one relationship;

Correspondingly, step S22 specifically includes:

S221, splicing the content of each cell in the table according to the position information of the row and column in the table;

S222, for each row of content in the table, based on the content after splicing, generate a first information extraction template corresponding to the table type;

Wherein, the first information extraction template includes the key and position information of each key-value pair in each row in the table, and the position information of the value of each key-value pair to be extracted.
The method according to claim 2, wherein the table type includes a top-bottom one-to-many format, the key and value of each key-value pair in the top-bottom one-to-many format table are in a top-bottom position relationship, and the key and value are one many-to-many relationship;

Correspondingly, step S22 specifically includes:

S221. If the preset vocabulary table is not detected, splicing the content of each cell in the table according to the position information of the row, and matching the spliced content with the content of a preset standard template, the preset standard template Include the key of the pre-set extracted key-value pair;

S222, if the matching is successful, the number of columns corresponding to the matched keys is taken as the first target number;

S223, traverse the table row by row, and use the number of columns in the table as the first standard number;

S224, if the first standard number matches the first target number, then add an auxiliary variable before the first column of cells in the table, and the auxiliary variable is used to distinguish the content of each row in the table when the table content is extracted;

S225, based on the auxiliary variable and the matched key and its position information, generate a second information extraction template corresponding to the form type;

Wherein, the second information extraction template includes the auxiliary variable, the matching key and its location information, and the location information of the value of each key-value pair to be extracted in the table.
The method according to claim 2, wherein the table type includes a top-bottom one-to-many format, the key and value of each key-value pair in the top-bottom one-to-many format table are in a top-bottom position relationship, and the key and value are one many-to-many relationship;

Correspondingly, step S22 specifically includes:

S221, if a preset vocabulary is detected, then the value of each key-value pair in the table is matched with the content of the preset vocabulary;

S222. If the matching is successful, splicing the content of each cell in the table according to the position information of the row, and matching the spliced content with the content of a preset standard template, where the preset standard template includes preset The key of the extracted key-value pair;

S223, if the matching is successful, the number of columns corresponding to the matched keys is taken as the second target number;

S224, traverse the table row by row, and use the number of columns in the table as the second standard number;

S225, if the second standard number matches the second target number, then based on the matched key and its position information, generate a third information extraction template corresponding to the form type;

Wherein, the third information extraction template includes the matched key and its location information, and the location information of the value of each key-value pair to be extracted in the table.
The method according to claim 2, wherein the table type comprises a left-right one-to-many format, the key and value of each key-value pair in the left-right one-to-many format table have a left-right positional relationship, and the key and value are one many-to-many relationship;

Correspondingly, step S22 specifically includes:

S221. If the preset vocabulary table is not detected, splicing the content of each cell in the table according to the position information of the column, and matching the spliced content with the keys in the preset standard template. The template includes the key of the extracted key-value pair in advance;

S222, if the matching is successful, the number of rows corresponding to the matched keys is taken as the third target number;

S223, traverse the table by column, and determine the third standard number of rows in the table;

S224. If the number of the third standard matches the number of the third target, add an auxiliary variable before each cell in the table;

S225, based on the auxiliary variable and the matching key and its position information, generate a fourth information extraction template corresponding to the table type, and the auxiliary variable is used to distinguish the content of each row in the table when the table content is extracted;

Wherein, the fourth information extraction template includes the auxiliary variable, the matched key and its location information, and the location information of the value of each key-value pair in the table.
The method according to claim 2, wherein the table type comprises a left-right one-to-many format, the key and value of each key-value pair in the left-right one-to-many format table have a left-right positional relationship, and the key and value are one many-to-many relationship;

Correspondingly, step S22 specifically includes:

S221, if a preset vocabulary is detected, then the value of each key-value pair in the table is matched with the content of the preset vocabulary;

S222. If the matching is successful, splicing the content of each cell in the table according to the position information of the column, and matching the spliced content with the content of a preset standard template, where the preset standard template includes preset The key of the extracted key-value pair;

S223, if the matching is successful, the number of columns corresponding to the matched keys is taken as the fourth target number;

S224, traverse the table by column, and use the number of rows in the table as the fourth standard number;

S225, if the fourth standard number matches the fourth target number, then based on the matched key and its position information, generate the fifth information extraction template corresponding to the table type;

Wherein, the fifth information extraction template includes the matched key and its location information, and location information of the value of each key-value pair to be extracted in the table.
A table information extraction device based on RPA and AI, characterized in that, comprising:

An image conversion template, configured to convert a file containing a table into an image;

a template generation module, configured to identify the table in the picture, and generate an information extraction template corresponding to the table type according to the identification result, the information extraction template contains the keys and position information of each key-value pair in the table, And the location information of the value of each key-value pair to be extracted;

The content extraction module is configured to extract table content from the recognition result according to the information extraction template.
A computing device, comprising:

memory in which executable program code is stored;

a processor coupled to the memory;

The processor invokes the executable program code stored in the memory to execute the RPA and AI-based table information extraction method according to any one of claims 1-7.
A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the method for extracting table information based on RPA and AI according to any one of claims 1-7 is implemented.