CN113869320A

CN113869320A - A template-based key-value pair extraction method and system

Info

Publication number: CN113869320A
Application number: CN202111191056.6A
Authority: CN
Inventors: 路橙; 朱佳豪; 梁沛森; 陈锦锋; 陈武聪; 钟建琛; 李文浩; 邓耀隆
Original assignee: Guangdong Kamfu Technology Co ltd
Current assignee: Guangdong Kamfu Technology Co ltd
Priority date: 2021-10-13
Filing date: 2021-10-13
Publication date: 2021-12-31
Anticipated expiration: 2041-10-13
Also published as: CN113869320B

Abstract

The invention discloses a template-based key-value pair extraction method and system. It receives an image, uses a text detection algorithm DB to find text information in the image, and forms a bounding box around the text information; Keyword information, the fixed anchor point is calculated according to the keyword information and the coordinates of the bounding box; the variable anchor point is calculated according to the bounding box coordinates and the fixed anchor point; the image is projected and transformed according to the fixed anchor point, variable anchor point and template Correction; identify and extract textual information within the corrected bounding box. According to keywords or regular expressions or custom anchor point functions, projection transformation correction is performed according to anchor point information, which is suitable for correction and extraction of various text information, and improves the accuracy, actual accuracy and robustness of image text recognition. Both are better than existing template key-value pair matching methods.

Description

Template-based key value pair extraction method and system

Technical Field

The invention relates to the technical field of image processing, in particular to a key value pair extraction method and system based on a template.

Background

At present, after the characters of image data are identified, key information of the characters needs to be extracted according to business requirements, a plurality of key information of the image data are related to the typesetting position of the image data, and the relative position of the key information is usually fixed, so that a document template can be customized to specify a bounding box in which each key information is located, and perspective transformation is performed by determining the corresponding anchor point position in an actual image, so that the transformed actual image can be attached to the template. The existing image character recognition algorithm based on template matching has very strict matching rules for anchor points, the anchor points are required to be completely fixed and invariable characters, the fault tolerance rate is not allowed on the character recognition of actual images, and the corresponding anchor points cannot be matched otherwise. The method expands the rule of anchor point matching, optimizes the image recognition details after matching, and greatly improves the template matching rate and the character recognition accuracy rate of the image character recognition model.

Disclosure of Invention

The invention aims to at least solve one technical problem in the prior art, and therefore, the invention provides a key value pair extraction method and system based on a template.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

according to an embodiment of the first aspect of the present invention, a template-based key-value pair extraction method includes: s10, receiving an image, finding text information in the image by using a text detection algorithm DB, and forming a bounding box around the text information; s20, finding keyword information in the image according to a pre-entered template, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box; s30, calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; s40, performing projective transformation correction on the image according to the fixed anchor point, the variable anchor point and the template; and S50, recognizing and extracting the text information in the corrected bounding box.

According to some embodiments of the present invention, the step S10 is preceded by a step S1, and the step S1 includes entering or deleting a template, wherein the template includes key typesetting and keyword information.

According to some embodiments of the invention, the step S20 further includes calculating coordinates of the anchor point according to a regular expression.

According to some embodiments of the invention, the step S20 further includes customizing the function of the anchor point, and optimizing the coordinates of the anchor point according to the coordinates of the bounding box.

According to some embodiments of the invention, the step S40 is further followed by a step S41, and the step S41 includes forming a bounding box around all the bounding boxes, and sequentially dividing the bounding box into a plurality of regions from top to bottom according to the template, wherein each region includes at least one key-value pair.

According to a second aspect of the present invention, a template-based key-value pair extraction system includes: the positioning module is used for receiving an image, finding text information in the image by utilizing a text detection algorithm DB, and forming a bounding box around the text information; the fixed anchor point generating module finds keyword information in the image according to a pre-entered template and calculates a fixed anchor point according to the keyword information and the coordinates of the bounding box; the variable anchor point calculating module is used for calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; the correction module is used for carrying out projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template; and the information extraction module is used for identifying and extracting the corrected text information in the bounding box.

According to some embodiments of the invention, the mobile terminal further comprises an entry module, wherein the entry module comprises an entry or deletion template, and the template comprises key typesetting and keyword information.

According to some embodiments of the present invention, the step of generating the anchor point further includes calculating coordinates of the anchor point according to a regular expression.

According to some embodiments of the invention, the anchor point generating module further comprises customizing a function of the anchor point to optimize coordinates of the anchor point according to coordinates of the bounding box.

According to some embodiments of the present invention, the system further comprises a key-value pair dividing module, wherein the key-value pair dividing module forms an enclosure frame around all the enclosures, and sequentially divides the enclosure frame into a plurality of regions from top to bottom according to the template, and each region comprises at least one key-value pair.

The template-based key value pair extraction method and system provided by the embodiment of the invention at least have the following beneficial effects: the method is suitable for correction and extraction of various text information, improves the accuracy of image character recognition, and has actual accuracy and robustness superior to the existing template key value pair matching method.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a block flow diagram of the present invention;

FIG. 2 is a schematic diagram of the invention forming bounding boxes and anchors;

fig. 3 is a schematic diagram of the present invention forming an enclosure frame.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are further described below with reference to the accompanying drawings.

The technical solutions in the embodiments of the present invention will be fully described below, and it should be apparent that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention provides a key value pair extraction method based on a template, as shown in fig. 1, which is a flow chart of the invention, according to some embodiments, S10, receiving an image, finding text information in the image by using a text detection algorithm DB, and forming a bounding box around the text information; s20, finding keyword information in the image according to a pre-entered template, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box; s30, calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; s40, performing projective transformation correction on the image according to the fixed anchor point, the variable anchor point and the template; and S50, recognizing and extracting the text information in the corrected bounding box.

Based on the above embodiment, an image is received, text information is found in the image by the text detection algorithm DB and a bounding box is formed around the text information, and keyword information is found in the text of the bounding box. The bounding box is a rectangular bounding box, coordinates of four points of the bounding box are extracted, and the coordinate position of the fixed anchor point is calculated around the keyword information according to the coordinates of the bounding box. As shown in fig. 2, the fixed anchor point is a black dot disposed on the left side of the keyword information, and is disposed at a midpoint of a short side of the bounding box, and the fixed anchor point is disposed on the left side and is not affected by the length of the keyword information, such as the length of a specific name and a specific place name. The white point on the right side is a variable anchor point, and the white point is used for calculating and positioning through the coordinates of the fixed anchor point and the coordinates of the bounding box, so that the number of the anchor points is increased, and the accuracy of projection transformation is improved. The variable anchor point can change the coordinates thereof according to specific key information, such as the length of a specific name and a specific place name. And calculating a projective transformation matrix through the fixed anchor points and the variable anchor points, and mapping the bounding box of the original picture onto the template picture for correction. After the image is corrected, the text information in the bounding box is identified, and after the identification is finished, if the text information has wrong expression modes, such as character missing, unrecognizable character recognition error and the like, the text information can be automatically corrected through manual correction and through a regular expression and collected text information.

According to some embodiments, the step S10 is preceded by a step S1, and the step S1 includes entering or deleting a template, where the template includes key typesetting and keyword information.

Based on the above embodiment, when unimportant text information exists in the image, for example, manually handwriting the unimportant text information onto the copy, after the copy image is shot by the high-speed shooting device, the keyword information can be more accurately located according to the keyword information and the key composition, and the key information is found according to the keyword information, wherein the keyword information is "name", "gender", "birth", "year", "month", "day", and the like, and the key information is specific name, place name, and the like before and after the keyword information.

According to some embodiments, the step S20 further includes calculating coordinates of the anchor points according to a regular expression.

According to some embodiments, the step S20 further includes customizing the function of the anchor point, and optimizing the coordinates of the anchor point according to the coordinates of the bounding box.

Based on the above embodiment, the computing method of the coordinates of the fixed anchor further includes computing according to a regular expression, which computes the coordinates of the non-located fixed anchor by collecting the coordinates of the plurality of located fixed anchors and by using the regular expression and the collected coordinates of the fixed anchors. The function of the fixed anchor point can be customized by a user and is used for calculating the coordinate of the fixed anchor point through the coordinates of the four points of the bounding box. Anchor point coordinates are obtained through multiple modes, and the method is more practical.

According to some embodiments, the step S40 is further followed by a step S41, and the step S41 includes forming a bounding box around all the bounding boxes, and sequentially dividing the bounding box into a plurality of regions from top to bottom according to the template, wherein each region includes at least one key-value pair.

Based on the above embodiment, as shown in fig. 3, a plurality of key-value pairs are respectively divided from top to bottom in the bounding box, the key-value pair of each region in the bounding box is extracted by identification, the key-value pair of each region is extracted after identification, and each region is set as a row, so that the distinction is facilitated. The extraction result is as follows: the name and the specific name are in a row, the gender and the specific gender, the ethnicity and the specific ethnicity are in a row, and the like, and the data are extracted into a computer text for storage.

On the basis of the above embodiments, the present embodiment provides a character correcting system based on bounding boxes. According to some embodiments, comprising: the positioning module is used for receiving an image, finding text information in the image by utilizing a text detection algorithm DB, and forming a bounding box around the text information; the fixed anchor point generating module finds keyword information in the image according to a pre-entered template and calculates a fixed anchor point according to the keyword information and the coordinates of the bounding box; the variable anchor point calculating module is used for calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; the correction module is used for carrying out projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template; and the information extraction module is used for identifying and extracting the corrected text information in the bounding box.

According to some embodiments, the system further comprises an entry module, wherein the entry module comprises an entry or deletion template, and the template comprises key typesetting and keyword information.

Based on the above embodiment, when unimportant text information exists in the image, for example, manually handwriting the unimportant text information onto the copy, after the copy image is shot by the high-speed shooting instrument, the keyword information can be more accurately positioned according to the keyword information and the key composition, the key information is found according to the keyword information, wherein the keyword information is "name", "gender", "birth", "year", "month", "day", and the like, and the key information is specific name of a person, place name, and the like before and after the keyword information.

According to some embodiments, the step of generating the anchor point further includes calculating coordinates of the anchor point according to a regular expression.

According to some embodiments, the anchor point generation module further comprises customizing a function of the anchor point to optimize coordinates of the anchor point according to coordinates of the bounding box.

Based on the above embodiment, the coordinate calculation mode of the fixed anchor point further includes calculating according to a regular expression, and the function of the fixed anchor point may also be customized by a user, and is used for calculating the coordinate of the fixed anchor point through the coordinates of the four points of the bounding box. Anchor point coordinates are obtained through multiple modes, and the method is more practical.

According to some embodiments, the system further comprises a key-value pair dividing module, wherein the key-value pair dividing module forms a bounding box around all the bounding boxes, and sequentially divides the bounding box into a plurality of regions from top to bottom according to the template, and each region comprises at least one key-value pair.

It will be evident to those skilled in the art that the invention is not limited to the exemplary embodiments described above, but that the present solution is capable of being embodied in other specific forms without departing from the essential characteristics thereof. Accordingly, the embodiments should be considered as exemplary and non-limiting.

Claims

1. a template-based key-value pair extraction method, is characterized in that, comprises:

S10, receiving an image, using a text detection algorithm DB to find text information in the image, and forming a bounding box around the text information;

S20, find keyword information in the image according to a pre-entered template, and calculate a fixed anchor point according to the keyword information and the coordinates of the bounding box;

S30. Calculate a variable anchor point according to the bounding box coordinates and the fixed anchor point;

S40, perform projection transformation correction on the image according to the fixed anchor point, the variable anchor point and the template;

S50. Identify and extract the corrected text information in the bounding box.

2. A template-based key-value pair extraction method according to claim 1, characterized in that, before the step S10, it further comprises a step S1, and the step S1 comprises, entering or deleting a template, and the template includes Key typography and keyword information.

3 . The template-based key-value pair extraction method according to claim 1 , wherein the step S20 further comprises: calculating the coordinates of the fixed anchor point according to a regular expression. 4 .

4. The method for extracting key-value pairs based on a template according to claim 1 or 3, wherein the step S20 further comprises: customizing the function of the fixed anchor point, optimizing all the parameters according to the coordinates of the bounding box. Describe the coordinates of the fixed anchor point.

5. A template-based key-value pair extraction method according to claim 1, characterized in that, after the step S40, it further comprises a step S41, and the step S41 comprises forming a bounding box around all the bounding boxes, According to the template, the bounding box is divided into several regions in sequence from top to bottom, and each region includes at least one key-value pair.

6. a template-based key-value pair extraction system, is characterized in that, comprises:

a positioning module, configured to receive an image, use a text detection algorithm DB to find text information in the image, and form a bounding box around the text information;

The fixed anchor point generation module finds the keyword information in the image according to the pre-entered template, and calculates the fixed anchor point according to the keyword information and the coordinates of the bounding box;

a variable anchor point calculation module, which calculates a variable anchor point according to the bounding box coordinates and the fixed anchor point;

a correction module, which performs projection transformation correction on the image according to the fixed anchor point, the variable anchor point and the template;

The information extraction module identifies and extracts the text information in the rectified bounding box.

7. a kind of template-based key-value pair extraction system according to claim 6, is characterized in that, also comprises input module, and described input module comprises input or delete template, and described template comprises key typesetting and keyword information .

8 . The template-based key-value pair extraction system according to claim 6 , wherein the step-fixed anchor point generation module further comprises: calculating the coordinates of the fixed anchor point according to a regular expression. 9 .

9 . The template-based key-value pair extraction system according to claim 8 , wherein the fixed anchor point generation module further comprises: customizing the function of the fixed anchor point, optimizing the coordinates according to the bounding box. 10 . The coordinates of the fixed anchor point.

10. A template-based key-value pair extraction system according to claim 6, further comprising a key-value pair dividing module, wherein the key-value pair dividing module comprises forming a bounding box around all the bounding boxes , according to the template, the bounding box is divided into several areas in sequence from top to bottom, and each area includes at least one key-value pair.