CN113869320A - Template-based key value pair extraction method and system - Google Patents
Template-based key value pair extraction method and system Download PDFInfo
- Publication number
- CN113869320A CN113869320A CN202111191056.6A CN202111191056A CN113869320A CN 113869320 A CN113869320 A CN 113869320A CN 202111191056 A CN202111191056 A CN 202111191056A CN 113869320 A CN113869320 A CN 113869320A
- Authority
- CN
- China
- Prior art keywords
- template
- anchor point
- bounding box
- value pair
- coordinates
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 26
- 238000012937 correction Methods 0.000 claims abstract description 16
- 230000009466 transformation Effects 0.000 claims abstract description 12
- 238000001514 detection method Methods 0.000 claims abstract description 9
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000000638 solvent extraction Methods 0.000 claims 2
- 238000005192 partition Methods 0.000 claims 1
- 238000000034 method Methods 0.000 abstract description 7
- 238000010586 diagram Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/80—Geometric correction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Processing Or Creating Images (AREA)
- Character Input (AREA)
Abstract
The invention discloses a key value pair extraction method and a system based on a template, wherein an image is received, text information is found in the image by utilizing a text detection algorithm DB, and a bounding box is formed around the text information; finding keyword information in the image according to a template which is input in advance, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box; calculating a variable anchor point according to the coordinates of the bounding box and the fixed anchor point; performing projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template; and recognizing and extracting the text information in the corrected bounding box. The method is suitable for correction and extraction of various text information, improves the accuracy of image character recognition, and has actual accuracy and robustness superior to the existing template key value pair matching method.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a key value pair extraction method and system based on a template.
Background
At present, after the characters of image data are identified, key information of the characters needs to be extracted according to business requirements, a plurality of key information of the image data are related to the typesetting position of the image data, and the relative position of the key information is usually fixed, so that a document template can be customized to specify a bounding box in which each key information is located, and perspective transformation is performed by determining the corresponding anchor point position in an actual image, so that the transformed actual image can be attached to the template. The existing image character recognition algorithm based on template matching has very strict matching rules for anchor points, the anchor points are required to be completely fixed and invariable characters, the fault tolerance rate is not allowed on the character recognition of actual images, and the corresponding anchor points cannot be matched otherwise. The method expands the rule of anchor point matching, optimizes the image recognition details after matching, and greatly improves the template matching rate and the character recognition accuracy rate of the image character recognition model.
Disclosure of Invention
The invention aims to at least solve one technical problem in the prior art, and therefore, the invention provides a key value pair extraction method and system based on a template.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
according to an embodiment of the first aspect of the present invention, a template-based key-value pair extraction method includes: s10, receiving an image, finding text information in the image by using a text detection algorithm DB, and forming a bounding box around the text information; s20, finding keyword information in the image according to a pre-entered template, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box; s30, calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; s40, performing projective transformation correction on the image according to the fixed anchor point, the variable anchor point and the template; and S50, recognizing and extracting the text information in the corrected bounding box.
According to some embodiments of the present invention, the step S10 is preceded by a step S1, and the step S1 includes entering or deleting a template, wherein the template includes key typesetting and keyword information.
According to some embodiments of the invention, the step S20 further includes calculating coordinates of the anchor point according to a regular expression.
According to some embodiments of the invention, the step S20 further includes customizing the function of the anchor point, and optimizing the coordinates of the anchor point according to the coordinates of the bounding box.
According to some embodiments of the invention, the step S40 is further followed by a step S41, and the step S41 includes forming a bounding box around all the bounding boxes, and sequentially dividing the bounding box into a plurality of regions from top to bottom according to the template, wherein each region includes at least one key-value pair.
According to a second aspect of the present invention, a template-based key-value pair extraction system includes: the positioning module is used for receiving an image, finding text information in the image by utilizing a text detection algorithm DB, and forming a bounding box around the text information; the fixed anchor point generating module finds keyword information in the image according to a pre-entered template and calculates a fixed anchor point according to the keyword information and the coordinates of the bounding box; the variable anchor point calculating module is used for calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; the correction module is used for carrying out projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template; and the information extraction module is used for identifying and extracting the corrected text information in the bounding box.
According to some embodiments of the invention, the mobile terminal further comprises an entry module, wherein the entry module comprises an entry or deletion template, and the template comprises key typesetting and keyword information.
According to some embodiments of the present invention, the step of generating the anchor point further includes calculating coordinates of the anchor point according to a regular expression.
According to some embodiments of the invention, the anchor point generating module further comprises customizing a function of the anchor point to optimize coordinates of the anchor point according to coordinates of the bounding box.
According to some embodiments of the present invention, the system further comprises a key-value pair dividing module, wherein the key-value pair dividing module forms an enclosure frame around all the enclosures, and sequentially divides the enclosure frame into a plurality of regions from top to bottom according to the template, and each region comprises at least one key-value pair.
The template-based key value pair extraction method and system provided by the embodiment of the invention at least have the following beneficial effects: the method is suitable for correction and extraction of various text information, improves the accuracy of image character recognition, and has actual accuracy and robustness superior to the existing template key value pair matching method.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a block flow diagram of the present invention;
FIG. 2 is a schematic diagram of the invention forming bounding boxes and anchors;
fig. 3 is a schematic diagram of the present invention forming an enclosure frame.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are further described below with reference to the accompanying drawings.
The technical solutions in the embodiments of the present invention will be fully described below, and it should be apparent that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a key value pair extraction method based on a template, as shown in fig. 1, which is a flow chart of the invention, according to some embodiments, S10, receiving an image, finding text information in the image by using a text detection algorithm DB, and forming a bounding box around the text information; s20, finding keyword information in the image according to a pre-entered template, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box; s30, calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; s40, performing projective transformation correction on the image according to the fixed anchor point, the variable anchor point and the template; and S50, recognizing and extracting the text information in the corrected bounding box.
Based on the above embodiment, an image is received, text information is found in the image by the text detection algorithm DB and a bounding box is formed around the text information, and keyword information is found in the text of the bounding box. The bounding box is a rectangular bounding box, coordinates of four points of the bounding box are extracted, and the coordinate position of the fixed anchor point is calculated around the keyword information according to the coordinates of the bounding box. As shown in fig. 2, the fixed anchor point is a black dot disposed on the left side of the keyword information, and is disposed at a midpoint of a short side of the bounding box, and the fixed anchor point is disposed on the left side and is not affected by the length of the keyword information, such as the length of a specific name and a specific place name. The white point on the right side is a variable anchor point, and the white point is used for calculating and positioning through the coordinates of the fixed anchor point and the coordinates of the bounding box, so that the number of the anchor points is increased, and the accuracy of projection transformation is improved. The variable anchor point can change the coordinates thereof according to specific key information, such as the length of a specific name and a specific place name. And calculating a projective transformation matrix through the fixed anchor points and the variable anchor points, and mapping the bounding box of the original picture onto the template picture for correction. After the image is corrected, the text information in the bounding box is identified, and after the identification is finished, if the text information has wrong expression modes, such as character missing, unrecognizable character recognition error and the like, the text information can be automatically corrected through manual correction and through a regular expression and collected text information.
According to some embodiments, the step S10 is preceded by a step S1, and the step S1 includes entering or deleting a template, where the template includes key typesetting and keyword information.
Based on the above embodiment, when unimportant text information exists in the image, for example, manually handwriting the unimportant text information onto the copy, after the copy image is shot by the high-speed shooting device, the keyword information can be more accurately located according to the keyword information and the key composition, and the key information is found according to the keyword information, wherein the keyword information is "name", "gender", "birth", "year", "month", "day", and the like, and the key information is specific name, place name, and the like before and after the keyword information.
According to some embodiments, the step S20 further includes calculating coordinates of the anchor points according to a regular expression.
According to some embodiments, the step S20 further includes customizing the function of the anchor point, and optimizing the coordinates of the anchor point according to the coordinates of the bounding box.
Based on the above embodiment, the computing method of the coordinates of the fixed anchor further includes computing according to a regular expression, which computes the coordinates of the non-located fixed anchor by collecting the coordinates of the plurality of located fixed anchors and by using the regular expression and the collected coordinates of the fixed anchors. The function of the fixed anchor point can be customized by a user and is used for calculating the coordinate of the fixed anchor point through the coordinates of the four points of the bounding box. Anchor point coordinates are obtained through multiple modes, and the method is more practical.
According to some embodiments, the step S40 is further followed by a step S41, and the step S41 includes forming a bounding box around all the bounding boxes, and sequentially dividing the bounding box into a plurality of regions from top to bottom according to the template, wherein each region includes at least one key-value pair.
Based on the above embodiment, as shown in fig. 3, a plurality of key-value pairs are respectively divided from top to bottom in the bounding box, the key-value pair of each region in the bounding box is extracted by identification, the key-value pair of each region is extracted after identification, and each region is set as a row, so that the distinction is facilitated. The extraction result is as follows: the name and the specific name are in a row, the gender and the specific gender, the ethnicity and the specific ethnicity are in a row, and the like, and the data are extracted into a computer text for storage.
On the basis of the above embodiments, the present embodiment provides a character correcting system based on bounding boxes. According to some embodiments, comprising: the positioning module is used for receiving an image, finding text information in the image by utilizing a text detection algorithm DB, and forming a bounding box around the text information; the fixed anchor point generating module finds keyword information in the image according to a pre-entered template and calculates a fixed anchor point according to the keyword information and the coordinates of the bounding box; the variable anchor point calculating module is used for calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; the correction module is used for carrying out projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template; and the information extraction module is used for identifying and extracting the corrected text information in the bounding box.
Based on the above embodiment, an image is received, text information is found in the image by the text detection algorithm DB and a bounding box is formed around the text information, and keyword information is found in the text of the bounding box. The bounding box is a rectangular bounding box, coordinates of four points of the bounding box are extracted, and the coordinate position of the fixed anchor point is calculated around the keyword information according to the coordinates of the bounding box. As shown in fig. 2, the fixed anchor point is a black dot disposed on the left side of the keyword information, and is disposed at a midpoint of a short side of the bounding box, and the fixed anchor point is disposed on the left side and is not affected by the length of the keyword information, such as the length of a specific name and a specific place name. The white point on the right side is a variable anchor point, and the white point is used for calculating and positioning through the coordinates of the fixed anchor point and the coordinates of the bounding box, so that the number of the anchor points is increased, and the accuracy of projection transformation is improved. The variable anchor point can change the coordinates thereof according to specific key information, such as the length of a specific name and a specific place name. And calculating a projective transformation matrix through the fixed anchor points and the variable anchor points, and mapping the bounding box of the original picture onto the template picture for correction. After the image is corrected, the text information in the bounding box is identified, and after the identification is finished, if the text information has wrong expression modes, such as character missing, unrecognizable character recognition error and the like, the text information can be automatically corrected through manual correction and through a regular expression and collected text information.
According to some embodiments, the system further comprises an entry module, wherein the entry module comprises an entry or deletion template, and the template comprises key typesetting and keyword information.
Based on the above embodiment, when unimportant text information exists in the image, for example, manually handwriting the unimportant text information onto the copy, after the copy image is shot by the high-speed shooting instrument, the keyword information can be more accurately positioned according to the keyword information and the key composition, the key information is found according to the keyword information, wherein the keyword information is "name", "gender", "birth", "year", "month", "day", and the like, and the key information is specific name of a person, place name, and the like before and after the keyword information.
According to some embodiments, the step of generating the anchor point further includes calculating coordinates of the anchor point according to a regular expression.
According to some embodiments, the anchor point generation module further comprises customizing a function of the anchor point to optimize coordinates of the anchor point according to coordinates of the bounding box.
Based on the above embodiment, the coordinate calculation mode of the fixed anchor point further includes calculating according to a regular expression, and the function of the fixed anchor point may also be customized by a user, and is used for calculating the coordinate of the fixed anchor point through the coordinates of the four points of the bounding box. Anchor point coordinates are obtained through multiple modes, and the method is more practical.
According to some embodiments, the system further comprises a key-value pair dividing module, wherein the key-value pair dividing module forms a bounding box around all the bounding boxes, and sequentially divides the bounding box into a plurality of regions from top to bottom according to the template, and each region comprises at least one key-value pair.
Based on the above embodiment, as shown in fig. 3, a plurality of key-value pairs are respectively divided from top to bottom in the bounding box, the key-value pair of each region in the bounding box is extracted by identification, the key-value pair of each region is extracted after identification, and each region is set as a row, so that the distinction is facilitated. The extraction result is as follows: the name and the specific name are in a row, the gender and the specific gender, the ethnicity and the specific ethnicity are in a row, and the like, and the data are extracted into a computer text for storage.
It will be evident to those skilled in the art that the invention is not limited to the exemplary embodiments described above, but that the present solution is capable of being embodied in other specific forms without departing from the essential characteristics thereof. Accordingly, the embodiments should be considered as exemplary and non-limiting.
Claims (10)
1. A template-based key-value pair extraction method is characterized by comprising the following steps:
s10, receiving an image, finding text information in the image by using a text detection algorithm DB, and forming a bounding box around the text information;
s20, finding keyword information in the image according to a pre-entered template, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box;
s30, calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point;
s40, performing projective transformation correction on the image according to the fixed anchor point, the variable anchor point and the template;
and S50, recognizing and extracting the text information in the corrected bounding box.
2. The template-based key-value pair extraction method of claim 1, wherein the step S10 is preceded by a step S1, and the step S1 includes entering or deleting a template, wherein the template includes key typesetting and keyword information.
3. The template-based key-value pair extraction method of claim 1, wherein the step S20 further comprises calculating coordinates of a fixed anchor point according to a regular expression.
4. The template-based key-value pair extraction method of claim 1 or 3, wherein the step S20 further comprises customizing a function of the anchor point, and optimizing coordinates of the anchor point according to coordinates of a bounding box.
5. The template-based key-value pair extraction method of claim 1, wherein the step S40 is followed by step S41, and the step S41 includes forming a bounding box around all the bounding boxes, and sequentially dividing the bounding box into a plurality of regions from top to bottom according to the template, wherein each region includes at least one key-value pair.
6. A template-based key-value pair extraction system, comprising:
the positioning module is used for receiving an image, finding text information in the image by utilizing a text detection algorithm DB, and forming a bounding box around the text information;
the fixed anchor point generating module finds keyword information in the image according to a pre-entered template and calculates a fixed anchor point according to the keyword information and the coordinates of the bounding box;
the variable anchor point calculating module is used for calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point;
the correction module is used for carrying out projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template;
and the information extraction module is used for identifying and extracting the corrected text information in the bounding box.
7. The template-based key-value pair extraction system according to claim 6, further comprising an entry module, wherein the entry module comprises an entry or deletion template, and the template comprises key typesetting and keyword information.
8. The template-based key-value pair extraction system of claim 6, wherein the anchor point generation module further comprises computing coordinates of anchor points according to a regular expression.
9. The template-based key-value pair extraction system of claim 8, wherein the anchor point generation module further comprises a function for customizing the anchor point to optimize coordinates of the anchor point according to coordinates of bounding boxes.
10. The template-based key-value pair extraction system of claim 6, further comprising a key-value pair partitioning module, wherein the key-value pair partitioning module forms a bounding box around all the bounding boxes, and sequentially partitions the bounding box into a plurality of regions from top to bottom according to the template, and each region comprises at least one key-value pair.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111191056.6A CN113869320A (en) | 2021-10-13 | 2021-10-13 | Template-based key value pair extraction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111191056.6A CN113869320A (en) | 2021-10-13 | 2021-10-13 | Template-based key value pair extraction method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113869320A true CN113869320A (en) | 2021-12-31 |
Family
ID=78998912
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111191056.6A Pending CN113869320A (en) | 2021-10-13 | 2021-10-13 | Template-based key value pair extraction method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113869320A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180285676A1 (en) * | 2015-09-11 | 2018-10-04 | Junyu Han | Method and apparatus for processing image information |
WO2019238063A1 (en) * | 2018-06-15 | 2019-12-19 | 众安信息技术服务有限公司 | Text detection and analysis method and apparatus, and device |
CN111126125A (en) * | 2019-10-15 | 2020-05-08 | 平安科技(深圳)有限公司 | Method, device and equipment for extracting target text in certificate and readable storage medium |
CN111353492A (en) * | 2020-03-12 | 2020-06-30 | 上海合合信息科技发展有限公司 | Image identification and information extraction method and device for standardized document |
CN111783770A (en) * | 2020-01-16 | 2020-10-16 | 北京沃东天骏信息技术有限公司 | Image rectification method, device and computer readable storage medium |
US10896357B1 (en) * | 2017-12-29 | 2021-01-19 | Automation Anywhere, Inc. | Automatic key/value pair extraction from document images using deep learning |
WO2021057138A1 (en) * | 2019-09-27 | 2021-04-01 | 支付宝(杭州)信息技术有限公司 | Certificate recognition method and apparatus |
CN113269126A (en) * | 2021-06-10 | 2021-08-17 | 上海云扩信息科技有限公司 | Key information extraction method based on coordinate transformation |
-
2021
- 2021-10-13 CN CN202111191056.6A patent/CN113869320A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180285676A1 (en) * | 2015-09-11 | 2018-10-04 | Junyu Han | Method and apparatus for processing image information |
US10896357B1 (en) * | 2017-12-29 | 2021-01-19 | Automation Anywhere, Inc. | Automatic key/value pair extraction from document images using deep learning |
WO2019238063A1 (en) * | 2018-06-15 | 2019-12-19 | 众安信息技术服务有限公司 | Text detection and analysis method and apparatus, and device |
WO2021057138A1 (en) * | 2019-09-27 | 2021-04-01 | 支付宝(杭州)信息技术有限公司 | Certificate recognition method and apparatus |
CN111126125A (en) * | 2019-10-15 | 2020-05-08 | 平安科技(深圳)有限公司 | Method, device and equipment for extracting target text in certificate and readable storage medium |
CN111783770A (en) * | 2020-01-16 | 2020-10-16 | 北京沃东天骏信息技术有限公司 | Image rectification method, device and computer readable storage medium |
CN111353492A (en) * | 2020-03-12 | 2020-06-30 | 上海合合信息科技发展有限公司 | Image identification and information extraction method and device for standardized document |
CN113269126A (en) * | 2021-06-10 | 2021-08-17 | 上海云扩信息科技有限公司 | Key information extraction method based on coordinate transformation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111325110B (en) | OCR-based table format recovery method, device and storage medium | |
CN109344831A (en) | A kind of tables of data recognition methods, device and terminal device | |
CN111898411B (en) | Text image labeling system, method, computer device and storage medium | |
CN109685870B (en) | Information labeling method and device, labeling equipment and storage medium | |
WO2018233055A1 (en) | Method and apparatus for entering policy information, computer device and storage medium | |
CN111310426B (en) | OCR-based table format recovery method, device and storage medium | |
JP2004139484A (en) | Form processing device, program for implementing it, and program for creating form format | |
US20230215201A1 (en) | Identify card number | |
CN112380978B (en) | Multi-face detection method, system and storage medium based on key point positioning | |
CN112396047B (en) | Training sample generation method and device, computer equipment and storage medium | |
CN113378764B (en) | Video face acquisition method, device, equipment and medium based on clustering algorithm | |
CN112001331B (en) | Image recognition method, device, equipment and storage medium | |
CN111428700B (en) | Table identification method and device, electronic equipment and storage medium | |
US8787702B1 (en) | Methods and apparatus for determining and/or modifying image orientation | |
CN115546809A (en) | Table structure identification method based on cell constraint and application thereof | |
CN117496521A (en) | Method, system and device for extracting key information of table and readable storage medium | |
CN114332883A (en) | Invoice information identification method and device, computer equipment and storage medium | |
CN113869320A (en) | Template-based key value pair extraction method and system | |
CN115797955A (en) | Table structure identification method based on cell constraint and application thereof | |
CN109635798A (en) | A kind of information extracting method and device | |
JP4651876B2 (en) | PATTERN IDENTIFICATION DEVICE, PATTERN IDENTIFICATION METHOD, AND PATTERN IDENTIFICATION PROGRAM | |
CN111160265B (en) | File conversion method and device, storage medium and electronic equipment | |
CN114943973B (en) | Text correction method, device, computer equipment and storage medium | |
CN113781607B (en) | Processing method, device, equipment and storage medium for labeling data of OCR (optical character recognition) image | |
US20240362938A1 (en) | Image processing system, image processing method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |