Disclosure of Invention
The invention aims to at least solve one technical problem in the prior art, and therefore, the invention provides a key value pair extraction method and system based on a template.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
according to an embodiment of the first aspect of the present invention, a template-based key-value pair extraction method includes: s10, receiving an image, finding text information in the image by using a text detection algorithm DB, and forming a bounding box around the text information; s20, finding keyword information in the image according to a pre-entered template, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box; s30, calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; s40, performing projective transformation correction on the image according to the fixed anchor point, the variable anchor point and the template; and S50, recognizing and extracting the text information in the corrected bounding box.
According to some embodiments of the present invention, the step S10 is preceded by a step S1, and the step S1 includes entering or deleting a template, wherein the template includes key typesetting and keyword information.
According to some embodiments of the invention, the step S20 further includes calculating coordinates of the anchor point according to a regular expression.
According to some embodiments of the invention, the step S20 further includes customizing the function of the anchor point, and optimizing the coordinates of the anchor point according to the coordinates of the bounding box.
According to some embodiments of the invention, the step S40 is further followed by a step S41, and the step S41 includes forming a bounding box around all the bounding boxes, and sequentially dividing the bounding box into a plurality of regions from top to bottom according to the template, wherein each region includes at least one key-value pair.
According to a second aspect of the present invention, a template-based key-value pair extraction system includes: the positioning module is used for receiving an image, finding text information in the image by utilizing a text detection algorithm DB, and forming a bounding box around the text information; the fixed anchor point generating module finds keyword information in the image according to a pre-entered template and calculates a fixed anchor point according to the keyword information and the coordinates of the bounding box; the variable anchor point calculating module is used for calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; the correction module is used for carrying out projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template; and the information extraction module is used for identifying and extracting the corrected text information in the bounding box.
According to some embodiments of the invention, the mobile terminal further comprises an entry module, wherein the entry module comprises an entry or deletion template, and the template comprises key typesetting and keyword information.
According to some embodiments of the present invention, the step of generating the anchor point further includes calculating coordinates of the anchor point according to a regular expression.
According to some embodiments of the invention, the anchor point generating module further comprises customizing a function of the anchor point to optimize coordinates of the anchor point according to coordinates of the bounding box.
According to some embodiments of the present invention, the system further comprises a key-value pair dividing module, wherein the key-value pair dividing module forms an enclosure frame around all the enclosures, and sequentially divides the enclosure frame into a plurality of regions from top to bottom according to the template, and each region comprises at least one key-value pair.
The template-based key value pair extraction method and system provided by the embodiment of the invention at least have the following beneficial effects: the method is suitable for correction and extraction of various text information, improves the accuracy of image character recognition, and has actual accuracy and robustness superior to the existing template key value pair matching method.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are further described below with reference to the accompanying drawings.
The technical solutions in the embodiments of the present invention will be fully described below, and it should be apparent that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a key value pair extraction method based on a template, as shown in fig. 1, which is a flow chart of the invention, according to some embodiments, S10, receiving an image, finding text information in the image by using a text detection algorithm DB, and forming a bounding box around the text information; s20, finding keyword information in the image according to a pre-entered template, and calculating a fixed anchor point according to the keyword information and the coordinates of the bounding box; s30, calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; s40, performing projective transformation correction on the image according to the fixed anchor point, the variable anchor point and the template; and S50, recognizing and extracting the text information in the corrected bounding box.
Based on the above embodiment, an image is received, text information is found in the image by the text detection algorithm DB and a bounding box is formed around the text information, and keyword information is found in the text of the bounding box. The bounding box is a rectangular bounding box, coordinates of four points of the bounding box are extracted, and the coordinate position of the fixed anchor point is calculated around the keyword information according to the coordinates of the bounding box. As shown in fig. 2, the fixed anchor point is a black dot disposed on the left side of the keyword information, and is disposed at a midpoint of a short side of the bounding box, and the fixed anchor point is disposed on the left side and is not affected by the length of the keyword information, such as the length of a specific name and a specific place name. The white point on the right side is a variable anchor point, and the white point is used for calculating and positioning through the coordinates of the fixed anchor point and the coordinates of the bounding box, so that the number of the anchor points is increased, and the accuracy of projection transformation is improved. The variable anchor point can change the coordinates thereof according to specific key information, such as the length of a specific name and a specific place name. And calculating a projective transformation matrix through the fixed anchor points and the variable anchor points, and mapping the bounding box of the original picture onto the template picture for correction. After the image is corrected, the text information in the bounding box is identified, and after the identification is finished, if the text information has wrong expression modes, such as character missing, unrecognizable character recognition error and the like, the text information can be automatically corrected through manual correction and through a regular expression and collected text information.
According to some embodiments, the step S10 is preceded by a step S1, and the step S1 includes entering or deleting a template, where the template includes key typesetting and keyword information.
Based on the above embodiment, when unimportant text information exists in the image, for example, manually handwriting the unimportant text information onto the copy, after the copy image is shot by the high-speed shooting device, the keyword information can be more accurately located according to the keyword information and the key composition, and the key information is found according to the keyword information, wherein the keyword information is "name", "gender", "birth", "year", "month", "day", and the like, and the key information is specific name, place name, and the like before and after the keyword information.
According to some embodiments, the step S20 further includes calculating coordinates of the anchor points according to a regular expression.
According to some embodiments, the step S20 further includes customizing the function of the anchor point, and optimizing the coordinates of the anchor point according to the coordinates of the bounding box.
Based on the above embodiment, the computing method of the coordinates of the fixed anchor further includes computing according to a regular expression, which computes the coordinates of the non-located fixed anchor by collecting the coordinates of the plurality of located fixed anchors and by using the regular expression and the collected coordinates of the fixed anchors. The function of the fixed anchor point can be customized by a user and is used for calculating the coordinate of the fixed anchor point through the coordinates of the four points of the bounding box. Anchor point coordinates are obtained through multiple modes, and the method is more practical.
According to some embodiments, the step S40 is further followed by a step S41, and the step S41 includes forming a bounding box around all the bounding boxes, and sequentially dividing the bounding box into a plurality of regions from top to bottom according to the template, wherein each region includes at least one key-value pair.
Based on the above embodiment, as shown in fig. 3, a plurality of key-value pairs are respectively divided from top to bottom in the bounding box, the key-value pair of each region in the bounding box is extracted by identification, the key-value pair of each region is extracted after identification, and each region is set as a row, so that the distinction is facilitated. The extraction result is as follows: the name and the specific name are in a row, the gender and the specific gender, the ethnicity and the specific ethnicity are in a row, and the like, and the data are extracted into a computer text for storage.
On the basis of the above embodiments, the present embodiment provides a character correcting system based on bounding boxes. According to some embodiments, comprising: the positioning module is used for receiving an image, finding text information in the image by utilizing a text detection algorithm DB, and forming a bounding box around the text information; the fixed anchor point generating module finds keyword information in the image according to a pre-entered template and calculates a fixed anchor point according to the keyword information and the coordinates of the bounding box; the variable anchor point calculating module is used for calculating a variable anchor point according to the bounding box coordinates and the fixed anchor point; the correction module is used for carrying out projection transformation correction on the image according to the fixed anchor points, the variable anchor points and the template; and the information extraction module is used for identifying and extracting the corrected text information in the bounding box.
Based on the above embodiment, an image is received, text information is found in the image by the text detection algorithm DB and a bounding box is formed around the text information, and keyword information is found in the text of the bounding box. The bounding box is a rectangular bounding box, coordinates of four points of the bounding box are extracted, and the coordinate position of the fixed anchor point is calculated around the keyword information according to the coordinates of the bounding box. As shown in fig. 2, the fixed anchor point is a black dot disposed on the left side of the keyword information, and is disposed at a midpoint of a short side of the bounding box, and the fixed anchor point is disposed on the left side and is not affected by the length of the keyword information, such as the length of a specific name and a specific place name. The white point on the right side is a variable anchor point, and the white point is used for calculating and positioning through the coordinates of the fixed anchor point and the coordinates of the bounding box, so that the number of the anchor points is increased, and the accuracy of projection transformation is improved. The variable anchor point can change the coordinates thereof according to specific key information, such as the length of a specific name and a specific place name. And calculating a projective transformation matrix through the fixed anchor points and the variable anchor points, and mapping the bounding box of the original picture onto the template picture for correction. After the image is corrected, the text information in the bounding box is identified, and after the identification is finished, if the text information has wrong expression modes, such as character missing, unrecognizable character recognition error and the like, the text information can be automatically corrected through manual correction and through a regular expression and collected text information.
According to some embodiments, the system further comprises an entry module, wherein the entry module comprises an entry or deletion template, and the template comprises key typesetting and keyword information.
Based on the above embodiment, when unimportant text information exists in the image, for example, manually handwriting the unimportant text information onto the copy, after the copy image is shot by the high-speed shooting instrument, the keyword information can be more accurately positioned according to the keyword information and the key composition, the key information is found according to the keyword information, wherein the keyword information is "name", "gender", "birth", "year", "month", "day", and the like, and the key information is specific name of a person, place name, and the like before and after the keyword information.
According to some embodiments, the step of generating the anchor point further includes calculating coordinates of the anchor point according to a regular expression.
According to some embodiments, the anchor point generation module further comprises customizing a function of the anchor point to optimize coordinates of the anchor point according to coordinates of the bounding box.
Based on the above embodiment, the coordinate calculation mode of the fixed anchor point further includes calculating according to a regular expression, and the function of the fixed anchor point may also be customized by a user, and is used for calculating the coordinate of the fixed anchor point through the coordinates of the four points of the bounding box. Anchor point coordinates are obtained through multiple modes, and the method is more practical.
According to some embodiments, the system further comprises a key-value pair dividing module, wherein the key-value pair dividing module forms a bounding box around all the bounding boxes, and sequentially divides the bounding box into a plurality of regions from top to bottom according to the template, and each region comprises at least one key-value pair.
Based on the above embodiment, as shown in fig. 3, a plurality of key-value pairs are respectively divided from top to bottom in the bounding box, the key-value pair of each region in the bounding box is extracted by identification, the key-value pair of each region is extracted after identification, and each region is set as a row, so that the distinction is facilitated. The extraction result is as follows: the name and the specific name are in a row, the gender and the specific gender, the ethnicity and the specific ethnicity are in a row, and the like, and the data are extracted into a computer text for storage.
It will be evident to those skilled in the art that the invention is not limited to the exemplary embodiments described above, but that the present solution is capable of being embodied in other specific forms without departing from the essential characteristics thereof. Accordingly, the embodiments should be considered as exemplary and non-limiting.