CN114463767A - Credit card identification method, device, computer equipment and storage medium - Google Patents

Credit card identification method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN114463767A
CN114463767A CN202111632352.5A CN202111632352A CN114463767A CN 114463767 A CN114463767 A CN 114463767A CN 202111632352 A CN202111632352 A CN 202111632352A CN 114463767 A CN114463767 A CN 114463767A
Authority
CN
China
Prior art keywords
text
credit card
credit
field
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111632352.5A
Other languages
Chinese (zh)
Inventor
王迪
李捷
王巍
徐敏
向东
王慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co Ltd
Original Assignee
Shanghai Pudong Development Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co Ltd filed Critical Shanghai Pudong Development Bank Co Ltd
Priority to CN202111632352.5A priority Critical patent/CN114463767A/en
Publication of CN114463767A publication Critical patent/CN114463767A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)

Abstract

The application relates to a credit card identification method, a device, a computer device and a storage medium. The method comprises the following steps: acquiring a credit card image to be identified; segmenting the credit card image to be identified to obtain a plurality of text areas; performing character recognition on the text area to obtain a recognition result of the text area; and post-processing the recognition result of the text region according to a preset rule to obtain structured target information. By adopting the method, the credit card image can be accurately identified.

Description

Credit card identification method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of image recognition technologies, and in particular, to a method and an apparatus for identifying a letter of credit, a computer device, and a storage medium.
Background
Because the application scenario of the letter of credit ticket is special, the industry has few special recognition engines for the letter of credit, and the existing letter of credit recognition usually adopts a mode of manual entry and manual checking, or uses a general character recognition engine, such as a tesseract engine, or a general OCR recognition model based on deep learning.
However, the credit card format has complex information content and intensive characters, and whether check boxes are checked, blackened and the like needs to be judged, so that the existing identification technical scheme has two formats of a form and a non-form, cannot accurately extract the credit card format with a complex structure, and cannot accurately position long texts of the credit cards.
Disclosure of Invention
In view of the above, it is necessary to provide a letter of credit identification method, apparatus, computer device, computer readable storage medium and computer program product capable of accurately identifying a letter of credit image in view of the above technical problems.
In a first aspect, the present application provides a method for identifying a letter of credit, the method comprising:
acquiring a credit card image to be identified;
segmenting the credit card image to be identified to obtain a plurality of text areas;
performing character recognition on the text area to obtain a recognition result of the text area;
and carrying out post-processing on the recognition result of the text region according to a preset rule to obtain structured target information. In one embodiment, after acquiring the credit card image to be identified, the method further includes:
and carrying out angle correction on the credit card image to be identified.
In one embodiment, the angle rectification of the credit card image to be recognized comprises the following steps:
carrying out binarization processing on the credit card image to be identified to obtain a binarized image;
extracting all table outlines in the binary image;
processing all table outlines to obtain a target outline of a table with the largest area;
calculating the included angle of adjacent straight lines in the target contour of the table with the largest area to obtain a rotation angle;
and carrying out angle correction on the credit card image to be identified according to the rotation angle.
In one embodiment, processing all table contours to obtain a target contour of a table with a largest area includes:
filtering according to the outline area to obtain a table with the minimum outline area as a current table;
acquiring a table to be processed adjacent to a current table;
and merging the table to be processed and the current table to be used as a new current table, continuously acquiring the table to be processed adjacent to the current table until the table with the largest area is acquired, and extracting the target contour of the table with the largest area.
In one embodiment, before performing text recognition on the text region, the method comprises
And repairing the text area.
In one embodiment, repairing the text region includes:
acquiring a format corresponding to the credit card image to be identified;
and repairing the text area according to the format.
In one embodiment, repairing the text area according to the layout comprises:
when the credit card image to be identified is in a form format, extracting a boundary to be processed of a target direction in a text area;
acquiring a boundary to be processed closest to a text region of a target type in the text region as a target boundary;
the boundary of the text region of the target type is moved to the target boundary.
In one embodiment, repairing text regions according to different layouts includes:
when the credit card image to be identified is in a non-table format, acquiring coordinates of a text area of a target type in the text area;
segmenting the credit card image to be identified according to the coordinates and the width of the credit card image to be identified to obtain a text slice corresponding to the text area of the target type;
respectively carrying out corresponding image operation on different directions of the text slices to obtain content outlines;
and repairing the text area of the target type according to the abscissa of the content outline.
In one embodiment, the post-processing the recognition result of the text region according to the preset rule to obtain the structured target information includes:
extracting a candidate field set in the recognition result according to a preset layout structure;
matching according to a preset configuration rule corresponding to a preset format to obtain a field position and field content corresponding to each field in the candidate field set;
acquiring a field information candidate set according to the position relation between the field and the field information;
and filtering the field information candidate set to obtain target information.
In one embodiment, before extracting the candidate field set in the recognition result according to the preset layout structure, the method further includes:
acquiring a format distinguishing field;
and matching the recognition result with the format distinguishing field to determine the format of the credit card image to be recognized.
In one embodiment, before obtaining the field information candidate set according to the position relationship between the field and the field information, the method further includes:
and when the credit card images to be identified are a plurality of credit card images, merging the plurality of credit card images according to the appearance sequence of the plurality of credit card images.
In one embodiment, the filtering the field information candidate set to obtain the target information further includes:
when the field information is the target field information, inputting the field information into the paragraph classification model; and extracting the target field information belonging to the same paragraph through the paragraph classification model.
In one embodiment, the paragraph classification model is obtained by pre-training, and the training process of the paragraph classification model includes:
acquiring a letter of credit paragraph text;
combining and marking the letter of credit paragraph texts according to a preset method to obtain letter of credit training data;
preprocessing the credit card training data to obtain preprocessed credit card training data;
inputting the preprocessed credit card training data into a first machine learning model for training to obtain a paragraph classification model.
In a second aspect, the present application further provides a letter of credit identification apparatus, comprising:
the acquisition module is used for acquiring a credit card image to be identified;
the text area detection module is used for segmenting the credit card image to be identified to obtain a plurality of text areas;
the recognition module is used for carrying out character recognition on the text area to obtain a recognition result of the text area;
and the post-processing module is used for post-processing the recognition result of the text region according to a preset rule to obtain the structured target information.
In a third aspect, the present application further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the method in any one of the above embodiments when executing the computer program.
In a fourth aspect, the present application further provides a computer-readable storage medium. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method of any one of the above-mentioned embodiments.
In a fifth aspect, the present application further provides a computer program product. Computer program product comprising a computer program which, when executed by a processor, carries out the method of any of the above embodiments.
According to the credit card identification method, the device, the equipment, the computer storage medium and the program product, the credit card image to be identified is firstly obtained, then the credit card image to be identified is segmented to obtain a plurality of text areas, character identification is carried out on the text areas to obtain the result to be identified of the text areas, and finally the identification result of the text areas is post-processed according to the preset rule to obtain the mechanization target information, so that the purpose of accurately and efficiently and automatically extracting the structural target information in the credit card is achieved, and privacy disclosure in the manual input process is avoided.
Drawings
FIG. 1 is a diagram of an exemplary environment in which the method for identifying a letter of credit may be implemented;
FIG. 2 is a flow diagram illustrating a method for identifying LC in one embodiment;
FIG. 3 is a diagram illustrating an example of an image of a letter of credit to be recognized;
FIG. 4 is a diagram illustrating correction of a letter of credit image according to one embodiment;
FIG. 5 is a schematic flow chart illustrating correction of a letter of credit angle according to one embodiment;
FIG. 6 is a diagram illustrating inaccuracy in text region detection in one embodiment;
FIG. 7 is a diagram illustrating boundaries to be processed in the target direction according to an embodiment;
FIG. 8 is a diagram illustrating repair of areas of form layout credit text, according to an embodiment;
FIG. 9 is a diagram illustrating a process for repairing an area of credit text in a form layout, according to an embodiment;
FIG. 10 is a diagram illustrating vertical binarization in one embodiment;
FIG. 11 is a diagram that illustrates a text slice of a text region of a target type in one embodiment;
FIG. 12 is a schematic view of one embodiment of a lateral closing operation;
FIG. 13 is a flow diagram illustrating a process for repairing a non-tabular format credit text area, according to one embodiment;
FIG. 14 is a diagram illustrating a set of candidate fields in an extracted recognition result according to an embodiment;
FIG. 15 is a diagram illustrating a rule configuration for fields in one embodiment;
FIG. 16 is a diagram illustrating the collection of LC corpora in one embodiment;
FIG. 17 is a diagram of LC training set data in one embodiment;
FIG. 18 is a diagram illustrating training data preprocessing in one embodiment;
FIG. 19 is a diagram illustrating model training of a paragraph classification model according to an embodiment;
FIG. 20 is a flow diagram illustrating a method for identifying LC according to one embodiment;
FIG. 21 is a schematic post-processing flow diagram in one embodiment;
FIG. 22 is a block diagram showing the construction of a credit identification device according to an embodiment;
FIG. 23 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The credit card identification method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. Acquiring a credit card image to be identified; segmenting the credit card image to obtain a plurality of text areas; performing character recognition on the text regions obtained by segmentation to obtain recognition results of the text regions; and finally, carrying out post-processing on the recognition result of the text region according to a preset rule to obtain structured target information. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, a method for identifying a letter of credit is provided, which is described by taking the method as an example applied to the server 104 in fig. 1, and includes the following steps:
s202, acquiring a credit card image to be identified.
The credit card image to be recognized is a credit card image which needs character recognition and can be acquired by scanning, photographing and the like and uploaded to the server. The letter of credit is a special bill, which includes two long fields, description _ goods and document _ required, and is specifically shown in fig. 3, where fig. 3 is a schematic diagram of an embodiment of an image of a letter of credit to be recognized, and the long field, document _ required, in the diagram includes a plurality of lines of text and includes a plurality of check boxes. Optionally, the server can select part or all of the to-be-identified bill images from the terminal as required and upload the to-be-identified bill images to the server, and thus the server can identify the to-be-identified bill images according to the instruction, so that the workload of manual entry and checking can be reduced, the efficiency is improved, and meanwhile, the data security is favorably ensured.
S204, segmenting the credit card image to be recognized to obtain a plurality of text areas.
The text area refers to an area corresponding to a certain text detection box in the credit card image to be recognized, and specifically, as shown in fig. 3, a box outside each line of text in the drawing is the text area.
Specifically, after the server acquires the credit card image to be recognized, the server detects a text region in the credit card image to be recognized, and segments the detected text region to obtain a plurality of text regions.
And S206, performing character recognition on the text area to obtain a recognition result of the text area.
The recognition result refers to text content obtained by performing character recognition on the text region, for example, if the text region includes three characters of a credit card, the recognition result of the text region is the credit card.
Specifically, after obtaining a plurality of text regions, the server identifies the text regions to obtain an identification result of the text regions. In other embodiments, the server may input the text region into a pre-trained character recognition model, and the recognition result corresponding to the text region may be obtained by recognizing the text region through the character recognition model.
And S208, post-processing the recognition result of the text region according to a preset rule to obtain structured target information.
The preset rule is a preset rule for performing post-processing on the recognition structure of the text region, and may be set according to a specific use scene, and the structured target information may be obtained by performing post-processing on the recognition result of the text region according to the preset rule, where the target information is information to be extracted, and may be, for example, content corresponding to a document _ required field in the credit image to be recognized. The structured target information refers to target information conforming to a preset rule format, such as "name: zhang III ".
Specifically, after obtaining the recognition result of each text region, the server performs post-processing on the recognition result according to a preset rule, for example, the post-processing may be to replace a line of text belonging to the same content in the letter of credit, that is, to extract a long field in the letter of credit as a paragraph, and finally output the structured target information. In other embodiments, the post-processing may also be post-processing verification of the structured target information, and may filter unnecessary information in the structured target information and finally output the structured target information.
According to the credit card identification method, the credit card image to be identified is obtained firstly, then the credit card image to be identified is segmented to obtain a plurality of text areas, character identification is carried out on the text areas to obtain the result to be identified of the text areas, and finally the identification result of the text areas is subjected to post-processing according to the preset rule to obtain the mechanization target information, so that the purpose of accurately and efficiently and automatically extracting the structural target information in the credit card is achieved, and privacy disclosure in the manual input process is avoided.
In one embodiment, after acquiring the credit card image to be identified, the method further comprises the following steps: and carrying out angle correction on the credit card image to be identified.
The angle correction is to rotate the to-be-identified credit card image with rotation so as to meet the standard of processing the to-be-identified credit card image.
Specifically, after the credit card image to be recognized is acquired, image rectification is further performed on the credit card image to be recognized, so that the rotation angle of the credit card image to be recognized meets the standard for processing the credit card image to be recognized, wherein optionally, the angle rectification may be performed by using radial transformation according to the corner point coordinate of the largest table in the credit card image to be recognized. Specifically, referring to fig. 4, fig. 4 is a schematic diagram of rectification of a letter of credit image in an embodiment, where the left side of the diagram is a letter of credit image to be recognized and to be angle-rectified, and the right side of the diagram is a letter of credit image to be recognized and to be angle-rectified.
In the embodiment, the angle correction is carried out on the credit card image to be recognized, so that the next processing can be better carried out on the credit card image to be recognized.
In one embodiment, the angle rectification of the credit card image to be recognized comprises the following steps: carrying out binarization processing on the credit card image to be identified to obtain a binarized image; extracting all table outlines in the binary image; processing all table outlines to obtain a target outline of a table with the largest area; calculating the included angle of adjacent straight lines in the target contour of the table with the largest area to obtain a rotation angle; and carrying out angle correction on the credit card image to be identified according to the rotation angle.
The binarization refers to setting the gray value of a pixel point on an image to be 0 or 255, namely, the whole image presents an obvious black-and-white effect, and the binarization image refers to an image obtained after binarization processing is performed on the credit card image to be identified; the target outline refers to a table outline obtained by processing the table outline, and may be, for example, a table outline obtained by merging adjacent table frames.
Specifically, firstly, the binarization image corresponding to the credit card image to be identified is obtained after the binarization processing is carried out on the credit card image to be identified, so that the extraction of the form outline can be facilitated; then extracting all table contours in the binary image to obtain all table contours in the binary image; obtaining the outline areas of all the tables after obtaining the table outlines, and optionally calculating the outline areas of the tables through the table outlines, for example, calculating by using the length and the width of the table outlines; further, processing all the contour areas to obtain a target contour of the table with the largest area, wherein optionally the target contour of the table with the largest area can be obtained by combining adjacent tables; calculating an included angle of adjacent straight lines in the target contour of the table with the largest area to obtain a rotation angle, wherein optionally, the included angle of the adjacent straight lines in the target contour can be calculated through the pythagorean theorem to obtain the rotation angle; and finally, carrying out angle correction on the to-be-identified letter-center evidence image according to the calculated rotation angle.
In the embodiment, after the target contour of the maximum form is obtained by calculating all the form contours, the included angle between adjacent straight lines in the target contour of the maximum form is calculated to obtain the rotation angle, and finally, the angle correction is performed on the credit card image to be identified according to the rotation angle. Therefore, the inclined credit card image in the actual scene can be corrected, and the phenomenon of inaccurate regression in the subsequent processing process of the credit card image is avoided.
In one embodiment, processing all table contours to obtain a target contour of a table with a largest area includes: filtering according to the outline area to obtain a table with the minimum outline area as a current table; acquiring a table to be processed adjacent to a current table; and merging the table to be processed and the current table to be used as a new current table, continuously acquiring the table to be processed adjacent to the current table until the table with the largest area is acquired, and extracting the target contour of the table with the largest area.
The contour area is an area calculated according to the table contour, specifically, after all the table contours in the binarized image are obtained, the contour area is calculated according to the table contour, wherein optionally, the contour area of each table can be calculated according to the length and width of the table contour after all the table contours are obtained. Wherein, the current table refers to the table with the minimum outline area; the table to be processed is a table which is adjacent to the current table and needs to be merged.
Specifically, after all table contours in the binarized image are obtained, the contour area is calculated according to the table contours, and filtering is performed according to the contour area, so that the table with the minimum contour area is obtained as the current table. Then obtaining a current form, namely a form to be processed adjacent to the form with the minimum outline area, wherein optionally, the outline can be filtered in parallel according to an upper form line, a lower form line, a left form line and a right form line to obtain the form to be processed adjacent to the current form; and merging the current form and the form to be processed to obtain a new current form, continuously acquiring the form to be processed adjacent to the current form, merging the current form and the form to be processed adjacent to the current form until the form with the largest area is acquired, and finally extracting the target contour of the form contour with the largest area.
In other embodiments, specifically referring to fig. 5, fig. 5 is a schematic flow chart of the angle correction of the letter of credit in one embodiment, and first, an image of the letter of credit to be recognized is input, and the image of the letter of credit to be recognized is binarized to obtain a binarized image; filtering the binary image to obtain all table contours, calculating the contour area according to all the table contours, and filtering according to the contour area to obtain the contour area with the minimum area; and then filtering according to the parallelism of the upper table, the lower table, the left table and the right table to obtain a current table, namely a table to be processed with the minimum area and adjacent outline areas, merging the current table and the table to be processed to obtain a new current table, continuously filtering according to the parallelism of the upper table, the lower table, the left table and the right table to obtain a new table to be processed adjacent to the current table, merging the new current table and the table to be processed until the table with the maximum area is obtained, finally calculating a rotation angle according to the included angle of adjacent straight lines of the table, and correcting the angle of the credit card image to be identified according to the rotation angle.
In the above embodiment, the rotation angle is calculated by calculating the included angle between the adjacent straight lines of the maximum table, so that the rotation angle of the to-be-processed credit card image can be obtained, and further the to-be-processed credit card image is subjected to angle correction.
In one embodiment, prior to performing text recognition on the text region, repairing the text region is included.
The repairing refers to adjusting the width or the width of the text region, so that the text region can include complete content to be recognized, and the phenomenon of extraction errors in subsequent text recognition of the text region is avoided.
Specifically, because many long and dense text lines exist in the letter of credit image, an inaccurate recognition phenomenon may occur in the process of recognizing the text region in the letter of credit image to be recognized, for example, the head and the tail of the text line may be missed or the text line may be incorrectly recognized during the recognition, which may cause an error in the subsequent recognition of the text region, specifically, as shown in fig. 6, fig. 6 is a schematic diagram illustrating an inaccurate detection of the text region in one embodiment, and a situation in which the head and the tail of the text line are missed occurs in a place indicated by an arrow in the diagram, so that the text region needs to be repaired before the text region is recognized, so that the result of the subsequent recognition of the text region is more accurate. Optionally, different repairing modes can be adopted for the credit cards with the forms and the non-forms, so as to ensure that the text area repairing is more accurate.
In the above embodiment, the result of the subsequent character recognition on the text region can be more accurate by repairing the text region.
In one embodiment, repairing the text region includes: acquiring a format corresponding to the credit card image to be identified; and repairing the text area according to the format.
The format refers to a format corresponding to the credit card image to be recognized, all the credit cards are generated according to the corresponding formats, and the formats can be table formats or non-table formats.
Specifically, the server first obtains the format corresponding to the credit card image to be recognized, and repairs the credit card image to be recognized in different ways according to the format corresponding to the credit card, for example, the credit card image to be recognized in the form format can be repaired according to the form line boundary, and the credit card image to be recognized in the non-form format can be repaired according to the contour searching method based on the text line. Specifically, the server judges the format of the credit card image to be recognized before acquiring the credit card image to be recognized, so as to determine that the format corresponding to the credit card image to be recognized repairs the text area to be repaired in the credit card image to be recognized by adopting a corresponding repairing mode. Optionally, the format of the credit card image to be recognized can be distinguished through the inside of the credit card image to be recognized according to the length field and whether the credit card image is a form or not. In other embodiments, whether the credit card image to be recognized is a credit card in a table format or not can be judged by judging whether a large number of table boundary lines are included in the credit card image to be recognized, and otherwise, the credit card image to be recognized is a credit card in a non-table format.
In the embodiment, the text region is repaired in different manners by acquiring the format corresponding to the credit card image to be recognized, so that the text region is accurately repaired.
In one embodiment, repairing the text region according to the layout includes: when the credit card image to be identified is in a form format, extracting a boundary to be processed of a target direction in a text area; acquiring a boundary to be processed closest to a text region of a target type in the text region as a target boundary; the boundary of the text region of the target type is moved to the target boundary.
In the embodiment, the vertical direction is selected as the target direction, because many long and dense text lines exist in the credit card image to be recognized, the phenomenon of head and tail missing can occur, and the phenomenon of head and tail missing can be avoided by adopting the vertical direction as the target direction. In other embodiments, the target direction may be selected from a horizontal direction or a horizontal direction and a vertical direction according to actual situations.
Wherein, the boundary to be processed refers to any table frame line in the credit card image to be processed; the text region of the target type refers to a text region meeting a preset requirement, for example, the text region may be a text region whose text region width is greater than a preset threshold, that is, a long text region, and the target boundary refers to a boundary where a boundary line of the region of the target type moves to a boundary meeting the preset requirement, for example, a boundary of a nearest vertical direction table line of the long field.
Specifically, when the credit card image to be recognized is in a form, the target direction in the text region is extracted as the boundary line to be processed, specifically, as shown in fig. 7, fig. 7 is a schematic diagram of the boundary line to be processed of the target direction in one embodiment, and a continuous and darker vertical straight line in fig. 7 is the boundary line to be processed. Then, before a boundary to be processed closest to a text region of a target type in the text region is obtained as a target boundary, the text region of the target type is also obtained, wherein optionally, a threshold value may be preset, and if the width of the current text region is greater than the preset threshold value, it is determined that the current text region is the text region of the target type. Then, a nearest boundary to be processed of the text region of the target type is obtained as a target boundary, and the boundary of the text region of the target type is moved to the target boundary, specifically referring to fig. 8, where fig. 8 is a schematic diagram illustrating how to repair the credit text region of the table format in one embodiment. In other embodiments, after the binarization processing is performed on the credit card image to be recognized, the boundary to be processed of the target direction in the text region may be extracted.
In other embodiments, referring to fig. 9, fig. 9 is a schematic diagram of a process of repairing a form format credit text area in an embodiment, where a to-be-identified credit image is binarized first, and then a nuclear structure in a vertical direction is subjected to an expansion corrosion operation to obtain a binarized diagram in a vertical direction, referring to fig. 10, fig. 10 is a schematic diagram of binarizing in a vertical direction in an embodiment; then judging whether the text region belongs to a long text region according to the width of the text region, wherein optionally, a threshold value can be preset, and if the width of the text region is larger than the threshold value, the text region is judged to be the long text region; and repairing the coordinates at the two ends of the long field as the vertical boundary positions according to the boundary of the table line in the nearest vertical direction of the text area of the long field.
In the embodiment, the boundary of the text region of the target type is moved to the target boundary, so that the credit card image to be identified in the form format is repaired, the phenomenon that the head and the tail of a text line are missed or the text line is mistakenly identified can be avoided, and the text region can be accurately identified subsequently.
In one embodiment, repairing text regions according to different layouts includes: when the credit card image to be identified is in a non-table format, acquiring coordinates of a text area of a target type in the text area; segmenting the credit card image to be identified according to the coordinates and the width of the credit card image to be identified to obtain a text slice corresponding to the text area of the target type; respectively carrying out corresponding image operation on different directions of the text slices to obtain content outlines; and repairing the text area of the target type according to the abscissa of the content outline.
Specifically, when the image of the letter of credit to be recognized is a letter of credit in a non-table format, the coordinates of the text region of the target type in the text region are first obtained, wherein the judgment of the text region of the target type can refer to the specific implementation in the above-mentioned embodiment, and will not be described repeatedly here. Then, coordinates of a text region of a target type in the text region are obtained, the credit image to be recognized is segmented according to the coordinates of the text region of the target type and the width of the credit image to be recognized, so as to obtain a text slice corresponding to the text region of the target type, which is specifically shown in fig. 11, where fig. 11 is a schematic diagram of a text slice of the text region of the target type in one embodiment. And then, respectively carrying out corresponding image operations on different directions of the text slices of the text region of the target type, wherein optionally, the text slices of the text region of the target type can be respectively carried out vertical direction opening operation and transverse direction closing operation to obtain the content outline in the text slices. In other embodiments, before performing corresponding image operations on different directions of a text slice, binarization processing may be performed on the text slice, so as to obtain a binarized text slice, and then the corresponding image operations may be performed on different directions of the text slice. With specific reference to fig. 12, fig. 12 is a schematic diagram of a lateral closing operation in an embodiment, in which a white area is a content contour. And finally, repairing the text area of the target type according to the abscissa of the content contour, wherein the repairing can be performed according to the minimum and maximum horizontal coordinates of the content contour.
In other embodiments, referring to fig. 13, fig. 13 is a schematic diagram illustrating a process of repairing a non-table format credit text area in an embodiment. Firstly, obtaining the coordinate position of the text region of the target type, recording the coordinate positions (location [0], location [1], location [2] and location [3]), and cutting out slice data according to the width of an original picture and the height of the text region of the target type; then, respectively performing opening operation in the vertical direction and closing operation in the transverse direction of the text area of the target type, specifically, performing opening operation in the vertical direction for the situation that the text is connected with the vertical table lines, and removing the vertical table lines; the horizontal direction closing operation is used for performing horizontal closing operation on a large kernel structure in the horizontal direction, connecting the content of a text region into blocks for contour detection, and facilitating to obtain a text line of the text region, as shown in fig. 12, after a content contour is obtained by performing corresponding image operations on different directions of a text slice, a minimum horizontal coordinate is found, a maximum horizontal coordinate is found as a boundary of a long text, and finally, correction is performed according to the calculated minimum and maximum horizontal coordinates (min _ x, location [1], max _ x, location [2]) of the contour.
In the embodiment, the content contour in the text slice can be obtained by performing image operations in different directions on the text slice corresponding to the target type, and finally, the text region of the target type can be repaired according to the abscissa of the content contour, so that the text region can be accurately identified subsequently.
In one embodiment, the post-processing the recognition result of the text region according to the preset rule to obtain the structured target information includes: extracting a candidate field set in the recognition result according to a preset layout structure; matching according to a preset configuration rule corresponding to a preset format to obtain a field position and field content corresponding to each field in the field set; acquiring a field information candidate set according to the position relation between the field and the field information; and filtering the field information candidate set to obtain target information.
The preset format refers to a preset credit card format, and the preset format can be a credit card of a table format or a credit card of a non-table format; the candidate field set is a set of all fields in the credit card image to be identified; the preset configuration rule is a preset field matching rule, a field position and field content corresponding to each field in the candidate field set can be obtained through the preset configuration rule, and the field information is content corresponding to the field, such as' name: zhang three ", wherein the field is" name ", the field information is" Zhang three "; the field information candidate set is a set of field information corresponding to a certain field in the credit card image to be identified; the target information refers to one field information of people in all field information sets needing to be extracted.
Specifically, the server first extracts a candidate field set in the recognition result according to a preset format structure, for example, most of credits of a form format have fields on the left and corresponding field information on the right, and the candidate field set can be extracted according to a left-right parallel relationship. Optionally, before extracting the candidate field set in the recognition result according to the preset layout structure, the recognition result and the layout distinguishing field are matched to determine the layout of the credit card image to be recognized, and then the candidate field set in the recognition result is extracted. Specifically, referring to fig. 14, fig. 14 is a schematic diagram of a candidate field set in the extraction recognition result in an embodiment, and since the letter of credit in fig. 14 is a form-format letter of credit, the candidate field set can be extracted according to a left-right parallel relationship.
Specifically, after the candidate fields are obtained and combined, matching is performed according to preset setting rules corresponding to preset formats to obtain field positions and field contents corresponding to each field in the candidate field set, for example, if the credit card image to be recognized is a form format, matching is performed by using preset configuration rules corresponding to the form format, and otherwise, matching is performed by using preset configuration rules corresponding to a non-form format. In one embodiment, the preset setting rule may be a regular search based on field configuration, and the regular search based on field configuration is performed according to scores corresponding to rule contents, where the rule contents are all fields used in the letter of credit, and the scores corresponding to the rule contents are preset. Optionally, the rule content with the highest score may be preferentially matched, when traversing all the rule contents, the rule content with the highest score is traversed first, if the field in the candidate field set matches with the rule content, matching is not required, otherwise, matching is required according to the rule content with the lower score, specifically referring to fig. 15, where fig. 15 is a schematic diagram of rule configuration of the field in an embodiment. In other embodiments, the preset setting rule may be a full-text-full-name-based matching method, and if the regular search configured based on the field does not match the corresponding field, the text similarity search may be performed based on the full-text-full-name-based matching method, and the most likely field is found as the reference.
Specifically, after the field position and the field content are obtained, the field information candidate set corresponding to the field can be obtained through the position relation between the field and the field information, and the distribution rule of the field and the field information in the credit card can be divided into an upper-lower structure, a left-right structure and a non-upper-lower structure, so that the candidate field information set can be searched for in a matching manner according to the candidate field position information vertical search, horizontal search and value content search.
Specifically, after the field information candidate set is obtained, the field information candidate set may be filtered to obtain the target information. Optionally, the filtering manner may be filtered according to a rule, for example, there is a fixed rule for date, amount, and the like, and therefore, the filtering may be performed according to a rule of a partial field. In one embodiment, filtering may be performed through field values, because the field information candidate set may erroneously extract other field keywords, which obviously do not need to be extracted, it may be determined whether these fields exist in the field information candidate set, and if so, filtering may be performed. Optionally, before filtering the field information candidate set, long fields belonging to the same content in the field information candidate set may be extracted as the paragraphs.
In the embodiment, the candidate field set in the recognition result is extracted through the preset format structure, matching is performed according to the preset configuration rule corresponding to the preset format, the field position and the field content corresponding to each field in the candidate field set are obtained, then the field information candidate set is obtained through the position relation between the fields and the field information, and finally the field information candidate set is filtered to obtain the target information, so that the accurate extraction of the target information can be realized, and the problems of multiple formats of the letter of credit, paragraph distinguishing and the like can be solved.
In one embodiment, before extracting the candidate field set in the recognition result according to the preset layout structure, the method further includes: acquiring a format distinguishing field; and matching the recognition result with the format distinguishing field to determine the format of the credit card image to be recognized.
The format distinguishing field is used for distinguishing the form format credit from the non-form format credit. Specifically, a format distinguishing field is obtained firstly, and then the text area is subjected to character recognition to obtain a recognition result which is matched with the format distinguishing field, so that the format of the credit card image to be recognized is determined. For example, a field not included in the non-form format letter of credit may be used as a field for distinguishing the form format letter of credit from the non-form identification letter of credit, and if the recognition result includes the field, it is determined that the image of the letter of credit to be recognized is the form format letter of credit, otherwise, it is the non-form format letter of credit.
In the implementation, the format of the credit card image to be recognized can be determined by matching the format distinguishing fields, and the field position and the field content corresponding to each field in the candidate field set are obtained by selecting the corresponding preset configuration rule according to different formats of the credit card image to be recognized and matching.
In one embodiment, before obtaining the field information candidate set according to the position relationship between the field and the field information, the method further includes: and when the credit card images to be identified are a plurality of credit card images, merging the plurality of credit card images according to the appearance sequence of the plurality of credit card images.
In particular, the credit image to be recognized is in a non-table format, particularly, due to the presence of a plurality of pages of the credit image to be recognized. When the credit images to be identified are a plurality of credit images, that is, there are a plurality of pages, before the field information candidate set is obtained through the position relationship between the field and the field information, the plurality of credit images need to be merged according to the appearance sequence of the plurality of credit images.
In the above embodiments, structured extraction may be achieved by merging a plurality of letter of credit images.
In one embodiment, the filtering the field information candidate set to obtain the target information further includes: when the field information is the target field information, inputting the field information into the paragraph classification model; and extracting the target field information belonging to the same paragraph through the paragraph classification model.
The target field information is field information meeting the condition in the field information candidate set, for example, when the preset condition is whether the fields such as description _ foods and document _ required are corresponding, the target field information is a long field.
Specifically, when the field information is the target field information, the field information is input into the paragraph classification model, wherein the paragraph classification model is pre-trained, a machine learning model capable of extracting the text which is line-fed and belongs to the same content as a paragraph is required, and the target field information in the field information candidate set can be extracted as a paragraph through the paragraph classification model.
In the above embodiment, the target field information in the field information candidate set can be extracted as a paragraph by the paragraph classification model, so as to solve the problem that paragraph distinction cannot be solved in the related art.
In one embodiment, the paragraph classification model is obtained by pre-training, and the training process of the paragraph classification model includes: acquiring a letter of credit paragraph text; combining and marking the letter of credit paragraph texts according to a preset method to obtain letter of credit training data; preprocessing the credit card training data and expanding the credit card training data; inputting the preprocessed credit card training data into a first machine learning model for training to obtain a paragraph classification model.
The credit paragraph text is based on two long field corpora of the real credit, collects words belonging to a paragraph, and distinguishes whether the words belong to the same paragraph by a line feed character, as shown in fig. 16, where fig. 16 is a schematic diagram illustrating collection of the credit corpora in one embodiment.
Specifically, the letter of credit paragraph texts are obtained first, and then the letter of credit paragraph texts are combined according to a preset method, wherein optionally, the letter of credit paragraph texts may be randomly combined, and all the paragraph texts are labeled, for example, the text belonging to the same paragraph is labeled as 1, otherwise, the text is labeled as 0, so as to obtain letter of credit training data, which is specifically shown in fig. 17, where fig. 17 is a schematic diagram of a letter of credit training set data in an embodiment.
In particular, the training data of the credit certificate is preprocessed, for example, the preprocessing may adopt a random replacement strategy to improve the generalization of the model, because the training data processed according to the two long-field corpora of the real credit certificate is highly likely to fit the model due to the speciality of the term of the credit certificate text, so that the training data is preprocessed. In other embodiments, a random replacement strategy is adopted to achieve the purpose of automatic expansion of the corpus, wherein 80% of random replacement [ mask ], 10% of random replacement into words, and 10% of random replacement remain unchanged, which is specifically shown in fig. 18, where fig. 18 is a schematic diagram of training data preprocessing in one embodiment.
Specifically, the preprocessed letter of credit training data is input into a first machine learning model for training, where the first machine learning model is a machine learning model capable of judging whether texts of adjacent lines are in the same paragraph, and optionally may be a Bert language model, and the first machine learning model may learn a large amount of letter of credit training data to obtain a paragraph classification model capable of classifying whether adjacent text lines in the recognition result are in the same paragraph. Specifically referring to fig. 19, fig. 19 is a schematic diagram illustrating model training of a paragraph classification model in an embodiment, wherein the first machine learning model is selected as the Bert language model.
In the above embodiment, the generalization of the model can be improved by preprocessing the letter of credit training data, and a paragraph classification model for classifying whether adjacent text lines in the recognition result are the same paragraph can be obtained by training the letter of credit training data, so that the text which is changed and belongs to the same content needs to be extracted as a paragraph.
In an embodiment, a method for identifying a letter of credit is provided, specifically referring to fig. 20, where fig. 20 is a schematic flow chart of the method for identifying a letter of credit in an embodiment, which may be specifically divided into three parts, namely angle correction, detection and identification, and post-processing.
Specifically, since the credit card image to be recognized uploaded by the service scan may have a small-angle inclination, which may cause an inaccurate condition for subsequently recognizing the text region by using the text detection model, the angle correction may be performed on the credit card image to be recognized after the credit card image to be recognized is acquired. Specifically, in this embodiment, the maximum table of the letter of credit is retained through image processing operation, and the angle correction is performed by using affine transformation according to the corner point coordinates of the maximum table, so that the problem of small angle correction of the letter of credit can be solved, which can be specifically shown in fig. 5.
Specifically, single-class character detection is carried out on the pictures by using a pre-trained character detection model, the sizes of the pictures are unified, required fields are detected, and coordinate points of a character detection frame are output. The training process of the character detection model is as follows: 1. marking a credit card real data set, wherein the foreground characters are marked as text; 2. expanding based on a credit card real data set, enriching training samples under the condition of ensuring the distribution of real data as much as possible, and ensuring that the ratio of the real data to expanded data is not less than 1/10; 3. inputting the training set in the step 1 and the step 2 into a second machine learning model, wherein the second machine learning model is a machine learning model capable of identifying a character region, in the embodiment, a RetinaNet model is selected to train the training set to obtain a character detection model based on the RetinaNet model, and the character detection model based on the RetinaNet model can count the aspect ratio of the Anchor according to the real data in the step 1 and the synthetic data in the step 2; considering the characteristic of larger text width-height ratio, when Feature Map generates Anchor, only generating a frame with the width-height ratio larger than or equal to 1 according to the character width; training a Retianet detection model by combining training parameters such as an image enhancement strategy and the like according to the counted Anchor width-to-height ratio; and continuously optimizing training parameters, adjusting a data enhancement strategy and increasing data according to the quality of the model reasoning result until the model reaches ideal precision. The characters in the credit card image to be recognized can be accurately detected through the character detection model based on the RetinaNet model, and then the credit card image to be recognized is segmented according to the detection result to obtain a plurality of text regions. It should be noted that, in this embodiment, the second machine learning model is a retinet model, and in other embodiments, any one of the machine learning models may be used to train and learn the training set, so as to obtain a character detection model capable of accurately identifying a text region in the credit card image to be identified.
Specifically, the long text region is repaired because the letter of credit is a document intensive scene and has many long and dense text lines, and the Anchor-based detection model has a condition of inaccurate regression in the long text, which may cause missing the head and the tail of the text line or an erroneous recognition during the recognition, resulting in an extraction error, so that the detection result needs to be corrected after the detection of the detection model is completed. The credit card format has two formats of a table format and a non-table format, and the repair process of the table format is as follows: delimiting according to the boundary of the table line, and when the regression of the text region of the long field is inaccurate, namely, the two ends of the text region have the phenomenon of incomplete frames; before recognition, the long text region is judged, and then the long text region is forced to be pulled to the vicinity of the table line, wherein the judgment refers to the judgment of whether the text region is the long text region or not. The key of the method is that a vertical form line is obtained; then, according to the position information of the vertical table line in the graph as the boundary, for those text lines belonging to the long field, the left and right ends of the post-processing are the table line positions, specifically, as shown in fig. 9, firstly, the expansion corrosion operation is performed on the binarization-set vertical direction kernel structure to obtain the binarization image in the vertical direction, the contour coordinates of the binarization image are obtained, and then, according to the nearest boundary of the table line in the vertical direction in the text area of the long field, the coordinates at the two ends of the long field are repaired to be the vertical boundary positions.
Specifically, for non-table formats, the repair process is as follows: and the non-table format credit card data adopts a contour searching method based on text lines. Obtaining boundaries according to rule statistics: firstly, obtaining a coordinate position in a long field text region, recording the coordinate positions (location [0], location [1], location [2] and location [3]), and cutting out slice data according to the width of an original picture and the height of the long field text region; and finally, searching the outline, finding the minimum transverse coordinate, and finding the maximum transverse coordinate as the boundary of the long text field. Specifically, a long-field text region is obtained: recording (location [0], location [1], location [2], location [3 ]); cutting a picture: cutting a picture according to the longitudinal coordinate of the long text region, and then cutting according to the original width of the picture to ensure that the content of the long text region is in the cut picture; binarization: a binary image; opening operation in the vertical direction: the table line removing device is used for removing the vertical table line under the condition that the text is connected with the vertical table line; transverse direction closing operation: performing transverse closing operation on the horizontal large-core structure, connecting the content of the text region into blocks for contour detection, and conveniently obtaining the text line of the text region; repairing the long field: (min _ x, location [1], max _ x, location [2]) is corrected according to the coordinates of the minimum and maximum lateral directions of the above-mentioned contour. Optionally, the long-field text region in the segmented text regions is determined before the long-field text region is repaired, and if the text region is the long-field text region, the text region is repaired. It should be noted that, in other embodiments, all the text regions obtained by segmentation may be repaired, and are not limited to the long-field text region.
Specifically, after the long text region in the text regions is repaired, the text region is identified. Specifically, a pre-trained character recognition model is used to recognize characters in the text region. The training process of the character recognition model is as follows: 1. based on the real data and the synthetic data coordinates, all data text slices are obtained and used as a credit card recognition model training set; the synthetic data is generated by filling text contents into a pre-generated letter of credit template, and the synthetic data can reduce the manual expenditure for labeling the data; 2. adjusting a data enhancement strategy and a training set in 1, and inputting the training set into a third machine learning model, wherein the third machine learning model is a machine learning model capable of recognizing characters in an image, and in the embodiment, Seq2Seq is adopted as a second machine learning model; 3. and continuously optimizing training parameters, adjusting a data enhancement strategy and increasing data according to the quality of a model reasoning result until the model reaches ideal precision, and acquiring a text recognition model based on the Seq2Seq and capable of accurately recognizing the text. The text region can be subjected to character recognition through the character recognition model based on the Seq2Seq, and a recognition result of the text region is obtained. It should be noted that in other embodiments, any one machine learning model may be used to train and learn the training set, so as to obtain a character recognition model capable of accurately recognizing characters in the credit card image to be recognized.
Specifically, after the recognition result is obtained, the recognition result is post-processed, specifically referring to fig. 21, where fig. 21 is a schematic diagram of a post-processing flow in an embodiment. The specific process is described as follows: and (4) format classification: distinguishing a form format and a non-form format, wherein extraction modes and extracted fields of the two formats are different, and loaded configuration contents are also different; the format extraction of the table and the non-table is different, so classification is needed before extraction, format difference is distinguished, and then structured extraction is carried out. In the embodiment, the specific key fields of the form class configured first are subjected to fuzzy matching before extraction, and whether the current data is the form class is matched; specifically, all the contents of the text lines are traversed, if one of the descriptions is matched with the table format, the table format is adopted, and if not, the non-table format is adopted; the extraction modes are distinguished in the format according to the long and short fields and whether the form is used, for example, the left side of most short fields of the form is the field, the right side of the short fields is the corresponding field information, and the short fields can be extracted according to the left-right parallel relation. Matching the field key: the credit card format is special, the number of pages is not fixed, the field content is spread and long, and the matching cannot be carried out according to a template matching mode; before extracting field information, the positions of the fields of the field candidates and the text content are found (according to regular and full-name text similarity matching). In this embodiment, first, the field position and the field content corresponding to the extracted field are found; because the quality of different pictures is different, the recognized characters have the condition of individual character recognition error more or less, and the search for finding the field is influenced; in order to improve the accuracy of the search field, in this embodiment, two methods are performed to perform the search: 1) based on the regular search of key configuration, before matching fields, loading field rule configuration corresponding to the table and the non-table, wherein each txt file corresponds to the configuration rule of each field (the rule content and the score corresponding to the rule content preferably take the highest score as the matching). Text similarity search based on full-name of key configuration if the corresponding field is not found based on the regular search of the field configuration in 1) above, the text similarity search is performed in a matching manner based on 2) full-name of full-text, and the most likely field is found as a reference. Matching field information: the credit card data is the condition that a plurality of pages and a single page exist, most of the table classes are the condition of the single page, the condition of the plurality of pages also exists, but most of the non-table classes are data with at least more than 2 pages, so that the credit card data can not be extracted only for the condition of the single page when being extracted and identified, and therefore in the embodiment, the whole credit card image to be identified is taken as a whole to be structurally extracted; certain rules and laws exist in the fixed field information and the non-fixed field information (namely the content to be extracted) in the credit card image to be identified; the field-field information rules or regularities may be summarized in three: an up-down structure, a left-right structure, a non-up-down left-right structure; and carrying out differential structured extraction based on the rule or the rule. It should be noted that, before extraction, the letters of credit of the multiple pages are merged according to the appearance order of the pages, including merging the pictures and merging the detection recognition results. Long field paragraph classification: the continuity of sentence semantics can not be obtained according to the position rule information between the long-field sentences, so that the Bert-based language model is used for judging whether the long-field context sentences belong to the same paragraph or not and extracting the long-field context sentences. Specifically, a Bert-based language model firstly acquires all candidate sets of a long field and sorts the candidate sets according to y coordinates; sending every two lines of the sequenced text content into a Bert model for judgment, and judging whether the two lines of input texts are a sentence (same paragraph); and outputting the long field information according to the judgment result. Before the Bert-based language model firstly acquires all candidate sets of the long field, the candidate field information needs to be judged, if the candidate field information is the long field text, the Bert-based language model is input, and if not, the Bert-based language model is directly subjected to post-processing verification. Optionally, whether the candidate field information is a long field may be distinguished by determining whether the field corresponding to the candidate field information indicates two long fields, namely, description _ goods and document _ required. And (3) post-processing verification: the screening and filtering of the content of the candidate field information set and the final structured output may specifically include the following two ways: 1. filtering according to rules: part of the fields have certain rules, such as date, etc. 2. And filtering field contents: the candidate field information may be extracted by mistake to obtain keywords of other field contents, which are obviously not needed, so that whether the fields exist in the candidate field information set or not can be judged, and if yes, filtering is performed.
In the embodiment, aiming at the characteristics of more and complex text contents of the letter of credit, the letter of credit angle detection method based on image processing is provided, and is simple and efficient; secondly, providing a credit card long field paragraph classification strategy based on a Bert language model aiming at the requirement that the contents of two long fields of a credit card need paragraph distinction; aiming at the characteristics of more text contents and complex business logic of the credit card, the post-processing flow comprises format classification, field matching, field information matching, long-field paragraph classification and post-processing verification, and can solve the problems of multiple formats of the credit card, cross-page identification information, background interference, paragraph distinguishing and the like.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides a credit card identification device for realizing the credit card identification method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so the specific limitations in one or more embodiments of the credit identification device provided below may refer to the limitations on the credit identification method in the above description, and are not described herein again.
In one embodiment, as shown in fig. 22, there is provided a letter of credit recognition apparatus including: an acquisition module 100, a text region detection module 200, a recognition module 300, and a post-processing module 400, wherein:
the acquiring module 100 is used for acquiring the credit card image to be identified.
The text region detection module 200 is configured to segment the to-be-identified credit card image to obtain a plurality of text regions.
The recognition module 300 is configured to perform character recognition on the text region to obtain a recognition result of the text region.
And the post-processing module 400 is configured to perform post-processing on the recognition result of the text region according to a preset rule to obtain structured target information.
In one embodiment, the credit identification apparatus further includes:
and the angle correction module is used for carrying out angle correction on the credit card image to be identified.
In one embodiment, the angle correcting module includes:
and the binarization unit is used for carrying out binarization processing on the credit card image to be identified to obtain a binarization image.
And a table contour extraction unit for extracting all table contours in the binarized image.
And the table contour processing unit is used for processing all table contours to obtain a target contour of the table with the largest area.
And the rotation angle calculation unit is used for calculating the included angle of the adjacent straight lines in the target outline of the table with the largest area to obtain the rotation angle.
And the correcting unit is used for correcting the angle of the credit card image to be recognized according to the rotation angle.
In one embodiment, the table contour extracting unit includes:
and the filtering subunit is used for filtering according to the outline area to obtain a table with the minimum outline area as a current table.
And the table to be processed acquiring unit is used for acquiring the table to be processed adjacent to the current table.
And the merging unit is used for merging the table to be processed and the current table to be used as a new current table, continuously acquiring the table to be processed adjacent to the current table until the table with the largest area is acquired, and extracting the target contour of the table with the largest area.
In one embodiment, the credit identification apparatus further includes:
and the repairing module is used for repairing the text area.
In one embodiment, the repair module further includes:
and the format acquiring unit is used for acquiring the format corresponding to the credit card image to be identified.
And the text region repairing unit is used for repairing the text region according to the format.
In one embodiment, the text region repairing unit includes:
and the boundary acquiring subunit is used for extracting the boundary to be processed of the target direction in the text area when the credit card image to be identified is in the form format.
And the target boundary acquiring subunit is used for acquiring the boundary to be processed closest to the text region of the target type in the text region as the target boundary.
A boundary moving subunit for moving the boundary of the text region of the target type to the target boundary.
In one embodiment, the text region repairing unit further includes:
and the coordinate acquisition subunit is used for acquiring the coordinates of the text area of the target type in the text area when the credit card image to be identified is in a non-table format.
And the text slice acquisition subunit is used for segmenting the credit image to be identified according to the coordinates and the width of the credit image to be identified to obtain a text slice corresponding to the text region of the target type.
And the content contour acquiring subunit is used for respectively carrying out corresponding image operations on different directions of the text slices to obtain content contours.
And the area repairing subunit is used for repairing the text area of the target type according to the abscissa of the content outline.
In one embodiment, the post-processing module 400 further comprises:
and the candidate field set acquisition unit is used for extracting the candidate field set in the identification result according to the preset layout structure.
And the field matching unit is used for matching according to a preset configuration rule corresponding to the preset format to obtain a field position and field content corresponding to each field in the candidate field set.
And the field information candidate set acquisition unit is used for acquiring the field information candidate set according to the position relation between the field and the field information.
And the target information acquisition unit is used for filtering the field information candidate set to obtain target information.
In one embodiment, the post-processing module further comprises:
and the distinguishing field acquisition unit is used for acquiring the format distinguishing field.
And the format matching unit is used for matching the recognition result with the format distinguishing field so as to determine the format of the credit card image to be recognized.
In one embodiment, the post-processing module 400 further comprises:
and the image merging unit is used for merging the plurality of credit images according to the appearance sequence of the plurality of credit images when the credit images to be identified are the plurality of credit images.
In one embodiment, the post-processing module 400 further comprises:
the paragraph classification unit is used for inputting the field information into the paragraph classification model when the field information is the target field information; and extracting the target field information belonging to the same paragraph through the paragraph classification model.
In one embodiment, the post-processing module 400 further comprises:
and the paragraph text acquiring unit is used for acquiring the letter of credit paragraph text.
And the training data acquisition unit is used for combining and marking the letter of credit paragraph texts according to a preset method to obtain letter of credit training data.
And the preprocessing unit is used for preprocessing the credit card training data to obtain the preprocessed credit card training data.
And the model training unit is used for inputting the preprocessed credit card training data into the first machine learning model for training so as to obtain a paragraph classification model.
The various modules in the credit card identification apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 23. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the image data of the credit to be identified. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of credit identification.
Those skilled in the art will appreciate that the architecture shown in fig. 23 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (17)

1. A method for identifying a letter of credit, the method comprising:
acquiring a credit card image to be identified;
segmenting the credit card image to be identified to obtain a plurality of text areas;
performing character recognition on the text area to obtain a recognition result of the text area;
and post-processing the recognition result of the text region according to a preset rule to obtain structured target information.
2. The method of claim 1, wherein after acquiring the image of the letter of credit to be recognized, the method further comprises:
and carrying out angle correction on the credit card image to be identified.
3. The method of claim 2, wherein the angularly rectifying the image of the letter of credit to be recognized comprises:
carrying out binarization processing on the credit card image to be identified to obtain a binarized image;
extracting all table contours in the binary image;
processing all the table outlines to obtain a target outline of a table with the largest area;
calculating the included angle of adjacent straight lines in the target contour of the table with the maximum area to obtain a rotation angle;
and carrying out angle correction on the credit card image to be identified according to the rotation angle.
4. The method according to claim 3, wherein the processing the overall table contour to obtain the target contour of the table with the largest area comprises:
filtering according to the outline area to obtain a table with the minimum outline area as a current table;
acquiring a table to be processed adjacent to the current table;
and merging the table to be processed and the current table to be used as a new current table, continuously acquiring the table to be processed adjacent to the current table until the table with the largest area is acquired, and extracting the target contour of the table with the largest area.
5. The method of claim 1, wherein prior to the text recognition of the text region, comprising:
and repairing the text area.
6. The method of claim 5, wherein the repairing the text region comprises:
acquiring a format corresponding to the to-be-identified credit card image;
and repairing the text area according to the format.
7. The method of claim 6, wherein the repairing the text region according to the layout comprises:
when the credit card image to be identified is in a form format, extracting a boundary to be processed of a target direction in the text area;
acquiring a boundary to be processed closest to a text region of a target type in the text region as a target boundary;
moving the boundary of the text region of the target type to the target boundary.
8. The method of claim 6, wherein the repairing the text region according to different layouts comprises:
when the credit card image to be identified is in a non-table format, acquiring coordinates of a text area of a target type in the text area;
segmenting the credit image to be identified according to the coordinates and the width of the credit image to be identified to obtain a text slice corresponding to the text area of the target type;
respectively carrying out corresponding image operation on different directions of the text slices to obtain content outlines;
and repairing the text area of the target type according to the abscissa of the content outline.
9. The method according to claim 1, wherein the post-processing the recognition result of the text region according to a preset rule to obtain structured target information comprises:
extracting a candidate field set in the recognition result according to a preset layout structure;
matching according to a preset configuration rule corresponding to the preset format to obtain a field position and field content corresponding to each field in the candidate field set;
acquiring a field information candidate set according to the position relation between the field and the field information;
and filtering the field information candidate set to obtain target information.
10. The method according to claim 9, wherein before extracting the set of candidate fields in the recognition result according to a preset layout structure, the method further comprises:
acquiring a format distinguishing field;
and matching the recognition result with the format distinguishing field to determine the format of the credit card image to be recognized.
11. The method of claim 9, wherein before obtaining the field information candidate set according to the position relationship between the field and the field information, the method further comprises:
and when the to-be-identified credit card images are a plurality of credit card images, merging the plurality of credit card images according to the appearance sequence of the plurality of credit card images.
12. The method of claim 9, wherein filtering the field information candidate set to obtain target information further comprises:
when the field information is the target field information, inputting the field information into a paragraph classification model; and extracting the target field information belonging to the same paragraph through the paragraph classification model.
13. The method of claim 12, wherein the paragraph classification model is obtained by pre-training, and the training process of the paragraph classification model comprises:
acquiring a letter of credit paragraph text;
combining and marking the letter of credit paragraph texts according to a preset method to obtain letter of credit training data;
preprocessing the credit card training data to obtain preprocessed credit card training data;
inputting the preprocessed letter of credit training data into a first machine learning model for training to obtain the paragraph classification model.
14. A letter of credit identification device, the device comprising:
the acquisition module is used for acquiring a credit card image to be identified;
the text area detection module is used for segmenting the credit card image to be identified to obtain a plurality of text areas;
the recognition module is used for carrying out character recognition on the text area to obtain a recognition result of the text area;
and the post-processing module is used for post-processing the recognition result of the text region according to a preset rule to obtain the structured target information.
15. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 13 when executing the computer program.
16. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 13.
17. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 13 when executed by a processor.
CN202111632352.5A 2021-12-28 2021-12-28 Credit card identification method, device, computer equipment and storage medium Pending CN114463767A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111632352.5A CN114463767A (en) 2021-12-28 2021-12-28 Credit card identification method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111632352.5A CN114463767A (en) 2021-12-28 2021-12-28 Credit card identification method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114463767A true CN114463767A (en) 2022-05-10

Family

ID=81407293

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111632352.5A Pending CN114463767A (en) 2021-12-28 2021-12-28 Credit card identification method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114463767A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129456A (en) * 2023-02-09 2023-05-16 广西壮族自治区自然资源遥感院 Method and system for identifying and inputting property rights and interests information
CN116702024A (en) * 2023-05-16 2023-09-05 见知数据科技(上海)有限公司 Method, device, computer equipment and storage medium for identifying type of stream data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129456A (en) * 2023-02-09 2023-05-16 广西壮族自治区自然资源遥感院 Method and system for identifying and inputting property rights and interests information
CN116129456B (en) * 2023-02-09 2023-07-25 广西壮族自治区自然资源遥感院 Method and system for identifying and inputting property rights and interests information
CN116702024A (en) * 2023-05-16 2023-09-05 见知数据科技(上海)有限公司 Method, device, computer equipment and storage medium for identifying type of stream data
CN116702024B (en) * 2023-05-16 2024-05-28 见知数据科技(上海)有限公司 Method, device, computer equipment and storage medium for identifying type of stream data

Similar Documents

Publication Publication Date Title
US10943105B2 (en) Document field detection and parsing
US10621727B1 (en) Label and field identification without optical character recognition (OCR)
Kleber et al. Cvl-database: An off-line database for writer retrieval, writer identification and word spotting
JP5492205B2 (en) Segment print pages into articles
Yang et al. A framework for improved video text detection and recognition
Wilkinson et al. Neural Ctrl-F: segmentation-free query-by-string word spotting in handwritten manuscript collections
CN109685065B (en) Layout analysis method and system for automatically classifying test paper contents
CN111460927B (en) Method for extracting structured information of house property evidence image
CN109389115B (en) Text recognition method, device, storage medium and computer equipment
CN111353491B (en) Text direction determining method, device, equipment and storage medium
CN114463767A (en) Credit card identification method, device, computer equipment and storage medium
CN112949455B (en) Value-added tax invoice recognition system and method
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN112766246A (en) Document title identification method, system, terminal and medium based on deep learning
CN114120345A (en) Information extraction method, device, equipment and storage medium
CN111738979A (en) Automatic certificate image quality inspection method and system
CN114998905A (en) Method, device and equipment for verifying complex structured document content
CN113780116A (en) Invoice classification method and device, computer equipment and storage medium
CN114511857A (en) OCR recognition result processing method, device, equipment and storage medium
CN114581928A (en) Form identification method and system
Karanje et al. Survey on text detection, segmentation and recognition from a natural scene images
CN112200789A (en) Image identification method and device, electronic equipment and storage medium
CN109101973B (en) Character recognition method, electronic device and storage medium
Prommas et al. CNN-based Thai handwritten OCR: an application for automated mail sorting
CN116363655A (en) Financial bill identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination