CN117373042A - Card image structuring processing method and device - Google Patents

Card image structuring processing method and device Download PDF

Info

Publication number
CN117373042A
CN117373042A CN202311295228.3A CN202311295228A CN117373042A CN 117373042 A CN117373042 A CN 117373042A CN 202311295228 A CN202311295228 A CN 202311295228A CN 117373042 A CN117373042 A CN 117373042A
Authority
CN
China
Prior art keywords
result
card image
matching
keyword
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311295228.3A
Other languages
Chinese (zh)
Inventor
刘昱彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202311295228.3A priority Critical patent/CN117373042A/en
Publication of CN117373042A publication Critical patent/CN117373042A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result
    • G06V30/262Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
    • G06V30/274Syntactic or semantic context, e.g. balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/248Character recognition characterised by the processing or recognition method involving plural approaches, e.g. verification by template match; Resolving confusion among similar patterns, e.g. "O" versus "Q"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/30Character recognition based on the type of data

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Character Discrimination (AREA)

Abstract

The embodiment of the specification discloses a card image structuring processing method, which comprises the following steps: acquiring an OCR recognition result of the target card image; extracting semantic information from the OCR recognition result through a pre-configured text extraction template; the text extraction template is used for describing the association relation between a preset keyword and a candidate region of a target field corresponding to the keyword, and matching field contents in the candidate region through a preset regular expression to obtain a first matching result; and determining the semantic information based on the first matching result and the keyword. Correspondingly, the invention discloses a card image structuring processing device.

Description

Card image structuring processing method and device
Technical Field
The invention relates to the technical field of computers, in particular to a card image structuring processing method and device.
Background
eKYC (Electronic Know Your Customer), namely online real-name authentication, requires that a user take a photograph of his own identity and upload the same, and cooperates with means such as face shooting and information verification to prove his own identity. In the process, the inside of the system can perform OCR (optical character recognition) on the certificate picture uploaded by the user, and the picture mode is converted into the text mode. However, after OCR, only a string of text is needed, if the semantics are extracted, a logic rule needs to be customized or a specific model needs to be trained to realize the structuring processing of the OCR recognition result, but when aiming at a new type of card, the two structuring schemes need to re-edit the logic rule or train a new model, and the cost is high.
Disclosure of Invention
One or more embodiments of the present disclosure provide a card image structuring processing method and apparatus, which can complete the task of recognizing different types of cards by using a structured card recognition template, and simplify the card image recognition flow.
According to a first aspect, there is provided a card image structuring method, including:
acquiring an OCR recognition result of the target card image;
extracting semantic information from the OCR recognition result through a pre-configured text extraction template;
the text extraction template is used for describing the association relation between a preset keyword and a candidate region of a target field corresponding to the keyword, and matching field contents in the candidate region through a preset regular expression to obtain a first matching result; and determining the semantic information based on the first matching result and the keyword.
As an optional implementation manner of the method of the first aspect, the semantic information is a key-value key value pair, where a value of a key is determined by the keyword, and a value of a value is determined by the matching result.
As an optional implementation manner of the method of the first aspect, the text extraction template is further configured to input field content in the candidate region into a pre-trained text classification model, so as to obtain a classification result corresponding to the field content in the candidate region; matching the classification result with the keywords to determine a second matching result; and screening the target field from the first matching result based on the second matching result, and determining the semantic information based on the keyword and the target field.
As an optional implementation manner of the method of the first aspect, the text extraction template is preconfigured by the following manner, including:
setting a keyword and a first regular expression for extracting a target field corresponding to the keyword based on a text extraction task;
determining candidate areas of target fields corresponding to the keywords according to the card types of the target card images;
setting a matching range of the first regular expression and a return condition of a matching result of the first regular expression.
Specifically, the configuration of the text extraction template further includes:
and setting a second regular expression which does not contain the character strings, wherein the second regular expression is used for screening out the unnecessary character strings from the matching result of the first regular expression.
According to a second aspect, there is provided an identity authentication method comprising:
acquiring a card image to be identified;
OCR recognition is carried out on the card image, and an OCR recognition result is obtained;
carrying out structural processing on the OCR result by adopting the card image structural processing method to obtain semantic information of the card image;
and matching the semantic information of the card image with pre-stored target identity information, and determining an identity authentication result based on a matching result.
According to a third aspect, there is provided a card image structuring processing device comprising:
the first data acquisition module is configured to acquire an OCR recognition result of the target card image;
the template generation module is configured to generate a text extraction template based on configuration parameters of a user; the text extraction template is used for describing the association relation between a preset keyword and a candidate region of a target field corresponding to the keyword, and matching field contents in the candidate region through a preset regular expression to obtain a first matching result; determining semantic information based on the first matching result and the keyword;
and the semantic information extraction module is configured to extract the semantic information from the OCR recognition result based on the text extraction template.
According to a fourth aspect, there is provided an identity authentication device comprising:
the second data acquisition module is configured to acquire card images to be identified;
the OCR recognition module is configured to perform OCR recognition on the card image to obtain an OCR recognition result;
the structuring processing module is configured to adopt any card image structuring processing method to carry out structuring processing on the OCR recognition result to obtain semantic information of the card image;
And the identity authentication module is configured to match the semantic information of the card image with the pre-stored target identity information and determine an identity authentication result based on the matching result.
According to a fifth aspect, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the card image structuring processing method of any one of the above.
According to a sixth aspect, there is provided an electronic device comprising
One or more processors; and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the card image structuring method of any one of the preceding claims.
According to a seventh aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the identity authentication method as claimed in any one of the preceding claims.
According to an eighth aspect, there is provided an electronic device comprising
One or more processors; and
a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the identity authentication method as claimed in any one of the preceding claims.
The card image structuring processing method has the beneficial effects that a configurable generalized text extraction template is provided, and semantic information can be extracted from the OCR recognition result of the target card image only by configuring the text extraction template for different types of cards, so that structuring processing of the OCR recognition result is realized without customizing a logic rule or training a specific model. By adopting the card image structuring processing method, the cost of OCR recognition result structuring can be reduced, higher recognition accuracy and higher recognition speed are ensured, and the accuracy and efficiency of card recognition are improved.
The card image structuring processing device and the identity authentication device disclosed by the embodiment of the specification have the beneficial effects.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present description, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 schematically illustrates a flowchart of a card image structuring method according to one or more embodiments of the present disclosure in an implementation manner.
Fig. 2 schematically illustrates a schematic diagram of a card image structuring process according to one or more embodiments of the present disclosure in a specific scenario.
Fig. 3 schematically illustrates a flow diagram of an identity authentication method according to one or more embodiments of the present disclosure, in one implementation.
Fig. 4 is a block diagram schematically illustrating a card image structuring apparatus according to one or more embodiments of the present disclosure in one embodiment.
Fig. 5 is a block diagram illustrating an identity authentication device according to one or more embodiments of the present disclosure in one implementation.
Fig. 6 schematically illustrates an identification system according to an embodiment of the present disclosure.
Fig. 7 schematically illustrates another identification system according to an embodiment of the present disclosure.
Fig. 8 exemplarily shows a block diagram of an electronic device provided in an embodiment of the present specification.
Detailed Description
First, it will be understood by those skilled in the art that the terminology used in the embodiments of the present invention is for the purpose of describing particular embodiments only, and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
eKYC (Electronic Know Your Customer), namely online real-name authentication, requires that a user take a photograph of the identity document and upload the identity document, and is matched with means such as face shooting, information verification and the like to check whether the identity document is consistent with the identity document. In the process, a certificate quality detection task and a certificate anti-counterfeiting detection task are usually executed after a certificate picture is uploaded, so that the validity and the authenticity of a certificate are ensured, if the certificate passes through the detection, the information in the certificate picture can be used for identity authentication, OCR (optical character recognition) is carried out on the certificate picture uploaded by a user in the system, and the picture mode is converted into a text mode so as to extract the text information from the certificate picture. However, only a string of texts is obtained by OCR, and if semantics in the texts are extracted for identity information comparison, the OCR recognition result needs to be structured.
Common structuring methods include: a method based on logic rules requires writing customized rule logic codes for each type of credentials; model-based methods require a specific structured model to be trained for each type of document. However, there remains a need for a more compact, more efficient structuring method to quickly extract text from a card image at a limited cost to further satisfy the need for large-scale development of new card types.
In view of the above, one or more embodiments of the present disclosure provide a card image structuring processing method, which can use a structured card recognition template to complete the recognition task of different types of cards, quickly adapt to new types of certificates, and simplify the card image recognition process.
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
It should be noted that: in other embodiments, the steps of the corresponding method are not necessarily performed in the order shown and described in this specification. In some other embodiments, the method may include more or fewer steps than described in this specification. Furthermore, individual steps described in this specification, in other embodiments, may be described as being split into multiple steps; while various steps described in this specification may be combined into a single step in other embodiments.
In some embodiments, the present disclosure provides a method for structuring a card image, please refer to fig. 1, the method comprising the steps of:
s100: and obtaining an OCR recognition result of the target card image.
OCR (optical character recognition) refers to character recognition, and is used for extracting characters in a target card image and outputting a recognition result in a text form. Specifically, the rough position and area of the characters in the target card image are determined through character line detection; then, in the text line recognition process, the image can be preprocessed by utilizing operations such as graying, image noise reduction, character normalization and the like so as to filter useless information in the image, thereby facilitating subsequent processing; and then extracting the characteristics of the character lines in the target card image, inputting the characteristics into a pre-trained classifier for character recognition, and obtaining an OCR recognition result. Optionally, the language model may be used to correct the semantics of the OCR recognition result, or optimize the format of the OCR recognition result according to the text typesetting in the target card image.
The classifier can be obtained through supervised learning, specifically, a text sample image and a corresponding label can be obtained, a text recognition result is obtained after the text sample image is input into the classifier, and then the classifier is updated based on the difference between the text recognition result and the corresponding label.
It should be noted that, the OCR recognition result includes a series of texts distributed in different areas on the card image, for example, specific information such as name, gender, birth date, etc. of the card holder, in order to facilitate identity information comparison, the area where each text is located needs to be analyzed to further sort out semantics corresponding to each text, that is, to structure the OCR recognition result, for example, "xxxx year xx month xx day" in the extracted text corresponds to the keyword "birth date" and "man" or "woman" corresponds to the keyword "gender".
S102: semantic information is extracted from the OCR recognition result through a pre-configured text extraction template.
The text extraction template is used for describing the association relation between a preset keyword and a candidate region of a target field corresponding to the keyword, and matching field contents in the candidate region through a preset regular expression to obtain a first matching result; and determining semantic information based on the first matching result and the keywords.
In some embodiments, the semantic information is a key-value key value pair, where the value of the key is determined by the key and the value of the value is determined by the match result.
For example, the key is sex, the matching result is "male" or "female", and then a key value of "sex", a value of "male" or "female" may be determined as a pair of key-value key value pairs. By extracting key-value key value pairs in the OCR recognition result, the semantics of each text recognized by OCR can be clearly identified, so that when identity information is checked, the corresponding real value and the value recognized from card images can be selected based on the key, so that the information to be compared is screened out, the information comparison step is simplified, and the identity authentication efficiency is improved.
The text extraction template can be presented in a graphical interface mode, and according to different card types, the parameters in the template are manually configured, so that semantic information extraction can be automatically performed on the basis of OCR recognition results. The text extraction template is used for highly summarizing the structuring rules of the card, and different types of card can be processed by configuring different parameters, so that the structuring efficiency is improved, and the new type of card can be conveniently adapted.
In some embodiments, the text extraction template is preconfigured in the following manner, including:
setting a keyword and a first regular expression for extracting a target field corresponding to the keyword based on a text extraction task;
determining candidate areas of target fields corresponding to keywords according to the card types of the target card images;
setting a matching range of the first regular expression and a return condition of a matching result of the first regular expression.
In general, the target fields corresponding to different keywords have different rules, such as word number limitation, or a specific character form, or a specific sub-field in the target fields, etc. Taking an identity card as an example, the number of target field words corresponding to the keyword 'name' is limited to 2-4 words, and the words are all Chinese characters; the character form contained in the target field corresponding to the keyword 'certificate number' is a number, or a number and an uppercase letter, and the number of the characters is 18 bits; the target field corresponding to the keyword 'address' contains subfields such as 'province', 'city'. These rules may be expressed in the form of regular expressions for extracting target fields that match keywords. It should be noted that, in different card types, the rules of the target fields corresponding to the keywords with the same meaning are not necessarily the same, for example, the certificate number formats of different cards are not necessarily the same; it is necessary to set the first regular expression according to a specific card type.
In the whole card image, the situation that the field matched by the first regular expression is not unique may exist, and for the same card type, the area of the target field corresponding to each keyword is fixed, so that the candidate area of the target field corresponding to the keyword can be determined according to the card type of the target card image, thereby limiting the matching range of the first regular expression, reducing the searching range and improving the matching efficiency. Specifically, the candidate region of the target field corresponding to the keyword may be represented by coordinates.
The matching range of the first regular expression can be set as a candidate area range of the target field corresponding to the keyword, after the target field is searched in the matching range, if the matching range meets the return condition, the target field is determined to be a matching result, otherwise, the matching in the area range fails, and the matching range needs to be reset.
Specifically, the configuration of the text extraction template further includes:
and setting a second regular expression which does not contain the character strings, wherein the second regular expression is used for screening the unnecessary character strings from the matching result of the first regular expression.
When the matching result of the first regular expression contains unnecessary strings such as redundant strings, for example, the matching result range of the first regular expression is too large, the matching result includes strings with similar formats besides the required strings, and if the strings follow a certain rule, the unnecessary strings can be removed from the matching result by using the second regular expression.
In other embodiments, supplemental configuration conditions may also be set in the text extraction template to flexibly extend the functionality of the text extraction template.
Alternatively, the format normalization of the matching result may be achieved by additionally configuring a beginning character (string) or an ending character (string) of the matching result, or replacing a sub character (string) in the matching result, or whether the matching result is returned in units of boxes. In addition, the matching result can be further filtered by supplementing configuration conditions, for example, whether the text category of the target field is matched with the corresponding keyword is judged. In some more specific embodiments, when a global scope search is performed, i.e. the first regular expression is matched in the full image scope, multiple candidate fields matched with the first regular expression may be obtained, so that what number of matching results to return may also be specified in the supplemental configuration condition.
In some embodiments, the text extraction template is further used for inputting field content in the candidate region into a pre-trained text classification model to obtain a classification result corresponding to the field content in the candidate region; matching the classification result with the keywords to determine a second matching result; and screening out the target field from the first matching result based on the second matching result, and determining semantic information based on the keyword and the target field.
Usually, the fields of many cards are defined by the key of the card and the key of the next field, if the key of the card is much smaller than the word size of value, and even is difficult to recognize by naked eyes, the effect of OCR recognition is more difficult to ensure, and in this case, if the field range is still defined by the content of the key, the function of the text extraction template is limited.
Therefore, adding text classification functionality may in turn match keywords based on field content in the candidate region. Optionally, in the text classification model, field types of most cards such as name, address, date, id, gender, etc. may be defined to match with keywords such as "name", "address", "date of birth", "certificate number", etc., so as to determine a key corresponding to the candidate region. With the assistance of the text classification model, the situation that the text extraction template is limited can be solved.
Specifically, the text classification model may be obtained through supervised learning, including: and acquiring a sample text and a corresponding category label, inputting the sample text into the text classification model to acquire a text classification result, and updating the text classification model based on the difference between the text classification result and the corresponding category label.
On one hand, a plurality of fields possibly exist in the text extraction template to meet the configured regular expression, and the text classification function is provided, so that which field is needed can be selected; on the other hand, text classification may be wrong, and the required fields may be selected by using the limited range of the text extraction template. Therefore, the success rate and the accuracy rate of structuring can be improved by adding the text classification function.
Fig. 2 schematically illustrates a schematic diagram of a card image structuring process according to one or more embodiments of the present disclosure in a specific scenario.
As shown in fig. 2, the text extraction template is an editable graphical interface. The structure of the text extraction template comprises:
key: a key in the card image, such as a Chinese title, an English name and the like, can be arranged in the first left column, and whether a value corresponding to the key is extracted or not is selected through a checkbox;
rect: the candidate region coordinates of the target field corresponding to the key are set, and the candidate range is presented in a frame form in the card image;
re: the first regular expression is used for configuring the first regular expression corresponding to the key so as to extract a matching result from the candidate region;
re_no: the second regular expression is used for configuring the second regular expression which does not contain the character strings, so that the unnecessary character strings are screened out from the matching result of the first regular expression, and if the unnecessary character strings have a certain rule, the configuration can be carried out in the column;
order: the method comprises the steps of setting a result of which number of matches is returned when global searching is performed by using a first regular expression;
reproduction: for adding the supplemental configuration condition, for example, starting with the xx string, ending with the xx string, returning a matching result in units of frames, replacing the xx string with null, or the like. Alternatively, the text classification model described in the above embodiment may be configured in this column. The configuration in the column can be flexibly added if new functions are added later.
When the new card type is required to be structured, the parameters of the text extraction template can be manually configured, and different requirements can be flexibly realized through a replay column, so that the text extraction template can highly summarize and abstract the structuring rules of the card, different cards can share the development tool of the set of graphical interface, the structuring is realized more simply and orderly, and the large-scale development is more convenient.
One or more embodiments of the present disclosure further provide an identity authentication method, as shown in fig. 3, including:
s200: and acquiring a card image to be identified.
S202: and carrying out OCR (optical character recognition) on the card image to obtain an OCR recognition result.
S204: the card image structuring processing method is adopted to carry out structuring processing on the OCR result, so that semantic information of the card image is obtained.
S206: and matching the semantic information of the card image with the pre-stored target identity information, and determining an identity authentication result based on the matching result.
Specifically, the card image to be recognized can be acquired by using the image acquisition equipment, and then the character line detection and character line recognition are performed by using the OCR recognition algorithm, so that the OCR recognition result in the text form is obtained. Then, a text extraction template is configured in advance based on the card type to be identified, and then the configured text extraction template is utilized to carry out structuring processing on the OCR result to obtain semantic information, such as key-value key value pairs, which has the same format as the pre-stored target identity information. And finally, determining a matching result based on the comparison result of the semantic information of the card image to be identified and the pre-stored target identity information, thereby obtaining an identity authentication result.
Optionally, the semantic features of the card image to be identified and the pre-stored target identity features can be respectively extracted through a feature extraction network, the similarity between the semantic features and the pre-stored target identity features is calculated, if the similarity is higher than a preset similarity threshold, the matching is successful, and the identity authentication passes; otherwise, the matching fails, and the identity authentication does not pass.
The card image structuring processing method disclosed by one or more embodiments of the specification is used for structuring the OCR recognition result of the card image to be recognized, so that structuring can be realized succinctly and rapidly, and the structuring has a higher success rate, thereby improving the accuracy and the authentication efficiency of identity authentication and being capable of adapting to the structuring requirements of various card types.
In some embodiments, there is provided a card image structuring processing apparatus, as shown in fig. 4, including:
a first data acquisition module 30 configured to acquire OCR recognition results of the target card image;
a template generation module 32 configured to generate a text extraction template based on configuration parameters of a user; the text extraction template is used for describing the association relation between a preset keyword and a candidate region of a target field corresponding to the keyword, and matching field contents in the candidate region through a preset regular expression to obtain a first matching result; determining semantic information based on the first matching result and the keywords;
the semantic information extraction module 34 is configured to extract semantic information from the OCR recognition result based on the text extraction template.
It should be noted that, the OCR recognition result obtained by the first data obtaining module includes a series of texts distributed in different areas on the card image, for example, specific information such as name, gender, birth date, etc. of the card holder, in order to facilitate identity information comparison, the area where each text is located needs to be analyzed to further sort semantics corresponding to each text, that is, to structure the OCR recognition result, for example, "xxxx year xx month xx day" in the extracted text corresponds to the keyword "birth date" and "male" or "female" corresponds to the keyword "gender".
In some embodiments, the semantic information extracted by the semantic information extraction module is a key-value key value pair, where the value of the key is determined by the key and the value of the value is determined by the matching result.
For example, the keyword is gender, the matching result is "male" or "female", and the semantic information extraction module may determine a key value as "gender", and a value as a pair of key-value key value pairs of "male" or "female". The semantic information extraction module can clearly identify the semantics of each text recognized by OCR through extracting key-value key value pairs in OCR recognition results, so that when identity information is checked, corresponding real value values and value values recognized from card images can be selected based on the keys to screen out information to be compared, simplify information comparison steps and improve identity authentication efficiency.
The template generation module can present a text extraction template in the form of a graphical interface, and the parameters in the template can be manually configured once according to different card types, so that semantic information extraction can be automatically performed on the basis of OCR recognition results. The text extraction template is used for highly summarizing the structuring rules of the card, and different types of card can be processed by configuring different parameters, so that the structuring efficiency is improved, and the new type of card can be conveniently adapted.
In some embodiments, the template generation module pre-configures the text extraction template in a manner that includes:
the template generation module sets a keyword and a first regular expression for extracting a target field corresponding to the keyword based on a text extraction task;
determining candidate areas of target fields corresponding to keywords according to the card types of the target card images;
setting a matching range of the first regular expression and a return condition of a matching result of the first regular expression.
In general, the target fields corresponding to different keywords have different rules, such as word number limitation, or a specific character form, or a specific sub-field in the target fields, etc. These rules may be expressed in the form of regular expressions for extracting target fields that match keywords. It should be noted that, in different card types, the rules of the target fields corresponding to the keywords with the same meaning are not necessarily the same, for example, the certificate number formats of different cards are not necessarily the same; it is necessary to set the first regular expression according to a specific card type.
In the whole card image, the situation that the field matched by the first regular expression is not unique may exist, and for the same card type, the area of the target field corresponding to each keyword is fixed, so that the template generating module can determine the candidate area of the target field corresponding to the keyword according to the card type of the target card image, thereby limiting the matching range of the first regular expression, reducing the searching range and improving the matching efficiency. Specifically, the template generation module may represent candidate regions of the target field corresponding to the keyword by coordinates.
The template generation module sets the matching range of the first regular expression as a candidate area range of the target field corresponding to the keyword, after the target field is searched in the matching range, if the matching range meets the return condition, the target field is determined to be a matching result, otherwise, the matching in the area range fails, and the matching range needs to be reset.
Specifically, the template generating module configures the text extraction template further includes:
the template generation module sets a second regular expression which does not contain the character strings, and the second regular expression is used for screening the unnecessary character strings from the matching result of the first regular expression.
When the matching result of the first regular expression contains unnecessary strings such as redundant strings, for example, the matching result range of the first regular expression is too large, the matching result includes strings with similar formats besides the required strings, and if the strings follow a certain rule, the unnecessary strings can be removed from the matching result by using the second regular expression.
In other embodiments, the template generation module may also set supplemental configuration conditions in the text extraction template to flexibly extend the functionality of the text extraction template.
Alternatively, the template generation module may implement format normalization of the matching result by complementarily configuring a beginning character (string) or an ending character (string) of the matching result, or replacing a sub character (string) in the matching result, or whether the matching result is returned in units of boxes. In addition, the matching result can be further filtered by supplementing configuration conditions, for example, whether the text category of the target field is matched with the corresponding keyword is judged. In some more specific embodiments, when performing a global scope search, i.e., matching the first regular expression across the full image, multiple candidate fields matching the first regular expression may be obtained, so the template generation module may also specify what number of matching results to return in the supplemental configuration conditions.
In some embodiments, the template generating module is further configured to input field content in the candidate region into a pre-trained text classification model, to obtain a classification result corresponding to the field content in the candidate region; matching the classification result with the keywords to determine a second matching result; and screening out the target field from the first matching result based on the second matching result, and determining semantic information based on the keyword and the target field.
Specifically, the text classification model may be obtained through supervised learning, including: and acquiring a sample text and a corresponding category label, inputting the sample text into the text classification model to acquire a text classification result, and updating the text classification model based on the difference between the text classification result and the corresponding category label.
On one hand, a plurality of fields possibly exist in the text extraction template to meet the configured regular expression, and the text classification function is provided, so that which field is needed can be selected; on the other hand, text classification may be wrong, and the required fields may be selected by using the limited range of the text extraction template. Therefore, the success rate and the accuracy rate of structuring can be improved by adding the text classification function.
In some embodiments, there is also provided an identity authentication device, as shown in fig. 5, including:
a second data acquisition module 40 configured to acquire a card image to be recognized;
an OCR recognition module 42 configured to perform OCR recognition on the card image to obtain an OCR recognition result;
a structuring processing module 44 configured to perform structuring processing on the OCR recognition result by using the card image structuring processing method as described in any one of the above, so as to obtain semantic information of the card image;
The identity authentication module 46 is configured to match the semantic information of the card image with the pre-stored target identity information, and determine an identity authentication result based on the matching result.
Specifically, the second data acquisition module can acquire the card image to be recognized by using the image acquisition device, and the OCR recognition module detects and recognizes the text line through an OCR recognition algorithm to obtain an OCR recognition result in a text form. Then, the structuring processing module configures a text extraction template in advance based on the card type to be recognized, and then utilizes the configured text extraction template to carry out structuring processing on the OCR recognition result to obtain semantic information, such as key-value key value pairs, which has the same format as the prestored target identity information. And finally, the identity authentication module determines a matching result based on the comparison result of the semantic information of the card image to be identified and the pre-stored target identity information, so that an identity authentication result is obtained.
Optionally, the identity authentication module can extract the semantic features of the card image to be identified and the pre-stored target identity features through the feature extraction network respectively, and then calculate the similarity between the semantic features and the target identity features, and if the similarity is higher than a preset similarity threshold, the matching is successful, and the identity authentication passes; otherwise, the matching fails, and the identity authentication does not pass.
Fig. 6 schematically illustrates an identification system according to an embodiment of the present disclosure. It should be noted that, the card image structuring processing method and the identity authentication method described in one or more embodiments of the present disclosure may be implemented by depending on the identity authentication system, but are not limited to the identity authentication system.
Referring to fig. 6, the identity authentication system includes an acquisition end 50 and an identification end 52, where the acquisition end 50 and the identification end 52 may be disposed in two terminal devices, respectively. The acquisition end 50 is connected to the identification end 52 via a communication link, which may be a wired network or a wireless network. For example, the acquisition end 50 may establish a communication connection with the identification end 52 using WIFI, bluetooth, infrared, etc. communication means. Alternatively, the acquisition end 50 may also establish a communication connection with the identification end 52 through a mobile network, where the network system of the mobile network may be any one of 2G (GSM), 2.5G (GPRS), 3G (WCDMA, TD-SCDMA, CDMA2000, UTMS), 4G (LTE), 4g+ (lte+), wiMax, etc.
The collection end 50 may be a terminal device with an image collection function, such as a mobile phone, a tablet computer, a notebook computer, a smart watch, etc., configured to obtain a card image to be identified.
The recognition end 52 is provided with a pre-configured text extraction template for structuring processing, and further comprises an OCR recognition algorithm and an identity information matching algorithm, which are used for acquiring the card image acquired by the acquisition end through the communication link, extracting identity information to be recognized after OCR recognition and structuring processing, and obtaining an identity authentication result through identity information matching. The identification end may be any apparatus, device, platform, cluster of devices with computing, processing capabilities. In this embodiment, the implementation form of the identification end is not limited, for example, the identification end may be a single server or may be a server cluster formed by a plurality of servers, and the identification end may also be a cloud server, also referred to as a cloud computing server or a cloud host, which is a host product in a cloud computing service system. The text extraction template can be manually configured in the recognition end or the acquisition end, and can also be configured in one or more other servers.
In other embodiments, the acquisition end 50 and the identification end 52 in the identity authentication system may also be deployed as an acquisition module and an identification module on the same terminal device. As shown in fig. 7, the terminal device may include a user device 54, a user device 56, and a user device 58, where each user device may perform card image acquisition, OCR recognition, structuring, and identity authentication under the operation of the user. Specifically, each user device may configure a text extraction template to perform a structuring process on the OCR recognition result, or configure the text extraction template with another server or terminal device, and store the configured text extraction template in the data storage system 60 in the form of program codes, and the user device invokes the program codes in the data storage system 60 to implement the card image structuring processing method and the identity authentication method provided in the embodiments of the present specification.
An embodiment in the present specification further provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the card image structuring processing method as described in any one of the above.
One embodiment in the present specification also provides an electronic device, including
One or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the card image structuring method as defined in any one of the preceding claims.
An embodiment in the present specification further provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the identity authentication method as described in any one of the above.
One embodiment in the present specification also provides an electronic device, including
One or more processors; and
a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the identity authentication method as claimed in any one of the preceding claims.
Fig. 8 exemplarily shows a block diagram of an electronic device provided in an embodiment of the present disclosure, which shows a schematic structural diagram of a computer system 700 of a terminal device or a server suitable for implementing an embodiment of the present invention. The terminal device or server shown in fig. 8 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present invention.
In a typical configuration, computer 700 includes one or more processors (CPUs) 702, a memory 704, a network interface 706, an input interface 708, and an output interface 710.
Memory 704 may include forms of non-volatile memory, random Access Memory (RAM), and/or nonvolatile memory in a computer-readable medium, such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. The above-described functions defined in the method of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 702.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous. It will also be noted that each block of the figures, and combinations of blocks in the figures, can be implemented by special purpose hardware-based systems which perform the specified functions or operations, or combinations of special purpose hardware and computer instructions.
It should be noted that the above-mentioned embodiments are merely examples of the present invention, and it is obvious that the present invention is not limited to the above-mentioned embodiments, and many similar variations are possible. All modifications attainable or obvious from the present disclosure set forth herein should be deemed to be within the scope of the present disclosure.

Claims (12)

1. A card image structuring processing method comprises the following steps:
Acquiring an OCR recognition result of the target card image;
extracting semantic information from the OCR recognition result through a pre-configured text extraction template;
the text extraction template is used for describing the association relation between a preset keyword and a candidate region of a target field corresponding to the keyword, and matching field contents in the candidate region through a preset regular expression to obtain a first matching result; and determining the semantic information based on the first matching result and the keyword.
2. The method of claim 1, wherein the semantic information is a key-value pair, wherein,
the key value is determined by the key word, and the value is determined by the matching result.
3. The method of claim 1, wherein the text extraction template is further used for inputting field content in the candidate region into a pre-trained text classification model to obtain a classification result corresponding to the field content in the candidate region; matching the classification result with the keywords to determine a second matching result; and screening the target field from the first matching result based on the second matching result, and determining the semantic information based on the keyword and the target field.
4. The method of claim 1, the text extraction template being preconfigured in a manner comprising:
setting a keyword and a first regular expression for extracting a target field corresponding to the keyword based on a text extraction task;
determining candidate areas of target fields corresponding to the keywords according to the card types of the target card images;
setting a matching range of the first regular expression and a return condition of a matching result of the first regular expression.
5. The method of claim 4, the configuring of the text extraction template further comprising:
and setting a second regular expression which does not contain the character strings, wherein the second regular expression is used for screening out the unnecessary character strings from the matching result of the first regular expression.
6. An identity authentication method, comprising:
acquiring a card image to be identified;
OCR recognition is carried out on the card image, and an OCR recognition result is obtained;
structuring the OCR result by the method according to any one of claims 1 to 5 to obtain semantic information of the card image;
and matching the semantic information of the card image with pre-stored target identity information, and determining an identity authentication result based on a matching result.
7. A card image structuring processing device comprising:
the first data acquisition module is configured to acquire an OCR recognition result of the target card image;
the template generation module is configured to generate a text extraction template based on configuration parameters of a user; the text extraction template is used for describing the association relation between a preset keyword and a candidate region of a target field corresponding to the keyword, and matching field contents in the candidate region through a preset regular expression to obtain a first matching result; determining semantic information based on the first matching result and the keyword;
and the semantic information extraction module is configured to extract the semantic information from the OCR recognition result based on the text extraction template.
8. An identity authentication device comprising:
the second data acquisition module is configured to acquire card images to be identified;
the OCR recognition module is configured to perform OCR recognition on the card image to obtain an OCR recognition result;
a structuring processing module configured to perform structuring processing on the OCR recognition result by using the method according to any one of claims 1 to 5, to obtain semantic information of the card image;
And the identity authentication module is configured to match the semantic information of the card image with the pre-stored target identity information and determine an identity authentication result based on the matching result.
9. A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any of claims 1 to 5.
10. An electronic device, comprising:
one or more processors; and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of claims 1 to 5.
11. A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of claim 6.
12. An electronic device, comprising:
one or more processors; and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of claim 6.
CN202311295228.3A 2023-10-08 2023-10-08 Card image structuring processing method and device Pending CN117373042A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311295228.3A CN117373042A (en) 2023-10-08 2023-10-08 Card image structuring processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311295228.3A CN117373042A (en) 2023-10-08 2023-10-08 Card image structuring processing method and device

Publications (1)

Publication Number Publication Date
CN117373042A true CN117373042A (en) 2024-01-09

Family

ID=89390331

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311295228.3A Pending CN117373042A (en) 2023-10-08 2023-10-08 Card image structuring processing method and device

Country Status (1)

Country Link
CN (1) CN117373042A (en)

Similar Documents

Publication Publication Date Title
US11195006B2 (en) Multi-modal document feature extraction
US10963685B2 (en) Generating variations of a known shred
WO2021151270A1 (en) Method and apparatus for extracting structured data from image, and device and storage medium
CN111783471B (en) Semantic recognition method, device, equipment and storage medium for natural language
US20170076152A1 (en) Determining a text string based on visual features of a shred
CN111858843B (en) Text classification method and device
CN113205047B (en) Medicine name identification method, device, computer equipment and storage medium
CN113611405A (en) Physical examination item recommendation method, device, equipment and medium
CN112257446A (en) Named entity recognition method and device, computer equipment and readable storage medium
CN114612921B (en) Form recognition method and device, electronic equipment and computer readable medium
CN113722438A (en) Sentence vector generation method and device based on sentence vector model and computer equipment
CN106156794B (en) Character recognition method and device based on character style recognition
CN115130613B (en) False news identification model construction method, false news identification method and device
CA3140455A1 (en) Information extraction method, apparatus, and system
AU2021371167B2 (en) Improving handwriting recognition with language modeling
US20210406451A1 (en) Systems and Methods for Extracting Information from a Physical Document
CN112199954A (en) Disease entity matching method and device based on voice semantics and computer equipment
CN114860667B (en) File classification method, device, electronic equipment and computer readable storage medium
CN114528851B (en) Reply sentence determination method, reply sentence determination device, electronic equipment and storage medium
CN113988223B (en) Certificate image recognition method, device, computer equipment and storage medium
CN115294593A (en) Image information extraction method and device, computer equipment and storage medium
CN117373042A (en) Card image structuring processing method and device
CN114913320A (en) Template-based certificate universal structuring method and system
US11335108B2 (en) System and method to recognise characters from an image
CN113869398A (en) Unbalanced text classification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination