CN111259894B - Certificate information identification method and device and computer equipment - Google Patents

Certificate information identification method and device and computer equipment Download PDF

Info

Publication number
CN111259894B
CN111259894B CN202010063773.XA CN202010063773A CN111259894B CN 111259894 B CN111259894 B CN 111259894B CN 202010063773 A CN202010063773 A CN 202010063773A CN 111259894 B CN111259894 B CN 111259894B
Authority
CN
China
Prior art keywords
target
picture
coding
page
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010063773.XA
Other languages
Chinese (zh)
Other versions
CN111259894A (en
Inventor
夏雅楠
张晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Puxin Hengye Technology Development Beijing Co ltd
Original Assignee
Puxin Hengye Technology Development Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Puxin Hengye Technology Development Beijing Co ltd filed Critical Puxin Hengye Technology Development Beijing Co ltd
Priority to CN202010063773.XA priority Critical patent/CN111259894B/en
Publication of CN111259894A publication Critical patent/CN111259894A/en
Application granted granted Critical
Publication of CN111259894B publication Critical patent/CN111259894B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Editing Of Facsimile Originals (AREA)
  • Image Processing (AREA)

Abstract

The application relates to a certificate information identification method, a device and computer equipment, wherein when identification of authenticity is carried out on a target code set (such as a vehicle certificate number) in a certificate, an integral code region picture corresponding to the target code set and at least one single code region picture contained in the target code set are extracted from pictures of a target code page, then feature extraction is further carried out on the integral code region picture and the single code region picture to respectively obtain global features and local features of the target code set, and authenticity of the target code set is identified by combining the global features and the local features of the target code set, so that identification of fake certificates is realized. In the method, based on the characteristics of the target code set in the certificate, the fake certificate of the type such as code forging, altering and the like can be automatically and efficiently identified, in addition, the characteristics of different dimensions such as global and local of the comprehensive code set are identified in a true-false mode, and the identification accuracy of fake certificate information can be further improved.

Description

Certificate information identification method and device and computer equipment
Technical Field
The application belongs to the technical field of information identification and anti-counterfeiting identification, and particularly relates to a certificate information identification method, a certificate information identification device and computer equipment.
Background
Various certificates such as automobile certificates, identity cards, academic/seniority certificates and the like are important certificates of user identities or qualification, certificate faking is a very serious fraudulent practice, and risks or bad results are often brought to social activities or economic activities involving the certificates.
At present, the certificates such as the automobile certificates and the like are mainly identified by adopting a manual examination mode or an OCR (Optical Character Recognition ) automatic identification mode. The manual examination means that a worker manually checks whether certificate information is false or not according to working experience, the examination mode is low in efficiency and limited by the working experience of the worker, and a high-accuracy identification result is difficult to ensure; the OCR automatic recognition mode is mainly to judge whether the certificate really exists by automatically recognizing whether the certificate code (such as the automobile certificate number) is the really existing code, and the identification capability is not provided for fake certificates of the types of code forging, altering and the like.
In summary, the existing certificate information identification method has limited efficiency and insufficient accuracy, and has a large improvement space, so that an effective method is needed to identify the authenticity of the certificate information.
Disclosure of Invention
In view of the foregoing, the present application provides a method, apparatus and computer device for authenticating document information, so as to automatically, efficiently and accurately authenticate the authenticity of document information of counterfeit, altered, etc.
Therefore, the application discloses the following technical scheme:
a credential information authentication method comprising:
obtaining a picture of a target coding page in the certificate;
extracting an integral coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single coding from the picture of the target coding page; the target set of codes includes at least one single code;
extracting the characteristics of the whole coding region picture to obtain global characteristics of a target coding set;
extracting the characteristics of at least one single coding region picture to obtain the local characteristics of a target coding set;
and identifying the authenticity of the target coding set based on the global characteristic and the local characteristic.
The method, preferably, the obtaining the picture of the target coding page in the certificate comprises the following steps:
obtaining at least one picture of at least one page of the document;
and identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
In the above method, preferably, the extracting, from the pictures of the target encoded page, an overall encoded region picture corresponding to the target encoded set and at least one single encoded region picture corresponding to at least one single encoding includes:
identifying a first region corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first region from the picture of the target coding page to obtain an integral coding region picture of a target coding set;
identifying each second region corresponding to each single code in the whole coding region picture;
and based on the identified second areas, segmenting the whole coding area picture to obtain at least one single coding area picture corresponding to at least one single coding.
In the above method, preferably, the extracting the features of the whole coding region picture to obtain the global feature of the target coding set, extracting the features of at least one single coding region picture to obtain the local feature of the target coding set includes:
inputting the integral coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network;
and inputting at least one single coding region picture into a second characteristic extraction sub-network of a characteristic extraction layer of a second processing model to obtain local characteristics of a target coding set output by the second characteristic extraction sub-network.
Preferably, the method for identifying the authenticity of the target code set based on the global feature and the local feature includes:
performing true and false identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code;
and determining the authenticity of the target coding set based on the first identification result and the at least one second identification result.
The above method, preferably, further comprises:
obtaining a picture of a target coding page with identification errors;
and performing iterative training on the second processing model by using the pictures of the target coding page with the identification errors so as to improve the accuracy of the second processing model.
A credential information authentication apparatus comprising:
the acquisition unit is used for acquiring pictures of target coding pages in the certificates;
a first extracting unit, configured to extract, from the pictures of the target encoding page, an overall encoding region picture corresponding to the target encoding set and at least one single encoding region picture corresponding to at least one single encoding; the target set of codes includes at least one single code;
the second extraction unit is used for extracting the characteristics of the whole coding region picture to obtain the global characteristics of the target coding set; extracting the characteristics of at least one single coding region picture to obtain the local characteristics of a target coding set;
and the identification unit is used for identifying the authenticity of the target coding set based on the global characteristic and the local characteristic.
The above device, preferably, the first extraction unit is specifically configured to:
identifying a first region corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first region from the picture of the target coding page to obtain an integral coding region picture of a target coding set;
identifying each second region corresponding to each single code in the whole coding region picture;
and based on the identified second areas, segmenting the whole coding area picture to obtain at least one single coding area picture corresponding to at least one single coding.
The above device, preferably, the second extraction unit is specifically configured to:
inputting the integral coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network;
inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain local features of a target coding set output by the second feature extraction sub-network;
the authentication unit is specifically configured to:
performing true and false identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code;
and determining the authenticity of the target coding set based on the first identification result and the at least one second identification result.
A computer device, comprising:
a memory for storing at least one set of instructions;
a processor for invoking and executing the set of instructions in the memory, by executing the set of instructions, performing the credential information authentication method as defined in any one of the preceding claims.
According to the scheme, when the identification method for the certificate information provided by the application is used for carrying out authenticity identification on the target code set (such as the automobile certificate number) in the certificate, the integral code region picture corresponding to the target code set and the single code region picture of at least one single code included in the target code set are extracted from the picture of the target code page, then, the feature extraction is further carried out on the integral code region picture and the single code region picture to respectively obtain the global feature and the local feature of the target code set, and the authenticity of the target code set is identified by combining the global feature and the local feature of the target code set, so that identification on the counterfeit certificate is realized. In the method, based on the characteristics of the target code set in the certificate, the fake certificate of the type such as code forging, altering and the like can be automatically and efficiently identified, in addition, the characteristics of different dimensions such as global and local of the comprehensive code set are identified in a true-false mode, and the identification accuracy of fake certificate information can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for authenticating document information according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of another method for authenticating document information according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a picture identifying a target encoded page from at least one picture of a document provided in an embodiment of the present application;
fig. 4 is an exemplary diagram of identifying an overall coding area corresponding to a license number from a license information page according to an embodiment of the present application;
fig. 5 is an exemplary diagram of slicing each single encoded region picture from the whole encoded region picture according to the embodiment of the present application;
FIG. 6 is a block diagram of a multi-densnet model provided by an embodiment of the present application;
FIG. 7 is an exemplary diagram of a process for determining authenticity of a target encoding set based on a corresponding decision strategy according to an embodiment of the present application;
FIG. 8 is a schematic flow chart of a method for authenticating document information according to an embodiment of the present application;
FIG. 9 is a schematic diagram of a certificate information authentication apparatus according to an embodiment of the present application;
FIG. 10 is a schematic diagram of another configuration of a certificate information authentication apparatus provided in an embodiment of the present application;
fig. 11 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The application provides a certificate information identification method, a certificate information identification device and computer equipment, and aims to automatically, efficiently and accurately identify fake certificates of types such as code counterfeiting, correction and the like. In reality, there are many kinds of certificate faking behaviors/faking types, and the embodiment of the application mainly aims at identifying the most common certificate code (such as a vehicle certificate number) faking behaviors with the most serious influence. The document information authentication method, apparatus and computer device of the present application will be described in the following by various embodiments.
In an alternative embodiment of the present application, a certificate information authentication method is disclosed, and the method may be applied to a computer device, where the computer device may be, but is not limited to, a mobile phone, a tablet computer, a personal digital assistant, a handheld scanning terminal, or a portable terminal in a general/special purpose computing or configuration environment (such as a notebook), a desktop computer, a large or medium size computer, or even a server.
Referring to fig. 1, a schematic flow chart of a method for identifying certificate information according to the present embodiment is provided, where the method for identifying certificate information includes:
and 101, obtaining a picture of a target coding page in the certificate.
The certificates can be, but are not limited to, any type of certificates such as car certificates, identity cards, academic/seniority certificates and the like. The target code page may be an information page having a document code (referred to herein as a target code set) such as a car license number, an identification card number, or an academic certificate number among the documents.
In the embodiment of the application, the certificate information identification method of the application will be mainly described by taking the identification of the authenticity (counterfeiting, correction and other counterfeiting actions) of the typical target code set of the vehicle certificate number as an example.
When the certificate information such as the vehicle license number needs to be identified, at least one picture corresponding to at least one information page of the current certificate to be identified can be obtained firstly based on technologies such as photographing, copying or scanning, and the target code page with the target code set such as the vehicle license number is identified from at least one picture of the certificate based on corresponding information identification technologies such as OCR (Optical Character Recognition ) and the like, so that preparation is made for the identification of the authenticity of the target code set such as the vehicle license number.
Step 102, extracting an integral coding region picture corresponding to the target coding set and at least one single coding region picture corresponding to at least one single coding from the pictures of the target coding page.
It will be readily appreciated that the target code set, such as a license number, includes at least one single code, which may be, but is not limited to, numbers, letters, or other special characters.
The target code set such as the vehicle license number is taken as a whole, the target code set has global characteristics (integral characteristics), each single code in the target code set also has the characteristics of the single code itself, the local characteristics of the target code set can be formed, and in order to accurately identify the authenticity of the target code set such as the vehicle license number, the fake behaviors such as counterfeiting and altering of the license information are effectively identified.
Based on the idea, after obtaining the picture of the target coding page in the certificate, the whole coding region picture corresponding to the target coding set can be further cut out from the picture of the target coding page for extracting global features of the target coding set, and at least one single coding region picture corresponding to at least one single code is cut out for extracting local features of the target coding set.
And 103, extracting the characteristics of the whole coding region picture to obtain the global characteristics of the target coding set.
And 104, extracting the characteristics of at least one single coding region picture to obtain the local characteristics of the target coding set.
The inventor finds that for the counterfeited certificate, on one hand, the characteristics of the target code set such as the vehicle certificate number and the like of the counterfeited certificate, such as the overall image quality, the character spacing, the number arrangement shape and the like, are different from those of the real vehicle certificate, and the characteristics of the image quality, the character spacing, the number arrangement shape and the like of the target code set are taken as the overall characteristics (overall characteristics) of the target code set; on the other hand, the false certificate often has inconsistencies in printing and drawing details of single codes with those of real certificates, such as the code fonts, font sizes, styles (stroke weights, such as thickening or not) of the false certificate may have subtle differences from those of the real certificate, and for example, the false certificate may draw one/a few short horizontal lines or the like more or less under certain characters, and any one or more of the above detailed features are taken as local features of the target code set in this embodiment.
In implementation, the global feature of the target encoding set can be obtained by extracting the features of the image quality, the character spacing, the number arrangement shape and the like from the whole encoding region picture corresponding to the target encoding set. Correspondingly, characters such as fonts, font sizes, styles (stroke weights are thickened or not) and/or special additional/default symbols (underlined) are extracted from the single coding region pictures corresponding to the single codes, and local characters of the target coding set are obtained.
And 105, based on the global features and the local features, identifying the authenticity of the target coding set.
After global features and local features of a target coding set of the certificate to be identified are extracted, the authenticity of the target coding set is identified by further combining all features and local features of the target coding set, and whether fake behaviors such as counterfeiting, correction and the like exist is determined.
The method is characterized in that the method is used for identifying the authenticity of the target code set based on a feature matching means, in the method, the global features and the local features of the target code set (such as a vehicle license number) of the real certificate are extracted and stored in advance and are regarded as reference features to provide a basis for feature matching, and the authenticity of the target code set is identified by matching the global features of the target code set of the certificate to be identified with preset global reference features and matching the local features of the target code set of the certificate to be identified with preset local reference features.
As another alternative implementation manner, a recognition model can be trained in advance based on batches of true and false certificate samples, and global feature and local feature extraction of a target coding set and true and false identification based on feature extraction can be performed by utilizing the pre-trained model. The present embodiment is not limited herein to the implementation of the authentication of the target code set.
As can be seen from the above solutions, in the certificate information authentication method provided in this embodiment, when authenticating a target code set (such as a vehicle certificate number) in a certificate, an overall code region picture corresponding to the target code set and at least one single code region picture included in the target code set are extracted from pictures of a target code page, and then feature extraction is further performed on the overall code region picture and the single code region picture to obtain global features and local features of the target code set, and the global features and the local features of the target code set are combined to authenticate the authenticity of the target code set, so as to implement the authentication of a counterfeit certificate. Based on the characteristics of the target code set in the certificate, the fake certificate of the type such as code forging, altering and the like can be automatically and efficiently identified, in addition, the identification accuracy of fake certificate information can be further improved by comprehensively identifying the authenticity of the characteristics of different dimensions such as the global dimension, the local dimension and the like of the code set.
In a further embodiment of the present application, a refinement of the certificate information authentication method is provided, as shown in fig. 2, and the certificate information authentication method may be further implemented by the following procedures:
step 201, at least one picture of at least one page of the document is obtained.
At least one picture corresponding to at least one page of the certificate to be identified can be obtained through means of photographing, copying or scanning; optionally, each page corresponds to a picture, or only a partial page may be selectively photographed to obtain a picture corresponding to the partial page.
Step 202, identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
After obtaining at least one picture of at least one page of the document to be identified, as shown in fig. 3, a target code page with a target code set such as a car license number is identified from at least one picture of the document, so as to prepare for authentication of the target code set such as the car license number.
In this embodiment, for the identification of the target code page, a first processing model, which may be, but is not limited to, a Densenet model, is trained in advance.
In practice, taking the authentication of the license number as an example, the photographs of different pages of a large number of license numbers can be used as input to perform model training, and the output of the model is the type/category information of the page number corresponding to the photograph (such as whether the page is a "target code page" or the confidence level of the page belonging to the "target code page"/"non-target code page"), that is, in practical implementation, the identification problem of the target code page can be abstracted into a multi-classification problem.
In this embodiment, in training the first processing model, a Densenet model is preferably used, where Densenet is a dense convolutional neural network (Dense Convolutional Network) that allows each layer of nodes to accept the output of all the previous layers, which can correspondingly make it have the following advantages that the conventional convolutional network does not have: the gradient vanishing problem is solved, the feature propagation is enhanced, the feature multiplexing is encouraged, the calculated amount is reduced, and the like; the inventor verifies that the accuracy can reach 99% when the Densenet model is applied to the identification of the target coding page picture in the application.
Step 203, identifying a first area corresponding to the target coding set in the picture of the target coding page.
The whole coding region corresponding to the target code set, such as the license number, may be first identified from the pictures of the target code page by using, but not limited to, OCR or the like, where the whole coding region is referred to as the first region, and referring to fig. 4, an exemplary diagram for identifying the whole coding region corresponding to the license number from the license information page is provided.
In practical applications, when the identification of the whole coding region of the license number is performed, the pictures provided by the vehicle owner are usually diversified, such as a color photo obtained by photographing, a copy photo, a flip photo, a scanned file, and the like, and the background content of the pictures is very messy and various. These uncertain disturbing factors greatly reduce the recognition accuracy, and therefore, the present embodiment pre-processes the picture before using OCR technology to perform overall encoding region recognition, so as to reduce the adverse effect of these disturbing factors on the recognition accuracy as much as possible. Wherein the pretreatment performed includes, but is not limited to, any one or more of the following: gray, binarization, dilation, hsv (Hue-Saturation-brightness) extraction, screening of region blocks, etc.
The picture preprocessing operation can be completed by means of an opencv library, and experiments prove that the OCR recognition accuracy of the picture which is not preprocessed is 68%, and the accuracy after preprocessing can reach more than 95%.
And 204, extracting the picture of the first region from the picture of the target coding page to obtain the picture of the whole coding region of the target coding set.
And then, based on the techniques of matting, picture cutting and the like, extracting the whole coding region corresponding to the target coding set from the picture of the target coding page, namely extracting the first region, and obtaining the whole coding region picture corresponding to the target coding set.
Step 205, identifying each second region corresponding to each single code in the whole coding region picture.
And 206, based on the identified second areas, segmenting the whole coding area picture to obtain at least one single coding area picture corresponding to at least one single coding.
After the first region of the target coding set is identified and the corresponding whole coding region picture is extracted, in order to more accurately determine the local characteristics (such as fonts, word sizes and styles) of single codes of the target coding set such as license numbers, single characters (single codes) can be further identified and segmented according to the whole coding region picture.
The process still uses an OCR character recognition technology, that is, the coordinates of each single code are specifically recognized based on the OCR technology, the second area corresponding to each single code is determined, and then a plurality of single code area pictures are segmented to be used as the basis for extracting the local features of the subsequent target code set, specifically referring to fig. 5, in fig. 5, after single code recognition and segmentation are performed on the whole code area picture with the target code set of "320702207203", a series of single code area pictures corresponding to single codes such as "3", "2", "0" … are specifically obtained.
It should be noted that, in the example of fig. 5, although all single codes are identified and cut from the whole coding region picture is shown, this is only a preferred embodiment of the present application, but the embodiment may not be limited thereto, and only single code pictures exceeding the number threshold (but not all) may be selectively cut from the whole coding region picture to be used as local feature extraction in combination with actual processing complexity requirements and discrimination performance requirements, which is not limited thereto.
Step 207, inputting the whole coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model, and obtaining global features of a target coding set output by the first feature extraction sub-network.
After global features and local features of a target coding set of the certificate to be identified are extracted, the authenticity of the target coding set is identified by further combining all features and local features of the target coding set, and whether fake behaviors such as counterfeiting, correction and the like exist is determined.
In this embodiment, for the authenticity identification of the target code set, a second processing model is trained in advance, and the second processing model may be, but is not limited to, a Densenet model.
The traditional Densenet model comprises a feature extraction layer, the picture to be identified is input into the feature extraction layer of the Densenet model, so that corresponding picture features are extracted by the feature extraction layer, and then required classification processing is further carried out on the pictures based on the extracted picture features.
In this embodiment of the present application, in order to perform true and false authentication on a target encoding set such as a license number more accurately, a conventional Densenet model is improved, and a feature extraction layer thereof is improved into two sub-network structures, specifically, as shown in fig. 6, the improved feature extraction layer includes a first feature extraction sub-network for global feature extraction and a second feature extraction sub-network for local feature extraction (the feature extraction layer of the conventional Densenet model is an integral, which is not subdivided). In the embodiment of the application, the proposed Densenet model in which the front-end feature extraction layer is composed of two sub-networks is called a Multi-Densenet model.
Referring to fig. 6, the first feature extraction sub-network for global feature (global feature) extraction may be composed of a connection layer, a pooling layer, and a Dense Block layer.
In addition to including a feature extraction layer with two feature extraction subnetworks, the Multi-Dense model in this application includes several connection layers and several Dense Block layers. The function of the connecting layer is mainly feature fusion and dimension reduction sampling, a plurality of layers of convolution networks are arranged in the Dense Block layer and used for extracting abstract features in pictures, and the input of each layer of convolution network of the Dense Block layer comprises the output of all layers of convolution networks in the front.
Based on the Multi-dense model provided in the embodiment of the present application, the overall coding region picture corresponding to the target coding set, for example, the overall coding region picture with the target coding set being "320702207203" in fig. 5, may be used as the input of the first feature extraction sub-network of the feature extraction layer of the second processing model, and input into the first feature extraction sub-network, so as to obtain the global feature of the target coding set output by the first feature extraction sub-network, for example, the feature of the target coding set in aspects of image quality, character spacing, number arrangement shape, and the like.
In a specific implementation, after the first feature extraction sub-network performs feature extraction processing on the overall coding region picture of the target coding set, the obtained global feature/overall feature of the target coding set is further used as an input of a next functional layer connected with the feature extraction layer, as shown in fig. 6, and the extracted global feature/overall feature of the target coding set is continuously input into a connection layer connected with the first feature extraction sub-network.
And step 208, inputting at least one single coding region picture into a second characteristic extraction sub-network of a characteristic extraction layer of a second processing model to obtain local characteristics of a target coding set output by the second characteristic extraction sub-network.
Correspondingly, each single coding region picture corresponding to each single coding, such as a series of single coding region pictures corresponding to a series of single coding such as '3', '2', '0', '…' in fig. 5, is input into the second feature extraction sub-network of the feature extraction layer of the second processing model, and local features of the target coding set output by the second feature extraction sub-network, such as features of the single coding of the target coding set in terms of font, word size, style (stroke weight, if thickened), and/or special additional/omitted symbols (as drawn below) and the like, can be obtained correspondingly.
Similarly, after the second feature extraction sub-network performs feature extraction processing on the single-coding region picture of the target coding set, the obtained single-coding local feature is further used as an input of a next functional layer connected with the feature extraction layer, and as shown in fig. 6, the extracted local feature is continuously input into a connection layer connected with the second feature extraction sub-network.
Thus, the overall characteristics of the input network can be expressed as:
F=H(hall(xall),hpart(x part ) F is all features entering the network, xall, xpart refers to global features, local features, respectively, H is a transform function (corresponding to one or two sets of Batch-Normalization, reLU and Convollation operations), hall refers to a transform function for global features, hpart refers to a transform function for local features.
And 209, performing true and false identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code.
Specifically, as shown in fig. 6, after the global feature and the local feature of the target encoding set are extracted by using the second processing model of the embodiment of the present application, that is, the above-mentioned Multi-Dense model, the at least one connection layer and the at least one Dense Block layer of the Multi-Dense model may be continuously used to perform the authenticity identification process on the extracted global feature and local feature.
The output of the model can also be divided into two parts, corresponding to the input of the two parts of the Multi-dense model, namely a first recognition result corresponding to the overall encoding of the target encoding set, and respective second recognition results corresponding to respective single encodings.
The first recognition result and the second recognition result may be a class result given for a corresponding object to be recognized (such as an integral code and a single code), for example, a 0 class result or a 1 class result is given, where optionally, it may be assumed that the 0 class represents a real license number of the object to be recognized and the 1 class represents an non-real license number obtained by forging, altering, etc. the object to be recognized is a fake behavior; alternatively, the first recognition result and the second recognition result may also be the confidence degrees of the corresponding objects to be recognized (such as integral codes and single codes), where the confidence degrees of the objects to be recognized belonging to a certain category are given, such as the confidence degrees of the objects to be recognized belonging to category 0 are given to be 10%, the confidence degrees of the objects to be recognized belonging to category 1 are given to be 90%, and the categories to which the objects to be recognized belong can be recognized based on a preset recognition strategy, such as the categories to which the objects to be recognized belong are determined to be 1 according to the confidence degrees of the objects to be recognized 90% exceeding a set confidence degree threshold value of 60%.
Step 210, determining authenticity of the target encoding set based on the first identification result and the at least one second identification result.
Finally, the first identification result corresponding to the integral code of the target code set and the second identification result corresponding to each single code can be combined, and the authenticity of the target code set can be determined based on a preset identification strategy.
Alternatively, when the first recognition result corresponding to the overall code of the target code set indicates that the target code set is false, and/or the second recognition result corresponding to the single code indicates that the number of single codes is false exceeds a set value (for example, 3), the target code set is determined to be false, accordingly, the certificate to be recognized is a counterfeit certificate, and referring to fig. 7, a process example of determining whether the target code set is a counterfeit certificate code set based on the corresponding determination policy is provided, however, in practical application, other determination policies may be adopted, such as directly determining that the target code set is false as long as there is one single code identified as false, and so on, which is not illustrated herein.
In this embodiment, a Multi-dense model with two sub-networks in the front-end feature extraction layer is provided and applied to the identification of the authenticity of the certificate codes, so that the integral features of the certificate codes in the certificate picture can be reserved, and the detail features of each single code in the certificate codes can be obtained, thereby identifying the fake certificate codes with high accuracy and accurately and effectively identifying the counterfeiting behavior of certificates such as vehicle certificates.
In an alternative embodiment of the present application, referring to the flowchart of the credential information identification method shown in fig. 8, the credential information identification method may further include the following processes:
step 211, obtaining a picture of a target coding page with identification errors;
and 212, performing iterative training on the second processing model by using the pictures of the target coding page with the identification errors so as to improve the accuracy of the second processing model.
In order to improve the accuracy of the second processing model, in this embodiment, pictures of the target code page with erroneous judgment are collected and used as input information to perform iterative training on the second processing model (such as the Multi-dense model described above) so as to improve the accuracy of the second processing model in identifying the certificate information.
Corresponding to the above-mentioned certificate information authentication method, the embodiment of the present application further provides a certificate information authentication device, referring to the schematic structural diagram of the certificate information authentication device shown in fig. 9, the certificate information authentication device may include:
an obtaining unit 901, configured to obtain a picture of a target code page in a certificate;
a first extracting unit 902, configured to extract, from the pictures of the target encoded page, an overall encoded region picture corresponding to the target encoded set and at least one single encoded region picture corresponding to at least one single encoding; the target set of codes includes at least one single code;
a second extracting unit 903, configured to extract features of the overall coding region picture, to obtain global features of a target coding set; extracting the characteristics of at least one single coding region picture to obtain the local characteristics of a target coding set;
an authentication unit 904, configured to authenticate the authenticity of the target encoding set based on the global feature and the local feature.
In an optional implementation manner of the embodiment of the present application, the acquiring unit 901 is specifically configured to: obtaining at least one picture of at least one page of the document; and identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
In an alternative implementation manner of the embodiment of the present application, the first extracting unit 902 is specifically configured to: identifying a first region corresponding to a target coding set in a picture of the target coding page; extracting the picture of the first region from the picture of the target coding page to obtain an integral coding region picture of a target coding set; identifying each second region corresponding to each single code in the whole coding region picture; and based on the identified second areas, segmenting the whole coding area picture to obtain at least one single coding area picture corresponding to at least one single coding.
In an alternative implementation manner of the embodiment of the present application, the second extraction unit 903 is specifically configured to: inputting the integral coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network; inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain local features of a target coding set output by the second feature extraction sub-network;
the authentication unit 904 is specifically configured to: performing true and false identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code; and determining the authenticity of the target coding set based on the first identification result and the at least one second identification result.
In an alternative implementation of the embodiment of the present application, referring to fig. 10, the certificate information authentication apparatus may further include:
a feedback unit 905 for obtaining a picture of the target encoded page of which the discrimination is erroneous; and performing iterative training on the second processing model by using the pictures of the target coding page with the identification errors so as to improve the accuracy of the second processing model.
Corresponding to the credential information identification method of the present application, the credential information identification apparatus of the embodiments of the present application are equally applicable to, but not limited to, any of the types of computer devices listed above.
Since the certificate information authentication apparatus disclosed in the embodiments of the present application corresponds to the certificate information authentication method disclosed in each of the above embodiments, the description is relatively simple, and the relevant similarities are found in the description of the certificate information authentication method section in each of the above embodiments, and will not be described in detail here.
Corresponding to the above method and apparatus for authenticating certificate information, the embodiments of the present application further disclose a computer device, which may be, but not limited to, any of the types of computer devices listed above, as shown in fig. 11, and at least includes:
a memory 1101 for storing at least a set of instructions;
a processor 1102 for calling and executing the instruction set in the memory, performing the processing procedure of the credential information authentication method as shown in any of the above embodiments, or the functions of the credential information authentication apparatus as shown in any of the embodiments, by executing the instruction set.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described as different from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
For convenience of description, the above system or apparatus is described as being functionally divided into various modules or units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the embodiments or some parts of the embodiments of the present application.
Finally, it is further noted that relational terms such as first, second, third, fourth, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims (7)

1. A method for authenticating document information, comprising:
obtaining a picture of a target coding page in the certificate;
extracting an integral coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single coding from the picture of the target coding page; the target set of codes includes at least one single code;
inputting the integral coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network;
inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain local features of a target coding set output by the second feature extraction sub-network;
performing true and false identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code;
and determining the authenticity of the target coding set based on the first identification result and the at least one second identification result.
2. The method of claim 1, wherein obtaining a picture of a target encoded page in the document comprises:
obtaining at least one picture of at least one page of the document;
and identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
3. The method according to claim 1, wherein the extracting, from the pictures of the target encoded page, the whole encoded region picture corresponding to the target encoded set and the at least one single encoded region picture corresponding to the at least one single encoding, comprises:
identifying a first region corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first region from the picture of the target coding page to obtain an integral coding region picture of a target coding set;
identifying each second region corresponding to each single code in the whole coding region picture;
and based on the identified second areas, segmenting the whole coding area picture to obtain at least one single coding area picture corresponding to at least one single coding.
4. The method as recited in claim 1, further comprising:
obtaining a picture of a target coding page with identification errors;
and performing iterative training on the second processing model by using the pictures of the target coding page with the identification errors so as to improve the accuracy of the second processing model.
5. A certificate information authentication apparatus, comprising:
the acquisition unit is used for acquiring pictures of target coding pages in the certificates;
a first extracting unit, configured to extract, from the pictures of the target encoding page, an overall encoding region picture corresponding to the target encoding set and at least one single encoding region picture corresponding to at least one single encoding; the target set of codes includes at least one single code;
the second extraction unit is used for inputting the whole coding region picture into a first characteristic extraction sub-network of a characteristic extraction layer of a second processing model to obtain global characteristics of a target coding set output by the first characteristic extraction sub-network; inputting at least one single coding region picture into a second characteristic extraction sub-network of a characteristic extraction layer of a second processing model to obtain local characteristics of a target coding set output by the second characteristic extraction sub-network;
the identification unit is used for carrying out true and false identification processing on the global features and the local features by utilizing other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code; and determining authenticity of the target code set based on the first identification result and the at least one second identification result.
6. The apparatus according to claim 5, wherein the first extraction unit is specifically configured to:
identifying a first region corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first region from the picture of the target coding page to obtain an integral coding region picture of a target coding set;
identifying each second region corresponding to each single code in the whole coding region picture;
and based on the identified second areas, segmenting the whole coding area picture to obtain at least one single coding area picture corresponding to at least one single coding.
7. A computer device, comprising:
a memory for storing at least one set of instructions;
a processor for invoking and executing said set of instructions in said memory, by executing said set of instructions, performing a credential information authentication method as defined in any one of claims 1-4.
CN202010063773.XA 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment Active CN111259894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010063773.XA CN111259894B (en) 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010063773.XA CN111259894B (en) 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment

Publications (2)

Publication Number Publication Date
CN111259894A CN111259894A (en) 2020-06-09
CN111259894B true CN111259894B (en) 2023-07-07

Family

ID=70949134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010063773.XA Active CN111259894B (en) 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN111259894B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115960A (en) * 2020-06-15 2020-12-22 曹辉 Method and system for identifying collection
CN116758564B (en) * 2023-08-15 2023-11-10 山东履信思源防伪技术有限公司 Method and system for comparing OCR character recognition results

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008310691A (en) * 2007-06-15 2008-12-25 Internatl Currency Technologies Corp Method for identifying photo identification
CN108229457A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Verification method, device, electronic equipment and the storage medium of certificate
CN108229499A (en) * 2017-10-30 2018-06-29 北京市商汤科技开发有限公司 Certificate recognition methods and device, electronic equipment and storage medium
CN108229339A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Identification apparatus, certificate personal identification method and storage medium
CN108288073A (en) * 2018-01-30 2018-07-17 北京小米移动软件有限公司 Picture authenticity identification method and device, computer readable storage medium
CN108573202A (en) * 2017-03-17 2018-09-25 北京旷视科技有限公司 Identity identifying method, device and system and terminal, server and storage medium
CN109409349A (en) * 2018-02-02 2019-03-01 深圳壹账通智能科技有限公司 Credit certificate discrimination method, device, terminal and computer readable storage medium
CN109492643A (en) * 2018-10-11 2019-03-19 平安科技(深圳)有限公司 Certificate recognition methods, device, computer equipment and storage medium based on OCR
CN109543551A (en) * 2018-10-26 2019-03-29 平安科技(深圳)有限公司 Identity card identifies processing method, device, computer equipment and storage medium
CN110046644A (en) * 2019-02-26 2019-07-23 阿里巴巴集团控股有限公司 A kind of method and device of certificate false proof calculates equipment and storage medium
CN110570209A (en) * 2019-07-30 2019-12-13 平安科技(深圳)有限公司 Certificate authenticity verification method and device, computer equipment and storage medium
CN110598699A (en) * 2019-09-16 2019-12-20 华中科技大学 Anti-counterfeiting bill authenticity distinguishing system and method based on multispectral image
CN110598710A (en) * 2019-08-21 2019-12-20 阿里巴巴集团控股有限公司 Certificate identification method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010045451A1 (en) * 2000-02-28 2001-11-29 Tan Warren Yung-Hang Method and system for token-based authentication

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008310691A (en) * 2007-06-15 2008-12-25 Internatl Currency Technologies Corp Method for identifying photo identification
CN108573202A (en) * 2017-03-17 2018-09-25 北京旷视科技有限公司 Identity identifying method, device and system and terminal, server and storage medium
CN108229499A (en) * 2017-10-30 2018-06-29 北京市商汤科技开发有限公司 Certificate recognition methods and device, electronic equipment and storage medium
CN108229457A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Verification method, device, electronic equipment and the storage medium of certificate
CN108229339A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Identification apparatus, certificate personal identification method and storage medium
CN108288073A (en) * 2018-01-30 2018-07-17 北京小米移动软件有限公司 Picture authenticity identification method and device, computer readable storage medium
CN109409349A (en) * 2018-02-02 2019-03-01 深圳壹账通智能科技有限公司 Credit certificate discrimination method, device, terminal and computer readable storage medium
CN109492643A (en) * 2018-10-11 2019-03-19 平安科技(深圳)有限公司 Certificate recognition methods, device, computer equipment and storage medium based on OCR
CN109543551A (en) * 2018-10-26 2019-03-29 平安科技(深圳)有限公司 Identity card identifies processing method, device, computer equipment and storage medium
CN110046644A (en) * 2019-02-26 2019-07-23 阿里巴巴集团控股有限公司 A kind of method and device of certificate false proof calculates equipment and storage medium
CN110570209A (en) * 2019-07-30 2019-12-13 平安科技(深圳)有限公司 Certificate authenticity verification method and device, computer equipment and storage medium
CN110598710A (en) * 2019-08-21 2019-12-20 阿里巴巴集团控股有限公司 Certificate identification method and device
CN110598699A (en) * 2019-09-16 2019-12-20 华中科技大学 Anti-counterfeiting bill authenticity distinguishing system and method based on multispectral image

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Michael Ryan 等.An Examination of Character Recognition on ID card using Template Matching Approach.《Procedia Computer Science》.2015,521-529. *
李政.基于图像匹配的身份证字符识别算法研究及软件设计.《中国优秀硕士学位论文全文数据库 信息科技辑》.2018,I138-1628. *
李欢.基于稀疏表示的签名真伪鉴别方法研究.《中国优秀硕士学位论文全文数据库 信息科技辑》.2018,I138-1063. *
章小兵 等.关于道路运输从业资格证假证的分析与规制.《保险职业学院学报》.2019,78-80. *
罗涛.身份证综合信息采集与识别软件系统设计与实现.《中国优秀硕士学位论文全文数据库 信息科技辑》.2019,I138-834. *

Also Published As

Publication number Publication date
CN111259894A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
CN106951832B (en) Verification method and device based on handwritten character recognition
CN111191568A (en) Method, device, equipment and medium for identifying copied image
CN111259894B (en) Certificate information identification method and device and computer equipment
CN111178290A (en) Signature verification method and device
CN109635625B (en) Intelligent identity verification method, equipment, storage medium and device
US10867170B2 (en) System and method of identifying an image containing an identification document
Hilles et al. Latent fingerprint enhancement and segmentation technique based on hybrid edge adaptive dtv model
Mahale et al. Image inconsistency detection using local binary pattern (LBP)
Sirajudeen et al. Forgery document detection in information management system using cognitive techniques
Raigonda Signature Verification System Using SSIM In Image Processing
CN113792659B (en) Document identification method and device and electronic equipment
CN112232336A (en) Certificate identification method, device, equipment and storage medium
CN108921006B (en) Method for establishing handwritten signature image authenticity identification model and authenticity identification method
CN109583463B (en) System and method for training a classifier for determining a category of a document
CN114241463A (en) Signature verification method and device, computer equipment and storage medium
CN113378609B (en) Agent proxy signature identification method and device
CN113221696A (en) Image recognition method, system, equipment and storage medium
CN113033562A (en) Image processing method, device, equipment and storage medium
Piekarczyk et al. Hierarchical Graph-Grammar Model for Secure and Efficient Handwritten Signatures Classification.
CN111860314B (en) Electronic license verification method, device and system based on image recognition
Thaiparnit et al. Tracking vehicles system based on license plate recognition
US11823521B2 (en) Image processing method for an identity document
CN114820476A (en) Identification card identification method based on compliance detection
CN117373030B (en) OCR-based user material identification method, system, device and medium
CN116959075B (en) Deep learning-based iterative optimization method for identity recognition robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant