CN111259894A - Certificate information identification method and device and computer equipment - Google Patents

Certificate information identification method and device and computer equipment Download PDF

Info

Publication number
CN111259894A
CN111259894A CN202010063773.XA CN202010063773A CN111259894A CN 111259894 A CN111259894 A CN 111259894A CN 202010063773 A CN202010063773 A CN 202010063773A CN 111259894 A CN111259894 A CN 111259894A
Authority
CN
China
Prior art keywords
target
picture
coding
code
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010063773.XA
Other languages
Chinese (zh)
Other versions
CN111259894B (en
Inventor
夏雅楠
张晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Puxin Hengye Technology Development Beijing Co ltd
Original Assignee
Puxin Hengye Technology Development Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Puxin Hengye Technology Development Beijing Co ltd filed Critical Puxin Hengye Technology Development Beijing Co ltd
Priority to CN202010063773.XA priority Critical patent/CN111259894B/en
Publication of CN111259894A publication Critical patent/CN111259894A/en
Application granted granted Critical
Publication of CN111259894B publication Critical patent/CN111259894B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

The method extracts an integral coding region picture corresponding to a target coding set and at least one single coding region picture of a single code included in the target coding set from pictures of a target coding page when the authenticity of the target coding set (such as a vehicle certificate number) in a certificate is identified, and then further extracts the characteristics of the integral coding region picture and the single coding region picture to respectively obtain the global characteristics and the local characteristics of the target coding set, and identifies the authenticity of the target coding set by combining the global characteristics and the local characteristics of the target coding set, so that the identification of counterfeit certificates is realized. In the application, based on the characteristics of the target code set in the certificate, the counterfeit certificate with the types of code counterfeiting, altering and the like can be automatically and efficiently identified, in addition, the authenticity identification is carried out by combining the characteristics of different dimensions of the global dimension, the local dimension and the like of the code set, and the identification accuracy of counterfeit certificate information can be further improved.

Description

Certificate information identification method and device and computer equipment
Technical Field
The application belongs to the technical field of information identification and anti-counterfeiting identification, and particularly relates to a certificate information identification method, a certificate information identification device and computer equipment.
Background
Various certificates such as automobile certificates, identity cards, academic/seniority certificates and the like are important certificates of user identities or qualifications, and certificate counterfeiting is a very serious fraudulent behavior, and risks or adverse results are often brought to social activities or economic activities related to the certificates.
Currently, for the certificates such as the car certificates, a manual review method or an OCR (Optical character recognition) automatic recognition method is mainly adopted to realize true and false authentication. The manual inspection means that workers check whether the certificate information is false or not according to work experience, the inspection mode is low in efficiency and limited by the work experience of the workers, and a higher-accuracy identification result is difficult to guarantee; the OCR automatic identification mode mainly judges whether the certificate exists really by automatically identifying whether the certificate code (such as a vehicle certificate number) is the code which exists really, and has no identification capability for counterfeit certificates with the types of code counterfeiting, altering and the like.
In summary, the current certificate information identification method has limited efficiency and insufficient accuracy, and has a large improvement space, and an effective method is needed for identifying the authenticity of the certificate information.
Disclosure of Invention
In view of the above, the present application provides a certificate information identification method, apparatus and computer device to automatically, efficiently and accurately identify the authenticity of the certificate information of the types of forgery, alteration and the like.
Therefore, the application discloses the following technical scheme:
a method of authenticating credential information, comprising:
obtaining a picture of a target coding page in the certificate;
extracting an integral coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single code from the pictures of the target coding page; the target encoding set comprises at least one single encoding;
extracting the characteristics of the picture in the whole coding region to obtain the global characteristics of a target coding set;
extracting the characteristics of at least one single coding region picture to obtain the local characteristics of a target coding set;
and identifying the authenticity of the target code set based on the global features and the local features.
In the above method, preferably, the obtaining the picture of the target encoded page in the certificate includes:
obtaining at least one picture of at least one page of the document;
and identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
In the above method, preferably, the extracting, from the pictures of the target coding page, an entire coding region picture corresponding to the target coding set and at least one single coding region picture corresponding to at least one single coding includes:
identifying a first area corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first area from the picture of the target coding page to obtain an integral coding area picture of a target coding set;
identifying each second area corresponding to each single code in the whole coding area picture;
and segmenting the whole coding region picture based on the identified second regions to obtain at least one single coding region picture corresponding to at least one single code.
Preferably, the extracting features of the entire coding region picture to obtain global features of the target coding set, and extracting features of at least one single coding region picture to obtain local features of the target coding set, includes:
inputting the whole coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network;
and inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain the local features of the target coding set output by the second feature extraction sub-network.
Preferably, the identifying the authenticity of the target code set based on the global feature and the local feature includes:
performing authenticity identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code;
and determining the authenticity of the target code set based on the first recognition result and the at least one second recognition result.
The above method, preferably, further comprises:
obtaining a picture of a target coding page with an identified error;
and performing iterative training on the second processing model by using the picture of the target coding page with the identified error so as to improve the accuracy of the second processing model.
A credential information authentication device comprising:
the acquisition unit is used for acquiring the picture of the target coding page in the certificate;
the first extraction unit is used for extracting an integral coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single code from the pictures of the target coding page; the target encoding set comprises at least one single encoding;
the second extraction unit is used for extracting the characteristics of the overall coding region picture to obtain the global characteristics of the target coding set; extracting the characteristics of at least one single coding region picture to obtain the local characteristics of the target coding set;
and the identification unit is used for identifying the authenticity of the target code set based on the global characteristics and the local characteristics.
The above apparatus, preferably, the first extraction unit is specifically configured to:
identifying a first area corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first area from the picture of the target coding page to obtain an integral coding area picture of a target coding set;
identifying each second area corresponding to each single code in the whole coding area picture;
and segmenting the whole coding region picture based on the identified second regions to obtain at least one single coding region picture corresponding to at least one single code.
The above apparatus, preferably, the second extraction unit is specifically configured to:
inputting the whole coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network;
inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain local features of a target coding set output by the second feature extraction sub-network;
the identification unit is specifically configured to:
performing authenticity identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code;
and determining the authenticity of the target code set based on the first recognition result and the at least one second recognition result.
A computer device, comprising:
a memory for storing at least one set of instructions;
a processor for calling and executing the instruction set in the memory, and executing the certificate information authentication method according to any one of the above items by executing the instruction set.
According to the scheme, when the authenticity of the target code set (such as the vehicle certificate number) in the certificate is identified, the integral coding region picture corresponding to the target code set and at least one single coding region picture of a single code included in the target code set are extracted from the picture of the target coding page, then, the characteristics of the integral coding region picture and the single coding region picture are further extracted, the global characteristics and the local characteristics of the target code set are respectively obtained, and the authenticity of the target code set is identified by combining the global characteristics and the local characteristics of the target code set, so that the identification of counterfeit certificates is realized. In the application, based on the characteristics of the target code set in the certificate, the counterfeit certificate with the types of code counterfeiting, altering and the like can be automatically and efficiently identified, in addition, the authenticity identification is carried out by combining the characteristics of different dimensions of the global dimension, the local dimension and the like of the code set, and the identification accuracy of counterfeit certificate information can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow chart of a certificate information authentication method provided by an embodiment of the application;
FIG. 2 is a schematic flow chart of another method for authenticating credential information provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of a picture for identifying a target encoded page from at least one picture of a document as provided by an embodiment of the present application;
FIG. 4 is an exemplary diagram of an overall encoded region corresponding to a license number identified from a license information page according to an embodiment of the present application;
fig. 5 is an exemplary diagram of a single-coded-region picture cut from a whole-coded-region picture according to an embodiment of the present application;
fig. 6 is a block diagram of a multi-densenet model provided in an embodiment of the present application;
fig. 7 is a diagram illustrating an example of a process for determining authenticity of a target code set based on a corresponding decision policy according to an embodiment of the present application;
FIG. 8 is a schematic flow chart of a certificate information authentication method provided by an embodiment of the application;
FIG. 9 is a schematic diagram of a structure of a certificate information authentication device provided in an embodiment of the present application;
FIG. 10 is a schematic view of another structure of a certificate information authentication device provided in an embodiment of the present application;
fig. 11 is a schematic structural diagram of a computer device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The application provides a certificate information identification method, a certificate information identification device and computer equipment, and aims to automatically, efficiently and accurately identify counterfeit certificates with the types of code counterfeiting, altering and the like. In reality, certificate counterfeiting behaviors/types are various, and the embodiment of the application mainly identifies the most common certificate code (such as a vehicle certificate number) counterfeiting behavior with the most serious influence. The certificate information authentication method, device and computer equipment of the present application will be described below by way of a plurality of embodiments.
In an alternative embodiment of the present application, a certificate information authentication method is disclosed, which can be applied to a computer device, which can be but is not limited to a portable terminal such as a mobile phone, a tablet computer, a personal digital assistant, a handheld scanning terminal, etc., or can also be a portable computer (such as a notebook), a desktop computer or a large and medium sized computer, even a server, etc., in a general/special purpose computing or configuration environment.
Referring to fig. 1, a schematic flow chart of a certificate information authentication method provided in this embodiment is shown, in this embodiment, the certificate information authentication method includes:
step 101, obtaining a picture of a target coding page in the certificate.
The certificate may be any type of certificate, such as but not limited to a car certificate, an identity card, a scholarly/seniority certificate, etc. The target code page may be an information page having a certificate code (referred to as a target code set) such as a vehicle certificate number, an identification number or a academic certificate number in the certificate.
In the embodiment of the present application, the certificate information identification method of the present application will be described mainly by taking the identification of the authenticity (counterfeit behavior such as forgery and falsification) of a typical target code set, i.e., a vehicle certificate number, as an example.
When the authenticity of certificate information such as a vehicle certificate number is required to be identified, at least one picture corresponding to at least one information page of the current certificate to be identified can be obtained based on technologies such as photographing, copying or scanning, and a target coding page of a target coding set such as a spare vehicle certificate number is identified from at least one picture of the certificate based on corresponding information identification technologies such as Optical Character Recognition (OCR) and the like, so that the authenticity identification of the target coding set such as the vehicle certificate number is prepared.
Step 102, extracting an overall coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single code from the pictures of the target coding page.
It will be readily appreciated that the target code set, such as a vehicle license number, includes at least one single code, which may be, but is not limited to, a number, letter, or other special character.
The technical idea of identifying the authenticity of the target code set by combining the global characteristics and the local characteristics of the target code set is provided.
Based on the thought, after the picture of the target coding page in the certificate is obtained, the picture of the whole coding region corresponding to the target coding set can be further cut out from the picture of the target coding page for extracting the global features of the target coding set, and at least one picture of the single coding region corresponding to at least one single code is cut out for extracting the local features of the target coding set.
And 103, extracting the characteristics of the overall coding region picture to obtain the global characteristics of the target coding set.
And 104, extracting the characteristics of at least one single coding region picture to obtain the local characteristics of the target coding set.
The inventor has found through research that, for counterfeit certificates, on one hand, a target code set such as a vehicle certificate number of a counterfeit certificate is different from a real vehicle certificate in the characteristics such as the overall image quality, the character pitch and the number arrangement shape, and the characteristics such as the image quality, the character pitch and/or the number arrangement shape of the target code set are used as the global characteristics (overall characteristics) of the target code set in the embodiment; on the other hand, the printing and drawing details of the single code of the fake certificate are often inconsistent with the code of the real certificate, for example, the coding font, the font size, the style (stroke weight, such as whether the stroke is thickened) and the like of the fake certificate may have slight differences from the code of the real certificate, for example, the fake certificate may have one or more short transverse lines drawn below some characters, and the present embodiment uses any one or more of the above-mentioned detail features as the local features of the target coding set.
In an implementation, the global features of the target code set can be obtained by extracting the features such as the image quality, the character pitch, and/or the number arrangement shape from the entire code region picture corresponding to the target code set. Correspondingly, characters such as fonts, word sizes, styles (stroke weights, if the characters are thickened) and/or special additional/default symbols (such as underlining lines) are extracted from the single coding region pictures corresponding to the single codes, and local features of the target coding set are obtained.
And 105, identifying the authenticity of the target coding set based on the global features and the local features.
After the global features and the local features of the target code set of the certificate to be identified are extracted, the authenticity of the target code set is further identified by further combining all the features and the local features of the target code set, and whether counterfeiting behaviors such as counterfeiting, altering and the like exist is determined.
As an optional implementation manner, the authenticity of the target code set can be identified based on a feature matching means, in this manner, global features and local features of the target code set (such as a vehicle certificate number) of a real certificate are extracted and stored in advance, and are regarded as reference features to provide a basis for feature matching, and subsequently, the authenticity of the target code set is identified by matching the global features of the target code set of the certificate to be identified with preset global reference features and matching the local features of the target code set of the certificate to be identified with preset local reference features.
As another optional implementation manner, a recognition model may be trained in advance based on batch of true and false certificate samples, and global feature and local feature extraction of the target coding set and authenticity identification based on feature extraction may be performed by using the model trained in advance. The present embodiment does not limit the implementation manner of authenticating the target code set.
It can be known from the above solutions that, in the certificate information identification method provided in this embodiment, when performing authenticity identification on a target code set (such as a car certificate number) in a certificate, an entire code region picture corresponding to the target code set and at least one single code region picture included in the target code set are extracted from pictures of a target code page, and then, feature extraction is further performed on the entire code region picture and the single code region picture to obtain global features and local features of the target code set, respectively, and the authenticity of the target code set is identified by combining the global features and the local features of the target code set, so as to implement identification on counterfeit certificates. Based on the characteristics of the target coding set in the certificate, the fake certificate with the types of code counterfeiting, altering and the like can be automatically and efficiently identified, in addition, the authenticity identification is carried out by comprehensively combining the characteristics of different dimensions such as the global dimension, the local dimension and the like of the coding set, and the identification accuracy of the fake certificate information can be further improved.
In a next embodiment of the present application, a detailed process of the certificate information authentication method is provided, as shown in fig. 2, the certificate information authentication method can be further implemented by the following processes:
step 201, at least one picture of at least one page of the document is obtained.
At least one picture corresponding to at least one page of the certificate to be identified can be obtained through means of photographing, copying or scanning and the like; optionally, each page corresponds to one picture, or only a part of the pages may be selectively photographed to obtain pictures corresponding to the part of the pages.
Step 202, identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
After obtaining at least one picture of at least one page of the certificate to be identified, as shown in fig. 3, the target code page of the target code set such as the serial number of the vehicle certificate is identified from the at least one picture of the certificate, so as to prepare for the authenticity identification of the target code set such as the serial number of the vehicle certificate.
In this embodiment, for the identification of the target coding page, a first processing model is trained in advance, and the first processing model may be, but is not limited to, a densenert model.
In the implementation, for example, the car license number is authenticated, a large number of photos of different pages of the car license can be used as input for model training, and the output of the model is the type/category information of the page corresponding to the photo (e.g., whether the photo is the target encoding page or not, or the confidence of the target encoding page/non-target encoding page), that is, in the actual implementation, the recognition problem of the target encoding page can be abstracted into a multi-classification problem.
In this embodiment, when the first processing model is trained, a densenert model is preferentially adopted, where densenert is a Dense Convolutional neural Network (Dense Convolutional Network), and each layer of nodes can receive outputs of all previous layers, which accordingly can make it have the following advantages that the conventional Convolutional Network does not have: the problem of gradient disappearance is solved, the feature propagation is enhanced, the feature multiplexing is encouraged, the calculated amount is reduced, and the like; the inventor verifies that the accuracy can reach 99% by applying the Densenet model to the identification of the target coding page picture in the application.
And 203, identifying a first area corresponding to the target coding set in the picture of the target coding page.
An example of identifying an overall code region corresponding to a vehicle license number from a vehicle license information page is provided with reference to fig. 4, in which an overall code region corresponding to a target code set, such as a vehicle license number, may be, but is not limited to, first identified from a picture of a target code page using OCR and the like.
In practical applications, when the overall coding region of the vehicle license number is identified, the images provided by the vehicle owner are usually diversified, for example, color photographs obtained by taking pictures, copied photographs, scanned documents and the like are included, and the background contents of the images are very disordered and diversified. Since the uncertain disturbing factors greatly reduce the recognition accuracy, the embodiment performs some preprocessing on the picture before the OCR technology is used for overall coding region recognition, so as to reduce the adverse effect of the disturbing factors on the recognition accuracy as much as possible. Wherein the pretreatment includes, but is not limited to, any one or more of the following: graying, binaryzation, expansion, hsv (Hue-Saturation-Value) extraction, region block screening and the like.
The image preprocessing operation can be completed by means of an opencv library, and experiments prove that the accuracy of the image OCR recognition without preprocessing is 68%, and the accuracy can reach more than 95% after preprocessing.
And 204, extracting the picture of the first area from the pictures of the target coding page to obtain an integral coding area picture of the target coding set.
Then, based on technologies such as image matting and image clipping, an overall coding region corresponding to the target coding set is extracted from the image of the target coding page, that is, the first region is extracted, so as to obtain an overall coding region image corresponding to the target coding set.
And step 205, identifying each second area corresponding to each single code in the whole coding area picture.
And step 206, based on the identified second regions, segmenting the overall coding region picture to obtain at least one single coding region picture corresponding to at least one single code.
After the first region of the target coding set is identified and the corresponding overall coding region picture is extracted, in order to more accurately determine the local features (such as fonts, word sizes and styles) of the single codes of the target coding set, such as the vehicle license number, the single character (single code) can be further identified and cut aiming at the overall coding region picture.
The OCR character recognition technology is still used in the process, that is, coordinates of each single code are recognized specifically based on the OCR technology, a second region corresponding to each single code is determined, and then a plurality of single code region pictures are segmented to be used as a basis for extracting local features of a subsequent target code set, as shown in fig. 5, after single code recognition and segmentation are performed on an overall code region picture of a target code set "320702207203", a series of single code region pictures corresponding to single codes such as "3", "2", "0" … and the like can be obtained specifically.
It should be noted that, in the example of fig. 5, all the single codes are identified and segmented from the overall coding region picture, but this is only a preferred embodiment of the present application, and the specific implementation may not be limited thereto, and only the single coded pictures exceeding a number threshold (but not all) may be selectively segmented from the overall coding region picture to be used as the local feature extraction in combination with the actual processing complexity requirement and the discrimination performance requirement, which is not limited in this embodiment.
Step 207, inputting the whole coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model, so as to obtain global features of a target coding set output by the first feature extraction sub-network.
After the global features and the local features of the target code set of the certificate to be identified are extracted, the authenticity of the target code set is further identified by further combining all the features and the local features of the target code set, and whether counterfeiting behaviors such as counterfeiting, altering and the like exist is determined.
In this embodiment, for the authenticity identification of the target code set, a second processing model is trained in advance, and the second processing model may also be, but is not limited to, a densener model.
The traditional Densenet model comprises a feature extraction layer, the image to be recognized is input into the feature extraction layer of the Densenet model, so that the feature extraction layer extracts corresponding image features, and then the image is further classified based on the extracted image features.
In the embodiment of the application, in order to more accurately identify the authenticity of a target code set such as a vehicle license number, a traditional densenert model is improved, and a feature extraction layer of the traditional densenert model is improved into two sub-network structures, specifically, as shown in fig. 6, the improved feature extraction layer includes a first feature extraction sub-network for global feature extraction and a second feature extraction sub-network for local feature extraction (the feature extraction layer of the traditional densenert model is an integral part and is not subdivided). In the embodiment of the present application, the proposed densenert model in which the front-end feature extraction layer is composed of two sub-networks is referred to as a Multi-densenert model.
Among them, referring to fig. 6, the first feature extraction sub-network for performing global feature (global feature) extraction may be composed of a connection layer, a pooling layer, and a Dense Block layer.
In addition to comprising a feature extraction layer with two sub-networks of feature extraction, the Multi-densenet model in this application also comprises several connection layers and several sense Block layers. The function of the connection layer is mainly feature fusion and dimension reduction sampling, a plurality of layers of convolutional networks are arranged in the Dense Block layer and used for extracting abstract features in pictures, and the input of each layer of convolutional network of the Dense Block layer comprises the output of all the layers of convolutional networks.
Based on the Multi-densenet model provided in this embodiment of the present application, the whole coding region picture corresponding to the target coding set, for example, the whole coding region picture whose target coding set is "320702207203" in fig. 5, is used as an input of the first feature extraction sub-network of the feature extraction layer of the second processing model, and is input into the first feature extraction sub-network, so as to obtain the global features of the target coding set output by the first feature extraction sub-network, for example, the features of the target coding set in terms of image quality, character spacing, number arrangement shape, and the like.
In a specific implementation, after the first feature extraction sub-network performs feature extraction processing on the whole coding region picture of the target coding set, the obtained global features/whole features of the target coding set are further used as input of a next functional layer connected to the feature extraction layer, and as shown in fig. 6, the extracted global features/whole features of the target coding set are continuously input into a connection layer connected to the first feature extraction sub-network.
Step 208, inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model, so as to obtain local features of the target coding set output by the second feature extraction sub-network.
Correspondingly, each single-coded region picture corresponding to each single code, for example, a series of single-coded region pictures corresponding to a series of single codes such as "3", "2", "0" … in fig. 5, is input to the second feature extraction sub-network as an input of the second feature extraction sub-network of the feature extraction layer of the second processing model, and accordingly, local features of the target coding set output by the second feature extraction sub-network, for example, features of the single codes of the target coding set in terms of font, font size, style (stroke weight, whether the stroke weight is bold or not) and/or special additional/omitted symbols (such as following dashed lines) can be obtained.
Similarly, after the second feature extraction sub-network performs feature extraction processing on the single coding region picture of the target coding set, the obtained local features of the single coding may further serve as input of a next functional layer connected to the feature extraction layer, and as shown in fig. 6, the extracted local features may be continuously input into a connection layer connected to the second feature extraction sub-network.
Thus, the overall characteristics of the input network can be expressed as:
F=H(hall(xall),hpart(xpart) F) for all features entering the network, xaall, Xpart, respectively for global and local features, H for transformation functions (corresponding to one or two sets of Batch-Normalization, ReLU and constraint operations), hall for global features and hpart for local features.
Step 209, performing authenticity identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code.
Specifically, as shown in fig. 6, after the global features and the local features of the target code set are extracted by using the Multi-densinet model, which is the second processing model in the embodiment of the present application, the extracted global features and local features may be subjected to the authenticity identification processing by using at least one connection layer and at least one sense Block layer of the Multi-densinet model.
The output of the model may also be split into two parts, corresponding to the input of the two parts of the Multi-densenet model, i.e. a first recognition result corresponding to the overall encoding of the target encoding set, and a respective second recognition result corresponding to the respective single encoding.
The first identification result and the second identification result may be a category result given for a corresponding object to be identified (e.g., an overall code or a single code), such as a category 0 result or a category 1 result, where optionally, it may be assumed that the category 0 indicates that the object to be identified is a true car license number, and the category 1 indicates that the object to be identified is a non-true car license number obtained by counterfeiting, falsification, or the like; or, the first recognition result and the second recognition result may also be confidence levels that are given for corresponding objects to be recognized (e.g., whole codes and single codes) and belong to a certain category, for example, the confidence level that the object to be recognized belongs to the category 0 is given as 10%, the confidence level that the object belongs to the category 1 is given as 90%, and the like, and then the category to which the object to be recognized belongs may be recognized based on a preset recognition policy, for example, the category to which the object to be recognized belongs is determined as 1, and the like, according to the fact that the confidence level that the object to be recognized 90% belongs to the category 1 exceeds a set confidence level threshold value.
Step 210, determining the authenticity of the target code set based on the first recognition result and the at least one second recognition result.
Finally, the authenticity of the target coding set can be determined based on a preset identification strategy by combining the first identification result corresponding to the whole coding of the target coding set and each second identification result corresponding to each single coding.
Alternatively, when the first recognition result corresponding to the entire code of the target code set indicates that the target code set is false, and/or the second recognition result corresponding to the single code indicates that the number of the single code that is false exceeds a set value (e.g., 3), the target code set is determined to be false, and accordingly, the document to be recognized is a counterfeit document, referring to fig. 7, a processing procedure example for determining whether the target code set is a counterfeit document code set based on a corresponding determination policy is provided.
In the embodiment, the Multi-densinseet model with two sub-networks in the front end feature extraction layer is provided and applied to authenticity identification of the certificate code, so that the overall features of the certificate code in the certificate picture can be kept, and the detailed features of each single code in the certificate code can be obtained, so that the fake certificate code can be identified with high accuracy, and fake behaviors of certificates such as vehicle certificates can be accurately and effectively identified.
In an alternative embodiment of the present application, referring to the flowchart of the certificate information authentication method shown in fig. 8, the certificate information authentication method may further include the following processes:
step 211, obtaining the picture of the target coding page with the identification error;
and step 212, performing iterative training on the second processing model by using the picture of the target coding page with the identified error so as to improve the accuracy of the second processing model.
In order to improve the model accuracy of the second processing model, in this embodiment, the pictures of the target encoded page with a misjudgment are collected and used as input information to perform iterative training on the second processing model (such as the Multi-densenet model mentioned above) so as to improve the identification accuracy of the second processing model on the certificate information.
In correspondence to the above-mentioned certificate information authentication method, an embodiment of the present application further provides a certificate information authentication apparatus, and referring to a schematic structural diagram of the certificate information authentication apparatus shown in fig. 9, the certificate information authentication apparatus may include:
an acquiring unit 901, configured to acquire a picture of a target encoded page in a certificate;
a first extracting unit 902, configured to extract, from the pictures of the target coding page, an entire coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single coding; the target encoding set comprises at least one single encoding;
a second extraction unit 903, configured to extract features of the entire coding region picture to obtain global features of a target coding set; extracting the characteristics of at least one single coding region picture to obtain the local characteristics of the target coding set;
and an identifying unit 904, configured to identify, based on the global feature and the local feature, whether the target code set is true or false.
In an optional implementation manner of the embodiment of the present application, the obtaining unit 901 is specifically configured to: obtaining at least one picture of at least one page of the document; and identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
In an optional implementation manner of the embodiment of the present application, the first extracting unit 902 is specifically configured to: identifying a first area corresponding to a target coding set in a picture of the target coding page; extracting the picture of the first area from the picture of the target coding page to obtain an integral coding area picture of a target coding set; identifying each second area corresponding to each single code in the whole coding area picture; and segmenting the whole coding region picture based on the identified second regions to obtain at least one single coding region picture corresponding to at least one single code.
In an optional implementation manner of the embodiment of the present application, the second extraction unit 903 is specifically configured to: inputting the whole coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network; inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain local features of a target coding set output by the second feature extraction sub-network;
the authentication unit 904 is specifically configured to: performing authenticity identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code; and determining the authenticity of the target code set based on the first recognition result and the at least one second recognition result.
In an alternative implementation manner of the embodiment of the present application, referring to fig. 10, the certificate information authentication apparatus may further include:
a feedback unit 905 for obtaining a picture of a target encoded page for which an error is identified; and performing iterative training on the second processing model by using the picture of the target coding page with the identified error so as to improve the accuracy of the second processing model.
The certificate information authentication apparatus of the embodiments of the present application, corresponding to the certificate information authentication method of the present application, can be applied to any type of computer device, not limited to those listed above, as well.
The certificate information authentication device disclosed in the embodiments of the present application is relatively simple in description since it corresponds to the certificate information authentication method disclosed in the above embodiments, and for the relevant similarities, please refer to the description of the certificate information authentication method in the above embodiments, and the details are not described here.
Corresponding to the above method and apparatus for authenticating certificate information, the embodiment of the present application further discloses a computer device, which may be, but is not limited to, any type of computer device listed above, as shown in fig. 11, where the computer device at least includes:
a memory 1101 for storing at least one set of instructions;
a processor 1102 for calling and executing the instruction set in the memory, and executing the instruction set to perform the processes of the certificate information authentication method as shown in any one of the above embodiments or the functions of the certificate information authentication apparatus as shown in any one of the above embodiments.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
For convenience of description, the above system or apparatus is described as being divided into various modules or units by function, respectively. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
Finally, it is further noted that, herein, relational terms such as first, second, third, fourth, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (10)

1. A method of authenticating credential information, comprising:
obtaining a picture of a target coding page in the certificate;
extracting an integral coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single code from the pictures of the target coding page; the target encoding set comprises at least one single encoding;
extracting the characteristics of the picture in the whole coding region to obtain the global characteristics of a target coding set;
extracting the characteristics of at least one single coding region picture to obtain the local characteristics of a target coding set;
and identifying the authenticity of the target code set based on the global features and the local features.
2. The method of claim 1, wherein obtaining the picture of the target encoded page in the document comprises:
obtaining at least one picture of at least one page of the document;
and identifying a picture of a target coding page from the at least one picture by using a first processing model, wherein the picture of the target coding page comprises a target coding set.
3. The method of claim 1, wherein the extracting, from the pictures of the target encoded page, an entire encoded region picture corresponding to a target encoding set and at least one single encoded region picture corresponding to at least one single encoding comprises:
identifying a first area corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first area from the picture of the target coding page to obtain an integral coding area picture of a target coding set;
identifying each second area corresponding to each single code in the whole coding area picture;
and segmenting the whole coding region picture based on the identified second regions to obtain at least one single coding region picture corresponding to at least one single code.
4. The method according to claim 1, wherein the extracting features of the entire coding region picture to obtain global features of a target coding set, and extracting features of at least one single coding region picture to obtain local features of the target coding set comprises:
inputting the whole coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network;
and inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain the local features of the target coding set output by the second feature extraction sub-network.
5. The method of claim 4, wherein said authenticating the set of object codes based on the global features and the local features comprises:
performing authenticity identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code;
and determining the authenticity of the target code set based on the first recognition result and the at least one second recognition result.
6. The method of claim 4, further comprising:
obtaining a picture of a target coding page with an identified error;
and performing iterative training on the second processing model by using the picture of the target coding page with the identified error so as to improve the accuracy of the second processing model.
7. A certificate information authentication apparatus, comprising:
the acquisition unit is used for acquiring the picture of the target coding page in the certificate;
the first extraction unit is used for extracting an integral coding region picture corresponding to a target coding set and at least one single coding region picture corresponding to at least one single code from the pictures of the target coding page; the target encoding set comprises at least one single encoding;
the second extraction unit is used for extracting the characteristics of the overall coding region picture to obtain the global characteristics of the target coding set; extracting the characteristics of at least one single coding region picture to obtain the local characteristics of the target coding set;
and the identification unit is used for identifying the authenticity of the target code set based on the global characteristics and the local characteristics.
8. The apparatus according to claim 7, wherein the first extraction unit is specifically configured to:
identifying a first area corresponding to a target coding set in a picture of the target coding page;
extracting the picture of the first area from the picture of the target coding page to obtain an integral coding area picture of a target coding set;
identifying each second area corresponding to each single code in the whole coding area picture;
and segmenting the whole coding region picture based on the identified second regions to obtain at least one single coding region picture corresponding to at least one single code.
9. The apparatus according to claim 7, wherein the second extraction unit is specifically configured to:
inputting the whole coding region picture into a first feature extraction sub-network of a feature extraction layer of a second processing model to obtain global features of a target coding set output by the first feature extraction sub-network;
inputting at least one single coding region picture into a second feature extraction sub-network of a feature extraction layer of a second processing model to obtain local features of a target coding set output by the second feature extraction sub-network;
the identification unit is specifically configured to:
performing authenticity identification processing on the global features and the local features by using other functional layers except the feature extraction layer in the second processing model to obtain a first identification result of the overall code of the target code set and at least one second identification result of at least one single code;
and determining the authenticity of the target code set based on the first recognition result and the at least one second recognition result.
10. A computer device, comprising:
a memory for storing at least one set of instructions;
a processor for invoking and executing the set of instructions in the memory, the certificate information authentication method of any of claims 1-6 being performed by executing the set of instructions.
CN202010063773.XA 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment Active CN111259894B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010063773.XA CN111259894B (en) 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010063773.XA CN111259894B (en) 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment

Publications (2)

Publication Number Publication Date
CN111259894A true CN111259894A (en) 2020-06-09
CN111259894B CN111259894B (en) 2023-07-07

Family

ID=70949134

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010063773.XA Active CN111259894B (en) 2020-01-20 2020-01-20 Certificate information identification method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN111259894B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115960A (en) * 2020-06-15 2020-12-22 曹辉 Method and system for identifying collection
CN116758564A (en) * 2023-08-15 2023-09-15 山东履信思源防伪技术有限公司 Method and system for comparing OCR character recognition results

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010045451A1 (en) * 2000-02-28 2001-11-29 Tan Warren Yung-Hang Method and system for token-based authentication
JP2008310691A (en) * 2007-06-15 2008-12-25 Internatl Currency Technologies Corp Method for identifying photo identification
CN108229457A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Verification method, device, electronic equipment and the storage medium of certificate
CN108229339A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Identification apparatus, certificate personal identification method and storage medium
CN108229499A (en) * 2017-10-30 2018-06-29 北京市商汤科技开发有限公司 Certificate recognition methods and device, electronic equipment and storage medium
CN108288073A (en) * 2018-01-30 2018-07-17 北京小米移动软件有限公司 Picture authenticity identification method and device, computer readable storage medium
CN108573202A (en) * 2017-03-17 2018-09-25 北京旷视科技有限公司 Identity identifying method, device and system and terminal, server and storage medium
CN109409349A (en) * 2018-02-02 2019-03-01 深圳壹账通智能科技有限公司 Credit certificate discrimination method, device, terminal and computer readable storage medium
CN109492643A (en) * 2018-10-11 2019-03-19 平安科技(深圳)有限公司 Certificate recognition methods, device, computer equipment and storage medium based on OCR
CN109543551A (en) * 2018-10-26 2019-03-29 平安科技(深圳)有限公司 Identity card identifies processing method, device, computer equipment and storage medium
CN110046644A (en) * 2019-02-26 2019-07-23 阿里巴巴集团控股有限公司 A kind of method and device of certificate false proof calculates equipment and storage medium
CN110570209A (en) * 2019-07-30 2019-12-13 平安科技(深圳)有限公司 Certificate authenticity verification method and device, computer equipment and storage medium
CN110598699A (en) * 2019-09-16 2019-12-20 华中科技大学 Anti-counterfeiting bill authenticity distinguishing system and method based on multispectral image
CN110598710A (en) * 2019-08-21 2019-12-20 阿里巴巴集团控股有限公司 Certificate identification method and device

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010045451A1 (en) * 2000-02-28 2001-11-29 Tan Warren Yung-Hang Method and system for token-based authentication
JP2008310691A (en) * 2007-06-15 2008-12-25 Internatl Currency Technologies Corp Method for identifying photo identification
CN108573202A (en) * 2017-03-17 2018-09-25 北京旷视科技有限公司 Identity identifying method, device and system and terminal, server and storage medium
CN108229499A (en) * 2017-10-30 2018-06-29 北京市商汤科技开发有限公司 Certificate recognition methods and device, electronic equipment and storage medium
CN108229339A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Identification apparatus, certificate personal identification method and storage medium
CN108229457A (en) * 2017-12-14 2018-06-29 深圳市商汤科技有限公司 Verification method, device, electronic equipment and the storage medium of certificate
CN108288073A (en) * 2018-01-30 2018-07-17 北京小米移动软件有限公司 Picture authenticity identification method and device, computer readable storage medium
CN109409349A (en) * 2018-02-02 2019-03-01 深圳壹账通智能科技有限公司 Credit certificate discrimination method, device, terminal and computer readable storage medium
CN109492643A (en) * 2018-10-11 2019-03-19 平安科技(深圳)有限公司 Certificate recognition methods, device, computer equipment and storage medium based on OCR
CN109543551A (en) * 2018-10-26 2019-03-29 平安科技(深圳)有限公司 Identity card identifies processing method, device, computer equipment and storage medium
CN110046644A (en) * 2019-02-26 2019-07-23 阿里巴巴集团控股有限公司 A kind of method and device of certificate false proof calculates equipment and storage medium
CN110570209A (en) * 2019-07-30 2019-12-13 平安科技(深圳)有限公司 Certificate authenticity verification method and device, computer equipment and storage medium
CN110598710A (en) * 2019-08-21 2019-12-20 阿里巴巴集团控股有限公司 Certificate identification method and device
CN110598699A (en) * 2019-09-16 2019-12-20 华中科技大学 Anti-counterfeiting bill authenticity distinguishing system and method based on multispectral image

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
MICHAEL RYAN 等: "An Examination of Character Recognition on ID card using Template Matching Approach" *
李政: "基于图像匹配的身份证字符识别算法研究及软件设计" *
李欢: "基于稀疏表示的签名真伪鉴别方法研究" *
章小兵 等: "关于道路运输从业资格证假证的分析与规制" *
罗涛: "身份证综合信息采集与识别软件系统设计与实现" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115960A (en) * 2020-06-15 2020-12-22 曹辉 Method and system for identifying collection
CN116758564A (en) * 2023-08-15 2023-09-15 山东履信思源防伪技术有限公司 Method and system for comparing OCR character recognition results
CN116758564B (en) * 2023-08-15 2023-11-10 山东履信思源防伪技术有限公司 Method and system for comparing OCR character recognition results

Also Published As

Publication number Publication date
CN111259894B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
Asghar et al. Copy-move and splicing image forgery detection and localization techniques: a review
CN106951832B (en) Verification method and device based on handwritten character recognition
CN112508011A (en) OCR (optical character recognition) method and device based on neural network
EP2360619A1 (en) Fast fingerprint searching method and fast fingerprint searching system
CN111191568A (en) Method, device, equipment and medium for identifying copied image
CN108830275B (en) Method and device for identifying dot matrix characters and dot matrix numbers
CN113792659B (en) Document identification method and device and electronic equipment
US10867170B2 (en) System and method of identifying an image containing an identification document
Mahale et al. Image inconsistency detection using local binary pattern (LBP)
CN113111880A (en) Certificate image correction method and device, electronic equipment and storage medium
US11823521B2 (en) Image processing method for an identity document
CN111259894B (en) Certificate information identification method and device and computer equipment
Fu et al. Robust GAN-face detection based on dual-channel CNN network
Sirajudeen et al. Forgery document detection in information management system using cognitive techniques
Raigonda Signature Verification System Using SSIM In Image Processing
Das et al. A robust method for detecting copy-move image forgery using stationary wavelet transform and scale invariant feature transform
CN112232336A (en) Certificate identification method, device, equipment and storage medium
Mahdi et al. Detection of Copy-Move Forgery in Digital Image Based on SIFT Features and Automatic Matching Thresholds
Irimia et al. Official Document Identification and Data Extraction using Templates and OCR
CN113888675A (en) Method, system, apparatus, and medium for generating a document image
CN110197140B (en) Material auditing method and equipment based on character recognition
Thaiparnit et al. Tracking vehicles system based on license plate recognition
CN113158745A (en) Disorder code document picture identification method and system based on multi-feature operator
CN114005131A (en) Certificate character recognition method and device
Bibi et al. Document forgery detection using source printer identification: A comparative study of text‐dependent versus text‐independent analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant