CN113989548A - Certificate classification model training method and device, electronic equipment and storage medium - Google Patents

Certificate classification model training method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113989548A
CN113989548A CN202111219802.8A CN202111219802A CN113989548A CN 113989548 A CN113989548 A CN 113989548A CN 202111219802 A CN202111219802 A CN 202111219802A CN 113989548 A CN113989548 A CN 113989548A
Authority
CN
China
Prior art keywords
certificate
certificate image
image set
enhanced
similar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111219802.8A
Other languages
Chinese (zh)
Other versions
CN113989548B (en
Inventor
董伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202111219802.8A priority Critical patent/CN113989548B/en
Publication of CN113989548A publication Critical patent/CN113989548A/en
Application granted granted Critical
Publication of CN113989548B publication Critical patent/CN113989548B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an artificial intelligence technology, and discloses a certificate classification model training method, which comprises the following steps: performing data enhancement operation on an original certificate image set to obtain an enhanced certificate image set, generating similar certificate images of corresponding enhanced certificate images according to pixel distribution information of each enhanced certificate image in the enhanced certificate image set, and selecting the similar certificate images with effective information content meeting a first preset condition to form an effective similar certificate image set; and carrying out image classification prediction training on the pre-constructed certificate classification model by utilizing the original certificate image set, the enhanced certificate image set and the effective similar certificate image set to obtain a trained certificate classification model, and carrying out classification judgment on the certificate image to be detected by utilizing the trained certificate classification model to obtain a classification result of the certificate image to be detected. The invention also provides a device, equipment and medium for training the certificate classification model. The invention can improve the accuracy of the certificate classification model training.

Description

Certificate classification model training method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a certificate classification model training method and device, electronic equipment and a computer readable storage medium.
Background
When a user transacts services such as finance, government affairs and medical treatment on line through a mobile phone APP or a small program, corresponding certificate images need to be uploaded according to specified requirements. Due to the requirements of supervision of business handling or compliance risk management, a certificate image uploaded by a user needs to be detected, whether the type of the certificate image of the user meets the requirements or not is identified, for example, in order to guarantee the definition and integrity of the certificate image uploaded by the user, a business system requires that the user cannot upload a certificate flip picture or a certificate screen capture picture.
Common certificate image types comprise certificate copying, electronic certificates, copies or screen capturing, and how to identify the types of the certificate images uploaded by users is a current common mode that the characteristics of each certificate image under different certificate image types are extracted by using a neural network model based on deep learning, and classification analysis is carried out according to the extracted characteristics, so that the judgment of the certificate image types is realized.
The certificate images are classified by utilizing the neural network model, a large number of certificate image samples need to be relied on, the classification accuracy of the neural network model can be ensured only by training the feature extraction of the large number of certificate image samples through the neural network model, however, the certificate images relate to personal privacy data of a user, and generally, a large number of certificate sample images cannot be directly obtained due to the safety protection of the personal privacy of the user, so that the accuracy of the current certificate classification model training based on the neural network model is required to be improved.
Disclosure of Invention
The invention provides a method and a device for training a certificate classification model and a computer readable storage medium, and mainly aims to improve the accuracy of training the certificate classification model.
In order to achieve the above object, the present invention provides a method for training a certificate classification model, comprising:
acquiring an original certificate image set, and performing data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
extracting pixel distribution information of each enhanced certificate image in the enhanced certificate image set, and generating a similar certificate image of the corresponding enhanced certificate image according to the pixel distribution information;
calculating the effective information content of each similar certificate image, and selecting the similar certificate images with the effective information content meeting a first preset condition to form an effective similar certificate image set;
and performing predictive training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set until the predictive training meets a second preset condition, and quitting the predictive training to obtain the trained certificate classification model.
Optionally, the performing data enhancement operations on the raw credential image set includes:
rotating the original certificate image set according to a preset rotation angle to obtain a rotated image set;
carrying out zooming operation on the original certificate image set according to a preset zooming proportion to obtain a zoomed image set;
performing at least one noise enhancement operation of noise addition on the original certificate image set to obtain an enhanced noise image set;
and collecting the rotated image set, the zoomed image set and the enhanced noise image set into an enhanced certificate image set.
Optionally, the performing at least one noise-added noise enhancement operation on the original document image set to obtain an enhanced noise image set includes:
carrying out noise dyeing on each original certificate image in the original certificate image set to obtain a first noise-added image set;
and carrying out local masking on each first noise-increased image in the first noise-increased image set to obtain an enhanced noise image set.
Optionally, the generating a similar certificate image of a corresponding enhanced certificate image according to the pixel distribution information includes:
generating an initial similar certificate image corresponding to each enhanced certificate image according to the pixel distribution information of each enhanced certificate image by using a pre-constructed image generation model;
calculating the difference between each initial similar certificate image and the corresponding enhanced certificate image, and counting the generation proportion between the number of the initial similar certificate images corresponding to the difference smaller than a preset difference threshold value and the number of all the initial similar certificate images;
when the generation proportion is smaller than a preset generation proportion threshold value, adjusting parameters of the image generation model, returning to the step of generating the initial similar certificate image corresponding to each enhanced certificate image according to the pixel distribution information of each enhanced certificate image by using the pre-constructed image generation model until the generation proportion is larger than or equal to the preset generation proportion threshold value;
and selecting the initial similar certificate image corresponding to the difference smaller than the preset difference threshold value from the difference as the similar certificate image.
Optionally, the calculating the effective information amount of each similar certificate image includes:
counting the number of pixel points of effective information contained in the similar certificate image and the total number of the pixel points in the similar certificate image;
calculating the ratio of the number of pixel points containing effective information in the similar certificate image to the total number of the pixel points in the similar certificate image, and taking the ratio as the effective information content of each similar certificate image;
optionally, before counting the number of pixels of the valid information included in the similar certificate image, the method further includes:
carrying out binarization processing on pixel points in each similar certificate image to obtain the gray value of each pixel point;
and taking the pixel points with the gray values larger than the preset pixel threshold value as the pixel points of the effective information.
Optionally, the performing, by using the original certificate image set, the enhanced certificate image set, and the effectively similar certificate image set, prediction training of image classification on a pre-constructed certificate classification model until the prediction training meets a second preset condition, exiting the prediction training, and obtaining a trained certificate classification model includes:
distributing a same number for each original certificate image, the corresponding enhanced certificate image and the corresponding effective similar certificate image in the original certificate image set, the enhanced certificate image set and the effective similar certificate image set;
carrying out prediction training of image classification on a pre-constructed certificate classification model by utilizing the original certificate image set, the enhanced certificate image set and the effective similar certificate image set to obtain a classification prediction result;
counting the occupation ratio between the number of the images with the same classification result and the total number of the images with the same number in the classification prediction results, and averaging the occupation ratios of all numbers to obtain an average occupation ratio;
judging whether the average occupation ratio value meets a second preset condition or not;
if the average occupation ratio is not the second preset condition, adjusting parameters of the pre-constructed certificate classification model, and returning to the step of performing prediction training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set;
and if the average ratio meets the second preset condition, quitting the prediction training to obtain the trained certificate classification model.
In order to solve the above problem, the present invention further provides a device for training a document classification model, the device comprising:
the enhanced sample generation module is used for acquiring an original certificate image set and performing data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
the effective similar sample generation module is used for extracting the pixel distribution information of each enhanced certificate image in the enhanced certificate image set and generating a similar certificate image of the corresponding enhanced certificate image according to the pixel distribution information; calculating the effective information content of each similar certificate image, and selecting the similar certificate images with the effective information content meeting a first preset condition to form an effective similar certificate image set;
and the classification model training module is used for carrying out prediction training of image classification on the pre-constructed certificate classification model by utilizing the original certificate image set, the enhanced certificate image set and the effective similar certificate image set until the prediction training meets a second preset condition, and quitting the prediction training to obtain the trained certificate classification model.
In order to solve the above problem, the present invention also provides an electronic device, including:
a memory storing at least one instruction; and
and the processor executes the instructions stored in the memory to realize the certificate classification model training method.
In order to solve the above problem, the present invention further provides a computer-readable storage medium, in which at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the above method for training a document classification model.
The method comprises the steps of obtaining an enhanced certificate image set by performing data enhancement operation on an original certificate image set, further generating a similar certificate image set of the enhanced certificate image set, obtaining an effective similar certificate image set by screening effective information content of the similar certificate image set, and performing image classification prediction training on a pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set, so that the sample number of certificate images is expanded, the classification model is guaranteed to be trained fully, and the accuracy of certificate classification model training is improved.
Drawings
FIG. 1 is a schematic flow chart of a method for training a certificate classification model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a detailed implementation of one of the steps in the method for training the credential classification model shown in FIG. 1;
FIG. 3 is a flowchart illustrating a detailed implementation of one of the steps in the method for training the credential classification model shown in FIG. 1;
FIG. 4 is a flowchart illustrating a detailed implementation of one of the steps in the method for training the credential classification model shown in FIG. 1;
FIG. 5 is a functional block diagram of a document classification model training apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device for implementing the certificate classification model training method according to an embodiment of the present invention.
Fig. 7 is a schematic structural diagram of an electronic device for implementing the certificate classification model training method according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the application provides a certificate classification model training method. The execution subject of the certificate classification model training method includes, but is not limited to, a server, a terminal, and the like, which can be configured to execute at least one of the electronic devices of the method provided by the embodiments of the present application. In other words, the certificate classification model training method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a block chain platform. The server can be an independent server, and can also be a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data and an artificial intelligence platform.
Fig. 1 is a schematic flow chart of a certificate classification model training method according to an embodiment of the present invention. In this embodiment, the method for training the certificate classification model includes:
s1, acquiring an original certificate image set, and performing data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
in an embodiment of the present invention, the original document image set includes, but is not limited to, document images such as an identification card, a driving license, a academic degree certificate, a work certificate, etc. of a user, wherein the types of the original document image set include, but are not limited to, screen capture images, reproduction images, black and white copy images, electronic document images, etc.
In the embodiment of the invention, the original certificate image uploaded by the user can be acquired from a specified system or a database according to authorization.
In the embodiment of the invention, the data enhancement operation refers to that on the basis of keeping the type label of the original certificate image unchanged, the data distribution of the image subjected to the data enhancement operation conforms to the real data distribution condition of the original certificate image through operations such as rotation, scaling and random shielding.
In detail, referring to fig. 2, the S1 includes:
s11, performing rotation operation on the original certificate image set according to a preset rotation angle to obtain a rotated image set;
s12, carrying out zooming operation on the original certificate image set according to a preset zooming proportion to obtain a zoomed image set;
s13, performing at least one noise enhancement operation of noise addition on the original certificate image set to obtain an enhanced noise image set;
and S14, collecting the rotated image set, the zoomed image set and the enhanced noise image set into an enhanced certificate image set.
In the embodiment of the present invention, the rotation and scaling operations of the image can be implemented by using a common image processing function provided in an open source tensrflow deep learning library, for example, the scaling function of the image can be implemented by using a tf.
Further, said performing at least one noise-adding noise enhancement operation on said original document image set to obtain an enhanced noise image set, comprising: carrying out noise dyeing on each original certificate image in the original certificate image set to obtain a first noise-added image set; and carrying out local masking on each first noise-increased image in the first noise-increased image set to obtain an enhanced noise image set.
In the embodiment of the present invention, a python statement having a data capture function may be used to obtain one or more color parameters from a pre-constructed database, for example, if a red color parameter is r, a red color range is (q, p), a pixel value of a target pixel is k, and k is not in the (q, p) range, the pixel value of the target pixel is numerically adjusted by using the color parameter r, so that the pixel value of the target pixel falls in the (q, p) range. And respectively utilizing multiple color parameters to carry out numerical value adjustment on the pixel value of each pixel point in the original image, thereby achieving the effect of carrying out noise dyeing on the original certificate image.
According to the embodiment of the invention, at least one type of noise is added to the original certificate image, so that various types of noise images can be obtained, and the subsequent training of an image classification model with higher universality by utilizing various types of noise images is facilitated.
In the embodiment of the invention, data enhancement operations such as translation, cutting, visual angle change and the like can be carried out on the original certificate image in practical application.
S2, extracting the pixel distribution information of each enhanced certificate image in the enhanced certificate image set, and generating a similar certificate image of the corresponding enhanced certificate image according to the pixel distribution information;
in the embodiment of the present invention, pixel distribution information of each pixel point in the enhanced certificate image may be obtained by using a method such as munpy (digital Python).
In the embodiment of the invention, the similar certificate image can be generated through a pre-constructed image generation model, and the pre-constructed image generation model is a convolutional neural network model constructed based on a Conditional access network (CGAN).
In detail, referring to fig. 3, the S2 includes:
s21, generating an initial similar certificate image corresponding to each enhanced certificate image according to the pixel distribution information of each enhanced certificate image by using the pre-constructed image generation model;
s22, calculating the difference between each initial similar certificate image and the corresponding enhanced certificate image, and counting the generation proportion between the number of the initial similar certificate images corresponding to the difference smaller than a preset difference threshold value and the number of all the initial similar certificate images;
s23, when the generation proportion is smaller than a preset generation proportion threshold value, adjusting parameters of the image generation model, returning to the step of generating the model by using the pre-constructed image, and generating an initial similar certificate image corresponding to each enhanced certificate image according to the pixel distribution information of each enhanced certificate image until the generation proportion is larger than or equal to the preset generation proportion threshold value;
and S24, selecting the initial similar certificate image corresponding to the difference degree smaller than the preset difference threshold value from the difference degrees as the similar certificate image.
In the embodiment of the present invention, the difference between the initial similar certificate image and the corresponding enhanced certificate image may be calculated by a difference function, where the difference function includes:
D=Lc+Ls
Lc=E[logP(C|Xreal)]+E[logP(C|Xfake)]
Ls=E[logP(S|Xreal)]+E[logP(S|Xfake)
wherein E [ ] is an expected value calculation, Lc is an expected value of the similarity between the initial similar certificate image and the enhanced certificate image, Ls is an expected value of effective information amount in the enhanced certificate image, Xreal is the enhanced certificate image, and Xfake is the initial similar certificate image; c is the effective information content in the initial similar certificate image, and S is the effective information content of the enhanced certificate image.
In the embodiment of the present invention, the preset generation ratio threshold may be determined according to an actual situation, and for example, may be set to 80% or 90%.
Another embodiment of the invention can also provide a method for generating a challenge sample based on an open-source tensrflow deep learning library, generating a similar certificate image for each enhanced certificate image in the set of enhanced certificate images.
S3, calculating the effective information content of each similar certificate image, and selecting the similar certificate images with the effective information content meeting a first preset condition to form an effective similar certificate image set;
in the embodiment of the present invention, the valid information is information that can be used to assist image classification, and it can be understood that each enhanced certificate image only includes a part of valid information that contributes to image classification, and accordingly, each similar certificate image also includes a part of the valid information, and therefore, it is further necessary to perform a screening operation on the similar certificate images to obtain a valid sample set that can be used for image classification.
In the embodiment of the present invention, before counting the number of pixels containing valid information in the similar certificate image, the method further includes: performing binarization processing on pixel points in each similar certificate image to obtain a gray value of each pixel point; and taking the pixel points with the gray values larger than the preset pixel threshold value as the pixel points of the effective information.
In detail, referring to fig. 4, the S3 includes:
s31, counting the number of the pixel points of the effective information contained in the similar certificate image and the total number of the pixel points in the similar certificate image;
s32, calculating the ratio of the number of pixels containing effective information in the similar certificate image to the total number of pixels in the similar certificate image, and taking the ratio as the effective information content of each similar certificate image;
and S33, selecting the similar certificate images with the effective information quantity meeting the first preset condition from all the similar certificate images to form an effective similar certificate image set.
In this embodiment of the present invention, the first preset condition may be a maximum number of valid samples, for example, the similar document images are sorted in an order from a large effective information amount to a small effective information amount, and N similar document images with a largest effective information amount are selected from the sorted order to obtain the valid similar document image set, where N is the maximum number of valid samples specified by the first preset condition.
In another embodiment of the present invention, the first preset condition may be an effective information amount threshold, for example, a similar document image with an effective information amount greater than or equal to the effective information amount threshold specified by the first preset condition is selected from the similar document image sets to form the effective similar document image set.
S4, carrying out image classification prediction training on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set until the prediction training meets a second preset condition, and quitting the prediction training to obtain a trained certificate classification model.
In an embodiment of the present invention, the original certificate image set, the enhanced certificate image set, and the valid similar certificate image set constitute a training sample set of the pre-constructed certificate classification model, where the pre-constructed certificate classification model may be a neural network model based on deep learning.
In detail, referring to fig. 5, the S4 includes:
s41, distributing a same number for each original certificate image, the corresponding enhanced certificate image and the corresponding effective similar certificate image in the original certificate image set, the enhanced certificate image set and the effective similar certificate image set;
s42, carrying out prediction training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set to obtain a classification prediction result;
s43, counting the ratio of the number of images with the same classification result and the total number of the images with the same number in the classification prediction results, and averaging the ratios of all numbers to obtain an average ratio;
s44, judging whether the average ratio value meets a second preset condition;
if the average ratio does not meet the second preset condition, executing S45, adjusting the parameters of the pre-constructed classification model, and returning to S42;
and if the average ratio meets the second preset condition, executing S46 and quitting the prediction training to obtain the trained certificate classification model.
In the embodiment of the present invention, it can be understood that the actual classification results corresponding to the enhanced certificate image and the effectively similar certificate image obtained from the same original certificate image are the same, and when the number of images with the same classification result in the images with the same number is larger in the classification prediction results output by the prediction training, the classification accuracy of the corresponding pre-constructed classification model is higher.
Illustratively, for example, 4 images under the same number, wherein the classification results of 3 images are similar, the ratio between the number of images under the same number and the total number of images under the number is 3/4.
In this embodiment of the present invention, the second preset condition may be a specified average ratio threshold, and the average ratio threshold may be set according to an actual situation, for example, 80%, that is, an average value of the ratios in all numbers is greater than or equal to 80%, which indicates that the pre-constructed classification model achieves a pre-optimal effect.
In the embodiment of the invention, the trained classification model is used for extracting the characteristics of the certificate image to be detected, the probability of each classification type corresponding to the certificate image to be detected is calculated according to the characteristics, and the classification type with the highest probability is selected as the classification result of the certificate image to be detected.
The method comprises the steps of obtaining an enhanced certificate image set by performing data enhancement operation on an original certificate image set, further generating a similar certificate image set of the enhanced certificate image set, obtaining an effective similar certificate image set by screening effective information content of the similar certificate image set, and performing image classification prediction training on a pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set, so that the sample number of certificate images is expanded, the classification model is guaranteed to be trained fully, and the accuracy of certificate classification model training is improved.
Fig. 6 is a functional block diagram of a certificate classification model training apparatus according to an embodiment of the present invention.
The certificate classification model training device 100 of the present invention can be installed in an electronic device. According to the implemented functions, the certificate classification model training apparatus 100 may include an enhanced sample generation module 101, an effective similar sample generation module 102, and a classification model training module 103. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.
In the present embodiment, the functions regarding the respective modules/units are as follows:
the enhanced sample generation module 101 is configured to acquire an original certificate image set, and perform data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
the effective similar sample generation module 102 is configured to extract pixel distribution information of each enhanced certificate image in the enhanced certificate image set, and generate a similar certificate image of a corresponding enhanced certificate image according to the pixel distribution information; calculating the effective information quantity of each similar certificate image, and selecting the similar certificate images with the effective information quantity meeting a first preset condition to form an effective similar certificate image set;
the classification model training module 103 is configured to perform prediction training of image classification on a pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set, and the valid similar certificate image set, and quit the prediction training until the prediction training meets a second preset condition, so as to obtain a trained certificate classification model.
In detail, when the modules in the certificate classification model training device 100 in the embodiment of the present invention are used, the same technical means as the certificate classification model training method described in fig. 1 to 5 are used, and the same technical effects can be produced, which is not described herein again.
Fig. 7 is a schematic structural diagram of an electronic device for implementing a method for training a certificate classification model according to an embodiment of the present invention.
The electronic device 1 may include a processor 10, a memory 11, and a bus, and may further include a computer program, such as a credential classification model training program, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, for example a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like equipped on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 can be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of a certificate classification model training program, but also to temporarily store data that has been output or is to be output.
The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), micro processors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., certificate classification model training programs, etc.) stored in the memory 11 and calling data stored in the memory 11.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.
Fig. 7 only shows an electronic device with components, and it will be understood by a person skilled in the art that the structure shown in fig. 7 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.
For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component such as one or more dc or ac power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visual user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The certificate classification model training program stored in the memory 11 of the electronic device 1 is a combination of instructions that, when executed in the processor 10, can implement:
acquiring an original certificate image set, and performing data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
extracting pixel distribution information of each enhanced certificate image in the enhanced certificate image set, and generating a similar certificate image of the corresponding enhanced certificate image according to the pixel distribution information;
calculating the effective information content of each similar certificate image, and selecting the similar certificate images with the effective information content meeting a first preset condition to form an effective similar certificate image set;
and performing predictive training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set until the predictive training meets a second preset condition, and quitting the predictive training to obtain the trained certificate classification model.
Specifically, the specific implementation method of the processor 10 for the above instruction may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, which is not repeated herein.
Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:
acquiring an original certificate image set, and performing data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
extracting pixel distribution information of each enhanced certificate image in the enhanced certificate image set, and generating a similar certificate image of the corresponding enhanced certificate image according to the pixel distribution information;
calculating the effective information content of each similar certificate image, and selecting the similar certificate images with the effective information content meeting a first preset condition to form an effective similar certificate image set;
and performing predictive training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set until the predictive training meets a second preset condition, and quitting the predictive training to obtain the trained certificate classification model.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and there may be other divisions when the actual implementation is performed.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A method for training a certificate classification model, the method comprising:
acquiring an original certificate image set, and performing data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
extracting pixel distribution information of each enhanced certificate image in the enhanced certificate image set, and generating a similar certificate image of the corresponding enhanced certificate image according to the pixel distribution information;
calculating the effective information content of each similar certificate image, and selecting the similar certificate images with the effective information content meeting a first preset condition to form an effective similar certificate image set;
and performing predictive training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set until the predictive training meets a second preset condition, and quitting the predictive training to obtain the trained certificate classification model.
2. The method of training the classification model of documents as claimed in claim 1, wherein said performing data enhancement operations on said original document image set comprises:
rotating the original certificate image set according to a preset rotation angle to obtain a rotated image set;
carrying out zooming operation on the original certificate image set according to a preset zooming proportion to obtain a zoomed image set;
performing at least one noise enhancement operation of noise addition on the original certificate image set to obtain an enhanced noise image set;
and collecting the rotated image set, the zoomed image set and the enhanced noise image set into an enhanced certificate image set.
3. The method of claim 2, wherein performing at least one noise-adding noise enhancement operation on the raw document image set to obtain an enhanced noise image set comprises:
carrying out noise dyeing on each original certificate image in the original certificate image set to obtain a first noise-added image set;
and carrying out local masking on each first noise-increased image in the first noise-increased image set to obtain an enhanced noise image set.
4. The method of training the classification model of documents according to claim 1, wherein said generating similar document images corresponding to enhanced document images according to said pixel distribution information comprises:
generating an initial similar certificate image corresponding to each enhanced certificate image according to the pixel distribution information of each enhanced certificate image by using a pre-constructed image generation model;
calculating the difference between each initial similar certificate image and the corresponding enhanced certificate image, and counting the generation proportion between the number of the initial similar certificate images corresponding to the difference smaller than a preset difference threshold value and the number of all the initial similar certificate images;
when the generation proportion is smaller than a preset generation proportion threshold value, adjusting parameters of the image generation model, returning to the step of generating the initial similar certificate image corresponding to each enhanced certificate image according to the pixel distribution information of each enhanced certificate image by using the pre-constructed image generation model until the generation proportion is larger than or equal to the preset generation proportion threshold value;
and selecting the initial similar certificate image corresponding to the difference smaller than the preset difference threshold value from the difference as the similar certificate image.
5. The method of claim 1, wherein calculating the effective information content of each of the similar document images comprises:
counting the number of pixel points of effective information contained in the similar certificate image and the total number of the pixel points in the similar certificate image;
and calculating the ratio of the number of the pixel points containing the effective information in the similar certificate image to the total number of the pixel points in the similar certificate image, and taking the ratio as the effective information content of each similar certificate image.
6. The method for training the certificate classification model as claimed in claim 5, wherein before counting the number of the pixel points of the valid information contained in the similar certificate image, the method further comprises:
carrying out binarization processing on pixel points in each similar certificate image to obtain the gray value of each pixel point;
and taking the pixel points with the gray values larger than the preset pixel threshold value as the pixel points of the effective information.
7. The method for training the certificate classification model as claimed in claim 1, wherein the step of performing predictive training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the valid similar certificate image set until the predictive training satisfies a second preset condition, and exiting the predictive training to obtain the trained certificate classification model comprises the steps of:
distributing a same number for each original certificate image, the corresponding enhanced certificate image and the corresponding effective similar certificate image in the original certificate image set, the enhanced certificate image set and the effective similar certificate image set;
carrying out prediction training of image classification on a pre-constructed certificate classification model by utilizing the original certificate image set, the enhanced certificate image set and the effective similar certificate image set to obtain a classification prediction result;
counting the occupation ratio between the number of the images with the same classification result and the total number of the images with the same number in the classification prediction results, and averaging the occupation ratios of all numbers to obtain an average occupation ratio;
judging whether the average occupation ratio value meets a second preset condition or not;
if the average ratio does not meet the second preset condition, adjusting parameters of the pre-constructed certificate classification model, and returning to the step of performing prediction training of image classification on the pre-constructed certificate classification model by using the original certificate image set, the enhanced certificate image set and the effective similar certificate image set;
and if the average ratio meets the second preset condition, quitting the prediction training to obtain the trained certificate classification model.
8. A document classification model training device, the device comprising:
the enhanced sample generation module is used for acquiring an original certificate image set and performing data enhancement operation on the original certificate image set to obtain an enhanced certificate image set;
the effective similar sample generation module is used for extracting the pixel distribution information of each enhanced certificate image in the enhanced certificate image set and generating a corresponding similar certificate image of the enhanced certificate image according to the pixel distribution information; calculating the effective information content of each similar certificate image, and selecting the similar certificate images with the effective information content meeting a first preset condition to form an effective similar certificate image set;
and the classification model training module is used for carrying out prediction training of image classification on the pre-constructed certificate classification model by utilizing the original certificate image set, the enhanced certificate image set and the effective similar certificate image set until the prediction training meets a second preset condition, and quitting the prediction training to obtain the trained certificate classification model.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method of training a credential classification model according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a method of training a document classification model according to any one of claims 1 to 7.
CN202111219802.8A 2021-10-20 2021-10-20 Certificate classification model training method and device, electronic equipment and storage medium Active CN113989548B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111219802.8A CN113989548B (en) 2021-10-20 2021-10-20 Certificate classification model training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111219802.8A CN113989548B (en) 2021-10-20 2021-10-20 Certificate classification model training method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113989548A true CN113989548A (en) 2022-01-28
CN113989548B CN113989548B (en) 2024-07-02

Family

ID=79739573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111219802.8A Active CN113989548B (en) 2021-10-20 2021-10-20 Certificate classification model training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113989548B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115065504A (en) * 2022-05-16 2022-09-16 国家广播电视总局广播电视科学研究院 Target detection model-oriented security assessment method and system and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090465A (en) * 2017-12-29 2018-05-29 国信优易数据有限公司 A kind of dressing effect process model training method and dressing effect processing method
CN111476760A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Medical image generation method and device, electronic equipment and medium
CN111832745A (en) * 2020-06-12 2020-10-27 北京百度网讯科技有限公司 Data augmentation method and device and electronic equipment
CN112163638A (en) * 2020-10-20 2021-01-01 腾讯科技(深圳)有限公司 Defense method, device, equipment and medium for image classification model backdoor attack
CA3070817A1 (en) * 2020-01-31 2021-07-31 Element Ai Inc. Method of and system for joint data augmentation and classification learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108090465A (en) * 2017-12-29 2018-05-29 国信优易数据有限公司 A kind of dressing effect process model training method and dressing effect processing method
CA3070817A1 (en) * 2020-01-31 2021-07-31 Element Ai Inc. Method of and system for joint data augmentation and classification learning
CN111476760A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Medical image generation method and device, electronic equipment and medium
CN111832745A (en) * 2020-06-12 2020-10-27 北京百度网讯科技有限公司 Data augmentation method and device and electronic equipment
CN112163638A (en) * 2020-10-20 2021-01-01 腾讯科技(深圳)有限公司 Defense method, device, equipment and medium for image classification model backdoor attack

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨昌东等: "基于AT-PGGAN的增强数据车辆型号精细识别", 中国图象图形学报, no. 03, 16 March 2020 (2020-03-16), pages 593 - 604 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115065504A (en) * 2022-05-16 2022-09-16 国家广播电视总局广播电视科学研究院 Target detection model-oriented security assessment method and system and electronic equipment
CN115065504B (en) * 2022-05-16 2024-04-09 国家广播电视总局广播电视科学研究院 Safety evaluation method and system for target detection model and electronic equipment

Also Published As

Publication number Publication date
CN113989548B (en) 2024-07-02

Similar Documents

Publication Publication Date Title
CN111898538B (en) Certificate authentication method and device, electronic equipment and storage medium
CN112507934A (en) Living body detection method, living body detection device, electronic apparatus, and storage medium
CN111985504B (en) Copying detection method, device, equipment and medium based on artificial intelligence
CN108628993B (en) Electronic map self-adaptive classification method, device, equipment and storage medium
CN113887408B (en) Method, device, equipment and storage medium for detecting activated face video
CN112668575B (en) Key information extraction method and device, electronic equipment and storage medium
CN108648189A (en) Image fuzzy detection method, apparatus, computing device and readable storage medium storing program for executing
CN112508145A (en) Electronic seal generation and verification method and device, electronic equipment and storage medium
CN115471775A (en) Information verification method, device and equipment based on screen recording video and storage medium
CN113989548B (en) Certificate classification model training method and device, electronic equipment and storage medium
CN112862703B (en) Image correction method and device based on mobile photographing, electronic equipment and medium
CN112883346A (en) Safety identity authentication method, device, equipment and medium based on composite data
CN113255456B (en) Inactive living body detection method, inactive living body detection device, electronic equipment and storage medium
CN114463685B (en) Behavior recognition method, behavior recognition device, electronic equipment and storage medium
CN113705686B (en) Image classification method, device, electronic equipment and readable storage medium
CN112233194B (en) Medical picture optimization method, device, equipment and computer readable storage medium
CN115082736A (en) Garbage identification and classification method and device, electronic equipment and storage medium
CN114913518A (en) License plate recognition method, device, equipment and medium based on image processing
CN114390200A (en) Camera cheating identification method, device, equipment and storage medium
CN114157634A (en) Unique account identification method, device, equipment and storage medium
CN114049676A (en) Fatigue state detection method, device, equipment and storage medium
CN112183520A (en) Intelligent data information processing method and device, electronic equipment and storage medium
CN112329599A (en) Digital signature identification method and device, electronic equipment and storage medium
CN113095284A (en) Face selection method, device, equipment and computer readable storage medium
CN111583215A (en) Intelligent damage assessment method and device for damage image, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant