CN113221842A - Model training method, image recognition method, device, equipment and medium - Google Patents

Model training method, image recognition method, device, equipment and medium Download PDF

Info

Publication number
CN113221842A
CN113221842A CN202110623053.9A CN202110623053A CN113221842A CN 113221842 A CN113221842 A CN 113221842A CN 202110623053 A CN202110623053 A CN 202110623053A CN 113221842 A CN113221842 A CN 113221842A
Authority
CN
China
Prior art keywords
model
training
image recognition
recognition model
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110623053.9A
Other languages
Chinese (zh)
Other versions
CN113221842B (en
Inventor
刘有亮
刘闯
叶雨桐
胡峻毅
陈诗昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Glasssix Technology Beijing Co ltd
Original Assignee
Glasssix Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Glasssix Technology Beijing Co ltd filed Critical Glasssix Technology Beijing Co ltd
Priority to CN202110623053.9A priority Critical patent/CN113221842B/en
Publication of CN113221842A publication Critical patent/CN113221842A/en
Application granted granted Critical
Publication of CN113221842B publication Critical patent/CN113221842B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20052Discrete cosine transform [DCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

In the model training method, the image recognition device, the equipment and the medium, an auxiliary model is introduced to assist the image recognition model to extract the characteristics during the training of the image recognition model. Compared with the depth learning model without assistance, the depth learning model extracts the image features by itself, and the auxiliary model enables the image recognition model to be more inclined to extract the texture features during feature extraction, so that the recognition model obtained through training can utilize the texture information in the image to achieve the purpose of improving the recognition accuracy when judging whether the face in the image is a living body face.

Description

Model training method, image recognition method, device, equipment and medium
Technical Field
The present application relates to the field of image recognition, and in particular, to a model training method, an image recognition method, an apparatus, a device, and a medium.
Background
The human face living body detection technology plays an important role in a human face recognition system. As a key link in a face recognition system, face living body detection needs to effectively prevent attack of non-living faces, such as 2D face image printing attack, electronic screen attack, 3D face mask attack and the like, and the safety and reliability of the whole system are guaranteed. The existing in-vivo detection technology mainly comprises the traditional in-vivo detection and a deep learning-based in-vivo detection method.
The inventor researches and discovers that the existing living body detection method based on deep learning does not fully utilize the feature information in the non-living body face, so that the identification precision of the living body detection method based on deep learning is expected to be further improved.
Disclosure of Invention
In order to overcome at least one of the deficiencies in the prior art, in a first aspect, the present application provides a model training method applied to a training device configured with an auxiliary model and an image recognition model to be trained, the method comprising:
acquiring a human face sample image;
inputting the face sample image into the image recognition model for processing;
iteratively adjusting model parameters of the image recognition model according to training losses to obtain a recognition model, wherein the training losses include a first training loss and a second training loss, the first training loss is obtained through texture features provided by the auxiliary model and shallow texture features output by the image recognition model, and the second training loss is obtained through the image recognition model.
In a second aspect, the present application provides an image recognition method applied to an image recognition apparatus configured with a recognition model obtained by a model training method, the method including:
acquiring a face image to be recognized;
and obtaining a living body face recognition result of the face image to be recognized through the recognition model.
In a third aspect, the present application provides a model training apparatus, which is applied to a training device configured with an auxiliary model and an image recognition model to be trained, and includes:
the data acquisition module is used for acquiring a human face sample image;
the data processing module is used for inputting the human face sample image into the image recognition model for processing;
and the model training module is used for iteratively adjusting the model parameters of the image recognition model according to training losses to obtain a recognition model, wherein the training losses comprise a first training loss and a second training loss, the first training loss is obtained through the texture features provided by the auxiliary model and the shallow texture features output by the image recognition model, and the second training loss is obtained through the image recognition model.
In a fourth aspect, the present application provides an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the computer program, when executed by the processor, implements the model training method or the image recognition method.
In a fifth aspect, the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the model training method or the image recognition method.
Compared with the prior art, the method has the following beneficial effects:
in the model training method, the image recognition device, the image recognition equipment and the medium provided by the embodiment of the application, an auxiliary model is introduced to assist the image recognition model in feature extraction during the training of the image recognition model. Compared with the depth learning model without assistance, the depth learning model extracts the image features by itself, and the auxiliary model enables the image recognition model to be more inclined to extract the texture features during feature extraction, so that the recognition model obtained through training can utilize the texture information in the image to achieve the purpose of improving the recognition accuracy when judging whether the face in the image is a living body face.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a schematic diagram illustrating steps of a model training method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a model provided in an embodiment of the present application;
FIG. 3 is a second schematic diagram of a model provided in an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram illustrating steps of an image recognition method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Icon: 120-a memory; 130-a processor; 201-a data acquisition module; 202-a data processing module; 203-model training module.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present application, it is noted that the terms "first", "second", "third", and the like are used merely for distinguishing between descriptions and are not intended to indicate or imply relative importance. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
As a key link in a face recognition system, face living body detection needs to effectively prevent attack of non-living faces. In the related art, a living body detection method based on deep learning is applied to various production scenes and is used for carrying out living body detection on a face image. However, the inventor researches and discovers that the existing living body detection method based on deep learning does not fully utilize the characteristic information in the non-living body face, so that the living body detection method based on deep learning has a plurality of problems in practical use.
For example, the existing in-vivo detection method based on deep learning is usually obtained by training based on a large number of samples. However, the cameras in actual use scenes include different specifications, so that images collected by the cameras with different specifications have different differences, and images outside a sample space inevitably exist, and therefore, the identification accuracy is affected by the differences in the specifications of the cameras to a certain extent.
Or, in order to improve the accuracy of the model, continuous frames are used as the input of the deep learning model, however, this method may require a large amount of computation, and may not be well adapted to the hardware performance of the terminal device (e.g., a camera), and also has a problem of poor recognition accuracy.
In view of this, the embodiment of the present application provides a model training method applied to a training device, and the training method enables an identification model obtained by training to specifically detect non-living body face features in a face image to be identified by introducing auxiliary monitoring information during training of the image identification model to be trained, so as to achieve the purpose of improving the accuracy of detecting a living body face.
The inventor researches and discovers that the copied face photo or video and the real face have difference in local texture. For example, in a face video shot by a mobile phone, due to the difference between the reflection of light by a mobile phone screen and the reflection of light by a face skin, moire fringes may exist in the image of the mobile phone screen in the camera.
The present embodiment is based on the finding that an auxiliary model is introduced during model training. Each step of the model training method provided in this embodiment is described in detail below with reference to the schematic step diagram of the model training method shown in fig. 1. As shown in fig. 1, the model training method includes:
in step S101A, a face sample image is acquired.
Step S102A, the face sample image is input into the image recognition model for processing.
Step S103A, iteratively adjusting the model parameters of the image recognition model according to the training loss to obtain the recognition model.
The training loss comprises a first training loss and a second training loss, the first training loss is obtained through texture features provided by the auxiliary model and shallow texture features output by the image recognition model, and the second training loss is obtained through the image recognition model.
Optionally, iteratively adjusting the model parameters of the image recognition model using the texture features provided by the auxiliary model to obtain the recognition model may include:
and step S103A-1, obtaining the shallow texture feature output by the image recognition model and the texture feature provided by the auxiliary model.
The shallow texture feature may be a feature output by a first network layer in the image recognition network. Those skilled in the art may select a network layer from the image recognition model as the first network layer according to the scene needs, and this embodiment is not limited in particular.
The model structure diagram shown in fig. 2 includes an image recognition model and an auxiliary model. For convenience of description, a model providing texture features in the auxiliary model will be referred to as a texture auxiliary model hereinafter.
As shown in fig. 2, compared with the face sample image in the original RGB color space, the image in the HVS color space is more beneficial to extracting texture information, so that the training apparatus converts the face sample image from the RGB color space to the HVS color space through the texture assistant model to obtain a color conversion image; then, the color conversion image is processed through a Local Binary Pattern (LBP) to obtain the texture features of the face sample image.
Referring to fig. 2 again, in the present embodiment, the second network layer of the image recognition model is used as the first network layer, and the output features thereof are used as the shallow texture features.
Step S103A-2, the difference between the texture feature and the shallow texture feature is taken as the first training loss.
Taking fig. 2 as an example again, in order to further reduce the introduction of interference information, before comparing the shallow texture features with the texture features, the image recognition model further performs feature extraction on the shallow texture features output by the first network layer by using a convolution kernel of 3 × 3.
In addition, in order to compare the shallow texture features with the texture features, the training device further scales the size of the texture features through the texture assistant model before the comparison, so that the scaled texture features and the shallow texture features have the same size.
And finally, the training equipment acquires the difference value between the texture feature and the shallow texture feature, and the difference value is used as a first training loss.
And step S103A-3, obtaining a second training loss of the image recognition model recognition sample image.
Taking fig. 2 as an example again, since the image recognition model does not reach the convergence condition during training, there is a second training loss when the image recognition model recognizes the sample image.
And step S103A-4, when the image recognition model meets the preset convergence condition, obtaining the recognition model.
And step S103A-5, when the image recognition model does not meet the preset convergence condition, adjusting the model parameters of the image recognition model according to the first training loss and the second training loss.
Optionally, the first training loss and the second training loss may be pooled by means of a weighted sum. For example, assuming that the weight of the first training loss is 0.5 and the weight of the second training loss is also 0.5, the training device may then perform a weighted summation to obtain a weighted loss according to the respective weights of the first training loss and the second training moment; and finally, adjusting the model parameters of the image recognition model by a reverse gradient propagation algorithm according to the weighting loss.
It should be noted that the above weight is only an example provided in this embodiment, and a person skilled in the art may perform adaptive adjustment according to needs, and this embodiment is not limited in particular.
And S103-6, returning to the step of obtaining the shallow texture feature output by the image recognition model and the texture feature provided by the auxiliary model until the image recognition model meets the preset convergence condition, and obtaining the recognition model.
Therefore, compared with the method that the image features are automatically extracted by a depth learning model without assistance, the method introduces an auxiliary model to assist the image recognition model to extract the features during the training period of the image recognition model. The auxiliary model enables the image recognition model to be more inclined to extract texture features during feature extraction, so that the recognition model obtained through training can utilize texture information in the image to achieve the purpose of improving recognition accuracy when judging whether the face in the image is a living body face.
Further, the inventors have also found that the high-frequency information in the living body face image is higher than that in the non-living body face image in the living body face image as compared with the non-living body face image.
Thus, as another possible implementation, the training loss includes a third training loss in addition to the first training loss and the second training loss. And obtaining the third training loss through the high-frequency characteristics provided by the auxiliary model and the deep high-frequency characteristics output by the image recognition model.
Therefore, before adjusting the model parameters of the image recognition model, the training device obtains the shallow texture features and the deep high-frequency features output by the image recognition model, and the texture features and the high-frequency features provided by the auxiliary model.
Illustratively, the training device may process the face sample image by means of Discrete Cosine Transform (DCT for Discrete Cosine Transform) to obtain high-frequency features.
Wherein, the deep shallow texture feature may be a feature output by a second network layer in the image recognition network. Those skilled in the art may select a network layer from the image recognition model as the second network layer according to the scene needs, and this embodiment is not limited in particular.
For example, a network layer preceding the fully connected layer may be selected as the second network layer.
Then, the training equipment takes the difference between the texture features and the shallow texture features as a first training loss; obtaining a second training loss of the image recognition model recognition sample image; the difference between the high frequency features and the deep high frequency features is taken as a third training loss.
And when the image recognition model meets the preset convergence condition, the training equipment obtains the trained recognition model.
And when the image recognition model does not meet the preset convergence condition, the training equipment adjusts the model parameters of the image recognition model according to the first training loss, the second training loss and the third training loss.
Exemplarily, the training equipment obtains a weighted training loss among the first training loss, the second training loss and the third training loss according to a preset weight; and adjusting the model parameters of the image recognition model to be trained according to the weighted training loss.
Wherein, the right training loss and the first training loss, the second training loss and the third training loss satisfy the following data relationship:
Lfinal=αLFL+βLDCT+γLLBP
wherein α + β + γ is 1, and α, β, γ >0, α > β, α > γ.
For example, the weight of the first training loss may be 0.3; the weight of the second training penalty may be 0.3; the weight of the third training penalty may be 0.4. Of course, one skilled in the art can adapt the adjustment as needed.
And returning to the step of obtaining the shallow texture feature output by the image recognition model and the texture feature provided by the auxiliary model, obtaining the recognition model when the image recognition model meets the preset convergence condition, and obtaining the recognition model when the image recognition model meets the preset convergence condition.
It should be understood that the above-mentioned preset convergence condition may be, but is not limited to, stopping the iteration when the model loss value no longer decreases; and stopping iteration when the iteration number reaches a set number, or stopping iteration when the model loss value is lower than a set threshold value.
For ease of understanding, the following description is made with reference to the model structure diagram shown in fig. 3. As shown in fig. 3, includes an image recognition model and an auxiliary model. For convenience of description, a model providing texture features in the auxiliary model is hereinafter referred to as a texture auxiliary model; the model in the auxiliary model that provides the high frequency features is referred to as a high frequency auxiliary model.
The texture-assisted model is the same as the texture-assisted model in fig. 2, and therefore, the description thereof is omitted. In order to further reduce the introduction of interference information, before the deep high-frequency features are compared with the high-frequency features, the image recognition model also utilizes a convolution kernel of 3 x 3 to perform further feature extraction on the deep high-frequency features output by the second network layer.
In addition, in order to compare the deep high-frequency features with the high-frequency features, before the comparison, the training device further scales the sizes of the high-frequency features through the high-frequency auxiliary model, so that the scaled high-frequency features and the deep high-frequency features have the same sizes.
Therefore, with the assistance of the texture auxiliary model and the high-frequency auxiliary model, the high-frequency features of the recognition model obtained by training are extracted on the basis of the texture features, so that the purpose of judging whether the face in the image is a living face or not by combining the texture features and the high frequency is achieved.
The embodiment further provides a model training apparatus, which is applied to a training device, wherein the training device is configured with an auxiliary model and an image recognition model to be trained, and the model training apparatus includes at least one functional module which can be stored in the memory 120 in a software form. As shown in fig. 4, functionally divided, the model training apparatus may include:
and the data acquisition module 201 is used for acquiring a face sample image.
In this embodiment, the data obtaining module 201 is configured to implement step S101A in fig. 1, and for a detailed description of the data obtaining module 201, refer to a detailed description of step S101A.
And the data processing module 202 inputs the human face sample image into the image recognition model for processing.
In this embodiment, the data processing module 202 is configured to implement step S102A in fig. 1, and for a detailed description of the data processing module 202, refer to a detailed description of step S102A.
And the model training module 203 is configured to iteratively adjust model parameters of the image recognition model according to training losses to obtain the recognition model, where the training losses include a first training loss and a second training loss, the first training loss is obtained through texture features provided by the auxiliary model and shallow texture features output by the image recognition model, and the second training loss is obtained through the image recognition model.
In this embodiment, the model training module 203 is configured to implement step S103A in fig. 1, and for a detailed description of the model training module 203, refer to a detailed description of step S103A.
It should be noted that the model training apparatus may further include other software functional modules, which are used to implement other steps or sub-steps of the above model training method; similarly, the data obtaining module 201, the data processing module 202, and the model training module 203 may also be used to implement other steps or sub-steps of the model training method, which is not specifically limited in this embodiment, and those skilled in the art may perform adaptive adjustment according to different module division standards.
The embodiment also provides an image recognition method, which is applied to an image recognition device, and the image recognition device is provided with a recognition model obtained by the model training method. That is, in this embodiment, the image recognition model to be trained is trained based on the auxiliary model, and after the image recognition model satisfies the preset convergence condition, the auxiliary model is removed, and the remaining models are referred to as the recognition models.
As shown in fig. 5, the image recognition method includes:
step S101B, a face image to be recognized is acquired.
Step S101B, obtaining the living human face recognition result of the human face image to be recognized through the recognition model.
An embodiment of the present application further provides an electronic device, as shown in fig. 6, the electronic device includes a memory and a processor, and the memory stores a computer program. When the electronic device is a training device, the computer program is executed by the processor to implement the model training method.
When the electronic device is a training device, the computer program is executed by a processor to implement the image recognition method described above.
The Memory 120 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
An embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the model training method or the image recognition method is implemented.
The processor 130 may be an integrated circuit chip having signal processing capabilities. The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In summary, in the model training method, the image recognition method, the device, the apparatus, and the medium provided in the embodiments of the present application, compared with an unassisted deep learning model, the method and the apparatus extract image features by itself, and during the training of the image recognition model, an assistant model is introduced to assist the image recognition model to extract features. The auxiliary model enables the image recognition model to be more inclined to extract texture features during feature extraction, so that the recognition model obtained through training can utilize texture information in the image to achieve the purpose of improving recognition accuracy when judging whether the face in the image is a living body face.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for various embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and all such changes or substitutions are included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A model training method is applied to a training device, the training device is provided with an auxiliary model and an image recognition model to be trained, and the method comprises the following steps:
acquiring a human face sample image;
inputting the face sample image into the image recognition model for processing;
iteratively adjusting model parameters of the image recognition model according to training losses to obtain a recognition model, wherein the training losses include a first training loss and a second training loss, the first training loss is obtained through texture features provided by the auxiliary model and shallow texture features output by the image recognition model, and the second training loss is obtained through the image recognition model.
2. The model training method of claim 1, wherein iteratively adjusting the model parameters of the image recognition model to obtain the recognition model according to the training loss comprises:
obtaining shallow texture features output by the image recognition model and texture features provided by the auxiliary model;
taking a difference between the textural features and the shallow textural features as a first training loss;
obtaining a second training loss of the image recognition model for recognizing the sample image;
when the image recognition model meets a preset convergence condition, obtaining the recognition model;
when the image recognition model does not meet the preset convergence condition, adjusting model parameters of the image recognition model according to the first training loss and the second training loss;
and returning to the step of obtaining the shallow texture feature output by the image recognition model and the texture feature provided by the auxiliary model until the image recognition model meets the preset convergence condition, and obtaining the recognition model.
3. The model training method of claim 1, wherein the training penalty further comprises a third training penalty obtained by matching the high frequency features provided by the auxiliary model with the deep high frequency features output by the image recognition model.
4. The model training method of claim 3, wherein iteratively adjusting the model parameters of the image recognition model to obtain the recognition model according to the training loss comprises:
obtaining a shallow texture feature and a deep high-frequency feature output by the image recognition model, and the texture feature and the high-frequency feature provided by the auxiliary model;
taking a difference between the textural features and the shallow textural features as a first training loss;
obtaining a second training loss of the image recognition model for recognizing the sample image;
taking the difference between the high frequency features and the deep high frequency features as a third training loss;
when the image recognition model meets a preset convergence condition, obtaining the recognition model;
when the image recognition model does not meet the preset convergence condition, adjusting model parameters of the image recognition model according to the first training loss, the second training loss and the third training loss;
and returning to the step of executing the step of obtaining the shallow texture feature output by the image recognition model and the texture feature provided by the auxiliary model until the image recognition model meets a preset convergence condition, obtaining the recognition model until the image recognition model meets the preset convergence condition, and obtaining the recognition model.
5. The model training method of claim 4, wherein the adjusting the model parameters of the image recognition model according to the first training loss, the second training loss, and the third training loss comprises:
obtaining a weighted training loss among the first training loss, the second training loss and the third training loss according to a preset weight;
and adjusting the model parameters of the image recognition model to be trained according to the weighted training loss.
6. The model training method according to claim 4, wherein the obtaining high-frequency features provided by the auxiliary model comprises:
and processing the face sample image in a discrete cosine transform mode to obtain the high-frequency characteristics.
7. An image recognition method applied to an image recognition device configured with a recognition model obtained by the model training method according to any one of claims 1 to 6, the method comprising:
acquiring a face image to be recognized;
and obtaining a living body face recognition result of the face image to be recognized through the recognition model.
8. A model training apparatus applied to a training device provided with an auxiliary model and an image recognition model to be trained, the model training apparatus comprising:
the data acquisition module is used for acquiring a human face sample image;
the data processing module is used for inputting the human face sample image into the image recognition model for processing;
and the model training module is used for iteratively adjusting the model parameters of the image recognition model according to training losses to obtain a recognition model, wherein the training losses comprise a first training loss and a second training loss, the first training loss is obtained through the texture features provided by the auxiliary model and the shallow texture features output by the image recognition model, and the second training loss is obtained through the image recognition model.
9. An electronic device, characterized in that the electronic device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, implements the model training method of any one of claims 1-6 or the image recognition method of claim 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the model training method of any one of claims 1-6 or the image recognition method of claim 7.
CN202110623053.9A 2021-06-04 2021-06-04 Model training method, image recognition method, device, equipment and medium Active CN113221842B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110623053.9A CN113221842B (en) 2021-06-04 2021-06-04 Model training method, image recognition method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110623053.9A CN113221842B (en) 2021-06-04 2021-06-04 Model training method, image recognition method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN113221842A true CN113221842A (en) 2021-08-06
CN113221842B CN113221842B (en) 2023-12-29

Family

ID=77082798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110623053.9A Active CN113221842B (en) 2021-06-04 2021-06-04 Model training method, image recognition method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113221842B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114724220A (en) * 2022-04-12 2022-07-08 广州广电卓识智能科技有限公司 Living body detection method, living body detection device, and readable medium
CN115631388A (en) * 2022-12-21 2023-01-20 第六镜科技(成都)有限公司 Image classification method and device, electronic equipment and storage medium

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003196674A (en) * 2001-12-25 2003-07-11 Mega Chips Corp Image processing method, image processing device and storage medium readable by computer
CN106778525A (en) * 2016-11-25 2017-05-31 北京旷视科技有限公司 Identity identifying method and device
CN106951869A (en) * 2017-03-22 2017-07-14 腾讯科技(深圳)有限公司 A kind of live body verification method and equipment
CN107992842A (en) * 2017-12-13 2018-05-04 深圳云天励飞技术有限公司 Biopsy method, computer installation and computer-readable recording medium
CN108764091A (en) * 2018-05-18 2018-11-06 北京市商汤科技开发有限公司 Biopsy method and device, electronic equipment and storage medium
CN109446980A (en) * 2018-10-25 2019-03-08 华中师范大学 Expression recognition method and device
US20190147642A1 (en) * 2017-11-15 2019-05-16 Google Llc Learning to reconstruct 3d shapes by rendering many 3d views
CN109961396A (en) * 2017-12-25 2019-07-02 中国科学院沈阳自动化研究所 A kind of image super-resolution rebuilding method based on convolutional neural networks
CN110765880A (en) * 2019-09-24 2020-02-07 中国矿业大学 Light-weight video pedestrian heavy identification method
WO2020221990A1 (en) * 2019-04-30 2020-11-05 Huawei Technologies Co., Ltd. Facial localisation in images
KR20200132682A (en) * 2019-05-16 2020-11-25 삼성전자주식회사 Image optimization method, apparatus, device and storage medium
CN112215050A (en) * 2019-06-24 2021-01-12 北京眼神智能科技有限公司 Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment
CN112257665A (en) * 2020-11-12 2021-01-22 腾讯科技(深圳)有限公司 Image content recognition method, image recognition model training method, and medium
CN112329745A (en) * 2021-01-04 2021-02-05 北京沃东天骏信息技术有限公司 Training method, face anti-counterfeiting detection method, related equipment and storage medium
US20210082136A1 (en) * 2018-12-04 2021-03-18 Yoti Holding Limited Extracting information from images
CN112597885A (en) * 2020-12-22 2021-04-02 北京华捷艾米科技有限公司 Face living body detection method and device, electronic equipment and computer storage medium
CN112883941A (en) * 2021-04-16 2021-06-01 哈尔滨理工大学 Facial expression recognition method based on parallel neural network

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003196674A (en) * 2001-12-25 2003-07-11 Mega Chips Corp Image processing method, image processing device and storage medium readable by computer
CN106778525A (en) * 2016-11-25 2017-05-31 北京旷视科技有限公司 Identity identifying method and device
CN106951869A (en) * 2017-03-22 2017-07-14 腾讯科技(深圳)有限公司 A kind of live body verification method and equipment
US20190147642A1 (en) * 2017-11-15 2019-05-16 Google Llc Learning to reconstruct 3d shapes by rendering many 3d views
CN107992842A (en) * 2017-12-13 2018-05-04 深圳云天励飞技术有限公司 Biopsy method, computer installation and computer-readable recording medium
CN109961396A (en) * 2017-12-25 2019-07-02 中国科学院沈阳自动化研究所 A kind of image super-resolution rebuilding method based on convolutional neural networks
CN108764091A (en) * 2018-05-18 2018-11-06 北京市商汤科技开发有限公司 Biopsy method and device, electronic equipment and storage medium
CN109446980A (en) * 2018-10-25 2019-03-08 华中师范大学 Expression recognition method and device
US20210082136A1 (en) * 2018-12-04 2021-03-18 Yoti Holding Limited Extracting information from images
WO2020221990A1 (en) * 2019-04-30 2020-11-05 Huawei Technologies Co., Ltd. Facial localisation in images
KR20200132682A (en) * 2019-05-16 2020-11-25 삼성전자주식회사 Image optimization method, apparatus, device and storage medium
CN112215050A (en) * 2019-06-24 2021-01-12 北京眼神智能科技有限公司 Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment
CN110765880A (en) * 2019-09-24 2020-02-07 中国矿业大学 Light-weight video pedestrian heavy identification method
CN112257665A (en) * 2020-11-12 2021-01-22 腾讯科技(深圳)有限公司 Image content recognition method, image recognition model training method, and medium
CN112597885A (en) * 2020-12-22 2021-04-02 北京华捷艾米科技有限公司 Face living body detection method and device, electronic equipment and computer storage medium
CN112329745A (en) * 2021-01-04 2021-02-05 北京沃东天骏信息技术有限公司 Training method, face anti-counterfeiting detection method, related equipment and storage medium
CN112883941A (en) * 2021-04-16 2021-06-01 哈尔滨理工大学 Facial expression recognition method based on parallel neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘芾;李茂军;胡建文;肖雨荷;齐战;: "基于低像素人脸图像的表情识别", 激光与光电子学进展, no. 10 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114724220A (en) * 2022-04-12 2022-07-08 广州广电卓识智能科技有限公司 Living body detection method, living body detection device, and readable medium
CN115631388A (en) * 2022-12-21 2023-01-20 第六镜科技(成都)有限公司 Image classification method and device, electronic equipment and storage medium
CN115631388B (en) * 2022-12-21 2023-03-17 第六镜科技(成都)有限公司 Image classification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113221842B (en) 2023-12-29

Similar Documents

Publication Publication Date Title
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
CN109325954B (en) Image segmentation method and device and electronic equipment
CN110084135B (en) Face recognition method, device, computer equipment and storage medium
WO2019192121A1 (en) Dual-channel neural network model training and human face comparison method, and terminal and medium
CN110569731B (en) Face recognition method and device and electronic equipment
CN112052831B (en) Method, device and computer storage medium for face detection
CN111160313B (en) Face representation attack detection method based on LBP-VAE anomaly detection model
JP2002342756A (en) Method for detecting position of eye and mouth in digital image
CN108416291B (en) Face detection and recognition method, device and system
CN110826418B (en) Facial feature extraction method and device
CN110532746B (en) Face checking method, device, server and readable storage medium
CN113221842A (en) Model training method, image recognition method, device, equipment and medium
CN112836625A (en) Face living body detection method and device and electronic equipment
JP2019153092A (en) Position identifying device, position identifying method, and computer program
CN116798041A (en) Image recognition method and device and electronic equipment
CN116229528A (en) Living body palm vein detection method, device, equipment and storage medium
CN111723626A (en) Method, device and electronic equipment for living body detection
CN112733670B (en) Fingerprint feature extraction method and device, electronic equipment and storage medium
CN111986176B (en) Crack image identification method, system, terminal and readable storage medium
CN113807237A (en) Training of in vivo detection model, in vivo detection method, computer device, and medium
WO2020113563A1 (en) Facial image quality evaluation method, apparatus and device, and storage medium
CN112329606B (en) Living body detection method, living body detection device, electronic equipment and readable storage medium
CN116645525B (en) Game image recognition method and processing system
KR102488858B1 (en) Method, apparatus and program for digital restoration of damaged object
CN111144357B (en) Face recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 202-60, building 6, courtyard 1, gaolizhang Road, Haidian District, Beijing 100082

Applicant after: Sixth mirror technology (Beijing) Group Co.,Ltd.

Address before: Room 202-60, building 6, courtyard 1, gaolizhang Road, Haidian District, Beijing 100082

Applicant before: GLASSSIX TECHNOLOGY (BEIJING) Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant