CN116486493A - Living body detection method, device and equipment - Google Patents

Living body detection method, device and equipment Download PDF

Info

Publication number
CN116486493A
CN116486493A CN202310466115.9A CN202310466115A CN116486493A CN 116486493 A CN116486493 A CN 116486493A CN 202310466115 A CN202310466115 A CN 202310466115A CN 116486493 A CN116486493 A CN 116486493A
Authority
CN
China
Prior art keywords
source domain
domain
network
living body
body detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310466115.9A
Other languages
Chinese (zh)
Inventor
李雅纯
王晶晶
陈玉辉
谢迪
浦世亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202310466115.9A priority Critical patent/CN116486493A/en
Publication of CN116486493A publication Critical patent/CN116486493A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/70Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in livestock or poultry

Abstract

The application provides a living body detection method, a living body detection device and living body detection equipment, wherein the method comprises the following steps: acquiring a source domain image and a natural image; performing frequency domain transformation on the source domain image to obtain a source domain spectrogram, performing frequency domain transformation on the natural image to obtain a natural spectrogram, generating a mixed spectrogram based on the source domain spectrogram and the natural spectrogram, and performing frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image; adjusting network parameters of an initial living body detection model based on the source domain image and the disturbance source domain image to obtain an adjusted living body detection model, and determining a target living body detection model based on the adjusted living body detection model; the target living body detection model is used for carrying out living body detection on an image to be detected. Through the technical scheme of the application, whether the current operation is a real living operation or not can be verified, and the accuracy of living detection is high.

Description

Living body detection method, device and equipment
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a living body detection method, a living body detection device and living body detection equipment.
Background
With the development of identity verification technology, the identity verification system can be adopted for identity recognition in various application scenes such as banks, railway stations, airports and the like. However, the authentication system is vulnerable to attack, thereby bringing about potential safety hazards, and in order to ensure the stability and security of the authentication system, living body detection is becoming popular.
The living body detection is a method for determining the real physiological characteristics of the user, the living body detection can be performed through combined actions such as blinking, opening mouth, shaking head and nodding, and technologies such as key point positioning and key point detection are used for verifying whether the current operation is the real living body operation or not, and attack means such as photos, videos, face changing, masks, shielding, 3D animation and screen flipping can be effectively resisted, so that the user is helped to screen the attack actions, and the benefit of the user is guaranteed.
However, when verifying whether the current operation is a real living operation by using technologies such as key point positioning and key point detection through combined actions such as blinking, opening mouth, shaking head, nodding head, etc., the accuracy of living detection is relatively low, that is, whether the current operation is a real living operation cannot be accurately detected.
Disclosure of Invention
The present application provides a living body detection method, the method comprising:
acquiring a source domain image and a natural image, wherein the source domain image comprises a living object and/or a non-living object;
performing frequency domain transformation on the source domain image to obtain a source domain spectrogram, performing frequency domain transformation on the natural image to obtain a natural spectrogram, generating a mixed spectrogram based on the source domain spectrogram and the natural spectrogram, and performing frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image;
Adjusting network parameters of an initial living body detection model based on the source domain image and the disturbance source domain image to obtain an adjusted living body detection model, and determining a target living body detection model based on the adjusted living body detection model; the target living body detection model is used for carrying out living body detection on an image to be detected.
The present application provides a living body detection apparatus, the apparatus comprising:
the acquisition module is used for acquiring the source domain image and the natural image; wherein the source domain image includes a living object and/or a non-living object;
the processing module is used for carrying out frequency domain transformation on the source domain image to obtain a source domain spectrogram, carrying out frequency domain transformation on the natural image to obtain a natural spectrogram, generating a mixed spectrogram based on the source domain spectrogram and the natural spectrogram, and carrying out frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image;
the training module is used for adjusting network parameters of the initial living body detection model based on the source domain image and the disturbance source domain image to obtain an adjusted living body detection model, and determining a target living body detection model based on the adjusted living body detection model; the target living body detection model is used for carrying out living body detection on the image to be detected.
The application provides an electronic device, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute the machine-executable instructions to implement the living body detection method of the above example.
According to the technical scheme, in the embodiment of the application, a large number of disturbance source domain images generated through simulation are constructed based on the source domain images and the natural images, and then the initial living body detection model is trained by utilizing the source domain images and the disturbance source domain images, so that the target living body detection model is obtained. After the image to be detected is acquired, the image to be detected is input to a target living body detection model, and living body detection is carried out on the image to be detected through the target living body detection model, so that whether the current operation is a real living body operation or not can be verified, and the accuracy of living body detection is high, namely whether the current operation is the real living body operation or not can be accurately detected. Under the condition that the multi-scene data are limited, a few source domain images and a large number of natural images can be utilized to simulate and generate disturbance source domain images under a plurality of scenes, the source domain data can be enriched, the disturbance source domain images generated by a large number of simulation are utilized to train to obtain a target living body detection model, the domain generalization capability of the target living body detection model can be improved, and better domain generalization performance and domain generalization effect are realized. The method can solve the problem of cross-domain (cross-scene and cross-device) of living body detection, the problems of model degradation, even complete failure and the like easily occur during cross-domain, and the target living body detection model can still maintain higher accuracy in the target domain under the condition of unknown target domain.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description will briefly describe the drawings that are required to be used in the embodiments of the present application or the description in the prior art, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may also be obtained according to these drawings of the embodiments of the present application for a person having ordinary skill in the art.
FIG. 1 is a flow diagram of a method of in-vivo detection in one embodiment of the present application;
FIG. 2 is a schematic diagram of a simulated generation of a disturbance source domain image in one embodiment of the present application;
FIG. 3 is a schematic diagram of the structure of a living body detection model in one embodiment of the present application;
FIG. 4 is a schematic diagram of the architecture of a dynamic module network in one embodiment of the present application;
FIG. 5 is a schematic diagram of the architecture of a dynamic module network in one embodiment of the present application;
FIG. 6 is a schematic diagram of a dynamic module network in one embodiment of the present application;
FIG. 7 is a schematic structural view of a living body detection apparatus in an embodiment of the present application;
fig. 8 is a hardware configuration diagram of an electronic device in an embodiment of the present application.
Detailed Description
The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to any or all possible combinations including one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in embodiments of the present application to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, a first message may also be referred to as a second message, and similarly, a second message may also be referred to as a first message, without departing from the scope of the present application. Depending on the context, furthermore, the word "if" used may be interpreted as "at … …" or "at … …" or "in response to a determination".
In an embodiment of the present application, a living body detection method is provided, and referring to fig. 1, the method may include:
Step 101, acquiring a source domain image and a natural image; wherein the source domain image may comprise a living object and/or a non-living object.
102, performing frequency domain transformation on the source domain image to obtain a source domain spectrogram, performing frequency domain transformation on the natural image to obtain a natural spectrogram, generating a mixed spectrogram based on the source domain spectrogram and the natural spectrogram, and performing frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image.
Illustratively, generating a hybrid spectrogram based on the source domain spectrogram and the natural spectrogram may include, but is not limited to: if the source domain spectrogram comprises a source domain amplitude spectrum and a source domain phase spectrum, the natural spectrogram comprises a natural amplitude spectrum and a natural phase spectrum, a disturbance amplitude spectrum can be generated based on the source domain amplitude spectrum, a first disturbance coefficient of the source domain amplitude spectrum, and a second disturbance coefficient of the natural amplitude spectrum; wherein the sum of the first perturbation coefficient and the second perturbation coefficient is a fixed value, and the second perturbation coefficient is determined based on the configured perturbation intensity. A hybrid spectrum graph is generated based on the disturbance magnitude spectrum and the source domain phase spectrum.
Step 103, adjusting network parameters of the initial living body detection model based on the source domain image and the disturbance source domain image to obtain an adjusted living body detection model, and determining a target living body detection model based on the adjusted living body detection model; the target living body detection model is used for carrying out living body detection on the image to be detected.
Illustratively, the initial living detection model may include a feature extraction network, a dynamic module network, and a classifier network, and the network parameters of the initial living detection model are adjusted based on the source domain image and the disturbance source domain image to obtain an adjusted living detection model, which may include, but is not limited to: the source domain image can be input to a feature extraction network to obtain source domain image features; inputting the source domain image characteristics to a dynamic module network to obtain source domain dynamic characteristics; inputting the dynamic characteristics of the source domain into a classifier network to obtain output characteristics of the source domain; and determining a first loss value based on the source domain output characteristics, determining a target loss value based on the first loss value, and adjusting network parameters of a characteristic extraction network, a dynamic module network and a classifier network based on the target loss value to obtain an intermediate living body detection model. The disturbance source domain image can be input to a feature extraction network of the intermediate living body detection model to obtain disturbance source domain image features; inputting the disturbance source domain image characteristics to a dynamic module network of the intermediate living body detection model to obtain disturbance source domain dynamic characteristics; inputting the dynamic characteristics of the disturbance source domain into a classifier network of an intermediate living body detection model to obtain output characteristics of the disturbance source domain; and determining a second loss value based on the disturbance source domain output characteristics, and adjusting network parameters of a characteristic extraction network, a dynamic module network and a classifier network of the intermediate living body detection model based on the second loss value to obtain an adjusted living body detection model.
Illustratively, the dynamic module network may include a domain invariant branch network and a domain specific branch network, and inputting the source domain image features to the dynamic module network to obtain the source domain dynamic features may include, but is not limited to: inputting the image characteristics of the source domain into a domain invariant branch network to obtain domain invariant characteristics; wherein the domain invariant feature is a commonality feature of the plurality of domains determined based on the source domain image features. Inputting the source domain image characteristics into a domain specific branch network to obtain domain specific characteristics; wherein the domain-specific feature is a personality feature of a domain determined based on the source domain image features. A source domain dynamic feature is generated based on the domain invariant feature and the domain specific feature.
Illustratively, the domain-specific branch network may include a dynamic adapter and K convolutional layers, K being a positive integer greater than 1, and inputting the source domain image features to the domain-specific branch network results in domain-specific features, which may include, but are not limited to: inputting the source domain image characteristics to a dynamic adapter, and generating K weight values corresponding to K convolution layers by the dynamic adapter, wherein the K weight values are in one-to-one correspondence with the K convolution layers; inputting the source domain image characteristics to each convolution layer, and generating convolution characteristics corresponding to the convolution layers by the convolution layers based on the source domain image characteristics and weight values corresponding to the convolution layers; domain-specific features are generated based on K convolution features corresponding to the K convolution layers.
Illustratively, determining the target loss value based on the first loss value may include, but is not limited to: and determining an entropy minimum loss value and a difference regular loss value based on the K weight values. A third loss value is determined based on the entropy minimum loss value and the difference canonical loss value, and a target loss value is determined based on the first loss value and the third loss value.
Illustratively, the domain invariant branch network may include a convolution layer, an instance normalization layer, and an activation layer, and inputting the source domain image features to the domain invariant branch network results in domain invariant features, which may include, but are not limited to: the source domain image features are input to the convolution layer to obtain convolved features, the convolved features are input to the example normalization layer to obtain normalized features, the normalized features are input to the activation layer to obtain activated features, and on the basis, domain invariant features can be generated based on the activated features.
Illustratively, determining the target biopsy model based on the adjusted biopsy model may include, but is not limited to: if the adjusted living body detection model is converged, the adjusted living body detection model can be used as a target living body detection model, if the adjusted living body detection model is not converged, the adjusted living body detection model can be used as an initial living body detection model, and the network parameters of the initial living body detection model are adjusted based on the source domain image and the disturbance source domain image to obtain the adjusted living body detection model.
For example, after the target living body detection model is obtained through training, the target living body detection model may be deployed to the terminal device, and after the terminal device collects the image to be detected, the image to be detected may be input to the target living body detection model, so as to perform living body detection on the image to be detected through the target living body detection model, thereby verifying whether the current operation is a real living body operation, and the living body detection process is not limited.
According to the technical scheme, in the embodiment of the application, a large number of disturbance source domain images generated through simulation are constructed based on the source domain images and the natural images, and then the initial living body detection model is trained by utilizing the source domain images and the disturbance source domain images, so that the target living body detection model is obtained. After the image to be detected is acquired, the image to be detected is input to a target living body detection model, and living body detection is carried out on the image to be detected through the target living body detection model, so that whether the current operation is a real living body operation or not can be verified, and the accuracy of living body detection is high, namely whether the current operation is the real living body operation or not can be accurately detected. Under the condition that the multi-scene data are limited, a few source domain images and a large number of natural images can be utilized to simulate and generate disturbance source domain images under a plurality of scenes, the source domain data can be enriched, the disturbance source domain images generated by a large number of simulation are utilized to train to obtain a target living body detection model, the domain generalization capability of the target living body detection model can be improved, and better domain generalization performance and domain generalization effect are realized. The method can solve the problem of cross-domain (cross-scene and cross-device) of living body detection, the problems of model degradation, even complete failure and the like easily occur during cross-domain, and the target living body detection model can still maintain higher accuracy in the target domain under the condition of unknown target domain.
The above technical solutions of the embodiments of the present application are described below with reference to specific application scenarios.
Living detection is a method of determining the true physiological characteristics of a user, and is capable of verifying whether a current operation is a true living operation. In order to realize the living body detection, a living body detection model can be trained to realize the living body detection based on the living body detection model, so that whether the current operation is a real living body operation can be verified.
In order to adapt the living detection model to all scenes, a data set of each scene needs to be acquired, and the living detection model is obtained based on the data set of all scenes. However, since there are many actual scenes, it takes a lot of time to acquire the data set of each scene, that is, it takes a long time to train to obtain the living body detection model, the training time is long, and the training efficiency is low. Moreover, a large amount of resources are required to collect the data set of each scene, and the resource consumption is high. Even a large amount of time and a large amount of resources are spent, data of all scenes cannot be acquired, so that the performance of the living body detection model is poor, the living body detection model cannot be suitable for all scenes, and under certain scenes, the accuracy of living body detection is low when the living body detection model is adopted to realize living body detection.
In view of the above findings, in the embodiment of the present application, in the case of limited multi-scene data, only a small number of source domain images (i.e., a data set of a small number of scenes) need to be acquired, and a large number of natural images (e.g., a large number of natural images from a common data set, which can cover a sufficiently abundant scene) are acquired, so that a disturbance source domain image (i.e., a data set of a large number of scenes) in a plurality of scenes can be simulated and generated by using the small number of source domain images and the large number of natural images, thereby enriching the source domain data. Because only a small amount of source domain images are acquired, the living body detection model can be obtained by training only a small amount of time, the training time is relatively short, and the training efficiency is relatively high. Moreover, a small amount of source domain images can be acquired by adopting a small amount of resources, and the resource consumption is relatively small. The living body detection model can be obtained by training a large number of disturbance source domain images, the living body detection model has better performance, can be suitable for all scenes or most scenes, can improve the domain generalization capability of the living body detection model, realizes better domain generalization performance and domain generalization effect, and has higher living body detection accuracy.
Referring to FIG. 2, a schematic diagram of generating a disturbance source domain image for simulation may include:
Step 201, acquiring a source domain image and a natural image; wherein the source domain image may comprise a living object and/or a non-living object.
For example, for a large number of scenes, only a few sets of scenes may be acquired, these sets of data being referred to as source domain (Source domain) sets of data, the data in the source domain sets being referred to as source domain images, which may be tagged images, being sample images for training a living detection model.
Illustratively, while living samples from different domains (different scenes) are more difficult to acquire, a large number of natural images from a common dataset, such as natural images of ImageNet dataset and/or COCO dataset, can be acquired that can cover sufficiently rich scenes, and have rich domain style information that helps to promote the generalization ability of living detection models.
202, performing frequency domain transformation on a source domain image to obtain a source domain spectrogram, wherein the source domain spectrogram can comprise a source domain amplitude spectrum and a source domain phase spectrum; the natural image is subjected to frequency domain transformation to obtain a natural spectrogram, and the natural spectrogram can comprise a natural amplitude spectrum and a natural phase spectrum.
For example, a partial source domain image or a whole source domain image may be selected from all source domain images, and for each selected source domain image (subsequently, a source domain image X is used s For example) frequency domain transforms, such as source domain images by FFT algorithmsX s Fourier transforming to obtain source domain image X s A corresponding source domain spectrogram, which may include a source domain amplitude spectrum a (X s ) And source domain phase spectrum P (X s )。
For example, a part of or all of the natural images may be selected from all of the natural images, and for each selected natural image (followed by a natural image X n For example) frequency domain transformation, such as natural image X by FFT algorithm n Fourier transforming to obtain natural image X n A corresponding natural spectrum graph, which may include a natural magnitude spectrum a (X n ) And natural phase spectrum P (X n )。
Step 203, generating a disturbance amplitude spectrum based on the source domain amplitude spectrum, a first disturbance coefficient of the source domain amplitude spectrum, the natural amplitude spectrum and a second disturbance coefficient of the natural amplitude spectrum; wherein the sum of the first perturbation coefficient and the second perturbation coefficient may be a fixed value (e.g., 1), and the second perturbation coefficient may be determined based on the configured perturbation intensity, or the first perturbation coefficient may be determined based on the configured perturbation intensity.
For example, since the phase component in the fourier spectrum obtained by fourier transform can retain the high-level semantics of the original signal, and the amplitude component contains low-level statistical information, i.e. the phase component of the fourier spectrum has the characteristic of retaining semantics, in order to better utilize the data distribution of a large number of natural images, the source domain image can be disturbed by utilizing the properties retained by the semantics of the frequency domain transform (such as fourier transform), so as to simulate the possible distribution changes in different target domains, i.e. simulate the domain offsets of different target domains, based on which the source domain image can be disturbed by utilizing the natural image, for example, the source domain amplitude spectrum can be disturbed by utilizing the natural amplitude spectrum.
Illustratively, to take advantage of the natural amplitude spectrum A (X n ) Based on natural amplitude spectrum a (X n ) And source domain amplitude spectrum a (X s ) The disturbance amplitude spectrum can be obtained based on linear interpolationFor determining the disturbance magnitude spectrum +.>Is not limited thereto.
In formula (1), λ represents a natural magnitude spectrum a (X n ) 1-lambda represents the source domain amplitude spectrum a (X s ) Obviously, the sum of the first disturbance factor and the second disturbance factor may be 1.
In the formula (1), λ may be greater than 0 and less than η, where η is used to control the disturbance intensity, and the value of η may be empirically configured, that is, the value of λ may be empirically configured, which is not limited. After the second perturbation coefficient λ is obtained, the first perturbation coefficients 1- λ can be obtained. Of course, the values of the first disturbance coefficients 1- λ may be configured first, and the second disturbance coefficient λ may be obtained after the first disturbance coefficients 1- λ are obtained.
Illustratively, the low-level statistical information of the source domain image is modified through disturbance of the natural image, so that the low-level statistical information of the source domain image is richer, but semantic information of the source domain image is not affected.
Step 204, generating a mixed spectrum graph based on the disturbance magnitude spectrum and the source domain phase spectrum.
Illustratively, a disturbance amplitude spectrum is obtainedAfterwards, the disturbance magnitude spectrum can be +.>And a source domain phase spectrum P (X s ) And (5) recombining to obtain a mixed spectrogram.
And 205, carrying out frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image.
For example, after the mixed spectrogram is obtained, the mixed spectrogram can be subjected to frequency domain inverse transformation, and a disturbance source domain image with rich domain styles, namely, a rich disturbance source domain image with changed statistical information, is simulated and generated. For example, inverse Fourier transform iFFT is used to perform frequency domain inverse transform on the mixed spectrogram to obtain a disturbed source domain image after disturbance Assuming that the above-mentioned perturbation process is performed for M source domain images and N natural images, m×n perturbation source domain images can be obtained, i.e., a large number of perturbation source domain images can be obtained.
For example, a preconfigured initial living detection model may be acquired, after obtaining the disturbance source domain image, network parameters of the initial living detection model may be adjusted based on the source domain image and the disturbance source domain image, an adjusted living detection model may be obtained, and a target living detection model may be determined based on the adjusted living detection model.
For example, the source domain image is input to an initial living body detection model to obtain a source domain output characteristic, a loss value is determined based on the source domain output characteristic, and network parameters of the initial living body detection model are adjusted based on the loss value to obtain an intermediate living body detection model. And inputting the disturbance source domain image into the intermediate living body detection model to obtain disturbance source domain output characteristics, determining a loss value based on the disturbance source domain output characteristics, and adjusting network parameters of the intermediate living body detection model based on the loss value to obtain an adjusted living body detection model.
For another example, the disturbance source domain image is input to the initial living body detection model to obtain a disturbance source domain output characteristic, a loss value is determined based on the disturbance source domain output characteristic, and network parameters of the initial living body detection model are adjusted based on the loss value to obtain an intermediate living body detection model. And then, inputting the source domain image into the intermediate living body detection model to obtain a source domain output characteristic, determining a loss value based on the source domain output characteristic, and adjusting network parameters of the intermediate living body detection model based on the loss value to obtain an adjusted living body detection model.
For example, if the adjusted living body detection model has converged, the adjusted living body detection model may be taken as a target living body detection model, and if the adjusted living body detection model has not converged, the adjusted living body detection model may be taken as an initial living body detection model, and the network parameters of the initial living body detection model are adjusted based on the source domain image and the disturbance source domain image, so as to obtain the adjusted living body detection model.
In summary, a large number of disturbance source domain images are generated by simulating a small number of source domain images and a large number of natural images, a rich disturbance source domain image is generated by simulating frequency domain transformation, the condition that living body detection is actually applied is simulated based on meta learning, domain generalization living body detection based on meta learning is realized, good domain generalization effects of living body detection models can be realized under the condition that scene data are limited, such as living body detection effects of cross-equipment and cross-scene, and domain generalization capability of the living body detection models is improved. Domain generalization (Domain Generalization, DG) refers to: using multiple source domain (source domain) data, generalizing to an unknown target domain (unknown/unseen target domains), where there is only a single domain sample during training, may be referred to as single domain generalization.
Referring to fig. 3, a schematic structure of a living body detection model (such as an initial living body detection model or a target living body detection model) is shown, and the living body detection model may include a feature extraction network, a dynamic module network and a classifier network. Of course, fig. 3 is only an example of a living body detection model, and is not limited thereto.
In the model training process, network parameters of a feature extraction network, a dynamic module network and a classifier network of the initial living body detection model can be adjusted based on the source domain image and the disturbance source domain image to obtain an adjusted living body detection model, and a target living body detection model is determined based on the adjusted living body detection model.
For example, inputting the source domain image to a feature extraction network to obtain the source domain image feature; inputting the source domain image characteristics into a dynamic module network to obtain source domain dynamic characteristics; inputting the dynamic characteristics of the source domain into a classifier network to obtain output characteristics of the source domain; and determining a first loss value based on the source domain output characteristics, determining a target loss value based on the first loss value, and adjusting network parameters of the characteristic extraction network, the dynamic module network and the classifier network based on the target loss value to obtain an intermediate living body detection model. Then, inputting the disturbance source domain image into a feature extraction network of the intermediate living body detection model to obtain disturbance source domain image features; inputting the disturbance source domain image characteristics into a dynamic module network of the intermediate living body detection model to obtain disturbance source domain dynamic characteristics; inputting the dynamic characteristics of the disturbance source domain into a classifier network of the intermediate living body detection model to obtain the output characteristics of the disturbance source domain; and determining a second loss value based on the disturbance source domain output characteristics, and adjusting network parameters of a characteristic extraction network, a dynamic module network and a classifier network of the intermediate living body detection model based on the second loss value to obtain an adjusted living body detection model.
For another example, the disturbance source domain image is input to a feature extraction network to obtain disturbance source domain image features; inputting the disturbance source domain image characteristics into a dynamic module network to obtain disturbance source domain dynamic characteristics; inputting the dynamic characteristics of the disturbance source domain into a classifier network to obtain the output characteristics of the disturbance source domain; and determining a first loss value based on the disturbance source domain output characteristics, determining a target loss value based on the first loss value, and adjusting network parameters of the characteristic extraction network, the dynamic module network and the classifier network based on the target loss value to obtain an intermediate living body detection model. Then, inputting the source domain image into a feature extraction network of the intermediate living body detection model to obtain source domain image features; inputting the source domain image characteristics into a dynamic module network of the intermediate living body detection model to obtain source domain dynamic characteristics; inputting the dynamic characteristics of the source domain into a classifier network of the intermediate living body detection model to obtain output characteristics of the source domain; and determining a second loss value based on the source domain output characteristics, and adjusting network parameters of a characteristic extraction network, a dynamic module network and a classifier network of the intermediate living body detection model based on the second loss value to obtain an adjusted living body detection model.
For example, if the adjusted living body detection model has converged, the adjusted living body detection model may be taken as a target living body detection model, and if the adjusted living body detection model has not converged, the adjusted living body detection model may be taken as an initial living body detection model, and the network parameters of the initial living body detection model are adjusted based on the source domain image and the disturbance source domain image, so as to obtain the adjusted living body detection model.
In the use process of the model, living body detection can be carried out on the image to be detected based on the target living body detection model. For example, after the target living body detection model is obtained through training, the target living body detection model can be deployed to the terminal equipment, and after the image to be detected is acquired, the terminal equipment can input the image to be detected to the target living body detection model so as to carry out living body detection on the image to be detected through the target living body detection model, thereby verifying whether the current operation is a real living body operation or not, and the living body detection process is not limited.
For example, inputting the image to be detected into a feature extraction network to obtain the feature of the image to be detected; inputting the image characteristics to be detected into a dynamic module network to obtain dynamic characteristics to be detected; inputting the dynamic characteristics to be detected into a classifier network to obtain output characteristics to be detected; a living detection result is determined based on the output characteristic to be detected, the living detection result being used to indicate that the current operation is a real living operation or not.
Referring to fig. 4, a schematic diagram of a dynamic module network is shown, which may include a domain-invariant branch network and a domain-specific branch network. Of course, FIG. 4 is merely an example of a dynamic module network, and is not limited in this regard. Regarding the feature extraction network and the classifier network, there is no limitation in this embodiment as long as the feature extraction network can realize the feature extraction function and the classifier network can realize the classification function.
Illustratively, after obtaining the source domain image feature (or the disturbance source domain image feature), inputting the source domain image feature (or the disturbance source domain image feature) to a domain invariant branch network to obtain a domain invariant feature, wherein the domain invariant feature is a common feature of a plurality of domains determined based on the source domain image feature (or the disturbance source domain image feature). The domain-specific feature is obtained by inputting the source domain image feature (or the disturbance source domain image feature) into the domain-specific branch network, and the domain-specific feature is a personalized feature of one domain determined based on the source domain image feature (or the disturbance source domain image feature). A source domain dynamic feature (or perturbation source domain dynamic feature) is generated based on the domain invariant feature and the domain specific feature.
Referring to fig. 5, which is a schematic structural diagram of a dynamic module network, a domain-specific branch network may include a dynamic adapter (dynamic adapter) and K convolution layers, where K is a positive integer greater than 1, based on which source domain image features (or disturbance source domain image features) may be input to the dynamic adapter, and K weight values corresponding to the K convolution layers are generated by the dynamic adapter, where the K weight values are in one-to-one correspondence with the K convolution layers. The source domain image features (or perturbed source domain image features) may be input to each convolution layer, which generates the corresponding convolution features for the convolution layer based on the source domain image features (or perturbed source domain image features) and the corresponding weight values for the convolution layer. Domain-specific features may be generated based on K convolution features corresponding to K convolution layers.
Referring to fig. 6, which is a schematic structural diagram of a dynamic module network, the domain invariant branch network may include a convolution layer, an instance normalization layer, and an activation layer, based on which source domain image features (or disturbance source domain image features) may be input to the convolution layer to obtain a convolved feature, the convolved feature may be input to the instance normalization layer to obtain a normalized feature, and the normalized feature may be input to the activation layer to obtain an activated feature. After the post-activation feature is obtained, a domain invariant feature may be generated based on the post-activation feature.
The following describes the processing procedure of the source domain image in conjunction with the dynamic module network shown in fig. 6, and the processing procedure of the disturbance source domain image is similar to the processing procedure of the source domain image, and will not be repeated here.
With reference to fig. 3 and 6, the source domain image may be input to a feature extraction network to obtain source domain image features, where the source domain image features are obtainedThereafter, it is possible toThe source domain image features are input to the domain invariant branch network and the domain specific branch network, respectively. For example, the dynamic module network can be divided into a domain invariant branch network and a domain specific branch network, features can be aligned to a domain agnostic space for the domain invariant branch network, domain invariant (domain-invariant) features are learned, and the shared feature space can still have better generalization capability in an unknown domain, and the domain invariant features have universality among different domains. Considering that domain-specific (domain-specific) features still can help to improve performance in respective domains, i.e. each domain or each sample has unique characteristics, the domain-specific (domain-specific) feature can be regarded as a hidden space, and the invariance constraint forced by the space can enhance the generalization capability of the feature in an unknown domain, but also discards part of discrimination information effective for a target task in the hidden space, so as to better utilize complementary information related to the domain in data, the domain-specific feature can be extracted by adopting the domain-specific branch network, and the domain-specific feature is utilized as a complement of the domain-invariant feature, i.e. the domain-invariant feature and the domain-specific feature are extracted through the domain-invariant branch network and the domain-specific branch network.
IN a domain invariant branch network, the domain invariant branch network consists of a convolution layer, an instance normalization layer (Instance Normalization, IN) and an activation layer, sample specific features are removed through the convolution layer and the instance normalization layer, the style of the sample is removed through the instance normalization layer, and domain difference is reduced, so that the generalization capability of domain invariant branches is enhanced. Based on the above, after the source domain image feature is obtained, the domain invariant branch network carries out convolution operation on the source domain image feature through the convolution layer to obtain a convolved feature, carries out example normalization operation on the convolved feature through the example normalization layer to obtain a normalized feature, carries out activation operation on the normalized feature through the activation layer (for example, carries out activation operation on the normalized feature by adopting a ReLU activation function) to obtain an activated feature, and the activated feature can be used as a domain invariant feature, wherein the domain invariant feature is a common feature of a plurality of domains.
For example, the learning manner of the domain invariant feature may be: f (F) inv =ReLU(IN(f 3×3 (F) And) in the above formula, F representsSource domain image features, f 3×3 Indicating a convolution operation with a kernel size of 3×3, IN indicating an instance normalization operation, reLU indicating an activation operation, F inv Representing domain invariant features.
In the domain-specific branch network, the domain-specific branch network is composed of a dynamic adapter (dynamic adapter) and K convolution layers, and the weight value of each convolution layer is adjusted by using the dynamic adapter to adjust the convolution weight for each sample, so that the parameters of the domain-specific branch network are not fixed, and degradation is not easy to occur in the unseen domain.
After obtaining the source domain image characteristics, the domain specific branch network processes the source domain image characteristics through the dynamic adapter to obtain K weight values corresponding to the K convolution layers, wherein the K weight values are in one-to-one correspondence with the K convolution layers, namely the dynamic adapter predicts the weight of each convolution layer based on the source domain image characteristics Wherein d represents a dynamic adapter, that is, the dynamic adapter processes the source domain image feature F, W represents weight values, that is, K weight values corresponding to K convolution layers, where K may be a positive integer greater than 1.
The dynamic adapter structure can be Pooling-FC-ReLU-FC-Softmax, pooling represents Pooling operation, FC represents full-connection operation, reLU represents activation operation, softmax represents logistic regression operation, namely the source domain image feature F passes through Pooling, FC, reLU, FC, softmax, and K weight values are obtained.
The source domain image feature F may be input to each convolution layer, which generates a convolution feature corresponding to the convolution layer based on the source domain image feature F and a weight value corresponding to the convolution layer. After deriving the convolution features for each convolution layer, domain-specific features may be generated based on the K convolution features for the K convolution layers.
For example, the learning manner of the domain-specific features may be:f represents source domain image features, F 3×3 Representing a convolution operation with a kernel size of 3 x 3, w k Representing the weight value corresponding to the kth convolution layer, it is obvious that when the value of K is sequentially 1 to K and the value of K is 1, w k Represents the weight value corresponding to the 1 st convolution layer, f k 3×3 Representing the 1 st convolution layer to convolve the source domain image feature F, w k ·f k 3×3 (F) Representing the convolution characteristics of the 1 st convolution layer, and so on, obtaining K convolution characteristics corresponding to the K convolution layers, and then adding the K convolution characteristics to obtain the domain-specific characteristic F spec . To this end, domain-specific features may be derived, which may be personality features of a domain.
Based on the domain-invariant feature and the domain-specific feature, a source domain dynamic feature may be generated, for example, an addition operation may be performed on the domain-invariant feature and the domain-specific feature to obtain a source domain dynamic feature, or a feature stitching operation may be performed on the domain-invariant feature and the domain-specific feature to obtain a source domain dynamic feature, which is not limited. Obviously, the dynamic characteristics of the source domain are obtained through the domain invariant characteristics and the domain specific characteristics, so that the domain invariance can be ensured by utilizing an example normalization layer, meanwhile, the dynamic characteristics of the source domain can be dynamically adjusted for a sample through a dynamic adapter, the domain invariance and the domain specific information can be well represented, and the generalization capability is improved.
With reference to fig. 3 and fig. 6, after the source domain dynamic feature is obtained, the source domain dynamic feature may be input to a classifier network, and the classifier network classifies the source domain dynamic feature to obtain a source domain output feature.
In one possible implementation manner, during the training process of the living body detection model, a first loss function, a second loss function and a third loss function can be constructed, wherein the input of the first loss function is a source domain output characteristic, the output of the first loss function is a first loss value, and the first loss function is not limited.
For example, the first loss function may be a cross entropy loss function, and one example of the first loss function may be shown with reference to formula (2), and of course, formula (2) is merely an example, which is not limited thereto.
In formula (2), X i For representing an ith source domain image, F for representing a feature extraction network, D for representing a dynamic module network, C for representing a classifier network, source domain image X i After F, D, C, the characteristics are output for the source domain, Y i A label corresponding to the ith source domain image, C representing classification category, i.e. C categories exist together, L Cls (S) is used to represent the output of the first loss function, i.e. the first loss value.
The input of the second loss function is the disturbance source domain output characteristic, the output of the second loss function is the second loss value, and the second loss function is not limited. For example, the second loss function may be a cross entropy loss function, as shown in equation (3), although equation (3) is merely exemplary and is not limited thereto.
In formula (3), X i For representing the ith disturbance source domain image, F for representing the feature extraction network, D for representing the dynamic module network, C for representing the classifier network, disturbance source domain image X i Output characteristics of disturbance source domain after F, D, C, Y i For representing the label corresponding to the ith disturbance source domain image, C for representing classification category, i.e. there are C categories altogether, L Cls (S + ) For representing a second loss value.
The input of the third loss function is K weight values output by the dynamic adapter, the output of the third loss function is the third loss value, and the third loss function is not limited. For example, to enhance the variability of different convolutions, an information maximization (Information Maximization) loss function is introduced as a third loss function, the mutual information of the input features and the dynamic weights is maximized by the information maximization loss function, and the information maximization loss function can be composed of an entropy minimization loss function and a difference regular loss function. As shown in equation (4), no limitation is imposed on this, which is one example of the information maximization loss function.
In the formula (4), θ F Network parameters, θ, representing a feature extraction network D(X) Representing network parameters of a dynamic module network, L ent Representing network parameters theta F And network parameter θ D(X) Entropy minimum loss value, L div Representing network parameters theta F And network parameter θ D(X) The difference regular loss value, L IM Representing network parameters theta F And network parameter θ D(X) The information below maximizes the loss value, i.e., the third loss value. Wherein L is ent Corresponding to the entropy minimum loss function, L div Corresponding to the difference regular loss function, L IM Corresponding to the information maximization loss function.
In the formula (4), W i For the weight value of the i-th source domain image, i.e. for the K weight values output by the dynamic adapter for the i-th source domain image, the values of K are sequentially 1 to K, when the value of K is 1,the 1 st weight value representing the dynamic adapter output, and so on. />
In the formula (4) of the present invention,representing the average weight value of N source domain images, namely, the average value of N weight values output by the dynamic adapter for N source domain images, wherein the values of K are sequentially 1 to K, and when the value of K is 1, the values of->Representing the average of the 1 st weight value of the dynamic adapter output, i.e. the average of the 1 st weight values of the N source domain images, and so on.
From the above, it can be seen that the entropy minimum loss value L is determined based on K weight values of each source domain image ent Determining a difference canonical loss value L based on K average weight values of all source domain images (e.g., average of 1 st weight value of all source domain images, average of 2 nd weight value of all source domain images, and so on) div Minimum entropy loss value L ent And a difference regular loss value L div The sum is taken as a third loss value L IM
Entropy minimum loss value L ent So that the dynamic weight of the network prediction is relatively confident, and the regular loss value L is differentiated div The purpose is to make the weight of prediction as a whole rich in diversity. Thus, domain-specific branching networks are able to dynamically adjust the network based on a single sample and learn more meta-characteristic representations through information maximization.
In summary, the first loss value, the second loss value, and the third loss value may be obtained, and then, network parameters of the feature extraction network, the dynamic module network, and the classifier network may be adjusted based on the first loss value, the second loss value, and the third loss value, to obtain an adjusted living body detection model.
For example, a target loss value is determined based on the first loss value and the third loss value, and network parameters of the feature extraction network, the dynamic module network and the classifier network in the initial living body detection model are adjusted based on the target loss value, so as to obtain an intermediate living body detection model. And then, adjusting network parameters of the feature extraction network, the dynamic module network and the classifier network in the intermediate living detection model based on the second loss value to obtain an adjusted living detection model.
When the target loss value is adopted to adjust the network parameters of the feature extraction network, the dynamic module network and the classifier network in the initial living body detection model, a gradient descent method can be adopted to adjust, and other methods can be adopted to adjust, so that the method is not limited.
For example, an example of the adjustment by the gradient descent method can be shown in the formula (5).
In the formula (5), β represents the learning rate, μ represents the information maximization loss value L, which is a previously configured known value IM Is a pre-configured known value,i.e. based on the first loss value L Cls (S) calculating to obtain θ D And updating the network parameters of the dynamic module network, namely the updated network parameters of the dynamic module network. See the above examples for meanings of other parameters.
According to the technical scheme, in the embodiment of the application, under the condition that the multi-scene data are limited, a few source domain images and a large number of natural images can be utilized to simulate and generate disturbance source domain images under a plurality of scenes, the source domain data can be enriched, a large number of disturbance source domain images are utilized to train to obtain a target living body detection model, the domain generalization capability of the target living body detection model can be improved, and better domain generalization performance and domain generalization effect are realized. By designing the dynamic module network, network parameters can be dynamically adjusted based on the characteristics of each sample, domain invariant features and domain specific features are learned, so that the living body detection model can be changed along with the characteristics of each sample, and the generalization capability of the living body detection model is improved. By optimizing based on meta learning, the condition of in-vivo detection in actual application is optimized and simulated, and the domain generalization of a living body detection model is good. The method can enable the living body detection model to still have the capacity of adaptively adjusting network parameters in an unknown rich target domain, namely the target living body detection model can still keep higher accuracy in the target domain, so that the generalization capacity of the living body detection model is greatly improved.
Based on the same application concept as the above method, a living body detection device is provided in an embodiment of the present application, and referring to fig. 7, a schematic structural diagram of the living body detection device is shown, where the device may include:
an acquisition module 71 for acquiring a source domain image and a natural image; wherein the source domain image includes a living object and/or a non-living object;
the processing module 72 is configured to perform frequency domain transformation on the source domain image to obtain a source domain spectrogram, perform frequency domain transformation on the natural image to obtain a natural spectrogram, generate a mixed spectrogram based on the source domain spectrogram and the natural spectrogram, and perform frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image;
the training module 73 is configured to adjust network parameters of an initial living body detection model based on the source domain image and the disturbance source domain image, obtain an adjusted living body detection model, determine a target living body detection model based on the adjusted living body detection model, and perform living body detection on an image to be detected.
Illustratively, the processing module 72 is specifically configured to, when generating a hybrid spectrogram based on the source domain spectrogram and the natural spectrogram: if the source domain spectrogram comprises a source domain amplitude spectrum and a source domain phase spectrum, the natural spectrogram comprises a natural amplitude spectrum and a natural phase spectrum, generating a disturbance amplitude spectrum based on the source domain amplitude spectrum, a first disturbance coefficient of the source domain amplitude spectrum, the natural amplitude spectrum and a second disturbance coefficient of the natural amplitude spectrum; wherein the sum of the first disturbance factor and the second disturbance factor is a fixed value, and the second disturbance factor is determined based on the configured disturbance intensity; the hybrid spectrogram is generated based on the disturbance magnitude spectrum and the source domain phase spectrum.
Illustratively, the initial living body detection model includes a feature extraction network, a dynamic module network, and a classifier network, and the training module 73 adjusts network parameters of the initial living body detection model based on the source domain image and the disturbance source domain image, so as to obtain the adjusted living body detection model, which is specifically used for: inputting the source domain image into a feature extraction network to obtain source domain image features; inputting the source domain image characteristics to a dynamic module network to obtain source domain dynamic characteristics; inputting the dynamic characteristics of the source domain into a classifier network to obtain output characteristics of the source domain; determining a first loss value based on the source domain output characteristic and determining a target loss value based on the first loss value; adjusting network parameters of a feature extraction network, a dynamic module network and a classifier network based on the target loss value to obtain an intermediate living body detection model; inputting the disturbance source domain image to a feature extraction network of an intermediate living body detection model to obtain disturbance source domain image features; inputting the disturbance source domain image characteristics to a dynamic module network of an intermediate living body detection model to obtain disturbance source domain dynamic characteristics; inputting the dynamic characteristics of the disturbance source domain into a classifier network of an intermediate living body detection model to obtain output characteristics of the disturbance source domain; determining a second loss value based on the disturbance source domain output characteristic; and adjusting network parameters of the feature extraction network, the dynamic module network and the classifier network of the intermediate living body detection model based on the second loss value to obtain an adjusted living body detection model.
Illustratively, the dynamic module network includes a domain invariant branch network and a domain specific branch network, and the training module 73 is specifically configured to, when the source domain image feature is input to the dynamic module network to obtain the source domain dynamic feature: inputting the source domain image characteristics into a domain invariant branch network to obtain domain invariant characteristics, wherein the domain invariant characteristics are common characteristics of a plurality of domains determined based on the source domain image characteristics; inputting the source domain image characteristics into a domain specific branch network to obtain domain specific characteristics, wherein the domain specific characteristics are personalized characteristics of one domain determined based on the source domain image characteristics; a source domain dynamic feature is generated based on the domain invariant feature and the domain specific feature.
Illustratively, the domain-specific branch network includes a dynamic adapter and K convolution layers, where K is a positive integer greater than 1, and the training module 73 is specifically configured to, when the source domain image feature is input to the domain-specific branch network to obtain the domain-specific feature: inputting the source domain image characteristics to the dynamic adapter, and generating K weight values corresponding to the K convolution layers by the dynamic adapter, wherein the K weight values are in one-to-one correspondence with the K convolution layers; inputting the source domain image features to each convolution layer, and generating convolution features corresponding to the convolution layers by the convolution layers based on the source domain image features and weight values corresponding to the convolution layers; and generating the domain specific features based on the K convolution features corresponding to the K convolution layers.
Illustratively, the training module 73 is specifically configured to, when determining the target loss value based on the first loss value: determining an entropy minimum loss value and a difference regular loss value based on the K weight values; determining a third loss value based on the entropy minimum loss value and the difference regular loss value; the target loss value is determined based on the first loss value and the third loss value.
Illustratively, based on the first loss value, the training module 73 determines the target loss value using the following formula:
wherein θ F Network parameters, θ, representing a feature extraction network D(X) Representing network parameters of a dynamic module network, L ent Representing network parameters theta F And network parameter θ D(X) The entropy minimum loss value, L div Representing network parameters theta F And network parameter θ D(X) The difference regular loss as followsFailure of value, L IM Representing network parameters theta F And network parameter θ D(X) The third loss value; w (W) i Representing the K weight values for the i-th source domain image,and representing K average weight values corresponding to the N source domain images.
Illustratively, the domain invariant branch network includes a convolution layer, an instance normalization layer, and an activation layer, and the training module 73 is specifically configured to, when the domain invariant branch network is obtained by inputting the source domain image feature to the domain invariant branch network: inputting the source domain image characteristics to the convolution layer to obtain convolved characteristics; inputting the convolved features to the example normalization layer to obtain normalized features; inputting the normalized features to the activation layer to obtain activated features; the domain invariant feature is generated based on the post-activation feature.
Based on the same application concept as the above method, an electronic device is proposed in an embodiment of the present application, and referring to fig. 8, the electronic device includes a processor 81 and a machine-readable storage medium 82, where the machine-readable storage medium 82 stores machine-executable instructions that can be executed by the processor 81; the processor 81 is configured to execute machine executable instructions to implement the in vivo detection method disclosed in the above examples of the present application.
Based on the same application concept as the above method, the embodiments of the present application further provide a machine-readable storage medium, where a number of computer instructions are stored, where the computer instructions can implement the living body detection method disclosed in the above example of the present application when executed by a processor.
Wherein the machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information, such as executable instructions, data, or the like. For example, a machine-readable storage medium may be: RAM (Radom Access Memory, random access memory), volatile memory, non-volatile memory, flash memory, a storage drive (e.g., hard drive), a solid state drive, any type of storage disk (e.g., optical disk, dvd, etc.), or a similar storage medium, or a combination thereof.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer entity or by an article of manufacture having some functionality. A typical implementation device is a computer, which may be in the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Moreover, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (11)

1. A method of in vivo detection, the method comprising:
acquiring a source domain image and a natural image, wherein the source domain image comprises a living object and/or a non-living object;
performing frequency domain transformation on the source domain image to obtain a source domain spectrogram, performing frequency domain transformation on the natural image to obtain a natural spectrogram, generating a mixed spectrogram based on the source domain spectrogram and the natural spectrogram, and performing frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image;
Adjusting network parameters of an initial living body detection model based on the source domain image and the disturbance source domain image to obtain an adjusted living body detection model, and determining a target living body detection model based on the adjusted living body detection model; the target living body detection model is used for carrying out living body detection on an image to be detected.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the generating a hybrid spectrogram based on the source domain spectrogram and the natural spectrogram includes:
if the source domain spectrogram comprises a source domain amplitude spectrum and a source domain phase spectrum, the natural spectrogram comprises a natural amplitude spectrum and a natural phase spectrum, generating a disturbance amplitude spectrum based on the source domain amplitude spectrum, a first disturbance coefficient of the source domain amplitude spectrum, the natural amplitude spectrum and a second disturbance coefficient of the natural amplitude spectrum; wherein the sum of the first disturbance factor and the second disturbance factor is a fixed value, and the second disturbance factor is determined based on the configured disturbance intensity;
the hybrid spectrogram is generated based on the disturbance magnitude spectrum and the source domain phase spectrum.
3. The method of claim 1, wherein the initial living detection model includes a feature extraction network, a dynamic module network, a classifier network, wherein adjusting network parameters of the initial living detection model based on the source domain image and the perturbation source domain image results in an adjusted living detection model, comprising:
Inputting the source domain image to a feature extraction network to obtain source domain image features; inputting the source domain image characteristics to a dynamic module network to obtain source domain dynamic characteristics; inputting the dynamic characteristics of the source domain into a classifier network to obtain output characteristics of the source domain; determining a first loss value based on the source domain output characteristic and determining a target loss value based on the first loss value; adjusting network parameters of a feature extraction network, a dynamic module network and a classifier network based on the target loss value to obtain an intermediate living body detection model;
inputting the disturbance source domain image to a feature extraction network of an intermediate living body detection model to obtain disturbance source domain image features; inputting the disturbance source domain image characteristics to a dynamic module network of an intermediate living body detection model to obtain disturbance source domain dynamic characteristics; inputting the dynamic characteristics of the disturbance source domain into a classifier network of an intermediate living body detection model to obtain output characteristics of the disturbance source domain; determining a second loss value based on the disturbance source domain output characteristic; and adjusting network parameters of the feature extraction network, the dynamic module network and the classifier network of the intermediate living body detection model based on the second loss value to obtain an adjusted living body detection model.
4. The method of claim 3, wherein the step of,
the dynamic module network comprises a domain invariable branch network and a domain specific branch network, and the step of inputting the source domain image characteristic into the dynamic module network to obtain the source domain dynamic characteristic comprises the following steps:
inputting the source domain image characteristics to the domain invariant branch network to obtain domain invariant characteristics; wherein the domain invariant feature is a commonality feature of a plurality of domains determined based on the source domain image features;
inputting the source domain image characteristics to the domain-specific branch network to obtain domain-specific characteristics; wherein the domain-specific feature is a personality feature of a domain determined based on the source domain image feature;
the source domain dynamic feature is generated based on the domain invariant feature and the domain specific feature.
5. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
the domain-specific branch network comprises a dynamic adapter and K convolution layers, K is a positive integer greater than 1, and the input of the source domain image feature to the domain-specific branch network obtains a domain-specific feature, comprising:
inputting the source domain image characteristics to the dynamic adapter, and generating K weight values corresponding to the K convolution layers by the dynamic adapter, wherein the K weight values are in one-to-one correspondence with the K convolution layers;
Inputting the source domain image features to each convolution layer, and generating convolution features corresponding to the convolution layers by the convolution layers based on the source domain image features and weight values corresponding to the convolution layers;
and generating the domain specific features based on the K convolution features corresponding to the K convolution layers.
6. The method of claim 5, wherein the step of determining the position of the probe is performed,
the determining a target loss value based on the first loss value includes:
determining an entropy minimum loss value and a difference regular loss value based on the K weight values;
determining a third loss value based on the entropy minimum loss value and the difference regular loss value;
the target loss value is determined based on the first loss value and the third loss value.
7. The method of claim 6, wherein the step of providing the first layer comprises,
based on the first loss value, the target loss value is determined using the following formula:
wherein θ F Network parameters, θ, representing a feature extraction network D(X) Representing network parameters of a dynamic module network, L ent Representing network parameters theta F And network parameter θ D(X) The entropy minimum loss value, L div Representing network parameters theta F And network parameter θ D(X) The difference canonical loss value below is,L IM representing network parameters theta F And network parameter θ D(X) The third loss value; w (W) i Representing the K weight values for the i-th source domain image,and representing K average weight values corresponding to the N source domain images.
8. The method of claim 4, wherein the step of determining the position of the first electrode is performed,
the domain invariant branch network comprises a convolution layer, an instance normalization layer and an activation layer, the input of the source domain image characteristics to the domain invariant branch network obtains domain invariant characteristics, and the method comprises the following steps:
inputting the source domain image characteristics to the convolution layer to obtain convolved characteristics;
inputting the convolved features to the example normalization layer to obtain normalized features;
inputting the normalized features to the activation layer to obtain activated features;
the domain invariant feature is generated based on the post-activation feature.
9. A living body detection apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring the source domain image and the natural image; wherein the source domain image includes a living object and/or a non-living object;
the processing module is used for carrying out frequency domain transformation on the source domain image to obtain a source domain spectrogram, carrying out frequency domain transformation on the natural image to obtain a natural spectrogram, generating a mixed spectrogram based on the source domain spectrogram and the natural spectrogram, and carrying out frequency domain inverse transformation on the mixed spectrogram to obtain a disturbance source domain image;
The training module is used for adjusting network parameters of the initial living body detection model based on the source domain image and the disturbance source domain image to obtain an adjusted living body detection model, and determining a target living body detection model based on the adjusted living body detection model; the target living body detection model is used for carrying out living body detection on the image to be detected.
10. The apparatus of claim 9, wherein the device comprises a plurality of sensors,
the processing module is specifically configured to, when generating a hybrid spectrogram based on the source domain spectrogram and the natural spectrogram: if the source domain spectrogram comprises a source domain amplitude spectrum and a source domain phase spectrum, the natural spectrogram comprises a natural amplitude spectrum and a natural phase spectrum, generating a disturbance amplitude spectrum based on the source domain amplitude spectrum, a first disturbance coefficient of the source domain amplitude spectrum, the natural amplitude spectrum and a second disturbance coefficient of the natural amplitude spectrum; wherein the sum of the first disturbance factor and the second disturbance factor is a fixed value, and the second disturbance factor is determined based on the configured disturbance intensity; generating the mixed spectrum map based on the disturbance magnitude spectrum and the source domain phase spectrum;
the training module adjusts network parameters of the initial living body detection model based on the source domain image and the disturbance source domain image, and is specifically used for obtaining the adjusted living body detection model: inputting the source domain image to a feature extraction network to obtain source domain image features; inputting the source domain image characteristics to a dynamic module network to obtain source domain dynamic characteristics; inputting the dynamic characteristics of the source domain into a classifier network to obtain output characteristics of the source domain; determining a first loss value based on the source domain output characteristic and determining a target loss value based on the first loss value; adjusting network parameters of a feature extraction network, a dynamic module network and a classifier network based on the target loss value to obtain an intermediate living body detection model; inputting the disturbance source domain image to a feature extraction network of an intermediate living body detection model to obtain disturbance source domain image features; inputting the disturbance source domain image characteristics to a dynamic module network of an intermediate living body detection model to obtain disturbance source domain dynamic characteristics; inputting the dynamic characteristics of the disturbance source domain into a classifier network of an intermediate living body detection model to obtain output characteristics of the disturbance source domain; determining a second loss value based on the disturbance source domain output characteristic; adjusting network parameters of a feature extraction network, a dynamic module network and a classifier network of the intermediate living body detection model based on the second loss value to obtain an adjusted living body detection model;
The training module is specifically used for inputting the source domain image characteristics to the dynamic module network to obtain the source domain dynamic characteristics when the dynamic module network comprises a domain invariant branch network and a domain specific branch network: inputting the source domain image characteristics into a domain invariant branch network to obtain domain invariant characteristics, wherein the domain invariant characteristics are common characteristics of a plurality of domains determined based on the source domain image characteristics; inputting the source domain image characteristics into a domain specific branch network to obtain domain specific characteristics, wherein the domain specific characteristics are personalized characteristics of one domain determined based on the source domain image characteristics; generating a source domain dynamic feature based on the domain invariant feature and the domain specific feature;
the training module is specifically configured to, when the domain-specific branch network obtains the domain-specific feature, input the source domain image feature to the domain-specific branch network, where K is a positive integer greater than 1, and the domain-specific branch network comprises a dynamic adapter and K convolution layers: inputting the source domain image characteristics to the dynamic adapter, and generating K weight values corresponding to the K convolution layers by the dynamic adapter, wherein the K weight values are in one-to-one correspondence with the K convolution layers; inputting the source domain image features to each convolution layer, and generating convolution features corresponding to the convolution layers by the convolution layers based on the source domain image features and weight values corresponding to the convolution layers; generating the domain-specific features based on the K convolution features corresponding to the K convolution layers;
The training module is specifically configured to, when determining a target loss value based on the first loss value: determining an entropy minimum loss value and a difference regular loss value based on the K weight values; determining a third loss value based on the entropy minimum loss value and the difference regular loss value; determining the target loss value based on the first loss value and the third loss value;
wherein, based on the first loss value, the training module determines the target loss value using the following formula:
wherein θ F Network parameters, θ, representing a feature extraction network D(X) Representing network parameters of a dynamic module network, L ent Representing network parameters theta F And network parameter θ D(X) The entropy minimum loss value, L div Representing network parameters theta F And network parameter θ D(X) The difference regular loss value, L IM Representing network parameters theta F And network parameter θ D(X) The third loss value; w (W) i Representing the K weight values for the i-th source domain image,k average weight values corresponding to the N source domain images are represented;
the training module inputs the source domain image characteristics to the domain invariant branch network to obtain domain invariant characteristics, and the training module is specifically used for: inputting the source domain image characteristics to the convolution layer to obtain convolved characteristics; inputting the convolved features to the example normalization layer to obtain normalized features; inputting the normalized features to the activation layer to obtain activated features; the domain invariant feature is generated based on the post-activation feature.
11. An electronic device, comprising: a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to implement the method of any of claims 1-8.
CN202310466115.9A 2023-04-26 2023-04-26 Living body detection method, device and equipment Pending CN116486493A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310466115.9A CN116486493A (en) 2023-04-26 2023-04-26 Living body detection method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310466115.9A CN116486493A (en) 2023-04-26 2023-04-26 Living body detection method, device and equipment

Publications (1)

Publication Number Publication Date
CN116486493A true CN116486493A (en) 2023-07-25

Family

ID=87219188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310466115.9A Pending CN116486493A (en) 2023-04-26 2023-04-26 Living body detection method, device and equipment

Country Status (1)

Country Link
CN (1) CN116486493A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117074783A (en) * 2023-10-12 2023-11-17 国网吉林省电力有限公司通化供电公司 Real-time monitoring and early warning method for overheat state of power equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117074783A (en) * 2023-10-12 2023-11-17 国网吉林省电力有限公司通化供电公司 Real-time monitoring and early warning method for overheat state of power equipment
CN117074783B (en) * 2023-10-12 2024-01-19 国网吉林省电力有限公司通化供电公司 Real-time monitoring and early warning method for overheat state of power equipment

Similar Documents

Publication Publication Date Title
US10474929B2 (en) Cyclic generative adversarial network for unsupervised cross-domain image generation
Lee et al. Learning debiased representation via disentangled feature augmentation
Zhang Deepfake generation and detection, a survey
Botha et al. Fake news and deepfakes: A dangerous threat for 21st century information security
Sun et al. LSTM for dynamic emotion and group emotion recognition in the wild
Yang et al. One-shot domain adaptation for face generation
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
Yu et al. An improved steganography without embedding based on attention GAN
CN116486493A (en) Living body detection method, device and equipment
Agarwal et al. Privacy preservation through facial de-identification with simultaneous emotion preservation
CN108062416B (en) Method and apparatus for generating label on map
Rehman et al. Federated self-supervised learning for video understanding
Zobaed et al. Deepfakes: Detecting forged and synthetic media content using machine learning
Arora et al. A review of techniques to detect the GAN-generated fake images
Song et al. Editing out-of-domain gan inversion via differential activations
Le et al. Exploring the asynchronous of the frequency spectra of gan-generated facial images
Shahreza et al. Comprehensive vulnerability evaluation of face recognition systems to template inversion attacks via 3d face reconstruction
Mathews et al. An explainable deepfake detection framework on a novel unconstrained dataset
Deb et al. Use of auxiliary classifier generative adversarial network in touchstroke authentication
CN115731620A (en) Method for detecting counter attack and method for training counter attack detection model
Boutadjine et al. A comprehensive study on multimedia DeepFakes
Singh et al. Deepfake images, videos generation, and detection techniques using deep learning
Berrahal et al. A comparative analysis of fake image detection in generative adversarial networks and variational autoencoders
Zhou et al. Detection-by-simulation: exposing DeepFake via simulating forgery using face reconstruction
CN113688944B (en) Image identification method based on meta-learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination