CN114444727B

CN114444727B - Living body detection method and device, electronic model and storage medium

Info

Publication number: CN114444727B
Application number: CN202111680037.XA
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Beijing Real AI Technology Co Ltd
Current assignee: Beijing Real AI Technology Co Ltd
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2023-04-07
Anticipated expiration: 2041-12-31
Also published as: CN114444727A

Abstract

The embodiment of the application relates to the technical field of machine learning, and provides a method and a device for detecting a living body, an electronic model and a storage medium, wherein the method comprises the following steps: constructing a kernel function approximation model, wherein the kernel function approximation model comprises a plurality of deep neural networks and corresponding characteristic values respectively, and each deep neural network and corresponding characteristic value respectively represent one characteristic of a target kernel function; the initial parameters and the corresponding characteristic values of the plurality of deep neural networks are different, and each deep neural network comprises a constraint layer; respectively determining a loss function of each deep neural network according to the target kernel function and the training data set; and updating the parameters and the corresponding characteristic values of each deep neural network according to the calculation result of the loss function based on the training data set to obtain a target kernel function approximate model. The method greatly accelerates the speed of the algorithm related to the kernel method in the artificial intelligence, realizes a more efficient and reliable kernel approximation method, and can be used as a feature extractor for various tasks.

Description

Living body detection method and device, electronic model and storage medium

Technical Field

The application relates to the technical field of machine learning, in particular to a method and a device for detecting a living body, an electronic model and a storage medium.

Background

The kernel method is the most important and basic tool in Machine Learning (ML).

Certain research efforts have been accumulated on approximating kernel functions, mainly divided into two areas, one is to approximate kernels using non-neural projection functions, such as the stochastic fourier feature (RFF) method, nystr \ o } m method; on the other hand, deconstructing the core by a deep neural network, such as the SpIN method.

However, the existing method has many disadvantages, for example, the RFF method can process a translation invariant core, but cannot process other types of cores; the Nystr \ o m method involves feature decomposition with complexity third-order to the sample size, and the learned nucleated feature function can cause the out-of-sample expansion (out-of-sample generation) to be expensive; the SpIN method adopts a mask gradient method to break the symmetry between characteristic functions, and the generated algorithm needs to definitely estimate and store a jacobian matrix and perform Cholesky decomposition, so that the SpIN method has the problems of low efficiency and instability.

Disclosure of Invention

The embodiment of the application provides a method and a device for in-vivo detection, an electronic model and a storage medium, the speed of the kernel approximation correlation algorithm in artificial intelligence is greatly accelerated, the more efficient and reliable kernel approximation algorithm is realized, the application scene is wide, and the practicability is high.

In a first aspect, an embodiment of the present application provides a method for in-vivo detection, including:

constructing a kernel function approximation model, wherein the kernel function approximation model comprises a plurality of deep neural networks and corresponding characteristic values respectively, and each deep neural network and corresponding characteristic value respectively represent one characteristic of the target kernel function; the initial parameters and the corresponding characteristic values of the plurality of deep neural networks are different, and each deep neural network comprises a constraint layer so that the kernel function approximation model meets the constraint condition of approximating a target kernel function;

respectively determining a loss function of each deep neural network according to the target kernel function and the training data set;

and determining the result of the loss function of each deep neural network based on the training data set, and updating the parameters and the corresponding characteristic values of each deep neural network according to the obtained result to obtain a target kernel function approximate model.

In a second aspect, an embodiment of the present application further provides an apparatus for in-vivo detection, including:

the kernel function approximation model comprises a plurality of deep neural networks and corresponding characteristic values, each deep neural network and the corresponding characteristic value respectively represent one characteristic of the target kernel function, initial parameters of the plurality of deep neural networks and the corresponding characteristic values are different, and each deep neural network comprises a constraint layer so that the kernel function approximation model meets constraint conditions for approximating the target kernel function;

the assignment unit is used for respectively determining the loss function of each deep neural network according to the target function and the training data set;

and the training unit is used for determining the result of the loss function of each deep neural network based on the training data set, and updating the parameters and the corresponding characteristic values of each deep neural network according to the obtained result to obtain a final kernel function approximation model.

In a third aspect, this application further provides a computer-readable storage medium, on which a computer-readable program is stored, where the computer-readable program is executed by a processor to implement the method according to any one of the foregoing descriptions.

In a fourth aspect, an embodiment of the present application further provides an electronic device, including: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to perform a method as any one of the preceding.

The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects:

the application designs a kernel function approximation model which comprises a plurality of deep neural networks, wherein each deep neural network corresponds to a characteristic value, a constraint layer is additionally arranged in each deep neural network, so that the kernel function approximation model meets constraint conditions of kernel approximation, different model parameters, characteristic values and loss functions are distributed for each deep neural network, so that each deep neural network learns different characteristics of a kernel function respectively, and the kernel function approximation model is obtained through training. The kernel function approximation model obtained by the method can be widely applied to the field of machine learning, greatly accelerates algorithms related to a kernel method in artificial intelligence, and realizes more efficient and reliable kernel approximation; and the device can be used as a feature extractor for various tasks, and has wide application scenes and strong practicability.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 shows a schematic flow diagram of a method of in vivo detection according to an embodiment of the present application;

FIG. 2 shows a schematic diagram of a correspondence of a kernel function approximation model to kernel functions according to an embodiment of the present application;

FIG. 3 shows the result of a classical kernel approximation using an objective kernel function approximation model according to one embodiment of the present application;

FIG. 4 shows the results of an MLP-GP kernel approximation using an objective kernel approximation model according to an embodiment of the present application;

FIG. 5 shows the result of a CNN-GP kernel approximation using a target kernel approximation model according to an embodiment of the present application;

FIG. 6 illustrates the results of an evaluation of the NTKs kernel function on the third test data set using a target kernel function approximation model according to one embodiment of the present application;

FIG. 7 shows a schematic structural diagram of an in-vivo detection device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

To further explain the technical means and effects of the embodiments of the present application to achieve the intended purpose, the following detailed description of the embodiments, structures, features and effects according to the present application will be given with reference to the accompanying drawings and embodiments. In the following description, different "one embodiment" or "an embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Further, although the steps in the respective embodiments are arranged in order by the sequence numbers, the steps are not necessarily performed in order by the sequence numbers. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in each embodiment may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in other steps.

The application of artificial intelligence in human life is more and more extensive, such as man-machine conversation, image recognition, character recognition, living body detection and the like. Machine learning is one way to implement artificial intelligence, the most basic practice of which is to use algorithms to parse data, learn from it, and then make decisions and predictions about events in the real world. The kernel method and the Deep Neural Networks (DNNs) are the most important and basic tools in Machine Learning (ML), which open the door to processing complex nonlinear data.

In the living body detection, for example, in the detection process, feature extraction is usually used as a basis, for example, a texture feature of a human face is extracted, and then a result of the living body detection is determined according to the texture feature. The approximate kernel method is an effective means for processing nonlinear data in the feature extraction algorithm.

In the prior art, certain research results have been accumulated on a method for approximating a kernel (also referred to as a method for approximating a kernel), which mainly includes two aspects, namely, approximating a kernel by using a non-neural projection function, such as a Random Fourier Feature (RFF) method, an Nystr \ o { m method; on the other hand, deconstruction of the kernel by deep neural networks, such as the SpIN method.

More research has been devoted to approximating kernels with non-neural projection functions, with the stochastic fourier feature (RFF) and Nystr \ o } m methods being two popular methods, where RFF can handle translation invariant kernels but not other types of kernels; the Nystr \ o } m method performs eigen decomposition on a partial kernel matrix by subsampling training data, and then constructs a solution to find the first k eigenfunctions of the kernel. However, eigen decomposition has a complexity third order of the sample size, and the learned nucleated eigenfunctions can cause sample out-expansion (out-of-sample propagation) to be expensive, especially in the face of modern kernel functions such as neural network gaussian process (NN-GP) kernel and endoneural kernels (NTKs).

Due to the strong expressive force and expandability of the deep neural network, the problems encountered by RFF and Nystr \ o } m methods can be solved by the deconstruction core of the deep neural network. SpIN is one of the prior arts in this spirit, and in detail, spIN adopts a mask gradient method to break the symmetry between feature functions, and the resulting algorithm needs to explicitly estimate and save the jacobian matrix and perform Cholesky decomposition, so that there are problems of inefficiency and instability, and it is difficult to actually land.

In this regard, the present application finds that "the kernel can be feature decomposed by solving a series of asymmetric optimization problems simultaneously", and based on this finding, the present application utilizes a deep neural network to approximate the feature function of the kernel.

Fig. 1 shows a schematic flow chart of a method for detecting a living body according to an embodiment of the present application, and as can be seen from fig. 1, the present application at least includes steps S110 to S130:

step S110: constructing a kernel function approximate model, wherein the kernel function approximate model comprises a plurality of deep neural networks and corresponding characteristic values respectively, and each deep neural network and the corresponding characteristic value respectively represent one characteristic of the target kernel function; the initial parameters and the corresponding characteristic values of the plurality of deep neural networks are different, and each deep neural network comprises a constraint layer so that the kernel function approximation model meets the constraint condition of approximating a target kernel function.

The application discovers, through research, that ' the characteristic decomposition can be performed on the kernel kappa (x, x ') by simultaneously solving a series of asymmetric optimization problems ', and a specific mathematical expression is as follows:

wherein,

representing the introduced characteristic function, namely a deep neural network;

will converge to the feature pair associated with the jth largest eigenvalue of the kernel function κ (x, x'), where a feature pair refers to a deep neural network and the corresponding eigenvalue. It should be noted that, in the present application, the real number set to which the kernel function is mapped is defined as ^ greater than or equal to>

The same applies below.

The proving process of the above finding is not detailed, and the above finding can be simply understood as that the application converts the approximation problem of the kernel function into a parameter optimization problem, and the application constructs a kernel function approximation model into which a deep neural network is introduced to approximate the kernel function.

Specifically, as shown in fig. 2, fig. 2 is a schematic diagram illustrating a correspondence relationship between a kernel function approximation model and a kernel function according to an embodiment of the present application; as can be seen in FIG. 2, the target kernel function kernel can be decomposed into a function characterized by a plurality of feature pairs, whichAs can be seen from fig. 2, the application introduces k deep Neural networks to capture the first k characteristic functions of the kernel function κ (x, x'), and the characteristic function that characterizes the kernel function by using the deep Neural networks and the corresponding characteristic values may be referred to as a Neural Eigenfunction (Neural Eigenfunction), or simply referred to as Neural ef. It can be simply understood that the kernel function κ (x, x') is decomposed into k feature functions, each of which contains a feature value μ and one

Wherein->

Is not in the form of a mathematical expression, but is learned through a deep neural network.

The number of the deep neural networks can be set, and a plurality of deep neural networks are set when the number of the features of the kernel function is required to be captured.

In the application, the structure of each deep neural network is not limited, the deep neural networks of common architectures can realize the application, the structures of a plurality of deep neural networks can be the same or different, and the deep neural networks with the same structure are recommended to be adopted in the application.

However, in order to learn features with different target kernel functions, initial parameters and corresponding feature values of each deep neural network are different from each other. When initializing the kernel function approximation model, the initial parameters and the corresponding characteristic values of each deep neural network can be randomly generated, as long as the initial parameters and the corresponding characteristic values of each deep neural network are ensured to be different.

In order for the kernel function approximation model to satisfy the constraint conditions (i.e., equations (2) and (3)) for approximating the target kernel function, a constraint layer is added to each deep neural network so as to satisfy the above conditions.

Step S120: and respectively determining the loss function of each deep neural network according to the target kernel function and the training data set.

The loss (loss) function is also different in each deep neural network, and before each training, the loss function of each deep neural network is determined to prepare for model optimization.

Specifically, an evaluation matrix of the target kernel function can be determined according to the target kernel function based on the training data set, and then the evaluation matrix is substituted into a loss function expression of each deep neural network to obtain a loss function of each deep neural network, wherein the loss function is in a form of a mathematical expression, and the loss functions of each deep neural network are different due to different loss function expressions of each deep neural network.

Step S130: and determining the result of the loss function of each deep neural network based on the training data set, and updating the parameters and the corresponding characteristic values of each deep neural network according to the obtained result to obtain a target kernel function approximate model.

And then training the constructed kernel function approximation model based on a training data set, wherein the training process is similar to that of the prior art, data are input, loss functions are calculated, and parameters of each deep neural network are updated according to the calculation results of the loss functions. And selecting target training data from the training data set according to the set adoption scale to perform iterative training on the kernel function approximation model each time until a preset training turn is reached or a preset stopping condition is reached, finishing training, and obtaining an updated kernel function approximation model as the target kernel function approximation model.

It should be noted that, except for the specification of necessary parameters such as sampling scale, etc., only one target kernel function and one training data set of a small number of samples need to be input for each training.

The target kernel function may be any kernel function in the prior art, and in some embodiments of the present application, the selection of the target kernel function may be associated with a scene to which the kernel function approximation model is to be applied, for example, if the kernel function approximation model is to be used for image feature extraction, a classical kernel function, or an NN-GP kernel function with a popular target may be selected.

By the method shown in fig. 1, a kernel function approximation model is designed, and the model includes a plurality of deep neural networks, each deep neural network corresponds to a feature value, a constraint layer is added in each deep neural network, so that the kernel function approximation model satisfies constraint conditions of kernel approximation, different model parameters, feature values and loss functions are allocated to each deep neural network, so that each deep neural network learns different features of a kernel function, and the kernel function approximation model is obtained after training. The kernel function approximation model obtained by the method can be widely applied to the field of machine learning, greatly accelerates algorithms related to a kernel method in artificial intelligence, and realizes more efficient and reliable kernel approximation; and the method can be used as a feature extractor for various tasks, and has wide application scenes and strong practicability.

In some embodiments of the present application, the constraint layer of each of the deep neural networks is disposed at the last layer of each of the deep neural networks as an output layer of each of the deep neural networks; and setting a constraint factor in the constraint layer of each deep neural network for constraining the output of each deep neural network, wherein the constraint factor is determined according to the sampling scale and the output of the previous layer of the constraint layer.

The basic structure of the deep neural network comprises an input layer (inputlayer), at least one hidden layer (hiddenlayer) and an output layer (outputlayer) which are connected in sequence, wherein a constraint layer of the deep neural network is arranged behind the output layer and is used as a final output layer of each deep neural network, namely, a result output from an original output layer, such as a vector or a matrix, is required to enter the constraint layer for further processing and then is output from the constraint layer and is used as a final output of each deep neural network, and the constraint layer is arranged so that an established kernel function approximation model meets the constraint conditions found by the deep neural network, namely the formula (2) and the formula (3).

Specifically, a constraint factor σ is set in the constraint layer, and the specific form of the constraint layer is as follows:

wherein it is present>

/>

Wherein,

represents an input to the restriction layer, and>

represents an output of the constraining layer, wherein>

The output of the original output layer (outputlayer) of the deep neural network is obtained; σ is a constraint factor, B is the sample size, and the value of the constraint factor σ is based on the sample size B and->

Determined, specifically as in equation (4), where sample size B is specified during training; and is

Representing the input and output of the batches, by such a definition, circumventing the barriers to optimization under explicit normalization constraints.

In some embodiments of the present application, the determining the loss function of each deep neural network according to the target kernel function and the training data set respectively includes: determining an evaluation matrix of the target kernel function according to the target function and the training data set; and substituting the evaluation matrix into the loss function expression of each deep neural network to determine the loss function of each deep neural network, wherein the loss function expression of each deep neural network is determined according to the difference between the generalized Rayleigh quotient of the target kernel function and the penalty coefficient of each deep neural network under different constraint conditions.

The loss function of each deep neural network is different, and is the difference between the generalized Rayleigh quotient of the target kernel function and the penalty coefficient of each deep neural network under different constraint conditions, and a specific expression is shown in the following formula (5):

wherein R is _jj Is the generalized Rayleigh quotient of the target kernel function;

a penalty factor for each neural network, wherein R _ij The expression is shown in formula (7).

Expanding equation (5) yields a loss function expression of the form of equation (6):

wherein l is a loss function of each neural network; omega ₁ ,……ω _k Parameters representing each deep neural network; k represents the number of deep neural networks; sg represents the stop of the gradient descent operation; kappa type ^X,X An evaluation matrix representing the target kernel function is calculated by using a training data set based on the target kernel function, and the calculation formula is kappa ^X,X ＝κ(X,X)；

Is expressed in formula (2), the superscript X is the vector characterizing the training data, X ^T Is the inverse of the vector characterizing the training data.

For a given target function and a given training data set, the evaluation matrix of the target kernel function is determined, the loss function expressions of the deep neural networks are different, and as can be seen from the expression (6), the loss functions of the deep neural networks can be determined by substituting the evaluation matrix of the target kernel function into the loss function expressions of the deep neural networks respectively.

In some embodiments of the application, the determining, based on the training data set, a result of the loss function of each deep neural network, and updating, according to the obtained result, a parameter and a corresponding feature value of each deep neural network includes: randomly selecting target training data from the training data set; inputting the target training data into the kernel function approximation model, and enabling each deep neural network to propagate forwards to determine a generalized Rayleigh quotient of each deep neural network; updating the characteristic value corresponding to each deep neural network according to the determined generalized Rayleigh quotient of each deep neural network; solving the derivative value of the loss function of each deep neural network according to the target training data; and updating the model parameters of each deep neural network according to the derivative values.

In the process of model training, the model of each deep neural network needs to be optimized, and the characteristic values corresponding to each deep neural network need to be optimized, and the approximate flow of the training process of one round can be simply stated as follows: selecting target training data, inputting a model to calculate to obtain a generalized Rayleigh quotient of each deep neural network, updating a characteristic value of each deep neural network, calculating a derivative of a loss function based on the target training data, updating model parameters of each deep neural network according to the derivative value, ending training of one round, and entering iteration of the next round.

Specifically, the target training data may be obtained by randomly sampling in the training data set according to a specified sampling scale B.

Then inputting the sampled target training data into a kernel function approximation model, so that each deep neural network is propagated forwards, and calculating to obtain a generalized Rayleigh quotient of each deep neural network, wherein it needs to be noted that the generalized Rayleigh quotient of each deep neural network is an approximate value obtained by an approximation means, and a specific expression of the generalized Rayleigh quotient of each deep neural network is as follows:

in the formula,

an approximation of a generalized Rayleigh quotient representing each deep neural network; x is the number of _b Or x _b′ Representing the target training data, with other parameters as before. And in formula (7), is->

To limit the cascading of output parameters.

After obtaining the generalized rayleigh quotient of each deep neural network, updating the characteristic value corresponding to each deep neural network, specifically, approximating the characteristic value by an Exponential Moving Average (EMA)

Thus obtaining the jth characteristic value->

The estimated value is used to replace the original characteristic value.

Then, the derivative of the loss function of each deep neural network is calculated, and in the prior art, in the process of iterative training, the value of the loss function is usually calculated, and then the model parameters are updated according to the value of the loss function. In the present application, the value of the loss function is used as a reference, then the derivative of the loss function is calculated based on the target training data, and the model parameter of each deep neural network is updated according to the derivative value, so that a faster convergence effect can be achieved, where an expression of the reciprocal of the loss function is shown in the following expression (8):

wherein the meaning of the parameters is as described above.

The kernel function approximate model obtained by training by the method has wide application, is suitable for the classical kernel function, and has ideal fitting effect on the modern kernel function which is popular at present. The application effect of the kernel function approximation model obtained by training by the method of the present application on different target kernel functions is described below.

In some embodiments of the present application, a kernel function approximation model is used for the classical kernel function, and in contrast to SpIN and Nystr \ o } m methods, specifically, referring to fig. 3, fig. 3 shows the result of approximating the classical kernel function by using a target kernel function approximation model according to an embodiment of the present application, and as can be seen from fig. 3, this embodiment approximates two different classical kernel functions by using three methods, where the first target kernel function is a polynomial kernel function k (x, x') =) ^T x′+1.5) ⁴ The second target kernel function is an RBF kernel function. It can be seen from the graph in the figure that the application achieves the effect consistent with the Nystr \ o } m method, but the time consumed by the training of the application is almost constant for different training data sets, while the Nystr \ o } m method is obviously increased under the condition of increasing the number of samples, so the Nystr \ o } m method requires higher time cost; it can also be seen from fig. 3 that the present application is superior to the SpIN method in both approximate quality and efficiency.

The application is not only suitable for classical kernel functions, but also has prominent effect on modern kernels, such as NN-GP kernel functions and NTKs kernel functions, wherein the NN-GP kernel functions comprise MLP-GP kernel functions and CNN-GP kernel functions.

In some embodiments of the present application, the target kernel function is an MLP-GP kernel function; each of the deep neural networks has a first specified structure including a first specified number of layers and a first specified width; after obtaining the approximate model of the objective kernel function, the method further comprises: performing unsupervised feature extraction on the first test data set by using the obtained target kernel function approximation model; and projecting the first test data set to a multi-dimensional space according to a feature extraction result so as to classify or cluster the first test data set.

Referring to fig. 4, fig. 4 shows a result of approximating an MLP-GP kernel function by using a target kernel function approximation model according to an embodiment of the present application, where an NN-GP kernel function associated with an MLP architecture is an MLP-GP kernel function.

In this embodiment, the number of the designated deep neural networks is 3, the number of layers of each deep neural network, that is, the first designated number of layers, is 3, the width of each deep neural network, that is, the first designated width is 32, and the target kernel function approximation model is constructed and trained. The first test data set includes 1000 two-dimensional data from a "bi-month" or "circle" shaped data set.

Unsupervised feature extraction is performed on the first test data set based on the target kernel function approximation model obtained through training, the first test data set is projected to a three-dimensional space according to a feature extraction result, a classification or clustering result shown in fig. 4 is obtained, as can be seen from fig. 4, the target kernel function approximation model generates a result similar to the Nystr \ o } m method, and data after projection becomes linearly separable.

In some embodiments of the present application, the target kernel function is a CNN-GP kernel function; each of the deep neural networks has a second specified structure; after obtaining the approximation model of the objective kernel function, the method further comprises: performing unsupervised extraction of discriminant image features on the second test data set by using the obtained target kernel function approximation model; and projecting the second test data set to a multidimensional space according to the feature extraction result so as to classify or cluster the second test data set.

Referring to fig. 5, fig. 5 illustrates a result of approximating a CNN-GP kernel function with a target kernel function approximation model according to an embodiment of the present application, where an NN-GP kernel function associated with a CNN (convolutional neural network) is the CNN-GP kernel function.

In this embodiment, the number of the designated deep neural networks is 10, the kernel function approximation model is constructed by using the deep neural networks with two structures, and the target kernel function approximation model is obtained through training. The second test data set is a MNIST test image set.

Wherein, the structure of the first kind of deep neural network includes in proper order: a first conv & ReLU layer, a second conv & ReLU layer, a flat layer, a linear & ReLU layer and a linear layer, wherein the parameters of the first conv & ReLU layer are as follows: convolution size 3 x 3, srtide 2, padding 1; the parameters of the second conv & ReLU layer are: convolution size 3 x 3, srtide 2, padding 0; the parameters of the linear layer are as follows: out _ channels is 1.

The method adopts 10 first deep neural networks to construct a kernel function approximation model, and 2000 rounds of training are carried out on an MNIST training image set to obtain a target kernel function approximation model.

Then, the MNIST test image set is subjected to feature extraction by adopting an objective kernel function approximation model, projected to a three-dimensional space, and subjected to logistic regression to evaluate the discrimination capability of a complete (ten-dimensional space) projection and report the test accuracy, so that a result shown in fig. 5-a is obtained, and as can be seen from fig. 5-a, the logistic regression test accuracy reaches 77.60%.

The structure of the second deep neural network sequentially comprises: a first conv & ReLU layer, a second conv & ReLU layer, a flat layer, a linear & ReLU layer and a linear layer, wherein the parameters of the first conv & ReLU layer are as follows: convolution size 3 x 3, srtide 1, padding 1, maxpool 2 x 2; the parameters of the second conv & ReLU layer are: convolution size 3 x 3, srtide 1, padding 0, maxpool 2 x 2; the parameters of the linear layer are as follows: out _ channels is 1.

The method adopts 10 first deep neural networks to construct a kernel function approximation model, and performs 2000 rounds of training on an MNIST training image set to obtain a target kernel function approximation model.

Then, the MNIST test image set is subjected to feature extraction by adopting an objective kernel function approximation model, projected to a three-dimensional space, and subjected to logistic regression to evaluate the discrimination capability of a complete (ten-dimensional space) projection and report the test accuracy, so that a result as shown in fig. 5-b is obtained, and as can be seen from fig. 5-b, the logistic regression test accuracy reaches 82.35%.

Meanwhile, the method is compared with the Nystr \ o } m method, and the logistic regression testing accuracy rate obtained by the Nystr \ o } m method is 77.81 percent through calculation.

In some embodiments of the present application, the target kernel function is an NTKs kernel function; each of the deep neural networks has a third prescribed structure; after obtaining the approximation model of the objective kernel function, the method further comprises: performing feature extraction on the third test data set by using the obtained target kernel function approximate model; and according to the feature extraction result, approximating the evaluation result of the target kernel function on the third test data set.

Referring to fig. 6, fig. 6 illustrates the results of approximating NTKs kernel functions using a target kernel function approximation model on a third test data set according to one embodiment of the present application. The third test data set was 128 CIFAR-10 test images.

In this embodiment, 10 deep neural networks are adopted, the feature value corresponding to each deep neural network is a ground truth feature value (ground channel eigenvalues), a kernel function approximation model is constructed, and a target kernel function approximation model is obtained through training.

Feature extraction is performed on 128 CIFAR-10 test images by using a target kernel function approximation model, and according to a feature extraction result, an evaluation result of the target kernel function on the third test data set is approximated, that is, a kernel function is reconstructed and compared with a Random feature approach (one in the prior art), and a result as shown in fig. 6 is obtained. Fig. 6- (a) shows the evaluation result obtained by the present application, fig. 6- (b) shows the evaluation result obtained by the random feature method, fig. 6- (c) shows the true value, and it can be seen from the combination of fig. 6- (a) to fig. 6- (c) that the feature function of the NTKs kernel function can be approximated by using the target kernel function approximation model, and the NTKs kernel function can be efficiently and accurately reconstructed.

Fig. 7 shows a schematic structural diagram of a living body detection device according to an embodiment of the application, the device 700 includes:

a constructing unit 710, configured to construct a kernel function approximation model, where the kernel function approximation model includes multiple deep neural networks and corresponding eigenvalues, each of the deep neural networks and the corresponding eigenvalue represents a feature of the target kernel function, initial parameters and corresponding eigenvalues of the multiple deep neural networks are different, and each of the deep neural networks includes a constraint layer, so that the kernel function approximation model satisfies a constraint condition of approximating the target kernel function;

an assigning unit 720, configured to determine a loss function of each deep neural network according to the target function and the training data set;

the training unit 730 is configured to determine a result of the loss function of each deep neural network based on the training data set, and update the parameter and the corresponding feature value of each deep neural network according to the obtained result, so as to obtain a final kernel function approximation model.

In some embodiments of the present application, in the apparatus, the constraint layer of each of the deep neural networks is disposed at a last layer of each of the deep neural networks as an output layer of each of the deep neural networks; and setting a constraint factor in a constraint layer of each deep neural network for constraining the output of each deep neural network, wherein the constraint factor is determined according to the sampling scale and the output of the previous layer of the constraint layer.

In some embodiments of the present application, in the above apparatus, the assigning unit 720 is configured to determine an evaluation matrix of the target kernel function according to the target function and the training data set; and substituting the evaluation matrix into the loss function expression of each deep neural network to determine the loss function of each deep neural network, wherein the loss function expression of each deep neural network is determined according to the difference between the generalized Rayleigh quotient of the target kernel function and the penalty coefficient of each deep neural network under different constraint conditions.

In some embodiments of the present application, in the above apparatus, the training unit 730 is configured to randomly select target training data from the training data set; inputting the target training data into the kernel function approximation model, and enabling each deep neural network to propagate forwards to determine a generalized Rayleigh quotient of each deep neural network; updating the characteristic value corresponding to each deep neural network according to the determined generalized Rayleigh quotient of each deep neural network; solving the derivative value of the loss function of each deep neural network according to the target training data; and updating the model parameters of each deep neural network according to the derivative values.

In some embodiments of the present application, in the above apparatus, the target kernel function is an MLP-GP kernel function; each of the deep neural networks has a first specified structure including a specified number of layers and a first specified width; the device further comprises: the test unit is used for performing unsupervised feature extraction on the first test data set by adopting the obtained target kernel function approximate model after the target kernel function approximate model is obtained; and projecting the first test data set to a multi-dimensional space according to a feature extraction result so as to classify or cluster the first test data set.

In some embodiments of the present application, in the above apparatus, the target kernel function is a CNN-GP kernel function; each of the deep neural networks has a second specified structure; the test unit is also used for carrying out unsupervised extraction of the discriminative image features on the second test data set by adopting the obtained target kernel function approximate model after the target kernel function approximate model is obtained; and projecting the second test data set to a multi-dimensional space according to the feature extraction result so as to classify or cluster the second test data set.

In some embodiments of the present application, in the above apparatus, the target kernel function is an NTKs kernel function; each of the deep neural networks has a third prescribed structure; the test unit is also used for extracting the characteristics of the third test data set by adopting the obtained target kernel function approximate model after the target kernel function approximate model is obtained; and according to the feature extraction result, approximating the evaluation result of the target kernel function on the third test data set.

It can be understood that the above-mentioned biopsy device can implement the above-mentioned biopsy method, and the details are not repeated herein.

Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Referring to fig. 8, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.

The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 8, but that does not indicate only one bus or one type of bus.

And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.

The processor reads a corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the living body detection device on a logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:

The method performed by the living body detecting device disclosed in the embodiment of fig. 7 of the present application can be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.

The electronic device may further execute the method executed by the apparatus for detecting a living body in fig. 7, and implement the functions of the apparatus for detecting a living body in the embodiment shown in fig. 6, which are not described herein again in this embodiment of the present application.

An embodiment of the present application further provides a computer-readable storage medium storing one or more programs, where the one or more programs include instructions, which when executed by an electronic device including a plurality of application programs, enable the electronic device to perform the method performed by the apparatus for living body detection in the embodiment shown in fig. 7, and are specifically configured to perform:

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises that element.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art to which the present application pertains. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of in vivo testing, comprising:

constructing a kernel function approximation model, wherein the kernel function approximation model comprises a plurality of deep neural networks and corresponding characteristic values respectively, and each deep neural network and the corresponding characteristic value respectively represent one characteristic of a target kernel function; the initial parameters and the corresponding characteristic values of the plurality of deep neural networks are different, and each deep neural network comprises a constraint layer so that the kernel function approximation model meets the constraint condition of approximating a target kernel function, wherein the constraint layer of each deep neural network is arranged at the last layer of each deep neural network and serves as the output layer of each deep neural network;

wherein, the determining the loss function of each deep neural network according to the target kernel function and the training data set respectively comprises:

determining an evaluation matrix of the target kernel function according to the target kernel function and the training data set;

substituting the evaluation matrix into the loss function expression of each deep neural network to determine the loss function of each deep neural network;

determining the result of the loss function of each deep neural network based on the training data set, and updating the parameters and the corresponding characteristic values of each deep neural network according to the obtained result to obtain a target kernel function approximate model;

and acquiring a face image, extracting the texture features of the face by using the target kernel function approximation model, and determining the result of the living body detection according to the texture features.

2. The method of claim 1, wherein a constraint factor is set in a constraint layer of each of the deep neural networks for constraining an output of each of the deep neural networks, wherein the constraint factor is determined according to a sampling size and an output of a previous layer of the constraint layer.

3. The method according to claim 1 or 2, wherein the loss function expression of each deep neural network is determined according to the difference between the generalized Rayleigh quotient of the target kernel function and the penalty coefficient of each deep neural network under different constraint conditions.

4. The method according to claim 1 or 2, wherein the determining a result of the loss function of each deep neural network based on the training data set and updating parameters and corresponding feature values of each deep neural network according to the obtained result comprises:

randomly selecting target training data from the training data set;

inputting the target training data into the kernel function approximation model, and enabling each deep neural network to propagate forwards to determine a generalized Rayleigh quotient of each deep neural network;

updating the characteristic value corresponding to each deep neural network according to the determined generalized Rayleigh quotient of each deep neural network;

solving the derivative value of the loss function of each deep neural network according to the target training data;

and updating the model parameters of each deep neural network according to the derivative values.

5. The method according to claim 1 or 2, wherein the target kernel function is an MLP-GP kernel function; each of the deep neural networks has a first specified structure including a specified number of layers and a first specified width;

after obtaining the approximation model of the objective kernel function, the method further comprises:

performing unsupervised feature extraction on the first test data set by using the obtained target kernel function approximate model;

and projecting the first test data set to a multi-dimensional space according to a feature extraction result so as to classify or cluster the first test data set.

6. The method according to claim 1 or 2, wherein the target kernel function is a CNN-GP kernel function; each of the deep neural networks has a second specified structure;

after obtaining the objective kernel function approximation model, the method further comprises:

performing unsupervised extraction of discriminant image features on the second test data set by using the obtained target kernel function approximation model;

and projecting the second test data set to a multi-dimensional space according to the feature extraction result so as to classify or cluster the second test data set.

7. The method of claim 1 or 2, wherein the target kernel function is an NTKs kernel function; each of the deep neural networks has a third prescribed structure;

performing feature extraction on the third test data set by using the obtained target kernel function approximate model;

and according to the feature extraction result, approximating the evaluation result of the target kernel function on the third test data set.

8. An in-vivo testing device, comprising:

the kernel function approximation model comprises a plurality of deep neural networks and corresponding characteristic values, each deep neural network and the corresponding characteristic value represent one characteristic of a target kernel function, initial parameters of the plurality of deep neural networks and the corresponding characteristic values are different, and each deep neural network comprises a constraint layer so that the kernel function approximation model meets constraint conditions of approximating the target kernel function, wherein the constraint layer of each deep neural network is arranged at the last layer of each deep neural network and serves as an output layer of each deep neural network;

the assignment unit is used for respectively determining the loss function of each deep neural network according to the target kernel function and the training data set;

the assignment unit is used for determining an evaluation matrix of the target kernel function according to the target kernel function and the training data set; substituting the evaluation matrix into the loss function expression of each deep neural network to determine the loss function of each deep neural network;

the training unit is used for determining the result of the loss function of each deep neural network based on the training data set, and updating the parameters and the corresponding characteristic values of each deep neural network according to the obtained result to obtain a final kernel function approximate model;

and the detection unit is used for acquiring a human face image, extracting the texture characteristics of the human face by using the target kernel function approximation model and determining the result of the living body detection according to the texture characteristics.

9. The apparatus of claim 8, wherein a constraint factor is provided in a constraint layer of each of the deep neural networks for constraining an output of each of the deep neural networks, wherein the constraint factor is determined according to a sampling size and an output of a previous layer of the constraint layer.

10. The apparatus of claim 8 or 9, wherein the loss function expression of each deep neural network is determined according to a difference between a generalized rayleigh quotient of the target kernel function and a penalty coefficient of each deep neural network under different constraint conditions.

11. The apparatus according to claim 8 or 9, characterized by a training unit for randomly selecting target training data from the training data set; inputting the target training data into the kernel function approximation model, and enabling each deep neural network to propagate forwards to determine a generalized Rayleigh quotient of each deep neural network; updating the characteristic value corresponding to each deep neural network according to the determined generalized Rayleigh quotient of each deep neural network; solving the derivative value of the loss function of each deep neural network according to the target training data; and updating the model parameters of each deep neural network according to the derivative values.

12. The apparatus according to claim 8 or 9, wherein the target kernel function is an MLP-GP kernel function; each of the deep neural networks has a first specified structure including a specified number of layers and a first specified width; the device further comprises: the test unit is used for performing unsupervised feature extraction on the first test data set by adopting the obtained target kernel function approximate model after the target kernel function approximate model is obtained; and projecting the first test data set to a multi-dimensional space according to a feature extraction result so as to classify or cluster the first test data set.

13. The apparatus according to claim 8 or 9, wherein the target kernel function is a CNN-GP kernel function; each of the deep neural networks has a second specified structure; the test unit is also used for carrying out unsupervised extraction of discriminant image features on the second test data set by adopting the obtained target kernel function approximate model after the target kernel function approximate model is obtained; and projecting the second test data set to a multi-dimensional space according to the feature extraction result so as to classify or cluster the second test data set.

14. The apparatus according to claim 8 or 9, wherein the target kernel function is an NTKs kernel function; each of the deep neural networks has a third prescribed structure; the test unit is also used for extracting the characteristics of the third test data set by adopting the obtained target kernel function approximate model after the target kernel function approximate model is obtained; and according to the feature extraction result, approximating the evaluation result of the target kernel function on the third test data set.

15. A computer-readable storage medium on which a computer-readable program is stored,

the computer readable program when executed by a processor implements the method of any one of claims 1 to 7.

16. An electronic device, comprising:

a processor; and

a memory arranged to store computer executable instructions that, when executed, cause the processor to perform the method of any one of claims 1 to 7.