CN115374950A

CN115374950A - Sample detection method, sample detection device, electronic apparatus, and storage medium

Info

Publication number: CN115374950A
Application number: CN202210820164.3A
Authority: CN
Inventors: 瞿晓阳; 王健宗; 邓宝平
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2022-07-13
Filing date: 2022-07-13
Publication date: 2022-11-22

Abstract

The embodiment of the application relates to the field of artificial intelligence, in particular to a sample detection method, a sample detection device, electronic equipment and a storage medium. The sample detection method comprises the following steps: obtaining a sample to be detected; inputting a sample to be detected into a pre-trained target detection model for class detection to obtain a target detection value; screening out a target mean value and a target sample from a preset sample reference set according to a target detection value; screening out a calibration category from the sample reference set according to the target sample; taking the calibration category as the sample category of the sample to be detected; acquiring a distance parameter between a sample to be detected and a target sample; calculating confidence coefficient according to the distance parameter; and determining the target class according to the confidence coefficient and the sample class. The method and the device can realize the optimization of the detection capability of the samples outside the model distribution on the basis of not modifying the network structure of the model, thereby reducing the complexity of the samples outside the model detection distribution.

Description

Sample detection method, sample detection device, electronic apparatus, and storage medium

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a sample detection method, a sample detection apparatus, an electronic device, and a storage medium.

Background

Currently, out-of-distribution (OOD) detection is an important method for verifying the security and reliability of a model.

In the related art, the network structure of the model needs to be modified to realize the detection of the out-of-distribution samples, thereby increasing the complexity of detecting the out-of-distribution samples by the model.

Disclosure of Invention

The embodiment of the present disclosure provides a sample detection method, a sample detection apparatus, an electronic device, and a storage medium, which can optimize the detection capability of samples outside the model distribution without modifying the network structure of the model, thereby reducing the complexity of detecting samples outside the model distribution.

In order to achieve the above object, a first aspect of the embodiments of the present disclosure provides a sample detection method, including:

obtaining a sample to be detected;

inputting the sample to be detected into a pre-trained target detection model for class detection to obtain a target detection value; the target detection model comprises an N-layer network layer, N is a positive integer greater than 1, the target detection value is a value output by the last network layer of the target detection model, the target detection value is used for representing the characteristic distribution of the sample to be detected, and the last network layer of the target detection model is a convolutional layer or a pooling layer;

screening a target mean value from a preset sample reference set according to the target detection value, and screening a target sample from the sample reference set according to the target mean value; wherein the sample to be detected is an out-of-distribution sample of the target sample;

screening out a calibration category from a sample reference set according to the target sample;

taking the calibration category as a sample category of the sample to be detected;

acquiring a distance parameter between the sample to be detected and the target sample; the distance parameter is used for representing the similarity between the sample to be detected and the target sample;

calculating confidence according to the distance parameters;

and determining a target class according to the confidence coefficient and the sample class.

In some embodiments, before the sample to be detected is input to a pre-trained target detection model for class detection to obtain a target detection value, the sample detection method further includes training the target detection model, specifically including:

generating image data according to a preset original generator;

inputting the image data into a pre-trained original detection model for processing to obtain a first training label;

inputting the image data into a preset calibration detection model for training to obtain a second training label;

and adjusting parameters of the calibration detection model according to the second training label and the first training label to obtain the target detection model.

In some embodiments, the training the target detection model further includes performing iterative training on the target detection model, which specifically includes:

adjusting parameters of the original generator according to the first training label to obtain a primary generator;

updating the image data according to the preliminary generator;

and performing iterative training on the target detection model according to the updated image data until the target detection model meets a preset convergence condition, and updating the preliminary generator into a target generator.

In some embodiments, before the selecting a target mean value from a preset sample reference set according to the target detection value and selecting a target sample from the sample reference set according to the target mean value, the sample detection method further includes constructing the sample reference set, specifically including:

inputting a preset random sample into the target generator for processing to obtain an original sample;

inputting the original sample into the original detection model for classification processing to obtain an original category;

taking a plurality of original samples with the same original category as a sample set;

inputting the sample set into the target detection model for class detection to obtain an original detection value; wherein the original detection value is a value output by each layer network layer of the target detection model;

obtaining the hierarchy of the network layer according to the original detection value to obtain the network hierarchy;

constructing the sample reference set according to the plurality of sample sets, the plurality of original classes, the plurality of original detection values, and the plurality of network levels.

In some embodiments, the raw detection values comprise raw mean values;

the screening of the target mean value from a preset sample reference set according to the target detection value comprises:

performing difference calculation on the target detection value and the calibration mean value to obtain a mean value difference; wherein the calibration mean is an original mean of the last layer of the network hierarchy in the sample reference set;

acquiring the minimum average value difference as a target difference value;

and screening out the target mean value from the calibration mean value according to the target difference value.

In some embodiments, the raw detection values further comprise a raw covariance matrix;

the obtaining of the distance parameter between the sample to be detected and the target sample includes:

inputting the sample to be detected into the target detection model for class detection to obtain a sample mean value and a sample covariance matrix; wherein the sample mean is a value of the network layer output of the nth layer of the target detection model and the sample covariance matrix is a matrix of the network layer output of the nth layer of the target detection model;

and obtaining the distance parameter according to the sample mean, the sample covariance matrix and the sample reference set.

In some embodiments, the inputting the sample to be detected into a pre-trained target detection model for class detection includes:

adding random noise to the sample to be detected to obtain a calibration detection sample;

and inputting the calibration detection sample into the target detection model for class detection.

To achieve the above object, a second aspect of the embodiments of the present disclosure provides a sample detection apparatus, including:

the sample acquisition module is used for acquiring a sample to be detected;

the mean value calculation module is used for inputting the sample to be detected into a pre-trained target detection model for class detection to obtain a target detection value; the target detection value is a value output by the last network layer of the target detection model, the target detection value is used for representing the characteristic distribution of the sample to be detected, and the last network layer of the target detection model is a convolutional layer or a pooling layer;

the sample type confirmation module is used for screening a target mean value from a preset sample reference set according to the target detection value and screening a target sample from the sample reference set according to the target mean value; wherein the sample to be detected is an out-of-distribution sample of the target sample; screening out a calibration category from the sample reference set according to the target sample; taking the calibration category as a sample category of the sample to be detected;

the confidence coefficient calculation module is used for acquiring a distance parameter between the sample to be detected and the target sample; the distance parameter is used for representing the similarity between the sample to be detected and the target sample; calculating confidence according to the distance parameters;

and the result calculation module is used for determining the target category according to the confidence coefficient and the sample category.

To achieve the above object, a third aspect of the embodiments of the present disclosure proposes an electronic device, including at least one memory;

at least one processor;

at least one computer program;

the computer programs are stored in the memory, and the processor executes the at least one computer program to implement:

the method of any one of the embodiments of the first aspect.

To achieve the above object, a fourth aspect of the embodiments of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions for causing a computer to perform:

the method of any one of the embodiments of the first aspect.

According to the sample detection method, the sample detection device, the electronic equipment and the storage medium, the target detection value of the sample to be detected is obtained through the target detection model, the target sample which is closest to the characteristic distribution trend of the sample to be detected is screened out from the sample reference set according to the target detection value, and the calibration type of the target sample is the sample type of the sample to be detected, namely, the sample type of the sample to be detected is subjected to preliminary detection. And then, determining the confidence coefficient of the initial detection according to the distance parameter between the sample to be detected and the target sample so as to finally determine the target class of the sample to be detected. Therefore, the sample detection method provided by the embodiment of the application realizes the optimization of the detection capability of the samples outside the distribution on the basis of ensuring the original classification capability of the target detection model, and avoids the method for achieving the optimization effect by adjusting the network structure of the target detection model in the related technology.

Drawings

FIG. 1 is a schematic flow chart of a sample detection method according to an embodiment of the present disclosure;

FIG. 2 is another schematic flow chart of a sample detection method provided in an embodiment of the present application;

FIG. 3 is another schematic flow chart of a sample detection method provided in an embodiment of the present application;

FIG. 4 is another schematic flow chart of a sample detection method provided in an embodiment of the present application;

FIG. 5 is another schematic flow chart of a sample detection method provided in an embodiment of the present application;

FIG. 6 is another schematic flow chart of a sample detection method provided in an embodiment of the present application;

FIG. 7 is another schematic flow chart of a sample detection method provided in an embodiment of the present application;

FIG. 8 is a block diagram of a sample testing device provided by an embodiment of the present application;

fig. 9 is a schematic hardware structure diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

It is noted that while functional block divisions are provided in device diagrams and logical sequences are shown in flowcharts, in some cases, steps shown or described may be performed in sequences other than block divisions within devices or flowcharts. The terms first, second and the like in the description and in the claims, and the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.

First, several terms referred to in the present application are resolved:

artificial Intelligence (AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science, which attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence, and research in this field includes robotics, language recognition, image recognition, natural language processing, expert systems, and the like. The artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results.

Out-of-Distribution sample (OOD): i.e., samples that do not conform to the model's original training data distribution. In the open world, classification is an important way to verify the security of the model, and the classification models in the related art are all trained in a closed world, i.e. the test data and the training data are from the same distribution (called in-distribution). Assuming that a classifier capable of distinguishing cats and dogs is obtained by training with a set of cat and dog pictures, when the classifier model is deployed in practical applications, some pictures which do not belong to closed world categories, such as tiger pictures, etc., are encountered. Or, some pictures which are visually different from the training pictures are encountered, such as cartoon cat pictures and the like. In the open world field, the following training tasks are commonly used: outlier Detection (OD), anomaly Detection (AD), new class Detection (ND), open Set Recognition (OSR), and Out-of-Distribution Detection (OOD Detection). The out-of-distribution detection is similar to the new-class detection, and is to find out a 'new-class' sample which does not belong to any class in the training set from the test set, but the out-of-distribution sample detection requires that the model rejects the sample with semantic deviation (i.e. the out-of-distribution sample) while ensuring the original classification performance so as to ensure the reliability and safety of the model.

Generation of countermeasure networks (GAN): the method is a deep learning model and belongs to unsupervised learning. Generating a countermeasure network yields good output through mutual game learning of at least two models in the framework, namely a generator (Generative Model) and a discriminator (Discriminative Model). At present, the generation countermeasure network is mainly applied to the directions of sample data generation, image restoration, image conversion, text generation and the like. The generator generates data through a machine, and the training target of the generator is to enable the generated data to be judged as wrong as possible by the discriminator, even if the generated data is close to the real data as possible; the function of the discriminator is to judge whether the data is real data or data generated by the generator, and the training goal of the discriminator is to find out the data generated by the generator as much as possible. For example, assume that the discriminator input parameter is x, x representing data, and D (x) of the discriminator output represents the probability that x is true data. When D (x) =1, this means that x100% is real data; when D (x) =0, it means that x cannot be real data. Therefore, the generator and the discriminator form a dynamic countermeasure process, the data generated by the generator is closer to the real data as the training is carried out, and the capability of the discriminator for discriminating the data is higher. Ideally, the generator is able to generate data approximating the real data, and it would be difficult for the arbiter to determine whether the data generated by the generator is real data, when D (x) =0.5. In practical application, the generation of the countermeasure network includes the following two training phases: the first stage, fixing a discriminator and a training generator; and in the second stage, a generator is fixed, and a discriminator is trained. In the first stage, the discriminator with better performance is used, and the generator continuously generates 'false data' and inputs the 'false data' into the discriminator for discrimination. In the initial stage, the performance of the generator is weak, i.e., the data generated by the generator is easily discriminated as "false data" by the discriminator. However, with continuous cyclic training, the performance of the generator is continuously improved, and finally data close to real data is generated. After the first stage, the effect of continuing training the generator is not good, so in the second stage, the generator is fixed, and the discriminator is continuously trained, so as to improve the performance of the discriminator, and enable the discriminator to accurately judge the 'false data'. And circularly repeating the first stage and the second stage, continuously improving the performance of the generator and the judger in iterative training, and finally obtaining a generator with generated data which can be consistent with the real data distribution.

Knowledge distillation (Knowledge distillation): the method is a model compression technology, and the knowledge of a complex and large-scale teacher model is 'distilled' to a small-scale student model, so that the student model has the capability of the teacher model. When the student model is deployed on the equipment, the requirements on the performances of the equipment such as the memory and the CUP can be reduced, so that the computing resources are saved.

Mahalanobis Distance (Mahalanobis Distance): the method is a distance measurement and is used as an index for evaluating the similarity between data together with Euclidean distance, manhattan distance, hamming distance and the like, but the Mahalanobis distance can deal with the problem of non-independent distribution between dimensions in data with high-dimensional linear distribution. Mahalanobis distance can be viewed as a modification of euclidean distance that corrects the problem of inconsistent and related dimensions in euclidean distance. Mahalanobis distance is a method for efficiently calculating the similarity between two unknown sample sets.

Confidence coefficient: in statistics, the Confidence interval (Confidence interval) of a probability sample is an interval estimate for some overall parameter of this sample. The confidence interval exhibits the extent to which the true value of this parameter has a certain probability of falling around the measurement. The confidence interval gives the range of confidence levels of the measured parameter measurement, i.e. the aforementioned "certain probability", which is the confidence, also referred to as confidence level.

In the related art, the out-of-distribution sample detection method includes Softmax-based, uncertainty, gennerativeModel, classifier, etc. In the method, in order to realize that the original model has the performance of learning uncertainty, the network structure of the original model needs to be modified, so that the complexity of detecting the samples outside the distribution of the original model is increased.

Based on this, the embodiment of the application provides a sample detection method, a sample detection apparatus, an electronic device, and a storage medium, which can realize detection of samples outside distribution on the basis of not modifying a model network structure, thereby reducing the complexity of model detection of samples outside distribution.

The sample detection method, the sample detection apparatus, the electronic device, and the storage medium provided in the embodiments of the present application are specifically described in the following embodiments, and first, the sample detection method in the embodiments of the present application is described.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

The embodiment of the application provides a sample detection method, a sample detection device, an electronic device and a storage medium, and relates to the technical field of artificial intelligence, in particular to the technical field of supervised learning. The sample detection method provided by the embodiment of the application can be applied to a terminal, a server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smartphone, tablet, laptop, desktop computer, smart watch, or the like; the server can be an independent server, and can also be a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, content Delivery Network (CDN), big data and artificial intelligence platform and the like; the software may be, but is not limited to, implementing a sample detection method, etc.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Referring to fig. 1, the present application provides a sample detection method, which includes, but is not limited to, steps S110 to S180.

S110, obtaining a sample to be detected;

specifically, a sample to be detected for performing out-of-distribution sample detection capability optimization on the target detection model is obtained. It can be understood that, since the sample to be detected is used for optimizing the sample detection capability of the target detection model outside the distribution, the class of the sample to be detected is different from the sample class of the original training data of the target detection model, that is, the feature distribution of the sample to be detected is different from the feature distribution of the original training data, and the distribution of the sample set to be detected is also different from the distribution of the original training data set. The sample set to be detected comprises a plurality of samples to be detected, and the original training data set comprises a plurality of original training data. For example, the target detection model is assumed to be a model with cat and dog classification capability, and the original training data set thereof includes a plurality of pictures of animal cats and animal dogs. Therefore, the obtained sample to be detected should be a picture of an animal cat or an animal dog, such as an animal rabbit picture. And/or the sample to be detected has visual difference with the original training data, such as cartoon cat picture, cartoon dog picture and the like.

It is understood that, in the embodiment of the present application, a plurality of samples to be detected, or one sample to be detected may be obtained. When a sample to be detected is obtained, the number of the samples in the sample set to be detected is 1.

S120, inputting the sample to be detected into a pre-trained target detection model for class detection to obtain a target detection value; the target detection model comprises an N-layer network layer, N is a positive integer larger than 1, a target detection value is a value output by the last network layer of the target detection model, the target detection value is used for representing the characteristic distribution of a sample to be detected, and the last network layer of the target detection model is a convolutional layer or a pooling layer;

it is understood that, in order to ensure the effectiveness of the out-of-distribution sample detection capability optimization, the target detection model should be a trained model, that is, the target detection model has basic model capability, such as classification capability of a specific class, and the model is in a converged state. The target detection model comprises an N-layer network layer, the network layer is a general name of a convolutional layer and a pooling layer, namely the sum of the number of layers of the convolutional layer and the pooling layer of the target detection model is N, so that N is a positive integer greater than 1. For example, when using ResNet18 as the network structure of the object detection model, the object detection model includes seventeen convolutional layers and one fully connected layer. It can be understood that the network structure of the target detection model may be adaptively set according to the actual capability requirement of the model, and the embodiment of the present application is not particularly limited.

Specifically, a sample to be detected is used as input data of a target detection model, so that class detection is performed on the sample to be detected through the target detection model. And in the analog detection processing process, obtaining a value output by the last network layer of the target detection model, and taking the value as a target detection value of the sample to be detected. It can be understood that, since the last network layer is a convolutional layer or a pooling layer, and the target detection value is a value obtained by performing an average processing on the value output by the last network layer, the target detection value can reflect a characteristic distribution trend of the sample to be detected.

S130, screening a target mean value from a preset sample reference set according to a target detection value, and screening a target sample from the sample reference set according to the target mean value; wherein the sample to be detected is an out-of-distribution sample of the target sample;

it can be understood that a sample reference set is preset, and the sample reference set comprises a plurality of original samples, a plurality of original categories, a plurality of original detection values, and the like, wherein unique mapping relationships exist among the original samples, the original categories, and the original detection values. The original class is used for representing the sample class of the original sample, and the original detection value is the target detection model or the average value of the last network layer output of the model with the same capability and network structure as the target detection model. The original samples are samples which are consistent with the distribution of the training data of the target detection model, so the characteristic distribution trend of the original training data of the target detection model can be reflected through the sample reference set.

Specifically, a target detection value of a sample to be detected is compared with a plurality of original detection values in a sample reference set, so as to screen out an original detection value closest to the target detection value. And taking the screened original detection values as a target mean value, and taking original samples which have a mapping relation with the screened original detection values as target samples. It can be understood that the target sample is an original sample with the highest similarity to the distribution trend of the characteristics of the sample to be detected in the sample reference set. Therefore, any method capable of describing the similarity of the feature distribution trends can be used as the method for screening the original detection value closest to the target detection value in the embodiment of the present application, and the embodiment of the present application is not particularly limited.

S140, screening out a calibration category from the sample reference set according to the target sample;

specifically, an original category having a mapping relationship with the target sample is screened out from a preset sample reference set, and the original category is used as a calibration category, that is, an original category having a mapping relationship with the screened-out original detection value is used as a calibration category.

S150, taking the calibration category as the sample category of the sample to be detected;

specifically, the calibration category obtained by screening in step S140 is used as the preliminary detection result of the target detection model for the sample to be detected, that is, the calibration category is used as the sample category of the sample to be detected.

S160, obtaining a distance parameter between the sample to be detected and the target sample; the distance parameter is used for representing the similarity between the sample to be detected and the target sample;

it can be understood that, in step S130, although the target sample with the highest similarity to the distribution trend of the characteristics of the sample to be detected is screened, the similarity between each characteristic of the sample to be detected and the target sample needs to be determined in consideration of the relation between each characteristic. Therefore, the distance between the sample to be detected and the target sample is calculated to obtain the distance parameter.

S170, calculating confidence coefficient according to the distance parameters;

it can be understood that, when the sample to be detected is input to the target detection model for class detection according to the step S160, the similarity between the extracted features of each layer network layer and the target sample, that is, the corresponding distance parameter is calculated according to the output value of each layer network layer, so that the confidence is calculated according to the N distance parameters and the preset weight. The preset weight of each distance parameter may be adaptively set according to actual needs, and the embodiment of the present application is not specifically limited.

And S180, determining the target class according to the confidence coefficient and the sample class.

It can be understood that the confidence is used for characterizing the probability that the sample class is the target class of the sample to be detected, and when the probability is lower than a certain threshold, it indicates that the probability that the target class is the sample class is lower, and at this time, it can be determined that the sample to be detected and the original sample of the target detection sample are not the same type of sample, that is, the sample to be detected is the sample outside the distribution of the original sample. For example, it is assumed that the target detection model is a model with cat and dog classification capability, the original sample includes a plurality of pictures of animal cats and animal dogs, and the sample to be detected is a rabbit picture. And obtaining a preliminary examination result that the sample category is cat and the calculated value of the confidence coefficient is 0.1 according to the steps, and determining that the sample to be detected is an out-of-distribution sample, namely, neither a cat sample nor a dog sample.

It can be understood that, through steps S110 to S180, the optimization of the out-of-distribution sample detection capability of the target detection model, that is, the capability of the target detection model to learn uncertainty is improved, so as to ensure the safety and reliability of the target detection model.

According to the sample detection method provided by the embodiment of the application, the target detection value of the sample to be detected is obtained through the target detection model, the target sample which is closest to the characteristic distribution trend of the sample to be detected is screened out from the sample reference set according to the target detection value, and the calibration type of the target sample is the sample type of the sample to be detected, namely the sample type of the sample to be detected is subjected to preliminary detection. And then, determining the confidence coefficient of the initial detection according to the distance parameter between the sample to be detected and the target sample so as to finally determine the target class of the sample to be detected. Therefore, the sample detection method provided by the embodiment of the application realizes the optimization of the detection capability of the samples outside the distribution on the basis of ensuring the original classification capability of the target detection model, and avoids the method for achieving the optimization effect by adjusting the network structure of the target detection model in the related technology.

Referring to fig. 2, in some embodiments, before step S120, the sample detection method further includes: the target detection model is trained, specifically including but not limited to steps S210 to S240.

S210, generating image data according to a preset original generator;

s220, inputting image data into a pre-trained original detection model for processing to obtain a first training label;

s230, inputting the image data into a preset calibration detection model for training to obtain a second training label;

s240, parameter adjustment is carried out on the calibration detection model according to the second training label and the first training label, and a target detection model is obtained.

It can be understood that, in the related art, besides the optimization of the sample detection capability by modifying the model network structure to realize the distribution, some methods need to acquire the original training data of the model. In actual production environment and application environment, the original training data is often difficult to obtain. Based on this, the sample detection method provided by the embodiment of the application also provides a training method of the target detection model, namely, the original detection model is used as a teacher model, and the target detection model is a trained student model, so that the acquisition of the network structure of the original detection model is avoided. And, the primitive detection model is used as a discriminator in the generation countermeasure network for discriminating the data generated by the primitive generator. After continuous cyclic training, the original generator can generate a data set which is consistent with the original training data distribution of the original detection model, so that the problem that the original training data cannot be acquired in the related technology is solved. Hereinafter, the training of the target detection model will be specifically described.

In step S210 of some embodiments, a primitive generator is preset, and the primitive generator is used to convert a group of vectors into a certain amount of image data with a certain size, for example, into M32 × 32 image data. It is understood that the amount and size of the converted image data may be adaptively adjusted according to actual needs, and the embodiment of the present application is not particularly limited.

In step S220 of some embodiments, the image data generated by the raw generator is used as input data of the raw detection model, where the raw detection model is a trained model that needs to be optimized for the detection capability of the out-of-distribution sample originally, but because the raw training data and the network structure of the raw detection model are difficult to obtain, the target detection model obtained by knowledge distillation of the raw detection model is used as the target for optimizing the detection capability of the out-of-distribution sample. It can be seen that when the image data generated by the raw generator is used as input data of the raw detection model, the raw detection model outputs a first training label corresponding to the image data, wherein the first training label is used for representing the probability that the image data is true to a certain class, for example, the probability that the image data is true to a cat or a dog.

In step S230 of some embodiments, the image data generated by the raw generator is also used as input data of an initial model of the target detection model (i.e., a calibration detection model), and the calibration detection model outputs a corresponding second training label according to the image data, where the second training label is used to represent a probability that the image data is predicted to be in a certain class.

In step S240 of some embodiments, a loss value of the calibrated detection model is calculated according to the real value (i.e., the first training label) output by the original detection model and the predicted value (i.e., the second training label) output by the calibrated detection model. And carrying out parameter adjustment on the calibration detection model according to the loss value, thereby obtaining a target detection model with the same classification capability as the original detection model.

Specifically, the loss value H (p, q) is calculated according to the following formula (1).

H(p,q)＝-∑p(x _i )log(q(x _i ) .

Wherein, p (x) _i ) A second training label representing the ith image data, q (x) _i ) A first training label representing the ith image data.

Referring to fig. 3, in some embodiments, training the target detection model further comprises the steps of: and performing iterative training on the target detection model, specifically including steps S310 to S330.

S310, adjusting parameters of the original generator according to the first training label to obtain a preliminary generator;

it is understood that the primitive detection model and the primitive generator form a generative confrontation network, wherein the primitive detection model serves as a discriminator in the generative confrontation network, and the primitive generator serves as a generator in the generative confrontation network. Since the original detection model is a trained model, the training phase for the original generator is the first phase in the pairing reactive network, namely the fixed arbiter and the training generator. The primitive generator is trained based on the discrimination capability of the primitive detection model (i.e., whether the type of image data is the same as the type of the primitive training data), so that the primitive generator generates data that can be "spurious," i.e., image data that can be distributed in accordance with the primitive training data.

Specifically, the image data generated by the raw generator is used as input data of a raw detection model, and the raw detection model generates a first training label according to the image data, wherein the first training label is used for representing the probability that the image data is true to a certain class. It can be understood that the first training label is an output result obtained by using a Softmax function as an activation function of the output node, and a loss value of the original generator is calculated according to the output result and a cross entropy function, so that the original generator is reversely optimized to obtain a preliminary generator.

S320, updating the image data according to the preliminary generator;

it is understood that the preliminary generator can obtain the loss value for the reverse optimization from the output result of the original detection model in each iteration training of the target detection model. The preliminary generator adjusts parameters according to the loss value to generate image data closer to the distribution of the original training data in the next round of iterative training, that is, the image data generated in the previous round is updated in each round of iterative training. The updated image data is simultaneously used as input data of the original detection model and the target detection model, so that a new round of iterative training is started.

And S330, performing iterative training on the target detection model according to the updated image data until the target detection model meets a preset convergence condition, and updating the preliminary generator into a target generator.

It can be understood that, the target detection model obtains the updated image data input by the original generator in each new iteration training, and the target detection model performs model adjustment and optimization according to the updated image data and the method described in step S240 until the target detection model meets the preset convergence condition, for example, until the change of the loss value calculated according to the first training label and the second training label tends to be stable. At this time, the primitive generator in the current iteration training is saved and is taken as the target generator.

Referring to fig. 4, before step S130, the sample detection method provided in the embodiment of the present application further includes the steps of: constructing a sample reference set, specifically including but not limited to steps S410 to S460.

S410, inputting a preset random sample into a target generator for processing to obtain an original sample;

it will be appreciated that during the synchronous optimization training of the raw generator and the target detection model, a target generator will be obtained, and the image data generated by the target generator is consistent with the raw training data distribution of the raw detection model. A certain number of vectors (i.e., random samples) are randomly generated as input data to the target generator to obtain original samples that are consistent with the original training data distribution.

S420, inputting the original sample into an original detection model for classification processing to obtain an original category;

it can be understood that, the original detection model is used to perform class prediction on the original sample, and an original class corresponding to the original sample is obtained.

S430, taking a plurality of original samples with the same original category as a sample set;

it can be understood that, the plurality of original samples are classified according to the original categories obtained in step S420, and the plurality of original samples having the same original category are used as a set of sample sets.

S440, inputting the sample set into a target detection model for class detection to obtain an original detection value; the original detection value is a value output by each layer of network layer of the target detection model;

it is understood that the original samples are input to the target detection model by category, that is, a plurality of original samples having the same original category are used as a group of sample sets, and the input data of the target detection model is used in units of sample sets. The target detection model comprises N-layer network layers, and original detection values output by the sample set on each layer of the network layers are obtained. It will be appreciated that the type of raw detection values may be chosen based on the function of the sample reference set. For example, when an original sample (i.e., a target sample) closest to the distribution trend of the characteristics of the sample to be detected needs to be screened from the sample reference set, the type of the original detection value is an average value.

S450, acquiring the hierarchy of the network layer according to the original detection value to obtain the network hierarchy;

it is understood that a hierarchy level at which each network layer used for calculating the original detection value is located is obtained, the hierarchy level is taken as a network hierarchy level, and a mapping relationship is established between the hierarchy level and the corresponding original detection value.

And S460, constructing a sample reference set according to the multiple sample sets, the multiple original categories, the multiple original detection values and the multiple network levels.

It can be understood that a corresponding sample set, original categories, original detection values and network levels are mapped to construct a sample reference set. Take a set of mapping relations as an example, in the set of mapping relations, the original type is an output result obtained by using the sample set as the input data of the original detection model, and the original detection value X is _N And when the sample set is input into the target detection model, the value output by the N-th network layer is used as the network level. Thus, each set of mapping relationships includes a sample set, a raw class, N raw detection values, and N network levels.

Referring to fig. 5, in some embodiments, the raw detection values include raw mean values, and step S130 includes, but is not limited to, sub-steps S510-S530.

S510, carrying out difference calculation on the target detection value and the calibration mean value to obtain a mean value difference; the calibration mean value is an original mean value of the last layer of the network level in the sample reference set;

it can be understood that, in order to screen out the original sample (i.e. the target sample) closest to the distribution trend of the characteristics of the sample to be detected from the sample reference set, the type of the original detection value is taken as a mean value. Correspondingly, the type of the target detection value is also the mean value. Thus, in a set of mapping relationships, each network level maps an original mean. And taking the original mean value mapped by the last network layer, namely the network layer with the network level N as a calibration mean value. And performing difference calculation on the target detection value and the calibration mean value of each group of mapping relations to obtain a plurality of mean value differences.

S520, acquiring the minimum average value difference as a target difference value;

it is understood that the average difference having the smallest value among the plurality of average differences is taken as the target difference.

S530, screening out a target mean value from the calibration mean value according to the target difference value;

it is understood that a calibration mean value having a mapping relation with the target difference value is obtained, and the calibration mean value is taken as the target mean value.

It is understood that a sample set having a mapping relationship with the target mean is obtained, and the original samples included in the sample set are taken as the target samples. And taking the original category mapped by the sample set as a calibration category, namely as a preliminary detection result of the sample to be detected.

Referring to fig. 6, in some embodiments, the raw detection values further include a raw covariance matrix, and step S150 includes, but is not limited to, sub-steps S610 through S620.

S610, inputting a sample to be detected into a target detection model for class detection to obtain a sample mean value and a sample covariance matrix; the sample mean value is a value output by an N-layer network layer of the target detection model, and the sample covariance matrix is a matrix output by the N-layer network layer of the target detection model;

it can be understood that, on the basis of screening out the target sample closest to the characteristic distribution trend of the sample to be detected according to the sample reference set, in order to determine the relation between the characteristics, the mahalanobis distance can be selected for calculating the distance parameter between the sample to be detected and the target sample. Therefore, the original detection values in the sample reference set should also include an original covariance matrix, i.e., the original covariance matrix output by each layer of the network layer is obtained while the sample set is input to the target detection model to obtain the original mean output by each layer of the network layer. At this time, each set of mapping relationships includes a sample set, an original class, N original means, N original covariance matrices, and N network levels, where one network level corresponds to one original mean and one original covariance matrix.

It can be understood that the sample to be detected is used as input data of the target detection model to obtain a sample mean value and a sample covariance matrix of the sample to be detected output at each layer of the network layer of the target detection model.

And S620, obtaining distance parameters according to the sample mean value, the sample covariance matrix and the sample reference set.

It can be understood that, the N sample mean values and the N sample covariance matrices are calculated according to step S610, and the similarity between the feature extracted from the sample to be detected in one network layer and the corresponding sample set (i.e., the distance parameter D (x, y)) can be obtained by calculating according to the following formula (2).

Wherein x represents the sample mean, y represents the original mean of the corresponding sample set, and Σ represents the original covariance matrix of the corresponding sample set.

And calculating N distance parameters according to the method, thereby calculating the confidence coefficient of taking the sample category as the target category of the sample to be detected according to the preset weight and the N distance parameters, and further realizing the detection of the sample outside the distribution.

Referring to fig. 7, in some embodiments, step S120 includes, but is not limited to, substeps 710 to S720.

S710, adding random noise to a sample to be detected to obtain a calibration detection sample;

specifically, any random noise is added to the sample to be detected, and the sample to be detected after the random noise is added is used as a calibration detection sample. The random noise includes gaussian noise, random variables, and the like. It can be understood that, in order to improve the robustness of the target detection model, it should be ensured that the random noise types or sizes of the multiple samples to be detected input to the target detection model are different.

And S720, inputting the calibration detection sample into a target detection model for class detection.

It can be understood that the calibration detection sample after random noise is added is used as input data of the target detection model, so as to obtain a target detection value output by the calibration detection sample at the last layer of network layer through the target detection model, thereby performing preliminary detection on the calibration detection sample according to the target detection value and the sample reference set.

According to the sample detection method provided by the embodiment of the application, the target detection model with the same detection capability as the original detection model is obtained through the knowledge distillation technology, so that the condition that the network structure of the original detection model cannot be obtained in the related technology is avoided. And the original detection model is used as a discriminator of the original generator, so that the target generator which can generate the original training data consistent with the original detection model is obtained, and the condition that the original training data of the original detection model cannot be acquired in the related technology is avoided. Secondly, a sample reference set is constructed to carry out preliminary detection judgment on the target detection model, and the confidence of the preliminary detection judgment result is obtained through the Mahalanobis distance, so that the detection of the samples outside the distribution is realized. Therefore, the sample detection method provided by the embodiment of the application can realize the optimization of the detection capability of the sample outside the distribution without modifying the network structure of the target detection model on the basis of ensuring the original classification capability of the target detection model, thereby improving the safety and reliability of the target detection model. In addition, the target detection model is used as a student model of the original detection model, and the network structure of the target detection model is simpler than that of the original detection model, so that the target detection model is more favorable for being deployed on equipment with lower computing power.

Referring to fig. 8, an embodiment of the present application further provides a sample detection device, including:

a sample obtaining module 810, configured to obtain a sample to be detected;

the mean value calculation module 820 is used for inputting the sample to be detected into a pre-trained target detection model for class detection to obtain a target detection value; the target detection value is a value output by the last network layer of the target detection model, the target detection value is used for representing the characteristic distribution of the sample to be detected, and the last network layer of the target detection model is a convolutional layer or a pooling layer;

the sample type confirming module 830 is configured to screen out a target mean value from a preset sample reference set according to a target detection value, and screen out a target sample from the sample reference set according to the target mean value; wherein the sample to be detected is an out-of-distribution sample of the target sample; screening out a calibration category from the sample reference set according to the target sample; taking the calibration category as a sample category of the sample to be detected;

the confidence coefficient calculation module 840 is used for acquiring a distance parameter between the sample to be detected and the target sample; the distance parameter is used for representing the similarity between the sample to be detected and the target sample; calculating confidence coefficient according to the distance parameter;

and a result calculation module 850 for determining the target class according to the confidence and the sample class.

It can be seen that the contents of the sample detection method embodiments are all applicable to the sample detection device embodiments, the functions of the sample detection device embodiments are the same as the sample detection method embodiments, and the beneficial effects of the sample detection device embodiments are also the same as the beneficial effects of the sample detection method embodiments.

An embodiment of the present application further provides an electronic device, including:

at least one memory;

at least one processor;

at least one program;

a program is stored in the memory and the processor executes at least one program to implement the present disclosure to implement the sample detection methods described above. The electronic device may be any intelligent terminal including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a vehicle-mounted computer, and the like.

Referring to fig. 9, fig. 9 illustrates a hardware structure of an electronic device of another embodiment, the electronic device including:

the processor 910 may be implemented by a general Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided by the embodiments of the present disclosure;

the Memory 920 may be implemented in the form of a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a Random Access Memory (RAM). The memory 920 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present disclosure is implemented by software or firmware, the relevant program codes are stored in the memory 920 and called by the processor 910 to execute the sample detection method according to the embodiments of the present disclosure;

an input/output interface 930 for realizing information input and output;

the communication interface 940 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g., USB, network cable, etc.) or in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);

a bus 950 that transfers information between various components of the device (e.g., the processor 910, the memory 920, the input/output interface 930, and the communication interface 940);

wherein the processor 910, the memory 920, the input/output interface 930, and the communication interface 940 are communicatively coupled to each other within the device via a bus 950.

The embodiment of the present application also provides a storage medium, which is a computer-readable storage medium, and the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used for causing a computer to execute the sample detection method.

The memory, as a non-transitory computer-readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer-executable programs. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The embodiments described in the embodiments of the present disclosure are for more clearly illustrating the technical solutions of the embodiments of the present disclosure, and do not constitute a limitation on the technical solutions provided in the embodiments of the present disclosure, and it is obvious to a person skilled in the art that, with the evolution of the technology and the appearance of new application scenarios, the technical solutions provided in the embodiments of the present disclosure are also applicable to similar technical problems.

Those skilled in the art will appreciate that the solutions shown in the figures are not intended to limit embodiments of the present disclosure, and may include more or less steps than those shown, or some of the steps may be combined, or different steps.

The above described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

One of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that, in this application, "at least one" means one or more, "a plurality" means two or more. "and/or" is used to describe the association relationship of the associated object, indicating that there may be three relationships, for example, "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is only a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes multiple instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing programs, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The preferred embodiments of the present disclosure have been described above with reference to the accompanying drawings, and therefore do not limit the scope of the claims of the embodiments of the present disclosure. Any modifications, equivalents, and improvements within the scope and spirit of the embodiments of the present disclosure that may occur to persons skilled in the art are to be within the scope of the embodiments of the present disclosure.

Claims

1. A method of detecting a sample, comprising:

obtaining a sample to be detected;

screening out a target mean value from a preset sample reference set according to the target detection value, and screening out a target sample from the sample reference set according to the target mean value; wherein the sample to be detected is an out-of-distribution sample of the target sample;

screening out a calibration category from the sample reference set according to the target sample;

calculating confidence according to the distance parameters;

2. The sample detection method according to claim 1, wherein before the sample to be detected is input to a pre-trained target detection model for class detection to obtain a target detection value, the sample detection method further comprises training the target detection model, and specifically comprises:

generating image data according to a preset original generator;

3. The sample detection method according to claim 2, wherein the training of the target detection model further comprises iterative training of the target detection model, specifically comprising:

adjusting parameters of the original generator according to the first training label to obtain a preliminary generator;

updating the image data according to the preliminary generator;

4. The sample detection method according to claim 3, wherein before the selecting a target mean value from a preset sample reference set according to the target detection value and selecting a target sample from the sample reference set according to the target mean value, the sample detection method further includes constructing the sample reference set, including:

constructing the sample reference set according to a plurality of the sample sets, a plurality of the original categories, a plurality of the original detection values, and a plurality of the network hierarchies.

5. The sample detection method of claim 4, wherein the raw detection values comprise raw mean values;

acquiring the minimum average value difference as a target difference value;

6. The sample detection method as recited in claim 4, wherein the raw detection values further comprise a raw covariance matrix;

7. The sample detection method according to any one of claims 1 to 6, wherein the inputting the sample to be detected into a pre-trained target detection model for class detection comprises:

8. A sample testing device, comprising:

the sample acquisition module is used for acquiring a sample to be detected;

9. An electronic device, comprising:

at least one memory;

at least one processor;

at least one computer program;

the method for detecting a specimen according to any one of claims 1 to 7.

10. A computer-readable storage medium having computer-executable instructions stored thereon for causing a computer to perform:

the method for detecting a specimen according to any one of claims 1 to 7.