CN114863178A

CN114863178A - Image data input detection method and system for neural network vision system

Info

Publication number: CN114863178A
Application number: CN202210522508.2A
Authority: CN
Inventors: 徐经纬; 许畅; 朱思远
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2022-05-13
Filing date: 2022-05-13
Publication date: 2022-08-05
Anticipated expiration: 2042-05-13
Also published as: CN114863178B

Abstract

The invention discloses an image data input detection method and system oriented to a neural network vision system. Given a neural network model and its training image data set, the training image data set is input to the neural network model and intermediate results are collected to obtain the neural network hidden image data set. Contains features; uses a Gaussian mixture model to fit the intermediate results, obtains model parameters, and collects the path frequency of the training image data set to calculate the probability; input the image data to be tested into the neural network model, and collect the intermediate results according to the method in step 1 ; Use the Gaussian mixture model in step 2 to calculate the generation probability and inter-layer transition probability of the intermediate result, and use the joint probability estimation model to perform fast probability estimation to verify whether the input image data to be tested is valid.

Description

Image data input detection method and system for neural network vision system

技术领域technical field

本发明涉及一种面向神经网络视觉系统的图像数据输入检测方法和系统，属于深度神经网络模型输入验证、智能软件质量保障、图像数据处理等技术领域。The invention relates to an image data input detection method and system for a neural network vision system, belonging to the technical fields of deep neural network model input verification, intelligent software quality assurance, image data processing and the like.

背景技术Background technique

基于深度神经网络的软件系统近年来被广泛应用于生产生活的各个领域，为人们的生活带来了巨大的便利。深度学习模型应用的一个重要领域是机器视觉，其又包括图像分类、目标检测、语义分割等子任务。尽管基于深度神经网络技术的软件系统在这些任务中取得了良好的性能，神经网络模型往往不会达到100％的准确率。而且深度神经网络模型的预测过程是难以被解释的，因此视觉模型预测结果是否正确无法通过简单的方法判断。Software systems based on deep neural networks have been widely used in various fields of production and life in recent years, bringing great convenience to people's lives. An important field of application of deep learning models is machine vision, which also includes sub-tasks such as image classification, object detection, and semantic segmentation. Although software systems based on deep neural network technology have achieved good performance in these tasks, neural network models often do not achieve 100% accuracy. Moreover, the prediction process of the deep neural network model is difficult to explain, so whether the prediction result of the visual model is correct cannot be judged by a simple method.

基于神经网络模型的视觉系统已经被应用于人脸识别、视频监控等安全性要求很高的领域，错误的预测结果可能带来严重的后果。为了保障基于深度神经网络的机器视觉系统的安全，一些面向图像数据的异常输入检测技术逐渐开始涌现。这类方法能够针对给定图像数据样本，估计其造成神经网络视觉系统预测异常的概率，如果估计的预测异常概率大于预定阈值则拒绝样本并给出警告，保证机器视觉智能系统的可靠运行。Vision systems based on neural network models have been used in areas with high security requirements such as face recognition and video surveillance, and wrong prediction results may have serious consequences. In order to ensure the safety of machine vision systems based on deep neural networks, some abnormal input detection techniques for image data have gradually emerged. This kind of method can estimate the probability of abnormal prediction of neural network vision system for a given image data sample. If the estimated abnormal prediction probability is greater than a predetermined threshold, the sample will be rejected and a warning will be given to ensure the reliable operation of the machine vision intelligent system.

当前已有的图像输入检测技术，适用性和可靠性仍存在一些限制。一些方法判别式基于判别式模型，将输入图像检测问题建模成一个正常数据与异常数据的二分类问题。这些方法的构建过程需要一些异常输入图像的参与，获取和标注图片样本增加了应用的成本。此外，由于异常数据的参与，判别式的方法可能产生对于特定数据分布的过拟合，难以保证对来自不同分布的图像数据都拥有相同的泛化性能。另外一些具有模型无关的特点，这些方法使用特定策略对深度神经网络模型重新训练，不仅增加了应用检测技术的成本和难度，而且重新训练的模型往往会导致性能的下降，这在很多应用场景下都是难以接受的。There are still some limitations in the applicability and reliability of the existing image input detection technology. Some methods discriminative are based on discriminative models, which model the input image detection problem as a binary classification problem of normal data and abnormal data. The construction process of these methods requires the participation of some abnormal input images, and obtaining and labeling image samples increases the cost of the application. In addition, due to the participation of abnormal data, discriminative methods may overfit to specific data distributions, and it is difficult to guarantee the same generalization performance for image data from different distributions. Others have model-independent characteristics. These methods use specific strategies to retrain the deep neural network model, which not only increases the cost and difficulty of applying detection technology, but also the retrained model often leads to performance degradation, which in many application scenarios are unacceptable.

发明内容SUMMARY OF THE INVENTION

发明目的：针对现有技术中存在的问题与不足，本发明提供一种面向神经网络视觉系统的基于中间层联合建模的图像数据输入检测方法和系统，本发明能够作用于现实生活中常见的基于深度神经网络的机器视觉智能系统中，对深度神经网络系统推理过程中产生的中间结果进行采集和分析，检测深度神经网络模型无法正确预测的输入图像样本数据。Purpose of the invention: Aiming at the problems and deficiencies in the prior art, the present invention provides an image data input detection method and system based on joint modeling of the intermediate layer for the neural network vision system. In the machine vision intelligent system based on the deep neural network, the intermediate results generated in the inference process of the deep neural network system are collected and analyzed, and the input image sample data that cannot be correctly predicted by the deep neural network model is detected.

本发明的优势为：一、使用场景不受神经网络模型的复杂程度以及使用场景的严格限制，尤其是在一些难以收集的异常图像数据的场景中仍然适用。二、该方法对于输入图像数据的有效性检测有较高的准确度，能够有效判别出有效与无效输入图像，从而避免无意义的预测结果。三、该方法对于检测输入图像所需时间代价较小，能够满足实时验证正确性的要求，可部署于运行中的机器视觉模型中进行输入检测。The advantages of the present invention are as follows: 1. The use scene is not strictly limited by the complexity of the neural network model and the use scene, especially in some scenes with abnormal image data that are difficult to collect. Second, the method has high accuracy for the validity detection of input image data, and can effectively discriminate between valid and invalid input images, thereby avoiding meaningless prediction results. Third, the method requires less time to detect input images, can meet the requirements of real-time verification of correctness, and can be deployed in running machine vision models for input detection.

技术方案：一种面向神经网络视觉系统的图像数据输入检测方法，包括如下步骤：Technical solution: an image data input detection method oriented to a neural network vision system, comprising the following steps:

步骤一：神经网络隐含特征提取。给定神经网络模型和其训练图像数据集，传入训练图像数据集到神经网络模型并收集中间结果，得到神经网络隐含特征。Step 1: Neural network hidden feature extraction. Given a neural network model and its training image dataset, input the training image dataset to the neural network model and collect intermediate results to obtain the hidden features of the neural network.

首先根据给定待测神经网络模型，重写其前向传播子过程使其在推理过程中导出中间结果；然后利用给定训练图像数据集，将该图像数据集输入神经网络模型并收集中间层隐含特征。First, according to the given neural network model to be tested, rewrite its forward propagation sub-process to derive intermediate results in the inference process; then use the given training image dataset to input the image dataset into the neural network model and collect the intermediate layer hidden features.

步骤二：表示空间联合概率建模。使用高斯混合模型对中间结果进行拟合，获取模型参数，并收集训练图像数据集路径频次计算概率。Step 2: Express joint probability modeling in space. Use a Gaussian mixture model to fit the intermediate results, obtain the model parameters, and collect the path frequencies of the training image dataset to calculate the probability.

构建联合概率模型需要两个步骤，首先利用步骤一产生的中间层隐含特征建立一个基于概率图的生成式模型。然后将中间层特征映射到离散空间中，得到图像数据样本在相邻中间层的转移概率；Constructing a joint probability model requires two steps. First, a generative model based on probability graph is established by using the hidden features of the intermediate layer generated in step 1. Then map the intermediate layer features into discrete space to obtain the transition probability of image data samples in adjacent intermediate layers;

步骤三：联合概率估计模型。将待测图像数据输入到神经网络模型中，按照如步骤一的方法收集其中间结果。使用步骤二中的高斯混合模型计算中间结果的生成概率和层间转移概率，并使用联合概率估计模型进行快速概率估计，验证输入待测图像数据是否有效。Step 3: Joint probability estimation model. Input the image data to be tested into the neural network model, and collect the intermediate results according to the method as in step 1. Use the Gaussian mixture model in step 2 to calculate the generation probability and inter-layer transition probability of the intermediate results, and use the joint probability estimation model to perform fast probability estimation to verify whether the input image data to be tested is valid.

所述神经网络为深度神经网络，是指利用神经元进行层次连接形成的进行图像数据特征提取及预测的机器学习模型，包含输入层、隐含层和输出层。层中包含大量神经元，层次间将神经元相互连接，由输入层向输出层传递图像数据，输出预测结果，如图像数据所属的类别；所述中间层结果是神经网络在输入层和输出层之间的隐含层的神经元所输出的隐含特征数据；所述神经元是对于神经元输入利用内置函数等对输入数据进行运算操作，并输出运算结果的结构；所述输入图像数据是指符合深度神经网络模型输入格式的单个或一批的图像数据样本。The neural network is a deep neural network, which refers to a machine learning model for image data feature extraction and prediction formed by using neurons to perform hierarchical connections, including an input layer, a hidden layer and an output layer. The layer contains a large number of neurons, and the neurons are connected between the layers. The input layer transfers the image data to the output layer, and outputs the prediction result, such as the category to which the image data belongs; the middle layer result is the neural network in the input layer and the output layer. The hidden feature data output by the neurons of the hidden layer between the two; the neuron is the structure that uses the built-in function to operate the input data for the neuron input, and outputs the result of the operation; the input image data is Refers to a single or batch of image data samples that conform to the input format of a deep neural network model.

所述步骤一中，所述待测神经网络模型是指包含模型完全信息(如模型架构、神经元参数等)且具备完整的操作权限(如采集模型中间结果、修改模型参数等)的预训练模型。In the step 1, the neural network model to be tested refers to the pre-training that contains complete model information (such as model architecture, neuron parameters, etc.) and has complete operation authority (such as collecting model intermediate results, modifying model parameters, etc.). Model.

所述步骤一中，所述传入训练图像数据集到神经网络模型并收集中间结果，指覆盖神经网络的前向传播过程使神经网络保留中间层神经元输出的隐含特征，并且收集训练图像数据集的中间层隐含特征。因为神经网络模型中神经元是按层排布的，中间结果也可以此分为多层。中间层隐含特征理论上可以被全部采集和利用，但由于实际硬件限制，以及抽样采集得到的结果已经有相当高的准确率，并无必要使用所有的中间结果。实际使用中可采用均匀抽取策略或者按照模型结构抽取策略对中间结果进行采样。实际应用中，如果中间层特征数据量仍然较大，可通过降维去除冗余信息。如全局池化可获得良好的效果，PCA降维算法也可与本发明共同使用。In the step 1, the input of the training image data set to the neural network model and the collection of intermediate results means that the forward propagation process covering the neural network enables the neural network to retain the hidden features of the output of the neurons in the middle layer, and collect training images. The mid-level hidden features of the dataset. Because the neurons in the neural network model are arranged in layers, the intermediate results can also be divided into multiple layers. In theory, all the hidden features of the middle layer can be collected and utilized. However, due to actual hardware limitations and the results obtained by sampling have a high accuracy rate, it is not necessary to use all the intermediate results. In actual use, a uniform extraction strategy or an extraction strategy according to the model structure can be used to sample the intermediate results. In practical applications, if the amount of feature data in the middle layer is still large, redundant information can be removed by dimensionality reduction. If global pooling can achieve good results, the PCA dimension reduction algorithm can also be used together with the present invention.

所述步骤二中，高斯混合模型是指多维的高斯混合聚类算法，对多维复杂数据分布具有拟合作用。可以使用常见的期望-最大化(EM)算法求高斯混合模型的近似解，得到若干高斯分量及对应权重，进而以生成式方法获得样本概率。这些模型参数是后续评估新样本概率的框架的基础材料。In the second step, the Gaussian mixture model refers to a multi-dimensional Gaussian mixture clustering algorithm, which has a fitting effect on the multi-dimensional complex data distribution. The common expectation-maximization (EM) algorithm can be used to obtain the approximate solution of the Gaussian mixture model, and several Gaussian components and corresponding weights can be obtained, and then the sample probability can be obtained by the generative method. These model parameters are the building blocks of a subsequent framework for evaluating the probability of new samples.

所述步骤二中，所述的表示空间联合概率建模指的是对深度神经网络的表示空间中的隐含特征的联合概率建模，其目的在于建模神经网络的推理过程。所述表示空间指中间层隐含特征的数据空间，包含模型输入数据的中间表示特征。In the second step, the joint probability modeling of the representation space refers to the joint probability modeling of the implicit features in the representation space of the deep neural network, and the purpose is to model the reasoning process of the neural network. The representation space refers to the data space of the hidden features of the intermediate layer, including the intermediate representation features of the input data of the model.

所述步骤二中，所述的对中间层隐含特征分别进行拟合是使用高斯混合模型(GMM)建立中间层隐含特征的概率分布模型。使用期望-最大化(EM)算法在第i个中间层输出上建立高斯混合模型，得到K_i个高斯分量的参数Θ_i以及每个高斯分量的权重Φ_i。概率图模型中的参数Φ_i和Θ_i是评估图像数据异常概率的基础材料。In the second step, the fitting of the hidden features of the middle layer is to use a Gaussian mixture model (GMM) to establish a probability distribution model of the hidden features of the middle layer. A Gaussian mixture model is built on the output of the ith intermediate layer using the expectation-maximization (EM) algorithm, and the parameters Θ _i of the K _i Gaussian components and the weight Φ _i of each Gaussian component are obtained. The parameters Φ _i and Θ _i in the probabilistic graphical model are the basic materials for evaluating the anomaly probability of image data.

所述步骤二中，所述收集训练图像数据集路径频次计算概率，其概率计算方法为：训练集全部样本产生路径数据{(z_i,z_i-1)^m|m∈Z^m}，其中m是训练集大小。统计出z_i-1到z_i转移的概率p(z_i|z_i-1)，为后续评估样本概率提供基础参数。In the second step, the _probability is _calculated by ^collecting the ^path frequency of the training image data set. m is the training set size. The probability p(z _i |z _i _-1 ) of the transition from zi _-1 to zi is calculated, which provides basic parameters for the subsequent evaluation of the sample probability.

所述步骤二中，所述相邻中间层的转移概率指的是在训练图像数据集上中间层特征x_i所对应的离散分量z_i在相邻层之间的转移概率；所述离散分量z_i指根据第i层的GMM的聚类结果，输入图像数据在第i层中间层特征x_i所属聚类为z_i。进一步的，计算训练图像数据集上z_i-1到z_i转移的概率P(z_i|z_i-1)，即第i-1层到i层的转移概率。z_i可能的取值为K_i种，第i-1层到i层的转移概率是尺寸为K_i-1×K_i的矩阵。In the second step, the transition probability of the adjacent intermediate layer refers to the transition probability between the adjacent layers of the discrete component _zi corresponding to the intermediate layer feature x _i on the training image data set; the discrete component z _i means that according to the clustering result of the GMM of the i-th layer, the input image data belongs to the cluster of the intermediate-layer feature x _i of the i-th layer as z _i . Further, calculate the probability P(z _i |z _i _-1 ) of the transition from zi _-1 to zi on the training image data set, that is, the transition probability from the i-1th layer to the i layer. The possible values of zi are K _i , and the transition probability from the _i -1th layer to the i layer is a matrix of size K _i-1 ×K _i .

所述步骤三中，所述联合概率估计模型，指的是通过概率图模型估计中间层输出序列{x₁,x₂,…,x_m}的联合概率。直接计算联合概率P(x₁,x₂,…,x_m)的时间复杂性关于m的指数增长，因此本发明使用了一种基于动态规划的快速前向算法。令In the third step, the joint probability estimation model refers to estimating the joint probability of the output sequence {x ₁ ,x ₂ ,...,x _m } of the middle layer through the probability graph model. The time complexity of directly calculating the joint probability P(x ₁ , x ₂ , . . . , x _m ) increases exponentially with respect to m, so the present invention uses a fast forward algorithm based on dynamic programming. make

α_i(z_i)≡P(z_i,x₁,x₂,…,x_i)，α _i (z _i )≡P(z _i ,x ₁ ,x ₂ ,…,x _i ),

那么α_i(·)可以通过如下递归过程产生：Then α _i ( ) can be generated by the following recursive process:

其中K_i-1表示第i-1层高斯分量的个数，P(x_i|z_i)由第i层的GMM给出，P(z_i|z_i-1)是转移概率。从而我们以线性的时间复杂度得到了中间层输出的联合概率。where K _i-1 represents the number of Gaussian components in the i-1th layer, P(x _i |z _i ) is given by the GMM of the i-th layer, and P(z _i |z _i-1 ) is the transition probability. Thus we get the joint probability of the output of the middle layer with linear time complexity.

所述步骤三中，所述验证输入待测图像数据是否有效，指的是判断图像数据的联合概率是否大于预设的阈值；所述阈值是0到1之间的数值，越接近0越倾向于方法的精准率，越接近1越倾向于方法的召回率；实践中可行的一个策略是根据训练图像数据集的联合概率，设定一个使多数图像数据(如90％)正常的阈值。In the third step, the verification of whether the input image data to be tested is valid refers to judging whether the joint probability of the image data is greater than a preset threshold; the threshold is a value between 0 and 1, and the closer it is to 0, the more inclined it is. For the accuracy of the method, the closer it is to 1, the more likely the recall rate of the method is; a feasible strategy in practice is to set a threshold that makes most image data (such as 90%) normal according to the joint probability of the training image data set.

一种面向神经网络视觉系统的图像数据输入检测系统，包括：An image data input detection system oriented to a neural network vision system, comprising:

神经网络隐含特征提取器：给定神经网络模型和其训练图像数据集，传入训练图像数据集到神经网络模型并收集中间结果，得到神经网络隐含特征。Neural network latent feature extractor: Given a neural network model and its training image dataset, input the training image dataset to the neural network model and collect intermediate results to obtain neural network latent features.

表示空间联合概率建模工具：使用高斯混合模型对中间结果进行拟合，获取模型参数，并收集训练图像数据集路径频次计算概率。Representation Space Joint Probabilistic Modeling Tool: Fits intermediate results using a Gaussian mixture model, obtains model parameters, and collects training image dataset path frequencies to calculate probabilities.

构建联合概率模型需要两个步骤，首先利用神经网络隐含特征提取器产生的中间层隐含特征建立一个基于概率图的生成式模型。然后将中间层特征映射到离散空间中，得到图像数据样本在相邻中间层的转移概率；Constructing a joint probability model requires two steps. First, a generative model based on probability graphs is built using the intermediate layer latent features generated by the neural network latent feature extractor. Then map the intermediate layer features into discrete space to obtain the transition probability of image data samples in adjacent intermediate layers;

联合概率估计模型；将待测图像数据输入到神经网络模型中，利用神经网络隐含特征提取器收集其中间结果。使用表示空间联合概率建模工具中的高斯混合模型计算中间结果的生成概率和层间转移概率，并使用联合概率估计模型进行快速概率估计，验证输入待测图像数据是否有效。Joint probability estimation model; input the image data to be tested into the neural network model, and use the neural network latent feature extractor to collect its intermediate results. Use the Gaussian mixture model in the representation space joint probability modeling tool to calculate the generation probability and interlayer transition probability of the intermediate results, and use the joint probability estimation model to perform fast probability estimation to verify whether the input image data to be tested is valid.

一种计算机设备，该计算机设备包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，处理器执行上述计算机程序时实现如上所述的面向神经网络视觉系统的图像数据输入检测方法。A computer device comprising a memory, a processor and a computer program stored in the memory and running on the processor, the processor implements the above-mentioned image data input oriented to a neural network vision system when the processor executes the above-mentioned computer program Detection method.

一种计算机可读存储介质，该计算机可读存储介质存储有执行如上所述的面向神经网络视觉系统的图像数据输入检测方法的计算机程序。A computer-readable storage medium storing a computer program for executing the above-mentioned image data input detection method for a neural network vision system.

有益效果：本发明能够弥补普通神经网络视觉模型难以识别异常图像数据的不足，从而保障智能系统的安全。本发明与已有神经网络模型输入数据验证技术相比，不需要使用异常数据进行训练，不容易对特定数据产生过拟合现象。本发明能高效地检测评估输入图像数据的有效性，并利用评估的有效性，从而实现测试图像的实时筛选，提升神经视觉系统实际部署的效果。Beneficial effects: The present invention can make up for the deficiency that the ordinary neural network visual model is difficult to identify abnormal image data, thereby ensuring the safety of the intelligent system. Compared with the existing neural network model input data verification technology, the present invention does not need to use abnormal data for training, and is not prone to overfitting to specific data. The present invention can efficiently detect and evaluate the validity of the input image data, and utilize the validity of the evaluation, thereby realizing the real-time screening of the test images and improving the effect of the actual deployment of the neural vision system.

附图说明Description of drawings

图1为本发明实施例的方法原理图；1 is a schematic diagram of a method according to an embodiment of the present invention;

图2为本发明实施例的神经网络隐含特征提取器示意图；2 is a schematic diagram of a neural network implicit feature extractor according to an embodiment of the present invention;

图3为本发明实施例的表示空间联合概率建模工具的流程图；3 is a flowchart representing a spatial joint probability modeling tool according to an embodiment of the present invention;

图4为本发明实施例的联合概率估计模型的功能示意图；4 is a functional schematic diagram of a joint probability estimation model according to an embodiment of the present invention;

图5为本发明实施例的概率图模型示意图。FIG. 5 is a schematic diagram of a probability graph model according to an embodiment of the present invention.

具体实施方式Detailed ways

下面结合具体实施例，进一步阐明本发明，应理解这些实施例仅用于说明本发明而不用于限制本发明的范围，在阅读了本发明之后，本领域技术人员对本发明的各种等价形式的修改均落于本申请所附权利要求所限定的范围。Below in conjunction with specific embodiments, the present invention will be further illustrated, and it should be understood that these embodiments are only used to illustrate the present invention and not to limit the scope of the present invention. The modifications all fall within the scope defined by the appended claims of this application.

如图1所示，面向神经网络视觉系统的图像数据输入检测方法，由训练和部署两阶段构成。训练阶段首先利用隐含特征提取器，将训练图像数据输入神经网络模型，生成各中间层产生的隐含特征。其后，利用表示空间联合建模工具在中间层特征上训练高斯混合模型，并且使用高斯混合模型的聚类结果，计算相邻层之间的转移概率。在部署阶段，对于任意的待测图像数据，首先将其输入神经网络获得中间层特征，然后使用联合概率估计模型计算该输入的中间层特征的联合概率。最后将该联合概率值与预定阈值比较，决定对待测图像数据接收或者拒绝。整个框架包含三个模块对应三个步骤：神经网络隐含特征提取器，表示空间联合概率建模工具，与联合概率估计模型。As shown in Figure 1, the image data input detection method for neural network vision system consists of two stages: training and deployment. In the training phase, the latent feature extractor is first used to input the training image data into the neural network model to generate the latent features generated by each intermediate layer. Thereafter, a Gaussian mixture model is trained on the intermediate layer features using a representation space joint modeling tool, and using the clustering results of the Gaussian mixture model, transition probabilities between adjacent layers are calculated. In the deployment stage, for any image data to be tested, it is firstly input into the neural network to obtain intermediate layer features, and then the joint probability estimation model is used to calculate the joint probability of the input intermediate layer features. Finally, the joint probability value is compared with a predetermined threshold to decide whether to accept or reject the image data to be tested. The whole framework consists of three modules corresponding to three steps: neural network latent feature extractor, representation space joint probability modeling tool, and joint probability estimation model.

面向神经网络视觉系统的图像数据输入检测系统，包括：Image data input detection system for neural network vision system, including:

步骤一：给定神经网络模型和其训练图像数据集，传入训练图像数据集到神经网络模型并收集中间结果，得到神经网络隐含特征。Step 1: Given a neural network model and its training image data set, input the training image data set to the neural network model and collect intermediate results to obtain the hidden features of the neural network.

如图2所示，覆盖神经网络视觉模型的前向传播过程使其保留中间层神经元输出的隐含特征，将训练图像数据集输入神经网络，并提取模型各中间层所产生的隐含特征{x₁,x₂,…,x_m}，其中m为中间层数量。这些图像数据的中间特征理论上可以被全部采集和利用，但由于实际硬件限制，以及抽样采集得到的结果已经有相当高的准确率，并无必要使用所有的中间结果。实际使用中可采用均匀抽取策略或者按照模型结构抽取策略进行采样。实际应用中，如果中间层特征数据量仍然较大，可通过降维去除图像数据中间层输出的冗余信息。如全局池化可获得良好的效果，PCA降维算法也可与本发明共同使用。保存处理后的中间层结果，用于后续训练。As shown in Figure 2, the forward propagation process of the overlay neural network visual model makes it retain the hidden features of the output of the neurons in the middle layer, input the training image dataset into the neural network, and extract the hidden features generated by the middle layers of the model {x ₁ ,x ₂ ,…,x _m }, where m is the number of intermediate layers. In theory, all the intermediate features of these image data can be collected and utilized, but due to actual hardware limitations, and the results obtained by sampling have a very high accuracy, it is not necessary to use all the intermediate results. In actual use, a uniform extraction strategy or an extraction strategy according to the model structure can be used for sampling. In practical applications, if the amount of feature data in the middle layer is still large, the redundant information output by the middle layer of the image data can be removed by dimensionality reduction. If global pooling can achieve good results, the PCA dimension reduction algorithm can also be used together with the present invention. Save the processed intermediate layer results for subsequent training.

在该步骤中，要求待测神经网络视觉系统中包含模型完全信息(如模型架构、神经元参数等)且具备完整的操作权限(如采集模型中间结果、修改模型参数等)。不适用于只提供API服务的黑盒模型。In this step, it is required that the neural network vision system to be tested contains complete model information (such as model architecture, neuron parameters, etc.) and has complete operation authority (such as collecting model intermediate results, modifying model parameters, etc.). Not suitable for a black-box model that only provides API services.

步骤二：利用表示空间联合概率建模，建模神经网络视觉模型对训练集图像数据的推理过程。Step 2: Use the representation space joint probability modeling to model the reasoning process of the neural network visual model to the training set image data.

如图3所示，在步骤一所得的中间层数据上逐层训练高斯混合模型。各层的高斯混合模型可根据图像数据规模以及设备算力合理选择参数，如高斯分量的数量。保留各中间层数据上的高斯混合模型，并对训练集图像数据做聚类操作，得到图像数据中间层输出聚类结果z_i并依据该结果计算相邻中间层聚类之间的转移概率P(z_i|z_i-1)，保存以供线上待测图像数据使用。As shown in Figure 3, the Gaussian mixture model is trained layer by layer on the intermediate layer data obtained in step 1. The Gaussian mixture model of each layer can reasonably select parameters, such as the number of Gaussian components, according to the size of the image data and the computing power of the device. Retain the Gaussian mixture model on the data of each intermediate layer, and perform the clustering operation on the image data of the training set to obtain the output clustering result _zi of the image data intermediate layer, and calculate the transition probability P between adjacent intermediate layer clusters according to the result. (z _i |z _i-1 ), save it for use in online image data to be measured.

表示空间联合概率建模指的是对深度神经网络的表示空间中的隐含特征的联合概率建模，其目的在于建模神经视觉网络对图像数据的推理过程。表示空间指中间层隐含特征的数据空间，包含模型输入数据的中间表示特征。本方法使用如图5所示的概率图模型表示中间层联合概率，神经网络的中间层与概率图模型中的层呈对应关系，其中x_i是神经网络模型第i个中间层的输出，z_i是概率图模型的第i层的隐藏状态，Φ_i和Θ_i是方法的参数。Representation space joint probability modeling refers to the joint probability modeling of implicit features in the representation space of deep neural network, and its purpose is to model the reasoning process of neural vision network on image data. The representation space refers to the data space of the hidden features of the middle layer, including the intermediate representation features of the model input data. This method uses the probabilistic graph model shown in Figure 5 to represent the joint probability of the intermediate layers. The intermediate layers of the neural network are in a corresponding relationship with the layers in the probabilistic graphical model, where x _i is the output of the ith intermediate layer of the neural network model, and z _i is the hidden state of the _i -th layer of the probabilistic graphical model, and Φi and _Θi are the parameters of the method.

对中间层隐含特征分别进行拟合是使用高斯混合模型(GMM)建立中间层隐含特征的概率分布模型。使用期望-最大化(EM)算法在第i个中间层输出上建立高斯混合模型，得到K_i个高斯分量的参数Θ_i以及每个高斯分量的权重Φ_i。概率图模型中的参数Φ_i和Θ_i是评估图像数据异常概率的基础材料。Fitting the hidden features of the middle layer separately is to use the Gaussian mixture model (GMM) to establish the probability distribution model of the hidden features of the middle layer. A Gaussian mixture model is built on the output of the ith intermediate layer using the expectation-maximization (EM) algorithm, and the parameters Θ _i of the K _i Gaussian components and the weight Φ _i of each Gaussian component are obtained. The parameters Φ _i and Θ _i in the probabilistic graphical model are the basic materials for evaluating the anomaly probability of image data.

相邻中间层的转移概率指的是在训练图像数据集上中间层特征x_i所对应的离散分量z_i在相邻层之间的转移概率；离散分量z_i指根据第i层的GMM的聚类结果，输入数据在第i层中间层特征x_i所属聚类为z_i。计算训练数据集上z_i-1到z_i转移的概率P(z_i|z_i-1)，即第i-1层到i层的转移概率。z_i可能的取值为K_i种，第i-1层到i层的转移概率是尺寸为K_i-1×K_i的矩阵。The transition probability of adjacent intermediate layers refers to the transition probability of the discrete component _zi corresponding to the intermediate layer feature x _i between adjacent layers on the training image dataset; the discrete component _zi refers to the The clustering result, the input data in the i-th intermediate layer feature _xi belongs to the cluster _zi . Calculate the probability P(z _i |z _i _-1 ) of the transition from zi _-1 to zi on the training data set, that is, the transition probability from layer i-1 to layer i. The possible values of zi are K _i , and the transition probability from the _i -1th layer to the i layer is a matrix of size K _i-1 ×K _i .

步骤三：利用联合概率估计模型，分析待测图像数据的有效性并报告。Step 3: Use the joint probability estimation model to analyze the validity of the image data to be tested and report.

如图4所示，将一个或一批待测图像数据输入神经网络模型，执行推理过程并提取其中间结果，对中间层结果做降维处理并保存。注意该步骤中选择的模型的中间层应与步骤一的操作一致，降维操作也应一致。将中间层输出数据输入该中间层对应的高斯混合模型，得到中间层结果在各高斯分量上的概率P(x_i|z_i)，其中z_i表示图像数据在第i层的高斯分量。通过概率图模型估计中间层输出序列{x₁,x₂,…,x_m}的联合概率。直接计算联合概率P(x₁,x₂,…,x_m)的时间复杂性关于m的指数增长，因此使用了一种基于动态规划的快速前向算法。令As shown in Figure 4, one or a batch of image data to be tested is input into the neural network model, the inference process is performed and the intermediate results are extracted, and the intermediate layer results are dimensionally reduced and saved. Note that the middle layer of the model selected in this step should be consistent with the operation of step 1, and the dimensionality reduction operation should also be consistent. Input the output data of the middle layer into the Gaussian mixture model corresponding to the middle layer _, and obtain the probability P ₍ x _i | The joint probability of the intermediate layer output sequence {x ₁ ,x ₂ ,…,x _m } is estimated by a probabilistic graphical model. The time complexity of directly computing the joint probability P(x ₁ ,x ₂ ,...,x _m ) grows exponentially with respect to m, so a fast forward algorithm based on dynamic programming is used. make

最终得到图像数据中间层特征序列{x₁,x₂,…,x_m}的联合概率。判断待测图像数据的联合概率是否大于预设的阈值；所述阈值是0到1之间的数值，越接近0越倾向于方法的精准率，越接近1越倾向于方法的召回率；实践中可行的一个策略是根据训练图像数据集的联合概率，设定一个使多数图像样本(如90％)正常的阈值。Finally, the joint probability of the image data intermediate layer feature sequence {x ₁ ,x ₂ ,...,x _m } is obtained. Determine whether the joint probability of the image data to be tested is greater than a preset threshold; the threshold is a value between 0 and 1. The closer to 0, the more accurate the method is, and the closer to 1, the more likely to be the recall rate of the method; practice One strategy that works in , is to set a threshold that makes the majority of image samples (eg, 90%) normal, based on the joint probability of the training image dataset.

显然，本领域的技术人员应该明白，上述的本发明实施例的面向神经网络视觉系统的图像数据输入检测方法各步骤或面向神经网络视觉系统的图像数据输入检测系统各模块可以用通用的计算装置来实现，它们可以集中在单个的计算装置上，或者分布在多个计算装置所组成的网络上，可选地，它们可以用计算装置可执行的程序代码来实现，从而，可以将它们存储在存储装置中由计算装置来执行，并且在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤，或者将它们分别制作成各个集成电路模块，或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样，本发明实施例不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the steps of the image data input detection method oriented to the neural network vision system or the modules of the image data input detection system oriented to the neural network vision system according to the above-mentioned embodiments of the present invention can use a general-purpose computing device They can be centralized on a single computing device or distributed on a network composed of multiple computing devices, optionally, they can be implemented with program codes executable by the computing devices, so that they can be stored in The storage device is performed by the computing device, and in some cases, the steps shown or described may be performed in an order different from that herein, or separately fabricated into individual integrated circuit modules, or multiple of them. Each module or step is fabricated into a single integrated circuit module to implement. As such, embodiments of the present invention are not limited to any particular combination of hardware and software.

Claims

1. An image data input detection method for a neural network vision system is characterized by comprising the following steps:

the method comprises the following steps: extracting implicit characteristics of a neural network; giving a neural network model and a training image data set thereof, transmitting the training image data set to the neural network model and collecting an intermediate result to obtain implicit characteristics of the neural network;

step two: representing spatial joint probability modeling; fitting the intermediate result by using a Gaussian mixture model to obtain model parameters, and collecting training image data set path frequency calculation probability;

step three: a joint probability estimation model; inputting image data to be detected into a neural network model, and collecting an intermediate result according to the method in the step one; and calculating the generation probability and the interlayer transition probability of the intermediate result by using the Gaussian mixture model in the step two, performing rapid probability estimation by using a joint probability estimation model, and verifying whether the input image data to be detected is effective or not.

2. The image data input detection method for the neural network visual system as claimed in claim 1, wherein the neural network is a deep neural network, which is a machine learning model for image data feature extraction and prediction formed by hierarchical connection of neurons, and comprises an input layer, a hidden layer and an output layer; the layers comprise neurons, the neurons are connected among the layers, image data are transmitted from the input layer to the output layer, and a prediction result is output; the intermediate layer result is implicit characteristic data output by neurons of an implicit layer of the neural network between the input layer and the output layer; the neuron is a structure that performs an operation on input data by using a built-in function or the like for neuron input and outputs an operation result; the input image data refers to a single or a batch of image data samples conforming to the input format of the deep neural network model.

3. The method for detecting image data input facing to the neural network vision system as claimed in claim 1, wherein in the first step, according to the given neural network model to be detected, the forward propagation subprocess of the neural network model is rewritten so that the neural network model can derive the intermediate result in the inference process; a given training image dataset is then used, which is input into the neural network model and the intermediate layer implicit features are collected.

4. The neural network vision system-oriented image data input detection method as claimed in claim 1, wherein the intermediate results are sampled by using a uniform extraction strategy or a model structure extraction strategy.

5. The method for detecting image data input facing a neural network vision system as claimed in claim 1, wherein in the second step, two steps are required for constructing the joint probability model, and firstly, a generative model based on a probability map is established by using the intermediate layer implicit characteristics generated in the first step; then mapping the characteristics of the middle layer into a discrete space to obtain the transition probability of the image data sample in the adjacent middle layer;

the representation space refers to a data space with hidden characteristics of the middle layer and comprises middle representation characteristics of model input data;

the step of respectively fitting the intermediate layer hidden features is to establish a probability distribution model of the intermediate layer hidden features by using a Gaussian mixture model; using expectation-maximization algorithm to build a Gaussian mixture model on the ith intermediate layer output to obtain K _i Parameters theta of the respective Gaussian components _i And the weight of each Gaussian component _i (ii) a Parameter phi in probabilistic graphical model _i And Θ _i Is a basic material for evaluating the abnormal probability of the image data.

6. The neural-network visual system-oriented image data input detection method according to claim 1, wherein in the second step, the transition probabilities of adjacent intermediate layers refer to the intermediate layer feature x on the training image data set _i Corresponding discrete component z _i Transition probability between adjacent layers; the discrete component z _i Means that according to the clustering result of the GMM of the ith layer, the input image data has the characteristic x of the middle layer of the ith layer _i The cluster is z _i (ii) a Computing z on a training image dataset _i-1 To z _i Probability of transition P (z) _i |z _i-1 ) I.e., transition probability from layer i-1 to layer i; z is a radical of _i Possible values are K _i The transition probability from layer i-1 to layer i is of size K _i-1 ×K _i Of the matrix of (a).

7. The method for detecting image data input facing neural network vision system as claimed in claim 1, wherein in step three, said joint probability estimation model refers to estimating middle layer output sequence { x } through probability map model ₁ ,x ₂ ,…,x _m J, directly calculating the joint probability P (x) ₁ ,x ₂ ,…,x _m ) The time complexity of (a) increases exponentially with respect to (m), and therefore a fast forward algorithm based on dynamic programming is used; order to

α _i (z _i )≡P(z _i ,x ₁ ,x ₂ ,…,x _i )，

Then alpha _i (. h) is generated by the following recursive process:

wherein K _i-1 Indicates the number of Gaussian components of the i-1 st layer, P (x) _i |z _i ) Given by the GMM of layer i, P (z) _i |z _i-1 ) Is the transition probability.

8. An image data input detection system for a neural network vision system, comprising:

the neural network implicit feature extractor: giving a neural network model and a training image data set thereof, transmitting the training image data set to the neural network model and collecting an intermediate result to obtain implicit characteristics of the neural network;

firstly, rewriting a forward propagation subprocess of a given neural network model to be tested according to the neural network model to be tested so that an intermediate result is derived in a reasoning process; then, inputting the image data set into a neural network model by using a given training image data set and collecting implicit characteristics of the middle layer;

representing a spatial joint probability modeling tool: fitting the intermediate result by using a Gaussian mixture model to obtain model parameters, and collecting training image data set path frequency calculation probability;

two steps are needed for constructing the joint probability model, and firstly, a generative model based on a probability graph is established by using intermediate layer hidden features generated by a neural network hidden feature extractor. Then mapping the characteristics of the middle layer into a discrete space to obtain the transition probability of the image data in the adjacent middle layer;

a joint probability estimation model; inputting image data to be detected into a neural network model, and collecting intermediate results according to a neural network implicit feature extractor; and calculating the generation probability and the interlayer transition probability of the intermediate result by using a Gaussian mixture model in the space joint probability modeling tool, performing rapid probability estimation by using a joint probability estimation model, and verifying whether the input image data to be detected is effective or not.

9. A computer device, characterized by: the computer device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the image data input detection method for the neural network vision system as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium characterized by: the computer-readable storage medium stores a computer program for executing the image data input detection method for a neural network vision system according to any one of claims 1 to 7.