WO2021027052A1 - 面向神经网络模型的基于层间剖析的输入实例验证方法 - Google Patents

面向神经网络模型的基于层间剖析的输入实例验证方法 Download PDF

Info

Publication number
WO2021027052A1
WO2021027052A1 PCT/CN2019/111612 CN2019111612W WO2021027052A1 WO 2021027052 A1 WO2021027052 A1 WO 2021027052A1 CN 2019111612 W CN2019111612 W CN 2019111612W WO 2021027052 A1 WO2021027052 A1 WO 2021027052A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
input
model
level
network model
Prior art date
Application number
PCT/CN2019/111612
Other languages
English (en)
French (fr)
Inventor
徐经纬
王慧妍
许畅
马晓星
吕建
Original Assignee
南京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京大学 filed Critical 南京大学
Publication of WO2021027052A1 publication Critical patent/WO2021027052A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the invention relates to a neural network model-oriented input example verification method based on interlayer analysis, which is used for neural network testing and input verification and other technical field tasks.
  • Neural network models are widely used in various fields of real life, such as image processing, object recognition, and autonomous driving.
  • the neural network model due to the complexity and inexplicability of the neural network model, generally speaking, for a given trained neural network model, it is often used in various scenarios in real life and get good results.
  • due to the characteristics of model training there will naturally be applicable or inapplicable situations for the input of different scenarios. If the applicability is not distinguished for actual deployment, it may cause abnormal effects of the neural network model (for example, if an autonomous car Use scenes such as extreme weather or processing extreme exposure input pictures, it may not be able to normally avoid obstacles and cause serious traffic accidents). Therefore, it is very important for the neural network model to automatically verify whether it is applicable (or effective) for the input.
  • the core of many work is based on the distance evaluation method, which evaluates the distance between the unknown input and the training data to judge its effectiveness. This method is limited by the scale of the training data, and it is difficult to apply in reality and often requires large-scale training.
  • the neural network model has a certain generalization ability, its input processing ability is not strictly equivalent to its training data. Directly using the latter to help verify may bring some accuracy deviations;
  • the current methods are often offline verification methods, and the efficiency is difficult to meet the requirements of real-time verification, which makes it difficult to use in deployed real-world scenarios.
  • the present invention provides a neural network model-oriented input instance verification method based on interlayer analysis, which has the characteristics of ease of use, effectiveness, and efficiency.
  • Ease of use means that the method can be used in neural network models trained on large-scale training data that are common in real life, and the use scene is not severely restricted by the scale of the training scene and the complexity of the neural network model.
  • Validity means that the method has high accuracy in validating the input instance, and can effectively distinguish valid and invalid inputs.
  • High efficiency means that the method requires less time cost to verify the input instance, can meet the requirements of real-time verification, and can be deployed in a running neural network model for input verification.
  • a method for verifying input examples based on interlayer analysis for neural network models including the following steps:
  • Step 1 Use the given neural network model and its corresponding training data, input the training data into the given neural network model, extract the intermediate information of the data at each intermediate level of the model during the training process, and train the corresponding sub-models of each level according to the intermediate information ,
  • Each sub-model contains the knowledge of the given neural network model from the input layer to the corresponding intermediate level and simulates the prediction behavior of the given neural network model;
  • Step 2 Using the sub-models corresponding to each intermediate level obtained in step 1, collect the predicted behavior snapshots on the corresponding sub-models of each level in increments of the levels for the input instances to be verified, and summarize the total input examples in all sub-models Behavior profile
  • Step 3 Based on the total behavior profile obtained by the hierarchical analysis corresponding to the given input instance obtained in step 2, analyze the effectiveness of its hierarchical prediction behavior snapshot and the effectiveness of the total behavior profile, and give the effectiveness confidence score, and Assess effectiveness.
  • the neural network refers to a type of data structure that uses neurons to perform hierarchical connection for big data feature extraction and prediction, including an input layer, a hidden layer, and an output layer.
  • Each layer contains a large number of neurons. They are connected to each other through neurons, and the input layer transmits information to the output layer, such as commonly used DNN, CNN, RNN models, etc.;
  • the neurons use built-in functions for neuron input to perform operations on input data, and The output data structure;
  • the built-in function is a fixed and commonly used form of several popular activation kernel functions, such as ReLU, Sigmoid, Softmax, etc.;
  • the input instance refers to a single input or batch input of a neural network model, for example: A neural network trained on a picture classification problem.
  • the input instance refers to a batch input composed of a certain picture file or multiple pictures.
  • the given neural network model and its training data set are provided, and the intermediate information of each level of the training process is extracted, where the intermediate information includes the model parameter information obtained by the neurons of each layer of the intermediate level during the training process (such as weight, bias, etc. in the CNN model), the input value and output value of each neuron, etc.
  • the parameter information is used to record the knowledge that the current model has learned from the training data set through the training process, and the input value and output value are used to provide training data for the subsequent sub-model training process.
  • the sub-model corresponding to each level is a neural network model similar to the structure of a given neural network model, which includes two parts of the structure, the first part inherits the given neural network in the original training process All the model parameter information (such as weight, bias, etc.) obtained from the input layer to the corresponding level k and the corresponding model structure of the post model obtained afterwards.
  • the second part uses the basic element model to connect the level k neurons and the predicted output neurons, And use the k-level neuron intermediate information recorded in step 1 (the original training data is input to a given neural network, the output set at this level k) and the original training set corresponding predicted value labels for retraining, and obtain the part of the trained Parameter information, the two parts of parameters are merged to obtain a sub-model structure with parameters; the basic meta-model often refers to a linear regression model, but is not limited to this model; the retraining usually only trains the second part of the parameters, but not Not limited to this, the retraining trains the parameters of the first part and the second part according to different application scenarios (such as using the inter-layer analysis method to fine-tune the model parameters as a whole).
  • the prediction behavior snapshot of the input instance on the corresponding sub-models of each level refers to the prediction probability distribution result and other information obtained after the input instance is passed into the sub-models corresponding to each level for prediction. , But not limited to predicting probability distribution results;
  • the total behavior profile refers to a set of predicted behavior snapshots obtained by each sub-model, which is used in the subsequent step three to perform verification evaluation on the input instance to be verified, and is the basic material.
  • the use of the total behavior profile in the second step to analyze the effectiveness of the corresponding level of the prediction behavior snapshot may adopt the following analysis methods:
  • Method 1 Consider the probability difference between the predicted maximum value and the final predicted value in the current level of prediction behavior, and use the relative size ratio as the snapshot validity score of each level;
  • Method 2 After considering the direct prediction behavior difference between the current level and the previous level, use the probability change of the predicted behavior and the relative proportion of the probability change of the final predicted value as the snapshot validity score of each level;
  • step three the use of the step two total behavior profile to analyze the effectiveness of the total profile may adopt the following analysis methods:
  • Method 1 Use the actual prediction accuracy of the training set at each level on the training set as the weight for analysis and modeling. This method takes the results of the snapshot effectiveness analysis at each level as the input of the linear model, and the parameters contained in the linear model are set based on the prediction accuracy of the training set, and finally the final total profile effectiveness score is calculated by weighting based on the prediction accuracy;
  • Method 2 Use observation and use common growth function curves to set weights (linear, logarithmic, exponential). This method takes the results of the snapshot effectiveness analysis at each level as the input of the selected growth function, and sets the parameters included in the growth function in a manually set way to calculate the final total profile effectiveness score;
  • Method 3 Obtain each level of snapshots and their total behavior profiles from the training set data, use the snapshot validity analysis as input data, and the corresponding verification results as labeled data, and use machine learning models to train models that can calculate the final profile validity scores;
  • the corresponding verification result can be given manually or given by the prediction accuracy of a given neural network model for the input, but it is not limited to this;
  • the machine learning model can be linear regression, logistic regression, SVM, neural network And other classic machine learning models, but not limited to this.
  • the validity confidence score means that the total profile validity score calculated for a given input to be verified is a certain value between 0 and 1, representing that it is valid for the current input to be verified
  • the confidence level of sex where the closer to 0, the more ineffective, and the closer to 1, the more effective.
  • the selection of the value range of the validity confidence score and the setting of the size relationship are not limited to this.
  • the evaluation validity refers to the validity/invalid division by setting the threshold division by using the calculated validity confidence.
  • the division threshold can be given in advance or obtained by experience, mainly based on The actual use scenarios of different models have different tolerances for valid input instances. Generally speaking, the more stringent the security requirements, the closer the threshold of the scenario is to 1.
  • the present invention can make up for the shortcomings of the existing neural network model input instance verification technology, and use specific input to perform inter-layer analysis in the model to efficiently detect and evaluate the validity of the input instance, and Use the effectiveness of the evaluation to perform real-time screening of input examples, thereby improving the actual deployment effect of the neural network model.
  • Figure 1 is a system structure diagram of the present invention
  • Figure 2 is a detailed view of the sub-model structure provided by the present invention.
  • Figure 3 is a working flow chart of the sub-model generation module provided by the present invention.
  • Fig. 5 is a working flow chart of the validity verification analysis module provided by the present invention.
  • Neural network refers to a type of data structure that uses neurons to perform hierarchical connection for big data feature extraction and prediction. It includes input layer, hidden layer, and output layer. Each layer contains a large number of neurons. The layers communicate with each other through neurons. Connection, transfer information from the input layer to the output layer, such as commonly used DNN, CNN, RNN models, etc.; neuron is a data structure that uses built-in functions for neuron input to perform operations on input data and output; built-in functions To fix several popular activation kernel function forms, such as ReLU, Sigmoid, Softmax, etc.; input instance refers to one input or batch input of neural network model, for example: for neural network training for image classification problem, the input instance is Refers to a batch input of a picture file or multiple pictures.
  • the input instance verification method based on inter-layer analysis for neural network models includes the following steps:
  • Step 1 Use the given neural network model and its corresponding training data, input the training data into the given neural network model, extract the intermediate information of the data at each intermediate level of the model during the training process, and train the corresponding sub-models of each level according to the intermediate information ,
  • Each sub-model contains the knowledge of a given neural network model from the input layer to the corresponding intermediate level and simulates its prediction behavior;
  • the intermediate information includes the model parameter information (such as weight, bias, etc. in the CNN model) obtained by the neurons of each layer of the intermediate level during the training process, and the input and output values of each neuron.
  • the parameter information is used to record the knowledge that the current model has learned from the training data set through the training process, and the input value and output value are used to provide training data for the subsequent sub-model training process.
  • the sub-model corresponding to each level is a neural network model similar to the given neural network model structure. It contains two parts of the structure. The first part inherits the post-model obtained by the given neural network after the original training process from the input layer to the corresponding level All the obtained model parameter information of k (such as weight, bias, etc.) and the corresponding model structure, the second part uses the basic element model to connect the level k neurons and the predicted output neurons, and uses the k level neurons recorded in step 1 Information (after the original training data is input to the given neural network, the output set at this level k) and the original training set correspond to the predicted value labels for retraining, and the parameter information after this part of the training is obtained.
  • k such as weight, bias, etc.
  • the two parts of the parameters are combined to get the parameter
  • the time analysis method fine-tunes the model parameters as a whole) Training the parameters of the first part and the second part.
  • Step 2 Using the sub-models corresponding to each intermediate level obtained in step 1, collect the predicted behavior snapshots on the corresponding sub-models of each level in increments of the levels for the input instances to be verified, and summarize the total input examples in all sub-models Behavior profile
  • the prediction behavior snapshot of the input instance on the corresponding sub-models at each level refers to the prediction probability distribution results and other information obtained after using the input instance to pass in the sub-models corresponding to each level for prediction;
  • the total behavior profile refers to a collection of predicted behavior snapshots obtained by each sub-model, which is used in the subsequent step three to verify and evaluate the input instance to be verified, and is the basic material.
  • Step 3 Based on the total behavior profile obtained by the hierarchical analysis corresponding to the given input instance obtained in step 2, analyze the effectiveness of its hierarchical prediction behavior snapshot and the effectiveness of the total behavior profile, and give the effectiveness confidence score, and Assess effectiveness.
  • Method 1 Consider the probability difference between the predicted maximum value and the final predicted value in the current level of prediction behavior, and use the relative size ratio as the snapshot validity score of each level;
  • Method 2 After considering the direct prediction behavior difference between the current level and the previous level, use the probability change of the predicted behavior and the relative proportion of the probability change of the final predicted value as the snapshot validity score of each level;
  • Method 1 Use the actual prediction accuracy of the training set on the training set at each level as the weight, and integrate the snapshot effectiveness analysis of each level to calculate the final profile effectiveness score;
  • Method 2 Use observations, use common growth function curves to set weights (linear, logarithmic, exponential), and analyze the effectiveness of snapshots at all levels to calculate the final profile effectiveness score;
  • Method 3 Obtain each level of snapshots and their total behavior profiles from the training set data, use the snapshot validity analysis as input data, and the corresponding verification results as labeled data, and use machine learning models to train a model that can calculate the final profile validity score;
  • the verification result can be given artificially, or combined with the prediction accuracy of a given neural network model for the input, but it is not limited to this; the machine learning model can be classics such as linear regression, logistic regression, SVM, neural network, etc. Machine learning model.
  • the validity confidence score refers to the total profile validity analysis calculated for a given input to be verified. It is a value between 0 and 1, which represents the confidence in the validity of the input to be verified. The closer to 0 represents The more ineffective, the closer to 1 the more effective.
  • Evaluating effectiveness refers to the effective/ineffective division by using the calculated confidence in the effectiveness of the threshold division.
  • the division threshold can be given in advance or obtained by experience, and is mainly based on the tolerance of valid input instances in actual use scenarios of different models The degree is different. Generally speaking, the more stringent the security requirement, the closer the scene threshold is to 1.
  • the neural network model-oriented input instance verification method based on inter-layer analysis can firstly use the original data network model and its training instance data set in advance/offline to generate the corresponding sub-models at each level And form a sub-model pool, each single sub-model in the pool corresponds to the knowledge included in a specific layer in the original model and can be used for prediction. Secondly, for any given input instance to be verified, predict by inputting each model in the sub-model pool, analyze its inter-layer behavior for each layer of the original neural network according to the level of prediction, and output inter-layer analysis of the total behavior (profile) , Including the inter-layer prediction behavior (snapshot) corresponding to each level.
  • the whole method framework contains three modules corresponding to three steps: sub-model generation module, inter-layer behavior analysis module, and input instance validity analysis module.
  • Step 1 The sub-model generation module generates sub-models corresponding to each level.
  • the structure design of the sub-model corresponding to the selected level k includes two parts.
  • the first part is a copy of the original neural network model structure from the input layer of the original model to the currently selected layer, including the structure on the model in detail Information, parameter information, etc.;
  • the second part is the retraining model based on the output value of the input data in the current layer k and the final predicted value using the meta-model structure.
  • the meta-model diagram is a single-layer linear fully connected, that is, a linear regression model , But not limited to this.
  • Figure 3 shows the work flow chart of the sub-model generation module. Input the original neural network model and the training data set, and first save all subsequent required intermediate results, such as the intermediate input and output values of each layer of neurons. Then iteratively select the k-th layer for sub-model generation, and generate the first and second parts of the sub-model separately and finally complete the sub-model generation by stitching. Finally, the sub-models generated by all selected layers are integrated and output into a sub-model pool.
  • Step 2 Interlayer behavior analysis module to analyze the interlayer behavior of the input instance to be verified.
  • Figure 4 shows the work flow chart of the inter-layer behavior analysis module.
  • Input the input instance to be verified and the sub-model pool obtained in step 1, so as to input the input instance to be verified into the sub-model pool corresponding to each level in the sub-model pool.
  • the model obtains the behavioral snapshots between layers, and each snapshot corresponds to the behavior information of a specific layer of the original model corresponding to the sub-model.
  • the total behavior profile of this input instance is summarized to reflect the direct transmission of this input instance on the original model. The profiling behavior.
  • Step 3 Input the instance validity analysis module, analyze the validity of the input instance to be tested and report.
  • Figure 5 shows the input instance validity analysis module.
  • the analysis method selection weight-based or learning-based
  • this method first selects the snapshot analysis method (the relative size ratio of the probability difference between the predicted maximum value and the final predicted value, or the probability change and the relative proportion of the probability change of the final predicted value, as described above), for a single snapshot Scoring, and select a specific profile analysis method (two weight-based or one learning-based analysis method, as described above), and summarize the scores for each snapshot as the effectiveness of the overall profile analysis for the entire inter-layer Assessment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开一种面向神经网络模型的基于层间剖析的输入实例验证方法,给定神经网络模型与其训练数据集,提取中间信息并生成对应每一层对应的子模型;对于任意待验证输入实例,输入子模型获取层间剖析后总行为profile;分析输入实例的层间剖析profile,验证该输入实例是否为有效并给出对于有效的置信度分数。本发明基于训练模型内部的层间剖析手段,利用输入实例在模型各层次剖析时的行为来分析给定输入实例的有效性,能避免已有技术需要较多不同模型相互借助的验证手段所存在的验证时间消耗巨大的弊端,能够更加准确的进行输入验证,从而能够帮助区分给定神经网络的输入实例的有效性,从而提高神经网络在实际使用中的准确性与安全性。

Description

面向神经网络模型的基于层间剖析的输入实例验证方法 技术领域
本发明涉及一种面向神经网络模型的基于层间剖析的输入实例验证方法,用于神经网络的测试和输入验证等技术领域任务。
背景技术
神经网络模型被广泛使用在现实生活的各个领域,例如图像处理、物体识别、自动驾驶等。然而,神经网络模型由于其结构复杂性与不可解释性,通常来说,对于给定训练完成的神经网络模型,现实生活中往往将其用于各式各样的场景并且得到好的结果。然而由于模型训练的特性,天然对于不同场景的输入会存在适用或不适用的情况,若不区分适用性进行现实部署,将有可能造成神经网络模型效果异常(例如,自动驾驶汽车若用于不使用的场景例如极端天气或处理极端曝光输入图片,有可能无法正常避障发生严重交通事故)。因此,对于神经网络模型对于输入的自动验证其是否适用(或是否有效)则至关重要。
当前已经有不少工作关注无效输入的问题,但是在使用上有一定的局限性。首先,不少工作核心是基于距离评估的方法,将未知输入与训练数据进行距离评测,从而判断其有效性,这一方式受制于训练数据的规模,难以应用到现实中往往要求大规模训练得到的神经网络模型中;其次,神经网络模型由于天然具备一定的泛化能力,其本身输入处理的能力与其训练数据并不严格等价,直接用后者帮助验证可能会带来一些精度的偏差;最后,当前方法往往属于线下验证的方式,效率难以满足实时验证的要求,从而难以在已部署的现实场景中使用。
发明内容
发明目的:针对现有技术中存在的问题与不足,本发明提供一种面向神经网络模型的基于层间剖析的输入实例验证方法,该方法具有易用性、有效性和高效性等特点。易用性是指该方法能够使用于现实生活中常见的大规模训练数据训练得到的神经网络模型中,使用场景并不严重受制于训练场景规模和神经网络模型的复杂程度。有效性是指该方法对于输入实例的有效性验证准确度较高,能够有效判别出有效与无效输入。高效性是指该方法对于验证输入实例所需时间代价较小,能够满足实时验证的要求,可部署于运行中的神经网络模型中进行输入验证。
技术方案:一种面向神经网络模型的基于层间剖析的输入实例验证方法,包括如下步骤:
步骤一:利用给定神经网络模型与其对应的训练数据,将训练数据输入给定神经网络模型,提取训练过程中数据在模型各中间层次的中间信息,并根据中间信息训练各层次对应的子模型,每一个子模型包含给定神经网络模型从输入层到对应中间层次的知识并模拟给定神经网络模型预测行为;
步骤二:利用步骤一获取的各中间层次对应的子模型,对于待验证的输入实例收集按照层次递增在各层次对应子模型上的预测行为snapshot,并汇总形成输入实例在所有子模型中的总行为profile;
步骤三:基于步骤二获取的给定输入实例对应的层次剖析所获取的总行为profile,分析其层次预测行为snapshot的有效性以及总行为profile的有效性,并给出有效性置信度分数,并评估有效性。
为实现并优化上述技术方案,采取的具体措施还包括:
进一步的,所述神经网络是指一类利用神经元进行层次连接形成的进行大数据特征提取及预测的数据结构,包含输入层、隐含层、输出层,每一层包含大量神经元,层次间通过神经元相互连接,由输入层向输出层传递信息,例如常用的各类DNN、CNN、RNN模型等;所述神经元是对于神经元输入利用内置函数等对输入数据进行运算操作,并输出的数据结构;所述内置函数为固定常用的几种流行的激活核函数形式,例如ReLU、Sigmoid、Softmax等;所述输入实例是指神经网络模型的一次输入或批输入,例如:对于面向图片分类问题训练的神经网络,输入实例是指某一张图片文件或多张图片构成的批输入。
进一步的,所述步骤一中,提供给定神经网络模型及其训练数据集,提取训练过程各层次中间信息,其中中间信息包括中间层次每一层神经元在训练过程中得到的模型参数信息(如CNN模型中的weight,bias等),各神经元输入值和输出值等。其中参数信息用于记录当前模型通过训练过程从训练数据集中学习到的知识,输入值和输出值用于为后续子模型的训练过程提供训练数据。
进一步的,所述步骤一中,所述每一层次如层次k对应的子模型是类似给定神经网络模型结构的神经网络模型,其包含两部分结构,第一部分继承给定神经网络在原训练过程后获得的后模型从输入层到对应层次k的所有得到的模型参 数信息(如weight、bias等)及其对应模型结构,第二部分利用基础元模型连接层次k神经元与预测输出神经元,并利用步骤一记录的k层次神经元中间信息(原始训练数据输入给定神经网络后,在该层次k的输出集合)和原始训练集对应预测值标记进行重新训练,并获取该部分训练后的参数信息,两部分参数合并则得到带参数的子模型结构;所述基础元模型常指代线性回归模型,但并不仅限于该模型;所述重新训练通常只训练第二部分的参数,但并不仅限于此,重新训练根据不同的应用场景(如利用层间剖析方法对模型参数进行整体微调)训练第一部分和第二部分的参数。
进一步的,所述步骤二中,所述输入实例在各层次对应子模型上的预测行为snapshot是指利用输入实例传入各层次对应的子模型进行预测后,所得到的预测概率分布结果等信息,但并不仅限于预测概率分布结果;
进一步的,所述总行为profile是指各子模型得到的预测行为snapshot的集合,用于后续步骤三进行对于该待验证输入实例的验证评估,是基础材料。
进一步的,所述步骤三中,所述利用步骤二总行为profile进行对对应层次预测行为snapshot的有效性分析,可采用以下分析方法:
方法一:考虑当前层次预测行为中,预测最大值与最终预测值的概率差异,利用相对大小比例作为每一层次snapshot有效性分数;
方法二:考虑当前层次与之前层次直接预测行为差异后,利用预测行为各概率变化情况以及最终预测值的概率变化相对比例作为每一层次snapshot有效性分数;
进一步的,所述步骤三中,所述利用步骤二总行为profile进行总profile有效性分析,可采用以下分析方法:
方法一:利用训练集在各层次实际在训练集上的预测准确性作为权重进行分析建模。该方法将各层次snapshot有效性分析的结果作为线性模型的输入,线性模型中包含的参数则基于训练集预测准确性进行设置,最终通过基于预测准确性加权的方式计算最终总profile有效性分数;
方法二:利用观察,采用常用增长函数曲线进行权重设置(线性、对数、指数)。该方法将各层次snapshot有效性分析的结果作为选取的增长函数的输入,并以人为设定的方式设置增长函数中包含的参数,用以计算最终总profile有效 性分数;
方法三:对训练集数据求得各层次snapshot及其总行为profile,将snapshot有效性分析作为输入数据,对应验证结果作为标记数据,采用机器学习模型训练可计算最终profile有效性分数的模型;所述对应验证结果可由人为给出,或结合给定神经网络模型对该输入的预测准确性给出,但并不仅限于此;所述机器学习模型可采用如线性回归,逻辑回归,SVM,神经网络等经典机器学习模型,但并不仅限于此。
进一步的,所述步骤三中,所述有效性置信度分数是指给定待验证输入所计算得到的总profile有效性分数为0至1之间的某一数值,代表对于当前待验证输入有效性的置信度,其中越接近0代表越无效,越接近于1代表越有效。所述有效性置信度分数取值范围选取和大小关系设定并不仅限于此。
进一步的,所述步骤三中,所述评估有效性,是指利用计算得到的有效性置信度,通过设定阈值划分进行有效/无效划分,划分阈值可以由事先给定或经验得到,主要根据不同模型实际使用场景对于有效输入实例的容忍程度不同决定,一般来说安全性要求越严格的场景阈值越接近于1。
有益效果:与现有技术相比,本发明能够弥补已有神经网络模型输入实例验证技术的不足,利用特定输入在模型中进行层间剖析的方法,高效地检测评估输入实例的有效性,并利用评估的有效性,从而进行输入实例的实时筛选,从而提升神经网络模型实际部署的效果。
附图说明
图1为本发明系统结构图;
图2为本发明提供的子模型结构细节图;
图3为本发明提供的子模型生成模块的工作流程图;
图4为本发明提供的层间行为剖析模块的工作流程图;
图5为本发明提供的有效性验证分析模块的工作流程图。
具体实施方式
下面结合具体实施例,进一步阐明本发明,应理解这些实施例仅用于说明本发明而不用于限制本发明的范围,在阅读了本发明之后,本领域技术人员对本发明的各种等价形式的修改均落于本申请所附权利要求所限定的范围。
神经网络是指一类利用神经元进行层次连接形成的进行大数据特征提取及预测的数据结构,包含输入层、隐含层、输出层,每一层包含大量神经元,层次间通过神经元相互连接,由输入层向输出层传递信息,例如常用的各类DNN、CNN、RNN模型等;神经元是对于神经元输入利用内置函数等对输入数据进行运算操作,并输出的数据结构;内置函数为固定常用的几种流行的激活核函数形式,例如ReLU、Sigmoid、Softmax等;输入实例是指神经网络模型的一次输入或批输入,例如:对于面向图片分类问题训练的神经网络,输入实例是指某一张图片文件或多张图片构成的批输入。
面向神经网络模型的基于层间剖析的输入实例验证方法,包括如下步骤:
步骤一:利用给定神经网络模型与其对应的训练数据,将训练数据输入给定神经网络模型,提取训练过程中数据在模型各中间层次的中间信息,并根据中间信息训练各层次对应的子模型,每一个子模型包含给定神经网络模型从输入层到对应中间层次的知识并模拟其预测行为;
中间信息包括中间层次每一层神经元在训练过程中得到的模型参数信息(如CNN模型中的weight,bias等),各神经元输入值和输出值等。其中参数信息用于记录当前模型通过训练过程从训练数据集中学习到的知识,输入值和输出值用于为后续子模型的训练过程提供训练数据。
每一层次如层次k对应的子模型是类似给定神经网络模型结构的神经网络模型,其包含两部分结构,第一部分继承给定神经网络在原训练过程后获得的后模型从输入层到对应层次k的所有得到的模型参数信息(如weight、bias等)及其对应模型结构,第二部分利用基础元模型连接层次k神经元与预测输出神经元,并利用步骤一记录的k层次神经元中间信息(原始训练数据输入给定神经网络后,在该层次k的输出集合)和原始训练集对应预测值标记进行重新训练,并获取该部分训练后的参数信息,两部分参数合并则得到带参数的子模型结构;基础元模型常指代线性回归模型,但并不仅限于该模型;重新训练通常只训练第二部分的参数,但并不仅限于此,重新训练根据不同的应用场景(如利用层间剖析方法对模型参数进行整体微调)训练第一部分和第二部分的参数。
步骤二:利用步骤一获取的各中间层次对应的子模型,对于待验证的输入实例收集按照层次递增在各层次对应子模型上的预测行为snapshot,并汇总形成输 入实例在所有子模型中的总行为profile;
输入实例在各层次对应子模型上的预测行为snapshot是指利用输入实例传入各层次对应的子模型进行预测后,所得到的预测概率分布结果等信息;
总行为profile是指各子模型得到的预测行为snapshot的集合,用于后续步骤三进行对于该待验证输入实例的验证评估,是基础材料。
步骤三:基于步骤二获取的给定输入实例对应的层次剖析所获取的总行为profile,分析其层次预测行为snapshot的有效性以及总行为profile的有效性,并给出有效性置信度分数,并评估有效性。
利用步骤二总行为profile进行对对应层次预测行为snapshot的有效性分析,可采用以下分析方法:
方法一:考虑当前层次预测行为中,预测最大值与最终预测值的概率差异,利用相对大小比例作为每一层次snapshot有效性分数;
方法二:考虑当前层次与之前层次直接预测行为差异后,利用预测行为各概率变化情况以及最终预测值的概率变化相对比例作为每一层次snapshot有效性分数;
利用步骤二总行为profile进行总profile有效性分析,可采用以下分析方法:
方法一:利用训练集在各层次实际在训练集上的预测准确性作为权重,并综合各层次snapshot有效性分析进行计算最终profile有效性分数;
方法二:利用观察,采用常用增长函数曲线进行权重设置(线性、对数、指数),并综合各层次snapshot有效性分析进行计算最终profile有效性分数;
方法三:对训练集数据求得各层次snapshot及其总行为profile,将snapshot有效性分析作为输入数据,对应验证结果作为标记数据,采用机器学习模型训练可计算最终profile有效性分数的模型;对应验证结果可由人为给出,或结合给定神经网络模型对该输入的预测准确性给出,但并不仅限于此;所述机器学习模型可采用如线性回归,逻辑回归,SVM,神经网络等经典机器学习模型。
有效性置信度分数是指给定待验证输入所计算得到的总profile有效性分析,为0至1之间的某一数值,代表对于当前待验证输入有效性的置信度,其中越接近0代表越无效,越接近于1代表越有效。
评估有效性,是指利用计算得到的有效性置信度,通过设定阈值划分进行有 效/无效划分,划分阈值可以由事先给定或经验得到,主要根据不同模型实际使用场景对于有效输入实例的容忍程度不同决定,一般来说安全性要求越严格的场景阈值越接近于1。
如图1所示,本发明实施例提供的面向神经网络模型的基于层间剖析的输入实例验证方法,首先可事先/线下利用原始数据网络模型与其训练实例数据集,生成各层次对应子模型并组成子模型池,池中每一个单个子模型对应原始模型中特定层包括的知识并能够用于预测。其次,对于任意给定待验证的输入实例,通过输入子模型池中各个模型进行预测,剖析其对于原始神经网络各层次按层次预测下的层间行为,并输出层间剖析总行为(profile),其中包括对应各层次的层间预测行为(snapshot)。最后,综合层间预测总行为profile与其包含的各层间预测行为snapshot进行有效性分析模块,并输出有效性分析报告。整个方法框架包含三个模块对应三个步骤:子模型生成模块,层间行为剖析模块,与输入实例有效性分析模块。
步骤一:子模型生成模块,生成各层次对应子模型。
如图2所示,对于选定层次k对应的子模型结构设计包括两个部分,第一部分为原始模型输入层到当前选定层之间的原始神经网络模型结构的拷贝,详细包括模型上结构信息、参数信息等;第二部分为利用元模型结构基于输入数据在当前层k的输出值与最终预测值的重新训练模型,其中元模型图示中为单层线性全连接,即线性回归模型,但不限于此。
如图3所示,图3表示子模型生成模块的工作流程图,输入原始神经网络模型与训练数据集,首先保存所有后续所需中间结果,如各层神经元的中间输入输出值。然后迭代选择第k层进行子模型生成,单独生成子模型第一第二部分最终拼接完成子模型生成。最终,将所有选定层对应生成的子模型综合输出成子模型池。
步骤二:层间行为剖析模块,剖析待验证输入实例的层间行为。
如图4所示,图4表示层间行为剖析模块的工作流程图,输入待验证输入实例与步骤一获取的子模型池,以此将待验证输入实例输入子模型池中各层次对应的子模型进行层间行为snapshot获取,各snapshot对应反映了子模型对应的原始模型的某一特定层的行为信息,最后汇总形成此输入实例的总行为profile,从而 反映此输入实例在原模型上层次直接传输的剖析行为。
步骤三:输入实例有效性分析模块,分析待测输入实例有效性并报告。
如图5所示,图5表示输入实例有效性分析模块,对于待验证输入实例在步骤二中获取的总层间剖析行为profile,进行剖析方法选择(权重-based或学习-based)和有效性程度计算。细节上,本方法首先选择snapshot分析方法(预测最大值与最终预测值的概率差异的相对大小比例,或概率变化情况以及最终预测值的概率变化相对比例,如上文所述),对于单个snapshot进行评分的,并选择特定profile分析方法(两种权重-based或一种学习-based的分析方法,如上文所述),汇总对于各snapshot的评分作为对于整个层间剖析总行为profile的有效性程度评估。通过对于与实际应用场景要求的安全性阈值判别,可以进一步获取对于待验证输入实例是否有效的决策,并报告。在实际场景中,安全性要求更高的场景通常伴随更高的阈值,使得相同情况下无效输入实例比重相对增加。本方法可以后续与过滤无效输入实例的手段总结,达到在实际场景中合理挑选输入实例输入神经网络模型进行判别的方式,增加实际使用神经网络模型的准确程度。

Claims (9)

  1. 一种面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,包括如下步骤:
    步骤一:利用给定神经网络模型与其对应的训练数据,将训练数据输入给定神经网络模型,提取训练过程中数据在模型各中间层次的中间信息,并根据中间信息训练各层次对应的子模型,每一个子模型包含给定神经网络模型从输入层到对应中间层次的知识并模拟其预测行为;
    步骤二:利用步骤一获取的各中间层次对应的子模型,对于待验证的输入实例收集按照层次递增在各层次对应子模型上的预测行为snapshot,并汇总形成输入实例在所有子模型中的总行为profile;
    步骤三:基于步骤二获取的给定输入实例对应的层次剖析所获取的总行为profile,分析其层次预测行为snapshot的有效性以及总行为profile的有效性,并给出有效性置信度分数,并评估有效性。
  2. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述步骤一中,提供给定神经网络模型及其训练数据集,提取训练过程各层次中间信息,其中中间信息包括中间层次每一层神经元在训练过程中得到的模型参数信息、各神经元输入值和输出值;其中参数信息用于记录当前模型通过训练过程从训练数据集中学习到的知识,输入值和输出值用于为后续子模型的训练过程提供训练数据。
  3. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述步骤一中,层次k对应的子模型是类似给定神经网络模型结构的神经网络模型,其包含两部分结构,第一部分继承给定神经网络在原训练过程后获得的后模型从输入层到对应层次k的所有得到的模型参数信息及其对应模型结构,第二部分利用基础元模型连接层次k神经元与预测输出神经元,并利用步骤一记录的k层次神经元中间信息和原始训练集对应预测值标记进行重新训练,并获取该部分训练后的参数信息,两部分参数合并则得到带参数的子模型结构;所述基础元模型常指代线性回归模型。
  4. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述步骤二中,所述输入实例在各层次对应子模型上的预测行为snapshot是指利用输入实例传入各层次对应的子模型进行预测后,所得到的 预测信息。
  5. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述总行为profile是指各子模型得到的预测行为snapshot的集合,用于后续步骤三进行对于该待验证输入实例的验证评估。
  6. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述步骤三中,所述利用步骤二总行为profile进行对对应层次预测行为snapshot的有效性分析,可采用以下分析方法:
    方法一:考虑当前层次预测行为中,预测最大值与最终预测值的概率差异,利用相对大小比例作为每一层次snapshot有效性分数;
    方法二:考虑当前层次与之前层次直接预测行为差异后,利用预测行为各概率变化情况以及最终预测值的概率变化相对比例作为每一层次snapshot有效性分数。
  7. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述步骤三中,所述利用步骤二总行为profile进行总profile有效性分析,可采用以下分析方法:
    方法一:利用训练集在各层次实际在训练集上的预测准确性作为权重,并综合各层次snapshot有效性分析进行计算最终profile有效性分数;
    方法二:利用观察,采用常用增长函数曲线进行权重设置,并综合各层次snapshot有效性分析进行计算最终profile有效性分数;
    方法三:对训练集数据求得各层次snapshot及其总行为profile,将snapshot有效性分析作为输入数据,对应验证结果作为标记数据,采用机器学习模型训练可计算最终profile有效性分数的模型。
  8. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述步骤三中,所述有效性置信度分数是指给定待验证输入所计算得到的总profile有效性分数,为0至1之间的某一数值,代表对于当前待验证输入有效性的置信度。
  9. 如权利要求1所述的面向神经网络模型的基于层间剖析的输入实例验证方法,其特征在于,所述步骤三中,所述评估有效性,是指利用计算得到的有效性置信度,通过设定阈值划分进行有效/无效划分。
PCT/CN2019/111612 2019-08-14 2019-10-17 面向神经网络模型的基于层间剖析的输入实例验证方法 WO2021027052A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910747317.4 2019-08-14
CN201910747317.4A CN110633788A (zh) 2019-08-14 2019-08-14 面向神经网络模型的基于层间剖析的输入实例验证方法

Publications (1)

Publication Number Publication Date
WO2021027052A1 true WO2021027052A1 (zh) 2021-02-18

Family

ID=68970306

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/111612 WO2021027052A1 (zh) 2019-08-14 2019-10-17 面向神经网络模型的基于层间剖析的输入实例验证方法

Country Status (3)

Country Link
CN (1) CN110633788A (zh)
LU (1) LU102710B1 (zh)
WO (1) WO2021027052A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884513A (zh) * 2021-02-19 2021-06-01 上海数鸣人工智能科技有限公司 基于深度因子分解机的营销活动预测模型结构和预测方法
CN114254274A (zh) * 2021-11-16 2022-03-29 浙江大学 一种基于神经元输出的白盒深度学习模型版权保护方法
CN114254274B (zh) * 2021-11-16 2024-05-31 浙江大学 一种基于神经元输出的白盒深度学习模型版权保护方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111475321B (zh) * 2020-05-08 2024-04-26 中国人民解放军国防科技大学 一种基于迭代抽象分析的神经网络安全性质验证方法
CN112632309B (zh) * 2020-12-15 2022-10-04 北京百度网讯科技有限公司 图像展示方法、装置、电子设备和存储介质
CN114422185B (zh) * 2021-12-17 2024-03-15 广西壮族自治区公众信息产业有限公司 一种基于异步神经网络生成和验证签名的方法
CN114861912B (zh) * 2022-07-06 2022-09-16 武汉山川软件有限公司 一种基于大数据的数据验证方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184678A (zh) * 2015-09-18 2015-12-23 齐齐哈尔大学 基于多种神经网络组合算法的光伏电站发电量短期预测模型的构建方法
CN109344806A (zh) * 2018-10-31 2019-02-15 第四范式(北京)技术有限公司 利用多任务目标检测模型执行目标检测的方法和系统
US20190156933A1 (en) * 2017-11-21 2019-05-23 Verisim Life Inc. Systems and methods for full body circulation and drug concentration prediction
WO2019098418A1 (ko) * 2017-11-16 2019-05-23 삼성전자 주식회사 뉴럴 네트워크 학습 방법 및 디바이스

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184678A (zh) * 2015-09-18 2015-12-23 齐齐哈尔大学 基于多种神经网络组合算法的光伏电站发电量短期预测模型的构建方法
WO2019098418A1 (ko) * 2017-11-16 2019-05-23 삼성전자 주식회사 뉴럴 네트워크 학습 방법 및 디바이스
US20190156933A1 (en) * 2017-11-21 2019-05-23 Verisim Life Inc. Systems and methods for full body circulation and drug concentration prediction
CN109344806A (zh) * 2018-10-31 2019-02-15 第四范式(北京)技术有限公司 利用多任务目标检测模型执行目标检测的方法和系统

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884513A (zh) * 2021-02-19 2021-06-01 上海数鸣人工智能科技有限公司 基于深度因子分解机的营销活动预测模型结构和预测方法
CN114254274A (zh) * 2021-11-16 2022-03-29 浙江大学 一种基于神经元输出的白盒深度学习模型版权保护方法
CN114254274B (zh) * 2021-11-16 2024-05-31 浙江大学 一种基于神经元输出的白盒深度学习模型版权保护方法

Also Published As

Publication number Publication date
LU102710B1 (en) 2021-04-08
CN110633788A (zh) 2019-12-31

Similar Documents

Publication Publication Date Title
WO2021027052A1 (zh) 面向神经网络模型的基于层间剖析的输入实例验证方法
CN110262463B (zh) 一种基于深度学习的轨道交通站台门故障诊断系统
CN111428071B (zh) 一种基于多模态特征合成的零样本跨模态检索方法
CN109034194B (zh) 基于特征分化的交易欺诈行为深度检测方法
CN110166484A (zh) 一种基于LSTM-Attention网络的工业控制系统入侵检测方法
Henriksson et al. Towards structured evaluation of deep neural network supervisors
CN111114556A (zh) 基于多源指数加权损失下lstm的换道意图识别方法
CN111274886B (zh) 一种基于深度学习的行人闯红灯违法行为分析方法及系统
CN110956309A (zh) 基于crf和lstm的流程活动预测方法
CN111881299B (zh) 基于复制神经网络的离群事件检测与识别方法
US20220324470A1 (en) Monitoring of an ai module of a vehicle driving function
Fang et al. Traffic police gesture recognition by pose graph convolutional networks
CN109656818A (zh) 一种软件密集系统故障预测方法
Gao et al. A layered working condition perception integrating handcrafted with deep features for froth flotation
CN110751005B (zh) 融合深度感知特征和核极限学习机的行人检测方法
CN111192121A (zh) 一种基于ann的风险纳税人自动预警方法及系统
CN113253709A (zh) 一种适用于轨道交通车辆健康诊断方法及装置
Li et al. Explaining a machine-learning lane change model with maximum entropy Shapley values
CN116946183A (zh) 一种考虑驾驶能力的商用车驾驶行为预测方法及车用设备
Sun et al. Vision-based traffic conflict detection using trajectory learning and prediction
CN105160336A (zh) 基于Sigmoid函数的人脸识别方法
CN113835739A (zh) 一种软件缺陷修复时间的智能化预测方法
Yang et al. How to use extra training data for better edge detection?
CN110728310A (zh) 一种基于超参数优化的目标检测模型融合方法及融合系统
Calvi Runtime Monitoring of Cyber-Physical Systems Using Data-driven Models

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19941468

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19941468

Country of ref document: EP

Kind code of ref document: A1