CN113963185A - Visualization and quantitative analysis method and system for layer feature expression capability in neural network - Google Patents

Visualization and quantitative analysis method and system for layer feature expression capability in neural network Download PDF

Info

Publication number
CN113963185A
CN113963185A CN202111240906.7A CN202111240906A CN113963185A CN 113963185 A CN113963185 A CN 113963185A CN 202111240906 A CN202111240906 A CN 202111240906A CN 113963185 A CN113963185 A CN 113963185A
Authority
CN
China
Prior art keywords
sample
low
region
dimensional
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111240906.7A
Other languages
Chinese (zh)
Inventor
张拳石
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202111240906.7A priority Critical patent/CN113963185A/en
Publication of CN113963185A publication Critical patent/CN113963185A/en
Priority to PCT/CN2022/127435 priority patent/WO2023072094A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the technical field of machine learning, and discloses a method and a system for visualizing and quantitatively analyzing the expression capability of a layer feature in a neural network, which can realize automatic visualization and quantitative analysis of the expression capability of the layer feature in the neural network under an unsupervised condition. The method comprises the following steps: providing a neural network to be analyzed, wherein the neural network is a deep neural network pre-trained on a certain data set; providing a group of input samples, inputting the samples into the neural network, and extracting sample level features and area level features corresponding to the samples; respectively reducing the dimensions of the sample level features and the region level features to obtain a visualization result in a low-dimensional space; and quantitatively analyzing the quantity and quality of knowledge points in the features based on the visualization result of the region-level features.

Description

Visualization and quantitative analysis method and system for layer feature expression capability in neural network
Technical Field
The application relates to the technical field of machine learning, in particular to a method and a system for visualizing and quantitatively analyzing the expression capability of medial features in a neural network.
Background
At present, deep neural networks have exhibited powerful performance in various fields, but the black box nature of neural networks makes it difficult for people to understand their internal behaviors. In the prior art, the visualization method is the most widely applied method in the field of artificial intelligence interpretability, but the expression capability of the layer characteristics in the neural network cannot be quantitatively analyzed by using the existing visualization method.
Therefore, the combination of the visual interpretation of the neural network and the quantitative analysis of the expression capability of the middle-layer characteristics is a problem to be solved in the field of artificial intelligence interpretability.
Disclosure of Invention
The invention aims to provide a method and a system for visualizing and quantitatively analyzing the expression capability of a layer feature in a neural network, which can realize automatic visualization and quantitative analysis of the expression capability of the layer feature in the neural network under an unsupervised condition.
The application discloses a visualization and quantitative analysis method for the expression capability of medial features in a neural network, which comprises the following steps
(1) Selecting a feature interpretation object:
selecting a model to be analyzed, wherein the model comprises a middle layer expression and comprises the following steps: a neural network, a hierarchical graph model;
(2) extracting neural network features:
providing a group of input samples, inputting the samples into the neural network, and extracting characteristics of the samples, wherein the characteristics comprise: sample-wise feature (sample-wise feature) and region-wise feature (regional feature);
(3) and (3) reducing the dimension of the features to obtain a visual result:
firstly, carrying out dimension reduction on sample level characteristics to obtain a visual result of the sample level characteristics in a low-dimensional space; secondly, reducing the dimension of the region level features based on the low-dimensional representation and the region level features of the sample level features to obtain a visual result of the region level features in a low-dimensional space;
(4) and (3) carrying out quantitative analysis on the characteristics according to the visualization result:
and quantitatively analyzing the quantity and quality of knowledge points (knowledge points) in the features based on the visualization result.
In a preferred example, the step (1) further comprises the following steps:
based on a certain data set, a neural network is trained as the neural network to be analyzed. Optionally, the neural network is a classification neural network.
In a preferred embodiment, the step (2) further comprises the following sub-steps:
(a) extracting sample level features: inputting a given group of samples into a neural network to be analyzed, and extracting the output characteristics of a middle layer of the neural network for each sample so as to obtain the sample-level characteristics corresponding to each input sample, namely the sample-level characteristics corresponding to the group of input samples.
(b) Extracting region level features: inputting a given group of input samples into a neural network to be analyzed, and extracting the output features of a certain convolution layer of the neural network for each sample so as to obtain a feature map (feature map) corresponding to each input sample, wherein the high-dimensional vector corresponding to each position of the feature map is the region-level feature of the sample in the region. When the height and width of this feature map are H and W, respectively, and there are K channels in total, then this feature map contains HW region-level features, where each region-level feature is a K-dimensional vector.
In a preferred embodiment, the step (3) further comprises the following sub-steps:
(a) reducing the dimension of the sample level features, and visualizing the sample level features in a low-dimensional space;
(b) and reducing the dimension of the region-level features and visualizing the region-level features in a low-dimensional space.
In the above sub-step (a), for each sample x, a corresponding sample level characteristic is obtainedSign for
Figure BDA0003319495960000031
By a projection matrix
Figure BDA0003319495960000032
Mapping the low-dimensional feature into a low-dimensional space to obtain the low-dimensional characterization of the sample-level features
Figure BDA0003319495960000033
And, optimizing M such that the low-dimensional token should satisfy that the closeness of the low-dimensional token g and each class should be as consistent as possible with the closeness of the sample x and each class.
Optionally, in the calculation of the "proximity of the low-dimensional characterization to each class", the distribution of the low-dimensional characterization g of the sample-level features in the low-dimensional space is modeled by radial distribution (radial distribution), and the proximity of the low-dimensional characterization to each class is calculated based on the distribution.
Based on the radial distribution, the probability density function of g in the low-dimensional space can be calculated by the following formula.
Figure BDA0003319495960000034
Wherein y ∈ {1, 2.,. C } represents different classes in the classification task; piyRepresenting the prior probability of class y; lg-g | | | represents the L2 norm of g, called the strength of g (strength); og=g/lgRepresents the direction of g (orientation); mu.syMean direction (mean direction) indicating the y-th class; k (·) is a monotonically increasing function. p (l)gY) represents l on category ygA priori probability of pvMF(ogy,κ(lg) ) mean direction is μyThe aggregation parameter is κ (l)g) vMF distribution (von Mises-Fisher distribution).
Alternatively, κ (·) may be any monotonically increasing function, and more preferably,the κ (·) function may be generated by: given a non-negative constant kmAnd dimension d, from the mean direction, of
Figure BDA0003319495960000035
The aggregation parameter is k ═ kmvMF distribution of the image signal is sampled to obtain N samples
Figure BDA0003319495960000036
Scaling these samples to length l without changing direction, where l is an arbitrary non-negative number, and noting the scaled samples as
Figure BDA0003319495960000037
Obtaining N Gaussian noise samples by sampling from standard normal distribution
Figure BDA0003319495960000038
Correspondingly adding the samples obtained by the scaling and the Gaussian noise samples to obtain
Figure BDA0003319495960000039
Define kappa (l) as
Figure BDA00033194959600000310
Wherein
Figure BDA00033194959600000311
N ranges from 1000 to 100000, and in one embodiment, N is 10000.
Based on the radial distribution, and assuming lgIs independent of the class y, then the low-dimensional representation of the proximity Q of g and class yM(y | x) can be calculated by the following formula.
Figure BDA0003319495960000041
Alternatively, the "closeness of the sample to each class" is calculated as the output probability of the sample, i.e. the closeness P (y | x) of the sample x and the y-th class is the output probability value of the corresponding y-th class in the neural network output.
Optionally, the optimization of the projection matrix M further comprises the low-dimensional characterization g and the closeness Q of each classM(y | x), and the calculation of the proximity P (y | x) of the sample x to each class, the projection matrix M is optimized such that P (y | x) and QMThe KL divergence (Kullback-Leibler divergence) between (y | x) is minimized.
Figure BDA0003319495960000042
Optionally, the projection matrix M is optimized alternately with the parameters { pi, μ } ═ pi in the radial distribution in the process of obtaining a low-dimensional characterization of the sample-level featuresy,μy}y∈Y. Wherein, when optimizing the projection matrix M, the parameters { π, μ } in radial distribution are fixed, and M is updated such that KL [ P (Y | X) | QM(Y|X)]The value of (d) is minimized; when optimizing the parameters { pi, mu } in the radial distribution, the projection matrix M is fixed and { pi, mu } is updated such that the likelihood pigp(g)=Πgy′πy′·pvMF(ogy′,κ(lg) ) has the largest value.
In sub-step (b) above, for each sample x, HW region-level features
Figure BDA0003319495960000043
Figure BDA0003319495960000044
By a projection matrix
Figure BDA0003319495960000045
Mapping them into a low-dimensional space to obtain the low-dimensional representation of HW region-level features
Figure BDA0003319495960000046
Figure BDA0003319495960000047
And, optimizing Λ such that the low-dimensional characterization should be satisfied, based on the low-dimensional characterization h ═ h { (h)(1),h(2),...,h(HW)The inferred inter-sample similarity is as consistent as possible with the inferred inter-sample similarity based on the network output, and further, the low-dimensional characterization of the region-level features needs to be aligned with the low-dimensional characterization of the sample-level features.
Optionally, in the calculation of "similarity between samples inferred based on low-dimensional features", the similarity between samples is split into weighted products of the similarities between low-dimensional features corresponding to each region based on the bag-of-words model; and based on vMF distribution, the similarity between the corresponding low-dimensional representations of each region is quantified.
Alternatively, in the above description, let x1And x2For any two samples, the low-dimensional characteristics of the corresponding region level features are respectively
Figure BDA0003319495960000051
And
Figure BDA0003319495960000052
based on bag of words model, let x1And x2Degree of similarity Q betweenΛ(x2|x1) The segmentation is performed as a weighted product of the similarity between the corresponding low-dimensional features of each region, as shown below.
Figure BDA0003319495960000053
Wherein,
Figure BDA0003319495960000054
represents a sample x2The degree of importance of the r-th region feature to the classification is, in a preferred embodiment,
Figure BDA0003319495960000055
is a non-negative number. And,
Figure BDA0003319495960000056
further quantified as follows.
Figure BDA0003319495960000057
Wherein,
Figure BDA0003319495960000058
is composed of
Figure BDA0003319495960000059
In the average direction of
Figure BDA00033194959600000510
An aggregation parameter of
Figure BDA00033194959600000511
vMF distribution of (1); as stated in claim 6, k (·) is a monotonically increasing function.
Alternatively, in the calculation of "similarity between samples inferred based on network output", let x be1And x2For any two of the samples, the sample is,
Figure BDA00033194959600000512
the network output probabilities corresponding to the two samples respectively are based on the similarity P (x) between the samples deduced by the network output2|x1) It can be further calculated as follows.
Figure BDA00033194959600000513
Wherein,
Figure BDA00033194959600000514
cos (·, ·) represents the cosine similarity between two vectors, κpIs a non-negative constant.
Alternatively, making "the inter-sample similarity inferred based on the low-dimensional characterization as consistent as possible with the inter-sample similarity inferred based on the network output" is equivalent to minimizing the loss function as follows.
Figure BDA0003319495960000061
Wherein, P (x)2|x1) And QΛ(x2|x1) As shown in claim 16 and claim 17, respectively.
Optionally, aligning the low-dimensional characterization of the region-level features with the low-dimensional characterization of the sample-level features is equivalent to optimizing the loss function as follows.
Figure BDA0003319495960000062
Wherein M/(. cndot.;) represents mutual information; g represents a low-dimensional characterization of the sample-level features of sample x; h is(r)Low-dimensional characterization of the r-th region-level feature, w, representing the sample x(r)Indicating how important the r-th region-level feature of the sample x is for classification.
Optionally, the optimization of the projection matrix Λ comprises the steps of: calculating the two loss functions respectively
Figure BDA0003319495960000063
And
Figure BDA0003319495960000064
further calculating the total loss function
Figure BDA0003319495960000065
Optimizing Λ based on the total loss function such that the total loss function
Figure BDA0003319495960000066
And (4) minimizing. Where α is a positive constant, in a preferred embodiment, α ranges from 0.01 to 100. In a fruitIn the examples, the value of α was taken to be 0.1.
In a preferred example, the step (4) further comprises the following steps:
(a) quantizing the knowledge points in the region-level features;
(b) reliable knowledge points are further quantified, as well as the proportion of reliable knowledge points.
Optionally, the knowledge point is defined as a set of region-level features such that the following equation is greater than some threshold.
Figure BDA0003319495960000067
Wherein h is(r)Is a low-dimensional representation of the r-th region-level feature corresponding to a certain sample x. Therefore, the knowledge point representation is such that maxc p(y=c|h(r)) Features at the region level > τ, i.e. set h(r):maxc p(y=c|h(r)) τ where τ is a positive constant, and in a preferred embodiment, τ ranges from 0.3 to 0.8. In one embodiment, τ is 0.4.
Optionally, reliable knowledge points and unreliable knowledge points in the knowledge points can be further quantized, so that the proportion of reliable knowledge points to total knowledge points is quantized. The reliable knowledge points are a set of knowledge points that further satisfy the following equation.
Figure BDA0003319495960000071
Wherein h is(r)A low-dimensional characterization of the r-th region-level feature corresponding to a certain sample x, ctruthA true category label representing the sample. That is, the reliable knowledge points are the set { h }(r):ctruth=arg maxc p(y=c|h(r)) Area level features contained in (c) }.
Further, the ratio of reliable knowledge points to total knowledge points measures the quality of knowledge points, and can be calculated by the following equation.
Figure BDA0003319495960000072
In a second aspect of the present invention, there is provided a system for visualizing and quantitatively analyzing a layer feature expression capability in a neural network, comprising:
(1) the input module is configured to be a pre-trained classification neural network and input samples containing all possible classes;
(2) a feature extraction module configured to extract sample-level features and region-level features of the input sample;
(3) the visualization module is configured to reduce the dimension of the extracted sample-level features and the extracted region-level features to obtain low-dimensional representations and visualize the low-dimensional representations in a low-dimensional space;
(3) and the quantitative analysis module is configured to quantitatively analyze the quantity and quality of the knowledge points in the features based on the visualization result of the region-level features.
It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be reiterated herein, but to the extent of space.
Other features, objects and advantages of the present invention will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings.
Drawings
FIG. 1 is a flow chart of a method for visualizing and quantitatively analyzing the expression capability of layer features in a neural network according to a first embodiment of the present invention;
FIG. 2 is a schematic illustration of a visualization of sample-level features in a low-dimensional space, obtained in accordance with the present invention;
FIG. 3 is a schematic diagram of the visualization of region-level features in a low-dimensional space obtained according to the present invention at different training stages of a neural network;
FIG. 4 is a schematic diagram of the visualization of region-level features in a low-dimensional space obtained according to the present invention at different forward propagation stages of a neural network;
FIG. 5 is a graph of the variation of knowledge points and reliable knowledge points in different interlayer features of a neural network with different training stages of the neural network, obtained according to the quantization of the number and quality of the knowledge points in the present invention;
FIG. 6 is a schematic view of the visualization of knowledge points in different inter-layer features of a neural network, obtained by quantifying the knowledge points according to the present invention;
fig. 7 is a schematic structural diagram of a system for visualizing and quantitatively analyzing the expression capability of layer features in a neural network according to a second embodiment of the present invention.
Detailed Description
Through careful and intensive research, the inventor develops a method and a system for visualizing and quantitatively analyzing the layer characteristic expression capability in a neural network for the first time. By the method and the system, the visual interpretation of the neural network is closely related to the quantitative analysis of the layer feature expression capability in the neural network; by visualization, the process that the expression capacity of the layer characteristics in the neural network emerges along with time and space can be clearly shown; the method can quantitatively analyze the quantity and quality of the layer knowledge points in the neural network, and further analyze the reliability of the model to be explained; based on the method and the system, a brand-new angle interpretation framework can be provided for the existing deep learning algorithms, such as attack resistance and knowledge distillation.
Based on the method and the system, a brand-new angle interpretation framework can be provided for the existing deep learning algorithms, such as attack resistance and knowledge distillation.
General procedure
Typically, the present invention comprises the steps of:
(1) providing a neural network to be analyzed, wherein the neural network is a deep neural network pre-trained on a certain data set;
(2) providing a group of input samples, inputting the samples into the neural network, and extracting sample-wise features (sample-wise features) and regional-wise features (regional features) corresponding to the samples;
(3) respectively reducing the dimensions of the sample level features and the region level features to obtain a visualization result in a low-dimensional space;
(4) and quantitatively analyzing the quantity and quality of knowledge points (knowledge points) in the features based on the visualization result of the region-level features.
The main advantages of the invention are:
(1) a visualization and quantitative analysis method and system for the expression ability of the layer characteristics in the neural network;
(2) by the method and the system, the visual interpretation of the neural network is closely related to the quantitative analysis of the layer feature expression capability in the neural network;
(3) by visualization, the process that the expression capacity of the layer characteristics in the neural network emerges along with time and space can be clearly shown;
(4) the method can quantitatively analyze the quantity and quality of the layer knowledge points in the neural network, and further analyze the reliability of the model to be explained;
(5) based on the method and the system, a brand-new angle interpretation framework can be provided for the existing deep learning algorithms, such as attack resistance and knowledge distillation.
Examples
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The first embodiment of the present invention relates to a method for visualizing and quantitatively analyzing the expression capability of a layer feature in a neural network, the flow of which is shown in fig. 1, and the method comprises the following steps:
in step 101: based on a certain data set, a neural network is trained as the neural network to be analyzed. Optionally, the neural network is a classification neural network.
Then, step 102 is entered, which may be further divided into the following two sub-steps:
(a) extracting sample level features: inputting a given group of samples into a neural network to be analyzed, and extracting the output characteristics of a middle layer of the neural network for each sample so as to obtain the sample-level characteristics corresponding to each input sample, namely the sample-level characteristics corresponding to the group of input samples.
(b) Extracting region level features: inputting a given group of input samples into a neural network to be analyzed, and extracting the output features of a certain convolution layer of the neural network for each sample so as to obtain a feature map (feature map) corresponding to each input sample, wherein the high-dimensional vector corresponding to each position of the feature map is the region-level feature of the sample in the region. When the height and width of this feature map are H and W, respectively, and there are K channels in total, then this feature map contains HW region-level features, where each region-level feature is a K-dimensional vector.
Then, step 103 is entered, which may be further divided into the following two sub-steps:
(a) reducing the dimension of the sample level features, and visualizing the sample level features in a low-dimensional space;
(b) and reducing the dimension of the region-level features and visualizing the region-level features in a low-dimensional space.
In the sub-step (a), for each sample x, corresponding sample-level features
Figure BDA0003319495960000111
By a projection matrix
Figure BDA0003319495960000112
Mapping the low-dimensional feature into a low-dimensional space to obtain the low-dimensional characterization of the sample-level features
Figure BDA0003319495960000113
And, optimizing M such that the low-dimensional token should satisfy that the closeness of the low-dimensional token g and each class should be as consistent as possible with the closeness of the sample x and each class.
Optionally, in the calculation of the "proximity of the low-dimensional characterization to each class", the distribution of the low-dimensional characterization g of the sample-level features in the low-dimensional space is modeled by radial distribution (radial distribution), and the proximity of the low-dimensional characterization to each class is calculated based on the distribution.
Based on the radial distribution, the probability density function of g in the low-dimensional space can be calculated by the following formula.
Figure BDA0003319495960000114
Wherein y ∈ {1, 2.,. C } represents different classes in the classification task; piyRepresenting the prior probability of class y; lg-g | | | represents the L2 norm of g, called the strength of g (strength); og=g/lgRepresents the direction of g (orientation); mu.syMean direction (mean direction) indicating the y-th class; k (·) is a monotonically increasing function. p (l)gY) represents l on category ygA priori probability of pvMF(ogy,κ(lg) ) mean direction is μyThe aggregation parameter is κ (l)g) vMF distribution (von Mises-Fisher distribution).
Alternatively, κ (·) may be any monotonically increasing function, and more preferably, the κ (·) function may be generated by: given a non-negative constant kmAnd dimension d, from the mean direction, of
Figure BDA0003319495960000115
The aggregation parameter is k ═ kmvMF distribution of the image signal is sampled to obtain N samples
Figure BDA0003319495960000116
Scaling these samples to length l without changing direction, where l is an arbitrary non-negative number, and noting the scaled samples as
Figure BDA0003319495960000121
From a standard normalSampling in the distribution to obtain N Gaussian noise samples
Figure BDA0003319495960000122
Correspondingly adding the samples obtained by the scaling and the Gaussian noise samples to obtain
Figure BDA0003319495960000123
Define kappa (l) as
Figure BDA0003319495960000124
Wherein
Figure BDA0003319495960000125
N ranges from 1000 to 100000, and in one embodiment, N is 10000.
Based on the radial distribution, and assuming lgIs independent of the class y, then the low-dimensional representation of the proximity Q of g and class yM(y | x) can be calculated by the following formula.
Figure BDA0003319495960000126
Alternatively, the "closeness of the sample to each class" is calculated as the output probability of the sample, i.e. the closeness P (y | x) of the sample x and the y-th class is the output probability value of the corresponding y-th class in the neural network output.
Optionally, the optimization of the projection matrix M further comprises the low-dimensional characterization g and the closeness Q of each classM(y | x), and the calculation of the proximity P (y | x) of the sample x to each class, the projection matrix M is optimized such that P (y | x) and QMThe KL divergence (Kullback-Leibler divergence) between (y | x) is minimized.
Figure BDA0003319495960000127
Optionally, the projection matrix M and the radial direction are optimized alternately in obtaining the low-dimensional characterization of the sample-level featuresParameter { pi, mu } - { pi in distributiony,μy}y∈Y. Wherein, when optimizing the projection matrix M, the parameters { π, μ } in radial distribution are fixed, and M is updated such that KL [ P (Y | X) | QM(Y|X)]The value of (d) is minimized; when optimizing the parameters { pi, mu } in the radial distribution, the projection matrix M is fixed and { pi, mu } is updated such that the likelihood pigp(g)=Πgy′πy′·pvMF(ogy′,κ(lg) ) has the largest value.
As shown in FIG. 2, in one embodiment of the present invention, a VGG-16 network pre-trained on a Tiny ImageNet image classification data set is given, wherein, during training, the neural network only adopts ten categories of steel arch bridge, school bus, sports car, tabby cat, desk, golden retriever, tall, iPod, life, and orange, and the characteristics of the VGG-16 network after the last second fully connected layer are extracted as sample-level characteristics, and the aforementioned method is used to perform dimension reduction to obtain a low-dimensional characterization scatter diagram in a three-dimensional space. In the figure, different colors indicate different categories shown in the legend, and the arrows corresponding to the colors of the respective categories indicate the average directions of the respective categories.
In sub-step (b) above, for each sample x, HW region-level features
Figure BDA0003319495960000131
Figure BDA0003319495960000132
By a projection matrix
Figure BDA0003319495960000133
Mapping them into a low-dimensional space to obtain the low-dimensional representation of HW region-level features
Figure BDA0003319495960000134
Figure BDA0003319495960000135
And, optimizing Λ so that this is lowThe dimension characterization should satisfy, based on the low dimension characterization h ═ h(1),h(2),...,h(HW)The inferred inter-sample similarity is as consistent as possible with the inferred inter-sample similarity based on the network output, and further, the low-dimensional characterization of the region-level features needs to be aligned with the low-dimensional characterization of the sample-level features.
Optionally, in the calculation of "similarity between samples inferred based on low-dimensional features", the similarity between samples is split into weighted products of the similarities between low-dimensional features corresponding to each region based on the bag-of-words model; and based on vMF distribution, the similarity between the corresponding low-dimensional representations of each region is quantified.
Alternatively, in the above description, let x1And x2For any two samples, the low-dimensional characteristics of the corresponding region level features are respectively
Figure BDA0003319495960000136
And
Figure BDA0003319495960000137
based on bag of words model, let x1And x2Degree of similarity Q betweenΛ(x2|x1) The segmentation is performed as a weighted product of the similarity between the corresponding low-dimensional features of each region, as shown below.
Figure BDA0003319495960000138
Wherein,
Figure BDA0003319495960000139
represents a sample x2The degree of importance of the r-th region feature to the classification is, in a preferred embodiment,
Figure BDA00033194959600001310
is a non-negative number. And,
Figure BDA00033194959600001311
further quantified as follows.
Figure BDA00033194959600001312
Wherein,
Figure BDA0003319495960000141
is composed of
Figure BDA0003319495960000142
In the average direction of
Figure BDA0003319495960000143
An aggregation parameter of
Figure BDA0003319495960000144
vMF distribution of (1); as stated in claim 6, k (·) is a monotonically increasing function.
Alternatively, in the calculation of "similarity between samples inferred based on network output", let x be1And x2For any two of the samples, the sample is,
Figure BDA0003319495960000145
the network output probabilities corresponding to the two samples respectively are based on the similarity P (x) between the samples deduced by the network output2|x1) It can be further calculated as follows.
Figure BDA0003319495960000146
Wherein,
Figure BDA0003319495960000147
cos (·, ·) represents the cosine similarity between two vectors, κpIs a non-negative constant.
Alternatively, making "the inter-sample similarity inferred based on the low-dimensional characterization as consistent as possible with the inter-sample similarity inferred based on the network output" is equivalent to minimizing the loss function as follows.
Figure BDA0003319495960000148
Wherein, P (x)2|x1) And QΛ(x2|x1) As shown in claim 16 and claim 17, respectively.
Optionally, aligning the low-dimensional characterization of the region-level features with the low-dimensional characterization of the sample-level features is equivalent to optimizing the loss function as follows.
Figure BDA0003319495960000149
Wherein MI (;) represents mutual information; g represents a low-dimensional characterization of the sample-level features of sample x; h is(r)Low-dimensional characterization of the r-th region-level feature, w, representing the sample x(r)Indicating how important the r-th region-level feature of the sample x is for classification.
Optionally, the optimization of the projection matrix Λ comprises the steps of: calculating the two loss functions respectively
Figure BDA00033194959600001410
And
Figure BDA00033194959600001411
further calculating the total loss function
Figure BDA00033194959600001412
Optimizing Λ based on the total loss function such that the total loss function
Figure BDA0003319495960000152
And (4) minimizing. Where α is a positive constant, in a preferred embodiment, α ranges from 0.01 to 100. In one embodiment, the value of α is taken to be 0.1.
As shown in FIG. 3, given a VGG-16 network pre-trained on a Tiny ImageNet data set as the previous one, the output features of the conv _53 layer are extracted as region-level features, and dimension reduction is carried out to three dimensions by the method to obtain low-dimensional representations and distribution in three-dimensional space. As before, the scatter points of different colors represent the low-dimensional representations of the region-level features corresponding to the samples of different classes, and the ellipses in the diagram represent the approximate distribution of the low-dimensional representations of the region-level features corresponding to the samples of different classes.
As shown in fig. 4, given a VGG-16 network pre-trained on the same Tiny ImageNet data set as described above, the output features of the conv _12, conv _22, conv _33, conv _43, and conv _53 layers are extracted as region-level features, and the above-described method is used to perform dimension reduction to a three-dimensional space to obtain low-dimensional features. Fig. 4 shows a distribution variation of the low-dimensional characterization of the region-level features corresponding to different samples with increasing number of forward-propagating layers, wherein the vertical upward arrow represents the correct average direction of the sample, and the scatter points of different colors represent the distribution of the low-dimensional characterization of the region-level features corresponding to different layers in a three-dimensional space.
Then, step 104 is entered, which may be further divided into the following sub-steps:
(a) quantizing the knowledge points in the region-level features;
(b) reliable knowledge points are further quantified, as well as the proportion of reliable knowledge points.
Optionally, the knowledge point is defined as a set of region-level features such that the following equation is greater than some threshold.
Figure BDA0003319495960000151
Wherein h is(r)Is a low-dimensional representation of the r-th region-level feature corresponding to a certain sample x. Therefore, the knowledge point representation is such that maxc p(y=c|h(r)) Features at the region level > τ, i.e. set h(r):maxc p(y=c|h(r)) τ, where τ is a positive constant,in a preferred embodiment, τ is in the range of 0.3-0.8. In one embodiment, τ is 0.4.
Optionally, reliable knowledge points and unreliable knowledge points in the knowledge points can be further quantized, so that the proportion of reliable knowledge points to total knowledge points is quantized. The reliable knowledge points are a set of knowledge points that further satisfy the following equation.
Figure BDA0003319495960000161
Wherein h is(r)A low-dimensional characterization of the r-th region-level feature corresponding to a certain sample x, ctruthA true category label representing the sample. That is, the reliable knowledge points are the set { h }(r):ctruth=arg maxc p(y=c|h(r)) Area level features contained in (c) }.
Further, the ratio of reliable knowledge points to total knowledge points measures the quality of knowledge points, and can be calculated by the following equation.
Figure BDA0003319495960000162
As shown in fig. 5, given a VGG-16 network pre-trained on the Tiny ImageNet dataset as described above, the number of all knowledge points and the number of reliable knowledge points in the corresponding region-level features of the conv _33, conv _43 and conv _53 layers were calculated by the method described above. Fig. 5 shows the variation curve of the total amount of knowledge points and the number of reliable knowledge points of different layers of the neural network with the number of training iterations of the neural network.
As shown in fig. 6, given a VGG-16 network pre-trained on the Tiny ImageNet dataset as described above, all knowledge points in the corresponding region-level features of the conv _33, conv _43, and conv _53 layers were obtained by the method described above. Fig. 6 shows the areas of the knowledge points corresponding to different layers in the graph, as indicated by the highlighted parts of the graph.
A second embodiment of the present invention relates to a system for visualizing and quantitatively analyzing the expression capability of a layer feature in a neural network, which has a structure shown in fig. 7 and includes:
(1) the input module is configured to be a pre-trained classification neural network and input samples containing all possible classes;
(2) the visualization module is configured to reduce the dimension of the extracted sample-level features and the extracted region-level features to obtain low-dimensional representations and visualize the low-dimensional representations in a low-dimensional space;
(3) and the quantitative analysis module is configured to quantitatively analyze the quantity and quality of the knowledge points in the features based on the visualization result of the region-level features.
It should be noted that, as will be understood by those skilled in the art, the implementation functions of the modules shown in the embodiment of the system for visualizing and quantitatively analyzing the expression capability of the layer features in the neural network described above can be understood by referring to the foregoing description of the method for visualizing and quantitatively analyzing the expression capability of the layer features in the neural network. The functions of the modules shown in the embodiment of the system for visualizing and quantitatively analyzing layer feature expression capability in a neural network can be realized by a program (executable instructions) running on a processor, and can also be realized by specific logic circuits. The visualization and quantitative analysis system for the layer feature expression capability in the neural network according to the embodiment of the present invention may be implemented in the form of a software functional module and may be stored in a computer-readable storage medium when the system is sold or used as an independent product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the method of the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
Accordingly, the embodiment of the present invention also provides a computer-readable storage medium, wherein computer-executable instructions are stored, and when being executed by a processor, the computer-executable instructions realize the method embodiments of the present invention. Computer-readable storage media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable storage medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
In addition, the embodiment of the invention also provides a system for visualizing and quantitatively analyzing the expression capability of the layer characteristics in the neural network, which comprises a memory for storing computer executable instructions and a processor; the processor is configured to implement the steps of the method embodiments described above when executing the computer-executable instructions in the memory. The Processor may be a Central Processing Unit (CPU), other general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or the like. The aforementioned memory may be a read-only memory (ROM), a Random Access Memory (RAM), a Flash memory (Flash), a hard disk, or a solid state disk. The steps of the method disclosed in the embodiments of the present invention may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
It is noted that, in the patent specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. In the invention document of this patent, if it is mentioned that a certain action is performed according to a certain element, it means that the action is performed at least according to the element, and two cases are included: performing the action based only on the element, and performing the action based on the element and other elements. The expression of a plurality of, a plurality of and the like includes 2, 2 and more than 2, more than 2 and more than 2.
All documents referred to in this application are to be considered as being incorporated in their entirety into the disclosure of the present invention for the purpose of making available modifications as necessary. It should be understood that the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of the present disclosure should be included in the scope of protection of one or more embodiments of the present disclosure.

Claims (10)

1. A visualization and quantitative analysis method for the expression ability of the layer characteristics in the neural network is characterized by comprising the following steps
(1) Selecting a feature interpretation object:
selecting a model to be analyzed, wherein the model comprises a middle layer expression and comprises the following steps: a neural network, a hierarchical graph model;
(2) extracting neural network features:
providing a group of input samples, inputting the samples into the neural network, and extracting characteristics of the samples, wherein the characteristics comprise: sample-wise feature (sample-wise feature) and region-wise feature (regional feature);
(3) and (3) reducing the dimension of the features to obtain a visual result:
firstly, carrying out dimension reduction on sample level characteristics to obtain a visual result of the sample level characteristics in a low-dimensional space; secondly, reducing the dimension of the region level features based on the low-dimensional representation and the region level features of the sample level features to obtain a visual result of the region level features in a low-dimensional space;
(4) and (3) carrying out quantitative analysis on the characteristics according to the visualization result:
and quantitatively analyzing the quantity and quality of knowledge points (knowledge points) in the features based on the visualization result.
2. The method of claim 1, wherein the step (2) of extracting the sample-level features further comprises the steps of:
inputting a given group of samples into a neural network to be analyzed, and extracting the output characteristics of a middle layer of the neural network for each sample so as to obtain the sample-level characteristics corresponding to each input sample, namely the sample-level characteristics corresponding to the group of input samples.
3. The method of claim 1, wherein the extracting of the region-level features in step (2) further comprises the steps of:
inputting a given group of input samples into a neural network to be analyzed, and extracting output features of a certain convolution layer of the neural network for each sample so as to obtain a feature map (feature map) corresponding to each input sample, wherein a high-dimensional vector corresponding to each position of the feature map is the region-level feature of the sample in the region; when the height and width of this feature map are H and W, respectively, and there are K channels in total, then this feature map contains HW region-level features, where each region-level feature is a K-dimensional vector.
4. The method of claim 1, wherein the dimension reduction process for the sample-level features in step (3) comprises the steps of:
for each sample x corresponding sample level feature
Figure FDA0003319495950000021
By a projection matrix
Figure FDA0003319495950000022
Mapping the low-dimensional feature into a low-dimensional space to obtain the low-dimensional characterization of the sample-level features
Figure FDA0003319495950000023
And, optimizing M such that the low-dimensional token should satisfy, the low-dimensional token g and the classes.
5. The method of claim 4, wherein the computing of the closeness of the low-dimensional tokens to the respective classes comprises:
(a) modeling a distribution of low-dimensional tokens g of sample-level features in a low-dimensional space using radial distribution (radial distribution);
(b) the proximity of the low-dimensional tokens to the respective classes is calculated.
6. The method of claim 5, wherein step (a) comprises:
based on the radial distribution, the probability density function of g in a low dimensional space can be written as follows:
Figure FDA0003319495950000024
wherein y ∈ {1, 2.,. C } represents different classes in the classification task; piyRepresenting the prior probability of class y; lg-g | | | represents the L2 norm of g, called the strength of g (strength); og=g/lgRepresents the direction of g (orientation); mu.syMean direction (mean direction) indicating the y-th class; kappa (. cndot.) is a monotonically increasing function, p (l)gY) represents l on category ygA priori probability of pvMF(ogy,κ(lg) ) mean direction is μyThe aggregation parameter is κ (l)g) vMF distribution (von Mises-Fisher distribution).
7. The method of claim 5, wherein step (b) comprises:
based on the radial distribution, and assuming lgIs independent of the class y, then the low-dimensional representation of the proximity Q of g and class yM(y | x) is expressed as follows:
Figure FDA0003319495950000031
8. the method of claim 1, wherein the dimension reduction process for the region-level features in step (3) further comprises the steps of:
HW region level features for each sample x
Figure FDA0003319495950000032
By a projection matrix
Figure FDA0003319495950000033
Mapping them into a low-dimensional space to obtain the low-dimensional representation of HW region-level features
Figure FDA0003319495950000034
And, optimizing Λ such that the low-dimensional characterization should be satisfied, based on the low-dimensional characterization h ═ h { (h)(1),h(2),...,h(HW)The inferred inter-sample similarity is as consistent as possible with the inferred inter-sample similarity based on the network output, and further, the low-dimensional characterization of the region-level features needs to be aligned with the low-dimensional characterization of the sample-level features.
9. The method of claim 1, wherein the knowledge points in step (4) are a set of region-level features that make the following equation greater than a certain threshold:
Figure FDA0003319495950000035
wherein h is(r)A low-dimensional representation of the r-th region level feature corresponding to a certain sample x; therefore, the knowledge point representation is such that maxcp(y=c|h(r)) Features at the region level > τ, i.e. set h(r):maxcp(y=c|h(r)) τ where τ is a positive constant, and in a preferred embodiment, τ ranges from 0.3 to 0.8.
10. A system for visualizing and quantitatively analyzing the expression capability of layer features in a neural network is characterized by comprising the following modules:
(1) the input module is configured to be a pre-trained classification neural network and input samples containing all possible classes;
(2) a feature extraction module configured to extract sample-level features and region-level features of the input sample;
(3) the visualization module is configured to reduce the dimension of the extracted sample-level features and the extracted region-level features to obtain low-dimensional representations and visualize the low-dimensional representations in a low-dimensional space;
(4) and the quantitative analysis module is configured to quantitatively analyze the quantity and quality of the knowledge points in the features based on the visualization result of the region-level features.
CN202111240906.7A 2021-10-25 2021-10-25 Visualization and quantitative analysis method and system for layer feature expression capability in neural network Pending CN113963185A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111240906.7A CN113963185A (en) 2021-10-25 2021-10-25 Visualization and quantitative analysis method and system for layer feature expression capability in neural network
PCT/CN2022/127435 WO2023072094A1 (en) 2021-10-25 2022-10-25 Visualization and quantitative analysis method and system for expression capability of layer feature in neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111240906.7A CN113963185A (en) 2021-10-25 2021-10-25 Visualization and quantitative analysis method and system for layer feature expression capability in neural network

Publications (1)

Publication Number Publication Date
CN113963185A true CN113963185A (en) 2022-01-21

Family

ID=79466728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111240906.7A Pending CN113963185A (en) 2021-10-25 2021-10-25 Visualization and quantitative analysis method and system for layer feature expression capability in neural network

Country Status (2)

Country Link
CN (1) CN113963185A (en)
WO (1) WO2023072094A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023072094A1 (en) * 2021-10-25 2023-05-04 上海交通大学 Visualization and quantitative analysis method and system for expression capability of layer feature in neural network
WO2024125063A1 (en) * 2022-12-13 2024-06-20 华为云计算技术有限公司 Feature visualization method and apparatus

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109460812B (en) * 2017-09-06 2021-09-14 富士通株式会社 Intermediate information analysis device, optimization device, and feature visualization device for neural network
EP3654248A1 (en) * 2018-11-19 2020-05-20 Siemens Aktiengesellschaft Verification of classification decisions in convolutional neural networks
EP3748540A1 (en) * 2019-06-06 2020-12-09 Koninklijke Philips N.V. Deep neural network visualisation
CN110781933B (en) * 2019-10-14 2022-08-05 杭州电子科技大学 Visual analysis method for understanding graph convolution neural network
CN111695590B (en) * 2020-04-24 2022-05-03 浙江大学 Deep neural network feature visualization method for constraint optimization class activation mapping
CN112884021B (en) * 2021-01-29 2022-09-02 之江实验室 Visual analysis system oriented to deep neural network interpretability
CN113963185A (en) * 2021-10-25 2022-01-21 上海交通大学 Visualization and quantitative analysis method and system for layer feature expression capability in neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023072094A1 (en) * 2021-10-25 2023-05-04 上海交通大学 Visualization and quantitative analysis method and system for expression capability of layer feature in neural network
WO2024125063A1 (en) * 2022-12-13 2024-06-20 华为云计算技术有限公司 Feature visualization method and apparatus

Also Published As

Publication number Publication date
WO2023072094A1 (en) 2023-05-04

Similar Documents

Publication Publication Date Title
CN109934293B (en) Image recognition method, device, medium and confusion perception convolutional neural network
Yang Symbol recognition via statistical integration of pixel-level constraint histograms: A new descriptor
Al Maadeed et al. Automatic prediction of age, gender, and nationality in offline handwriting
Hung et al. Image texture analysis
US20120093411A1 (en) Active Segmentation for Groups of Images
WO2023072094A1 (en) Visualization and quantitative analysis method and system for expression capability of layer feature in neural network
Tralic et al. Combining cellular automata and local binary patterns for copy-move forgery detection
CN111369003A (en) Method and device for determining fidelity of quantum bit reading signal
CN113807073B (en) Text content anomaly detection method, device and storage medium
CN113343920A (en) Method and device for classifying face recognition photos, electronic equipment and storage medium
Fan et al. A hierarchical Dirichlet process mixture of generalized Dirichlet distributions for feature selection
CN109960730B (en) Short text classification method, device and equipment based on feature expansion
CN112163114A (en) Image retrieval method based on feature fusion
CN111475648A (en) Text classification model generation method, text classification method, device and equipment
CN113762294B (en) Feature vector dimension compression method, device, equipment and medium
Joren et al. Learning document graphs with attention for image manipulation detection
CN116521899B (en) Improved graph neural network-based document level relation extraction method and system
Cheng et al. Activity guided multi-scales collaboration based on scaled-CNN for saliency prediction
CN111930883A (en) Text clustering method and device, electronic equipment and computer storage medium
CN111340139A (en) Method and device for judging complexity of image content
CN115238645A (en) Asset data identification method and device, electronic equipment and computer storage medium
CN112785601B (en) Image segmentation method, system, medium and electronic terminal
Evangelou et al. PU learning-based recognition of structural elements in architectural floor plans
CN114358011A (en) Named entity extraction method and device and electronic equipment
CN113139382A (en) Named entity identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination