CN111652271A - Nonlinear feature selection method based on neural network - Google Patents
Nonlinear feature selection method based on neural network Download PDFInfo
- Publication number
- CN111652271A CN111652271A CN202010331361.XA CN202010331361A CN111652271A CN 111652271 A CN111652271 A CN 111652271A CN 202010331361 A CN202010331361 A CN 202010331361A CN 111652271 A CN111652271 A CN 111652271A
- Authority
- CN
- China
- Prior art keywords
- neural network
- function
- norm
- feature selection
- weight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 75
- 238000010187 selection method Methods 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 31
- 210000002569 neuron Anatomy 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 6
- 230000035945 sensitivity Effects 0.000 claims description 6
- 238000011478 gradient descent method Methods 0.000 claims description 5
- 238000009795 derivation Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000012417 linear regression Methods 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 210000005036 nerve Anatomy 0.000 claims 2
- 230000006870 function Effects 0.000 abstract description 65
- 238000004519 manufacturing process Methods 0.000 abstract description 11
- 230000008878 coupling Effects 0.000 abstract description 4
- 238000010168 coupling process Methods 0.000 abstract description 4
- 238000005859 coupling reaction Methods 0.000 abstract description 4
- 230000002159 abnormal effect Effects 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000005188 flotation Methods 0.000 description 4
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 3
- 229910052802 copper Inorganic materials 0.000 description 3
- 239000010949 copper Substances 0.000 description 3
- 239000006260 foam Substances 0.000 description 3
- 238000000513 principal component analysis Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 229910052500 inorganic mineral Inorganic materials 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 239000011707 mineral Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
- G06F18/2113—Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2136—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a nonlinear feature selection method based on a neural network, which aims at the problem that an unsupervised learning method cannot utilize important information carried by a sample label only by analyzing the relationship among features; meanwhile, the industrial process has the characteristics of nonlinearity, complex coupling, hysteresis and the like, and the accurate characteristic weight cannot be obtained through a linear error function. The neural network error function is provided to replace the linear error function of the sparse regularization model, and group sparse constraint is carried out on the weight of the neural network input layer according to the complexity of the weight of the neural network, so that the nonlinear problem prediction precision of the sparse regularization model is improved. In addition, when solving for neural networks, L is used2,1The norm is an error function solution, and the influence of the abnormal value on the feature selection result is reduced.
Description
Technical Field
The invention relates to the field of feature selection in feature selection engineering, and particularly relates to a nonlinear feature selection method for enhancing the robustness of a feature selection model based on neural network regression function adjustment.
Background
With the continuous improvement of automation level in industrial process, more and more important positions are changed from manual control to computer monitoring, and machine vision technology is an important component of the industrial process. The machine vision technology is beneficial to solving the problems of high labor cost, difficulty in real-time monitoring, low control precision and the like in industrial processes due to the advantages of high efficiency, accuracy, objectivity and the like, and is widely applied to the fields of soft measurement, fault diagnosis, product classification and the like in various industrial processes. However, high-dimensional samples inevitably have dimensional disasters and data noise problems during the image digitization process of machine vision. In order to avoid dimension disasters and reduce noise influence, researchers provide a feature selection method technology, and irrelevant or redundant features are removed to reduce model complexity and improve learning performance.
At present, most of the common feature selection methods in industry are mainly filtering feature selection methods, and by analyzing the correlation and feature divergence among features, most of feature information is kept, but label information of samples is ignored, so that the method is an unsupervised learning method. The sparse regularization model is a simple and efficient feature selection method, a linear error function between features and label quantity is established, meanwhile, sparse constraint is carried out on feature weights, and an optimal feature selection result is obtained by solving the weight error function under the constraint condition. However, the industrial process has the characteristics of nonlinearity, complex coupling, hysteresis and the like, and the linear error function based on the classical sparse regularization model cannot accurately describe the characteristic relationship of the industrial process, so that the obtained characteristic selection result is not accurate enough.
Disclosure of Invention
In order to overcome the defects of the existing method, the invention provides a nonlinear feature selection method based on a neural network, which is called RSBP for short.
The invention aims to solve the problems that the unsupervised learning method cannot utilize important information carried by a sample label only by analyzing the relation between characteristics; meanwhile, the industrial process has the characteristics of nonlinearity, complex coupling, hysteresis and the like, and cannot be obtained through a linear error functionAnd obtaining the accurate feature weight. The neural network error function is provided to replace the linear error function of the sparse regularization model, and group sparse constraint is carried out on the weight of the neural network input layer according to the complexity of the weight of the neural network, so that the nonlinear problem prediction precision of the sparse regularization model is improved. In addition, when solving for neural networks, L is used2,1The norm is an error function solution, and the influence of the abnormal value on the feature selection result is reduced.
The technical scheme of the invention is as follows:
a nonlinear feature selection method based on a neural network comprises the following steps: (1) neural network embedding: replacing a linear error function of the sparse regularization model with a neural network error function, and simultaneously carrying out group sparse constraint on weights of input layers of the neural network to establish a nonlinear feature selection model; (2) and (3) robustness optimization: by means of L2,1Robustness of the norm, robustness optimization is carried out on the neural network, and an RSBP target function is established; (3) and (3) optimizing the strategy: introduction of class L2,1And (3) solving an iterative function of the neural network according to a projection gradient descent method to obtain an optimal weight matrix, and taking the sum of absolute values of weight vectors corresponding to each input layer neuron as a characteristic importance index to solve the provided problem.
The nonlinear feature selection method based on the neural network comprises the following steps of (1):
the classical sparse regularization model (Lasso) can be understood to satisfy L1Solving β the optimal solution of the linear regression problem under the norm constraint condition, where y ═ y1,…,yN) Representing an N-dimensional response vector, the input quantity X is an N × p matrix, and a lagrange function can be constructed as follows:
replacing a linear error function of the sparse regularization model with a neural network error function, and carrying out group sparse constraint on the weight of the neural network input layer according to the complexity of the weight of the neural network to obtain a nonlinear feature selection model:
am=fm-1((Wm-1)Tam-1+bm-1) (3)
wherein E represents an error function, h represents a constraint function, w represents a feature weight, f is an activation function, lambda is a sparse coefficient, M represents the number of layers of the neural network, amRepresents the net output of the mth layer, bmRepresents an mth layer bias value;
gradient vanishing and gradient guarantee are also phenomena that the deep neural network must avoid in the solving process, and therefore attention must be paid to the hidden layer activation function when setting the neural network parameters, for example, an ELU (explicit Linear units) function can be adopted as the activation function:
the advantages of this function are: the method can ensure that the gradient disappears under the condition of the pole removing end, and simultaneously ensures the continuity and the differentiability of the function, thereby being convenient for the next solution.
The nonlinear feature selection method based on the neural network, wherein the step (2): l is2,1Norm is compared to L1The norm has better robustness, so that robustness constraint can be added on the basis of an original sparse neural network to improve the accuracy of feature selection:
the nonlinear feature selection method based on the neural network, wherein the step (3): due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And (3) a smooth function of the norm, solving an iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to neurons of each input layer as a characteristic importance index, and carrying out the following process:
due to, L2,1There is an immeasurable point in the norm, which increases the difficulty of the solution process. To solve this problem, we introduce an LR,1Norm (class L)2,1Norm) that makes the entire function differentiable by adding a smoothing term at points that are not differentiable. L isR,1The norm is as follows:
the minimum value is expressed, and the above formula is easy to see whenWhen L isR,1Norm and L2,1The norm is equivalent. At the same time, due to the addition of a minimum valueIt can be ensured that the derivative of the function is not 0, i.e. differentiable within the whole domain of definition;
in order to solve the expression, the number of the neural network layers is assumed to be 4, the ELU function is an activation function, and each layer of the neural network is expressed as L-i-j, and L is used for expressingR,1Norm instead of L2,1Norm, which leads to the following results;
carrying out integral derivation on the objective function to obtain a weight value updating formula
Computing neuron outputs o of layersm:
Then calculating partial derivatives of each layer according to the error in the reverse direction:
let smIndicating the sensitivity:
introducing a derivative matrix:
this makes it possible to obtain:
finally updating the weight and the offset value
And repeatedly iterating, updating and optimizing the formula until a stopping condition is met, so as to obtain the optimal input layer weight matrix of the neural network. And finally, taking the sum of the absolute values of the weight vectors corresponding to each input layer neuron as a feature importance index to obtain a feature selection result.
In summary, the method aims at the existing industrial characteristicsThe selection method does not utilize sample label information and nonlinear and complex coupling of an industrial process, provides a linear error function of a sparse regularization model by replacing a neural network error function, and carries out group sparse constraint on the weight of an input layer of the neural network according to the complexity of the weight of the neural network so as to improve the nonlinear problem prediction precision of the sparse regularization model. In addition, when solving for neural networks, L is used2,1The norm is an error function solution, and the influence of the abnormal value on the feature selection result is reduced.
The method is suitable for supervised feature selection of nonlinear processes such as complex industrial processes.
Drawings
FIG. 1 shows the variation of each index of a nonlinear feature selection model under different sparse coefficients; (a) sparse parameter sensitivity under MSE index; (b) sparse parameter sensitivity under ARE index; (c) r2Sparse parameter sensitivity under indexes;
FIG. 2 is a graph of regression yields for different selected dimensions; (a) comparing results of the algorithm under the MSE index; (b) comparing results of algorithms under ARE indexes; (c) r2Comparing the algorithm results under indexes;
FIG. 3 is the regression yields in different dimensions after adding a perturbation; (a) comparing results of the algorithm under the MSE index; (b) comparing results of algorithms under ARE indexes; (c) r2Comparing the algorithm results under indexes;
Detailed Description
The present invention will be described in detail with reference to specific examples.
Firstly, combining a neural network with a sparse regularization model, replacing a linear error function of the sparse regularization model with a neural network error function, carrying out group sparse constraint on a weight of an input layer of the neural network according to the complexity of the weight of the neural network, and establishing a nonlinear feature selection model. Then, using L2,1And (3) carrying out robustness optimization on the neural network by the robustness of the norm, and establishing an RSBP target function. Finally, solving the iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to each input layer neuron as a characteristic importance index, and solving the problem of the proposed weight vectorTo a problem of (a). Furthermore, due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And a smooth function of the norm avoids the occurrence of non-derivable points. The technical scheme is specifically described as follows:
(1) neural network embedding:
the classical sparse regularization model (Lasso) can be understood to satisfy L1Solving β the optimal solution of the linear regression problem under the norm constraint condition, where y ═ y1,…,yN) Representing an N-dimensional response vector, the input quantity X is an N × p matrix, and a lagrange function can be constructed as follows:
replacing a linear error function of the sparse regularization model with a neural network error function, and carrying out group sparse constraint on the weight of the neural network input layer according to the complexity of the weight of the neural network to obtain a nonlinear feature selection model:
am=fm-1((Wm-1)Tam-1+bm-1) (3)
wherein E represents an error function, h represents a constraint function, w represents a feature weight, f is an activation function, lambda is a sparse coefficient, M represents the number of layers of the neural network, amRepresents the net output of the mth layer, bmRepresents an mth layer bias value;
gradient vanishing and gradient guarantee are also phenomena that the deep neural network must avoid in the solving process, and therefore attention must be paid to the hidden layer activation function when setting the neural network parameters, for example, an ELU (explicit Linear units) function can be adopted as the activation function:
the advantages of this function are: the method can ensure that the gradient disappears under the condition of the pole removing end, and simultaneously ensures the continuity and the differentiability of the function, thereby being convenient for the next solution.
(2) And (3) robustness optimization:
the problems of measurement disturbance and artificial error existing in the complex industrial process are not beneficial to the establishment of a feature selection model, and the mathematical experiment of researchers proves that: l is2,1Norm is compared to L1The norm has better robustness, so that robustness constraint can be added on the basis of an original sparse neural network to improve the accuracy of feature selection:
(3) and (3) optimizing the strategy:
due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And (3) solving an iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to neurons of each input layer as a characteristic importance index, and obtaining the problem provided by solving the actual industrial problem, wherein the process is as follows:
due to, L2,1There is an immeasurable point in the norm, which increases the difficulty of the solution process. To solve this problem, we introduce an LR,1Norm (class L)2,1Norm) that makes the entire function differentiable by adding a smoothing term at points that are not differentiable. L isR,1The norm is as follows:
the minimum value is expressed, and the above formula is easy to see whenWhen L isR,1Norm and L2,1The norm is equivalent. At the same time, due to the addition of a minimum valueIt can be ensured that the derivative of the function is not 0, i.e. differentiable within the whole domain of definition;
in order to solve the expression, the number of the neural network layers is assumed to be 4, the ELU function is an activation function, and each layer of the neural network is expressed as L-i-j, and L is used for expressingR,1Norm instead of L2,1Norm, which leads to the following results;
carrying out integral derivation on the objective function to obtain a weight value updating formula
Computing neuron outputs o of layersm:
Then calculating partial derivatives of each layer according to the error in the reverse direction:
let smIndicating the sensitivity:
introducing a derivative matrix:
this makes it possible to obtain:
finally updating the weight and the offset value
And repeatedly iterating, updating and optimizing the formula until a stopping condition is met, so as to obtain the optimal input layer weight matrix of the neural network. And finally, taking the sum of the absolute values of the weight vectors corresponding to each input layer neuron as a feature importance index to obtain a feature selection result.
The method selects the real data of the foam image of the first roughing tank in the copper ore flotation process of a certain flotation plant to carry out simulation experiment, preprocesses the data, removes outliers, normalizes the data to obtain 245 groups of copper flotation samples, each group of samples comprises 14 characteristics (average value, peak value, standard deviation, skewness, R, G, B, hue, red component, yellow component, speed, stability, bearing rate and gray level) and 1 label (mineral grade), 195 groups are selected as training samples according to the data distribution characteristics of the mineral grade, and 49 groups are selected as test samples. In order to make the solution algorithm fast and effective, the number of hidden layers is set to 2, the number of neurons is 8, and the learning rate η is 0.0003. The convergence judgment condition is set as that "if the variation of the objective function does not exceed 0.01 for 20 consecutive generations, the objective function is considered to be converged", and the maximum iteration number is 4000. And solving the objective function to obtain the weight of the input layer, and obtaining the characteristic importance sequence aiming at the foam image by taking the absolute value of the weight as the importance basis. And generating feature subsets with different dimensions according to the feature importance ordering, performing inspection by using an SVR (singular value representation) model, and comparing to obtain an optimal combination result under the fixed feature ordering.
First, to explain the scientificity of the selection results, the change of the feature importance index was obtained by changing the sparse coefficient, as shown in tables 1-2. It can be seen that, as the sparse coefficient is gradually increased, each feature index is gradually decreased, and sparse solutions gradually appear until all are zero (when the sparse coefficient is large enough). In the process, the sequence of importance is fixed, the sequence is consistent with the sequence of sparse solution, the result is matched with subjective analysis, and the algorithm process is reasonable and interpretable.
TABLE 1 top 7 feature importance ranking
7 last feature importance ranking of Table 2
Meanwhile, analyzing the performance influence of the sparse parameter lambda on the technology of the invention, selecting lambda ∈ [0,240 ]]Representing the process from no sparseness to full sparseness of the objective function. Using mean square error MSE, mean relative error ARE and square correlation coefficient R2As an evaluation standard, the influence of the sparse coefficient on the algorithm performance under the input of different dimensional feature subsets is obtained, and the result is shown in figure 1, wherein (a), (b) and (c) represent the change of three different evaluation indexes, and as can be seen from figure 1, the most suitable sparse coefficient interval for the problem of the copper flotation primary tank foam image feature selection is lambda ∈ [90,120 ]]。
Then, the existing feature selection algorithms are compared, and the comparison algorithms comprise selection Sequence Forward Selection (SFS), Principal Component Analysis (PCA), Sparse PCA (SPCA), Sparse Artificial Neural Network (SANN) and robust feature selection algorithm (RFS), so that a new data set is obtained according to the same steps. In order to ensure the authenticity of the comparison condition, an SVR model with the same parameters is adopted as a test model of the feature selection result. The new data set is used to train the SVR model, and the mean square error, the average relative error and the decision coefficient are used as evaluation criteria, and the comparison results obtained respectively are shown in fig. 2. Among different evaluation criteria, RSBP is generally superior to all comparative feature selection methods, and the error of the method is lower compared with 14-dimensional original feature data. The method can effectively improve the model prediction precision.
Finally, in order to compare the anti-interference capability of the algorithm, 25% of training samples are randomly selected and 5% of proportion disturbance is added. As can be seen from fig. 3, the redundancy rate of the selected feature subset on the data set by RSBP is significantly lower than that of other methods. At the same time, it can be seen that the error of RSBP is kept at a fairly low level at all times in different dimensions, which indicates that our feature selection method is more robust. Comparing fig. 2 and fig. 3, we can see that the error of other methods will fluctuate significantly when adding the interference data to the training set, but our method always keeps the error at a low level. In the case of 25% sample interference, our method is 5% -12% more accurate than other methods.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.
Claims (5)
1. A nonlinear feature selection method based on a neural network is characterized by comprising the following steps: (1) neural network embedding: replacing a linear error function of the sparse regularization model with a neural network error function, and simultaneously carrying out group sparse constraint on weights of input layers of the neural network to establish a nonlinear feature selection model; (2) and (3) robustness optimization: by means of L2,1Robustness of the norm, robustness optimization is carried out on the neural network, and an RSBP target function is established; (3) and (3) optimizing the strategy: introduction of class L2,1Smooth function of norm, solving of nerves according to projection gradient descent methodThe iterative function of the network obtains an optimal weight matrix, and the sum of the absolute values of the weight vectors corresponding to each input layer neuron is taken as a characteristic importance index, so that the proposed problem is solved.
2. The neural network-based nonlinear feature selection method according to claim 1, wherein the step (1):
the classical sparse regularization model (Lasso) can be understood to satisfy L1Solving β the optimal solution of the linear regression problem under the norm constraint condition, where y ═ y1,…,yN) Representing an N-dimensional response vector, the input quantity X is an N × p matrix, and a lagrange function can be constructed as follows:
replacing a linear error function of the sparse regularization model with a neural network error function, and carrying out group sparse constraint on the weight of the neural network input layer according to the complexity of the weight of the neural network to obtain a nonlinear feature selection model:
am=fm-1((Wm-1)Tam-1+bm-1) (3)
wherein E represents an error function, h represents a constraint function, w represents a feature weight, f is an activation function, lambda is a sparse coefficient, M represents the number of layers of the neural network, amRepresents the net output of the mth layer, bmRepresenting the mth layer bias value.
5. the neural network-based nonlinear feature selection method according to claim 1, wherein the step (3): due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And (3) a smooth function of the norm, solving an iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to neurons of each input layer as a characteristic importance index, and carrying out the following process:
introducing a class L2,1Norm: l isR,1A norm that makes the entire function differentiable by adding a smoothing term at a point that is not differentiable; l isR,1The norm is as follows:
indicates a minimum value whenWhen L isR,1Norm and L2,1Norm equivalence, at the same time, because of adding minimum valueEnsure that the derivative of the function is not 0, i.e. differentiable over the entire domain;
assuming that the number of the neural network layers is 4, the ELU function is an activation function, and the nerves of each layer of the neural network are expressed as L-i-j, and L is usedR,1Norm instead of L2,1Norm, which leads to the following results;
carrying out integral derivation on the objective function to obtain a weight value updating formula
Computing neuron outputs o of layersm:
Then calculating partial derivatives of each layer according to the error in the reverse direction:
let smIndicating the sensitivity:
introducing a derivative matrix:
this gives:
finally updating the weight and the offset value
Repeatedly iterating, updating and optimizing through the formula until a stopping condition is met to obtain an optimal input layer weight matrix of the neural network; and finally, taking the sum of the absolute values of the weight vectors corresponding to each input layer neuron as a feature importance index to obtain a feature selection result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010331361.XA CN111652271A (en) | 2020-04-24 | 2020-04-24 | Nonlinear feature selection method based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010331361.XA CN111652271A (en) | 2020-04-24 | 2020-04-24 | Nonlinear feature selection method based on neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111652271A true CN111652271A (en) | 2020-09-11 |
Family
ID=72349283
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010331361.XA Pending CN111652271A (en) | 2020-04-24 | 2020-04-24 | Nonlinear feature selection method based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111652271A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112884136A (en) * | 2021-04-21 | 2021-06-01 | 江南大学 | Bounded clustering projection synchronous regulation control method and system for coupled neural network |
CN113033416A (en) * | 2021-03-26 | 2021-06-25 | 深圳市华杰智通科技有限公司 | Millimeter wave radar gesture recognition method based on sparse function |
CN113064489A (en) * | 2021-04-02 | 2021-07-02 | 深圳市华杰智通科技有限公司 | Millimeter wave radar gesture recognition method based on L1-Norm |
CN113313175A (en) * | 2021-05-28 | 2021-08-27 | 北京大学 | Image classification method of sparse regularization neural network based on multivariate activation function |
CN115294406A (en) * | 2022-09-30 | 2022-11-04 | 华东交通大学 | Method and system for attribute-based multimodal interpretable classification |
CN115796244A (en) * | 2022-12-20 | 2023-03-14 | 广东石油化工学院 | CFF-based parameter identification method for super-nonlinear input/output system |
-
2020
- 2020-04-24 CN CN202010331361.XA patent/CN111652271A/en active Pending
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113033416A (en) * | 2021-03-26 | 2021-06-25 | 深圳市华杰智通科技有限公司 | Millimeter wave radar gesture recognition method based on sparse function |
CN113064489A (en) * | 2021-04-02 | 2021-07-02 | 深圳市华杰智通科技有限公司 | Millimeter wave radar gesture recognition method based on L1-Norm |
CN112884136A (en) * | 2021-04-21 | 2021-06-01 | 江南大学 | Bounded clustering projection synchronous regulation control method and system for coupled neural network |
CN112884136B (en) * | 2021-04-21 | 2022-05-13 | 江南大学 | Bounded clustering projection synchronous regulation control method and system for coupled neural network |
CN113313175A (en) * | 2021-05-28 | 2021-08-27 | 北京大学 | Image classification method of sparse regularization neural network based on multivariate activation function |
CN113313175B (en) * | 2021-05-28 | 2024-02-27 | 北京大学 | Image classification method of sparse regularized neural network based on multi-element activation function |
CN115294406A (en) * | 2022-09-30 | 2022-11-04 | 华东交通大学 | Method and system for attribute-based multimodal interpretable classification |
CN115796244A (en) * | 2022-12-20 | 2023-03-14 | 广东石油化工学院 | CFF-based parameter identification method for super-nonlinear input/output system |
CN115796244B (en) * | 2022-12-20 | 2023-07-21 | 广东石油化工学院 | Parameter identification method based on CFF for ultra-nonlinear input/output system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111652271A (en) | Nonlinear feature selection method based on neural network | |
CN110503251B (en) | Non-holiday load prediction method based on Stacking algorithm | |
CN111429415B (en) | Method for constructing efficient detection model of product surface defects based on network collaborative pruning | |
CN108875933B (en) | Over-limit learning machine classification method and system for unsupervised sparse parameter learning | |
CN111768000A (en) | Industrial process data modeling method for online adaptive fine-tuning deep learning | |
CN110571792A (en) | Analysis and evaluation method and system for operation state of power grid regulation and control system | |
CN113486078A (en) | Distributed power distribution network operation monitoring method and system | |
CN113780420B (en) | GRU-GCN-based method for predicting concentration of dissolved gas in transformer oil | |
CN112613536A (en) | Near infrared spectrum diesel grade identification method based on SMOTE and deep learning | |
CN112308298B (en) | Multi-scenario performance index prediction method and system for semiconductor production line | |
CN112439794A (en) | Hot rolling bending force prediction method based on LSTM | |
CN104634265A (en) | Soft measurement method for thickness of mineral floating foam layer based on multivariate image feature fusion | |
CN114580934A (en) | Early warning method for food detection data risk based on unsupervised anomaly detection | |
Fuertes et al. | Visual dynamic model based on self-organizing maps for supervision and fault detection in industrial processes | |
CN116340726A (en) | Energy economy big data cleaning method, system, equipment and storage medium | |
CN115564983A (en) | Target detection method and device, electronic equipment, storage medium and application thereof | |
CN113221442A (en) | Construction method and device of health assessment model of power plant equipment | |
CN112241832A (en) | Product quality grading evaluation standard design method and system | |
CN108764583B (en) | Unbiased prediction method for forest accumulation | |
CN116842358A (en) | Soft measurement modeling method based on multi-scale convolution and self-adaptive feature fusion | |
CN115510740A (en) | Aero-engine residual life prediction method based on deep learning | |
CN114529063A (en) | Financial field data prediction method, device and medium based on machine learning | |
CN111882441A (en) | User prediction interpretation Treeshap method based on financial product recommendation scene | |
CN112330029A (en) | Fishing ground prediction calculation method based on multilayer convLSTM | |
Sallehuddin et al. | Forecasting small data set using hybrid cooperative feature selection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200911 |