CN111652271A - Nonlinear feature selection method based on neural network - Google Patents

Nonlinear feature selection method based on neural network Download PDF

Info

Publication number
CN111652271A
CN111652271A CN202010331361.XA CN202010331361A CN111652271A CN 111652271 A CN111652271 A CN 111652271A CN 202010331361 A CN202010331361 A CN 202010331361A CN 111652271 A CN111652271 A CN 111652271A
Authority
CN
China
Prior art keywords
neural network
function
norm
feature selection
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010331361.XA
Other languages
Chinese (zh)
Inventor
朱建勇
杨辉
黄鑫
聂飞平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Jiaotong University
Original Assignee
East China Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Jiaotong University filed Critical East China Jiaotong University
Priority to CN202010331361.XA priority Critical patent/CN111652271A/en
Publication of CN111652271A publication Critical patent/CN111652271A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2136Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a nonlinear feature selection method based on a neural network, which aims at the problem that an unsupervised learning method cannot utilize important information carried by a sample label only by analyzing the relationship among features; meanwhile, the industrial process has the characteristics of nonlinearity, complex coupling, hysteresis and the like, and the accurate characteristic weight cannot be obtained through a linear error function. The neural network error function is provided to replace the linear error function of the sparse regularization model, and group sparse constraint is carried out on the weight of the neural network input layer according to the complexity of the weight of the neural network, so that the nonlinear problem prediction precision of the sparse regularization model is improved. In addition, when solving for neural networks, L is used2,1The norm is an error function solution, and the influence of the abnormal value on the feature selection result is reduced.

Description

Nonlinear feature selection method based on neural network
Technical Field
The invention relates to the field of feature selection in feature selection engineering, and particularly relates to a nonlinear feature selection method for enhancing the robustness of a feature selection model based on neural network regression function adjustment.
Background
With the continuous improvement of automation level in industrial process, more and more important positions are changed from manual control to computer monitoring, and machine vision technology is an important component of the industrial process. The machine vision technology is beneficial to solving the problems of high labor cost, difficulty in real-time monitoring, low control precision and the like in industrial processes due to the advantages of high efficiency, accuracy, objectivity and the like, and is widely applied to the fields of soft measurement, fault diagnosis, product classification and the like in various industrial processes. However, high-dimensional samples inevitably have dimensional disasters and data noise problems during the image digitization process of machine vision. In order to avoid dimension disasters and reduce noise influence, researchers provide a feature selection method technology, and irrelevant or redundant features are removed to reduce model complexity and improve learning performance.
At present, most of the common feature selection methods in industry are mainly filtering feature selection methods, and by analyzing the correlation and feature divergence among features, most of feature information is kept, but label information of samples is ignored, so that the method is an unsupervised learning method. The sparse regularization model is a simple and efficient feature selection method, a linear error function between features and label quantity is established, meanwhile, sparse constraint is carried out on feature weights, and an optimal feature selection result is obtained by solving the weight error function under the constraint condition. However, the industrial process has the characteristics of nonlinearity, complex coupling, hysteresis and the like, and the linear error function based on the classical sparse regularization model cannot accurately describe the characteristic relationship of the industrial process, so that the obtained characteristic selection result is not accurate enough.
Disclosure of Invention
In order to overcome the defects of the existing method, the invention provides a nonlinear feature selection method based on a neural network, which is called RSBP for short.
The invention aims to solve the problems that the unsupervised learning method cannot utilize important information carried by a sample label only by analyzing the relation between characteristics; meanwhile, the industrial process has the characteristics of nonlinearity, complex coupling, hysteresis and the like, and cannot be obtained through a linear error functionAnd obtaining the accurate feature weight. The neural network error function is provided to replace the linear error function of the sparse regularization model, and group sparse constraint is carried out on the weight of the neural network input layer according to the complexity of the weight of the neural network, so that the nonlinear problem prediction precision of the sparse regularization model is improved. In addition, when solving for neural networks, L is used2,1The norm is an error function solution, and the influence of the abnormal value on the feature selection result is reduced.
The technical scheme of the invention is as follows:
a nonlinear feature selection method based on a neural network comprises the following steps: (1) neural network embedding: replacing a linear error function of the sparse regularization model with a neural network error function, and simultaneously carrying out group sparse constraint on weights of input layers of the neural network to establish a nonlinear feature selection model; (2) and (3) robustness optimization: by means of L2,1Robustness of the norm, robustness optimization is carried out on the neural network, and an RSBP target function is established; (3) and (3) optimizing the strategy: introduction of class L2,1And (3) solving an iterative function of the neural network according to a projection gradient descent method to obtain an optimal weight matrix, and taking the sum of absolute values of weight vectors corresponding to each input layer neuron as a characteristic importance index to solve the provided problem.
The nonlinear feature selection method based on the neural network comprises the following steps of (1):
the classical sparse regularization model (Lasso) can be understood to satisfy L1Solving β the optimal solution of the linear regression problem under the norm constraint condition, where y ═ y1,…,yN) Representing an N-dimensional response vector, the input quantity X is an N × p matrix, and a lagrange function can be constructed as follows:
Figure BDA0002465054010000021
replacing a linear error function of the sparse regularization model with a neural network error function, and carrying out group sparse constraint on the weight of the neural network input layer according to the complexity of the weight of the neural network to obtain a nonlinear feature selection model:
Figure BDA0002465054010000022
am=fm-1((Wm-1)Tam-1+bm-1) (3)
wherein E represents an error function, h represents a constraint function, w represents a feature weight, f is an activation function, lambda is a sparse coefficient, M represents the number of layers of the neural network, amRepresents the net output of the mth layer, bmRepresents an mth layer bias value;
gradient vanishing and gradient guarantee are also phenomena that the deep neural network must avoid in the solving process, and therefore attention must be paid to the hidden layer activation function when setting the neural network parameters, for example, an ELU (explicit Linear units) function can be adopted as the activation function:
Figure BDA0002465054010000031
the advantages of this function are: the method can ensure that the gradient disappears under the condition of the pole removing end, and simultaneously ensures the continuity and the differentiability of the function, thereby being convenient for the next solution.
The nonlinear feature selection method based on the neural network, wherein the step (2): l is2,1Norm is compared to L1The norm has better robustness, so that robustness constraint can be added on the basis of an original sparse neural network to improve the accuracy of feature selection:
Figure BDA0002465054010000032
the nonlinear feature selection method based on the neural network, wherein the step (3): due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And (3) a smooth function of the norm, solving an iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to neurons of each input layer as a characteristic importance index, and carrying out the following process:
due to, L2,1There is an immeasurable point in the norm, which increases the difficulty of the solution process. To solve this problem, we introduce an LR,1Norm (class L)2,1Norm) that makes the entire function differentiable by adding a smoothing term at points that are not differentiable. L isR,1The norm is as follows:
Figure BDA0002465054010000033
Figure BDA0002465054010000046
the minimum value is expressed, and the above formula is easy to see when
Figure BDA0002465054010000047
When L isR,1Norm and L2,1The norm is equivalent. At the same time, due to the addition of a minimum value
Figure BDA0002465054010000048
It can be ensured that the derivative of the function is not 0, i.e. differentiable within the whole domain of definition;
in order to solve the expression, the number of the neural network layers is assumed to be 4, the ELU function is an activation function, and each layer of the neural network is expressed as L-i-j, and L is used for expressingR,1Norm instead of L2,1Norm, which leads to the following results;
Figure BDA0002465054010000041
carrying out integral derivation on the objective function to obtain a weight value updating formula
Figure BDA0002465054010000042
Computing neuron outputs o of layersm
Figure BDA0002465054010000043
Then calculating partial derivatives of each layer according to the error in the reverse direction:
Figure BDA0002465054010000044
let smIndicating the sensitivity:
Figure BDA0002465054010000045
introducing a derivative matrix:
Figure BDA0002465054010000051
Figure BDA0002465054010000052
this makes it possible to obtain:
Figure BDA0002465054010000053
finally updating the weight and the offset value
Figure BDA0002465054010000054
Figure BDA0002465054010000055
Figure BDA0002465054010000056
And repeatedly iterating, updating and optimizing the formula until a stopping condition is met, so as to obtain the optimal input layer weight matrix of the neural network. And finally, taking the sum of the absolute values of the weight vectors corresponding to each input layer neuron as a feature importance index to obtain a feature selection result.
In summary, the method aims at the existing industrial characteristicsThe selection method does not utilize sample label information and nonlinear and complex coupling of an industrial process, provides a linear error function of a sparse regularization model by replacing a neural network error function, and carries out group sparse constraint on the weight of an input layer of the neural network according to the complexity of the weight of the neural network so as to improve the nonlinear problem prediction precision of the sparse regularization model. In addition, when solving for neural networks, L is used2,1The norm is an error function solution, and the influence of the abnormal value on the feature selection result is reduced.
The method is suitable for supervised feature selection of nonlinear processes such as complex industrial processes.
Drawings
FIG. 1 shows the variation of each index of a nonlinear feature selection model under different sparse coefficients; (a) sparse parameter sensitivity under MSE index; (b) sparse parameter sensitivity under ARE index; (c) r2Sparse parameter sensitivity under indexes;
FIG. 2 is a graph of regression yields for different selected dimensions; (a) comparing results of the algorithm under the MSE index; (b) comparing results of algorithms under ARE indexes; (c) r2Comparing the algorithm results under indexes;
FIG. 3 is the regression yields in different dimensions after adding a perturbation; (a) comparing results of the algorithm under the MSE index; (b) comparing results of algorithms under ARE indexes; (c) r2Comparing the algorithm results under indexes;
Detailed Description
The present invention will be described in detail with reference to specific examples.
Firstly, combining a neural network with a sparse regularization model, replacing a linear error function of the sparse regularization model with a neural network error function, carrying out group sparse constraint on a weight of an input layer of the neural network according to the complexity of the weight of the neural network, and establishing a nonlinear feature selection model. Then, using L2,1And (3) carrying out robustness optimization on the neural network by the robustness of the norm, and establishing an RSBP target function. Finally, solving the iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to each input layer neuron as a characteristic importance index, and solving the problem of the proposed weight vectorTo a problem of (a). Furthermore, due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And a smooth function of the norm avoids the occurrence of non-derivable points. The technical scheme is specifically described as follows:
(1) neural network embedding:
the classical sparse regularization model (Lasso) can be understood to satisfy L1Solving β the optimal solution of the linear regression problem under the norm constraint condition, where y ═ y1,…,yN) Representing an N-dimensional response vector, the input quantity X is an N × p matrix, and a lagrange function can be constructed as follows:
Figure BDA0002465054010000071
replacing a linear error function of the sparse regularization model with a neural network error function, and carrying out group sparse constraint on the weight of the neural network input layer according to the complexity of the weight of the neural network to obtain a nonlinear feature selection model:
Figure BDA0002465054010000072
am=fm-1((Wm-1)Tam-1+bm-1) (3)
wherein E represents an error function, h represents a constraint function, w represents a feature weight, f is an activation function, lambda is a sparse coefficient, M represents the number of layers of the neural network, amRepresents the net output of the mth layer, bmRepresents an mth layer bias value;
gradient vanishing and gradient guarantee are also phenomena that the deep neural network must avoid in the solving process, and therefore attention must be paid to the hidden layer activation function when setting the neural network parameters, for example, an ELU (explicit Linear units) function can be adopted as the activation function:
Figure BDA0002465054010000073
the advantages of this function are: the method can ensure that the gradient disappears under the condition of the pole removing end, and simultaneously ensures the continuity and the differentiability of the function, thereby being convenient for the next solution.
(2) And (3) robustness optimization:
the problems of measurement disturbance and artificial error existing in the complex industrial process are not beneficial to the establishment of a feature selection model, and the mathematical experiment of researchers proves that: l is2,1Norm is compared to L1The norm has better robustness, so that robustness constraint can be added on the basis of an original sparse neural network to improve the accuracy of feature selection:
Figure BDA0002465054010000074
(3) and (3) optimizing the strategy:
due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And (3) solving an iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to neurons of each input layer as a characteristic importance index, and obtaining the problem provided by solving the actual industrial problem, wherein the process is as follows:
due to, L2,1There is an immeasurable point in the norm, which increases the difficulty of the solution process. To solve this problem, we introduce an LR,1Norm (class L)2,1Norm) that makes the entire function differentiable by adding a smoothing term at points that are not differentiable. L isR,1The norm is as follows:
Figure BDA0002465054010000081
Figure BDA0002465054010000085
the minimum value is expressed, and the above formula is easy to see when
Figure BDA0002465054010000086
When L isR,1Norm and L2,1The norm is equivalent. At the same time, due to the addition of a minimum value
Figure BDA0002465054010000087
It can be ensured that the derivative of the function is not 0, i.e. differentiable within the whole domain of definition;
in order to solve the expression, the number of the neural network layers is assumed to be 4, the ELU function is an activation function, and each layer of the neural network is expressed as L-i-j, and L is used for expressingR,1Norm instead of L2,1Norm, which leads to the following results;
Figure BDA0002465054010000082
carrying out integral derivation on the objective function to obtain a weight value updating formula
Figure BDA0002465054010000083
Computing neuron outputs o of layersm
Figure BDA0002465054010000084
Then calculating partial derivatives of each layer according to the error in the reverse direction:
Figure BDA0002465054010000091
let smIndicating the sensitivity:
Figure BDA0002465054010000092
introducing a derivative matrix:
Figure BDA0002465054010000093
Figure BDA0002465054010000094
this makes it possible to obtain:
Figure BDA0002465054010000095
finally updating the weight and the offset value
Figure BDA0002465054010000096
Figure BDA0002465054010000101
Figure BDA0002465054010000102
And repeatedly iterating, updating and optimizing the formula until a stopping condition is met, so as to obtain the optimal input layer weight matrix of the neural network. And finally, taking the sum of the absolute values of the weight vectors corresponding to each input layer neuron as a feature importance index to obtain a feature selection result.
The method selects the real data of the foam image of the first roughing tank in the copper ore flotation process of a certain flotation plant to carry out simulation experiment, preprocesses the data, removes outliers, normalizes the data to obtain 245 groups of copper flotation samples, each group of samples comprises 14 characteristics (average value, peak value, standard deviation, skewness, R, G, B, hue, red component, yellow component, speed, stability, bearing rate and gray level) and 1 label (mineral grade), 195 groups are selected as training samples according to the data distribution characteristics of the mineral grade, and 49 groups are selected as test samples. In order to make the solution algorithm fast and effective, the number of hidden layers is set to 2, the number of neurons is 8, and the learning rate η is 0.0003. The convergence judgment condition is set as that "if the variation of the objective function does not exceed 0.01 for 20 consecutive generations, the objective function is considered to be converged", and the maximum iteration number is 4000. And solving the objective function to obtain the weight of the input layer, and obtaining the characteristic importance sequence aiming at the foam image by taking the absolute value of the weight as the importance basis. And generating feature subsets with different dimensions according to the feature importance ordering, performing inspection by using an SVR (singular value representation) model, and comparing to obtain an optimal combination result under the fixed feature ordering.
First, to explain the scientificity of the selection results, the change of the feature importance index was obtained by changing the sparse coefficient, as shown in tables 1-2. It can be seen that, as the sparse coefficient is gradually increased, each feature index is gradually decreased, and sparse solutions gradually appear until all are zero (when the sparse coefficient is large enough). In the process, the sequence of importance is fixed, the sequence is consistent with the sequence of sparse solution, the result is matched with subjective analysis, and the algorithm process is reasonable and interpretable.
TABLE 1 top 7 feature importance ranking
Figure BDA0002465054010000111
7 last feature importance ranking of Table 2
Figure BDA0002465054010000112
Figure BDA0002465054010000121
Meanwhile, analyzing the performance influence of the sparse parameter lambda on the technology of the invention, selecting lambda ∈ [0,240 ]]Representing the process from no sparseness to full sparseness of the objective function. Using mean square error MSE, mean relative error ARE and square correlation coefficient R2As an evaluation standard, the influence of the sparse coefficient on the algorithm performance under the input of different dimensional feature subsets is obtained, and the result is shown in figure 1, wherein (a), (b) and (c) represent the change of three different evaluation indexes, and as can be seen from figure 1, the most suitable sparse coefficient interval for the problem of the copper flotation primary tank foam image feature selection is lambda ∈ [90,120 ]]。
Then, the existing feature selection algorithms are compared, and the comparison algorithms comprise selection Sequence Forward Selection (SFS), Principal Component Analysis (PCA), Sparse PCA (SPCA), Sparse Artificial Neural Network (SANN) and robust feature selection algorithm (RFS), so that a new data set is obtained according to the same steps. In order to ensure the authenticity of the comparison condition, an SVR model with the same parameters is adopted as a test model of the feature selection result. The new data set is used to train the SVR model, and the mean square error, the average relative error and the decision coefficient are used as evaluation criteria, and the comparison results obtained respectively are shown in fig. 2. Among different evaluation criteria, RSBP is generally superior to all comparative feature selection methods, and the error of the method is lower compared with 14-dimensional original feature data. The method can effectively improve the model prediction precision.
Finally, in order to compare the anti-interference capability of the algorithm, 25% of training samples are randomly selected and 5% of proportion disturbance is added. As can be seen from fig. 3, the redundancy rate of the selected feature subset on the data set by RSBP is significantly lower than that of other methods. At the same time, it can be seen that the error of RSBP is kept at a fairly low level at all times in different dimensions, which indicates that our feature selection method is more robust. Comparing fig. 2 and fig. 3, we can see that the error of other methods will fluctuate significantly when adding the interference data to the training set, but our method always keeps the error at a low level. In the case of 25% sample interference, our method is 5% -12% more accurate than other methods.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims (5)

1. A nonlinear feature selection method based on a neural network is characterized by comprising the following steps: (1) neural network embedding: replacing a linear error function of the sparse regularization model with a neural network error function, and simultaneously carrying out group sparse constraint on weights of input layers of the neural network to establish a nonlinear feature selection model; (2) and (3) robustness optimization: by means of L2,1Robustness of the norm, robustness optimization is carried out on the neural network, and an RSBP target function is established; (3) and (3) optimizing the strategy: introduction of class L2,1Smooth function of norm, solving of nerves according to projection gradient descent methodThe iterative function of the network obtains an optimal weight matrix, and the sum of the absolute values of the weight vectors corresponding to each input layer neuron is taken as a characteristic importance index, so that the proposed problem is solved.
2. The neural network-based nonlinear feature selection method according to claim 1, wherein the step (1):
the classical sparse regularization model (Lasso) can be understood to satisfy L1Solving β the optimal solution of the linear regression problem under the norm constraint condition, where y ═ y1,…,yN) Representing an N-dimensional response vector, the input quantity X is an N × p matrix, and a lagrange function can be constructed as follows:
Figure FDA0002465050000000011
replacing a linear error function of the sparse regularization model with a neural network error function, and carrying out group sparse constraint on the weight of the neural network input layer according to the complexity of the weight of the neural network to obtain a nonlinear feature selection model:
Figure FDA0002465050000000012
am=fm-1((Wm-1)Tam-1+bm-1) (3)
wherein E represents an error function, h represents a constraint function, w represents a feature weight, f is an activation function, lambda is a sparse coefficient, M represents the number of layers of the neural network, amRepresents the net output of the mth layer, bmRepresenting the mth layer bias value.
3. The method of claim 2, wherein the neural network-based nonlinear feature selection method is characterized in that an ELU function is used as the activation function when the neural network parameters are set, and the ELU function is used as the activation function:
Figure FDA0002465050000000021
4. the neural network-based nonlinear feature selection method according to claim 1, wherein the step (2): l is2,1Norm is compared to L1The norm has better robustness, and robustness constraint is added on the basis of an original sparse neural network:
Figure FDA0002465050000000022
5. the neural network-based nonlinear feature selection method according to claim 1, wherein the step (3): due to L2,1Norm is a non-smooth function, and a class L needs to be established2,1And (3) a smooth function of the norm, solving an iterative function of the neural network according to a projection gradient descent method, taking the sum of absolute values of weight vectors corresponding to neurons of each input layer as a characteristic importance index, and carrying out the following process:
introducing a class L2,1Norm: l isR,1A norm that makes the entire function differentiable by adding a smoothing term at a point that is not differentiable; l isR,1The norm is as follows:
Figure FDA0002465050000000023
Figure FDA0002465050000000024
indicates a minimum value when
Figure FDA0002465050000000025
When L isR,1Norm and L2,1Norm equivalence, at the same time, because of adding minimum value
Figure FDA0002465050000000026
Ensure that the derivative of the function is not 0, i.e. differentiable over the entire domain;
assuming that the number of the neural network layers is 4, the ELU function is an activation function, and the nerves of each layer of the neural network are expressed as L-i-j, and L is usedR,1Norm instead of L2,1Norm, which leads to the following results;
Figure FDA0002465050000000027
carrying out integral derivation on the objective function to obtain a weight value updating formula
Figure FDA0002465050000000031
Computing neuron outputs o of layersm
Figure FDA0002465050000000032
Then calculating partial derivatives of each layer according to the error in the reverse direction:
Figure FDA0002465050000000033
let smIndicating the sensitivity:
Figure FDA0002465050000000034
introducing a derivative matrix:
Figure FDA0002465050000000035
Figure FDA0002465050000000036
this gives:
Figure FDA0002465050000000041
finally updating the weight and the offset value
Figure FDA0002465050000000042
Figure FDA0002465050000000043
Figure FDA0002465050000000044
Repeatedly iterating, updating and optimizing through the formula until a stopping condition is met to obtain an optimal input layer weight matrix of the neural network; and finally, taking the sum of the absolute values of the weight vectors corresponding to each input layer neuron as a feature importance index to obtain a feature selection result.
CN202010331361.XA 2020-04-24 2020-04-24 Nonlinear feature selection method based on neural network Pending CN111652271A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010331361.XA CN111652271A (en) 2020-04-24 2020-04-24 Nonlinear feature selection method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010331361.XA CN111652271A (en) 2020-04-24 2020-04-24 Nonlinear feature selection method based on neural network

Publications (1)

Publication Number Publication Date
CN111652271A true CN111652271A (en) 2020-09-11

Family

ID=72349283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010331361.XA Pending CN111652271A (en) 2020-04-24 2020-04-24 Nonlinear feature selection method based on neural network

Country Status (1)

Country Link
CN (1) CN111652271A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884136A (en) * 2021-04-21 2021-06-01 江南大学 Bounded clustering projection synchronous regulation control method and system for coupled neural network
CN113033416A (en) * 2021-03-26 2021-06-25 深圳市华杰智通科技有限公司 Millimeter wave radar gesture recognition method based on sparse function
CN113064489A (en) * 2021-04-02 2021-07-02 深圳市华杰智通科技有限公司 Millimeter wave radar gesture recognition method based on L1-Norm
CN113313175A (en) * 2021-05-28 2021-08-27 北京大学 Image classification method of sparse regularization neural network based on multivariate activation function
CN115294406A (en) * 2022-09-30 2022-11-04 华东交通大学 Method and system for attribute-based multimodal interpretable classification
CN115796244A (en) * 2022-12-20 2023-03-14 广东石油化工学院 CFF-based parameter identification method for super-nonlinear input/output system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033416A (en) * 2021-03-26 2021-06-25 深圳市华杰智通科技有限公司 Millimeter wave radar gesture recognition method based on sparse function
CN113064489A (en) * 2021-04-02 2021-07-02 深圳市华杰智通科技有限公司 Millimeter wave radar gesture recognition method based on L1-Norm
CN112884136A (en) * 2021-04-21 2021-06-01 江南大学 Bounded clustering projection synchronous regulation control method and system for coupled neural network
CN112884136B (en) * 2021-04-21 2022-05-13 江南大学 Bounded clustering projection synchronous regulation control method and system for coupled neural network
CN113313175A (en) * 2021-05-28 2021-08-27 北京大学 Image classification method of sparse regularization neural network based on multivariate activation function
CN113313175B (en) * 2021-05-28 2024-02-27 北京大学 Image classification method of sparse regularized neural network based on multi-element activation function
CN115294406A (en) * 2022-09-30 2022-11-04 华东交通大学 Method and system for attribute-based multimodal interpretable classification
CN115796244A (en) * 2022-12-20 2023-03-14 广东石油化工学院 CFF-based parameter identification method for super-nonlinear input/output system
CN115796244B (en) * 2022-12-20 2023-07-21 广东石油化工学院 Parameter identification method based on CFF for ultra-nonlinear input/output system

Similar Documents

Publication Publication Date Title
CN111652271A (en) Nonlinear feature selection method based on neural network
CN110503251B (en) Non-holiday load prediction method based on Stacking algorithm
CN111429415B (en) Method for constructing efficient detection model of product surface defects based on network collaborative pruning
CN108875933B (en) Over-limit learning machine classification method and system for unsupervised sparse parameter learning
CN111768000A (en) Industrial process data modeling method for online adaptive fine-tuning deep learning
CN110571792A (en) Analysis and evaluation method and system for operation state of power grid regulation and control system
CN113486078A (en) Distributed power distribution network operation monitoring method and system
CN113780420B (en) GRU-GCN-based method for predicting concentration of dissolved gas in transformer oil
CN112613536A (en) Near infrared spectrum diesel grade identification method based on SMOTE and deep learning
CN112308298B (en) Multi-scenario performance index prediction method and system for semiconductor production line
CN112439794A (en) Hot rolling bending force prediction method based on LSTM
CN104634265A (en) Soft measurement method for thickness of mineral floating foam layer based on multivariate image feature fusion
CN114580934A (en) Early warning method for food detection data risk based on unsupervised anomaly detection
Fuertes et al. Visual dynamic model based on self-organizing maps for supervision and fault detection in industrial processes
CN116340726A (en) Energy economy big data cleaning method, system, equipment and storage medium
CN115564983A (en) Target detection method and device, electronic equipment, storage medium and application thereof
CN113221442A (en) Construction method and device of health assessment model of power plant equipment
CN112241832A (en) Product quality grading evaluation standard design method and system
CN108764583B (en) Unbiased prediction method for forest accumulation
CN116842358A (en) Soft measurement modeling method based on multi-scale convolution and self-adaptive feature fusion
CN115510740A (en) Aero-engine residual life prediction method based on deep learning
CN114529063A (en) Financial field data prediction method, device and medium based on machine learning
CN111882441A (en) User prediction interpretation Treeshap method based on financial product recommendation scene
CN112330029A (en) Fishing ground prediction calculation method based on multilayer convLSTM
Sallehuddin et al. Forecasting small data set using hybrid cooperative feature selection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200911