CN111931813A

CN111931813A - CNN-based width learning classification method

Info

Publication number: CN111931813A
Application number: CN202010634517.1A
Authority: CN
Inventors: 夏旸; 车海莺
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2020-07-02
Filing date: 2020-07-02
Publication date: 2020-11-13

Abstract

The invention discloses a CNN-based width learning classification method, which comprises the following specific steps of: obtaining training data and test data; preprocessing training data and test data; performing feature extraction on the training data by using a Convolutional Neural Network (CNN) to obtain feature mapping of the training data and generate a feature node layer of a width learning basic model; enhancing the mapped features into an enhancement matrix of randomly generated weights to generate an enhancement node layer of the width learning basic model; constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model; and (4) learning the model by using the finally trained improved width. The key points of the technical scheme are that under the condition of ensuring that the number of the characteristic nodes is similar, the CNN _ BLS has a better classification result than the BLS, and meanwhile, the original high efficiency of a width learning system is kept in comparison with an ELM model, so that the CNN _ BLS has a better comprehensive effect.

Description

CNN-based width learning classification method

Technical Field

The invention relates to the field of intelligent learning, in particular to a CNN-based width learning classification method.

Background

At present, in the field of intelligent learning, a deep neural network model is widely used for solving different types of intelligent classification problems. However, when a more complex problem is faced, generally, it is necessary to increase the network structure depth of the deep neural network model and adjust the neuron number of each network layer, and then train the connection weight between each network layer in an iterative update manner, so as to finally achieve an ideal model effect. In the face of a large amount of experimental data, as the depth model is more and more complex and the layer number is deeper and deeper, the number of parameters to be optimized is multiplied, and a large amount of time and machine resources are generally consumed for iterative optimization of the parameters, so that the difficulty of practical application is caused.

Aiming at the problem, on the basis that a random vector function is connected with a neural network, a width learning model which can be compared with deep learning is provided by Chenjunlong and the like, and the system is called as a width learning system (BLS) and is an efficient incremental learning system without a deep structure. At present, BLS has shown a learning ability approaching a deep neural network learning model in the field of image recognition and the like. Compared with a depth model structure, the width model structure is simple, connection among a plurality of layers is avoided, gradient descent iteration is not needed to update the weight, but ridge regression is used for directly solving a weight matrix through matrix calculation, and therefore the depth learning model has great advantages in calculation speed.

However, because BLS defaults to randomly generated weights and biases in the process of solving the description features of the samples and constructing the feature layer, in the face of complex samples, the problem of insufficient description of the representative features of the samples is caused, so that the learning ability is reduced, and the accuracy of the classification task is also lost. Therefore, it is necessary to find a more reliable and efficient feature extraction procedure for the width learning model.

In order to solve the above problem, a CNN-based width learning classification method is proposed herein.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide a CNN-based width learning classification method, wherein CNN _ BLS has a better classification result than BLS under the condition of ensuring that the number of feature nodes is similar. Meanwhile, in comparison with an ELM (extreme learning machine) model, the original high efficiency of the width learning system is kept, and the comprehensive effect is better.

The technical purpose of the invention is realized by the following technical scheme:

a CNN-based width learning classification method, which is described in fig. 1, and is a CNN-based width learning classification method in a preferred embodiment of the present invention, and is implemented by using a technical solution including the following specific steps:

step 1, obtaining training data and test data;

step 2, preprocessing the training data and the test data;

step 3, extracting the characteristics of the training data by using a convolutional neural network CNN to obtain the characteristic mapping of the training data and generate a characteristic node layer of a width learning basic model;

step 4, enhancing the mapped features into an enhanced matrix of random generated weights, and generating an enhanced node layer of the width learning basic model;

step 5, constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model;

and 6, utilizing the finally trained improved width learning model, comprising the following steps: and estimating the output corresponding to the test input by the weight and the bias generated in the training process and the output weight W.

Further, in step 3, the process of generating the feature node layer includes:

step 3-1: for input data X obtained after preprocessing, a feature mapping Z containing k feature windows is generated by using a convolutional neural network model^k；

Step 3-2: as above, k feature windows Z are generated using a k-th convolutional neural network model_iEach of Z_iEach feature window comprises n feature nodes, wherein i is 1,2, …, k;

step 3-3: k feature windows Z₁,Z₂,…,Z_kAnd splicing to form a characteristic node layer containing k multiplied by n characteristic nodes.

Further, the convolutional neural network model specifically includes:

one of the TensorFlow frame based constructs includes: the convolutional neural network of 1 convolutional layer, 1 pooling layer and 1 full-link layer specifically is:

step 3-1-1: the convolutional layer adopts 16 5 multiplied by 5 single-channel convolutional cores to carry out X convolution on input data, the activation function uses relu, and 16 28 multiplied by 28 characteristics are output;

step 3-1-2: the output data of the convolutional layer is input into the pooling layer, and maximum pooling calculation is adopted, Z₁The size of the pooling window is 2 multiplied by 2, the step length is 2 multiplied by 2, the filling mode is 'SAME', and 16 characteristic mappings with the size of 14 multiplied by 14 are output after the pooling layer calculation;

step 3-1-3: the input of the full connection layer is output data of the previous layer, the activation function uses relu, and the output is a feature map containing 10 neurons, namely Z_iWherein i 1, 2.. k;

step 3-1-4: and using the k-time convolutional neural network model to obtain a final feature node layer, and entering the next calculation, wherein k represents the number of feature windows, and n represents the number of feature nodes contained in each group of feature windows.

Further, in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xi_jA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξ_jIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.

Further, in the step 5, the step 5-1: and splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of the hidden layer nodes is L ═ k × n + m × r.

Further, in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculated

And solving the output weight W to finally obtain a final model after training.

Further, in step 6, step 6-1, using the finally trained improved width learning model, includes: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.

The improved width learning system CNN _ BLS can extract deeper sample characteristics through the convolutional neural network model while keeping the high-efficiency learning speed of the original width learning system BLS, and has stronger characterization capability than the BLS model.

In conclusion, the invention has the following beneficial effects:

1. the method is based on BLS, the advantages of the CNN convolutional neural network in sample feature extraction are utilized, features of initially input sample data are extracted to the maximum extent on the basis of convolution and pooling operation, the machine learning classification task can be efficiently completed under the condition of complex samples, and the improved width learning system CNN _ BLS can extract deeper sample features through the convolutional neural network model while the efficient learning speed of the original width learning system BLS is kept, and has stronger characterization capability compared with the BLS model.

2. Under the condition of ensuring that the number of the feature nodes is similar, the CNN _ BLS has a better classification result than the BLS, and simultaneously, the original high efficiency of a width learning system is kept in comparison with an ELM (extreme learning machine) model, so that the CNN _ BLS has a better comprehensive effect.

Drawings

FIG. 1 is a diagram of an improved width learning model;

Detailed Description

The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.

As shown in fig. 1, a CNN-based width learning classification method in a preferred embodiment of the present invention is implemented by a technical solution including the following specific steps:

step 1, obtaining training data and test data;

step 2, preprocessing the training data and the test data;

step 3-3: k feature windows Z₁,Z₂,…,Z_kSplicing to form a characteristic node layer containing kXn characteristic nodes;

the convolutional neural network model specifically includes:

Step 4, enhancing the mapped features into an enhancement matrix of random generated weights, and generating an enhancement node layer of the width learning basic model, wherein in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xi_jA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξ_jIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.

Step 5, constructing an input matrix by the characteristic node layer and the enhanced node layer, inputting a width learning model for training, and constructing a width learning basic model, wherein in the step 5, the step 5-1: splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of nodes of the hidden layer is L ═ k × n + m × r;

in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculated

And solving the output weight W to finally obtain a final model after training.

And 6, utilizing the finally trained improved width learning model, comprising the following steps: and (3) estimating output corresponding to the test input by using the weight and bias and the output weight W generated in the training process, wherein in the step 6, the step 6-1 of utilizing the finally trained improved width learning model comprises the following steps: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.

The classification results after using the standard data set MNIST and setting the relevant parameters are shown in comparative experiments:

CNN _ BLS has better classification results than BLS, while ensuring that the number of feature nodes is similar. Meanwhile, in comparison with an ELM (extreme learning machine) model, the original high efficiency of the width learning system is kept, and the comprehensive effect is better.

The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A CNN-based width learning classification method is characterized in that: the method comprises the following specific steps:

step 1, obtaining training data and test data;

step 2, preprocessing the training data and the test data;

and 6, utilizing the finally trained improved width learning model.

2. The CNN-based width learning classification method according to claim 1, wherein: in step 3, the process of generating the feature node layer includes:

3. The CNN-based width learning classification method according to claim 2, wherein: the convolutional neural network model specifically includes:

4. The CNN-based width learning classification method according to claim 2, wherein: in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xi_jA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξ_jIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.

5. The CNN-based width learning classification method according to claim 1, wherein: in the step 5, the step 5-1: and splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of the hidden layer nodes is L ═ k × n + m × r.

6. The CNN-based width learning classification method according to claim 5, wherein: in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculated

And solving the output weight W to finally obtain a final model after training.

7. The CNN-based width learning classification method according to claim 3, wherein: in the step 6, the step 6-1 of utilizing the finally trained improved width learning model comprises the following steps: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.