CN111931813A - CNN-based width learning classification method - Google Patents
CNN-based width learning classification method Download PDFInfo
- Publication number
- CN111931813A CN111931813A CN202010634517.1A CN202010634517A CN111931813A CN 111931813 A CN111931813 A CN 111931813A CN 202010634517 A CN202010634517 A CN 202010634517A CN 111931813 A CN111931813 A CN 111931813A
- Authority
- CN
- China
- Prior art keywords
- width learning
- layer
- cnn
- feature
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a CNN-based width learning classification method, which comprises the following specific steps of: obtaining training data and test data; preprocessing training data and test data; performing feature extraction on the training data by using a Convolutional Neural Network (CNN) to obtain feature mapping of the training data and generate a feature node layer of a width learning basic model; enhancing the mapped features into an enhancement matrix of randomly generated weights to generate an enhancement node layer of the width learning basic model; constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model; and (4) learning the model by using the finally trained improved width. The key points of the technical scheme are that under the condition of ensuring that the number of the characteristic nodes is similar, the CNN _ BLS has a better classification result than the BLS, and meanwhile, the original high efficiency of a width learning system is kept in comparison with an ELM model, so that the CNN _ BLS has a better comprehensive effect.
Description
Technical Field
The invention relates to the field of intelligent learning, in particular to a CNN-based width learning classification method.
Background
At present, in the field of intelligent learning, a deep neural network model is widely used for solving different types of intelligent classification problems. However, when a more complex problem is faced, generally, it is necessary to increase the network structure depth of the deep neural network model and adjust the neuron number of each network layer, and then train the connection weight between each network layer in an iterative update manner, so as to finally achieve an ideal model effect. In the face of a large amount of experimental data, as the depth model is more and more complex and the layer number is deeper and deeper, the number of parameters to be optimized is multiplied, and a large amount of time and machine resources are generally consumed for iterative optimization of the parameters, so that the difficulty of practical application is caused.
Aiming at the problem, on the basis that a random vector function is connected with a neural network, a width learning model which can be compared with deep learning is provided by Chenjunlong and the like, and the system is called as a width learning system (BLS) and is an efficient incremental learning system without a deep structure. At present, BLS has shown a learning ability approaching a deep neural network learning model in the field of image recognition and the like. Compared with a depth model structure, the width model structure is simple, connection among a plurality of layers is avoided, gradient descent iteration is not needed to update the weight, but ridge regression is used for directly solving a weight matrix through matrix calculation, and therefore the depth learning model has great advantages in calculation speed.
However, because BLS defaults to randomly generated weights and biases in the process of solving the description features of the samples and constructing the feature layer, in the face of complex samples, the problem of insufficient description of the representative features of the samples is caused, so that the learning ability is reduced, and the accuracy of the classification task is also lost. Therefore, it is necessary to find a more reliable and efficient feature extraction procedure for the width learning model.
In order to solve the above problem, a CNN-based width learning classification method is proposed herein.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a CNN-based width learning classification method, wherein CNN _ BLS has a better classification result than BLS under the condition of ensuring that the number of feature nodes is similar. Meanwhile, in comparison with an ELM (extreme learning machine) model, the original high efficiency of the width learning system is kept, and the comprehensive effect is better.
The technical purpose of the invention is realized by the following technical scheme:
a CNN-based width learning classification method, which is described in fig. 1, and is a CNN-based width learning classification method in a preferred embodiment of the present invention, and is implemented by using a technical solution including the following specific steps:
step 1, obtaining training data and test data;
step 2, preprocessing the training data and the test data;
step 3, extracting the characteristics of the training data by using a convolutional neural network CNN to obtain the characteristic mapping of the training data and generate a characteristic node layer of a width learning basic model;
step 4, enhancing the mapped features into an enhanced matrix of random generated weights, and generating an enhanced node layer of the width learning basic model;
step 5, constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model;
and 6, utilizing the finally trained improved width learning model, comprising the following steps: and estimating the output corresponding to the test input by the weight and the bias generated in the training process and the output weight W.
Further, in step 3, the process of generating the feature node layer includes:
step 3-1: for input data X obtained after preprocessing, a feature mapping Z containing k feature windows is generated by using a convolutional neural network modelk;
Step 3-2: as above, k feature windows Z are generated using a k-th convolutional neural network modeliEach of ZiEach feature window comprises n feature nodes, wherein i is 1,2, …, k;
step 3-3: k feature windows Z1,Z2,…,ZkAnd splicing to form a characteristic node layer containing k multiplied by n characteristic nodes.
Further, the convolutional neural network model specifically includes:
one of the TensorFlow frame based constructs includes: the convolutional neural network of 1 convolutional layer, 1 pooling layer and 1 full-link layer specifically is:
step 3-1-1: the convolutional layer adopts 16 5 multiplied by 5 single-channel convolutional cores to carry out X convolution on input data, the activation function uses relu, and 16 28 multiplied by 28 characteristics are output;
step 3-1-2: the output data of the convolutional layer is input into the pooling layer, and maximum pooling calculation is adopted, Z1The size of the pooling window is 2 multiplied by 2, the step length is 2 multiplied by 2, the filling mode is 'SAME', and 16 characteristic mappings with the size of 14 multiplied by 14 are output after the pooling layer calculation;
step 3-1-3: the input of the full connection layer is output data of the previous layer, the activation function uses relu, and the output is a feature map containing 10 neurons, namely ZiWherein i 1, 2.. k;
step 3-1-4: and using the k-time convolutional neural network model to obtain a final feature node layer, and entering the next calculation, wherein k represents the number of feature windows, and n represents the number of feature nodes contained in each group of feature windows.
Further, in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xijA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξjIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.
Further, in the step 5, the step 5-1: and splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of the hidden layer nodes is L ═ k × n + m × r.
Further, in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculatedAnd solving the output weight W to finally obtain a final model after training.
Further, in step 6, step 6-1, using the finally trained improved width learning model, includes: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.
The improved width learning system CNN _ BLS can extract deeper sample characteristics through the convolutional neural network model while keeping the high-efficiency learning speed of the original width learning system BLS, and has stronger characterization capability than the BLS model.
In conclusion, the invention has the following beneficial effects:
1. the method is based on BLS, the advantages of the CNN convolutional neural network in sample feature extraction are utilized, features of initially input sample data are extracted to the maximum extent on the basis of convolution and pooling operation, the machine learning classification task can be efficiently completed under the condition of complex samples, and the improved width learning system CNN _ BLS can extract deeper sample features through the convolutional neural network model while the efficient learning speed of the original width learning system BLS is kept, and has stronger characterization capability compared with the BLS model.
2. Under the condition of ensuring that the number of the feature nodes is similar, the CNN _ BLS has a better classification result than the BLS, and simultaneously, the original high efficiency of a width learning system is kept in comparison with an ELM (extreme learning machine) model, so that the CNN _ BLS has a better comprehensive effect.
Drawings
FIG. 1 is a diagram of an improved width learning model;
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
As shown in fig. 1, a CNN-based width learning classification method in a preferred embodiment of the present invention is implemented by a technical solution including the following specific steps:
step 1, obtaining training data and test data;
step 2, preprocessing the training data and the test data;
step 3, extracting the characteristics of the training data by using a convolutional neural network CNN to obtain the characteristic mapping of the training data and generate a characteristic node layer of a width learning basic model;
step 3-1: for input data X obtained after preprocessing, a feature mapping Z containing k feature windows is generated by using a convolutional neural network modelk;
Step 3-2: as above, k feature windows Z are generated using a k-th convolutional neural network modeliEach of ZiEach feature window comprises n feature nodes, wherein i is 1,2, …, k;
step 3-3: k feature windows Z1,Z2,…,ZkSplicing to form a characteristic node layer containing kXn characteristic nodes;
the convolutional neural network model specifically includes:
one of the TensorFlow frame based constructs includes: the convolutional neural network of 1 convolutional layer, 1 pooling layer and 1 full-link layer specifically is:
step 3-1-1: the convolutional layer adopts 16 5 multiplied by 5 single-channel convolutional cores to carry out X convolution on input data, the activation function uses relu, and 16 28 multiplied by 28 characteristics are output;
step 3-1-2: the output data of the convolutional layer is input into the pooling layer, and maximum pooling calculation is adopted, Z1The size of the pooling window is 2 multiplied by 2, the step length is 2 multiplied by 2, the filling mode is 'SAME', and 16 characteristic mappings with the size of 14 multiplied by 14 are output after the pooling layer calculation;
step 3-1-3: the input of the full connection layer is output data of the previous layer, the activation function uses relu, and the output is a feature map containing 10 neurons, namely ZiWherein i 1, 2.. k;
step 3-1-4: and using the k-time convolutional neural network model to obtain a final feature node layer, and entering the next calculation, wherein k represents the number of feature windows, and n represents the number of feature nodes contained in each group of feature windows.
Step 4, enhancing the mapped features into an enhancement matrix of random generated weights, and generating an enhancement node layer of the width learning basic model, wherein in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xijA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξjIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.
Step 5, constructing an input matrix by the characteristic node layer and the enhanced node layer, inputting a width learning model for training, and constructing a width learning basic model, wherein in the step 5, the step 5-1: splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of nodes of the hidden layer is L ═ k × n + m × r;
in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculatedAnd solving the output weight W to finally obtain a final model after training.
And 6, utilizing the finally trained improved width learning model, comprising the following steps: and (3) estimating output corresponding to the test input by using the weight and bias and the output weight W generated in the training process, wherein in the step 6, the step 6-1 of utilizing the finally trained improved width learning model comprises the following steps: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.
The improved width learning system CNN _ BLS can extract deeper sample characteristics through the convolutional neural network model while keeping the high-efficiency learning speed of the original width learning system BLS, and has stronger characterization capability than the BLS model.
The classification results after using the standard data set MNIST and setting the relevant parameters are shown in comparative experiments:
CNN _ BLS has better classification results than BLS, while ensuring that the number of feature nodes is similar. Meanwhile, in comparison with an ELM (extreme learning machine) model, the original high efficiency of the width learning system is kept, and the comprehensive effect is better.
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (7)
1. A CNN-based width learning classification method is characterized in that: the method comprises the following specific steps:
step 1, obtaining training data and test data;
step 2, preprocessing the training data and the test data;
step 3, extracting the characteristics of the training data by using a convolutional neural network CNN to obtain the characteristic mapping of the training data and generate a characteristic node layer of a width learning basic model;
step 4, enhancing the mapped features into an enhanced matrix of random generated weights, and generating an enhanced node layer of the width learning basic model;
step 5, constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model;
and 6, utilizing the finally trained improved width learning model.
2. The CNN-based width learning classification method according to claim 1, wherein: in step 3, the process of generating the feature node layer includes:
step 3-1: for input data X obtained after preprocessing, a feature mapping Z containing k feature windows is generated by using a convolutional neural network modelk;
Step 3-2: as above, k feature windows Z are generated using a k-th convolutional neural network modeliEach of ZiEach feature window comprises n feature nodes, wherein i is 1,2, …, k;
step 3-3: k feature windows Z1,Z2,…,ZkAnd splicing to form a characteristic node layer containing k multiplied by n characteristic nodes.
3. The CNN-based width learning classification method according to claim 2, wherein: the convolutional neural network model specifically includes:
one of the TensorFlow frame based constructs includes: the convolutional neural network of 1 convolutional layer, 1 pooling layer and 1 full-link layer specifically is:
step 3-1-1: the convolutional layer adopts 16 5 multiplied by 5 single-channel convolutional cores to carry out X convolution on input data, the activation function uses relu, and 16 28 multiplied by 28 characteristics are output;
step 3-1-2: the output data of the convolutional layer is input into the pooling layer, and maximum pooling calculation is adopted, Z1The size of the pooling window is 2 multiplied by 2, the step length is 2 multiplied by 2, the filling mode is 'SAME', and 16 characteristic mappings with the size of 14 multiplied by 14 are output after the pooling layer calculation;
step 3-1-3: the input of the full connection layer is output data of the previous layer, the activation function uses relu, and the output is a feature map containing 10 neurons, namely ZiWherein i 1, 2.. k;
step 3-1-4: and using the k-time convolutional neural network model to obtain a final feature node layer, and entering the next calculation, wherein k represents the number of feature windows, and n represents the number of feature nodes contained in each group of feature windows.
4. The CNN-based width learning classification method according to claim 2, wherein: in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xijA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξjIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.
5. The CNN-based width learning classification method according to claim 1, wherein: in the step 5, the step 5-1: and splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of the hidden layer nodes is L ═ k × n + m × r.
6. The CNN-based width learning classification method according to claim 5, wherein: in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculatedAnd solving the output weight W to finally obtain a final model after training.
7. The CNN-based width learning classification method according to claim 3, wherein: in the step 6, the step 6-1 of utilizing the finally trained improved width learning model comprises the following steps: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010634517.1A CN111931813A (en) | 2020-07-02 | 2020-07-02 | CNN-based width learning classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010634517.1A CN111931813A (en) | 2020-07-02 | 2020-07-02 | CNN-based width learning classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111931813A true CN111931813A (en) | 2020-11-13 |
Family
ID=73317301
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010634517.1A Pending CN111931813A (en) | 2020-07-02 | 2020-07-02 | CNN-based width learning classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111931813A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112802011A (en) * | 2021-02-25 | 2021-05-14 | 上海电机学院 | Fan blade defect detection method based on VGG-BLS |
CN113011493A (en) * | 2021-03-18 | 2021-06-22 | 华南理工大学 | Electroencephalogram emotion classification method, device, medium and equipment based on multi-kernel width learning |
CN113311035A (en) * | 2021-05-17 | 2021-08-27 | 北京工业大学 | Effluent total phosphorus prediction method based on width learning network |
CN113673554A (en) * | 2021-07-07 | 2021-11-19 | 西安电子科技大学 | Radar high-resolution range profile target identification method based on width learning |
-
2020
- 2020-07-02 CN CN202010634517.1A patent/CN111931813A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112802011A (en) * | 2021-02-25 | 2021-05-14 | 上海电机学院 | Fan blade defect detection method based on VGG-BLS |
CN113011493A (en) * | 2021-03-18 | 2021-06-22 | 华南理工大学 | Electroencephalogram emotion classification method, device, medium and equipment based on multi-kernel width learning |
CN113311035A (en) * | 2021-05-17 | 2021-08-27 | 北京工业大学 | Effluent total phosphorus prediction method based on width learning network |
CN113311035B (en) * | 2021-05-17 | 2022-05-03 | 北京工业大学 | Effluent total phosphorus prediction method based on width learning network |
CN113673554A (en) * | 2021-07-07 | 2021-11-19 | 西安电子科技大学 | Radar high-resolution range profile target identification method based on width learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111489358B (en) | Three-dimensional point cloud semantic segmentation method based on deep learning | |
CN111931813A (en) | CNN-based width learning classification method | |
CN109543502B (en) | Semantic segmentation method based on deep multi-scale neural network | |
CN109948029A (en) | Based on the adaptive depth hashing image searching method of neural network | |
CN109829541A (en) | Deep neural network incremental training method and system based on learning automaton | |
CN111695457B (en) | Human body posture estimation method based on weak supervision mechanism | |
CN108171318B (en) | Convolution neural network integration method based on simulated annealing-Gaussian function | |
CN112699247A (en) | Knowledge representation learning framework based on multi-class cross entropy contrast completion coding | |
CN112750129B (en) | Image semantic segmentation model based on feature enhancement position attention mechanism | |
CN106022363A (en) | Method for recognizing Chinese characters in natural scene | |
CN110263855B (en) | Method for classifying images by utilizing common-basis capsule projection | |
CN113177560A (en) | Universal lightweight deep learning vehicle detection method | |
CN113591978B (en) | Confidence penalty regularization-based self-knowledge distillation image classification method, device and storage medium | |
CN116503676B (en) | Picture classification method and system based on knowledge distillation small sample increment learning | |
CN112686376A (en) | Node representation method based on timing diagram neural network and incremental learning method | |
CN115049534A (en) | Knowledge distillation-based real-time semantic segmentation method for fisheye image | |
CN113553918B (en) | Machine ticket issuing character recognition method based on pulse active learning | |
CN111222534A (en) | Single-shot multi-frame detector optimization method based on bidirectional feature fusion and more balanced L1 loss | |
CN111783688B (en) | Remote sensing image scene classification method based on convolutional neural network | |
CN116188785A (en) | Polar mask old man contour segmentation method using weak labels | |
CN116563636A (en) | Synthetic aperture radar image generation method and system | |
CN114494284B (en) | Scene analysis model and method based on explicit supervision area relation | |
CN114581789A (en) | Hyperspectral image classification method and system | |
CN114332491A (en) | Saliency target detection algorithm based on feature reconstruction | |
CN109409226A (en) | A kind of finger vena plot quality appraisal procedure and its device based on cascade optimization CNN |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |