CN111931813A - CNN-based width learning classification method - Google Patents

CNN-based width learning classification method Download PDF

Info

Publication number
CN111931813A
CN111931813A CN202010634517.1A CN202010634517A CN111931813A CN 111931813 A CN111931813 A CN 111931813A CN 202010634517 A CN202010634517 A CN 202010634517A CN 111931813 A CN111931813 A CN 111931813A
Authority
CN
China
Prior art keywords
width learning
layer
cnn
feature
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010634517.1A
Other languages
Chinese (zh)
Inventor
夏旸
车海莺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202010634517.1A priority Critical patent/CN111931813A/en
Publication of CN111931813A publication Critical patent/CN111931813A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a CNN-based width learning classification method, which comprises the following specific steps of: obtaining training data and test data; preprocessing training data and test data; performing feature extraction on the training data by using a Convolutional Neural Network (CNN) to obtain feature mapping of the training data and generate a feature node layer of a width learning basic model; enhancing the mapped features into an enhancement matrix of randomly generated weights to generate an enhancement node layer of the width learning basic model; constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model; and (4) learning the model by using the finally trained improved width. The key points of the technical scheme are that under the condition of ensuring that the number of the characteristic nodes is similar, the CNN _ BLS has a better classification result than the BLS, and meanwhile, the original high efficiency of a width learning system is kept in comparison with an ELM model, so that the CNN _ BLS has a better comprehensive effect.

Description

CNN-based width learning classification method
Technical Field
The invention relates to the field of intelligent learning, in particular to a CNN-based width learning classification method.
Background
At present, in the field of intelligent learning, a deep neural network model is widely used for solving different types of intelligent classification problems. However, when a more complex problem is faced, generally, it is necessary to increase the network structure depth of the deep neural network model and adjust the neuron number of each network layer, and then train the connection weight between each network layer in an iterative update manner, so as to finally achieve an ideal model effect. In the face of a large amount of experimental data, as the depth model is more and more complex and the layer number is deeper and deeper, the number of parameters to be optimized is multiplied, and a large amount of time and machine resources are generally consumed for iterative optimization of the parameters, so that the difficulty of practical application is caused.
Aiming at the problem, on the basis that a random vector function is connected with a neural network, a width learning model which can be compared with deep learning is provided by Chenjunlong and the like, and the system is called as a width learning system (BLS) and is an efficient incremental learning system without a deep structure. At present, BLS has shown a learning ability approaching a deep neural network learning model in the field of image recognition and the like. Compared with a depth model structure, the width model structure is simple, connection among a plurality of layers is avoided, gradient descent iteration is not needed to update the weight, but ridge regression is used for directly solving a weight matrix through matrix calculation, and therefore the depth learning model has great advantages in calculation speed.
However, because BLS defaults to randomly generated weights and biases in the process of solving the description features of the samples and constructing the feature layer, in the face of complex samples, the problem of insufficient description of the representative features of the samples is caused, so that the learning ability is reduced, and the accuracy of the classification task is also lost. Therefore, it is necessary to find a more reliable and efficient feature extraction procedure for the width learning model.
In order to solve the above problem, a CNN-based width learning classification method is proposed herein.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a CNN-based width learning classification method, wherein CNN _ BLS has a better classification result than BLS under the condition of ensuring that the number of feature nodes is similar. Meanwhile, in comparison with an ELM (extreme learning machine) model, the original high efficiency of the width learning system is kept, and the comprehensive effect is better.
The technical purpose of the invention is realized by the following technical scheme:
a CNN-based width learning classification method, which is described in fig. 1, and is a CNN-based width learning classification method in a preferred embodiment of the present invention, and is implemented by using a technical solution including the following specific steps:
step 1, obtaining training data and test data;
step 2, preprocessing the training data and the test data;
step 3, extracting the characteristics of the training data by using a convolutional neural network CNN to obtain the characteristic mapping of the training data and generate a characteristic node layer of a width learning basic model;
step 4, enhancing the mapped features into an enhanced matrix of random generated weights, and generating an enhanced node layer of the width learning basic model;
step 5, constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model;
and 6, utilizing the finally trained improved width learning model, comprising the following steps: and estimating the output corresponding to the test input by the weight and the bias generated in the training process and the output weight W.
Further, in step 3, the process of generating the feature node layer includes:
step 3-1: for input data X obtained after preprocessing, a feature mapping Z containing k feature windows is generated by using a convolutional neural network modelk
Step 3-2: as above, k feature windows Z are generated using a k-th convolutional neural network modeliEach of ZiEach feature window comprises n feature nodes, wherein i is 1,2, …, k;
step 3-3: k feature windows Z1,Z2,…,ZkAnd splicing to form a characteristic node layer containing k multiplied by n characteristic nodes.
Further, the convolutional neural network model specifically includes:
one of the TensorFlow frame based constructs includes: the convolutional neural network of 1 convolutional layer, 1 pooling layer and 1 full-link layer specifically is:
step 3-1-1: the convolutional layer adopts 16 5 multiplied by 5 single-channel convolutional cores to carry out X convolution on input data, the activation function uses relu, and 16 28 multiplied by 28 characteristics are output;
step 3-1-2: the output data of the convolutional layer is input into the pooling layer, and maximum pooling calculation is adopted, Z1The size of the pooling window is 2 multiplied by 2, the step length is 2 multiplied by 2, the filling mode is 'SAME', and 16 characteristic mappings with the size of 14 multiplied by 14 are output after the pooling layer calculation;
step 3-1-3: the input of the full connection layer is output data of the previous layer, the activation function uses relu, and the output is a feature map containing 10 neurons, namely ZiWherein i 1, 2.. k;
step 3-1-4: and using the k-time convolutional neural network model to obtain a final feature node layer, and entering the next calculation, wherein k represents the number of feature windows, and n represents the number of feature nodes contained in each group of feature windows.
Further, in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xijA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξjIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.
Further, in the step 5, the step 5-1: and splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of the hidden layer nodes is L ═ k × n + m × r.
Further, in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculated
Figure BDA0002567526920000041
And solving the output weight W to finally obtain a final model after training.
Further, in step 6, step 6-1, using the finally trained improved width learning model, includes: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.
The improved width learning system CNN _ BLS can extract deeper sample characteristics through the convolutional neural network model while keeping the high-efficiency learning speed of the original width learning system BLS, and has stronger characterization capability than the BLS model.
In conclusion, the invention has the following beneficial effects:
1. the method is based on BLS, the advantages of the CNN convolutional neural network in sample feature extraction are utilized, features of initially input sample data are extracted to the maximum extent on the basis of convolution and pooling operation, the machine learning classification task can be efficiently completed under the condition of complex samples, and the improved width learning system CNN _ BLS can extract deeper sample features through the convolutional neural network model while the efficient learning speed of the original width learning system BLS is kept, and has stronger characterization capability compared with the BLS model.
2. Under the condition of ensuring that the number of the feature nodes is similar, the CNN _ BLS has a better classification result than the BLS, and simultaneously, the original high efficiency of a width learning system is kept in comparison with an ELM (extreme learning machine) model, so that the CNN _ BLS has a better comprehensive effect.
Drawings
FIG. 1 is a diagram of an improved width learning model;
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
As shown in fig. 1, a CNN-based width learning classification method in a preferred embodiment of the present invention is implemented by a technical solution including the following specific steps:
step 1, obtaining training data and test data;
step 2, preprocessing the training data and the test data;
step 3, extracting the characteristics of the training data by using a convolutional neural network CNN to obtain the characteristic mapping of the training data and generate a characteristic node layer of a width learning basic model;
step 3-1: for input data X obtained after preprocessing, a feature mapping Z containing k feature windows is generated by using a convolutional neural network modelk
Step 3-2: as above, k feature windows Z are generated using a k-th convolutional neural network modeliEach of ZiEach feature window comprises n feature nodes, wherein i is 1,2, …, k;
step 3-3: k feature windows Z1,Z2,…,ZkSplicing to form a characteristic node layer containing kXn characteristic nodes;
the convolutional neural network model specifically includes:
one of the TensorFlow frame based constructs includes: the convolutional neural network of 1 convolutional layer, 1 pooling layer and 1 full-link layer specifically is:
step 3-1-1: the convolutional layer adopts 16 5 multiplied by 5 single-channel convolutional cores to carry out X convolution on input data, the activation function uses relu, and 16 28 multiplied by 28 characteristics are output;
step 3-1-2: the output data of the convolutional layer is input into the pooling layer, and maximum pooling calculation is adopted, Z1The size of the pooling window is 2 multiplied by 2, the step length is 2 multiplied by 2, the filling mode is 'SAME', and 16 characteristic mappings with the size of 14 multiplied by 14 are output after the pooling layer calculation;
step 3-1-3: the input of the full connection layer is output data of the previous layer, the activation function uses relu, and the output is a feature map containing 10 neurons, namely ZiWherein i 1, 2.. k;
step 3-1-4: and using the k-time convolutional neural network model to obtain a final feature node layer, and entering the next calculation, wherein k represents the number of feature windows, and n represents the number of feature nodes contained in each group of feature windows.
Step 4, enhancing the mapped features into an enhancement matrix of random generated weights, and generating an enhancement node layer of the width learning basic model, wherein in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xijA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξjIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.
Step 5, constructing an input matrix by the characteristic node layer and the enhanced node layer, inputting a width learning model for training, and constructing a width learning basic model, wherein in the step 5, the step 5-1: splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of nodes of the hidden layer is L ═ k × n + m × r;
in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculated
Figure BDA0002567526920000061
And solving the output weight W to finally obtain a final model after training.
And 6, utilizing the finally trained improved width learning model, comprising the following steps: and (3) estimating output corresponding to the test input by using the weight and bias and the output weight W generated in the training process, wherein in the step 6, the step 6-1 of utilizing the finally trained improved width learning model comprises the following steps: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.
The improved width learning system CNN _ BLS can extract deeper sample characteristics through the convolutional neural network model while keeping the high-efficiency learning speed of the original width learning system BLS, and has stronger characterization capability than the BLS model.
The classification results after using the standard data set MNIST and setting the relevant parameters are shown in comparative experiments:
Figure BDA0002567526920000062
Figure BDA0002567526920000071
CNN _ BLS has better classification results than BLS, while ensuring that the number of feature nodes is similar. Meanwhile, in comparison with an ELM (extreme learning machine) model, the original high efficiency of the width learning system is kept, and the comprehensive effect is better.
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (7)

1. A CNN-based width learning classification method is characterized in that: the method comprises the following specific steps:
step 1, obtaining training data and test data;
step 2, preprocessing the training data and the test data;
step 3, extracting the characteristics of the training data by using a convolutional neural network CNN to obtain the characteristic mapping of the training data and generate a characteristic node layer of a width learning basic model;
step 4, enhancing the mapped features into an enhanced matrix of random generated weights, and generating an enhanced node layer of the width learning basic model;
step 5, constructing an input matrix by the characteristic node layer and the enhancement node layer, inputting a width learning model for training, and constructing a width learning basic model;
and 6, utilizing the finally trained improved width learning model.
2. The CNN-based width learning classification method according to claim 1, wherein: in step 3, the process of generating the feature node layer includes:
step 3-1: for input data X obtained after preprocessing, a feature mapping Z containing k feature windows is generated by using a convolutional neural network modelk
Step 3-2: as above, k feature windows Z are generated using a k-th convolutional neural network modeliEach of ZiEach feature window comprises n feature nodes, wherein i is 1,2, …, k;
step 3-3: k feature windows Z1,Z2,…,ZkAnd splicing to form a characteristic node layer containing k multiplied by n characteristic nodes.
3. The CNN-based width learning classification method according to claim 2, wherein: the convolutional neural network model specifically includes:
one of the TensorFlow frame based constructs includes: the convolutional neural network of 1 convolutional layer, 1 pooling layer and 1 full-link layer specifically is:
step 3-1-1: the convolutional layer adopts 16 5 multiplied by 5 single-channel convolutional cores to carry out X convolution on input data, the activation function uses relu, and 16 28 multiplied by 28 characteristics are output;
step 3-1-2: the output data of the convolutional layer is input into the pooling layer, and maximum pooling calculation is adopted, Z1The size of the pooling window is 2 multiplied by 2, the step length is 2 multiplied by 2, the filling mode is 'SAME', and 16 characteristic mappings with the size of 14 multiplied by 14 are output after the pooling layer calculation;
step 3-1-3: the input of the full connection layer is output data of the previous layer, the activation function uses relu, and the output is a feature map containing 10 neurons, namely ZiWherein i 1, 2.. k;
step 3-1-4: and using the k-time convolutional neural network model to obtain a final feature node layer, and entering the next calculation, wherein k represents the number of feature windows, and n represents the number of feature nodes contained in each group of feature windows.
4. The CNN-based width learning classification method according to claim 2, wherein: in the step 4, the step 4-1: using m groups of randomly generated weights and biases to map with the features generated in step 3 through xijA transformation, where j ═ 1, 2.. times, m, to construct an enhancement node matrix, where ξjIs a hyperbolic tangent activation function, and r represents the number of enhancement nodes corresponding to each group of enhancement transformations.
5. The CNN-based width learning classification method according to claim 1, wherein: in the step 5, the step 5-1: and splicing the characteristic node matrix and the enhanced node matrix into a whole to obtain a hidden layer output matrix, wherein the number of the hidden layer nodes is L ═ k × n + m × r.
6. The CNN-based width learning classification method according to claim 5, wherein: in the step 5-2, by introducing regularization, a weight matrix is solved by using a least square method, namely ridge regression, to obtain a final weight estimator, and the final weight estimator is calculated
Figure FDA0002567526910000021
And solving the output weight W to finally obtain a final model after training.
7. The CNN-based width learning classification method according to claim 3, wherein: in the step 6, the step 6-1 of utilizing the finally trained improved width learning model comprises the following steps: and estimating classification output corresponding to the test data input by using the weight and bias generated in the training process and the output weight W.
CN202010634517.1A 2020-07-02 2020-07-02 CNN-based width learning classification method Pending CN111931813A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010634517.1A CN111931813A (en) 2020-07-02 2020-07-02 CNN-based width learning classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010634517.1A CN111931813A (en) 2020-07-02 2020-07-02 CNN-based width learning classification method

Publications (1)

Publication Number Publication Date
CN111931813A true CN111931813A (en) 2020-11-13

Family

ID=73317301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010634517.1A Pending CN111931813A (en) 2020-07-02 2020-07-02 CNN-based width learning classification method

Country Status (1)

Country Link
CN (1) CN111931813A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112802011A (en) * 2021-02-25 2021-05-14 上海电机学院 Fan blade defect detection method based on VGG-BLS
CN113011493A (en) * 2021-03-18 2021-06-22 华南理工大学 Electroencephalogram emotion classification method, device, medium and equipment based on multi-kernel width learning
CN113311035A (en) * 2021-05-17 2021-08-27 北京工业大学 Effluent total phosphorus prediction method based on width learning network
CN113673554A (en) * 2021-07-07 2021-11-19 西安电子科技大学 Radar high-resolution range profile target identification method based on width learning

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112802011A (en) * 2021-02-25 2021-05-14 上海电机学院 Fan blade defect detection method based on VGG-BLS
CN113011493A (en) * 2021-03-18 2021-06-22 华南理工大学 Electroencephalogram emotion classification method, device, medium and equipment based on multi-kernel width learning
CN113311035A (en) * 2021-05-17 2021-08-27 北京工业大学 Effluent total phosphorus prediction method based on width learning network
CN113311035B (en) * 2021-05-17 2022-05-03 北京工业大学 Effluent total phosphorus prediction method based on width learning network
CN113673554A (en) * 2021-07-07 2021-11-19 西安电子科技大学 Radar high-resolution range profile target identification method based on width learning

Similar Documents

Publication Publication Date Title
CN111489358B (en) Three-dimensional point cloud semantic segmentation method based on deep learning
CN111931813A (en) CNN-based width learning classification method
CN109543502B (en) Semantic segmentation method based on deep multi-scale neural network
CN109948029A (en) Based on the adaptive depth hashing image searching method of neural network
CN109829541A (en) Deep neural network incremental training method and system based on learning automaton
CN111695457B (en) Human body posture estimation method based on weak supervision mechanism
CN108171318B (en) Convolution neural network integration method based on simulated annealing-Gaussian function
CN112699247A (en) Knowledge representation learning framework based on multi-class cross entropy contrast completion coding
CN112750129B (en) Image semantic segmentation model based on feature enhancement position attention mechanism
CN106022363A (en) Method for recognizing Chinese characters in natural scene
CN110263855B (en) Method for classifying images by utilizing common-basis capsule projection
CN113177560A (en) Universal lightweight deep learning vehicle detection method
CN113591978B (en) Confidence penalty regularization-based self-knowledge distillation image classification method, device and storage medium
CN116503676B (en) Picture classification method and system based on knowledge distillation small sample increment learning
CN112686376A (en) Node representation method based on timing diagram neural network and incremental learning method
CN115049534A (en) Knowledge distillation-based real-time semantic segmentation method for fisheye image
CN113553918B (en) Machine ticket issuing character recognition method based on pulse active learning
CN111222534A (en) Single-shot multi-frame detector optimization method based on bidirectional feature fusion and more balanced L1 loss
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN116188785A (en) Polar mask old man contour segmentation method using weak labels
CN116563636A (en) Synthetic aperture radar image generation method and system
CN114494284B (en) Scene analysis model and method based on explicit supervision area relation
CN114581789A (en) Hyperspectral image classification method and system
CN114332491A (en) Saliency target detection algorithm based on feature reconstruction
CN109409226A (en) A kind of finger vena plot quality appraisal procedure and its device based on cascade optimization CNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination