CN116246102A - Image classification method and system based on self-encoder and decision tree - Google Patents

Image classification method and system based on self-encoder and decision tree Download PDF

Info

Publication number
CN116246102A
CN116246102A CN202310070830.0A CN202310070830A CN116246102A CN 116246102 A CN116246102 A CN 116246102A CN 202310070830 A CN202310070830 A CN 202310070830A CN 116246102 A CN116246102 A CN 116246102A
Authority
CN
China
Prior art keywords
sample
encoder
self
nearest neighbor
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310070830.0A
Other languages
Chinese (zh)
Inventor
黄祎婧
王辉
黄宇廷
韩星宇
曹学儒
范自柱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Jiaotong University
Original Assignee
East China Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Jiaotong University filed Critical East China Jiaotong University
Priority to CN202310070830.0A priority Critical patent/CN116246102A/en
Publication of CN116246102A publication Critical patent/CN116246102A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

An image classification method and system based on a self-encoder and a decision tree, wherein the method comprises the following steps: collecting image sample data, and converting the image sample into a pixel information matrix/vector; learning the characterization information of the image sample by using a self-encoder network model, and compressing and extracting the low-dimensional characteristic information of the image sample by using an encoder; updating nearest neighbor values corresponding to each sample in the process of iteratively solving the self-encoder network optimal weight parameters; based on the low-dimensional sample characteristic information extracted from the trained self-encoder network model, combining the sample nearest neighbor value obtained through iteration as a corresponding sample label to construct a decision tree model; and obtaining low-dimensional characteristic information of the new sample by using the self-encoder, inputting a decision tree to obtain nearest neighbor values, searching the nearest neighbor field in the training set, and taking the category with the largest number in the nearest neighbor field as a prediction result. The invention can obtain the low-dimensional characteristic information of the target, predicts the category of the sample and has the interpretability of the prediction result.

Description

Image classification method and system based on self-encoder and decision tree
Technical Field
The invention relates to an image classification method and system based on a self-encoder and a decision tree, belonging to the technical field of machine learning and deep learning.
Background
The large number of image samples obtained by research do not need to be marked manually, but the large number of data are marked manually, so that the traditional machine learning method and the deep learning method aim at classifying or identifying the unlabeled samples through a sample data set with label information. The classification task is used as a basic task of a traditional machine learning method, and samples of unknown label information need to be classified by using sample label information of a training set. The task of image classification is a popular field of research today, and derives many corresponding classical methods of traditional machine learning and improved algorithms thereof.
The traditional machine learning classical algorithm comprises decision trees, bayesian classifiers, support vector machines, K-neighbor classifiers and the like, and the classification effect of the methods for processing small structured data is good. When complex data is input, such as high-dimensional data, most machine learning algorithms face dimension disasters, and the algorithm classification effect is reduced. Under a large sample data set, the deep learning network can greatly improve the speed of the algorithm and the classification accuracy.
The difference between the traditional machine learning algorithm and the deep learning algorithm is that the deep learning network does not need given characteristics and does not need to analyze the characteristics, when the data volume is increased, the number of layers of the network can be deepened to obtain better learning performance, and the performance of the machine learning algorithm is not adjusted after exceeding a certain limit. However, the machine learning algorithm has a common characteristic of having the interpretability, and can intuitively see the process of generating specific output, and the current deep network method has good classification effect but has no interpretability, and the classification process of the network is unknown.
With the development of diversification of data collection paths, challenges of high-dimensional complex data are often faced in image classification. For the problems of large computational complexity, long time consumption and the like of a basic machine learning method of image classification when facing high-dimensional complex data, a neural network is used for processing sample information. The method has the advantages that the interpretive performance of the decision tree model is utilized, the algorithm result has certain interpretive performance while the algorithm classification accuracy is improved, the self-encoder network is combined with the decision tree, and the classification interpretive performance and the generalization capability of the model are improved.
Disclosure of Invention
The invention aims to solve the problems of image classification, and provides an image classification method based on a self-encoder and a decision tree.
The technical scheme of the invention is as follows, an image classification method based on a self-encoder and a decision tree comprises the following steps:
(1) Collecting data, acquiring original RGB image data, and converting an image sample into a pixel information matrix/vector;
(2) Inputting the collected image data into a self-encoder, performing self-encoder characterization learning on the image samples through the encoder and the decoder by using a feedforward neural network, and extracting low-dimensional structural feature information of the image samples by using an encoder part; constructing a network according to the sample image data, adding sparsity constraint and correlation constraint to a loss function of the network, and updating weight parameters of the self-coding network in an iterative mode;
(3) In the iterative process of solving the optimal weight parameters of the network, calculating the distance between the image samples based on the low-dimensional sample vector obtained by the encoder, and updating the nearest neighbor value corresponding to each image sample under the constraint of the nearest neighbor distance;
(4) Based on the trained encoder in the self-encoder network model, extracting low-dimensional characteristic information of a sample, combining the sample nearest neighbor value obtained through iteration as a corresponding sample label, constructing a decision tree with the nearest neighbor value as a leaf node by using a CART method, and simultaneously adjusting parameters of the self-encoder network;
(5) And obtaining low-dimensional characteristic information of the new sample by using the self-encoder, inputting a decision tree to obtain nearest neighbor values, searching the nearest neighbor field of the training sample by KNN, and taking the category with the largest number in the nearest neighbor field as a prediction result.
The self-encoder characterization learning step includes:
according to sample image data, constructing a network, adding sparsity constraint and correlation constraint to a loss function of the network, wherein the loss function of the self-coding network under the unconstrained condition is as follows:
X=(x 1 ,x 2 ,...,x n )
Figure BDA0004064740080000031
Figure BDA0004064740080000032
wherein X is an input sample vector; x is x i An ith feature of the input sample vector X; n is the dimension of the input vector, in an image sample of size 28 pixels x 28 pixels, the dimension of the corresponding input sample vector n=784;
Figure BDA0004064740080000033
reconstructing a sample vector for the output; />
Figure BDA0004064740080000034
For reconstructing sample vector->
Figure BDA0004064740080000035
Is the ith feature of (2); j (J) ave (W, b) is an unconstrained loss function from the encoder network for measuring reconstructed samples +.>
Figure BDA0004064740080000036
And the average difference between the original sample X; w, b are respectivelyIs the weight and bias from the encoder network;
the loss function of the self-encoding network after adding a sparse constraint to the hidden layer output of the self-encoder network is:
Figure BDA0004064740080000037
Figure BDA0004064740080000038
Figure BDA0004064740080000039
wherein ,
Figure BDA00040647400800000310
is the average activation of hidden layer neurons in the self-encoder network; n is the dimension of the input vector; x is x j A j-th feature that is an input sample vector; a, a i (x j ) Is the ith neuron at input x j A lower activation value; />
Figure BDA0004064740080000041
Is the relative entropy, which is a penalty factor for measuring the difference between two distributions; h is the number of neurons of the hidden layer; ρ is a sparsity parameter; gamma is a KL divergence constraint parameter; j (J) sparse (W, b) is a sparse loss function from the encoder network; />
Figure BDA0004064740080000042
Mean activation of neurons for the hidden layer;
self-encoder network loss function after similarity constraint is added to sparse self-encoding neural network:
Figure BDA0004064740080000043
wherein ,Jre (W, b) is a self-encoder network loss function incorporating sparsity constraints and similarity constraints, μ is a similarity parameter, n is the dimension of the input vector,
Figure BDA0004064740080000044
to reconstruct the ith feature of the sample vector, as a limitation to increase the sample-to-sample variance as much as possible;
and updating the weight parameters of the self-coding network in an iterative mode, wherein the iteration of the self-coding neural network uses a quasi-Newton method L-BFGS, and the maximum value of the iteration is set to 300.
The calculating the distance between the image samples comprises:
(1) Updating nearest neighbor values of samples at each iteration from an encoder network, presetting distance parameters between samples before training, and limiting maximum and minimum nearest neighbor values; under the limitation, the minimum nearest neighbor value of the sample is 1, the maximum nearest neighbor value is 10, namely, if the nearest neighbor value of the sample is 0, the nearest neighbor value is corrected to be 1, and if the nearest neighbor value of the sample is more than 10, the nearest neighbor value of the sample is corrected to be 10;
in the iterative calculation process, a new sample vector after the low-dimensional characteristics are improved can be obtained by the updated weight W and the bias term b; the distance of the compressed eigenvectors of the samples after passing through the self-encoding neural network of the two hidden layers is calculated as follows:
X i ′=h 1 (X i )=σ 1 (W 1 X i +b 1 )
X i ″=h 2 (X i ′)=σ 2 (W 2 X i ′+b 2 )
Figure BDA0004064740080000045
wherein ,W1 and W2 Respectively the first hidden layer h in the encoder network 1 And a second hidden layer h 2 Weights of (2); b 1 and b2 Then the corresponding bias term; sigma (sigma) 1 and σ2 Concealing layer h for network 1 and h2 An output function corresponding to the layer; x is X i ' denoted as the ith input sample X i Through the hidden layer h 1 A post vector; x is X i "is denoted as X i ' pass through hidden layer h 2 A post vector; d (X) i ″,X j ") is the extracted low-dimensional sample vector X i "and X j "Euclidean distance between; m is the dimension of the low-dimensional feature vector X "; x is x is ' is the sample vector X i "s-th feature; x's' js For sample vector X j "s-th feature; the distance parameter α between samples determines the nearest neighbor distance of the other sample to the sample, when the distance from the sample is greater than α, i.e. D (X i ″,X j "α), then it cannot be a neighbor of the sample; based on the distance parameter, the nearest neighbor value of the ith sample is expressed as the number of samples with the distances among all samples less than alpha;
(2) The self-encoding neural network is provided with two hidden layers, a first hidden layer d for 784×1 sample data 1 Is set to 196 x 1, the second hidden layer d 2 Is set to 20 x 1;
(3) Initializing parameters of the network, setting initial value of the self-encoder network weight parameter to 0, sparsity parameter ρ to 0.05, sparsity penalty factor coefficient γ to 0.5, similarity constraint parameter μ to 3×10 -3
The decision tree with the nearest neighbor value as the leaf node is constructed as follows:
taking the final output X' after iteration of the self-encoder as a sample vector for constructing a decision tree, and taking the nearest neighbor number of each sample obtained by iteration as a new label of the sample;
the decision tree is generated by adopting a CART algorithm, and a full binary tree is generated, and the method for calculating the base Ny index is as follows;
Figure BDA0004064740080000051
Figure BDA0004064740080000052
wherein Gini (B) represents the purity of sample set B in the decision tree; v i Represents the proportion of samples of class i (i=1,.); gini (B, q) represents the base index of attribute q; t represents the attribute q= { q 1 ,q 2 ,...,q T Number of values; b (B) t Indicating that all values at the t-th branch node are q t Is a sample set of the sample set.
The adjusting parameters of the self-encoder network includes:
test sample X z First a new verification sample vector X is generated by the generated self-encoding neural network z Obtaining corresponding neighbor value K through constructed CART decision tree z
Calculating the distance between samples:
Figure BDA0004064740080000061
wherein ,D(Xz ″,X i "is used to represent the Euclidean distance between two samples, m is the dimension of the low-dimensional feature vector X", X' zs Representing a sample vector X z "s-th feature; x is x is ' representing a sample vector X i "s-th feature;
then searching a corresponding neighbor sample set in the training set through a KNN algorithm, and classifying the samples by using labels of the neighbor sample set; if the classification effect is excellent, namely the classification accuracy reaches 85% or more, the generated network model and decision tree are reserved; otherwise, the parameters of the self-encoder neural network are adjusted to achieve better classification effect.
The nearest neighbor field of the search training sample is as follows:
Z=(z 1 ,z 2 ,...,z n )
Figure BDA0004064740080000062
Figure BDA0004064740080000064
Figure BDA0004064740080000063
wherein Z is a training sample; z i Is the ith feature of Z; z' is a low-dimensional feature vector extracted from the encoder; z i "ith feature which is a low-dimensional feature vector Z"; d (Z ", X) i "is a Euclidean distance metric; m is the dimension of the low-dimensional feature vectors X 'and Z'; k (K) z The optimal nearest neighbor value corresponding to the training sample output by the decision tree;
Figure BDA0004064740080000065
a nearest neighbor sample set that is a training sample; alpha is a distance parameter; />
Figure BDA0004064740080000066
Is->
Figure BDA0004064740080000067
A corresponding ith neighbor sample; p is p i For nearest neighbor sample set->
Figure BDA0004064740080000068
Probability of occurrence of the corresponding tag; p (P) z Predictive labels for training sample Z; c is the number of label categories of the sample.
The invention discloses a system of an image classification method based on a self-encoder and a decision tree, which comprises an image input conversion module, a training module, a feature extraction module, a nearest neighbor module, a decision tree module and a classification module.
The image input conversion module is used for acquiring image sample data, acquiring original RGB image data and acquiring a matrix/vector of pixel information of an image sample.
The training module inputs the collected image data into the self-encoder, and solves the self-encoder network weight parameters through a back propagation algorithm by using a feedforward neural network.
The feature extraction module learns the characterization information of the image sample by using a self-encoder network model and extracts typical feature information of the image sample based on an encoder.
And the nearest neighbor module calculates the distance between the image samples based on the output result of the corresponding encoder in the iterative process of solving the optimal weight parameter of the network, and simultaneously updates the nearest neighbor value corresponding to each image sample under the constraint of the nearest neighbor distance.
The decision tree module extracts compression characteristic information of samples based on the trained encoder in the self-encoder network model, and combines the optimal neighbor number of each sample as a label to construct a decision tree based on the CART method.
The classification module compresses sample characteristic information by using a self-encoder and inputs the sample characteristic information into a decision tree to obtain the corresponding optimal nearest neighbor numerical value, searches the corresponding nearest neighbor field of the sample, and takes the category with the largest number in the nearest neighbor field in the KNN as a prediction result.
In the system of the image classification method based on the self-encoder and the decision tree, firstly, an image input conversion module is utilized to process data, then a training module is adopted to train the data, a feature extraction module is generated based on the training module to extract data features, the data features are input into the decision tree module to obtain nearest neighbor values, the nearest neighbor module is used for searching the nearest neighbor field, and finally, a classification module is adopted to output a prediction result.
The invention has the beneficial effects that the self-encoder network is utilized to process the image sample, compress the characteristics of the sample and extract the low-dimensional characteristics and structure of the sample as far as possible; during training of the self-encoder network, continuously searching for the neighbor number of the sample meeting the given distance according to the distance between the extracted low-dimensional feature vectors; and constructing a decision tree by using the nearest neighbor value of the sample and the extracted low-dimensional sample characteristics, acquiring the nearest neighbor value of a new image sample of the unknown label by using the decision tree, and judging the category of the image by adopting a nearest neighbor algorithm.
The method provided by the invention can obtain the low-dimensional characteristic information of the target, predict the category of the sample and have the interpretation of the prediction result.
Drawings
FIG. 1 is a flow chart of a method for classifying images from an encoder and decision tree according to the present invention;
FIG. 2 is a schematic diagram of decision tree generation.
Detailed Description
As shown in fig. 1, the image classification method based on the self-encoder and the decision tree of the present embodiment includes:
s101, acquiring data, acquiring original RGB image data, and converting an image sample into a pixel information matrix/vector.
S102, inputting the collected image data into an encoder, performing characterization learning on the image samples through the encoder and the decoder by using a feedforward neural network, and extracting low-dimensional structural feature information of the image samples by utilizing an encoder part.
The loss function of the corresponding self-encoder network under unconstrained conditions is:
X=(x 1 ,x 2 ,...,x n )
Figure BDA0004064740080000081
Figure BDA0004064740080000082
wherein X is an input sample vector, X i For the ith feature of the input sample vector X, n is the dimension of the input vector, (in an image sample of size 28 pixels X28 pixels, the dimension of the corresponding input sample vector n=784),
Figure BDA0004064740080000083
for the output reconstructed sample vector, +.>
Figure BDA0004064740080000084
For reconstructing sample vector->
Figure BDA0004064740080000085
I th feature of J ave (W, b) is an unconstrained loss function from the encoder network for measuring reconstructed samples +.>
Figure BDA0004064740080000086
And the average difference between the original samples X, W, b are the weight and bias, respectively, from the encoder network.
The loss function of the self-encoding network after adding the sparse constraint from the hidden layer output of the encoder network is:
Figure BDA0004064740080000091
Figure BDA0004064740080000092
Figure BDA0004064740080000093
wherein ,
Figure BDA0004064740080000094
is the average activation of hidden layer neurons in the self-encoder network, n is the dimension of the input vector, x j For the j-th feature of the input sample vector, a i (x j ) Is the ith neuron at input x j Lower activation value, < >>
Figure BDA0004064740080000095
Is the relative entropy, h is the number of neurons of the hidden layer, ρ is the sparsity parameter,gamma is KL divergence constraint parameter, J sparse (W, b) is a sparse loss function from the encoder network;
adding similarity constraint to sparse self-coding neural network:
Figure BDA0004064740080000096
wherein ,Jre (W, b) is a self-encoder network loss function incorporating sparsity constraints and similarity constraints, μ is a similarity parameter, n is the dimension of the input vector,
Figure BDA0004064740080000097
to reconstruct the ith feature of the sample vector, the difference between samples is increased as much as possible as a limitation.
S103, in the iterative process of solving the optimal weight parameters of the network, calculating the distance between the image samples based on the low-dimensional sample vector obtained by the encoder, and updating the nearest neighbor value corresponding to each image sample under the constraint of the nearest neighbor distance;
the corresponding distance calculation formula is expressed as:
X i ′=h 1 (X i )=σ 1 (W 1 X i +b 1 )
X i ″=h 2 (X i ′)=σ 2 (W 2 X i ′+b 2 )
Figure BDA0004064740080000098
wherein ,W1 and W2 Respectively the first hidden layer h in the encoder network 1 And a second hidden layer h 2 Weights of b 1 and b2 Then it is the corresponding bias term, σ 1 and σ2 Concealing layer h for network 1 and h2 Layer-corresponding output function, X i ' denoted as the ith input sample X i Through the hidden layer h 1 Post vector,X i "is denoted as X i ' pass through hidden layer h 2 Post vector, D (X i ″,X j ") is the extracted low-dimensional sample vector X i "and X j "Euclidean distance between" m is the dimension of the low-dimensional feature vector X ", X is ' is the sample vector X i "s-th feature, x' js For sample vector X j "s-th feature. The distance parameter α between samples determines the nearest neighbor distance of the other sample to the sample, when the distance from the sample is greater than α, i.e. D (X i ″,X j "α), then it cannot be a neighbor of the sample. Based on the distance parameter, the nearest neighbor value of the ith sample is expressed as the number of samples with all inter-sample distances less than α.
S104, extracting low-dimensional characteristic information of a sample based on an encoder in the trained self-encoder network model, combining a sample nearest neighbor numerical value obtained through iteration as a corresponding sample label, and constructing a decision tree model by using a CART method;
the corresponding calculation method of the base index is expressed as follows;
Figure BDA0004064740080000101
Figure BDA0004064740080000102
wherein Gini (B) represents the purity, v, of sample set B in the decision tree i Represents the proportion of samples of class i (i=1,., C) in sample set B, gini (B, q) represents the base index of attribute q, T represents attribute q= { q 1 ,q 2 ,...,q T Number of values of B t Indicating that all values at the t-th branch node are q t Is a sample set of the sample set.
The corresponding mode fine tuning includes:
test sample X z First generating a new validation sample X through the generated self-encoding neural network z "EtongThe over-constructed CART decision tree obtains the corresponding neighbor value K z
Calculating the distance between samples:
Figure BDA0004064740080000111
wherein ,D(Xz ″,X i "is used to represent the Euclidean distance between two samples, m is the dimension of the low-dimensional feature vector X", X' zs Representing a sample vector X z "s-th feature; x is x is ' is the sample vector X i "s-th feature.
Searching a corresponding neighbor sample set in the training set through a KNN algorithm, classifying the samples by using labels of the neighbor sample set, and if the classification effect is excellent, namely the classification accuracy reaches 85% or more, reserving the generated network model and decision tree; otherwise, the parameters of the self-encoder neural network are adjusted to achieve better classification effect.
S105, obtaining low-dimensional characteristic information of a new sample by using a self-encoder, inputting a decision tree to obtain nearest neighbor values, searching for corresponding nearest neighbors by KNN, and taking the category with the largest number in the nearest neighbor field as a prediction result;
the nearest neighbor field of the corresponding search training sample is expressed as:
Z=(z 1 ,z 2 ,...,z n )
Figure BDA0004064740080000112
N Kz ={X K1 ,X K2 ,...,X Kz |D(Z″,X i ″)<α,K z =1,...,10}
Figure BDA0004064740080000113
wherein Z is a training sample, Z i Is the ith feature of Z"is a low-dimensional feature vector extracted from the encoder, z i "ith feature, D (Z", X), which is a low-dimensional feature vector Z i "is Euclidean distance metric, m is the dimension of the low-dimensional feature vectors X" and Z ", K z For the optimal nearest neighbor value corresponding to the training sample output by the decision tree,
Figure BDA0004064740080000116
for the nearest neighbor sample set of training samples, α is the distance parameter, +.>
Figure BDA0004064740080000114
Is->
Figure BDA0004064740080000115
Corresponding ith neighbor sample, p i For nearest neighbor sample set->
Figure BDA0004064740080000117
Probability of occurrence of corresponding tag, P z For training the predictive labels of sample Z, C is the label class number of the sample.
The embodiment of the system for realizing the image classification method based on the self-encoder and the decision tree comprises an image input conversion module, a training module, a feature extraction module, a nearest neighbor module, a decision tree module and a classification module; the image transfer-in conversion module is connected with the training module, the training module is connected with the feature extraction module, the feature extraction module is connected with the decision tree module, the decision tree module is connected with the nearest neighbor module, and the nearest neighbor module is connected with the classification module.
The system comprises an image input conversion module, a matrix/vector conversion module and a display module, wherein the image input conversion module is used for acquiring image sample data, acquiring original RGB image data and acquiring pixel information of an image sample.
The training module of the system inputs the collected image data into the self-encoder, and solves the self-encoder network weight parameters through a back propagation algorithm by using a feedforward neural network.
The characteristic extraction module of the system learns the characteristic information of the image sample by utilizing a self-encoder network model and extracts the typical characteristic information of the image sample based on an encoder.
In the iterative process of solving the optimal weight parameter of the network, the nearest neighbor module of the system calculates the distance between the image samples based on the output result of the corresponding encoder, and updates the nearest neighbor value corresponding to each image sample under the constraint of the nearest neighbor distance.
The decision tree module of the system extracts compression characteristic information of samples based on the trained encoder in the self-encoder network model, combines the optimal neighbor number of each sample as a label to construct a decision tree, and constructs based on a CART method.
The classification module of the system compresses sample characteristic information by utilizing a self-encoder and inputs the sample characteristic information into a decision tree to obtain the corresponding optimal nearest neighbor numerical value, searches the corresponding nearest neighbor field of the sample, and takes the category with the largest number in the nearest neighbor field in the KNN as a prediction result.
In the embodiment, the self-encoder network is utilized to process the image samples, compress the characteristics of the samples, and extract the low-dimensional characteristics and structures of the samples as far as possible; during training of the self-encoder network, continuously searching for the neighbor number of the sample meeting the given distance according to the distance between the extracted low-dimensional feature vectors; and constructing a decision tree by using the nearest neighbor value of the sample and the extracted low-dimensional sample characteristics, acquiring the nearest neighbor value of a new image sample of the unknown label by using the decision tree, and judging the category of the image by adopting a nearest neighbor algorithm.
The technical principles of the present invention have been described above in connection with specific embodiments, which are provided for the purpose of explaining the principles of the present invention and are not to be construed as limiting the scope of the present invention in any way. Other embodiments of the invention will be apparent to those skilled in the art from consideration of this specification without undue burden.

Claims (7)

1. A method of classifying images based on a self-encoder and a decision tree, the method comprising the steps of:
(1) Collecting data, acquiring original RGB image data, and converting an image sample into a pixel information matrix/vector;
(2) Inputting the collected image data into a self-encoder, performing self-encoder characterization learning on the image samples through the encoder and the decoder by using a feedforward neural network, and extracting low-dimensional structural feature information of the image samples by using an encoder part; constructing a network according to the sample image data, adding sparsity constraint and correlation constraint to a loss function of the network, and updating weight parameters of the self-coding network in an iterative mode;
(3) In the iterative process of solving the optimal weight parameters of the network, calculating the distance between the image samples based on the low-dimensional sample vector obtained by the encoder, and updating the nearest neighbor value corresponding to each image sample under the constraint of the nearest neighbor distance;
(4) Based on the trained encoder in the self-encoder network model, extracting low-dimensional characteristic information of a sample, combining the sample nearest neighbor value obtained through iteration as a corresponding sample label, constructing a decision tree with the nearest neighbor value as a leaf node by using a CART method, and simultaneously adjusting parameters of the self-encoder network;
(5) And obtaining low-dimensional characteristic information of the new sample by using the self-encoder, inputting a decision tree to obtain nearest neighbor values, searching the nearest neighbor field of the training sample by KNN, and taking the category with the largest number in the nearest neighbor field as a prediction result.
2. The method of image classification based on a self-encoder and decision tree according to claim 1, wherein the self-encoder characterization learning step comprises:
according to sample image data, constructing a network, adding sparsity constraint and correlation constraint to a loss function of the network, wherein the loss function of the self-coding network under the unconstrained condition is as follows:
X=(x 1 ,x 2 ,...,x n )
Figure FDA0004064740050000021
Figure FDA0004064740050000022
wherein X is an input sample vector; x is x i An ith feature of the input sample vector X; n is the dimension of the input vector, in an image sample of size 28 pixels x 28 pixels, the dimension of the corresponding input sample vector n=784;
Figure FDA0004064740050000023
reconstructing a sample vector for the output; />
Figure FDA0004064740050000024
For reconstructing sample vector->
Figure FDA0004064740050000025
Is the ith feature of (2); j (J) ave (W, b) is an unconstrained loss function from the encoder network for measuring reconstructed samples +.>
Figure FDA0004064740050000026
And the average difference between the original sample X; w, b are weights and offsets, respectively, from the encoder network;
the loss function of the self-encoding network after adding a sparse constraint to the hidden layer output of the self-encoder network is:
Figure FDA0004064740050000027
Figure FDA0004064740050000028
Figure FDA0004064740050000029
wherein ,
Figure FDA00040647400500000210
is the average activation of hidden layer neurons in the self-encoder network; n is the dimension of the input vector; x is x j A j-th feature that is an input sample vector; a, a i (x j ) Is the ith neuron at input x j A lower activation value; />
Figure FDA00040647400500000211
Is the relative entropy, which is a penalty factor for measuring the difference between two distributions; h is the number of neurons of the hidden layer; ρ is a sparsity parameter; gamma is a KL divergence constraint parameter; j (J) sparse (W, b) is a sparse loss function from the encoder network; />
Figure FDA00040647400500000212
Mean activation of neurons for the hidden layer; />
Self-encoder network loss function after similarity constraint is added to sparse self-encoding neural network:
Figure FDA00040647400500000213
wherein ,Jre (W, b) is a self-encoder network loss function incorporating sparsity constraints and similarity constraints, μ is a similarity parameter, n is the dimension of the input vector,
Figure FDA00040647400500000214
to reconstruct the ith feature of the sample vector, as a limitation to increase the sample-to-sample variance as much as possible;
and updating the weight parameters of the self-coding network in an iterative mode, wherein the iteration of the self-coding neural network uses a quasi-Newton method L-BFGS, and the maximum value of the iteration is set to 300.
3. The method of image classification based on a self-encoder and decision tree according to claim 1, wherein said calculating the distance between the image samples comprises:
(1) Updating nearest neighbor values of samples at each iteration from an encoder network, presetting distance parameters between samples before training, and limiting maximum and minimum nearest neighbor values; under the limitation, the minimum nearest neighbor value of the sample is 1, the maximum nearest neighbor value is 10, namely, if the nearest neighbor value of the sample is 0, the nearest neighbor value is corrected to be 1, and if the nearest neighbor value of the sample is more than 10, the nearest neighbor value of the sample is corrected to be 10;
in the iterative calculation process, a new sample vector after the low-dimensional characteristics are improved can be obtained by the updated weight W and the bias term b; the distance of the compressed eigenvectors of the samples after passing through the self-encoding neural network of the two hidden layers is calculated as follows:
X i ′=h 1 (X i )=σ 1 (W 1 X i +b 1 )
X i ″=h 2 (X i ′)=σ 2 (W 2 X i ′+b 2 )
Figure FDA0004064740050000031
wherein ,W1 and W2 Respectively the first hidden layer h in the encoder network 1 And a second hidden layer h 2 Weights of (2); b 1 and b2 Then the corresponding bias term; sigma (sigma) 1 and σ2 Concealing layer h for network 1 and h2 An output function corresponding to the layer; x is X i ' denoted as the ith input sample X i Through the hidden layer h 1 A post vector; x is X i "is denoted as X i ' pass through hidden layer h 2 A post vector; d (X) i ″,X j ") is the extracted low-dimensional sample vector X i "and X j "Euclidean distance between; m is the dimension of the low-dimensional feature vector X "; x is x is ' is the sample vector X i "s-th feature; x's' js For sample vector X j "s-th feature; the distance parameter α between samples determines the nearest neighbor distance of the other sample to the sample, when the distance from the sample is greater than α, i.e. D (X i ″,X j "α), then it cannot be a neighbor of the sample; based on the distance parameter, the nearest neighbor value of the ith sample is expressed as the number of samples with the distances among all samples less than alpha;
(2) The self-encoding neural network is provided with two hidden layers, a first hidden layer d for 784×1 sample data 1 Is set to 196 x 1, the second hidden layer d 2 Is set to 20 x 1;
(3) The parameters of the network are initialized, the initial value of the self-encoder network weight parameter is set to 0, the sparsity parameter rho is set to 0.05, the coefficient gamma of the sparsity penalty factor is set to 0.5, and the parameter mu of the similarity constraint is set to 3 e-3.
4. The method for classifying images based on a self-encoder and decision tree according to claim 1, wherein the construction of the decision tree with nearest neighbor values as leaf nodes is as follows:
taking the final output X' after iteration of the self-encoder as a sample vector for constructing a decision tree, and taking the nearest neighbor number of each sample obtained by iteration as a new label of the sample;
the decision tree is generated by adopting a CART algorithm, and a full binary tree is generated, and the method for calculating the base Ny index is as follows;
Figure FDA0004064740050000041
/>
Figure FDA0004064740050000042
wherein Gini (B) represents the purity of sample set B in the decision tree; v i Represents the i (i=i) th in sample set BThe proportion of samples of class C); gini (B, q) represents the base index of attribute q; t represents the attribute q= { q 1 ,q 2 ,...,q T Number of values; b (B) t Indicating that all values at the t-th branch node are q t Is a sample set of the sample set.
5. The method of image classification based on a self-encoder and decision tree according to claim 1, wherein said adjusting parameters of the self-encoder network comprises:
test sample X z First a new verification sample vector X is generated by the generated self-encoding neural network z Obtaining corresponding neighbor value K through constructed CART decision tree z
Calculating the distance between samples:
Figure FDA0004064740050000051
wherein ,D(Xz ″,X i "is used to represent the Euclidean distance between two samples, m is the dimension of the low-dimensional feature vector X", X' zs Representing a sample vector X z "s-th feature; x is x is ' representing a sample vector X i "s-th feature;
then searching a corresponding neighbor sample set in the training set through a KNN algorithm, and classifying the samples by using labels of the neighbor sample set; if the classification effect is excellent, namely the classification accuracy reaches 85% or more, the generated network model and decision tree are reserved.
6. The image classification method based on a self-encoder and decision tree according to claim 1, wherein the nearest neighbor field of the search training samples is as follows:
Z=(z 1 ,z 2 ,...,z n )
Figure FDA0004064740050000052
Figure FDA0004064740050000053
Figure FDA0004064740050000054
wherein Z is a training sample; z i Is the ith feature of Z; z' is a low-dimensional feature vector extracted from the encoder; z i "ith feature which is a low-dimensional feature vector Z"; d (Z ", X) i "is a Euclidean distance metric; m is the dimension of the low-dimensional feature vectors X 'and Z'; k (K) z The optimal nearest neighbor value corresponding to the training sample output by the decision tree;
Figure FDA0004064740050000055
a nearest neighbor sample set that is a training sample; alpha is a distance parameter; />
Figure FDA0004064740050000056
Is->
Figure FDA0004064740050000057
A corresponding ith neighbor sample; p is p i For nearest neighbor sample set->
Figure FDA0004064740050000058
Probability of occurrence of the corresponding tag; p (P) z Predictive labels for training sample Z; c is the number of label categories of the sample.
7. A system for implementing a self-encoder and decision tree based image classification method according to any of claims 1-6, said system comprising an image input conversion module, a training module, a feature extraction module, a nearest neighbor module, a decision tree module and a classification module:
the image input conversion module is used for acquiring image sample data, acquiring original RGB image data and acquiring a matrix/vector of pixel information of an image sample;
the training module inputs the collected image data into the self-encoder, and solves the weight parameters of the self-encoder by using a feedforward neural network through a back propagation algorithm;
the characteristic extraction module is used for learning the characteristic information of the image sample by utilizing a self-encoder network model and extracting the typical characteristic information of the image sample based on an encoder;
the nearest neighbor module calculates the distance between the image samples based on the output result of the corresponding encoder in the iterative process of solving the optimal weight parameter of the network, and updates the nearest neighbor value corresponding to each image sample under the constraint of the nearest neighbor distance;
the decision tree module extracts compression characteristic information of samples based on the trained encoder in the self-encoder network model, and constructs a decision tree based on a CART method by combining the optimal neighbor number of each sample as a label;
the classification module compresses sample characteristic information by using a self-encoder and inputs the sample characteristic information into a decision tree to obtain the corresponding optimal nearest neighbor numerical value, searches the corresponding nearest neighbor field of the sample, and takes the category with the largest number in the nearest neighbor field in the KNN as a prediction result.
In the system of the image classification method based on the self-encoder and the decision tree, firstly, an image input conversion module is utilized to process data, then a training module is adopted to train the data, a feature extraction module is generated based on the training module to extract data features, the data features are input into the decision tree module to obtain nearest neighbor values, the nearest neighbor module is used for searching the nearest neighbor field, and finally, a classification module is adopted to output a prediction result.
CN202310070830.0A 2023-02-07 2023-02-07 Image classification method and system based on self-encoder and decision tree Pending CN116246102A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310070830.0A CN116246102A (en) 2023-02-07 2023-02-07 Image classification method and system based on self-encoder and decision tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310070830.0A CN116246102A (en) 2023-02-07 2023-02-07 Image classification method and system based on self-encoder and decision tree

Publications (1)

Publication Number Publication Date
CN116246102A true CN116246102A (en) 2023-06-09

Family

ID=86634328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310070830.0A Pending CN116246102A (en) 2023-02-07 2023-02-07 Image classification method and system based on self-encoder and decision tree

Country Status (1)

Country Link
CN (1) CN116246102A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116939210A (en) * 2023-09-13 2023-10-24 瀚博半导体(上海)有限公司 Image compression method and device based on self-encoder
CN117454277A (en) * 2023-10-11 2024-01-26 深圳励剑智能科技有限公司 Data management method, system and medium based on artificial intelligence

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116939210A (en) * 2023-09-13 2023-10-24 瀚博半导体(上海)有限公司 Image compression method and device based on self-encoder
CN116939210B (en) * 2023-09-13 2023-11-17 瀚博半导体(上海)有限公司 Image compression method and device based on self-encoder
CN117454277A (en) * 2023-10-11 2024-01-26 深圳励剑智能科技有限公司 Data management method, system and medium based on artificial intelligence
CN117454277B (en) * 2023-10-11 2024-06-25 深圳励剑智能科技有限公司 Data management method, system and medium based on artificial intelligence

Similar Documents

Publication Publication Date Title
Yang et al. A survey of DNN methods for blind image quality assessment
CN109063565B (en) Low-resolution face recognition method and device
CN105138973B (en) The method and apparatus of face authentication
CN116246102A (en) Image classification method and system based on self-encoder and decision tree
CN113221641B (en) Video pedestrian re-identification method based on generation of antagonism network and attention mechanism
CN112765352A (en) Graph convolution neural network text classification method based on self-attention mechanism
CN110188827A (en) A kind of scene recognition method based on convolutional neural networks and recurrence autocoder model
CN110942091A (en) Semi-supervised few-sample image classification method for searching reliable abnormal data center
CN113627266A (en) Video pedestrian re-identification method based on Transformer space-time modeling
CN111967358B (en) Neural network gait recognition method based on attention mechanism
CN113222072A (en) Lung X-ray image classification method based on K-means clustering and GAN
Wang et al. Accelerated manifold embedding for multi-view semi-supervised classification
CN114255371A (en) Small sample image classification method based on component supervision network
CN113052017A (en) Unsupervised pedestrian re-identification method based on multi-granularity feature representation and domain adaptive learning
CN116110089A (en) Facial expression recognition method based on depth self-adaptive metric learning
CN114780767A (en) Large-scale image retrieval method and system based on deep convolutional neural network
CN113065520A (en) Multi-modal data-oriented remote sensing image classification method
CN109784244B (en) Low-resolution face accurate identification method for specified target
Yao A compressed deep convolutional neural networks for face recognition
CN115392474B (en) Local perception graph representation learning method based on iterative optimization
CN111461061A (en) Pedestrian re-identification method based on camera style adaptation
CN115049894A (en) Target re-identification method of global structure information embedded network based on graph learning
CN115664970A (en) Network abnormal point detection method based on hyperbolic space
CN113269235B (en) Assembly body change detection method and device based on unsupervised learning
CN114882007A (en) Image anomaly detection method based on memory network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination