CN117076985A - Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder - Google Patents
Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder Download PDFInfo
- Publication number
- CN117076985A CN117076985A CN202311021346.5A CN202311021346A CN117076985A CN 117076985 A CN117076985 A CN 117076985A CN 202311021346 A CN202311021346 A CN 202311021346A CN 117076985 A CN117076985 A CN 117076985A
- Authority
- CN
- China
- Prior art keywords
- data
- network
- encoder
- self
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 title claims abstract description 82
- 238000000034 method Methods 0.000 title claims abstract description 57
- 229910052742 iron Inorganic materials 0.000 title claims abstract description 41
- 238000002536 laser-induced breakdown spectroscopy Methods 0.000 title claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 63
- 230000008569 process Effects 0.000 claims abstract description 20
- 238000001228 spectrum Methods 0.000 claims abstract description 19
- 230000009467 reduction Effects 0.000 claims abstract description 13
- 230000015556 catabolic process Effects 0.000 claims abstract description 5
- 230000001360 synchronised effect Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 32
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 9
- 238000013135 deep learning Methods 0.000 claims description 9
- 238000005457 optimization Methods 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 6
- 238000012512 characterization method Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 230000001902 propagating effect Effects 0.000 claims description 4
- 239000000470 constituent Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 2
- 238000013528 artificial neural network Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004737 colorimetric analysis Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000004448 titration Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/62—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
- G01N21/71—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light thermally excited
- G01N21/718—Laser microanalysis, i.e. with formation of sample plasma
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Optics & Photonics (AREA)
- Plasma & Fusion (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
Abstract
The invention discloses a classification and identification method of a twin network fused with a self-encoder for iron ore LIBS data, which effectively solves the problem that the prediction accuracy of the traditional algorithm on small-sample iron ore is not high; the method comprises the following steps: s1: collecting iron ore spectrum data as original training data through a laser-induced breakdown spectrometer; s2: performing self-encoder dimension reduction on the original training data, wherein the dimension-reduced data is used as the input of a training network; s3: selecting two identical branch networks as twin networks, wherein each branch network processes an input spectrum sample; the two branch networks remain synchronized during the training process by sharing parameters. The invention can reduce the data dimension by using the self-encoder, extract important information and enable the characteristics to be better gathered together; after the twin network is used, the accuracy of the small sample is improved; model training and prediction speed are improved.
Description
Technical Field
The invention relates to the technical field of classification and identification of iron ore LIBS (Laser-Induced Breakdown Spectroscopy ) data, in particular to a classification and identification method of a twin network fused with a self-encoder for iron ore LIBS data.
Background
The existing methods for classifying iron ores include chemical methods, deep learning and machine learning.
Chemical method: the chemical method is a common method for determining the content and composition of each element in ores, and the common chemical method comprises titration, colorimetry, a compound indicator method and the like. The contents of the main components, impurities and useful elements therein can be determined, thereby achieving classification of the iron ore.
Machine learning: the machine learning method comprises KNN, SVM and other modes. An improved classification method of steel materials by combining a support vector machine with laser-induced breakdown spectroscopy is disclosed in patent document CN 103488874A; firstly, 1-to-many modeling classification is established, binary classifiers aiming at each type of iron ore are respectively established, and in the prediction process, if the predicted result has a plurality of types, one-to-one models are established for the types. The process is then iterated until a unique result is selected.
Deep learning: the existing LIBS iron ore data classification mode mainly aims at training a neural network on a big data sample. And adopting PCA technology to conduct dimension reduction and feature extraction on the data. Typically, deep learning neural networks are trained and classified by using 1DCNN (one-dimensional neural network), such as a method for identifying iron ore production countries and brands provided in patent document CN111239103 a.
Drawbacks/deficiencies of the prior art:
1. the iron ore LIBS data is high-dimensional data, and a sample of the LIBS spectrum data of the iron ore has 1496 characteristic points. The existing deep learning can reduce the dimension of the iron ore spectrum data by adopting a PCA method. PCA is a linear dimension reduction method, and can only capture the linear relation in data. PCA may face the problem of dimension catastrophe in high-dimensional space, i.e., PCA may not work well when the dimension is much larger than the number of samples.
2. The existing deep learning model is often trained into a relatively accurate model under the condition of sufficient sample size. In order to increase the accuracy of the model, a mode of deepening the layer number of the model is adopted. However, when training the model, as the depth of the model increases, the gradient vanishes or the network degenerates (network degenerates: in the process of increasing the parameters of the network layer in the neural network of deep learning, the accuracy of the training set tends to be saturated, the number of layers of the network is increased, and the accuracy of the training set is reduced, which is not caused by overfitting).
3. In training these data, there are cases where the samples are not balanced, and the sample size of iron ore of a part of the kinds is much smaller than other samples, so that the accuracy of the model is biased to a large number of sample kinds, but the accuracy is not high enough on iron ore kinds with small sample sizes.
4. Current machine learning methods, such as the method of patent document CN103488874a using a modified support vector machine in combination with Libs spectroscopy, are very time consuming for part of the sample predictions. Because in this way, if the one-to-many model is used for the first time, the specific class cannot be screened layer by layer, but multiple classes are selected, the 1-to-1 model needs to be built again and the screening is repeated. Time and labor are consumed, and real-time rapidity in industry cannot be achieved.
Disclosure of Invention
The invention aims to provide a classification and identification method for iron ore LIBS data by a twin network fused with a self-encoder, so as to solve the problems in the prior art.
In order to achieve the above purpose, the present invention provides the following technical solutions: a classification and identification method of a twin network integrated with a self-encoder for iron ore LIBS data comprises the following steps: s1: collecting iron ore spectrum data as original training data through a laser-induced breakdown spectrometer; s2: performing self-encoder dimension reduction on the original training data, wherein the dimension-reduced data is used as the input of a training network; s3: selecting two identical branch networks as twin networks, wherein each branch network processes an input spectrum sample; the two branch networks remain synchronized during the training process by sharing parameters.
Further: the step S2 is to use a self-encoder to extract data characteristics, and the data subjected to self-encoding dimension reduction is used as an input layer of a subsequent deep learning classification network; the self-encoder contains 3 linear layers, 3 linear rectification functions are activated, and the back-propagation optimization function selects the mean square error.
Further: the self-encoder is used for extracting and reconstructing the characteristics of the data; the system consists of two main components of an encoder and a decoder; the goal of the self-encoder is to achieve self-characterization of the input data by learning a compressed representation and a decoder that reconstructs the input data from the representation; the training process of the self-encoder includes two phases:
forward propagation: the input data is passed to an encoder, producing a characteristic representation:
wherein W is 1 : forward propagating weights, b 1 : a forward propagating bias parameter;
back propagation: transmitting the output of the encoder to a decoder, and calculating a reconstruction error by comparing the output of the decoder with the original input; the weights and offsets of the network are then updated using a back propagation algorithm to minimize reconstruction errors:
wherein W is 2 : counter-propagating weights, b 2 : a counter-propagating bias parameter;
the objective function of the algorithm optimization is: minimum =dist (X, X) R )。
Further: the step S2 further includes performing the following normalization processing on the data:
further: the branch network is a residual network, the basic constituent units of the residual network are residual blocks, and the training and performance of the network are optimized through cross-layer connection and residual learning; inside each residual block, the input data is processed through two or more convolutional layers and an activation function; and then adding the input data with the residual connection to obtain the output of the residual block.
Further: each branch network comprises a convolution layer and a pooling layer structure to extract characteristics and learn sample representation; the two branch networks share weight to obtain two vectors, and the Euclidean distance is used for calculating the difference between the two vectors; finally, an S-type activation function is used.
Further: the step S3 is to divide the data set into a training set, a verification set and a test set; generating a "positive sample" corresponding to each sample and a "negative sample" of a different class from the positive sample; positive samples are original samples, negative samples are samples selected from other classes; creating pairs of input samples, each pair comprising a positive sample and a negative sample; employing contrast ive loss loss functions, including boundary loss and inter-sample distance loss;
n represents the number of samples, i.e. the number of paired samples;dw is the euclidean distance between the two outputs of the twin network: e (E) w =∣X 1 -X 2 | 2 The method comprises the steps of carrying out a first treatment on the surface of the Y is 1 or 0; y=1 when the two inputs are similar, otherwise 0.
Further: the training process of the step S3 is as follows: training the twin network using the training data to minimize a defined loss function; optimizing the weight and bias of the updated network using a gradient descent algorithm; the performance of the model is monitored using the validation set and the parameters are adjusted or training stopped in advance based on the performance.
Further: the step S3 further comprises the step of evaluating the performance of the trained model on unseen data by using a test set in the following manner: selecting the average spectrum of each iron ore sample as a standard sample of the sample, and inputting an unknown sample to be tested and the average spectrum into a twin network at the same time to obtain a prediction result; calculating classification accuracy and confusion matrix indexes to evaluate the effect of the model; for the new spectrum sample, it is input into the deployed model, and the prediction results output by the model are classified.
Further: the residual error network is formed by a 50-layer network, and the quantity ratio of the training set, the verification set and the test set is 8:1:1.
Compared with the prior art, the invention has the beneficial effects that:
1. the self-encoder can reduce the data dimension, extract important information and enable the characteristics to be better gathered; therefore, the model training time can be effectively shortened, and the model accuracy is improved.
2. After the twin network is used, the accuracy of the small samples is improved (in the selected 35 samples, the overall accuracy is 96.06%, and the overall accuracy of the five small samples reaches 95.3%).
3. Model training and prediction speed are improved.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a block diagram of a self-encoder of the present invention;
FIG. 2 is a diagram of the effect of the self-encoder of the present invention;
FIG. 3 is a diagram of a residual network of the present invention;
FIG. 4 is a diagram of the twinning network of the present invention;
FIG. 5 is a los plot of the training process of the present invention and a plot of accuracy over the training set and the test set.
FIG. 6 is a confusion matrix plot of predicted results on 35 iron ores according to the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention.
Referring to fig. 1-6, in an embodiment of the present invention, a method for classifying and identifying iron ore LIBS data by a twin network integrated with an encoder includes the following steps:
s1: collecting original training data: collecting spectrum data of the iron ore by a laser-induced breakdown spectrometer;
s2: performing self-encoder dimension reduction on the original data:
2.1, data normalization processing:
2.2, through the self-encoder, 3 linear layers, 3 Relu functions are activated, and dimension reduction is carried out on the data; the data after the dimension reduction is used as the input of a training network;
s3: training of twin networks: the twin network has two branch networks; for extracting data features;
the twin neural network takes a residual error network as a branch network;
preferably, the residual network is formed by a 50-layer network; the 50-layer residual network may give the best results in 140 iterative training. If the number of network layers is too small, LIBS data features cannot be fully extracted, more rounds of iteration are needed, and time is consumed. If the residual network exceeds 50 layers, the result will not be improved, but instead the parameters will be more as the number of layers increases. In addition, a deeper network requires more training time and computational resources, and the iteration time per round is also longer.
Selecting a residual error network by the two branch networks; the branch network shares weight to obtain two vectors, and the Euclidean distance is used for calculating the difference between the two vectors; an S-type activation function is used.
Preferably, the self-encoder is used for extracting data characteristics, and the data subjected to self-encoding and dimension reduction is used as an input layer of a subsequent deep learning classification network; the self-encoder contains 3 linear layers, 3 linear rectification functions are activated, and the back-propagation optimization function selects the mean square error.
Preferably, the self-encoder is used for extracting and reconstructing the characteristics of the data; the system consists of two main components of an encoder and a decoder; the goal of the self-encoder is to achieve self-characterization of the input data by learning a compressed representation and a decoder that reconstructs the input data from the representation;
the training process of the self-encoder includes two phases:
forward propagation: the input data is passed to an encoder, producing a characteristic representation:
back propagation: transmitting the output of the encoder to a decoder, and calculating a reconstruction error by comparing the output of the decoder with the original input; the weights and offsets of the network are then updated using a back propagation algorithm to minimize reconstruction errors:
the objective function of the algorithm optimization is: minimum =dist (X, X) R )。
Preferably, the residual network introduces the concept of residual blocks, optimizing training and performance of the network through cross-layer connection and residual learning;
the residual block is a basic constituent unit of a residual network; information is transferred by introducing cross-layer connection, so that a network can learn a residual function; inside each residual block, the input data is processed through two or more convolutional layers and an activation function; then, the input data is added to the residual connection to obtain the output of the residual block.
Preferably, the twin neural network: twin network architecture design:
the twin network consists of two identical network branches, each branch processing one input spectrum sample;
each branch comprises a convolution layer, a pooling layer and a full-connection layer structure so as to extract characteristics and learn sample representation;
by sharing the parameters, the weights of the two branches are ensured to be kept synchronous in the training process.
Preferably, training data preparation: dividing a data set into a training set, a verification set and a test set, wherein the number ratio of the training set to the verification set to the test set is 8:1:1; this is because there is a very small amount of LIBS data in the sample, and a preferred range ensures that there is enough data to participate in the training, preventing over-fitting or under-fitting phenomena. If the training set exceeds the proportion, the accuracy of the model cannot be described through the verification set and the test set. If the training set is lower than the proportion, the data of the small sample cannot be fully trained, and the training of the small sample in the sample can generate a lack-fitting phenomenon, so that the accuracy of the model on the prediction of the small sample is reduced. In fig. 5, the curve (1) on the training set and the curve (2) on the verification set show the trend of the same trend, which indicates that the model has no over-fitting phenomenon.
Generating a "positive sample" corresponding to each sample and a "negative sample" of a different class from the positive sample; positive samples are original samples, negative samples are samples selected from other classes;
creating pairs of input samples, each pair comprising a positive sample and a negative sample;
loss function definition: employing contrast ive loss loss functions, including boundary loss and inter-sample distance loss, using a twin network;
n represents the number of samples, i.e. the number of paired samples;
dw is the euclidean distance between the two outputs of the twin network: d (D) w =∣X 1 -X 2 | 2
Y is 1 or 0; y=1 when the two inputs are similar, otherwise 0;
training process:
training the twin network using the training data to minimize a defined loss function;
updating the weight and deviation of the network by using a gradient descent optimization algorithm;
monitoring the performance of the model by using the verification set, and adjusting the super parameters or stopping training in advance according to the performance;
testing and evaluation:
evaluating the performance of the trained model on unseen data using the test set;
calculating indexes such as classification accuracy and confusion matrix to evaluate the effect of the model;
model deployment: in practical applications, the trained model is deployed into a suitable environment.
For the new spectrum sample, it is input into the deployed model, and the prediction results output by the model are classified.
The working principle of the invention is as follows: and collecting spectrum data of the iron ore by a laser-induced breakdown spectrometer. (total 35 kinds of iron ores)
The sample number of the iron ores of the 3,4,10,13,21 five categories is 7,12,7,12,7 respectively, and the spectrum data of the rest 30 iron ores are more than 35 parts.
S2, performing self-encoder dimension reduction on the original data.
The dimension of the raw data for a strip of iron ore is 1x 14914.
Through the self-encoder, 3 linear layers, 3 Relu functions are activated, and the dimension of the data is reduced to 1x 7458. The reduced-dimension data is used as an input to a training network.
The self-encoder (Autoencoder Network) is an unsupervised learning model that better captures complex patterns and non-linear relationships in the data. For feature extraction and reconstruction of data. It consists of two main components, an Encoder (Encoder) and a Decoder (Decoder). The goal of a self-encoder is to achieve self-characterization of the input data by learning a compressed representation (encoding) and a decoder that reconstructs the input data from the representation.
The working principle of the self-encoder is as follows:
encoder (Encoder): the encoder accepts the input data and converts it to a low-dimensional representation of the features. An encoder is typically made up of multiple hidden layers, each of which contains a set of neurons. Each neuron multiplies the input data with weights and generates an output of the hidden layer by an activation function. As the hidden layer increases, the encoder gradually decreases the data dimension, thereby learning the abstract features of the data.
Decoder (Decoder): the decoder accepts the encoder-generated representation of the features and decodes it into reconstructed data. The decoder is similar in structure to the encoder but operates in reverse. It reconstructs the data by reversing it layer by layer and as close as possible to the original input. The last layer output of the decoder typically matches the range and distribution of the original data using an appropriate activation function.
The training process of the self-encoder includes two phases:
forward propagation (encoding): input data is passed to an encoder, producing a representation of the characteristic.
Back propagation (decoding and reconstruction): the output of the encoder is passed to the decoder, and the reconstruction error is calculated by comparing the output of the decoder with the original input. The weights and offsets of the network are then updated using a back propagation algorithm to minimize the reconstruction error.
The objective function of the algorithm optimization is: minimum =dist (X, X) R )
In the scheme of the invention, the self-encoder is used for extracting the data characteristics, and the data subjected to self-encoding and dimension reduction is used as an input layer of a subsequent deep learning classification network. The self-encoder in this scheme contains 3 linear layers, 3 linear rectification functions (Rule) are activated, and the back-propagation optimization function selects the mean square error. The original spectral information is reduced from 1496 to 7458, see fig. 1 for an overall network framework.
The Residual Network (Residual Network) is a deep convolutional neural Network architecture, and aims to solve the problems of gradient disappearance and information loss in deep Network training. The concept of Residual Block (Residual Block) was introduced to optimize the training and performance of the network through cross-layer connectivity and Residual learning.
In the training process of the traditional deep neural network, the gradient gradually becomes smaller in the back propagation process, so that parameters of an earlier layer are difficult to update, and the gradient vanishing problem is called. In addition, the increase of the network layer number may also cause information to be lost in the network, so that the network performance is difficult to improve. To address these problems, res net has proposed the concept of residual learning.
In the method of the invention, the branch network of the twin network is composed of a residual network.
The residual block is the basic building block of ResNet. It conveys information by introducing a cross-layer connection (a jump connection) so that the network can learn the residual function. Inside each residual block, the input data is processed through two or more convolutional layers and an activation function. Then, the input data is added to the residual connection to obtain the output of the residual block. The cross-layer connection allows information to be directly transmitted from an earlier layer to a deeper layer, so that the gradient vanishing problem is effectively relieved, and the expression capacity of a network is improved; the network structure is shown in fig. 3.
In the residual network, a design called "bottleneck structure" is also introduced. This structure uses a convolution layer of 1x1, 1x3, and 1x1 in each residual block to reduce the dimension of the feature map and reduce the computational complexity. This design allows the residual network to handle deeper network structures without introducing excessive parameters and computational burden.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder is characterized by comprising the following steps of: the method comprises the following steps:
s1: collecting iron ore spectrum data as original training data through a laser-induced breakdown spectrometer;
s2: performing self-encoder dimension reduction on the original training data, wherein the dimension-reduced data is used as the input of a training network;
s3: selecting two identical branch networks as twin networks, wherein each branch network processes an input spectrum sample; the two branch networks remain synchronized during the training process by sharing parameters.
2. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 1, wherein: the step S2 is to use a self-encoder to extract data characteristics, and the data subjected to self-encoding dimension reduction is used as an input layer of a subsequent deep learning classification network; the self-encoder contains 3 linear layers, 3 linear rectification functions are activated, and the back-propagation optimization function selects the mean square error.
3. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 2, wherein: the self-encoder is used for extracting and reconstructing the characteristics of the data; the system consists of two main components of an encoder and a decoder; the goal of the self-encoder is to achieve self-characterization of the input data by learning a compressed representation and a decoder that reconstructs the input data from the representation;
the training process of the self-encoder includes two phases:
forward propagation: the input data is passed to an encoder, producing a characteristic representation:
wherein W is 1 : forward propagating weights, b 1 : a forward propagating bias parameter;
back propagation: transmitting the output of the encoder to a decoder, and calculating a reconstruction error by comparing the output of the decoder with the original input; the weights and offsets of the network are then updated using a back propagation algorithm to minimize reconstruction errors:
wherein W is 2 : counter-propagating weights, b 2 : a counter-propagating bias parameter;
the objective function of the algorithm optimization is: minimum =dist (X, X) R )。
4. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 1, wherein: the step S2 further includes performing the following normalization processing on the data:
5. the classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 1, wherein: the branch network is a residual network, the basic constituent units of the residual network are residual blocks, and the training and performance of the network are optimized through cross-layer connection and residual learning;
inside each residual block, the input data is processed through two or more convolutional layers and an activation function; and then adding the input data with the residual connection to obtain the output of the residual block.
6. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 5, wherein: each branch network comprises a convolution layer and a pooling layer structure to extract characteristics and learn sample representation; the two branch networks share weight to obtain two vectors, and the Euclidean distance is used for calculating the difference between the two vectors; finally, an S-type activation function is used.
7. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 6, wherein the method comprises the following steps: the step S3 is to divide the data set into a training set, a verification set and a test set;
generating a "positive sample" corresponding to each sample and a "negative sample" of a different class from the positive sample; positive samples are original samples, negative samples are samples selected from other classes;
creating pairs of input samples, each pair comprising a positive sample and a negative sample;
adopting a consieven loss function, including boundary loss and sample interval loss;
n represents the number of samples, i.e. the number of paired samples;
dw is the euclidean distance between the two outputs of the twin network: e (E) w =∣X 1 -X 2 | 2 ;
Y is 1 or 0; y=1 when the two inputs are similar, otherwise 0.
8. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 7, wherein the method comprises the following steps: the training process of the step S3 is as follows:
training the twin network using the training data to minimize a defined loss function;
optimizing the weight and bias of the updated network using a gradient descent algorithm;
the performance of the model is monitored using the validation set and the parameters are adjusted or training stopped in advance based on the performance.
9. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 7, wherein the method comprises the following steps: the step S3 further comprises the step of evaluating the performance of the trained model on unseen data by using a test set in the following manner: selecting the average spectrum of each iron ore sample as a standard sample of the sample, and inputting an unknown sample to be tested and the average spectrum into a twin network at the same time to obtain a prediction result;
calculating classification accuracy and confusion matrix indexes to evaluate the effect of the model;
for the new spectrum sample, it is input into the deployed model, and the prediction results output by the model are classified.
10. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 7, wherein the method comprises the following steps: the residual error network is formed by a 50-layer network, and the quantity ratio of the training set, the verification set and the test set is 8:1:1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311021346.5A CN117076985A (en) | 2023-08-14 | 2023-08-14 | Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311021346.5A CN117076985A (en) | 2023-08-14 | 2023-08-14 | Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117076985A true CN117076985A (en) | 2023-11-17 |
Family
ID=88714512
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311021346.5A Pending CN117076985A (en) | 2023-08-14 | 2023-08-14 | Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117076985A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112232280A (en) * | 2020-11-04 | 2021-01-15 | 安徽大学 | Hyperspectral image classification method based on self-encoder and 3D depth residual error network |
CN113138178A (en) * | 2021-04-15 | 2021-07-20 | 上海海关工业品与原材料检测技术中心 | Method for identifying imported iron ore brand |
CN114821164A (en) * | 2022-04-13 | 2022-07-29 | 北京工业大学 | Hyperspectral image classification method based on twin network |
CN115187861A (en) * | 2022-07-13 | 2022-10-14 | 哈尔滨理工大学 | Hyperspectral image change detection method and system based on depth twin network |
-
2023
- 2023-08-14 CN CN202311021346.5A patent/CN117076985A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112232280A (en) * | 2020-11-04 | 2021-01-15 | 安徽大学 | Hyperspectral image classification method based on self-encoder and 3D depth residual error network |
CN113138178A (en) * | 2021-04-15 | 2021-07-20 | 上海海关工业品与原材料检测技术中心 | Method for identifying imported iron ore brand |
CN114821164A (en) * | 2022-04-13 | 2022-07-29 | 北京工业大学 | Hyperspectral image classification method based on twin network |
CN115187861A (en) * | 2022-07-13 | 2022-10-14 | 哈尔滨理工大学 | Hyperspectral image change detection method and system based on depth twin network |
Non-Patent Citations (3)
Title |
---|
吕尊记: "基于激光诱导击穿光谱的陶瓷原料成分定量方法研究", 《中国优秀硕士学位论文全文数据库 工程科技I.辑》, no. 02, 15 February 2023 (2023-02-15), pages 19 - 22 * |
肖志强: "基于小样本学习的岩石光谱分类方法研究——以辽宁兴城地区为例", 《中国优秀硕士学位论文全文数据库 基础科学辑》, no. 11, 15 November 2022 (2022-11-15), pages 17 - 35 * |
赵琪: "基于深度学习的虹膜识别算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 05, 15 May 2021 (2021-05-15), pages 41 - 54 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110232280B (en) | Software security vulnerability detection method based on tree structure convolutional neural network | |
CN107633255B (en) | Rock lithology automatic identification and classification method under deep learning mode | |
CN109493287B (en) | Deep learning-based quantitative spectral data analysis processing method | |
CN109299185B (en) | Analysis method for convolutional neural network extraction features aiming at time sequence flow data | |
CN110659207A (en) | Heterogeneous cross-project software defect prediction method based on nuclear spectrum mapping migration integration | |
CN112447265B (en) | Lysine acetylation site prediction method based on modular dense convolutional network | |
CN112668809B (en) | Method for establishing autism children rehabilitation effect prediction model | |
CN114112992B (en) | Detection method and device for blue pigment of cream and storage medium | |
CN110852369B (en) | Hyperspectral image classification method combining 3D/2D convolutional network and adaptive spectrum unmixing | |
CN107704883A (en) | A kind of sorting technique and system of the grade of magnesite ore | |
CN115018512A (en) | Electricity stealing detection method and device based on Transformer neural network | |
CN114399763B (en) | Single-sample and small-sample micro-body paleobiological fossil image identification method and system | |
CN112817954A (en) | Missing value interpolation method based on multi-method ensemble learning | |
CN115394383A (en) | Method and system for predicting luminescence wavelength of phosphorescent material | |
CN109460872B (en) | Mobile communication user loss imbalance data prediction method | |
CN113109782B (en) | Classification method directly applied to radar radiation source amplitude sequence | |
CN108090905B (en) | The judgment method and system of producing line exception | |
CN112488188A (en) | Feature selection method based on deep reinforcement learning | |
CN117076985A (en) | Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder | |
CN115910217A (en) | Base determination method, base determination device, computer equipment and storage medium | |
CN114067169A (en) | Raman spectrum analysis method based on convolutional neural network | |
CN115423737A (en) | Wafer map defect detection method based on multiple principal component analysis networks | |
CN110533080B (en) | Fuzzy rule set-based breast cancer cell image classification method | |
CN113326971A (en) | PCA (principal component analysis) and Adaboost-based tunnel traffic accident duration prediction method | |
CN114429166A (en) | Method, device and equipment for acquiring high-dimensional features of data and computer storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |