CN117076985A

CN117076985A - Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder

Info

Publication number: CN117076985A
Application number: CN202311021346.5A
Authority: CN
Inventors: 胡修远; 刘曙; 于永爱; 闵红
Original assignee: Shanghai Oceanhood Opto Electronics Tech Co ltd
Current assignee: Shanghai Oceanhood Opto Electronics Tech Co ltd
Priority date: 2023-08-14
Filing date: 2023-08-14
Publication date: 2023-11-17

Abstract

The invention discloses a classification and identification method of a twin network fused with a self-encoder for iron ore LIBS data, which effectively solves the problem that the prediction accuracy of the traditional algorithm on small-sample iron ore is not high; the method comprises the following steps: s1: collecting iron ore spectrum data as original training data through a laser-induced breakdown spectrometer; s2: performing self-encoder dimension reduction on the original training data, wherein the dimension-reduced data is used as the input of a training network; s3: selecting two identical branch networks as twin networks, wherein each branch network processes an input spectrum sample; the two branch networks remain synchronized during the training process by sharing parameters. The invention can reduce the data dimension by using the self-encoder, extract important information and enable the characteristics to be better gathered together; after the twin network is used, the accuracy of the small sample is improved; model training and prediction speed are improved.

Description

Classification recognition method for iron ore LIBS data by twin network integrated with self-encoder

Technical Field

The invention relates to the technical field of classification and identification of iron ore LIBS (Laser-Induced Breakdown Spectroscopy ) data, in particular to a classification and identification method of a twin network fused with a self-encoder for iron ore LIBS data.

Background

The existing methods for classifying iron ores include chemical methods, deep learning and machine learning.

Chemical method: the chemical method is a common method for determining the content and composition of each element in ores, and the common chemical method comprises titration, colorimetry, a compound indicator method and the like. The contents of the main components, impurities and useful elements therein can be determined, thereby achieving classification of the iron ore.

Machine learning: the machine learning method comprises KNN, SVM and other modes. An improved classification method of steel materials by combining a support vector machine with laser-induced breakdown spectroscopy is disclosed in patent document CN 103488874A; firstly, 1-to-many modeling classification is established, binary classifiers aiming at each type of iron ore are respectively established, and in the prediction process, if the predicted result has a plurality of types, one-to-one models are established for the types. The process is then iterated until a unique result is selected.

Deep learning: the existing LIBS iron ore data classification mode mainly aims at training a neural network on a big data sample. And adopting PCA technology to conduct dimension reduction and feature extraction on the data. Typically, deep learning neural networks are trained and classified by using 1DCNN (one-dimensional neural network), such as a method for identifying iron ore production countries and brands provided in patent document CN111239103 a.

Drawbacks/deficiencies of the prior art:

1. the iron ore LIBS data is high-dimensional data, and a sample of the LIBS spectrum data of the iron ore has 1496 characteristic points. The existing deep learning can reduce the dimension of the iron ore spectrum data by adopting a PCA method. PCA is a linear dimension reduction method, and can only capture the linear relation in data. PCA may face the problem of dimension catastrophe in high-dimensional space, i.e., PCA may not work well when the dimension is much larger than the number of samples.

2. The existing deep learning model is often trained into a relatively accurate model under the condition of sufficient sample size. In order to increase the accuracy of the model, a mode of deepening the layer number of the model is adopted. However, when training the model, as the depth of the model increases, the gradient vanishes or the network degenerates (network degenerates: in the process of increasing the parameters of the network layer in the neural network of deep learning, the accuracy of the training set tends to be saturated, the number of layers of the network is increased, and the accuracy of the training set is reduced, which is not caused by overfitting).

3. In training these data, there are cases where the samples are not balanced, and the sample size of iron ore of a part of the kinds is much smaller than other samples, so that the accuracy of the model is biased to a large number of sample kinds, but the accuracy is not high enough on iron ore kinds with small sample sizes.

4. Current machine learning methods, such as the method of patent document CN103488874a using a modified support vector machine in combination with Libs spectroscopy, are very time consuming for part of the sample predictions. Because in this way, if the one-to-many model is used for the first time, the specific class cannot be screened layer by layer, but multiple classes are selected, the 1-to-1 model needs to be built again and the screening is repeated. Time and labor are consumed, and real-time rapidity in industry cannot be achieved.

Disclosure of Invention

The invention aims to provide a classification and identification method for iron ore LIBS data by a twin network fused with a self-encoder, so as to solve the problems in the prior art.

In order to achieve the above purpose, the present invention provides the following technical solutions: a classification and identification method of a twin network integrated with a self-encoder for iron ore LIBS data comprises the following steps: s1: collecting iron ore spectrum data as original training data through a laser-induced breakdown spectrometer; s2: performing self-encoder dimension reduction on the original training data, wherein the dimension-reduced data is used as the input of a training network; s3: selecting two identical branch networks as twin networks, wherein each branch network processes an input spectrum sample; the two branch networks remain synchronized during the training process by sharing parameters.

Further: the step S2 is to use a self-encoder to extract data characteristics, and the data subjected to self-encoding dimension reduction is used as an input layer of a subsequent deep learning classification network; the self-encoder contains 3 linear layers, 3 linear rectification functions are activated, and the back-propagation optimization function selects the mean square error.

Further: the self-encoder is used for extracting and reconstructing the characteristics of the data; the system consists of two main components of an encoder and a decoder; the goal of the self-encoder is to achieve self-characterization of the input data by learning a compressed representation and a decoder that reconstructs the input data from the representation; the training process of the self-encoder includes two phases:

forward propagation: the input data is passed to an encoder, producing a characteristic representation:

wherein W is ₁ : forward propagating weights, b ₁ : a forward propagating bias parameter;

back propagation: transmitting the output of the encoder to a decoder, and calculating a reconstruction error by comparing the output of the decoder with the original input; the weights and offsets of the network are then updated using a back propagation algorithm to minimize reconstruction errors:

wherein W is ₂ : counter-propagating weights, b ₂ : a counter-propagating bias parameter;

the objective function of the algorithm optimization is: minimum =dist (X, X) ^R )。

Further: the step S2 further includes performing the following normalization processing on the data:

further: the branch network is a residual network, the basic constituent units of the residual network are residual blocks, and the training and performance of the network are optimized through cross-layer connection and residual learning; inside each residual block, the input data is processed through two or more convolutional layers and an activation function; and then adding the input data with the residual connection to obtain the output of the residual block.

Further: each branch network comprises a convolution layer and a pooling layer structure to extract characteristics and learn sample representation; the two branch networks share weight to obtain two vectors, and the Euclidean distance is used for calculating the difference between the two vectors; finally, an S-type activation function is used.

Further: the step S3 is to divide the data set into a training set, a verification set and a test set; generating a "positive sample" corresponding to each sample and a "negative sample" of a different class from the positive sample; positive samples are original samples, negative samples are samples selected from other classes; creating pairs of input samples, each pair comprising a positive sample and a negative sample; employing contrast ive loss loss functions, including boundary loss and inter-sample distance loss;

n represents the number of samples, i.e. the number of paired samples;dw is the euclidean distance between the two outputs of the twin network: e (E) _w ＝∣X ₁ -X ₂ | ₂ The method comprises the steps of carrying out a first treatment on the surface of the Y is 1 or 0; y=1 when the two inputs are similar, otherwise 0.

Further: the training process of the step S3 is as follows: training the twin network using the training data to minimize a defined loss function; optimizing the weight and bias of the updated network using a gradient descent algorithm; the performance of the model is monitored using the validation set and the parameters are adjusted or training stopped in advance based on the performance.

Further: the step S3 further comprises the step of evaluating the performance of the trained model on unseen data by using a test set in the following manner: selecting the average spectrum of each iron ore sample as a standard sample of the sample, and inputting an unknown sample to be tested and the average spectrum into a twin network at the same time to obtain a prediction result; calculating classification accuracy and confusion matrix indexes to evaluate the effect of the model; for the new spectrum sample, it is input into the deployed model, and the prediction results output by the model are classified.

Further: the residual error network is formed by a 50-layer network, and the quantity ratio of the training set, the verification set and the test set is 8:1:1.

Compared with the prior art, the invention has the beneficial effects that:

1. the self-encoder can reduce the data dimension, extract important information and enable the characteristics to be better gathered; therefore, the model training time can be effectively shortened, and the model accuracy is improved.

2. After the twin network is used, the accuracy of the small samples is improved (in the selected 35 samples, the overall accuracy is 96.06%, and the overall accuracy of the five small samples reaches 95.3%).

3. Model training and prediction speed are improved.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a block diagram of a self-encoder of the present invention;

FIG. 2 is a diagram of the effect of the self-encoder of the present invention;

FIG. 3 is a diagram of a residual network of the present invention;

FIG. 4 is a diagram of the twinning network of the present invention;

FIG. 5 is a los plot of the training process of the present invention and a plot of accuracy over the training set and the test set.

FIG. 6 is a confusion matrix plot of predicted results on 35 iron ores according to the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention.

Referring to fig. 1-6, in an embodiment of the present invention, a method for classifying and identifying iron ore LIBS data by a twin network integrated with an encoder includes the following steps:

s1: collecting original training data: collecting spectrum data of the iron ore by a laser-induced breakdown spectrometer;

s2: performing self-encoder dimension reduction on the original data:

2.1, data normalization processing:

2.2, through the self-encoder, 3 linear layers, 3 Relu functions are activated, and dimension reduction is carried out on the data; the data after the dimension reduction is used as the input of a training network;

s3: training of twin networks: the twin network has two branch networks; for extracting data features;

the twin neural network takes a residual error network as a branch network;

preferably, the residual network is formed by a 50-layer network; the 50-layer residual network may give the best results in 140 iterative training. If the number of network layers is too small, LIBS data features cannot be fully extracted, more rounds of iteration are needed, and time is consumed. If the residual network exceeds 50 layers, the result will not be improved, but instead the parameters will be more as the number of layers increases. In addition, a deeper network requires more training time and computational resources, and the iteration time per round is also longer.

Selecting a residual error network by the two branch networks; the branch network shares weight to obtain two vectors, and the Euclidean distance is used for calculating the difference between the two vectors; an S-type activation function is used.

Preferably, the self-encoder is used for extracting data characteristics, and the data subjected to self-encoding and dimension reduction is used as an input layer of a subsequent deep learning classification network; the self-encoder contains 3 linear layers, 3 linear rectification functions are activated, and the back-propagation optimization function selects the mean square error.

Preferably, the self-encoder is used for extracting and reconstructing the characteristics of the data; the system consists of two main components of an encoder and a decoder; the goal of the self-encoder is to achieve self-characterization of the input data by learning a compressed representation and a decoder that reconstructs the input data from the representation;

the training process of the self-encoder includes two phases:

Preferably, the residual network introduces the concept of residual blocks, optimizing training and performance of the network through cross-layer connection and residual learning;

the residual block is a basic constituent unit of a residual network; information is transferred by introducing cross-layer connection, so that a network can learn a residual function; inside each residual block, the input data is processed through two or more convolutional layers and an activation function; then, the input data is added to the residual connection to obtain the output of the residual block.

Preferably, the twin neural network: twin network architecture design:

the twin network consists of two identical network branches, each branch processing one input spectrum sample;

each branch comprises a convolution layer, a pooling layer and a full-connection layer structure so as to extract characteristics and learn sample representation;

by sharing the parameters, the weights of the two branches are ensured to be kept synchronous in the training process.

Preferably, training data preparation: dividing a data set into a training set, a verification set and a test set, wherein the number ratio of the training set to the verification set to the test set is 8:1:1; this is because there is a very small amount of LIBS data in the sample, and a preferred range ensures that there is enough data to participate in the training, preventing over-fitting or under-fitting phenomena. If the training set exceeds the proportion, the accuracy of the model cannot be described through the verification set and the test set. If the training set is lower than the proportion, the data of the small sample cannot be fully trained, and the training of the small sample in the sample can generate a lack-fitting phenomenon, so that the accuracy of the model on the prediction of the small sample is reduced. In fig. 5, the curve (1) on the training set and the curve (2) on the verification set show the trend of the same trend, which indicates that the model has no over-fitting phenomenon.

Generating a "positive sample" corresponding to each sample and a "negative sample" of a different class from the positive sample; positive samples are original samples, negative samples are samples selected from other classes;

creating pairs of input samples, each pair comprising a positive sample and a negative sample;

loss function definition: employing contrast ive loss loss functions, including boundary loss and inter-sample distance loss, using a twin network;

n represents the number of samples, i.e. the number of paired samples;

dw is the euclidean distance between the two outputs of the twin network: d (D) _w ＝∣X ₁ -X ₂ | ₂

Y is 1 or 0; y=1 when the two inputs are similar, otherwise 0;

training process:

training the twin network using the training data to minimize a defined loss function;

updating the weight and deviation of the network by using a gradient descent optimization algorithm;

monitoring the performance of the model by using the verification set, and adjusting the super parameters or stopping training in advance according to the performance;

testing and evaluation:

evaluating the performance of the trained model on unseen data using the test set;

calculating indexes such as classification accuracy and confusion matrix to evaluate the effect of the model;

model deployment: in practical applications, the trained model is deployed into a suitable environment.

For the new spectrum sample, it is input into the deployed model, and the prediction results output by the model are classified.

The working principle of the invention is as follows: and collecting spectrum data of the iron ore by a laser-induced breakdown spectrometer. (total 35 kinds of iron ores)

The sample number of the iron ores of the 3,4,10,13,21 five categories is 7,12,7,12,7 respectively, and the spectrum data of the rest 30 iron ores are more than 35 parts.

S2, performing self-encoder dimension reduction on the original data.

The dimension of the raw data for a strip of iron ore is 1x 14914.

Through the self-encoder, 3 linear layers, 3 Relu functions are activated, and the dimension of the data is reduced to 1x 7458. The reduced-dimension data is used as an input to a training network.

The self-encoder (Autoencoder Network) is an unsupervised learning model that better captures complex patterns and non-linear relationships in the data. For feature extraction and reconstruction of data. It consists of two main components, an Encoder (Encoder) and a Decoder (Decoder). The goal of a self-encoder is to achieve self-characterization of the input data by learning a compressed representation (encoding) and a decoder that reconstructs the input data from the representation.

The working principle of the self-encoder is as follows:

encoder (Encoder): the encoder accepts the input data and converts it to a low-dimensional representation of the features. An encoder is typically made up of multiple hidden layers, each of which contains a set of neurons. Each neuron multiplies the input data with weights and generates an output of the hidden layer by an activation function. As the hidden layer increases, the encoder gradually decreases the data dimension, thereby learning the abstract features of the data.

Decoder (Decoder): the decoder accepts the encoder-generated representation of the features and decodes it into reconstructed data. The decoder is similar in structure to the encoder but operates in reverse. It reconstructs the data by reversing it layer by layer and as close as possible to the original input. The last layer output of the decoder typically matches the range and distribution of the original data using an appropriate activation function.

The training process of the self-encoder includes two phases:

forward propagation (encoding): input data is passed to an encoder, producing a representation of the characteristic.

Back propagation (decoding and reconstruction): the output of the encoder is passed to the decoder, and the reconstruction error is calculated by comparing the output of the decoder with the original input. The weights and offsets of the network are then updated using a back propagation algorithm to minimize the reconstruction error.

The objective function of the algorithm optimization is: minimum =dist (X, X) ^R )

In the scheme of the invention, the self-encoder is used for extracting the data characteristics, and the data subjected to self-encoding and dimension reduction is used as an input layer of a subsequent deep learning classification network. The self-encoder in this scheme contains 3 linear layers, 3 linear rectification functions (Rule) are activated, and the back-propagation optimization function selects the mean square error. The original spectral information is reduced from 1496 to 7458, see fig. 1 for an overall network framework.

The Residual Network (Residual Network) is a deep convolutional neural Network architecture, and aims to solve the problems of gradient disappearance and information loss in deep Network training. The concept of Residual Block (Residual Block) was introduced to optimize the training and performance of the network through cross-layer connectivity and Residual learning.

In the training process of the traditional deep neural network, the gradient gradually becomes smaller in the back propagation process, so that parameters of an earlier layer are difficult to update, and the gradient vanishing problem is called. In addition, the increase of the network layer number may also cause information to be lost in the network, so that the network performance is difficult to improve. To address these problems, res net has proposed the concept of residual learning.

In the method of the invention, the branch network of the twin network is composed of a residual network.

The residual block is the basic building block of ResNet. It conveys information by introducing a cross-layer connection (a jump connection) so that the network can learn the residual function. Inside each residual block, the input data is processed through two or more convolutional layers and an activation function. Then, the input data is added to the residual connection to obtain the output of the residual block. The cross-layer connection allows information to be directly transmitted from an earlier layer to a deeper layer, so that the gradient vanishing problem is effectively relieved, and the expression capacity of a network is improved; the network structure is shown in fig. 3.

In the residual network, a design called "bottleneck structure" is also introduced. This structure uses a convolution layer of 1x1, 1x3, and 1x1 in each residual block to reduce the dimension of the feature map and reduce the computational complexity. This design allows the residual network to handle deeper network structures without introducing excessive parameters and computational burden.

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder is characterized by comprising the following steps of: the method comprises the following steps:

s1: collecting iron ore spectrum data as original training data through a laser-induced breakdown spectrometer;

s2: performing self-encoder dimension reduction on the original training data, wherein the dimension-reduced data is used as the input of a training network;

s3: selecting two identical branch networks as twin networks, wherein each branch network processes an input spectrum sample; the two branch networks remain synchronized during the training process by sharing parameters.

2. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 1, wherein: the step S2 is to use a self-encoder to extract data characteristics, and the data subjected to self-encoding dimension reduction is used as an input layer of a subsequent deep learning classification network; the self-encoder contains 3 linear layers, 3 linear rectification functions are activated, and the back-propagation optimization function selects the mean square error.

3. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 2, wherein: the self-encoder is used for extracting and reconstructing the characteristics of the data; the system consists of two main components of an encoder and a decoder; the goal of the self-encoder is to achieve self-characterization of the input data by learning a compressed representation and a decoder that reconstructs the input data from the representation;

the training process of the self-encoder includes two phases:

4. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 1, wherein: the step S2 further includes performing the following normalization processing on the data:

5. the classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 1, wherein: the branch network is a residual network, the basic constituent units of the residual network are residual blocks, and the training and performance of the network are optimized through cross-layer connection and residual learning;

inside each residual block, the input data is processed through two or more convolutional layers and an activation function; and then adding the input data with the residual connection to obtain the output of the residual block.

6. The classification and identification method for iron ore LIBS data by a twin network integrated with a self-encoder according to claim 5, wherein: each branch network comprises a convolution layer and a pooling layer structure to extract characteristics and learn sample representation; the two branch networks share weight to obtain two vectors, and the Euclidean distance is used for calculating the difference between the two vectors; finally, an S-type activation function is used.

7. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 6, wherein the method comprises the following steps: the step S3 is to divide the data set into a training set, a verification set and a test set;

adopting a consieven loss function, including boundary loss and sample interval loss;

n represents the number of samples, i.e. the number of paired samples;

dw is the euclidean distance between the two outputs of the twin network: e (E) _w ＝∣X ₁ -X ₂ | ₂ ；

Y is 1 or 0; y=1 when the two inputs are similar, otherwise 0.

8. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 7, wherein the method comprises the following steps: the training process of the step S3 is as follows:

optimizing the weight and bias of the updated network using a gradient descent algorithm;

the performance of the model is monitored using the validation set and the parameters are adjusted or training stopped in advance based on the performance.

9. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 7, wherein the method comprises the following steps: the step S3 further comprises the step of evaluating the performance of the trained model on unseen data by using a test set in the following manner: selecting the average spectrum of each iron ore sample as a standard sample of the sample, and inputting an unknown sample to be tested and the average spectrum into a twin network at the same time to obtain a prediction result;

calculating classification accuracy and confusion matrix indexes to evaluate the effect of the model;

10. The method for classifying and identifying iron ore LIBS data by the twin network integrated with the encoder according to claim 7, wherein the method comprises the following steps: the residual error network is formed by a 50-layer network, and the quantity ratio of the training set, the verification set and the test set is 8:1:1.