CN115512144A

CN115512144A - Automatic XRF spectrogram classification method based on convolution self-encoder

Info

Publication number: CN115512144A
Application number: CN202211053356.2A
Authority: CN
Inventors: 李福生; 王欣然
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2022-12-23

Abstract

The invention discloses an XRF spectrogram automatic classification method based on a convolution self-encoder, and belongs to the field of XRF spectrum analysis processing. The method provided by the invention has the advantages that the acquired XRF spectrum data to be detected is subjected to normalization processing, and the one-dimensional spectrum data vector is converted into a two-dimensional spectrum information matrix form after the normalization processing, so that the XRF spectrum data can be processed by adopting a picture processing mode, and the classification precision is improved. On the basis, a neural network model of a convolution self-encoder is built, and the XRF spectrum spectrogram to be detected is subjected to feature compression to obtain a spectrum data spectrogram after the feature compression; by designing a kmeans classification network, unsupervised classification is realized, and the problems of low classification precision and efficiency caused by complicated XRF spectrum characteristic indexes are effectively solved.

Description

Automatic XRF spectrogram classification method based on convolution self-encoder

Technical Field

The invention relates to the field of XRF spectrum analysis processing, in particular to an XRF spectrogram automatic classification method based on a convolution self-encoder.

Background

XRF (X Ray Fluorescence spectroscopy) is a method of exciting atoms in a substance to be measured with primary X-Ray photons to generate secondary X-rays for substance composition analysis and chemical research analysis. The method has the advantages of simple and convenient pretreatment, no pollution in detection, low detection cost, high detection precision, wide detection element range, high analysis speed, strong applicability, good stability and the like, and obtains wide social and economic benefits in the technical fields of industry, underground mines, farmland environment evaluation, medicine, health and the like. The XRF spectrum contains a lot of information, and in the face of XRF spectra of unknown different samples measured in multiple fields, the spectrum classification is always a hot point of research.

Currently, machine learning methods such as a back propagation algorithm (BP), a Support Vector Machine (SVM), an Extreme Learning Machine (ELM) and the like are mainly adopted in XRF spectrum classification, and the algorithms show stronger performance. However, these algorithms also have their own drawbacks: if the BP algorithm is easy to fall into local minimum, the classification performance of the SVM is unstable, the calculation complexity is high, and the effect is unstable although the learning speed can be greatly improved by the ELM.

In recent years, an image classification method based on deep learning has attracted much attention. The deep Convolutional Neural Network (CNN) can directly process an input two-dimensional image, has a strong feature learning capability, is widely applied to the fields of computer vision, voice recognition and the like of multi-classification and large-scale data, and has great success. The application of the method to XRF spectrum is still less, because the XRF spectrum is a one-dimensional vector in nature, and the data set is often not large, deep learning has strong learning capability, but overfitting is easy; and the traditional deep learning network structure is not suitable for processing one-dimensional data.

Therefore, there is a need to develop a new XRF spectrum classification method, which can realize automatic classification in the face of a large number of unknown XRF spectrum samples, and improve classification accuracy and efficiency.

Disclosure of Invention

The invention aims to provide an XRF spectrogram automatic classification method based on a convolution self-encoder, which aims to solve the problems of low classification precision and low efficiency caused by complicated XRF spectrogram characteristic indexes.

In order to realize the purpose, the invention adopts the following technical scheme:

an XRF spectrogram automatic classification method based on a convolution self-encoder comprises the following steps:

step 1, detecting an unknown sample by using a handheld X-ray fluorescence spectrum analyzer to serve as XRF spectrum data to be detected;

step 2, normalizing the XRF spectrum data to be detected obtained in the step 1, converting the data into a two-dimensional spectrum information matrix form, wherein each spectrum information matrix converted into a two-dimensional space comprises 512 multiplied by 512 features; then creating a training sample set according to the converted XRF spectrum information;

step 3, generating a training set and a test set according to the training sample set created in the step 2;

step 4, building and training a neural network model of the convolution self-encoder:

step 4.1, iterative training is carried out on the training set based on the neural network of the convolutional self-encoder, and a neural network model of the convolutional self-encoder is constructed;

step 4.2, inputting the test set into the neural network model of the convolutional self-encoder obtained in the step 4.1 to obtain a compressed XRF spectrum;

and 5, building a Kmeans unsupervised classification model, inputting the XRF spectrum data graph obtained in the step 4.2 after feature compression into the Kmeans unsupervised classification model, and obtaining an XRF spectrum classification result of the unknown sample through training.

Further, the formula adopted for normalization in step 2 is:

in the formula (1), X _max Maximum, X, representing one-dimensional spectral data of input sample of the same kind _min Minimum, X, of one-dimensional spectral data representing input samples of the same kind _norm Spectral data representing the normalized species sample;

further, the convolutional self-encoder neural network model constructed in the step 4.1 comprises an encoder and a decoder; wherein the encoder consists of 1 input layer and m coding hidden layers; the decoder consists of m-1 decoding hidden layers and 1 output layer; the input data of the input layer is the training set obtained in the step 3 and is used for outputting the input data to the m coding hidden layers; the m coding hidden layers are used for compressing input data to obtain core characteristics of the input data, outputting the core characteristics of the input data to the m-1 decoding hidden layers for decompression and then transmitting to the output layer, and the output layer reconstructs the received core data to obtain XRF spectrum output after characteristic compression.

Furthermore, the hidden layers of the encoder and the decoder both adopt a Tanh nonlinear function as an activation function, and the output layer of the decoder adopts a LeakyReLU nonlinear function as an activation function.

Further, the detailed process of obtaining the compressed XRF spectrum by using the convolutional auto-encoder neural network model in the step 4.2 is as follows:

step 4.2.1, inputting the test set obtained in the step 3 as input data into an input layer of a convolutional self-encoder, mapping the input data to m hidden layers of the encoder respectively for feature compression to obtain core features of the input data, and completing the encoding process of the convolutional self-encoder, wherein the expression of the encoding of the convolutional self-encoder is as follows:

y＝f(w _y x+b _y ) (5)

where x is the loaded input data, y represents the learned features of the intermediate hidden layer, and w _y Weight representing hidden layer input, b _y Representing the offset coefficient of the hidden unit, and f represents the convolution operation;

step 4.2.2, transmitting the core features obtained in the step 4.2.1 to m-1 hidden layers of a decoder for decompression, then transmitting the decompressed core features to an output layer for reconstruction, and obtaining output data close to the input data, namely a compressed XRF spectrum after feature compression, thereby completing the decoding process of the self-encoder, wherein the expression of the decoding process is as follows:

z＝f(w _z y+b _z ) (6)

where y represents the learned features of the intermediate hidden layer, z is the data reconstructed by hiding feature y, w _z Weight representing hidden layer output, b _z Representing the offset coefficient of the output unit, f represents the convolution operation;

the constraints of the convolutional auto-encoder are as follows:

w _y ＝w′ _z ＝w (7)

wherein, w' _z Denotes w _z The transpose of (2) indicates that the convolutional self-encoding has the same binding weight w, which is helpful for halving the parameter quantity of the model;

the target of the training of the convolution self-encoder continuously reduces the error of the input data and the reconstructed data, and the mathematical expression is as follows:

here, the parameters that the auto-encoder needs to train are: w, b _y ，b _z Wherein x is the loaded sample data, z represents the reconstructed output data, and c (x, z) represents the error between the input data and the reconstructed data;

the weight updating rule is expressed by the following formula:

where cost (x, z) is the error loss between the input data and the reconstructed data, η is the learning rate,

which means that the weight W is partial-derivative,

representing the offset coefficient b to the hidden unit _y The deviation is calculated and the deviation is calculated,

representing the offset coefficient b to the output unit _z And (5) calculating partial derivatives.

Further, in the step 5, the process of building and training the Kmeans unsupervised classification model is as follows:

step 5.1, building of Kmeans unsupervised classification model

Step 5.1.1, a sample data set x = { x1, x2, \8230;, xn } is given, the distance from any one data sample in the data set to k (k is less than or equal to n) centers is obtained, and the data sample is assigned to the class with the shortest distance from the center to the data sample by using the k (k is less than or equal to n) data sample in the data set as an initial center;

step 5.1.2, updating the center of each class by methods such as solving the mean value of the class and the like aiming at the data samples in the class obtained in the step 5.1.1;

step 5.1.3, repeatedly iterating the two steps to update the class center, and if the class center is not changed or the change of the class center is smaller than a preset threshold value, finishing the updating to form a class cluster so as to complete the construction of a Kmeans unsupervised classification model; otherwise, continuing;

and (3) iteratively realizing the target that the distance between the sample and the class center to which the sample belongs is minimum, wherein the objective function is as follows:

wherein, mu _i Representation set S _i Mean of (1), all elements in the class andthe sum of the mean distances is VarS _i ；

The distance measurement method shown below is selected:

wherein X _i Representing the ith data sample;

and 5.2, inputting the XRF spectrum data after feature compression into a kmeans unsupervised classification model for training to obtain a trained kmeans classification network model.

The invention has the beneficial effects that: the method comprises the steps of carrying out normalization processing on acquired XRF spectrum data to be detected, and converting the data from a one-dimensional spectrum data vector to a two-dimensional spectrum information matrix form after the normalization processing; the XRF spectrum spectrogram can be processed in a picture processing mode, and the classification precision is improved. On the basis, a convolutional self-encoder neural network model is built, and characteristic compression is carried out on the XRF spectrum to be detected to obtain an XRF spectrum data spectrum after characteristic compression; by designing a kmeans classification network, unsupervised classification is realized, and the problem that the classification precision and efficiency are influenced by complicated characteristic indexes of a large number of unknown samples is solved more effectively. Meanwhile, the gap in the direction of researching an unknown sample classification method by using an XRF spectrum of a sample is filled. In addition, in the neural network model of the convolutional self-encoder, hidden layers of an encoder and a decoder of the neural network model both adopt Tanh nonlinear functions as activation functions, and an output layer of the decoder adopts LeakyReLU nonlinear functions as the activation functions to avoid an overfitting phenomenon. The model combines the deep learning neural network with the XRF spectrum, breaks the limitation of quantitative analysis of the spectrum and the deep learning, and better fuses the XRF spectrum with the deep learning to carry out qualitative analysis on a sample.

Drawings

FIG. 1 is a sequence diagram of the method of the present invention;

FIG. 2 is an image generated after dimension change of a metal sample spectrum one-dimensional spectrogram data according to the embodiment;

FIG. 3 is an image generated after dimension change of the alloy sample spectrum one-dimensional spectrogram data in the embodiment;

FIG. 4 is an image generated after dimension change of soil sample spectrum one-dimensional spectrogram data in the embodiment;

FIG. 5 is a diagram of the classification results of the example.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention is further described with reference to the following embodiments and the accompanying drawings, in which:

as shown in fig. 1, the method for automatically classifying XRF spectrogram based on convolution self-encoder according to the present invention includes the following steps:

step 1, obtaining XRF spectrum data to be detected:

detecting an unknown sample by using a handheld X-ray fluorescence spectrum analyzer to serve as spectrum data to be detected;

secondly, preprocessing the spectrogram data of the spectrum to be detected: and (3) after normalization processing is carried out on the XRF spectrum data to be detected obtained in the step (1), converting the one-dimensional spectrum data vector into a two-dimensional spectrum information matrix form, and converting the two-dimensional spectrum information matrix form into a two-dimensional space. The detailed process is as follows:

1.1, carrying out normalization processing on the XRF spectrum data to be detected obtained in the step 1 by adopting a normalization formula, wherein the normalization formula is as follows:

wherein, X _max Maximum value, X, of two-dimensional spectral data representing input of sample of the same kind _min Minimum, X, of two-dimensional spectral data representing input samples of the same kind _norm Representing the normalized spectral data of the sample of the species.

1.2, converting the XRF spectrum data subjected to normalization processing in the step 1.1 into a two-dimensional characteristic matrix; each column in the two-dimensional feature matrix represents a spectral dimension, and each row represents all spectral information of each data to be measured. Moving to two-dimensional space, each spectral information matrix contains 512 × 512 features. The embodiment provides images generated after dimension changes of metal samples, alloy samples and soil sample spectrum one-dimensional spectrogram data after conversion of bright XRF spectra into two-dimensional space; as can be seen from fig. 2, 3 and 4, the spectral data can be processed by using a picture processing method after the dimension conversion.

1.3, setting the number of picture data in each packet during loading through a batch _ size parameter; and then packaging the spectrum images with the converted dimensionalities according to the set loading number, and loading the packaged data to obtain a training sample set.

And 3, generating a training set and a testing set according to the training sample set obtained in the step 2. The data of the training set and the test set generated according to the obtained training sample set is shown in table 1:

table 1 XRF spectrum of unknown sample collected by experiment

Step 4, building and training a neural network model of the convolutional self-encoder:

and 4.1, carrying out iterative training on the training set based on the neural network of the convolutional self-encoder, constructing a neural network model of the convolutional self-encoder, and setting an activation function and a loss function of the model.

The convolutional self-encoder neural network model constructed in the step 4.1 comprises an encoder and a decoder; wherein the encoder consists of 1 input layer and m coding hidden layers; the decoder consists of m-1 decoding hidden layers and 1 output layer; the input data of the input layer is the training set obtained in the step 3 and is used for outputting the input data to the m coding hidden layers; the m coding hidden layers are used for compressing input data to obtain core characteristics of the input data, outputting the core characteristics of the input data to the m-1 decoding hidden layers for decompression and then transmitting the decompressed core characteristics to the output layer, and the output layer reconstructs the received core data to obtain XRF spectrum spectrogram output after characteristic compression.

Because the XRF spectrum data presents nonlinearity and unknown characteristics are excessive. The traditional dimension reduction method cannot well solve the existing nonlinear problem, and the classification efficiency is reduced due to excessive characteristics of different types of unknown samples in an XRF spectrum during subsequent classification. To solve these problems, the hidden layers of the encoder and the decoder of the present embodiment both use a Tanh nonlinear function as the activation function, and the output layer of the decoder uses a LeakyReLU nonlinear function as the activation function. After the Tanh nonlinear function and the LeakyReLU nonlinear function are adopted as the activation functions, on one hand, the performance of the neural network model of the convolutional self-encoder is improved, so that the neural network model can be used for linear transformation and nonlinear transformation, and more complex data can be processed. On the other hand, the neural network model of the convolutional self-encoder is characterized in that after a training form taking reconstruction as a guide and taking Tanh and LeakyReLU nonlinear functions as activation functions, the model can well recover original input, so that the characteristic stored in the middle hidden layer can retain enough input information.

In this embodiment, the expression of the Tanh nonlinear activation function is as follows:

wherein e is the base number of the natural logarithm function and is a constant; x is input data of the self-encoder, and f (x) is output data of the self-encoder after being processed by a Tanh activation function;

the LeakyReLU nonlinear activation function is expressed as follows:

wherein x is _end Output data from the last layer of the encoder, f (x) _end ) Is the output data processed by the LeakyReLU activation function.

Setting and adopting a root mean square error loss function to measure the deviation between the reconstructed data and the real data, and optimizing an algorithm training network parameter by using a self-adaptive time estimation method, wherein the root mean square error loss function calculation formula is as follows:

where, Σ is the summation operation,

for root-cutting operations, Y _RMSE Is the arithmetic square root value of the mean square error,

in order to input the data true tag distribution,

for the predicted values from the encoder model, the superscript l is the total number of classes in the sample that indicates which true value and predicted value are undergoing loss calculations, and N is the total number of samples in each batch.

And 4.2, inputting the test set obtained in the step 3 into a neural network model of a convolution self-encoder, and obtaining a compressed XRF spectrum by training. The detailed steps are as follows:

4.2.1, inputting the test set obtained in the step 3 as input data into an input layer of a neural network of the convolutional self-encoder, respectively mapping the input data to m hidden layers in the encoder to perform feature compression to obtain core features of the input data, and completing the encoding process of the neural network model of the convolutional self-encoder, wherein the encoding expression of the neural network model of the convolutional self-encoder is as follows:

y＝f(w _y x+b _y ) (4)

4.1.2, transmitting the core characteristics obtained in the step 4.1.1 to m-1 hidden layers of a decoder for decompression, and then transmitting the core characteristics to an output layer for reconstruction to obtain output data similar to the input data, namely a compressed XRF spectrum after characteristic compression, thereby completing the decoding process of the self-encoder. In this embodiment, the input layer in the encoder and the output layer in the decoder have the same scale, and the decoding process has the following expression:

z＝f(w _z y+b _z ) (5)

w _y ＝w′ _z ＝w (6)

wherein, w' _z Denotes w _z The formula is a constraint condition of the convolutional self-encoder, and represents that the convolutional self-encoder has the same binding weight w, which is beneficial to halving the parameter quantity of the model;

the weight updating rule is expressed by the following formula:

which means that the weight W is derived by a partial derivation,

representing the offset coefficient b to the hidden unit _y The deviation is calculated and calculated according to the actual measurement,

In summary, the neural network model of the convolutional self-encoder in this embodiment is to use the training set or the test set of the unknown sample created in step 3 as input data, input the input data to the convolutional self-encoder, use convolution operation as an activation function of the coding layer of the self-encoder, and train the convolutional self-encoder in an unsupervised manner, so as to obtain a trained neural network model of the convolutional self-encoder, and obtain a spectrogram of training spectral data after feature compression.

And 5, building a Kmeans unsupervised classification model, inputting the XRF spectrum data map obtained in the step 4.2 after feature compression into the Kmeans unsupervised classification model, and obtaining an XRF spectrum spectrogram classification result of an unknown sample through training. The detailed process is as follows:

5.1, a sample data set x = { x1, x2, \8230;, xn } is given, the distance from any one data sample in the data set to k (k is less than or equal to n) centers is obtained, and the data sample is assigned to the class with the shortest distance from the center to the data sample by using the k (k is less than or equal to n) data sample in the data set as an initial center;

5.1 building of Kmeans unsupervised classification model

5.1.2, updating the center of each class by methods such as solving the mean value of the class and the like aiming at the data samples in the class obtained in the step 5.1.1;

5.1.3, repeatedly iterating the two steps to update the class center, if the class center is not changed or the change of the class center is smaller than a certain threshold value, finishing the updating to form a class cluster, and forming the class cluster so as to complete the building of a Kmeans unsupervised classification model; otherwise, continuing;

wherein, mu _i Representation set S _i The sum of all elements in the class and the mean distance is VarS _i ；

The distance measurement method shown below is selected:

wherein X _i Representing the ith data sample;

and 5.2, inputting the XRF spectrum spectrogram data after feature compression into a kmeans network for training to obtain a trained kmeans classification network model.

In order to verify the feasibility and the effect of the method, the embodiment applies the method.

Step 1, acquiring XRF spectrum data of unknown samples to be classified;

and 2, preprocessing and compressing the XRF spectrum data to be classified according to the method in the second step. Specifically, the method comprises the following steps: firstly, normalizing XRF spectrum patterns to be classified, converting the spectrum patterns into a two-dimensional spectrum information matrix form, and converting the two-dimensional spectrum information matrix form into a two-dimensional space. Creating a training sample set according to the converted XRF spectrum information;

and 3, inputting the training sample set created in the step 2 into a neural network model of a convolution self-encoder to perform feature compression, and obtaining XRF spectrum spectrogram data with compressed features, wherein the size of the compressed data is 8 multiplied by 3.

And 4, inputting the XRF spectrum data graph after feature compression into a trained Kmeans network to obtain an XRF spectrum spectrogram classification result of the unknown sample.

Through the steps, the classification result of the XRF spectrogram of the finally obtained unknown sample is shown in fig. 5. Confusion matrices it is intuitive to understand that classification models behave within each class of sample, often as part of model evaluation, and very easily indicate whether multiple classes are confused. In the mixed-line matrix, a diagonal line is used as a boundary. As shown in FIG. 5, the positions of the diagonals indicate that the prediction is correct, and the positions outside the diagonals indicate that the samples are wrongly predicted as other samples, so that only two misclassifications exist between the soil and the metals and alloys, and the other classification effects are good, thereby proving the effectiveness of the method of the embodiment.

Claims

1. An XRF spectrogram automatic classification method based on a convolution self-encoder is characterized in that: the method comprises the following steps:

step 1, detecting an unknown sample by using a handheld X-ray fluorescence spectrum analyzer to obtain XRF spectrum data to be detected;

step 3, generating a training set and a testing set according to the training sample set created in the step 2;

and 5, building a Kmeans unsupervised classification model, inputting the XRF spectrum data map obtained in the step 4.2 after feature compression into the Kmeans unsupervised classification model, and obtaining an XRF spectrum spectrogram classification result of an unknown sample through training.

2. The method of claim 1, wherein the XRF spectrogram automatic classification method based on a convolution self-encoder comprises: the formula adopted for normalization in the step 2 is as follows:

in the formula (1), X _max Maximum, X, representing one-dimensional spectral data input to a sample of the same type _min Minimum, X, representing one-dimensional spectral data input to a sample of the same type _norm Representing the normalized spectral data of the sample of the species.

3. The method of automatically classifying XRF spectrograms based on a convolutional auto-encoder as claimed in claim 1 wherein: the convolutional self-encoder neural network model constructed in the step 4.1 comprises an encoder and a decoder; wherein the encoder consists of 1 input layer and m coding hidden layers; the decoder consists of m-1 decoding hidden layers and 1 output layer; the input data of the input layer is the training set obtained in the step 3 and is used for outputting the input data to the m coding hidden layers; the m coding hidden layers are used for compressing input data to obtain core characteristics of the input data, outputting the core characteristics of the input data to the m-1 decoding hidden layers for decompression and then transmitting to the output layer, and the output layer reconstructs the received core data to obtain XRF spectrum output after characteristic compression.

4. The method of claim 3, wherein the method comprises the following steps: hidden layers of the encoder and the decoder both adopt Tanh nonlinear functions as activation functions, and output layers of the decoder adopt LeakyReLU nonlinear functions as activation functions.

5. The method of claim 4, wherein the XRF spectrogram automatic classification method based on the convolution self-encoder is as follows: the detailed process of obtaining the compressed XRF spectrum by using the neural network model of the convolution self-encoder in the step 4.2 is as follows:

step 4.2.1, inputting the test set obtained in the step 3 as input data into an input layer of a convolutional self-encoder, respectively mapping the input data to m hidden layers of the encoder to perform feature compression and obtain core features of the input data, and completing the encoding process of the convolutional self-encoder, wherein an expression of the convolutional self-encoder is as follows:

y＝f(w _y x+b _y ) (5)

wherein x is loaded input data, y represents the learned characteristics of the middle hidden layer, and w _y Weight representing hidden layer input, b _y Representing the offset coefficient of the concealment unit, f represents the convolution operation;

step 4.2.2, after m-1 hidden layers of the core characteristics obtained in the step 4.2.1 are transmitted to a decoder for decompression, the core characteristics are transmitted to an output layer for reconstruction, output data similar to the input data, namely a compressed XRF spectrum after characteristic compression is obtained, so that the decoding process of the self-encoder is completed, and the expression of the decoding process is as follows:

z＝f(w _z y+b _z ) (6)

the constraint of the convolutional autocoder is as follows:

w _y ＝w _z ′＝w (7)

wherein, w _z ' means w _z By transposition, indicating convolutional self-organizationThe codes have the same binding weight w, which helps to halve the parameter quantity of the model;

the weight value updating rule is represented by the following formula:

which means that the weight W is derived by a partial derivation,

to indicate to the transmissionOffset coefficient b of output unit _z And (5) calculating partial derivatives.

6. The method of claim 1, wherein the XRF spectrogram based on a convolution auto-encoder is automatically classified as: the process of building and training the Kmeans unsupervised classification model in the step 5 is as follows:

step 5.1, building of Kmeans unsupervised classification model

step 5.1.2, updating the center of each class by a class mean value solving method aiming at the data samples in the classes obtained in the step 5.1.1;

step 5.1.3, repeatedly iterating the two steps to update the class center, and if the class center is unchanged or the change of the class center is smaller than a preset threshold value, finishing the updating to form a class cluster so as to complete the construction of the Kmeans unsupervised classification model; otherwise, continuing;

The distance measurement method shown below is selected:

wherein X _i Representing the ith data sample;