CN111242201A

CN111242201A - Stellar spectrum small sample classification method based on confrontation generation network

Info

Publication number: CN111242201A
Application number: CN202010014622.5A
Authority: CN
Inventors: 余先川; 刘伟; 闫瑞清; 朱猛; 代聪; 陈思莹; 姚旺; 詹英; 田海峰
Original assignee: Beijing Normal University
Current assignee: Beijing Normal University
Priority date: 2020-01-07
Filing date: 2020-01-07
Publication date: 2020-06-05

Abstract

The invention discloses a semi-supervised stellar spectrum classification method based on an antagonistic generation network, which comprises the following steps: the method is based on a classification method under the condition that known training samples are few. Since the number of spectra of a specific celestial body is known to be very small, it is desirable to classify the spectra of the specific celestial body, however, using conventional machine learning and statistics based classification methods requires a large amount of known data to train the classification model. Thus, unlike the supervised approach, the proposed method can make full use of a large number of unlabelled samples, which consists of two parts: a generator that captures the data distribution and a discriminator that determines whether the sample consists of actual data. And then training the trained discriminator by using a small amount of labeled data to obtain a better classification model. The performance of the model is evaluated under the condition of using real world spectral data and limited amount of spectral data, and experimental results show that the model is superior to other methods in the aspect of classification precision.

Description

Stellar spectrum small sample classification method based on confrontation generation network

The technical field is as follows:

the invention belongs to a statistical learning classification method under the condition of small training data quantity, and particularly relates to classification of stellar spectral data.

Background art:

with the development of astronomical observational instruments, astronomy has generated more and more spectral data in recent decades, such as a sialon digital sky patrol, a global celestial physical interferometer and a large sky space multi-target optical fiber spectral telescope. Classification of stars helps us to study the overall structure and evolution of galaxy. It is also effective for detecting a specific celestial body. We can obtain spectral data that contains continuous spectral band information and a range of ultraviolet to infrared bands. The stellar spectrum has a plurality of wave bands and continuous spectrum characteristics, and is beneficial to classification and identification. The search and identification of a particular celestial body is of great significance to astronomers.

From the perspective of data processing and mining, for data with a large amount of labeled information, a traditional supervised learning method can be used for firstly training a model of a neural network and then classifying unknown observed data. However, the number of observations we obtained for distinctive celestial bodies and rare astronomical phenomena is small. That is, the observed data corresponding to the distinctive celestial body and rare astronomical phenomena belong to a small sample as compared to a large number of known celestial bodies or astronomical phenomena. Therefore, we search for characteristic celestial bodies or rare astronomical phenomena in massive observation data, which is essentially a classification problem about small samples or sample imbalance. The problem with using conventional unsupervised learning methods is that the accuracy is not high, so improving the accuracy of the classifier with a small number of spectra is a challenging problem.

MK (Morgan et al, 1973) is a widely used spectral classification method based on spectral features of small-scale standard stars. MK systems classify the stellar spectra into seven classes based on the temperature of the stars going from high to low. However, MK star classification systems are primarily done manually by comparing spectra to a small number of standard stars, which is inefficient and unreliable when the number of spectra is very large. Bailer-Jones (1997) trained an Artificial Neural Network (ANN) directly from the observed spectra using physical parameters, which can be applied to spectral classification. BailerJones (2003) tends to use pattern recognition algorithms and outlines two alternative frameworks (parallel and hierarchical) for classification, with data from the Galactic survey task GAIA. The supervised methods widely used for classification can make full use of a priori knowledge to improve the performance of the classifier. However, a large number of labeled spectra are required to train the model, and obtaining a large number of labeled spectra is time consuming and expensive. Moreover, the number of spectra for a given celestial body is known to be generally small.

The invention content is as follows:

the invention discloses a method for classifying stellar spectrum small samples based on an adversary generation network, which learns the characteristics of a stellar spectrum by utilizing the strong characteristic learning capacity of the adversary generation network. And then, a small amount of labeled data is utilized to finely adjust the network, and finally the problem of classifying different stellar spectrums under a small sample is solved.

A method for classifying stellar spectrum small samples based on a challenge generation network comprises the following steps:

1) collecting the stellar spectral data, preprocessing the stellar spectral data, and preprocessing the data by using a z-score standardization method.

2) The method includes training a generator-countermeasure network (GAN) using unlabeled real spectra and pseudo-spectra generated by a generative model, wherein the trained generative model is used to simulate the generation of one-dimensional constellations and a trained discriminative model is used to discriminate whether the constellations are generated by the generator network.

3) In step 2, when the confrontation network model is generated in the training process, the input of the generated model is random noise p_z(z)＝μ(-1,1)。

4) In step 2, when the confrontation network model is generated in the training process, the generation model obtains a pseudo spectrum with the same size as the real spectrum by using an up-sampling scheme.

5) In step 2, when the generated network model is trained, the generated model uses the tanh activation function to represent nonlinearity, and the last layer of the generated model is connected to the input layer of the discriminant model.

6) After the generation of the confrontation network is trained, the generation model is deleted, the discrimination model is reserved, and the dimension output by the discrimination model is increased.

7) The retained output of the discriminant model is replaced with a Softmax function, which is typically used as a classifier for multi-class problems. The modified discrimination network can classify the stellar spectra into classes.

8) A small amount of tagged spectral data is used to fine tune the modified discrimination network.

9) After retraining the network with a small amount of tagged data, the network can perform multi-classification tasks

Description of the drawings:

FIG. 1 is a framework of an SCGAN network

FIG. 2 is a generated spectral image

FIG. 3 is a confusion matrix for different methods of SDSS stellar spectral classification

FIG. 4 is a graph of classification accuracy results of different algorithms compared on SDSS

Description of reference numerals:

the label 0 indicates a category for discriminating the final output of the network

Reference 1 denotes another category for discriminating the final output of the network

Reference 2 denotes the generation of a network

Reference numeral 3 denotes a discrimination network

Reference 4 denotes the SVM confusion matrix

Reference numeral 5 denotes an RF confusion matrix

Reference 6 denotes the CNN confusion matrix

Reference numeral 7 denotes an SCGAN confusion matrix

The symbol X represents random noise

The notation Z denotes spurious spectra generated by noise

The specific implementation mode is as follows:

the specific implementation mode of the invention is as follows:

the invention is composed of two parts, the first part is to generate the training of the countermeasure network (GAN), the countermeasure generating network includes the generating model (G), which is mainly used to simulate and generate the one-dimensional stellar spectrum, and the discriminating model (D), the discriminating network is used to discriminate whether the stellar spectrum is generated by the generating network. The second part is to fine-tune the trained discrimination model (D) of the first part by using a small number of labeled stellar spectra. Two sections are detailed below.

A first part:

1) we use an upsampling scheme to obtain a pseudo-spectrum of the same size as the real spectrum

2) The input being random noise p_z(z)＝μ(-1,1)。

3) To obtain a pseudo-spectrum (X in fig. 1) of the same size as the real spectrum, we use upsampling to generate data of the same dimension as the real signal.

4) We use the tanh activation function in G to represent the non-linearity (Nair and Hinton, 2010). The last layer of G is connected to the input layer of D.

5) We train D and G using unlabeled real spectra and pseudo-spectra generated by G. The network framework of SCGAN is shown in fig. 1. In the SCGAN training process, the task of generating the network is to generate data as real as possible to deceive the discrimination network, and the main objective of the discrimination network is to discriminate the data generated by the generation network as false, i.e. to distinguish the real data from the generated false data. Thus, the generation network and the discrimination network form a dynamic game process:

6) after the generation of the antagonistic network is trained, fig. 2 is a spectral image generated by the generation model. Then we delete G and increase the dimension of the D output.

7) We design D as a one-dimensional convolution based classifier where the non-linear mapping is by the ReLU activation function.

8) And storing the trained model.

A second part:

9) the output of D is replaced with a Softmax function, which is typically used as a classifier for multi-classification problems. In this way, D can classify the stellar spectra into classes.

10) A small amount of labeled spectral data is used to fine-tune the first portion generated D.

11) In this case, the retrained D can be used to classify three spectral types: f type spectrum, G type spectrum and K type spectrum, achieving the purpose of multi-classification. Experimental results, fig. 3 shows a confusion matrix for different methods of SDSS stellate spectral classification.

12) We evaluated classification performance of SCGAN using F1-score and confusion matrix. F1-score is a harmonic mean of precision and recall, typically using a confusion metric to evaluate the performance of the method in the classification task. FIG. 4 is a graph of classification accuracy results comparing different algorithms on an SDSS.

Claims

1. A method for classifying stellar spectrum small samples based on a challenge generation network is characterized by comprising the following steps:

step 1: learning a large number of unlabeled stellar spectral features by using a generation countermeasure generation network;

step 2: after G and D of the confrontation network are generated after training, when D cannot distinguish a real spectrum from a false spectrum, storing the trained model D;

and step 3: replacing the output of the trained D with a Softmax function;

and 4, step 4: d is finely adjusted by using a small amount of labeled spectral data, and classification of multiple categories of stellar spectra can be realized by retraining D.

2. The method for stellar spectrum small sample classification based on countermeasure generation network of claim 1, wherein the input of G is random noise p_z(z)＝μ(-1,1)。

3. The method for stellar spectral small sample classification based on antagonism generation network of claim 1, in which G represents nonlinearity using tanh activation function and connects the last layer of G to the input layer of D.

4. The method for classifying the stellar spectrum small samples based on the antagonistic generation network as claimed in claim 1, training the antagonistic generation network to delete G and D after training, increasing the dimension of D output, replacing the last layer of D with Softmax, and designing D as a classifier based on one-dimensional convolution, wherein the nonlinear mapping is through a ReLU activation function.