CN110929624B

CN110929624B - Construction method of multi-task classification network based on orthogonal loss function

Info

Publication number: CN110929624B
Application number: CN201911124037.4A
Authority: CN
Inventors: 何贵青; 敖振; 霍胤丞; 纪佳琪
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2019-11-18
Filing date: 2019-11-18
Publication date: 2021-09-14
Anticipated expiration: 2039-11-18
Also published as: CN110929624A

Abstract

The invention provides a construction method of a multitask classification network based on an orthogonal loss function, wherein the constructed multitask classification network simulates the human learning process, a deep convolutional neural network is used as a hidden layer to simulate the human brain to carry out deep feature extraction, a tree classifier is used as a task-related output layer to carry out progressive classification, and the identification process forms different learning tasks. The invention enables the features obtained by different tasks to better meet respective requirements, enables the depth features of the same coarse class to be more aggregated when the classifier completes the coarse classification task, and enables the depth features of different fine classes to be more discrete when the fine classification task is completed, and distinguishes the task output layer features of different classification tasks, so that the classifiers of different levels can obtain the features which are more matched with different classification tasks, and the useless features are removed, thereby improving the classification accuracy.

Description

Construction method of multi-task classification network based on orthogonal loss function

Technical Field

The invention relates to the field of image classification, in particular to a multitask classification construction method.

Background

In recent years, photographing and image recognition are more and more widely applied to field exploration and daily life. This benefits from the development of deep learning. The tool that currently performs best in feature extraction is the deep convolutional neural network. As is known, the deep convolutional neural network can not only extract edge information in a shallow layer, but also obtain features closer to human cognitive behaviors as semantic information of the features becomes more abstract as the number of layers deepens.

Subsequently, the multitasking classification network gradually comes into sight of people. The different tasks of the multi-task classification network are mutually assisted, the different tasks of the same network are trained simultaneously, and each different task has an independent loss function. According to the learning experience of human beings, the process of identifying thousands of targets in the world is a progressive process from easy to difficult. In childhood, humans can only recognize a rough classification of all objects, e.g., birds, cars, plants. As the brain system matures, it may be possible to have contact with species that become increasingly finer during learning or life, for example, birds with parrots, sparrows, etc., and cars or buses in cars. The multitask network regards the identification of the coarse class as a first-level task, and a plurality of subtasks for identifying the fine class exist under the first-level task. Thus, knowledge learned earlier in the task of identifying coarse classes can also be used in the new task of identifying fine classes.

However, the structure for classification in the human brain is not simple as expected, and when a multi-task classification network is used for multi-task classification, the hidden layer parameters are learned together by using each loss function of the multi-task, so that the extracted features are also used for multi-classification task multiplexing. But the features they require are not completely consistent for different classification tasks. For example, in a coarse classification, the information of the steering wheel and the wheels may help the network to effectively identify the target to the category of the car. However, when sub-categorizing, these common features may favor the SUV and the charter into the same category, and the task of identifying the sub-category is to separate the two categories, so it is desirable that the network pay more attention to their characteristic features, such as appearance, shape. However, due to the sharing of hidden layer parameters, the extracted features are crossed, so that the multi-task classification network lacks of distinguishing the extracted common features from the extracted unique features in the process of executing different classification tasks.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention provides a construction method of a multitask classification network based on an orthogonal loss function. The invention provides an orthogonal loss function to distinguish common characteristics and special characteristics under a multi-task classification network model, and the loss function measures loss by using the spatial similarity of the characteristics, so that the characteristics obtained by each classification task better meet the requirements of the respective classification task, the separability of the characteristics is improved, and the interference of useless characteristics to the classification tasks is effectively inhibited. The multi-task classification network constructed by the invention simulates the human learning process, uses the deep convolutional neural network as a hidden layer to simulate the human brain to carry out deep feature extraction, uses the tree classifier as a task-related output layer to carry out progressive classification, and forms different learning tasks from the difficult, easy and mutually related identification processes. The orthogonal loss function can guide the multi-task classification network to distinguish the features of different levels of the tree classifier during back propagation, so that the network can learn the features which are more consistent with respective tasks according to different task requirements in the parameter updating process.

The technical scheme adopted by the invention for solving the technical problem comprises the following steps:

the method comprises the following steps: building a hierarchical tag tree

The hierarchical label tree is divided into two layers, wherein the first layer of labels are rough labels of the images and are divided according to the species to which the images belong, and the second layer of labels are fine labels of the images and are defined according to the subclass of the species to which the images belong;

step two: constructing a deep convolutional neural network as a feature extraction module;

the deep convolutional neural network is selected to extract image depth features for Resnet-18 with a residual structure, and the image depth features comprise 18 layer network structures which comprise 17 convolutional layers and a full-connection layer. Except that the first convolution layer uses convolution kernels of 7 × 7, the other convolution layers all use convolution kernels of 3 × 3, wherein every two convolution layers form a residual block, identity mapping is added, the network requires an image RGB three-dimensional pixel value with an input dimension of 3 × 224, a feature vector output dimension of 512 × 7 is obtained after operation of 17 convolution layers, the output dimension of the feature vector input to the full-connection layer is set to be 1024, and therefore the image obtains a one-dimensional vector containing 1024 neurons through Resnet-18, namely the depth feature extracted by the depth convolution neural network;

step three: building a tree classifier for classification;

constructing a corresponding tree classifier according to the hierarchical label tree of each database in the first step, wherein the structure of the tree classifier is two layers, and the first layer comprises a coarse classifier and is used for a coarse classification task; the second layer comprises N fine classifiers for fine classification tasks, the sub-classifiers comprise a coarse classifier and N fine classifiers, the sub-classifiers have the same network structure and are mutually independent, each sub-classifier comprises two full-connection layers and a softmax classifier, and each sub-classifier obtains a classification result; inputting the depth features obtained in the second step into each sub-classifier for classification, wherein the image obtained by the coarse classifier belongs to the nth coarse class, and the nth fine classifier is selected for fine classification to obtain which fine class the image belongs to according to the classification result of the coarse classifier and the membership defined by the hierarchical label tree; therefore, after the multitask network including the deep convolutional neural network and the tree classifier is built, when an image is input into the multitask network, the image can be classified, and an image classification result is obtained;

step four: constructing a quadrature loss function

Combining a deep convolutional neural network with a tree classifier, building a multi-task classification network, inputting a training set image to train the multi-task network, and needing to construct a loss function to update parameters;

firstly, constructing an orthogonal loss function to update parameters, wherein an expected result realized by adopting orthogonal loss is that a coarse classifier feature vector and a fine classifier feature vector are orthogonal in space, so that a cross feature vector is 0 in an ideal state, adding a target to be completed by feature selection into the loss function provides an orthogonal loss, and the orthogonal loss function formula is constructed as follows:

where x is the pixel value of the input N images, k represents the number of coarse classes, f₁，f₂，......，f_kRepresenting k sub-classification tasks, Tr representing the trace of the matrix, T representing the matrix transposition, f_g(x) Coarse classifier features representing the N images obtained, f_s(x) A sub-classifier feature representing N images,

the trace represents the corresponding inner products of the coarse classifier characteristic and the fine classifier characteristic of the N images, and then the inner products are added and summed when the inner products are obtained

When the number is equal to 0, the vectors of each row of the coarse classifier features and the fine classifier features are orthogonal, and alpha is a hyperparameter;

when a training set image is input to train the multi-task network, the orthogonal loss function is used for carrying out reverse propagation, and the value of the orthogonal loss function is reduced by updating parameters; when the quadrature loss function approaches 0 infinitely, f_g(x) And f_s(x) Tend to be orthogonal;

step five: constructing a classification loss function;

in the process f_g(x) And f_s(x) When the neural network is transmitted to the next full connection layer and the softmax classifier for classification, the output of a plurality of neurons is mapped to a (0, 1) interval through the softmax classifier, a rough classification predicted value and a fine classification predicted value are respectively obtained, then the error between the predicted values and the real label value is measured by using a cross entropy loss function, and the cross entropy loss function formula is as follows:

wherein g represents a coarse class, s represents a fine class, X represents a depth feature obtained by the step two of the input image, Wg and Ws represent weight values in a coarse classifier and a fine classifier respectively, and b_sAnd b_gRespectively representing the bias in the coarse classifier and the fine classifier, and when the cross entropy loss function is infinitely close to 0, the predicted value is infinitely close to the true value;

step six: and adding the orthogonal loss function in the step four and the cross entropy loss function in the step five, updating the network parameters through back propagation, continuously updating the parameters by utilizing images of a training set through a random gradient descent method by using an SGD optimizer through the back propagation, and adding a test set into each training to perform testing.

The invention has the advantages that the common characteristics and the characteristic characteristics are distinguished by providing the orthogonal loss function under the multi-task classification network model, so that the characteristics obtained by different tasks can better meet the respective requirements, the depth characteristics of the same coarse class are more aggregated when the classifier completes the coarse classification task, and the depth characteristics of different fine classes are more discrete when the fine classification task is completed. In addition, different threshold values are selected as balance points for measuring the orthogonal loss and the cross entropy loss, and the optimal solution of the threshold values is found through experimental comparison. After the optimal threshold solution is obtained, an orthogonal loss function is added to update network parameters, so that the characteristics of task output layers of different classification tasks are distinguished, classifiers of different levels are enabled to obtain characteristics which are more matched with the different classification tasks, useless characteristics are removed, and the classification accuracy is improved. The invention obtains better classification effect on two different databases.

Drawings

FIG. 1 is a structural diagram of a Fashion-60 database hierarchical tag tree in accordance with the present invention.

FIG. 2 is a structural diagram of a hierarchical tag tree of Caltech-UCSD libraries-200 and 2011 database in the present invention.

FIG. 3 is a schematic diagram of the selection of features in three-dimensional space by orthogonal loss functions in the present invention.

Fig. 4 is a schematic diagram of the overall structure of the multitask classification network according to the present invention.

Fig. 5 is a schematic diagram of a method for implementing a multi-task classification network with an orthogonal loss function added in the present invention.

FIG. 6 is a bar graph illustrating network identification accuracy for different Fashion-60 thresholds in accordance with the present invention.

FIG. 7 is a bar chart showing the network identification accuracy under different thresholds for Caltech-UCSD libraries-200 and 2011 in the present invention.

Detailed Description

The invention is further illustrated with reference to the following figures and examples.

The method comprises the following steps: constructing a hierarchical label tree;

because there is visual similarity between species belonging to the same ethnicity, hierarchical label trees are constructed for different databases using the dependencies between species in nature. The hierarchical label tree is divided into two layers, wherein the first layer of labels are rough labels of the images and are divided according to the species to which the images belong, and the second layer of labels are fine labels of the images and are defined according to the subclass of the species to which the images belong. Branches of the label tree represent dependencies. Caltech-UCSD libraries-200-. For example, hummingbirds are identified as a rough genus of birds, which are the primary labels of the samples. Under this genus, there are numerous types of hummingbirds as thin classes, which are the secondary labels of the sample, i.e., the original database labels. And the dependency relationship between the two levels of samples is used as a branch of the label tree to define the association between the two levels of samples, and the specific structure of the structure is shown in fig. 1. Similarly, the Fashinon-60 database contains 60 garment accessory categories. The construction method is to classify each of 5 kinds of clothes as a rough type according to its function with reference to common sense. For example, consider a shoe as a thick genus of the database as a first level tag of a hierarchical tag tree. The specific structure result of the slippers, leather boots and the like contained under the shoes as the thin classes is shown in figure 2, and the subordinate relations are also used as branches of the label tree;

step two: building deep convolutional neural network as feature extraction module

The deep convolutional neural network is selected to extract image depth features for Resnet-18 with a residual structure, and the image depth features comprise 18 layer network structures which comprise 17 convolutional layers and a full-connection layer. Except that the first convolution layer uses convolution kernels of 7 × 7, the other convolution layers all use convolution kernels of 3 × 3, wherein every two convolution layers form a residual block, identity mapping is added, the network requires an image RGB three-dimensional pixel value with an input dimension of 3 × 224, a feature vector output dimension of 512 × 7 is obtained after operation of 17 convolution layers, the output dimension of the feature vector input to a full-connection layer is set to be 1024, and therefore a one-dimensional vector containing 1024 neurons is obtained from the image through Resnet-18 and is the depth feature extracted by the depth convolution neural network.

Step three: building tree classifier for classification

Constructing a corresponding tree classifier according to the hierarchical label tree of each database in the first step, wherein the structure of the tree classifier is two layers, and the first layer comprises a coarse classifier and is used for a coarse classification task; the second layer comprises N fine classifiers (N coarse classes are assumed) for fine classification tasks, the sub-classifiers (one coarse classifier and the N fine classifiers) have the same network structure and are independent of each other, each sub-classifier comprises two full-connection layers and one softmax classifier, and each sub-classifier obtains a classification result; inputting the depth features obtained in the second step into each sub-classifier for classification, wherein the image obtained by the coarse classifier belongs to the nth coarse class, and the nth fine classifier is selected for fine classification to obtain which fine class the image belongs to according to the classification result of the coarse classifier and the membership defined by the hierarchical label tree; thus, the establishment of the multitask network including the deep convolutional neural network and the tree classifier is completed, as shown in fig. 5; when an image is input into the multitask network, image classification can be carried out to obtain an image classification result;

step four: constructing a quadrature loss function

firstly, an orthogonal loss function is constructed to update parameters, the purpose of feature selection is realized, and the expected result realized by adopting orthogonal loss is that the feature vectors of the coarse classifier and the feature vectors of the fine classifier are orthogonal in space, so that the cross feature vector is 0 in an ideal state, as shown in fig. 3. The invention adds the target to be completed by feature selection into a loss function and provides an orthogonal loss, and a structural formula is as follows:

where x is the pixel value of the input N images, k represents the number of coarse classes, f₁，f₂，......，f_kRepresenting k sub-classification tasks (only one sub-classifier structure is shown in fig. 4), Tr representing the trace of the matrix, T representing the matrix transposition, f_g(x) Coarse classifier features representing the N images obtained, f_s(x) A sub-classifier feature representing N images,

When the sum of the vectors is 0, the vectors of each row of the coarse classifier characteristic and the fine classifier characteristic are orthogonal, alpha is a hyperparameter (the optimal value of a bird database is 2, and the optimal value of a fast-60 database is 2.5), and the size of the alpha represents the influence of the orthogonal loss on the whole network parameter in the process of back propagation.

When a training set image is input to train the multi-task network, the orthogonal loss function is used for carrying out reverse propagation, and the value of the orthogonal loss function is reduced by updating parameters; when the quadrature loss function approaches 0 infinitely, f_g(x) And f_s(x) Tending to be orthogonal. The active position is shown in figure 4.

Step five: constructing a classification loss function

In the process f_g(x) And f_s(x) When the output of the neuron is transmitted to the next full-link layer and the softmax classifier for classification, the output of a plurality of neurons is mapped to (0,1) within the interval, respectively obtaining a rough classification predicted value and a fine classification predicted value, and then measuring the error between the predicted value and the real label value by using a cross entropy loss function, wherein the formula is as follows:

wherein g represents a coarse class, s represents a fine class, X represents a depth feature obtained by the step two of the input image, Wg and Ws represent weight values in a coarse classifier and a fine classifier respectively, and b_sAnd b_gRepresenting the bias in the coarse and fine classifiers, respectively, the predicted values approach the true values infinitely when the cross-entropy loss function approaches 0 infinitely. The action position is shown in figure 4;

The present invention uses the Resnet network as a hidden layer for feature extraction, and then transfers the features to a tree classifier for multi-task classification. Although the multitask classification network effectively solves the interference of the similarity among classes to the network, the multitask classification network also has some defects. When multi-task classification is carried out, the characteristics required by each classification task are different, but because the parameters of each task in the model are shared during characteristic extraction, the characteristics required by each task are mixed together, and therefore, redundant characteristics required by tasks other than the tasks possibly interfere with the recognition performance of the classifier. Therefore, the orthogonal loss function is provided for the multi-task classification network, the feature selection is realized by measuring the spatial distance of the depth features of the classifiers of different levels in the tree classifier, so that the features obtained by each layer of classifier can better meet the requirements of respective classification tasks, and the defects of the existing model are overcome.

According to fig. 3, the present embodiment proposes a multitask classification network function based on an orthogonal loss function, which includes the following four parts: (1) organizing a large number of picture labels by utilizing the subordination relation among the species to construct a complete hierarchical label number; (2) extracting depth features of different pictures by using a depth convolutional neural network, so that the depth features extracted by the network can be simultaneously used for a multi-task classifier; (3) and replacing the traditional N-way softmax classifier by using the tree classifier to realize multi-task classification. (4) And constructing an orthogonal loss function to measure the depth features of classifiers in different layers, and distinguishing the two types of features by increasing the spatial distance between the common features and the specific features and deleting the cross features.

Compared with the traditional deep learning network model, the example model has two advantages: firstly, the embodiment constructs a hierarchical label tree by utilizing the subordination relation among species so as to realize multi-task classification, pays attention to that the discrimination degrees among the categories are different, and progressively classifies the tasks layer by layer through the construction difficulty, so that the tasks of each hierarchy are mutually assisted, and the error gradient distribution is more uniform; secondly, the orthogonal loss function constructed in the embodiment distinguishes the depth features extracted by the depth convolution neural network, the orthogonal loss measures the space distance between the coarse classifier features and the fine classifier features, and the space distance is increased through automatic updating of network parameters, so that the proportion of cross features in a classification task is reduced, and the accuracy of the model for image recognition is improved. In the example, the influence of the orthogonal loss on the whole network parameters in the process of back propagation is controlled by setting different threshold values, so that experiments are carried out to verify the advantage of the orthogonal loss function in feature selection. Obviously, after the orthogonal loss function is added to the multitask network, the obtained characteristics better meet the requirements of various classification tasks, and the network identification precision is improved.

In order to quantitatively evaluate the effect of the orthogonal loss function on the multitask classification network, the embodiment first selects Fashion-60 to evaluate the effect of the orthogonal loss function, and the database has 60 clothing subclasses and 10 clothing classes divided according to common sense of life as rough classes. The present embodiment will be evaluated from two aspects: (1) the influence of the orthogonal loss function on the whole network parameters in the process of back propagation is changed by adjusting the size of the threshold alpha, and how to reach the balance of the orthogonal loss and the cross entropy loss is evaluated. (2) An orthogonal loss function is used for updating parameters in the multi-task classification network, and whether the performance is better than that of a traditional multi-task classification network structure is evaluated; in the network structure of the embodiment, the CNN uses Resnet to perform feature extraction, and has 18 layers in total.

In the experimental process, the network has different settings for the threshold α, so as to select a proper α for training. Therefore, the present embodiment selects 14 different α values from 0.0001 to 6. The results of the experiment are shown in FIG. 4. As can be seen from the figure, the network performance is best when α is 2.5. When the influence factor is small, the influence of the orthogonality loss function on the network is insignificant. As the value increases, the performance of the network will gradually decrease, which means that when the value is too large, the effect of the orthogonality loss function will increase, which affects the original performance of the network and plays a negative role. Therefore, α ═ 2.5 is selected to train the network and compare it to the reference network.

Table one: classification accuracy of networks

As can be seen from table one, compared with the conventional deep convolutional neural network and the multitask network without the orthogonal loss, the accuracy of the classification of the multitask network based on the orthogonal loss is obviously better than that of the other two types of networks. The result proves that the orthogonality loss function provided by the invention effectively completes the feature selection, so that the features obtained in the multitask network are more in line with the task requirements.

Similarly, in the embodiment, 14 different alpha values are selected from 0.0001-6 to be used for evaluating the results of the Caltech-UCSD libraries-200-. The results of the experiment are shown in FIG. 5. As can be seen from the figure, the network performance is best when α is 2. When the influence factor α is small, the orthogonality loss function is unstable for the network. And as the value gradually increases beyond a certain range, network performance will gradually decrease. The optimal value of α is different compared to the experimental results on fast-60, but the overall trend of α towards network performance is the same for both types of databases. Further experimental results show that the value of α may be different in different databases, but the effect of α on network performance is regular. Thus, for a value of α, too small will not work, while too large a value will work adversely. Thus taking the median value, the balance between the quadrature loss and the softmax loss can only be found.

Table two: classification accuracy of networks

Methods	Basic architecture	Fine-classes	Coarse-classes
				CNN	Alexnet	67.683％	--
CNN	VGG-19	68.816％	--
				CNN	Resnet-18	70.094％	--
Multi-task network	Resnet-18+tree classifier(baseline)	72.491％	96.303％
				Multi-task network	Resnet-18+tree classifier+Orthogonality Loss	73.399％	96.842％

From the data in Table two, it can be seen that the orthogonal loss based multitask network is also significantly better than the other two types of networks in Caltech-UCSD Birds-200-. This further demonstrates that the orthogonality loss function proposed by the present invention effectively accomplishes feature selection, making the features obtained in a multitasking network more discriminative.

By providing an orthogonal loss function for distinguishing common features from characteristic features under a multi-task classification network model, the features obtained by different tasks can better meet respective requirements, the same coarse-class features are more aggregated when a classifier completes a coarse classification task, and different fine-class features are more discrete when a fine classification task is completed. In addition, the method selects different threshold values as balance points for measuring the orthogonal loss and the cross entropy loss, and finds the optimal solution of the threshold values through experimental comparison. After the optimal threshold solution is obtained, an orthogonal loss function is added to update network parameters, so that the characteristics of task output layers of different classification tasks are distinguished, classifiers of different levels are enabled to obtain characteristics which are more matched with the different classification tasks, useless characteristics are removed, and the classification accuracy is improved. The invention obtains better classification effect on two different databases.

The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A construction method of a multitask classification network based on an orthogonal loss function is characterized by comprising the following steps:

selecting a deep convolutional neural network to extract image depth features for Resnet-18 with a residual error structure, wherein the image depth features comprise 18 layer network structures which comprise 17 convolutional layers and a full connection layer; except that the first convolution layer uses convolution kernels of 7 × 7, the other convolution layers all use convolution kernels of 3 × 3, wherein every two convolution layers form a residual block, identity mapping is added, the network requires an image RGB three-dimensional pixel value with an input dimension of 3 × 224, a feature vector output dimension of 512 × 7 is obtained after operation of 17 convolution layers, the feature vector is input into a full-connection layer, the output dimension of the full-connection layer is 1024, and therefore, a one-dimensional vector containing 1024 neurons is obtained by the image through Resnet-18, namely the depth feature extracted by the depth convolution neural network;

step three: building a tree classifier for classification;

constructing a corresponding tree classifier according to the middle-level label tree in the first step, wherein the tree classifier has a two-layer structure, and the first layer comprises a coarse classifier and is used for a coarse classification task; the second layer comprises N fine classifiers for fine classification tasks, the sub-classifiers comprise a coarse classifier and N fine classifiers, the sub-classifiers have the same network structure and are mutually independent, each sub-classifier comprises two full-connection layers and a softmax classifier, and each sub-classifier obtains a classification result; inputting the depth features obtained in the second step into each sub-classifier for classification, wherein the image obtained by the coarse classifier belongs to the nth coarse class, and the nth fine classifier is selected for fine classification to obtain which fine class the image belongs to according to the classification result of the coarse classifier and the membership defined by the hierarchical label tree; therefore, after the multitask classification network including the deep convolutional neural network and the tree classifier is built, when an image is input into the multitask classification network, image classification can be carried out to obtain an image classification result;

step four: constructing a quadrature loss function

Combining a deep convolutional neural network with a tree classifier, building a multi-task classification network, inputting a training set image to train the multi-task classification network, and constructing a loss function to update parameters;

when a training set image is input to train the multi-task classification network, the orthogonal loss function is used for carrying out reverse propagation, and the value of the orthogonal loss function is reduced by updating parameters; when the quadrature loss function approaches 0 infinitely, f_g(x) And f_s(x) Tend to be orthogonal;

step five: constructing a classification loss function;

wherein g represents a coarse class, s represents a fine class, X represents a depth feature of the input image obtained through the second step, W_gAnd W_sRepresenting weight values in the coarse and fine classifiers, respectively, b_gAnd b_sRespectively representing the bias in the coarse classifier and the fine classifier, and when the cross entropy loss function is infinitely close to 0, the predicted value is infinitely close to the real label value;

step six: and adding the orthogonal loss function in the step four and the cross entropy loss function in the step five, updating the network parameters through back propagation, continuously updating the parameters by utilizing images of a training set through a random gradient descent method by using an SGD optimizer through the back propagation, and adding a test set into each round of training for testing.