CN110929624A - Construction method of multi-task classification network based on orthogonal loss function - Google Patents
Construction method of multi-task classification network based on orthogonal loss function Download PDFInfo
- Publication number
- CN110929624A CN110929624A CN201911124037.4A CN201911124037A CN110929624A CN 110929624 A CN110929624 A CN 110929624A CN 201911124037 A CN201911124037 A CN 201911124037A CN 110929624 A CN110929624 A CN 110929624A
- Authority
- CN
- China
- Prior art keywords
- classifier
- classification
- loss function
- network
- fine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a construction method of a multitask classification network based on an orthogonal loss function, wherein the constructed multitask classification network simulates the human learning process, a deep convolutional neural network is used as a hidden layer to simulate the human brain to carry out deep feature extraction, a tree classifier is used as a task-related output layer to carry out progressive classification, and the identification process forms different learning tasks. The invention enables the features obtained by different tasks to better meet respective requirements, enables the depth features of the same coarse class to be more aggregated when the classifier completes the coarse classification task, and enables the depth features of different fine classes to be more discrete when the fine classification task is completed, and distinguishes the task output layer features of different classification tasks, so that the classifiers of different levels can obtain the features which are more matched with different classification tasks, and the useless features are removed, thereby improving the classification accuracy.
Description
Technical Field
The invention relates to the field of image classification, in particular to a multitask classification construction method.
Background
In recent years, photographing and image recognition are more and more widely applied to field exploration and daily life. This benefits from the development of deep learning. The tool that currently performs best in feature extraction is the deep convolutional neural network. As is known, the deep convolutional neural network can not only extract edge information in a shallow layer, but also obtain features closer to human cognitive behaviors as semantic information of the features becomes more abstract as the number of layers deepens.
Subsequently, the multitasking classification network gradually comes into sight of people. The different tasks of the multi-task classification network are mutually assisted, the different tasks of the same network are trained simultaneously, and each different task has an independent loss function. According to the learning experience of human beings, the process of identifying thousands of targets in the world is a progressive process from easy to difficult. In childhood, humans can only recognize a rough classification of all objects, e.g., birds, cars, plants. As the brain system matures, it may be possible to have contact with species that become increasingly finer during learning or life, for example, birds with parrots, sparrows, etc., and cars or buses in cars. The multitask network regards the identification of the coarse class as a first-level task, and a plurality of subtasks for identifying the fine class exist under the first-level task. Thus, knowledge learned earlier in the task of identifying coarse classes can also be used in the new task of identifying fine classes.
However, the structure for classification in the human brain is not simple as expected, and when a multi-task classification network is used for multi-task classification, the hidden layer parameters are learned together by using each loss function of the multi-task, so that the extracted features are also used for multi-classification task multiplexing. But the features they require are not completely consistent for different classification tasks. For example, in a coarse classification, the information of the steering wheel and the wheels may help the network to effectively identify the target to the category of the car. However, when sub-categorizing, these common features may favor the SUV and the charter into the same category, and the task of identifying the sub-category is to separate the two categories, so it is desirable that the network pay more attention to their characteristic features, such as appearance, shape. However, due to the sharing of hidden layer parameters, the extracted features are crossed, so that the multi-task classification network lacks of distinguishing the extracted common features from the extracted unique features in the process of executing different classification tasks.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a construction method of a multitask classification network based on an orthogonal loss function. The invention provides an orthogonal loss function to distinguish common characteristics and special characteristics under a multi-task classification network model, and the loss function measures loss by using the spatial similarity of the characteristics, so that the characteristics obtained by each classification task better meet the requirements of the respective classification task, the separability of the characteristics is improved, and the interference of useless characteristics to the classification tasks is effectively inhibited. The multi-task classification network constructed by the invention simulates the human learning process, uses the deep convolutional neural network as a hidden layer to simulate the human brain to carry out deep feature extraction, uses the tree classifier as a task-related output layer to carry out progressive classification, and forms different learning tasks from the difficult, easy and mutually related identification processes. The orthogonal loss function can guide the multi-task classification network to distinguish the features of different levels of the tree classifier during back propagation, so that the network can learn the features which are more consistent with respective tasks according to different task requirements in the parameter updating process.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
the method comprises the following steps: building a hierarchical tag tree
The hierarchical label tree is divided into two layers, wherein the first layer of labels are rough labels of the images and are divided according to the species to which the images belong, and the second layer of labels are fine labels of the images and are defined according to the subclass of the species to which the images belong;
step two: constructing a deep convolutional neural network as a feature extraction module;
the deep convolutional neural network is selected to extract image depth features for Resnet-18 with a residual structure, and the image depth features comprise 18 layer network structures which comprise 17 convolutional layers and a full-connection layer. Except that the first convolution layer uses convolution kernels of 7 × 7, the other convolution layers all use convolution kernels of 3 × 3, wherein every two convolution layers form a residual block, identity mapping is added, the network requires an image RGB three-dimensional pixel value with an input dimension of 3 × 224, a feature vector output dimension of 512 × 7 is obtained after operation of 17 convolution layers, the output dimension of the feature vector input to the full-connection layer is set to be 1024, and therefore the image obtains a one-dimensional vector containing 1024 neurons through Resnet-18, namely the depth feature extracted by the depth convolution neural network;
step three: building a tree classifier for classification;
constructing a corresponding tree classifier according to the hierarchical label tree of each database in the first step, wherein the structure of the tree classifier is two layers, and the first layer comprises a coarse classifier and is used for a coarse classification task; the second layer comprises N fine classifiers for fine classification tasks, the sub-classifiers comprise a coarse classifier and N fine classifiers, the sub-classifiers have the same network structure and are mutually independent, each sub-classifier comprises two full-connection layers and a softmax classifier, and each sub-classifier obtains a classification result; inputting the depth features obtained in the second step into each sub-classifier for classification, wherein the image obtained by the coarse classifier belongs to the nth coarse class, and the nth fine classifier is selected for fine classification to obtain which fine class the image belongs to according to the classification result of the coarse classifier and the membership defined by the hierarchical label tree; therefore, after the multitask network including the deep convolutional neural network and the tree classifier is built, when an image is input into the multitask network, the image can be classified, and an image classification result is obtained;
step four: constructing a quadrature loss function
Combining a deep convolutional neural network with a tree classifier, building a multi-task classification network, inputting a training set image to train the multi-task network, and needing to construct a loss function to update parameters;
firstly, constructing an orthogonal loss function to update parameters, wherein an expected result realized by adopting orthogonal loss is that a coarse classifier feature vector and a fine classifier feature vector are orthogonal in space, so that a cross feature vector is 0 in an ideal state, adding a target to be completed by feature selection into the loss function provides an orthogonal loss, and the orthogonal loss function formula is constructed as follows:
where x is the pixel value of the input N images, k represents the number of coarse classes, f1,f2,......,fkRepresenting k sub-classification tasks, Tr representing the trace of the matrix, T representing the matrix transposition, fg(x) Coarse classifier features representing the N images obtained, fs(x) A sub-classifier feature representing N images,the trace represents the corresponding inner products of the coarse classifier characteristic and the fine classifier characteristic of the N images, and then the inner products are added and summed when the inner products are obtainedWhen the number is equal to 0, the vectors of each row of the coarse classifier features and the fine classifier features are orthogonal, and α is a hyperparameter;
when a training set image is input to train the multi-task network, the orthogonal loss function is used for carrying out reverse propagation, and the value of the orthogonal loss function is reduced by updating parameters; when the quadrature loss function approaches 0 infinitely, fg(x) And fs(x) Tend to be orthogonal;
step five: constructing a classification loss function;
in the process fg(x) And fs(x) When the neural network is transmitted to a next full connection layer and a softmax classifier for classification, the output of a plurality of neurons is mapped to a (0,1) interval through the softmax classifier, a rough classification predicted value and a fine classification predicted value are obtained respectively, then the error between the predicted values and a real label value is measured by using a cross entropy loss function, and the cross entropy is measuredThe loss function is formulated as follows:
wherein g represents a coarse class, s represents a fine class, X represents a depth feature obtained by the step two of the input image, Wg and Ws represent weight values in a coarse classifier and a fine classifier respectively, and bsAnd bgRespectively representing the bias in the coarse classifier and the fine classifier, and when the cross entropy loss function is infinitely close to 0, the predicted value is infinitely close to the true value;
step six: and adding the orthogonal loss function in the step four and the cross entropy loss function in the step five, updating the network parameters through back propagation, continuously updating the parameters by utilizing images of a training set through a random gradient descent method by using an SGD optimizer through the back propagation, and adding a test set into each training to perform testing.
The invention has the advantages that the common characteristics and the characteristic characteristics are distinguished by providing the orthogonal loss function under the multi-task classification network model, so that the characteristics obtained by different tasks can better meet the respective requirements, the depth characteristics of the same coarse class are more aggregated when the classifier completes the coarse classification task, and the depth characteristics of different fine classes are more discrete when the fine classification task is completed. In addition, different threshold values are selected as balance points for measuring the orthogonal loss and the cross entropy loss, and the optimal solution of the threshold values is found through experimental comparison. After the optimal threshold solution is obtained, an orthogonal loss function is added to update network parameters, so that the characteristics of task output layers of different classification tasks are distinguished, classifiers of different levels are enabled to obtain characteristics which are more matched with the different classification tasks, useless characteristics are removed, and the classification accuracy is improved. The invention obtains better classification effect on two different databases.
Drawings
FIG. 1 is a structural diagram of a Fashion-60 database hierarchical tag tree in accordance with the present invention.
FIG. 2 is a structural diagram of a hierarchical tag tree of Caltech-UCSD libraries-200 and 2011 database in the present invention.
FIG. 3 is a schematic diagram of the selection of features in three-dimensional space by orthogonal loss functions in the present invention.
Fig. 4 is a schematic diagram of the overall structure of the multitask classification network according to the present invention.
Fig. 5 is a schematic diagram of a method for implementing a multi-task classification network with an orthogonal loss function added in the present invention.
FIG. 6 is a bar graph illustrating network identification accuracy for different Fashion-60 thresholds in accordance with the present invention.
FIG. 7 is a bar chart showing the network identification accuracy under different thresholds for Caltech-UCSD libraries-200 and 2011 in the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The method comprises the following steps: constructing a hierarchical label tree;
because there is visual similarity between species belonging to the same ethnicity, hierarchical label trees are constructed for different databases using the dependencies between species in nature. The hierarchical label tree is divided into two layers, wherein the first layer of labels are rough labels of the images and are divided according to the species to which the images belong, and the second layer of labels are fine labels of the images and are defined according to the subclass of the species to which the images belong. Branches of the label tree represent dependencies. Caltech-UCSD libraries-200-. For example, hummingbirds are identified as a rough genus of birds, which are the primary labels of the samples. Under this genus, there are numerous types of hummingbirds as thin classes, which are the secondary labels of the sample, i.e., the original database labels. And the dependency relationship between the two levels of samples is used as a branch of the label tree to define the association between the two levels of samples, and the specific structure of the structure is shown in fig. 1. Similarly, the Fashinon-60 database contains 60 garment accessory categories. The construction method is to classify each of 5 kinds of clothes as a rough type according to its function with reference to common sense. For example, consider a shoe as a thick genus of the database as a first level tag of a hierarchical tag tree. The specific structure result of the slippers, leather boots and the like contained under the shoes as the thin classes is shown in figure 2, and the subordinate relations are also used as branches of the label tree;
step two: building deep convolutional neural network as feature extraction module
The deep convolutional neural network is selected to extract image depth features for Resnet-18 with a residual structure, and the image depth features comprise 18 layer network structures which comprise 17 convolutional layers and a full-connection layer. Except that the first convolution layer uses convolution kernels of 7 × 7, the other convolution layers all use convolution kernels of 3 × 3, wherein every two convolution layers form a residual block, identity mapping is added, the network requires an image RGB three-dimensional pixel value with an input dimension of 3 × 224, a feature vector output dimension of 512 × 7 is obtained after operation of 17 convolution layers, the output dimension of the feature vector input to a full-connection layer is set to be 1024, and therefore a one-dimensional vector containing 1024 neurons is obtained from the image through Resnet-18 and is the depth feature extracted by the depth convolution neural network.
Step three: building tree classifier for classification
Constructing a corresponding tree classifier according to the hierarchical label tree of each database in the first step, wherein the structure of the tree classifier is two layers, and the first layer comprises a coarse classifier and is used for a coarse classification task; the second layer comprises N fine classifiers (N coarse classes are assumed) for fine classification tasks, the sub-classifiers (one coarse classifier and the N fine classifiers) have the same network structure and are independent of each other, each sub-classifier comprises two full-connection layers and one softmax classifier, and each sub-classifier obtains a classification result; inputting the depth features obtained in the second step into each sub-classifier for classification, wherein the image obtained by the coarse classifier belongs to the nth coarse class, and the nth fine classifier is selected for fine classification to obtain which fine class the image belongs to according to the classification result of the coarse classifier and the membership defined by the hierarchical label tree; thus, the establishment of the multitask network including the deep convolutional neural network and the tree classifier is completed, as shown in fig. 5; when an image is input into the multitask network, image classification can be carried out to obtain an image classification result;
step four: constructing a quadrature loss function
Combining a deep convolutional neural network with a tree classifier, building a multi-task classification network, inputting a training set image to train the multi-task network, and needing to construct a loss function to update parameters;
firstly, an orthogonal loss function is constructed to update parameters, the purpose of feature selection is realized, and the expected result realized by adopting orthogonal loss is that the feature vectors of the coarse classifier and the feature vectors of the fine classifier are orthogonal in space, so that the cross feature vector is 0 in an ideal state, as shown in fig. 3. The invention adds the target to be completed by feature selection into a loss function and provides an orthogonal loss, and a structural formula is as follows:
where x is the pixel value of the input N images, k represents the number of coarse classes, f1,f2,......,fkRepresenting k sub-classification tasks (only one sub-classifier structure is shown in fig. 4), Tr representing the trace of the matrix, T representing the matrix transposition, fg(x) Coarse classifier features representing the N images obtained, fs(x) A sub-classifier feature representing N images,the trace represents the corresponding inner products of the coarse classifier characteristic and the fine classifier characteristic of the N images, and then the inner products are added and summed when the inner products are obtainedEqual to 0, it means that each row vector of the coarse classifier feature and the fine classifier feature is orthogonal, α is a hyperparameter (the optimal value of the bird database is 2, and the optimal value of the fast-60 database is 2.5), and α represents the influence of the orthogonality loss on the whole network parameters in the process of back propagation.
When a training set image is input to train the multi-task network, the orthogonal loss function is used for carrying out reverse propagation, and the value of the orthogonal loss function is reduced by updating parameters; when the quadrature loss function is noneWhen the limit approaches 0, fg(x) And fs(x) Tending to be orthogonal. The active position is shown in figure 4.
Step five: constructing a classification loss function
In the process fg(x) And fs(x) When the neural network is transmitted to the next full connection layer and the softmax classifier for classification, the output of a plurality of neurons is mapped to a (0,1) interval through the softmax classifier, a rough classification predicted value and a fine classification predicted value are respectively obtained, then the error between the predicted values and the real label value is measured by using a cross entropy loss function, and the formula is as follows:
wherein g represents a coarse class, s represents a fine class, X represents a depth feature obtained by the step two of the input image, Wg and Ws represent weight values in a coarse classifier and a fine classifier respectively, and bsAnd bgRepresenting the bias in the coarse and fine classifiers, respectively, the predicted values approach the true values infinitely when the cross-entropy loss function approaches 0 infinitely. The action position is shown in figure 4;
step six: and adding the orthogonal loss function in the step four and the cross entropy loss function in the step five, updating the network parameters through back propagation, continuously updating the parameters by utilizing images of a training set through a random gradient descent method by using an SGD optimizer through the back propagation, and adding a test set into each training to perform testing.
The present invention uses the Resnet network as a hidden layer for feature extraction, and then transfers the features to a tree classifier for multi-task classification. Although the multitask classification network effectively solves the interference of the similarity among classes to the network, the multitask classification network also has some defects. When multi-task classification is carried out, the characteristics required by each classification task are different, but because the parameters of each task in the model are shared during characteristic extraction, the characteristics required by each task are mixed together, and therefore, redundant characteristics required by tasks other than the tasks possibly interfere with the recognition performance of the classifier. Therefore, the orthogonal loss function is provided for the multi-task classification network, the feature selection is realized by measuring the spatial distance of the depth features of the classifiers of different levels in the tree classifier, so that the features obtained by each layer of classifier can better meet the requirements of respective classification tasks, and the defects of the existing model are overcome.
According to fig. 3, the present embodiment proposes a multitask classification network function based on an orthogonal loss function, which includes the following four parts: (1) organizing a large number of picture labels by utilizing the subordination relation among the species to construct a complete hierarchical label number; (2) extracting depth features of different pictures by using a depth convolutional neural network, so that the depth features extracted by the network can be simultaneously used for a multi-task classifier; (3) and replacing the traditional N-way softmax classifier by using the tree classifier to realize multi-task classification. (4) And constructing an orthogonal loss function to measure the depth features of classifiers in different layers, and distinguishing the two types of features by increasing the spatial distance between the common features and the specific features and deleting the cross features.
Compared with the traditional deep learning network model, the example model has two advantages: firstly, the embodiment constructs a hierarchical label tree by utilizing the subordination relation among species so as to realize multi-task classification, pays attention to that the discrimination degrees among the categories are different, and progressively classifies the tasks layer by layer through the construction difficulty, so that the tasks of each hierarchy are mutually assisted, and the error gradient distribution is more uniform; secondly, the orthogonal loss function constructed in the embodiment distinguishes the depth features extracted by the depth convolution neural network, the orthogonal loss measures the space distance between the coarse classifier features and the fine classifier features, and the space distance is increased through automatic updating of network parameters, so that the proportion of cross features in a classification task is reduced, and the accuracy of the model for image recognition is improved. In the example, the influence of the orthogonal loss on the whole network parameters in the process of back propagation is controlled by setting different threshold values, so that experiments are carried out to verify the advantage of the orthogonal loss function in feature selection. Obviously, after the orthogonal loss function is added to the multitask network, the obtained characteristics better meet the requirements of various classification tasks, and the network identification precision is improved.
In order to quantitatively evaluate the function of the orthogonal loss function used for the multitask classification network, Fashion-60 is selected to evaluate the function of the orthogonal loss function, the database has 60 clothing details and 10 clothing categories divided according to common sense of life as rough categories, the implementation example evaluates from two aspects, (1) the influence of the orthogonal loss function on the whole network parameter in the back propagation process is changed by adjusting the size of a threshold value α, and how to achieve the balance of the orthogonal loss and the cross entropy loss, (2) the orthogonal loss function is used for parameter updating in the multitask classification network, whether the performance is better than that of a traditional multitask classification network structure is evaluated, and the CNN in the network structure of the implementation example uses Resnet for feature extraction and has 18 layers in total.
During the experiment, the network has different settings for the threshold α to select the appropriate α to train, therefore, the present embodiment selects 14 different α values from 0.0001-6. the experimental results are shown in fig. 4. it can be seen from the figure that the network performance is best when α is 2.5. the impact factor is small, the performance of the orthogonality loss function is not significant when the impact factor is small, when the value is gradually increased, the performance of the network is gradually reduced, which means that when the value is too large, the effect of the orthogonality loss function is increased, which affects the original performance of the network and plays a negative role, therefore, α is selected to be 2.5 to train the network and compare it with the reference network.
Table one: classification accuracy of networks
As can be seen from table one, compared with the conventional deep convolutional neural network and the multitask network without the orthogonal loss, the accuracy of the classification of the multitask network based on the orthogonal loss is obviously better than that of the other two types of networks. The result proves that the orthogonality loss function provided by the invention effectively completes the feature selection, so that the features obtained in the multitask network are more in line with the task requirements.
Similarly, the present example selects 14 different α values from 0.0001-6 for evaluating Caltech-UCSDBirds-200-.
Table two: classification accuracy of networks
Methods | Basic architecture | Fine-classes | Coarse-classes |
CNN | Alexnet | 67.683% | -- |
CNN | VGG-19 | 68.816% | -- |
CNN | Resnet-18 | 70.094% | -- |
Multi-task network | Resnet-18+tree classifier(baseline) | 72.491% | 96.303% |
Multi-task network | Resnet-18+tree classifier+Orthogonality Loss | 73.399% | 96.842% |
From the data in Table two, it can be seen that the orthogonal loss based multitask network is also significantly better than the other two types of networks in Caltech-UCSD Birds-200-. This further demonstrates that the orthogonality loss function proposed by the present invention effectively accomplishes feature selection, making the features obtained in a multitasking network more discriminative.
By providing an orthogonal loss function for distinguishing common features from characteristic features under a multi-task classification network model, the features obtained by different tasks can better meet respective requirements, the same coarse-class features are more aggregated when a classifier completes a coarse classification task, and different fine-class features are more discrete when a fine classification task is completed. In addition, the method selects different threshold values as balance points for measuring the orthogonal loss and the cross entropy loss, and finds the optimal solution of the threshold values through experimental comparison. After the optimal threshold solution is obtained, an orthogonal loss function is added to update network parameters, so that the characteristics of task output layers of different classification tasks are distinguished, classifiers of different levels are enabled to obtain characteristics which are more matched with the different classification tasks, useless characteristics are removed, and the classification accuracy is improved. The invention obtains better classification effect on two different databases.
The foregoing illustrates and describes the principles, general features, and advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (1)
1. A construction method of a multitask classification network based on an orthogonal loss function is characterized by comprising the following steps:
the method comprises the following steps: constructing a hierarchical label tree;
the hierarchical label tree is divided into two layers, wherein the first layer of labels are rough labels of the images and are divided according to the species to which the images belong, and the second layer of labels are fine labels of the images and are defined according to the subclass of the species to which the images belong;
step two: constructing a deep convolutional neural network as a feature extraction module;
selecting a deep convolutional neural network to extract image depth features for Resnet-18 with a residual error structure, wherein the image depth features comprise 18 layer network structures which comprise 17 convolutional layers and a full connection layer; except that the first convolution layer uses convolution kernels of 7 × 7, the other convolution layers all use convolution kernels of 3 × 3, wherein every two convolution layers form a residual block, identity mapping is added, the network requires an image RGB three-dimensional pixel value with an input dimension of 3 × 224, a feature vector output dimension of 512 × 7 is obtained after operation of 17 convolution layers, the output dimension of the feature vector input to the full-connection layer is set to be 1024, and therefore the image obtains a one-dimensional vector containing 1024 neurons through Resnet-18, namely the depth feature extracted by the depth convolution neural network;
step three: building a tree classifier for classification;
constructing a corresponding tree classifier according to the hierarchical label tree of each database in the first step, wherein the structure of the tree classifier is two layers, and the first layer comprises a coarse classifier and is used for a coarse classification task; the second layer comprises N fine classifiers for fine classification tasks, the sub-classifiers comprise a coarse classifier and N fine classifiers, the sub-classifiers have the same network structure and are mutually independent, each sub-classifier comprises two full-connection layers and a softmax classifier, and each sub-classifier obtains a classification result; inputting the depth features obtained in the second step into each sub-classifier for classification, wherein the image obtained by the coarse classifier belongs to the nth coarse class, and the nth fine classifier is selected for fine classification to obtain which fine class the image belongs to according to the classification result of the coarse classifier and the membership defined by the hierarchical label tree; therefore, after the multitask network including the deep convolutional neural network and the tree classifier is built, when an image is input into the multitask network, the image can be classified, and an image classification result is obtained;
step four: constructing a quadrature loss function
Combining a deep convolutional neural network with a tree classifier, building a multi-task classification network, inputting a training set image to train the multi-task network, and needing to construct a loss function to update parameters;
firstly, constructing an orthogonal loss function to update parameters, wherein an expected result realized by adopting orthogonal loss is that a coarse classifier feature vector and a fine classifier feature vector are orthogonal in space, so that a cross feature vector is 0 in an ideal state, adding a target to be completed by feature selection into the loss function provides an orthogonal loss, and the orthogonal loss function formula is constructed as follows:
where x is the pixel value of the input N images, k represents the number of coarse classes, f1,f2,......,fkRepresenting k sub-classification tasks, Tr representing matrix-solvingTrace, T represents the matrix transpose, fg(x) Coarse classifier features representing the N images obtained, fs(x) A sub-classifier feature representing N images,the trace represents the corresponding inner products of the coarse classifier characteristic and the fine classifier characteristic of the N images, and then the inner products are added and summed when the inner products are obtainedWhen the number is equal to 0, the vectors of each row of the coarse classifier features and the fine classifier features are orthogonal, and α is a hyperparameter;
when a training set image is input to train the multi-task network, the orthogonal loss function is used for carrying out reverse propagation, and the value of the orthogonal loss function is reduced by updating parameters; when the quadrature loss function approaches 0 infinitely, fg(x) And fs(x) Tend to be orthogonal;
step five: constructing a classification loss function;
in the process fg(x) And fs(x) When the neural network is transmitted to the next full connection layer and the softmax classifier for classification, the output of a plurality of neurons is mapped to a (0,1) interval through the softmax classifier, a rough classification predicted value and a fine classification predicted value are respectively obtained, then the error between the predicted values and the real label value is measured by using a cross entropy loss function, and the cross entropy loss function formula is as follows:
wherein g represents a coarse class, s represents a fine class, X represents a depth feature obtained by the step two of the input image, Wg and Ws represent weight values in a coarse classifier and a fine classifier respectively, and bsAnd bgRespectively representing the bias in the coarse classifier and the fine classifier, and when the cross entropy loss function is infinitely close to 0, the predicted value is infinitely close to the true value;
step six: and adding the orthogonal loss function in the step four and the cross entropy loss function in the step five, updating the network parameters through back propagation, continuously updating the parameters by utilizing images of a training set through a random gradient descent method by using an SGD optimizer through the back propagation, and adding a test set into each training to perform testing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911124037.4A CN110929624B (en) | 2019-11-18 | 2019-11-18 | Construction method of multi-task classification network based on orthogonal loss function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911124037.4A CN110929624B (en) | 2019-11-18 | 2019-11-18 | Construction method of multi-task classification network based on orthogonal loss function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110929624A true CN110929624A (en) | 2020-03-27 |
CN110929624B CN110929624B (en) | 2021-09-14 |
Family
ID=69853358
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911124037.4A Active CN110929624B (en) | 2019-11-18 | 2019-11-18 | Construction method of multi-task classification network based on orthogonal loss function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110929624B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111507403A (en) * | 2020-04-17 | 2020-08-07 | 腾讯科技(深圳)有限公司 | Image classification method and device, computer equipment and storage medium |
CN111798520A (en) * | 2020-09-08 | 2020-10-20 | 平安国际智慧城市科技股份有限公司 | Image processing method, device, equipment and medium based on convolutional neural network |
CN112784776A (en) * | 2021-01-26 | 2021-05-11 | 山西三友和智慧信息技术股份有限公司 | BPD facial emotion recognition method based on improved residual error network |
CN112924177A (en) * | 2021-04-02 | 2021-06-08 | 哈尔滨理工大学 | Rolling bearing fault diagnosis method for improved deep Q network |
CN113408852A (en) * | 2021-05-18 | 2021-09-17 | 江西师范大学 | Meta-cognition ability evaluation model based on online learning behavior and deep neural network |
CN113869458A (en) * | 2021-10-21 | 2021-12-31 | 成都数联云算科技有限公司 | Training method of text classification model, text classification method and related device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919177A (en) * | 2019-01-23 | 2019-06-21 | 西北工业大学 | Feature selection approach based on stratification depth network |
CN109992703A (en) * | 2019-01-28 | 2019-07-09 | 西安交通大学 | A kind of credibility evaluation method of the differentiation feature mining based on multi-task learning |
CN110046668A (en) * | 2019-04-22 | 2019-07-23 | 中国科学技术大学 | A kind of high performance multiple domain image classification method |
-
2019
- 2019-11-18 CN CN201911124037.4A patent/CN110929624B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109919177A (en) * | 2019-01-23 | 2019-06-21 | 西北工业大学 | Feature selection approach based on stratification depth network |
CN109992703A (en) * | 2019-01-28 | 2019-07-09 | 西安交通大学 | A kind of credibility evaluation method of the differentiation feature mining based on multi-task learning |
CN110046668A (en) * | 2019-04-22 | 2019-07-23 | 中国科学技术大学 | A kind of high performance multiple domain image classification method |
Non-Patent Citations (4)
Title |
---|
GUIQING HE ET AL: "Study on Algorithm Evaluation of Image Fusion Based on Multi-hierarchical Synthetic Analysis", 《ICSPCC2016》 * |
SHIVA PENTYALA ET AL: "Multi-Task Networks With Universe, Group, and Task Feature Learning", 《ARXIV:1907.01791V1 [CS.CL]》 * |
XIAO DING ET AL: "Learning Multi-Domain Adversarial Neural Networks for Text Classification", 《IEEE ACCESS》 * |
李军炜: "短文本用户评论的分类系统设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111507403A (en) * | 2020-04-17 | 2020-08-07 | 腾讯科技(深圳)有限公司 | Image classification method and device, computer equipment and storage medium |
CN111798520A (en) * | 2020-09-08 | 2020-10-20 | 平安国际智慧城市科技股份有限公司 | Image processing method, device, equipment and medium based on convolutional neural network |
CN112784776A (en) * | 2021-01-26 | 2021-05-11 | 山西三友和智慧信息技术股份有限公司 | BPD facial emotion recognition method based on improved residual error network |
CN112924177A (en) * | 2021-04-02 | 2021-06-08 | 哈尔滨理工大学 | Rolling bearing fault diagnosis method for improved deep Q network |
CN113408852A (en) * | 2021-05-18 | 2021-09-17 | 江西师范大学 | Meta-cognition ability evaluation model based on online learning behavior and deep neural network |
CN113408852B (en) * | 2021-05-18 | 2022-04-19 | 江西师范大学 | Meta-cognition ability evaluation model based on online learning behavior and deep neural network |
CN113869458A (en) * | 2021-10-21 | 2021-12-31 | 成都数联云算科技有限公司 | Training method of text classification model, text classification method and related device |
Also Published As
Publication number | Publication date |
---|---|
CN110929624B (en) | 2021-09-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110929624B (en) | Construction method of multi-task classification network based on orthogonal loss function | |
CN108664924B (en) | Multi-label object identification method based on convolutional neural network | |
Mathur et al. | Crosspooled FishNet: transfer learning based fish species classification model | |
CN110659958B (en) | Clothing matching generation method based on generation of countermeasure network | |
CN109919177B (en) | Feature selection method based on hierarchical deep network | |
CN105809672B (en) | A kind of image multiple target collaboration dividing method constrained based on super-pixel and structuring | |
CN112132014B (en) | Target re-identification method and system based on non-supervised pyramid similarity learning | |
CN108647595B (en) | Vehicle weight identification method based on multi-attribute depth features | |
Zhong et al. | A comparative study of image classification algorithms for Foraminifera identification | |
CN109871892A (en) | A kind of robot vision cognitive system based on small sample metric learning | |
CN112733602B (en) | Relation-guided pedestrian attribute identification method | |
CN113761259A (en) | Image processing method and device and computer equipment | |
CN116612335B (en) | Few-sample fine-granularity image classification method based on contrast learning | |
CN112784921A (en) | Task attention guided small sample image complementary learning classification algorithm | |
Chen et al. | Military image scene recognition based on CNN and semantic information | |
CN104036021A (en) | Method for semantically annotating images on basis of hybrid generative and discriminative learning models | |
Balasubramaniyan et al. | Color contour texture based peanut classification using deep spread spectral features classification model for assortment identification | |
CN107392233A (en) | Multi-modal method for classifying modes based on analytical type dictionary learning | |
Anggoro et al. | Classification of Solo Batik patterns using deep learning convolutional neural networks algorithm | |
CN110569727A (en) | Transfer learning method combining intra-class distance and inter-class distance based on motor imagery classification | |
Jerandu et al. | Image Classification of Decapterus Macarellus Using Ridge Regression | |
CN115730730A (en) | Production line job scheduling method, system and device | |
Patel et al. | Three fold classification using shift invariant deep neural network | |
CN113627522A (en) | Image classification method, device and equipment based on relational network and storage medium | |
Lubis et al. | Knn method on credit risk classification with binary particle swarm optimization based feature selection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |