CN113610150B

CN113610150B - Model training method, object classification device and electronic equipment

Info

Publication number: CN113610150B
Application number: CN202110896590.0A
Authority: CN
Inventors: 戴兵
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-08-05
Filing date: 2021-08-05
Publication date: 2023-07-25
Anticipated expiration: 2041-08-05
Also published as: CN113610150A

Abstract

The disclosure provides a model training method, an object classification device and electronic equipment, and relates to the field of deep learning, in particular to the field of model training. The specific implementation scheme is as follows: acquiring a sample set for model training; wherein the sample set comprises sample objects of each classification task of a multitasking model comprising a first class network and a second class network; the second type of network is a network structure other than the first type of network; training the multi-task model by utilizing each sample object included in the sample set and task identifiers corresponding to each sample object; in the training process, each sample object is used for training network parameters of the second type of network and network parameters of the first type of network aiming at corresponding tasks, and the corresponding tasks are classified tasks with task identifications corresponding to the sample objects. By the scheme, network parameters needing training in the multi-task model can be reduced.

Description

Model training method, object classification device and electronic equipment

Technical Field

The disclosure relates to the technical field of deep learning, in particular to the field of model training, and specifically relates to a model training method, an object classification device and electronic equipment.

Background

The multitasking model is a model capable of simultaneously implementing a plurality of classification tasks for an object. For example, in image classification, a plurality of image classifications can be simultaneously realized for one image by a multitasking model.

In the related art, a multi-task model is mainly obtained through training in a hard parameter sharing mode. However, the adoption of the hard parameter sharing mode requires more network parameters of the trained multi-task model.

Disclosure of Invention

The present disclosure provides a method, object classification method, apparatus and electronic device for model training that reduces network parameters within a multitasking model that need to be trained.

According to an aspect of the present disclosure, there is provided a method of model training, comprising:

acquiring a sample set for model training; the sample set comprises sample objects of each classification task of a multi-task model, the multi-task model comprises a first type network and a second type network, the first type network comprises a full connection layer and a normalization layer in a terminal feature extraction layer; the second type network is a network structure except the first type network;

training the multi-task model by utilizing each sample object included in the sample set and task identifiers corresponding to each sample object;

The task identifier corresponding to each sample object is the identifier of the classification task to which the sample object belongs; in the training process, each sample object is used for training the network parameters of the second type of network and the network parameters of the first type of network aiming at the corresponding task, wherein the corresponding task is a classification task with a task identifier corresponding to the sample object.

According to another aspect of the present disclosure, there is provided an object classification method including:

acquiring a target object to be classified;

based on a target multitasking model trained in advance, multitasking classification is carried out on the target object, and classification results of all classification tasks are obtained;

the target multitasking classification model is a model obtained by training by the model training method of any one of the above.

According to another aspect of the present disclosure, there is provided an apparatus for model training, including:

the sample set acquisition module is used for acquiring a sample set for model training; the sample set comprises sample objects of each classification task of a multi-task model, the multi-task model comprises a first type network and a second type network, the first type network comprises a full connection layer and a normalization layer in a terminal feature extraction layer; the second type network is a network structure except the first type network;

The model training module is used for training the multi-task model by utilizing each sample object included in the sample set and the task identifier corresponding to each sample object;

According to another aspect of the present disclosure, there is provided an object classification apparatus including:

the object acquisition module is used for acquiring a target object to be classified;

the object classification module is used for carrying out multi-task classification on the target object based on a pre-trained target multi-task model to obtain classification results of all classification tasks;

wherein the target multitasking classification model is a model trained by the model training device according to any one of the above.

According to another aspect of the present disclosure, there is provided an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of model training or a method of object classification.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform a method of model training or a method of object classification.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements a method of model training or a method of object classification.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram of a related art multitasking model trained using hard parameter sharing;

FIG. 2 is a schematic diagram of a ResNet50 network of the related art;

FIG. 3 is a schematic diagram according to a first embodiment of the present disclosure;

FIG. 4 is a schematic diagram according to a second embodiment of the present disclosure;

FIG. 5 is a schematic diagram according to a third embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a multitasking model provided in accordance with an embodiment of the present disclosure;

FIG. 7 is a schematic diagram according to a fourth embodiment of the present disclosure;

FIG. 8 is a schematic diagram according to a fifth embodiment of the present disclosure;

FIG. 9 is a block diagram of an electronic device for implementing a method of model training or a method of object classification in accordance with an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

With the continuous development of deep learning, object classification in deep learning is also an important algorithm task at present.

In practical demands, it is often necessary to perform multiple parallel unassociated classification tasks. If a classification task needs to correspond to an object classification model, the workload is greatly increased and the deployment efficiency is lower when training and deploying the object classification models.

A multitasking model is an effective means for solving these problems, where a multitasking model is a model that can implement multiple classification tasks for an object at the same time. For example, in image classification, a plurality of image classifications can be simultaneously realized for one image by a multitasking model.

In the related art, a multi-task model is mainly obtained through training in a hard parameter sharing mode. Briefly, in a multitasking model, each classification task shares a feature extraction layer of the bottom layer in the multitasking model, while each classification task has an independent terminal feature extraction layer and an independent full connection layer. FIG. 1 is a schematic diagram of a related art multitasking model trained by hard parameter sharing. In fig. 1, tasks a, task B and Task C share the feature extraction layer of the bottom layer, that is, shared layers in fig. 1, while each Task has a respective feature extraction layer of the end and a full connection layer on the high layer, that is, specific layers in the drawing, to form a structure that the bottom layer is Shared and the high layers are independent.

Exemplary, as shown in fig. 2, is a schematic diagram of a network of res net50 in the related art. Wherein: input is an input layer for inputting an object to be processed; the stem layer is the initial layer, comprising a convolution layer and a pooling layer; layer1, layer2, layer3 and layer4 are the ResNet50 primary structure, where layer4 represents the inverse 3 convolutional layer structure composition of ResNet 50. Each layer contains a number of bottleneck structures, as shown in the right half of fig. 2, each bottleneck structure is composed of three conv+bn+relu layers, where conv represents a convolution layer, bn represents a batch normalization, i.e., a normalization layer, relu is an activation function, i.e., an activation function layer, expressed as F (x) =max (0, x); GAP (Global Average Pooling ) represents a global pooling layer, outputting a fixed 2048-dimensional vector; FC is a fully connected layer.

If classical ResNet50 is used as the base network for the multitasking model, then the way hard parameter sharing is employed is: task a, task B, and task C share the underlying input, stem layer, layer1, layer2, and layer3 layers in the ResNet50 network, while the last layer4, GAP, and full connection layers are owned by each task alone.

However, for the neural network model, the feature extraction layer of the terminal often contains more network parameters, while for the multi-task classification model, each classification task has an independent feature extraction layer of the terminal, so that the network parameters needing training in a hard parameter sharing manner are more. Thereby causing more memory to be occupied during training. Meanwhile, when the model after training is deployed, more memory is required.

In the related art, there is also a scheme that each classification task shares all feature extraction layers, and each classification task only has an independent full connection layer. However, because the network parameters of all feature extraction layers are shared, when one classification task reaches the best classification, other classification tasks cannot reach the best classification. I.e. a problem arises that simultaneous convergence is not possible.

In order to reduce and solve the technical problems in the related art, embodiments of the present disclosure provide a method for model training.

It should be noted that, in a specific application, the method for model training provided in the embodiments of the present disclosure may be applied to various electronic devices, for example, a personal computer, a server, and other devices having data processing capabilities. In addition, it can be understood that the model training method provided by the embodiment of the present disclosure may be implemented by software, hardware, or a combination of software and hardware.

The method for training the model provided by the embodiment of the disclosure can comprise the following steps:

acquiring a sample set for model training; the sample set comprises sample objects of each classification task of a multi-task model, the multi-task model comprises a first type network and a second type network, the first type network comprises a full-connection layer and a normalization layer in a terminal feature extraction layer; the second type of network is a network structure other than the first type of network;

the task identifier corresponding to each sample object is the identifier of the classification task to which the sample object belongs; in the training process, each sample object is used for training network parameters of the second type of network and network parameters of the first type of network aiming at corresponding tasks, and the corresponding tasks are classified tasks with task identifications corresponding to the sample objects.

According to the scheme provided by the disclosure, in the training process, each sample object is used for training the network parameters of the second-class network and the network parameters of the first-class network aiming at the classification tasks with the task identifications corresponding to the sample objects, so that the network parameters of the second-class network are shared by the classification tasks in the trained multi-task model, and the network parameters of the first-class network are classified by the classification tasks. And because the first-type network only comprises the normalization layer in the characteristic extraction layer of the terminal, network parameters except the normalization layer in the characteristic extraction layer of the terminal are shared by all classification tasks, so that the number of network parameters unique to each classification task is reduced. It can be seen that the above solution provided by the present disclosure can reduce the network parameters that need to be trained within the multitasking model. And further, the memory occupied during training and deploying the multi-task model can be reduced.

Furthermore, as each classification task has a unique parameter in the normalization layer in the characteristic extraction layer at the tail end, the problem of training conflict among the classification tasks can be avoided.

A method for model training provided by embodiments of the present disclosure is described below with reference to the accompanying drawings.

As shown in fig. 3, an embodiment of the present disclosure provides a method for model training, which may include the following steps:

s301, acquiring a sample set for model training; the sample set comprises sample objects of each classification task of a multi-task model, the multi-task model comprises a first type network and a second type network, the first type network comprises a full-connection layer and a normalization layer in a terminal feature extraction layer; the second type of network is a network structure other than the first type of network;

wherein network parameters in the first type of network in the multi-task model belong to different classification tasks. For example, the classification tasks of the multi-task model include classification task 1, classification task 2 and classification task 3, and the first type of network of the multi-task model includes network parameter 1 for classification task 1, network parameter 2 for classification task 2 and network parameter 3 for classification task 3. The network parameters in the second type of network of the multi-task model are shared by the classification tasks, i.e. only one part of the network parameters belongs to the network parameters, and the network parameters can be used when aiming at the classification tasks.

The first type of network contains normalization layers in the fully connected and terminal feature extraction layers. And network layers such as a convolution layer, an activation function layer and the like in the characteristic extraction layer of the terminal belong to a second type of network. The feature extraction layer is a structure layer divided among different network structures. Taking a ResNet50 network as an example, the feature extraction layer in the ResNet50 network includes: layer1, layer2, layer3, and layer4.layer1, layer2, layer3, and layer4 each comprise a plurality of convolutional layers, an activation function layer, and a normalization layer. In the embodiment of the present disclosure, the layer4 is a terminal feature extraction layer, and may also be referred to as a highest feature extraction layer.

The acquired sample set may be pre-established. In order to train the multitasking model, the acquired sample set needs to contain sample objects for each classification task of the multitasking model. For example, the classification tasks of the multi-task model include classification task 1 and classification task 2, and the sample set includes sample object 1 and sample object 2 for classification task 1, and sample object 3 and sample object 4 for classification task 2, wherein sample object 1 and sample object 2 are used to train the multi-classification model to implement classification task 1, and sample objects 3 and 4 are used to train the multi-classification task to implement classification task 2.

The multitasking model may be a model for classifying objects such as images and audio. Corresponding to the multitasking model, the sample object may be an object of an image, audio, etc. For example, if the multitasking model is a multitasking image classification model, such as one built based on CNN (Convolutional Neural Network ), the sample object may be a sample image.

Each classification task of the multi-task model is a classification task for the same object, such as multiple classification tasks that classify images. Each classification task may be a parallel unassociated task, for example, classification task 1 classifies the color of the image, and classification task 2 classifies the object contained in the image; the classification tasks may be tasks for classifying the thickness, for example, the classification task 1 classifies objects included in the image, and the classification task 2 classifies buildings included in the image.

S302, training a multi-task model by utilizing each sample object included in a sample set and task identifiers corresponding to each sample object; the task identifier corresponding to each sample object is the identifier of the classification task to which the sample object belongs; in the training process, each sample object is used for training network parameters of the second type of network and network parameters of the first type of network aiming at corresponding tasks, and the corresponding tasks are classified tasks with task identifications corresponding to the sample objects.

The sample objects contained in the sample set are corresponding to task identifications. Each task identity characterizes the classification task to which the sample object belongs. Alternatively, the correspondence with each sample object and task identity may be implemented based on the naming of the sample object. For example, if the task identifier of classification task 1 is 1, then the sample object belonging to classification task 1 may be named 1-xx, e.g., 1-01, 1-02, etc. Optionally, the task identifier may be a task ID (Identity Document, identification number) of the task.

After the sample set is obtained, the multi-task model can be trained by utilizing each sample object included in the sample set and the task identifier corresponding to each sample object.

In the process of training the multitasking model, after each sample object is input into the multitasking model, the sample object is only used for training the network parameters of the second type network and the network parameters of the first type network aiming at the corresponding tasks, and is not used for training the network parameters of the first type network which are not corresponding to the tasks.

For example, the classification tasks of the multi-task model include classification task 1 and classification task 2, and the sample set includes sample object 1 and sample object 2 of classification task 1, and sample object 3 and sample object 4 of classification task 2. When the sample object 1 is input to the multitasking classification model, the sample object 1 is only used to train the network parameters of the second type of network, and the network parameters of the first type of network for the classification task 1, but not for the classification task 2.

According to the scheme provided by the disclosure, as the first-type network only comprises the normalization layer in the terminal feature extraction layer, network parameters except the normalization layer in the second-type network feature extraction layer are shared by all classification tasks, so that the number of network parameters unique to each classification task is reduced. It can be seen that the above solution provided by the present disclosure can reduce the network parameters that need to be trained within the multitasking model. Meanwhile, as each classification task has a unique parameter in the normalization layer in the characteristic extraction layer at the tail end, the problem of training conflict among the classification tasks can be avoided.

Based on the embodiment of fig. 3, as shown in fig. 4, the method for model training provided in another embodiment of the disclosure, S302 above, may include steps S3021 to S3024:

s3021, selecting a target sample object from a sample set;

the method for selecting the target sample object from the sample set may be various, for example, a random method for selecting the target sample object from the sample set, or a preset non-random method for selecting the target sample object from the sample set, which will be described in detail in the following embodiments, and will not be described herein.

Alternatively, in one implementation, to fully utilize each sample object in the sample set to train the target classification model, each acquired sample object may be recorded, so that when a sample object is acquired, an unused sample object may be acquired from the sample set as the target sample object.

S3022, inputting the selected target sample object and the corresponding task identifier into a multi-task model so that the multi-task model classifies the target sample object based on the network parameters of the second type network and the network parameters of the first type network aiming at the specified task to obtain a classification result; the method comprises the steps of designating a task as a classification task with a task identifier corresponding to a target sample object;

the classification task to which the target sample object belongs can be determined through the task identifier corresponding to the target sample object. Thus, the target sample object and corresponding task identity may be input into the multitasking model.

After receiving the target sample object and the task identifier corresponding to the target sample object, the multi-task model can determine network parameters for processing the target sample in the first type network, namely network parameters of the first type network aiming at the appointed task, and further can classify the target sample object based on the network parameters of the second type network and the network parameters of the first type network aiming at the appointed task, so as to obtain a classification result.

S3023, adjusting network parameters of the second type network and network parameters of the designated task of the first type network based on the difference between the obtained classification result and the calibration result of the target sample object;

wherein after obtaining the classification result of the target sample object, a difference between the classification result and the calibration result of the target sample object, which is also referred to as model loss, can be calculated.

For example, the target sample object is a sample object for classification task 1, classification task 1 being used to classify the object into category 1, category 2, and category 3. The calibration result of the target sample object is category 3. Inputting the target sample object into a multitasking model, wherein the multitasking model classifies the target sample object based on the network parameters of the second type network and the network parameters of the first type network aiming at the classification task 1 to obtain classification results as follows: if the probability of category 1 is 20%, the probability of category 2 is 10%, and the probability of category 3 is 70%, the difference between the calibration result and the classification result can be calculated as follows: 30%, or 0.3.

After determining the difference in the calibration results of the target sample object, the network parameters of the second type of network and the network parameters of the specified tasks of the first type of network may be adjusted based on the obtained classification results and the difference in the calibration results of the target sample object.

For the neural network model, the larger the difference is, the larger the adjustment amplitude of the parameters to be adjusted is, so that the network parameters of the second type network and the network parameters of the designated tasks of the first type network in the multi-task model can be adjusted based on the difference between the calibration results of the target sample object and the actual situation and the requirements.

Optionally, in one implementation, a predetermined parameter adjustment manner may be used to adjust network parameters of the second type of network and network parameters of the designated task of the first type of network in the multitasking model. Illustratively, the predetermined parameter adjustment may be a random gradient descent, a batch gradient descent, or the like.

S3024, judging whether sample objects in the sample set are all selected; if not, returning to execute step S3021; otherwise, the training is ended.

Optionally, in one implementation, after the parameters in the multitasking model are adjusted, it is further required to determine whether all the sample objects in the sample set have been selected, and if there are unselected sample objects in the sample set, the step of selecting the target sample object from the sample set may be performed back, that is, step S3021 is performed back until all the sample objects in the sample set have been selected. If all the sample objects in the sample set have been selected, the training is ended.

According to the scheme provided by the disclosure, the network parameters needing training in the multi-task model can be reduced, and the problem of training conflict among classification tasks can be avoided. Further, the network parameters of the second type network and the network parameters of the appointed task of the first type network in the multi-task model are adjusted by using the target sample object, so that the trained multi-task model can execute the appointed task.

Optionally, for the situation that the network parameters of different sample objects need to be adjusted in the multi-task model training, for each classification task, the network parameters of the second type network and the network parameters of the classification task of the first type network in the multi-task model are adjusted based on a plurality of sample objects of the classification task, and then the network parameters of another classification task are adjusted, so that the network parameters of the multi-task model need to be trained in frequent switching are avoided.

Based on this, the method for model training provided in another embodiment of the present disclosure, S3021 above may include:

selecting a plurality of target sample objects of a target classification task from a sample set; the target classification task is any one of a plurality of classification tasks.

For any classification task, after the classification task is determined as a target classification task, a plurality of target sample objects of the target classification task can be selected from the sample set, and then the plurality of target sample objects are sequentially input into the multi-task model, so as to adjust network parameters of a second type network and network parameters of the target classification task of a first type network in the multi-task model.

At this time, before returning to the selection of the target sample object from the sample set in step S3023, the method further includes:

from the plurality of classification tasks, a new target classification task is determined.

Wherein the new object classification task may be randomly determined. Optionally, in an implementation manner, determining a new target classification task from the multiple classification tasks may include:

and determining a new target classification task from the plurality of classification tasks according to a mode of selecting the tasks in turn.

For example, classification tasks for a multitasking model include: the method comprises the steps of determining a classification task 1, a classification task 2 and a classification task 3 as target classification tasks, determining the classification task 1 as target classification tasks, determining the classification task 2 as target classification tasks, determining the classification task 3 as target classification tasks, determining the classification task 1 as target classification tasks again after one round of completion, and the like until training is completed.

By selecting tasks in turn, the network parameters in the second type network in the multi-task model can be prevented from being biased to a certain classification task, and therefore the accuracy of classification of the multi-task model can be improved.

According to the scheme provided by the disclosure, the network parameters needing training in the multi-task model can be reduced, and the problem of training conflict among classification tasks can be avoided. Furthermore, the network parameters which need to be trained can be avoided from being frequently switched by the multi-task model.

Optionally, in an embodiment, the first type of network may further include: a global pooling layer disposed between the fully connected layer and the feature extraction layer at the end. The feature extraction layer at the tail end inputs the output feature vector to the global pooling layer, and the feature vector is processed by the global pooling layer and then is input to the full connection layer.

Optionally, in an embodiment, the first type of network may further include: the first pooling layer and the feature fusion layer are arranged between the full-connection layer and the feature extraction layer at the tail end; the second class of network comprises a second pooling layer connected with the characteristic extraction layer of the tail end; the feature fusion layer is used for fusing the features of the first pooling layer and the second pooling layer and inputting the fused features into the full-connection layer.

The second pooling layer can input the feature information extracted by the feature extraction network of the bottom layer into the feature fusion layer, and the feature fusion layer can fuse the features of the first pooling layer and the second pooling layer and then input the fused features into the full-connection layer. Therefore, the fully connected layer can conduct object classification based on the feature information extracted by the feature extraction network of the bottom layer and the feature information extracted by the feature extraction network of the upper layer. The classification effect of the multi-task model can be improved.

As shown in fig. 5, an embodiment of the present disclosure provides an object classification method, which may include the following steps:

s501, obtaining a target object to be classified;

the method comprises the steps of obtaining a model training method, obtaining a multi-task model based on the model training method, and performing deployment on the multi-task model. At this time, the target object to be classified may be acquired.

S502, performing multi-task classification on the target object based on a pre-trained target multi-task model to obtain classification results of all classification tasks.

After the target object is obtained, the target object can be subjected to multi-task classification based on a pre-trained target multi-task model, so that classification results of various classification tasks are obtained.

For example, each classification task of the target multitasking model 1 includes classification task 1, classification task 2, and classification task 3. The target object may be subjected to multi-task classification by using the target multi-task model 1 to obtain a classification result of the classification task 1, a classification result of the classification task 2, and a classification result of the classification task 3.

Optionally, in one implementation, the target object may be input into the target multitasking model, so that the target multitasking model performs multitasking classification on the target object based on the network parameters of the second type network and the network parameters of the first type network for each classification task, to obtain the classification result of each classification task.

After the target object is acquired by the target multitasking model, the target object can be subjected to object processing based on the network parameters of the second class network, and an intermediate processing result is obtained. And setting task identifiers of the classification tasks for the intermediate processing results in turn, and after each time of setting the task identifiers, performing object processing on the intermediate processing results by utilizing network parameters of the first type of network aiming at the classification task corresponding to the currently set task identifier to obtain a classification result of the classification task corresponding to the currently set task identifier.

The object processing may be feature extraction, feature pooling, feature classification, and the like.

The object processing of the target object based on the network parameters of the second class network may be feature extraction of the target object by using an underlying feature extraction layer, and the intermediate classification result may be the underlying feature extracted by the underlying feature extraction layer in the multi-task model.

After the bottom layer features are extracted, task identifiers of any classification task in the classification tasks can be set for the bottom layer features. And then, the network parameters of the first type of network, which aim at the classification task corresponding to the currently set task identifier, are utilized to perform the processing of feature extraction, feature pooling, feature classification and the like on the bottom features, so as to obtain the classification result of the classification task corresponding to the currently set task identifier.

After the classification result is obtained, the task identification of the next classification task is continuously set for the extracted bottom layer features, and the process is repeated until the classification results of all the classification tasks are obtained.

According to the scheme provided by the disclosure, as the target object is obtained, the target object can be subjected to multi-task classification, and classification results of all classification tasks are obtained, so that the speed of classifying the target object is higher.

For a better understanding of the solution provided by the present disclosure, consider the scenario of a multi-tasking model constructed by a ResNet50 network for image classification as shown in FIG. 6, the solution provided by the present disclosure is presented below. The second type of network in the multitasking model includes the stem layer, layer1, layer2, layer3 of ResNet50, and conv and relu in layer4, the first type of network includes: bn, GAP and FC layers in layer 4.

Assuming that there are now 4 image classification tasks, the 4 image classification tasks share the feature extraction layer underlying the layers of layer1, layer2, and layer3 of the ResNet50, and conv and relu in layer4 except bn. Whereas in layer4, each image classification task has a respective network parameter (i.e., bn parameter in bn layer), and each image classification task has a respective in-GAP network parameter and in-FC network parameter. The image classification tasks can be converged simultaneously, and the parameters of bn are small. The training and deployment process is basically free of memory, and meanwhile, great convenience is added to reasoning and deployment, and more image classification tasks can be supported and expanded. When the image classification task is required to be added, only some bn parameters corresponding to the image classification task are required to be added in the bn layer, so that the image classification task has good expansibility. And the scheme is also suitable for other classification tasks with bn structures, and has good expansibility.

When a multi-task model based on a ResNet50 network needs to be trained, task identification (such as task ID) corresponding to a sample object needs to be added to the object of the sample in addition to the image path and the calibration result (such as label). When the sample object passes through the layer4 after the feature extraction layer of the bottom layer extracts the bottom layer feature, the bn parameters belonging to the image classification task indicated by the task identifier in the bn layer to be utilized can be determined according to the object identifier corresponding to the sample object. For example, when the image classification task 0 is trained, only the bn parameters with the suffix of 0, such as bn1_0, bn2_0, and bn3_0, in the bn layer of layer4 are used, and the bn parameters belonging to other image classification tasks in the bn layer of layer4 are not used.

In classifying the target image, the target image may be input to a ResNet50 network-based multitasking model. The 14X 1024-dimensional vector N is output through the step layer, layer1, layer2, layer 3. When passing through the layer4, the task identifier of the image classification task needs to be set according to the image classification task at a time, and then the bn parameters corresponding to the task identifier in the bn layer of the layer4 are adopted for processing. After the layer4 processing is completed, the processing result of the layer4 is output through the corresponding GAP and FC layers, and therefore, the layer4 needs to set task identifiers of the image classification tasks for 4 times in sequence, and the above process is executed for 4 times, so as to obtain classification results of the image classification tasks for 4 times.

Further, as shown in fig. 6, the output features of layer2 may be further processed by GAP to obtain 512-dimensional features, and then spliced with 2048-dimensional features of layer4 processed by GAP to obtain 2560-dimensional features. Therefore, the bottom layer characteristics of the upper layer2 can be utilized, and the multi-task classification effect can be improved.

According to an embodiment of the present disclosure, as shown in fig. 7, the present disclosure further provides an apparatus for model training, where the apparatus includes:

a sample set obtaining module 701, configured to obtain a sample set for model training; the sample set comprises sample objects of each classification task of a multi-task model, the multi-task model comprises a first type network and a second type network, the first type network comprises a full-connection layer and a normalization layer in a terminal feature extraction layer; the second type of network is a network structure other than the first type of network;

model training module 702, configured to train the multitasking model by using each sample object included in the sample set and a task identifier corresponding to each sample object;

Optionally, the model training module includes:

the object selection sub-module is used for selecting a target sample object from the sample set;

the object classification sub-module is used for inputting the selected target sample object and the corresponding task identifier into the multi-task model so that the multi-task model classifies the target sample object based on the network parameters of the second type network and the network parameters of the first type network aiming at the appointed task to obtain a classification result; the method comprises the steps of designating a task as a classification task with a task identifier corresponding to a target sample object;

the network parameter adjustment sub-module is used for adjusting the network parameters of the second type of network and the network parameters of the designated tasks of the first type of network based on the difference between the obtained classification result and the calibration result of the target sample object; and returning to the execution object selection sub-module until all the sample objects in the sample set have been selected.

Optionally, the object selecting sub-module is further configured to select a plurality of target sample objects of the target classification task from the sample set; the target classification task is any one of a plurality of classification tasks;

the network parameter adjustment submodule further comprises:

the task determination sub-module is used for determining a new target classification task from a plurality of classification tasks before returning to the execution object selection sub-module.

Optionally, the task determining submodule is further configured to determine a new target classification task from the multiple classification tasks according to a mode of selecting the tasks in turn.

Optionally, the first type of network further includes: the first pooling layer and the feature fusion layer are arranged between the full-connection layer and the feature extraction layer at the tail end; the second class of network comprises a second pooling layer connected with the characteristic extraction layer of the tail end; the feature fusion layer is used for fusing the features of the first pooling layer and the second pooling layer and inputting the fused features into the full-connection layer.

According to the scheme provided by the disclosure, in the training process, each sample object is used for training the network parameters of the second-class network and the network parameters of the first-class network aiming at the classification tasks with the task identifications corresponding to the sample objects, so that the network parameters of the second-class network are shared by the classification tasks in the trained multi-task model, and the network parameters of the first-class network are classified by the classification tasks. And because the first-class network only comprises the normalization layer in the terminal feature extraction layer, network parameters except the normalization layer in the second-class network feature extraction layer are shared by all classification tasks, so that the number of unique network parameters of each classification task is reduced. It can be seen that the above solution provided by the present disclosure can reduce the network parameters that need to be trained within the multitasking model.

According to an embodiment of the object classification method of the present disclosure, as shown in fig. 8, the present disclosure further provides an object classification apparatus, where the apparatus includes:

an object obtaining module 801, configured to obtain a target object to be classified;

the object classification module 802 is configured to perform multitasking classification on the target object based on a target multitasking model trained in advance, so as to obtain classification results of each classification task;

the target multitasking classification model is a model obtained by training by using the model training device provided by the invention.

Optionally, the object classification module includes:

the object input sub-module is used for inputting the target object into the target multi-task model so that the target multi-task model carries out multi-task classification on the target object based on the network parameters of the second class network and the network parameters of the first class network aiming at each classification task to obtain the classification result of each classification task.

Optionally, the target multitasking model is further configured to perform object processing on the target object based on the network parameters of the second class network, to obtain an intermediate processing result; setting task identifiers of all classification tasks for the intermediate processing results in sequence, and after each time of setting the task identifiers, performing object processing on the intermediate processing results by utilizing network parameters of the first type of network aiming at the classification task corresponding to the currently set task identifier to obtain a classification result of the classification task corresponding to the currently set task identifier.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

The embodiment of the disclosure provides an electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

The disclosed embodiments provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a method of model training or a method of object classification.

Embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements a method of model training or a method of object classification.

Fig. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the apparatus 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, such as a model training method or an object classification method. For example, in some embodiments, the method of model training or the method of object classification may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into the RAM 903 and executed by the computing unit 901, one or more steps of the above-described method of model training or method of object classification may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the method of model training or the method of object classification by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A method of model training, comprising:

acquiring a sample image set for model training; the sample image set comprises sample images of each image classification task of a multi-task model, the multi-task model comprises a first type network and a second type network, and the first type network consists of a full-connection layer and a normalization layer in a terminal feature extraction layer; the second type network is a network structure except the first type network; each image classification task of the multi-task model is as follows: aiming at a plurality of image classification tasks of the same image, each image classification task is a parallel unassociated task, and the image classification task comprises a task for classifying the colors of the image and/or a task for classifying objects in the image;

Training the multi-task model by utilizing each sample image included in the sample image set and task identifiers corresponding to each sample image;

the task identifier corresponding to each sample image is the identifier of the image classification task to which the sample image belongs; in the training process, each sample image is used for training network parameters of the second type of network and network parameters of the first type of network aiming at corresponding tasks, wherein the corresponding tasks are classified tasks with task identifications corresponding to the sample images;

the training the multi-task model by using each sample image included in the sample image set and the task identifier corresponding to each sample image includes:

selecting a target sample image from the sample image set;

inputting the selected target sample image and the corresponding task identifier into the multi-task model, so that the multi-task model classifies the target sample image based on the network parameters of the second type network and the network parameters of the first type network aiming at the appointed task to obtain a classification result; the appointed task is an image classification task with a task identifier corresponding to the target sample image;

Based on the obtained classification result and the difference between the calibration result of the target sample image, adjusting the network parameters of the second type network and the network parameters of the designated task of the first type network; and returning to the step of selecting the target sample image from the sample image set until all sample images in the sample image set have been selected.

2. The method of claim 1, wherein the selecting a target sample image from the set of sample images comprises:

selecting a plurality of target sample images of a target classification task from the sample image set; the target classification task is any one of a plurality of image classification tasks;

the returning, before selecting the target sample image from the sample image set, further includes:

from the plurality of image classification tasks, a new target classification task is determined.

3. The method of claim 2, wherein the determining a new target classification task from a plurality of image classification tasks comprises:

and determining a new target classification task from the plurality of image classification tasks according to a mode of selecting the tasks in turn.

4. A method according to any of claims 1-3, wherein the first type of network consists of a fully connected layer, and a normalization layer in a terminal feature extraction layer, and a first pooling layer and feature fusion layer arranged between the fully connected layer and terminal feature extraction layer; the second class of network comprises a second pooling layer connected with the characteristic extraction layer at the tail end; the feature fusion layer is used for fusing the features of the first pooling layer and the second pooling layer and inputting the fused features into the full-connection layer.

5. An object classification method, comprising:

acquiring a target object to be classified; wherein the target object is a target image;

based on a target multitasking model trained in advance, multitasking classification is carried out on the target object, and classification results of various image classification tasks are obtained;

wherein the target multitasking classification model is a model trained using the method of any one of claims 1-4;

the multitasking classification is carried out on the target object based on a target multitasking model trained in advance to obtain classification results of each image classification task, and the method comprises the following steps:

inputting the target object into a target multitasking model, so that the target multitasking model carries out multitasking classification on the target object based on network parameters of a second type network and network parameters of a first type network aiming at each image classification task to obtain classification results of each image classification task;

the target multitasking model carries out multitasking classification on the target object based on the network parameters of the second class network and the network parameters of the first class network aiming at each image classification task to obtain the classification result of each image classification task, and the method comprises the following steps:

The target multitasking model carries out object processing on the target object based on network parameters of the second class network to obtain an intermediate processing result;

and setting task identifiers of the image classification tasks for the intermediate processing results in sequence, and after each task identifier is set, performing object processing on the intermediate processing results by utilizing network parameters of the first type of network aiming at the image classification task corresponding to the currently set task identifier to obtain a classification result of the image classification task corresponding to the currently set task identifier.

6. An apparatus for model training, comprising:

the sample set acquisition module is used for acquiring a sample image set for model training; the sample image set comprises sample images of each image classification task of a multi-task model, the multi-task model comprises a first type network and a second type network, and the first type network consists of a full-connection layer and a normalization layer in a terminal feature extraction layer; the second type network is a network structure except the first type network; each image classification task of the multi-task model is as follows: aiming at a plurality of image classification tasks of the same image, each image classification task is a parallel unassociated task, and the image classification task comprises a task for classifying the colors of the image and/or a task for classifying objects in the image; the model training module is used for training the multi-task model by utilizing each sample image included in the sample image set and the task identifier corresponding to each sample image;

the model training module comprises:

an object selection sub-module, configured to select a target sample image from the sample image set;

the object classification sub-module is used for inputting the selected target sample image and the corresponding task identifier into the multi-task model so that the multi-task model classifies the target sample image based on the network parameters of the second type network and the network parameters of the first type network aiming at the designated task to obtain a classification result; the appointed task is an image classification task with a task identifier corresponding to the target sample image;

the network parameter adjustment sub-module is used for adjusting the network parameters of the second type network and the network parameters of the designated task of the first type network based on the difference between the obtained classification result and the calibration result of the target sample image; and returning to the object selection sub-module until all the sample images in the sample image set have been selected.

7. The apparatus of claim 6, wherein the object selection sub-module is further configured to select a plurality of target sample images of a target classification task from the set of sample images; the target classification task is any one of a plurality of image classification tasks;

the network parameter adjustment submodule further comprises:

and the task determination sub-module is used for determining a new target classification task from a plurality of image classification tasks before returning to execute the object selection sub-module.

8. The apparatus of claim 7, wherein the task determination submodule is further configured to determine a new target classification task from a plurality of image classification tasks in a manner that the tasks are selected in turn.

9. The apparatus of any of claims 6-8, wherein the first type of network consists of a fully connected layer, and a normalization layer in a terminal feature extraction layer, and a first pooling layer and feature fusion layer disposed between the fully connected layer and terminal feature extraction layer; the second class of network comprises a second pooling layer connected with the characteristic extraction layer at the tail end; the feature fusion layer is used for fusing the features of the first pooling layer and the second pooling layer and inputting the fused features into the full-connection layer.

10. An object classification apparatus comprising:

the object acquisition module is used for acquiring a target object to be classified; wherein the target object is a target image;

the object classification module is used for carrying out multi-task classification on the target object based on a pre-trained target multi-task model to obtain classification results of various image classification tasks;

wherein the target multitasking classification model is a model trained using the apparatus of any of claims 6-9;

the object classification module comprises:

the object input sub-module is used for inputting the target object into a target multi-task model so that the target multi-task model carries out multi-task classification on the target object based on network parameters of a second type network and network parameters of a first type network aiming at each image classification task to obtain classification results of each image classification task;

the target multitasking model is further used for performing object processing on the target object based on network parameters of the second class network to obtain an intermediate processing result; and setting task identifiers of the image classification tasks for the intermediate processing results in sequence, and after each task identifier is set, performing object processing on the intermediate processing results by utilizing network parameters of the first type of network aiming at the image classification task corresponding to the currently set task identifier to obtain a classification result of the image classification task corresponding to the currently set task identifier.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4 or 5.

12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-4 or 5.