CN114494275A

CN114494275A - Method and device for training image segmentation model of mobile terminal

Info

Publication number: CN114494275A
Application number: CN202210396441.2A
Authority: CN
Inventors: 李博贤; 闫亚军; 见良; 王轶
Original assignee: Beijing Meishe Network Technology Co ltd
Current assignee: Beijing Meishe Network Technology Co ltd
Priority date: 2022-04-15
Filing date: 2022-04-15
Publication date: 2022-05-13
Anticipated expiration: 2042-04-15
Also published as: CN114494275B

Abstract

The embodiment of the invention provides a method and a device for training an image segmentation model of a mobile terminal; the method comprises the following steps: acquiring an image segmentation training sample set, an initial model, a teaching assistant model and a student model; training the initial model by adopting the first fine labeling data to obtain a teacher model, wherein the teacher model is used for labeling the image segmentation training sample set; marking the rough marking data by adopting the teacher model to generate second fine marking data and difficult case data; obtaining a full data set by combining the second fine labeling data, the difficult case data and the first fine labeling data; training the teaching aid model by adopting the full data set; and training the student model by adopting the trained teaching assistant model based on the full data set to obtain an image segmentation model, wherein the image segmentation model is used for carrying out image segmentation processing on a mobile terminal. The big model is used for guiding the small model to train step by step, so that the speed, the light weight and the precision are considered.

Description

Method and device for training image segmentation model of mobile terminal

Technical Field

The invention relates to the technical field of image processing, in particular to a training method for an image segmentation model of a mobile terminal, a training device for the image segmentation model of the mobile terminal and a storage medium of an electronic device.

Background

The image segmentation is widely applied to various scenes, wherein the sky segmentation is mainly used for the directions of image beautification, defogging, post-production and the like. The traditional sky segmentation adopts Gaussian filtering or edge extraction operators to process images, so that the time consumption and the calculated amount are large, and the traditional sky segmentation is difficult to apply to the real-time field. The neural network model trained by deep learning generally has a large number of parameters, has a good effect but cannot be deployed on mobile terminals such as mobile phones, generally needs to be transmitted to a cloud server for processing through a network, lacks robustness under extreme conditions, and has a high cost.

Disclosure of Invention

In view of the above problems, embodiments of the present invention are provided to provide a method for training an image segmentation model for a mobile terminal, an apparatus for training an image segmentation model for a mobile terminal, an electronic device and a storage medium that overcome or at least partially solve the above problems.

In order to solve the above problems, an embodiment of the present invention discloses a method for training an image segmentation model of a mobile terminal, including:

acquiring an image segmentation training sample set, an initial model, a teaching assistant model and a student model; the image segmentation training sample set comprises rough labeling data and first fine labeling data; the parameter quantity of the initial model is larger than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is larger than the parameter quantity of the student model;

training the initial model by adopting the first fine labeling data to obtain a teacher model, wherein the teacher model is used for labeling the image segmentation training sample set;

marking the rough marking data by adopting the teacher model to generate second fine marking data and difficult case data;

obtaining a full data set by combining the second fine labeling data, the difficult case data and the first fine labeling data;

training the teaching aid model by adopting the full data set;

and training the student model by adopting the trained teaching assistant model based on the full data set to obtain an image segmentation model, wherein the image segmentation model is used for carrying out image segmentation processing on a mobile terminal.

Optionally, the step of obtaining a full data set by combining the second fine labeling data, the difficult case data, and the first fine labeling data includes:

combining the first fine labeling data and the second fine labeling data to obtain a fine labeling data set;

and adding the difficult case data to the fine labeling data set to generate a full data set.

Optionally, the step of training the teaching aid model using the full-scale data set includes:

labeling the full data set;

and training the teaching assistant model by adopting the marked full data set.

Optionally, the step of training the student model by distillation using the trained teaching assistant model based on the full data set to obtain an image segmentation model includes:

performing distillation training on the student model on the full data set by adopting the teaching assistant model;

training the student model after distillation training by adopting the fine labeling data set to obtain a pre-training model;

and training the pre-training model by adopting a plurality of preset supervision strategies to generate an image segmentation model.

Optionally, the step of training the student model on the full data set by distillation using the teaching assistant model comprises:

and taking the teaching assistant model as a supervision model of the distillation training, and distilling the student model by adopting a full data set.

Optionally, the step of training the student model after the distillation training by using the fine labeling data set to obtain a pre-training model includes:

training the student model after distillation training by adopting the fine labeling data set;

and when the average loss output by the student model after the distillation training is constant, reducing the loss constraint weight of the student model after the distillation training, and training the student model with the reduced loss constraint weight by adopting the fine labeling data set to obtain a pre-training model.

Optionally, the monitoring strategy includes a loss function, the pre-training model is trained by using a plurality of preset monitoring strategies, and the step of generating the image segmentation model includes:

and training the pre-training models one by adopting various loss functions to generate an image segmentation model.

The embodiment of the invention also discloses a device for training the image segmentation model of the mobile terminal, which comprises the following steps:

the acquisition module is used for acquiring an image segmentation training sample set, an initial model, a teaching assistant model and a student model; the image segmentation training sample set comprises rough labeling data and first fine labeling data; the parameter quantity of the initial model is larger than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is larger than the parameter quantity of the student model;

the first training module is used for training the initial model by adopting the first fine labeling data to obtain a teacher model, and the teacher model is used for labeling the image segmentation training sample set;

the labeling module is used for labeling the rough labeling data by adopting the teacher model to generate second fine labeling data and difficult case data;

the combining module is used for combining the second fine labeling data, the difficult case data and the first fine labeling data to obtain a full data set;

the second training module is used for training the teaching assistant model by adopting the full data set;

and the third training module is used for training the student model by adopting the trained teaching assistant model through distillation based on the full data set to obtain an image segmentation model, and the image segmentation model is used for carrying out image segmentation processing on the mobile terminal.

The embodiment of the invention also discloses electronic equipment, which comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein when the computer program is executed by the processor, the steps of the method for training the image segmentation model of the mobile terminal are realized.

The embodiment of the invention also discloses a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the steps of the method for training the image segmentation model of the mobile terminal are realized.

The embodiment of the invention has the following advantages:

the embodiment of the invention obtains an image segmentation training sample set, an initial model, a teaching assistant model and a student model; the image segmentation training sample set comprises rough labeling data and first fine labeling data; the parameter quantity of the initial model is larger than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is larger than the parameter quantity of the student model; training the initial model by adopting the first fine labeling data to obtain a teacher model, wherein the teacher model is used for labeling the image segmentation training sample set; marking the rough marking data by adopting the teacher model to generate second fine marking data and difficult case data; obtaining a full data set by combining the second fine labeling data, the difficult case data and the first fine labeling data; training the teaching aid model by adopting the full data set; and training the student model by adopting the trained teaching assistant model based on the full data set to obtain an image segmentation model, wherein the image segmentation model is used for carrying out image segmentation processing on a mobile terminal. Training the initial model through the data of wholeness and obtaining the teacher model, rethread teacher model training helps the teaching model, recycles the teaching model distillation training student model of helping at last, and through big model guide little model training step by step, under the prerequisite of keeping the precision, the neural network model diminishes constantly to realize giving consideration to of precision and speed.

Drawings

FIG. 1 is a flowchart illustrating steps of an embodiment of a method for training an image segmentation model of a mobile terminal according to the present invention;

FIG. 2 is a flowchart illustrating steps of an embodiment of a method for training an image segmentation model of a mobile terminal according to the present invention;

FIG. 3 is a block diagram of an embodiment of an image segmentation model training apparatus for a mobile terminal according to the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

Referring to fig. 1, a flowchart illustrating steps of an embodiment of a method for training an image segmentation model of a mobile terminal according to the present invention is shown, which may specifically include the following steps:

101, acquiring an image segmentation training sample set, an initial model, a teaching assistant model and a student model; the image segmentation training sample set comprises rough labeling data and first fine labeling data; the parameter quantity of the initial model is larger than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is larger than the parameter quantity of the student model;

in practical application, an image segmentation training sample set, an initial model, an assistant teaching model and a student model can be obtained from a specified address. The designated address may be a local storage address or a third-party storage address, which is not specifically limited in this embodiment of the present invention.

For the image segmentation training sample set, the image segmentation training sample set comprises rough labeling data and first fine labeling data; for the distinction between the rough labeling data and the first fine labeling data, a manual labeling mode can be adopted for distinction. The first fine label is fine label data in the image segmentation training sample set.

In addition, the initial model, the teaching assistant model and the student model have the same structure and different parameter quantities. Specifically, the parameter quantity of the initial model is greater than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is greater than the parameter quantity of the student model; namely, the initial model, the teaching assistant model and the student model are ordered from large to small according to the parameter quantity.

Step 102, training the initial model by adopting the first fine labeling data to obtain a teacher model, wherein the teacher model is used for labeling the image segmentation training sample set;

and training the initial model by adopting the fine marking data to ensure that the teacher model has the optimal precision. The teacher model is used for labeling the image segmentation training sample set, namely classifying and labeling data in the image segmentation training sample set.

103, marking the rough marking data by adopting the teacher model to generate second fine marking data and difficult case data;

and marking the rough marking data by adopting a teacher model, carrying out more detailed classification marking on the obtained rough marking data, and generating second fine marking data and difficult case data. The second fine annotation is the fine annotation data generated by the teacher model by annotating the data in the rough annotation data.

104, combining the second fine labeling data, the difficult case data and the first fine labeling data to obtain a full data set;

and combining the labeled second fine labeling data, the difficultly marked data and the first fine labeling data to obtain a full data set corresponding to the full data.

105, training the teaching assistant model by adopting the full data set;

and training the assistant teaching model by adopting a full data set so as to reduce the huge parameters of the teacher model and obtain the assistant teaching model with smaller parameters.

And 106, training the student model by adopting the trained teaching assistant model through distillation based on the full data set to obtain an image segmentation model, wherein the image segmentation model is used for image segmentation processing at the mobile terminal.

In the embodiment of the invention, the trained teaching assistant model distills and trains the student model on the full data set so as to further train the student model with smaller data volume and obtain the image segmentation model which can be used for image segmentation processing on the mobile terminal. So that the finally trained image segmentation model can be deployed on the mobile terminal.

Referring to fig. 2, a flowchart illustrating steps of another embodiment of a method for training an image segmentation model of a mobile terminal according to the present invention is shown, which may specifically include the following steps:

step 201, acquiring an image segmentation training sample set, an initial model, a teaching assistant model and a student model; the image segmentation training sample set comprises rough labeling data and first fine labeling data; the parameter quantity of the initial model is larger than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is larger than the parameter quantity of the student model;

and acquiring an image segmentation training sample set, an initial model, a teaching assistant model and a student model from a storage address of the server. The image segmentation training sample set is obtained by manually screening sky segmentation data and labeling the sky segmentation data to obtain first fine labeling data and coarse labeling data. Specifically, the screening method may be fast ambiguous screening, and the remaining data considered to be labeled correctly and accurately are selected and divided into the first fine labeled data, while the remaining data is divided into the coarse labeled data for use.

The initial model, the teaching assistant model and the student model are convolution models, have the same structure and are different in parameter quantity. Specifically, the parameter quantity of the initial model is larger than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is larger than the parameter quantity of the student model.

Step 202, training the initial model by adopting the first fine labeling data to obtain a teacher model, wherein the teacher model is used for labeling the image segmentation training sample set;

the initial model has larger parameters, the input size is 1024x512, and the neural network model with higher precision is trained. Therefore, the initial model can be trained by adopting the first fine marking data to obtain a teacher model; in pursuit of optimal model accuracy.

Step 203, training the initial model by using the first fine labeling data to obtain a teacher model, wherein the teacher model is used for labeling the image segmentation training sample set;

in practical application, the rough label data can be input into a teacher model, the teacher model is used for reasoning and labeling the rough label data, the label of the teacher model for the rough label data is generated, and the original label of the rough label data is abandoned and replaced by a new label.

Then, screening the newly marked rough marking data by adopting rapid ambiguous screening, and determining the data and marks with accurate marks as second fine marking data; and determining the data with the inaccurate label and the label as the difficult data.

Step 204, combining the first fine labeling data and the second fine labeling data to obtain a fine labeling data set;

and integrating the second fine labeling data into the first fine labeling data to obtain a fine labeling data set so as to obtain as many fine labeling data as possible for model training. And continuing training the teacher model by using the fine labeled data set. The continuous training refers to taking the teacher model trained in the previous round as a pre-training model, loading the weights of the teacher model before training, and training on a new fine labeling data set. The method has the advantages that the existing data are utilized as much as possible to extract the rising space of the teacher model, and the labeling confidence coefficient is improved for the labeled data in the following steps; the generalization capability of the teacher model is improved.

Step 205, adding the difficult case data to the fine labeling data set to generate a full data set;

in actual application, a current teacher model can be used as a baseline model, mask data output by a default teacher model is accurate, difficult-case data are inferred and labeled by the current teacher model, and the obtained labels and the difficult-case data are merged into a fine-label data set to generate a full-scale data set. When the data is merged, the data needs to be marked in a marking file so as to distinguish the difficult data from the original fine-marking data.

Step 206, training the teaching assistant model by adopting the full data set;

if the student models are directly used for distillation training, the student models are difficult to learn the distribution of the teacher model due to the overlarge difference of the model parameters, and are difficult to converge finally, and if images with smaller sizes than the input of the student models are input to the teacher model during distillation training, the distillation effect is greatly reduced. Therefore, the teaching assistant model can be trained by adopting a full data set, so that the teaching assistant model not only learns the distribution of the teacher model, but also converts the input into the same size as the student model, and lays a foundation for the distillation training in the following steps.

In an optional embodiment of the present invention, the step of training the teaching assistant model using the full data set comprises:

substep S2061, labeling the full data set;

and a substep S2062 of training the teaching assistant model by adopting the labeled full data set.

And marking the full data set by adopting a teacher model, and training the teaching-assistant model by using the marked full data set, wherein the training link is the same as that of the teacher model. The real-time teaching assistant model and the teacher model have the same structure except the backbone, the backbone adopts relatively smaller ResNet18 to extract features, and the input is changed into 512x256 to train.

Step 207, performing distillation training on the student model by adopting the teaching assistant model on the full data set;

and reasoning the input training data based on the teaching assistant model on the full data set, and carrying out distillation training on the student model.

Specifically, the teaching assistant model is used as a supervision model of the distillation training, and a full data set is adopted to distill the student model. Namely, the teaching assistant model is used as a supervisor in the distillation training to carry out the distillation training on the student model. The distillation training refers to that in the model training process, the freezing weight parameters of the teaching assistant model are not trained, only input training data are reasoned, mask data after the inference of the teaching assistant model are obtained and participate in loss calculation of the student model together with the teacher model label of the mask data, and finally obtained loss is weighted and then is subjected to back propagation to update the weight parameters.

JS divergence distance loss constraint is formed between mask data generated by teaching assistance and mask data of a student model, and the JS divergence distance loss constraint aims at enabling the student model to learn output distribution of the teaching assistance model; and the DiceLoss is used between the label of the teacher model and the output of the student model for loss constraint, and the purpose is to enable the student model to learn the size and the position of the mask shape corresponding to the real mask data.

It should be noted that JS divergence measures the similarity of two probability distributions, and based on the variation of KL divergence, the problem of asymmetric KL divergence is solved. The JS divergence is symmetrical, and the value of the JS divergence is 0 to 1, so that the JS divergence can be measured as a distance.

The KL and JS divergence are defined as follows:

wherein, P1 and P2 are two distributions which need to be judged and correspond to the output of the teaching assistant model and the student model; x is a certain numerical value in the output matrix, and X is the set of all numerical values of the output; KL and JS respectively represent two algorithm functions of divergence, P without (x) takes distribution as an output whole, and addition, subtraction and division are operated according to a matrix.

DiceLoss is a collective measurement function, generally used for calculating the similarity of two samples, and the value range is [0,1], in the embodiment of the invention, the DiceLoss mainly has the function of complementing JS divergence constraint distribution on the difference of the calculated area, shape and position of a mask region predicted by a model and a mask region really marked. Specifically, the DiceLoss is defined as follows:

wherein P1 and P2 are two distributions of needed judgment and correspond to the output of the teaching assistant model and the student model.

In addition, the two losses can be weighted and fused by a certain proportion weight, and the weighted fusion is used as the total loss in the reverse propagation.

Step 208, training the student model after distillation training by adopting the fine labeling data set to obtain a pre-training model;

and removing the teaching aid distillation, training the student model after the distillation training by adopting the fine labeling data set to obtain a pre-training model, wherein the influence of the teaching aid model on the student model is caused by the teaching aid model.

In an optional embodiment of the present invention, the step of training the distillation-trained student model by using the fine labeling data set to obtain a pre-training model includes:

a substep 2081 of training the student model after distillation training by adopting the fine labeling data set;

and a substep 2082, when the average loss output by the distillation-trained student model is constant, reducing the loss constraint weight of the distillation-trained student model, and training the student model with the reduced loss constraint weight by adopting the fine labeling data set to obtain a pre-training model.

And (3) continuously training the student model after distillation training by adopting a fine labeling data set, when the average loss output by each turn of the student model is not continuously reduced for a plurality of turns, reducing the weight of JS divergence loss for a plurality of times until the average loss is reduced to 0, and finally only calculating the labeling of the teacher model and the loss of the output of the student model. In this state, training the student model until loss oscillation, namely loss in a plurality of rounds does not descend any more, stopping training, and obtaining a pre-training model.

In addition, weight addition can be performed on loss values obtained by corresponding output of the difficult-to-case data, so that the attention of the model to the difficult cases is improved, and the generalization capability and robustness of the model are improved.

Step 209, training the pre-training model by adopting a plurality of preset supervision strategies to generate an image segmentation model; the image segmentation model is used for carrying out image segmentation processing on the mobile terminal.

And changing a plurality of preset supervision strategies alternately for the pre-training model, finely adjusting to obtain a final lightweight small model which is an image segmentation model, and performing image segmentation processing on the model by deploying the model in a mobile terminal.

In an optional embodiment of the present invention, the monitoring strategy includes a loss function, the pre-training model is trained by using a plurality of preset monitoring strategies, and the step of generating the image segmentation model includes:

and a substep S209 of adopting a plurality of loss functions to train the pre-training models one by one to generate an image segmentation model.

In practical application, on the premise of not changing the model structure of the pre-training model, the weight parameter is used as the pre-training parameter, training is performed on the fine-labeled data set, the learning rate is reduced, the number of samples sent into the pre-training model in each batch is increased, generally, the more samples are sent into each batch, the fewer the iteration times of the pre-training model in one round are, and the smaller the oscillation degree of loss reduction is. The parameter adjustment is to improve the precision of the small model in the data with high confidence level and find the optimal solution.

The DiceLoss (function) is first used alone, without varying the losses in the case of changing data, for a smooth transition and for the model to learn more about the shape, size and position of the corresponding classes.

Then, the loss function of the model is adjusted to be cross entropy loss, the pre-training model is enabled to pay attention to the category of each pixel, the edge of the pre-training model can be softened, obvious transition exists between the categories, the model output under the cross entropy loss constraint is that the number of the pixels near the category boundary line is more than 0.5, and the pixels are not concentrated near 0 and 1.

And finally, converting the loss function into a lovasz loss function, reducing the noise of the pre-training model, and generating an image segmentation model.

According to the embodiment of the invention, the first fine labeling data and the coarse labeling data in a training sample set are segmented through an image; training the initial model by using the fine labeling data to obtain a teacher model; inputting the rough-labeled data into the teacher model to obtain the label of the teacher model, and integrating the second fine-labeled data into the first fine-labeled data; training a teacher model by using the fine labeling data set; marking difficult case data by the teacher model to obtain a full data set; training an assistant teaching model by using a full data set and teacher labels; carrying out model distillation training by using the teaching aid model and the student model to obtain a student model; then fine tuning training is carried out on the student model by using the fine labeling data set; and (5) alternately changing the supervision strategy to obtain a final lightweight image segmentation model. Through a progressive training method, on the premise that the precision is kept as much as possible, the model is continuously reduced, and therefore the precision and the speed are considered.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

Referring to fig. 3, a block diagram illustrating a structure of an embodiment of the apparatus for training an image segmentation model of a mobile terminal according to the present invention may specifically include the following modules:

the acquisition module 301 is used for acquiring an image segmentation training sample set, an initial model, an assistant teaching model and a student model; the image segmentation training sample set comprises rough labeling data and first fine labeling data; the parameter quantity of the initial model is larger than the parameter quantity of the teaching assistant model, and the parameter quantity of the teaching assistant model is larger than the parameter quantity of the student model;

a first training module 302, configured to train the initial model with the first fine labeling data to obtain a teacher model, where the teacher model is used to label the image segmentation training sample set;

the labeling module 303 is configured to label the rough labeling data by using the teacher model to generate second fine labeling data and difficult case data;

a combining module 304, configured to combine the second fine labeling data, the difficult case data, and the first fine labeling data to obtain a full data set;

a second training module 305 for training the teaching aid model using the full-scale data set;

and a third training module 306, configured to train the student model by using the trained teaching assistant model based on the full data set to obtain an image segmentation model, where the image segmentation model is used to perform image segmentation processing on the mobile terminal.

In an optional embodiment of the present invention, the combining module 304 includes:

the first labeling sub-module is used for combining the first fine labeling data and the second fine labeling data to obtain a fine labeling data set;

and the adding submodule is used for adding the difficult data to the fine labeling data set to generate a full data set.

In an optional embodiment of the present invention, the second training module 305 includes:

the second labeling submodule is used for labeling the full-scale data set;

and the teaching assistant model training submodule is used for training the teaching assistant model by adopting the marked full data set.

In an optional embodiment of the present invention, the third training module 306 includes:

the distillation submodule is used for performing distillation training on the student model on the full data set by adopting the teaching assistant model;

the student model training submodule is used for training the student model after distillation training by adopting the fine labeling data set to obtain a pre-training model;

and the alternative supervision sub-module is used for training the pre-training model by adopting a plurality of preset supervision strategies to generate an image segmentation model.

In an alternative embodiment of the invention, the distillation submodule comprises:

and the distillation unit is used for taking the teaching assistant model as a supervision model of the distillation training and distilling the student model by adopting a full data set.

In an optional embodiment of the present invention, the student model training sub-module includes:

the student model training unit is used for training the student model after distillation training by adopting the fine labeling data set;

and the weight adjusting unit is used for reducing the loss constraint weight of the student model after the distillation training when the average loss output by the student model after the distillation training is constant, and training the student model with the reduced loss constraint weight by adopting the fine labeling data set to obtain a pre-training model.

In an optional embodiment of the invention, the supervision strategy comprises a loss function, and the alternative supervision sub-module comprises:

and the alternate training unit is used for training the pre-training models one by adopting various loss functions to generate image segmentation models.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

An embodiment of the present invention further provides an electronic device, including:

a processor and a storage medium storing a computer program executable by the processor, the computer program being executable by the processor to perform a method according to any one of the embodiments of the invention when the electronic device is run. The specific implementation manner and technical effects are similar to those of the method embodiment, and are not described herein again.

The embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the method according to any one of the embodiments of the present invention. The specific implementation manner and technical effects are similar to those of the method embodiment, and are not described herein again.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one of skill in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The method and the device for training the image segmentation model of the mobile terminal provided by the invention are described in detail, and the principle and the implementation mode of the invention are explained by applying specific examples, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for training an image segmentation model of a mobile terminal is characterized by comprising the following steps:

training the teaching aid model by adopting the full data set;

2. The method of claim 1, wherein the step of combining the second fine annotation data, the difficult case data, and the first fine annotation data to obtain a full data set comprises:

3. The method of claim 1, wherein the step of training the teaching assistance model using the full-scale data set comprises:

labeling the full data set;

and training the teaching assistant model by adopting the marked full data set.

4. The method of claim 2, wherein the step of training the student model using the trained teaching aid model based on the full-scale data set to obtain an image segmentation model comprises:

5. The method of claim 4, wherein the step of training the student model distillatively on the full data set using the teaching assistance model comprises:

and taking the teaching assistant model as a supervision model of the distillation training, and distilling the student model by adopting the full data set.

6. The method of claim 4, wherein the step of training the student model after distillation training with the fine labeled data set to obtain a pre-trained model comprises:

7. The method of claim 4, wherein the supervised policy includes a loss function, the pre-trained model is trained using a plurality of preset supervised policies, and the step of generating the image segmentation model includes:

8. An image segmentation model training device for a mobile terminal is characterized by comprising:

and the third training module is used for training the student model by adopting the trained teaching assistant model in a distillation mode based on the full data set to obtain an image segmentation model, and the image segmentation model is used for performing image segmentation processing on the mobile terminal.

9. An electronic device, comprising a processor, a memory and a computer program stored on the memory and capable of running on the processor, wherein the computer program, when executed by the processor, implements the steps of any one of claims 1 to 7 for a mobile terminal image segmentation model training method.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for training a mobile terminal image segmentation model according to any one of claims 1 to 7.