CN111401102A

CN111401102A - Deep learning model training method and device, electronic equipment and storage medium

Info

Publication number: CN111401102A
Application number: CN201910000456.0A
Authority: CN
Inventors: 徐青青; 寿文卉; 张志鹏
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Communications Ltd Research Institute
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Communications Ltd Research Institute
Priority date: 2019-01-02
Filing date: 2019-01-02
Publication date: 2020-07-10
Anticipated expiration: 2039-01-02
Also published as: CN111401102B

Abstract

The embodiment of the invention provides a deep learning model training method and device, electronic equipment and a storage medium. The method comprises the following steps: labeling the first training data by using a first deep learning model to obtain a first labeling result; obtaining first feature data of each piece of first training data based on the first labeling result; acquiring a second labeling result of the first characteristic data; and training a second deep learning model based on a first training set formed by the first feature data and the second labeling result.

Description

Deep learning model training method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of information, in particular to a deep learning model training method and device, electronic equipment and a storage medium.

Background

The deep learning model is a model trained using a large amount of first training data. The deep learning model includes an image processing model that may be used for image segmentation. In the process of training the deep learning model, first training data needs to be acquired, and manual labeling is generally needed for acquiring the first training data. For some specific fields, such as the medical field, labeling the first training data requires a professional, which is a time-consuming and labor-consuming matter; particularly, for some high-precision deep learning models, the labeling accuracy is very high, which is undoubtedly a huge expense in model training and can seriously delay the training efficiency of the deep learning models.

The deep learning method is adopted to analyze images such as medical images, and a large number of images need to be marked firstly. The image level labeling (image classification) of the image is easy, but the difficulty of giving the pixel level labeling (image segmentation) is very high, so that a large amount of data with the image level labeling is easily collected in many times, but the pixel level labeling of the data is difficult; therefore, the image processing models such as the existing medical images have the characteristics of high training difficulty, difficulty in acquiring first training data and the like, and the image processing models obtained by further training are difficult to obtain high-precision processing results.

Disclosure of Invention

In view of this, embodiments of the present invention are intended to provide a deep learning model training method and apparatus, an electronic device, and a storage medium.

The technical scheme of the invention is realized as follows:

a deep learning model training method comprises the following steps:

labeling the first training data by using a first deep learning model to obtain a first labeling result;

obtaining first feature data of each piece of first training data based on the first labeling result;

acquiring a second labeling result of the first characteristic data;

and training a second deep learning model based on a first training set formed by the first feature data and the second labeling result.

Based on the above scheme, the obtaining first feature data of each piece of the first training data based on the first labeling result includes:

and performing reverse derivation based on the first labeling result and the model structure of the first deep learning model, and determining the first feature data which enables the first training data to belong to a target class.

Based on the scheme, the first training data is a first training image; the first labeling result comprises: a probability that the first training image belongs to the target class;

the determining the first feature data which enables the first training data to belong to a target class based on the reverse derivation of the model structure of the first deep learning model and the first labeling result comprises:

determining a gradient of the probability;

and determining a response region of the first training image to the first training image belonging to the target class based on the gradient of the probability and the reverse conduction by combining the first deep learning model.

Based on the above scheme, the obtaining of the second labeling result of the first feature data includes:

visually outputting the response region;

acquiring a second labeling result for labeling the response area;

alternatively, the first and second electrodes may be,

visually outputting the response region, and outputting a first labeling result corresponding to the response region, wherein the first labeling result is a labeling result output by the first deep learning model;

and obtaining the second labeling result based on the revision operation acted on the first labeling result.

Based on the above scheme, the visually outputting the response region based on the response region further includes:

performing morphological processing on the response region to obtain the connected response region;

visually outputting the connected response regions.

Based on the scheme, the response area is a lesion area; the second labeling result comprises: the type of lesion and/or the extent of lesion in the lesion area.

Based on the above scheme, the method further comprises:

determining second feature data satisfying a similar condition to the first feature data from second training data;

conducting the second labeling result of the first characteristic data to the second characteristic data; the second feature data and the labeling result corresponding to the second feature data form a second training set;

training a second deep learning model based on a first training set composed of the first feature data and the second labeling result, including:

training the second deep learning model based on the first training set and a second training set.

Based on the above scheme, the method further comprises:

if the first training image is a first medical image, acquiring first characteristics of a lesion area in the first medical image according to the first characteristic data; wherein the first characteristic comprises at least one of: the area of the lesion area, the perimeter of the lesion area, and the color of the lesion area:

the determining second feature data satisfying similar conditions to the first feature data from second training data includes:

and selecting a second medical image with a second feature and the first feature meeting the similar condition, wherein the second feature is a feature extracted from the second medical image.

Based on the above scheme, the method further comprises:

inputting the data of the first training set and/or the second training set into the second deep learning model which is trained to obtain third labeling information;

if the third labeling information is inconsistent with fourth labeling information and the third labeling information meets a preset condition, removing corresponding abnormal training data from the first training set and/or the second training set, wherein the fourth labeling information is: the first deep learning model is used for carrying out classification processing on the second training data to obtain the second training data;

and optimizing the second deep learning model by utilizing the first training set and/or the second training set from which abnormal training data are eliminated.

A deep learning model training apparatus comprising:

the first obtaining module is used for labeling the first training data by using the first deep learning model to obtain a first labeling result;

a second obtaining module, configured to obtain first feature data of each piece of the first training data based on the first labeling result;

a third obtaining module, configured to obtain a second labeling result of the first feature data;

and the training module is used for training a second deep learning model based on a first training set formed by the first characteristic data and the second labeling result.

An electronic device, comprising:

a memory;

and the processor is connected with the memory and is used for realizing the deep learning model training method provided by one or more of the technical schemes by running the computer executable instructions stored on the memory.

A computer storage medium having stored thereon computer-executable instructions; after being executed, the computer-executable instructions implement the deep learning model training method provided by one or more of the above technical solutions.

According to the technical scheme provided by the embodiment of the invention, when the second deep learning model is trained, the first deep learning model with certain classification or segmentation labeling capacity is utilized to label the whole piece of first training data, the first characteristic data which enables a specific piece of training data to belong to certain specific classes in the first training data is obtained based on the first labeling result, the screening of useful characteristic data in the first training data is realized, the second deep learning model is trained by utilizing the characteristic data, so that a labeling person at least does not need to find the first characteristic data in all the first training data, and the labeling workload of the labeling person is reduced; meanwhile, the first feature data with useless data deleted is adopted to train the second deep learning model, so that the second deep learning model has stronger labeling capability than the first deep learning model, and the stronger labeling capability can be embodied in the accuracy of a labeling result; in this way, simplified training of the high-annotation capability second depth model is achieved.

Drawings

Fig. 1 is a schematic flowchart of a first deep learning model training method according to an embodiment of the present invention;

FIG. 2 is a schematic flowchart of a second deep learning model training method according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a deep learning model training apparatus according to a first embodiment of the present invention;

FIG. 4 is a flowchart illustrating a third deep learning model training method according to an embodiment of the present invention;

fig. 5 is a schematic diagram illustrating an effect of obtaining first feature data by using a first deep learning model according to an embodiment of the present invention;

FIGS. 6A and 6B are schematic diagrams illustrating the comparison between an original image and an image containing a lesion tissue alone;

FIGS. 7A and 7B are schematic diagrams illustrating the comparison between the original image obtained by morphological processing of the images in FIGS. 6A and 6B and the image containing the lesion tissue alone;

FIG. 8 is a schematic flowchart of a training set for automatically labeling a second deep learning model according to an embodiment of the present invention;

fig. 9 is a schematic diagram of iterative training of a second depth model according to an embodiment of the present invention.

Detailed Description

The technical solution of the present invention is further described in detail with reference to the drawings and the specific embodiments of the specification.

As shown in fig. 1, the present embodiment provides a deep learning model training method, including:

step S110: labeling the first training data by using a first deep learning model to obtain a first labeling result;

step S120: obtaining first feature data of each piece of first training data based on the first labeling result;

step S130: acquiring a second labeling result of the first characteristic data;

step S140: and training a second deep learning model based on a first training set formed by the first feature data and the second labeling result.

The deep learning model provided by the embodiment can be applied to training equipment of various deep learning models. The training device may be an electronic device such as a server.

The first deep learning model and the second deep learning model provided by the embodiment can be various types of deep learning models, such as a neural network; the neural network may include: convolutional neural networks (FNNs) or fully-connected networks (CNNs), etc.

In this embodiment, a first deep learning model that has been trained is used to label first training data to obtain a first labeling result, for example, for an image, the first deep learning model at an image level may be used to label the first training image to mark whether an imaging region (referred to as a lesion region in this embodiment) composed of lesions exists in the first training image; the type and/or degree of lesion corresponding to the lesion area.

After the first labeling result is obtained, the feature data in each piece of the first training data, which is classified into a specific type, is obtained, and in this embodiment, the feature data is referred to as first feature data for distinction.

For example, taking a medical image as an example, an image region that allows the medical image to be classified into an image class for imaging diseased tissues is found in each medical image, and the image region is the first feature data.

After the first feature data is found, a second labeling result of the first feature data is obtained, and the second labeling result can be a manual labeling result, but at the moment, manual labeling is not needed to search for the feature data by a labeling person. For example, a physician is not required to circle a lesion region imaged by lesion tissue in a medical image; obviously, the workload of the annotating personnel is reduced. In the embodiment, because the first deep learning model is used for labeling the first training data, part of the training data without the characteristic data is already screened out by using the first deep learning model, so that the workload of screening by the personnel needing to label is reduced; the labeling overhead required to train the second deep learning model, and the training speed, are reduced from at least these two levels. Furthermore, the second deep learning model is based on the first feature data and the second labeling result, so that the whole first training data is no longer used as a training sample of the second deep learning model, the second deep learning model obtained through training can label subsequently input data, and can label a part of data in the input data as a corresponding result, so that the precision of the second deep learning model is further improved relative to the first deep learning model.

In some embodiments, the second depth learning model may be a lesion region segmentation model of a medical image. For example, the first training data is a medical image to be labeled, and a lesion region including lesion tissue imaging in the medical image can be obtained through the processing in step S110 and step S120; the physician then presents second annotation information, such as lesion information (e.g., type of lesion and/or extent of lesion), by viewing the image of the lesion area. The pixel value of the pathological change area and the second marking information are used for training a second deep learning model, and the obtained second deep learning model is provided with the pathological change area and the non-pathological change area in the medical image which are divided one by one, so that the image division is realized; through the image processing of the second deep learning model, after the lesion area is segmented, a doctor can only watch the lesion area to give a diagnosis suggestion.

In some embodiments, the second deep learning model may even output annotation information, which may be used as auxiliary information for doctor diagnosis to assist the doctor's diagnosis.

In some embodiments, the step S120 may include:

The first deep learning model classifies the first training data, and assigns the whole piece of the first training data to the target class, possibly only because of part of the data therein, and obtains the first feature data that causes the first training data to be assigned to the target class by reverse derivation in step S120.

For example, the first training data is a first training image; the first labeling result comprises: a probability that the first training image belongs to the target class;

the step S120 may include:

determining a gradient of the probability;

The gradient of the probabilities is determined and can be obtained by a differential calculation that derives these probabilities. For example, the first deep learning model is a first neural network, which includes: the device comprises an input layer, N middle hidden layers and an output layer; the output layer outputs the probability that the first training data belongs to each class; the first feature data is obtained by reversely propagating the gradient of the probability layer by layer, for example, reversely mapping the gradient of the probability to the nth hidden layer from the output layer, reversely deriving the gradient from the nth hidden layer to the (N-1) th hidden layer, and then deriving the gradient layer by layer to obtain a data set which has a larger influence degree of attributing the first training data to the target class in all the original data input by the input layer.

Therefore, the first deep learning model is used for realizing that the data with specific characteristics in the first training data are selected to form the first characteristic data. If the first training data is an image, the first feature data can be an image area where feature pixels enabling the image to belong to the target class are located; the image area is the response area. Once the response region is determined, pixels outside the image region can be regarded as pixels not causing the image to belong to the target class, and thus, taking the image as an example, pixel-level segmentation of the image is realized; equivalently, the characteristic pixels in the image are determined, and the contour of the region where the characteristic pixels are located is determined; therefore, a marking person is not required to manually define the feature area where the feature pixel is located.

In some embodiments, the step S130 may include:

visually outputting the response region;

and acquiring a second labeling result for labeling the response area.

Outputting the response area by using an image output mode; therefore, the annotator can view the response area in an image mode; and manually marking based on the characteristics presented in the response area, so that the electronic equipment receives the second marking result from the man-machine interaction interface.

In some embodiments, the step S130 may include:

visually outputting the response region, and outputting a first labeling result corresponding to the response region, wherein the first labeling result is a labeling result output by the first deep learning model; and obtaining the second labeling result based on the revision operation acted on the first labeling result.

In this embodiment, after the image outputs the response region, the first deep learning model is synchronously output for labeling to obtain a first labeling result. And a professional marking person judges, and if the first marking result is accurate, the marking person does not need to manually input the marking result, so that the marking workload of the marking person can be further reduced. If the annotating personnel think the annotation result is incorrect, the editable first annotation result can be revised, so that the electronic equipment receives the revision operation input by the identifying personnel, and a second annotation result is obtained.

In some embodiments, the step S130 may further include:

visually outputting the connected response regions.

The response region obtained in step S120 may be a non-closed region or a non-connected image region due to an error of the model itself. In this embodiment, the connected response regions are obtained by morphological processing. The morphological treatment comprises the following steps:

performing expansion treatment on the response area before the morphological treatment through expansion operation; after the expansion treatment, carrying out equal-pixel corrosion treatment on the outer contour of the response region after the expansion treatment, so that the gap in the response region is eliminated, and the response region forms a closed communication region; at the same time, the strip-shaped region extending outwards before the morphological treatment is corroded by the corrosion treatment of the outer contour, so that the complete and smooth connected response region is formed. Through morphological processing, the response region can be obtained more accurately, so that the accuracy of the second deep learning model trained by using the data can be improved.

For example, the swelling operation and the erosion operation are sequentially performed by performing N pixels for both the inner boundary and the outer boundary of the response region and performing N pixels for the outer boundary after the swelling. N may be a positive integer not less than 1, for example, N may be 3, 4, or 5.

In some embodiments, the responsive region is a diseased region; the second labeling result comprises: the type of lesion and/or the extent of lesion in the lesion area.

For example, for an eye image of a human body as an example, the eye image is labeled, and the specific region in the eye image may be labeled as: marking the types of normal, sugar net pathological changes, other eye diseases and other pathological changes. At the same time, the degree of lesion may be indicative of different degrees of a lesion, e.g., mild, moderate, severe, etc.

In this embodiment, the first training data may be other data besides the image, for example, the motion data detected by the motion sensor, and the first labeling result is labeling data for labeling a piece of motion data as a certain piece of motion. The first characteristic data may be data in the motion data such that it is labeled as a certain item of motion. For example, a motion data packet is a vector of N dimensions, and data of one dimension in the vector is an element; the first feature data may be a vector of M dimensions selected from vectors of N dimensions. Where M is less than or equal to N.

In some embodiments, as shown in fig. 2, the method further comprises:

step S131: determining second feature data satisfying a similar condition to the first feature data from second training data;

step S132: conducting the second labeling result of the first characteristic data to the second characteristic data; the second feature data and the labeling result corresponding to the second feature data form a second training set;

the step S140 may include the step S141:

the step S141: training the second deep learning model based on the first training set and a second training set.

In this embodiment, the second training data may be data that does not require manual labeling. In this embodiment, a feature similarity manner is utilized to find out second feature data in the second training data, which satisfies similar conditions with the first feature data, so that expansion of the training set is realized on the basis of manual labeling or manual revision, thereby obtaining more training data.

For example, a non-deep learning model is used for carrying out feature comparison processing, second feature data with the similarity reaching a special threshold with the first feature data are found from second training data, and a second labeling result of the first feature data is transmitted to the second feature data, so that automatic labeling of the second feature data is realized; thus, the annotating personnel can only annotate a small amount of data and obtain a training set comprising a large amount of data.

For example, taking an image as an example, the response region may be input into a vector machine model, a vector machine model (SVM) may extract features of the response region, then a response region in a second training image obtained by reversely deriving an annotation result obtained based on the annotation of the first deep learning model is input into the vector machine model, and the SVM automatically obtains a response region in the second training image sufficiently similar to the response region subjected to manual annotation based on feature extraction and comparison; and through the conduction of the marking information, the second training data is automatically generated.

For example, in the second training data, data a includes a preset threshold value, such as a similarity greater than 70%, 80%, or 85%, of the first feature data, and at this time, the second labeling result of the first feature data may be directly assigned to the data a, so as to complete the conduction of the labeling result.

In step S140, the second deep learning model is trained based on the first training set and the second training set at the same time, and because the number of training samples of the second deep learning model is increased, the model parameters of the second deep learning model are more accurate, so that the second deep learning model with higher labeling capability can be obtained.

In some embodiments, the method further comprises:

the step S132 may include: and selecting a second medical image with a second feature and the first feature meeting the similar condition, wherein the second feature is a feature extracted from the second medical image.

The lesion region may be the aforementioned response region. The area and the perimeter of the lesion area and the color of the lesion area reflect the characteristics of the imaged lesion tissue, and can be used as a non-deep learning model such as an SVM (support vector machine) for expanding training data.

In some embodiments, the method further comprises:

step S150: inputting the data of the first training set and/or the second training set into the second deep learning model which is trained to obtain third labeling information;

step S160: if the third labeling information is inconsistent with fourth labeling information and the third labeling information meets a preset condition, removing corresponding abnormal training data from the first training set and/or the second training set, wherein the fourth labeling information is: the first deep learning model is used for carrying out classification processing on the second training data to obtain the second training data;

step S170: and optimizing the second deep learning model by utilizing the first training set and/or the second training set from which abnormal training data are eliminated.

In this embodiment, the second deep learning model is a model that iterates repeatedly, and after being trained by using the first training set and/or the second training set, the second deep learning model obtains some preliminary model parameters. And then, obtaining a preliminary model parameter by using the completed preliminary training, inputting data in the training data, outputting a labeling result by the second training model at the moment, comparing the labeling result with the initial labeling result, finding that the labeling result is different from the initial labeling result, and when the correct probability of the labeling result of the second training model is greater than a certain probability threshold, indicating that the training data is possibly abnormal, in order to avoid the negative influence of the abnormal data on the second deep learning model, in the embodiment, after the abnormal training data is eliminated, iteratively optimizing or retraining the second deep learning model by using a training set with the abnormal data eliminated, so that the training of the second deep learning model is realized.

As shown in fig. 3, the present embodiment provides a deep learning model training apparatus, including:

a first obtaining module 110, configured to label the first training data by using the first deep learning model to obtain a first labeling result;

a second obtaining module 120, configured to obtain first feature data of each piece of the first training data based on the first labeling result;

a third obtaining module 130, configured to obtain a second labeling result of the first feature data;

the training module 140 is configured to train a second deep learning model based on a first training set formed by the first feature data and the second labeling result.

In some embodiments, the first obtaining module 110, the second obtaining module 120, the third obtaining module 130, and the training module 140 may be program modules, which, after being executed by a processor, are capable of obtaining the first labeling result, the first feature data, and the second labeling result, and completing training of the second deep learning model.

In other embodiments, the first obtaining module 110, the second obtaining module 120, the third obtaining module 130, and the training module 140 may be a hard-soft combining module, which may be a programmable array, such as a field programmable array or a complex programmable array.

In still other embodiments, the first obtaining module 110, the second obtaining module 120, the third obtaining module 130, and the training module 140 may be pure hardware modules, which may be application specific integrated circuits.

In some embodiments, the second obtaining module 120 is specifically configured to perform a reverse derivation based on the first labeling result and a model structure of the first deep learning model, and determine the first feature data that causes the first training data to belong to a target class.

In some embodiments, the first training data is a first training image; the first labeling result comprises: a probability that the first training image belongs to the target class;

the second training module 140 is specifically configured to determine a gradient of the probability; and determining a response region of the first training image to the first training image belonging to the target class based on the gradient of the probability and the reverse conduction by combining the first deep learning model.

In some embodiments, the first obtaining module 110 is specifically configured to visually output the response region; acquiring a second labeling result for labeling the response area;

or, visually outputting the response region, and outputting a first labeling result corresponding to the response region, wherein the first labeling result is a labeling result output by the first deep learning model; and obtaining the second labeling result based on the revision operation acted on the first labeling result.

In some embodiments, the second obtaining module 120 is further configured to perform morphological processing on the response regions to obtain connected response regions; visually outputting the connected response regions.

In some embodiments, the apparatus further comprises:

a determining module, configured to determine second feature data from second training data, wherein the second feature data satisfies a similar condition with the first feature data;

the transmission module is used for transmitting the second labeling result of the first characteristic data to the second characteristic data; the second feature data and the labeling result corresponding to the second feature data form a second training set;

the training module 140 is specifically configured to train the second deep learning model based on the first training set and the second training set.

In some embodiments, the apparatus further comprises:

the feature extraction module is used for acquiring a first feature of a lesion area in a first medical image according to the first feature data if the first training image is the first medical image; wherein the first characteristic comprises at least one of: the area of the lesion area, the perimeter of the lesion area, and the color of the lesion area:

the determining module is specifically configured to select a second medical image of which a second feature and the first feature satisfy the similarity condition, where the second feature is a feature extracted from the second medical image.

Several specific examples are provided below in connection with any of the embodiments described above:

example 1:

the example provides a segmentation model training method based on weak supervised learning, which does not need to know the target contour of each picture in advance, and comprises the following specific steps:

training a classification model: training an image classification model on the premise of only image class marking;

model visualization: and then, giving the classification probability of the appointed class of all the training pictures, calculating the gradient of the probability value and reversely transmitting the gradient to the image, and obtaining a response area which has the maximum influence on the class in the original image, for example, obtaining the position of a target lesion in the medical image.

Lesion segmentation: the response region is segmented in an unsupervised manner, for example, in the segmentation process based on the unsupervised manner, the segmentation can be performed based on morphological transformation.

And (3) classifying lesion types: and extracting connected domains from the segmentation result obtained in the last step, labeling the types of the pathological changes of a small number of connected domains, and extracting corresponding characteristics to classify the pathological changes.

Deep segmentation network training: and using the obtained lesion contour as a standard training deep segmentation network.

A classification network is trained on annotation data for which there are already a large number of image levels. For example, currently a 4-class network is trained on the fundus image: the 4 categories are: normal fundus, sugar net, other eye diseases, images are not available.

For any picture, the classification network is adopted to obtain the probability P and the class C of each class₁For an image of class 1 (a sugar net image), its gradient of class 1 is calculated, and its response on the image is calculated by back conduction. And removing bright spots by closed operation, directly dispersing visual results, firstly adopting morphological gaps, then carrying out threshold segmentation on the closed operation results, counting connected domains on the segmented binary image, and finally counting external rectangles of each connected domain to be used as masks of grabcut to segment the original image.

And selecting a small number of pictures, delivering the segmentation result to a doctor, and marking the lesion category (four categories including bleeding, microaneurysm, hard exudation and background) of each connected domain by the doctor. It should be noted that only class labeling is required here, and labeling of the outline of the lesion by the doctor is not required, and since the subsequent lesion class model is trained by a non-deep learning method (such as SVM), the required data amount is not large, so that the workload of the doctor is not increased significantly here.

The method comprises the following steps of firstly carrying out high-pass filtering on an original image labeled by a doctor to remove illumination influence, then extracting the following characteristics for each connected domain in the original image, and judging the category of the connected domain:

area of connected region:

perimeter of connected domain;

communicating three-channel color of domain/three-channel color of whole image;

connected domains are classified into four categories of bleeding point, microaneurysms, hard extravasation, and background.

And taking the result obtained in the step as an initial label of image segmentation, and using the initial label to train a depth segmentation network. For example, the picture and the initial label are respectively cut into 224 × 224, only the patch with the lesion is kept as a training sample, and the training is carried out by adopting deplab or U-Net network.

Because the classification network cannot guarantee 100% accuracy, and the visualization result cannot detect all lesions 100%, the negative sample may contain part of lesions and needs to be discarded in the training process; the classification network herein may be the first deep learning model described above. The method for further optimizing the second deep learning model may be as follows:

firstly, training all pictures to obtain an initial model, detecting all first training data by using the initial model, comparing a detection result with an initial label, if the initial label is non-pathological, but the detection result is pathological and the probability value is high (taking the probability value to be more than 0.7), considering that the initial label is wrong, and discarding the picture from a training sample. And after all the training pictures are detected once, retraining until no obvious error occurs in the detection result.

The example provides a depth segmentation model training method based on depth classification model visualization, which is characterized in that a depth classification model is trained aiming at medical images with class labels, all first training data are subjected to one-by-one adoption of the model to calculate visualization results of specified classes, and the visualization results are segmented by an unsupervised method; and training a lesion type recognition model by using the segmentation result, acquiring lesion outlines and categories of all the first training data, using the lesion outlines and the categories as a golden standard of the deep segmentation network, and further training the deep segmentation network.

In conclusion, in the present example, the segmented labeling result is obtained through an automated process by using the result of the image category labeling, and the segmentation labeling by a doctor is not needed;

and testing the training data by adopting the model obtained by training, and screening error labels according to the test result and the probability value so as to improve the accuracy of the model.

Compared with the prior art mentioned above, although the lesion contour of the first training data does not need to be labeled, the position (i.e. the circumscribed rectangle) of the lesion contour still needs to be labeled, and the scheme only needs to label the category of the whole image, so that the workload of the doctor can be further reduced.

In the prior art, the result of unsupervised segmentation is directly used as a final result, namely, when testing or actual application is carried out, in order to obtain the segmentation result of an image, a target detection model is firstly adopted to detect the position of a lesion, and then an unsupervised segmentation algorithm is adopted to segment the contour of the lesion. In the scheme, because the deep segmentation network is trained, in actual application, only the segmentation network is needed to be directly used to obtain the segmentation result. On the other hand, because the accuracy limit of the classification network is considered, the result of unsupervised segmentation may have errors, the example further trains the unsupervised result by using the deep segmentation network, and the accuracy is improved by using the method of iterative testing and sample screening in the training process, so that the accuracy is higher than the result of unsupervised detection.

Example 2:

as shown in fig. 4, the present example provides a deep learning model training method, including:

classifying the training pictures by using a classification model; the classification model is one of the first deep learning models;

visually displaying the result, for example, displaying an image of the lesion tissue based on the classification result;

based on the processing such as reverse derivation or inverse operation, lesion segmentation is carried out;

classifying the types of the lesions;

combining lesion type classification and training images to perform deep segmentation network training; the deep segmentation network here corresponds to the second deep learning model described above.

Referring to fig. 5, the Forward path (Forward Pass) of the classification model is used to calculate the probability of the score to each class, for example, the eye image is classified in fig. 4, and 4 kinds of classified probabilities are obtained;

the gradient is calculated by utilizing a reverse path, and the image area which causes a certain picture to be attributed to the class with the maximum classification probability is reversely deduced based on the calculation of the gradient.

Fig. 6A and 6B are left images of original images, which correspond to first training data of the present application;

both the right images of fig. 6A and 6B are images of diseased tissue, including: visualization of microaneurysms, bleeding, and hard infiltration. The response region corresponding to the aforementioned first feature data is marked with a circle on the original image.

It can be observed from fig. 6A and 6B that the characteristic pixels in the visualization result are relatively discrete, and the results shown in fig. 7A and 7B are obtained by morphological processing. The left diagrams of fig. 7A and 7B are also original images on which response regions corresponding to the aforementioned first feature data are marked with circles. Through morphological processing in fig. 7A and 7B, connected lesion regions are obtained, and the lesion regions are represented by white regions in fig. 7A and 7B.

Specifically referring to fig. 8, the method for obtaining training data in the deep learning model training method may include:

displaying a visual result;

morphological transformations corresponding to the aforementioned morphological treatments;

binarization, namely representing the characteristic pixels and the non-characteristic pixels by different values, namely generating a mask for extracting a response region from the original image;

and extracting any communication region of the mask, and carrying out Grabcut and other algorithms for segmentation to obtain the response region.

Extracting category characteristics by combining the response area and the original image;

low-pass filtering is carried out on a part of original images, and interference information is filtered out, for example, the imaging influence of illumination on lesion tissues is filtered out;

and acquiring a lesion category rule, and judging the lesion category by a non-depth model such as an SVM (support vector machine) and the like based on feature similarity and the like to expand the training set.

As shown in fig. 9, the deep learning model training method may include:

carrying out network depth training by using an initial label and a training picture (mainly including fragments of pathological tissue imaging) to obtain a depth segmentation network;

testing by using a depth segmentation network, and comparing to find pictures with inconsistent labels;

and judging whether the probability of the depth segmentation network to the inconsistent pictures is greater than a probability threshold value, if so, discarding the pictures, and if not, retraining the depth segmentation network by using the training set discarded the abnormal pictures.

The present embodiment also provides an electronic device, including:

a memory;

and the processor is connected with the memory and is used for implementing the deep learning model training method provided by one or more of the technical schemes by running the computer executable instructions stored on the memory, and particularly implementing the method shown in the figure.

The memory may be various types of memory, such as random access memory, read only memory, or flash memory.

The processor may be various types of processors, such as a microprocessor, a digital signal processor, a programmable array, or a central processing unit, among others.

The processor may be equal to the memory connection through an integrated bus.

The present embodiments provide a computer storage medium having stored thereon computer-executable instructions; after being executed, the computer-executable instructions implement the deep learning model training method provided by one or more of the above technical solutions, and may specifically implement the method shown in the figure.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all the functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A deep learning model training method is characterized by comprising the following steps:

acquiring a second labeling result of the first characteristic data;

2. The method according to claim 1, wherein the obtaining first feature data of each piece of the first training data based on the first labeling result comprises:

3. The method of claim 2, wherein the first training data is a first training image; the first labeling result comprises: a probability that the first training image belongs to the target class;

determining a gradient of the probability;

4. The method of claim 3,

the obtaining of the second labeling result of the first feature data includes:

visually outputting the response region;

acquiring a second labeling result for labeling the response area;

alternatively, the first and second electrodes may be,

5. The method of claim 4, wherein the visually outputting a response region based on the response region further comprises:

visually outputting the connected response regions.

6. The method of claim 3,

the response region is a lesion region; the second labeling result comprises: the type of lesion and/or the extent of lesion in the lesion area.

7. The method according to any one of claims 1 to 6, further comprising:

8. The method of claim 7, further comprising:

9. The method of claim 7, further comprising:

10. A deep learning model training device, comprising:

11. An electronic device, comprising:

a memory;

a processor coupled to the memory for implementing the method provided by any of claims 1 to 9 by executing computer-executable instructions stored on the memory.

12. A computer storage medium having stored thereon computer-executable instructions; the computer-executable instructions, when executed, implement the method provided by any one of claims 1 to 9.