CN114612451A

CN114612451A - Multi-organ segmentation method and system based on multi-source integrated distillation

Info

Publication number: CN114612451A
Application number: CN202210268374.6A
Authority: CN
Inventors: 王延峰; 张乐飞; 冯世祥; 王钰; 张娅
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2022-03-18
Filing date: 2022-03-18
Publication date: 2022-06-10

Abstract

The invention provides a multi-organ segmentation method and a multi-organ segmentation system based on multi-source integrated distillation, which comprise the following steps: inputting pictures, and predicting by the teacher model and the student model to obtain a multi-component segmentation graph; performing output conversion on the teacher picture segmentation picture, and expanding the teacher picture segmentation picture to multi-class output; performing organ and background region conversion on the output according to the region-based mask; respectively performing supervised learning on different organ areas and background areas to align the converted prediction results of the teacher model and the student models so as to obtain an efficient student multi-organ segmentation model; and inputting the prediction picture, and obtaining an organ segmentation prediction result through a student multi-organ segmentation model. According to the invention, a multi-organ segmentation model with better effect is guided and trained by a multi-source integrated distillation method, so that more accurate unsupervised multi-organ segmentation under the condition of privacy friendliness is realized.

Description

Multi-organ segmentation method and system based on multi-source integrated distillation

Technical Field

The invention relates to the technical field of computer vision, image processing and computer-aided medical diagnosis, in particular to a multi-organ segmentation method and system based on multi-source integrated distillation.

Background

Organ segmentation is one of the basic tasks of computer-aided medical diagnosis, and plays a crucial role in disease discovery and subsequent treatment. The purpose of medical image segmentation is to extract an anatomical region of interest from a medical image in an automatic or semi-automatic manner. Depending on the clinical application, the specific region of interest may range from a tumor to bone to blood vessels. In recent years, the vigorous development of deep learning also brings new ideas and breakthroughs to medical image processing. Although deep learning methods have been widely used for this task, their success depends largely on high quality labeling, however, medical labeling is costly and difficult to obtain in practice. Furthermore, physicians often label only certain organs of interest and that are good at themselves. Therefore, the existing main research focuses on single-organ segmentation models, and the multi-organ segmentation models are short of exploration. The multi-organ segmentation model can reduce the calculation and storage overhead and can also utilize a priori knowledge between organs. In order to create a multi-organ data set, multiple organs in a single image need to be annotated, which requires the cooperation of multiple experts in different fields, greatly increases the difficulty and cost of annotation, causes the lack of the multi-organ annotated data set, and limits the realization of a multi-organ segmentation model.

The success of depth learning based semantic segmentation of medical images also depends on the availability of medical image data. However, organizations are reluctant to share image data for various reasons, most notably for privacy protection. Strict health information privacy regulations require that protected health information needs to be deleted before data is shared outside the home establishment, a process that is both expensive and time consuming in the context of medical image data. Organizations that issue health information, either intentionally or unintentionally, may suffer serious consequences, including a large penalty and criminal prosecution. Furthermore, in many cases, when imaging data is publicly accessible (e.g., cancer imaging data), the corresponding tags are rarely available due to the lack of expert annotation of the target area and the lack of information on the equipment and criteria of the annotation. In practice, there are many reasons why such data is not available, including data storage and transmission, limited number of cases, and difficulty in data cleaning and labeling. Therefore, it is very difficult to train an efficient medical model in such a situation. The problems of multi-organ labeling loss and data privacy provide new requirements for the modeling method.

Aiming at the defects in the prior art, a new technical scheme needs to be provided.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a multi-organ segmentation method and system based on multi-source integrated distillation.

According to the invention, the multi-organ segmentation method based on multi-source integrated distillation comprises the following steps:

step S1: inputting pictures, and predicting by a teacher model and a student model to obtain a multi-component segmentation picture;

step S2: performing output conversion on the teacher picture segmentation picture, and expanding the teacher picture segmentation picture to multi-class output;

step S3: performing organ and background region conversion on the output according to the region-based mask;

step S4: respectively performing supervised learning on different organ areas and background areas, and aligning converted prediction results of a teacher model and a student model to obtain a student multi-organ segmentation model;

step S5: and inputting the prediction picture to obtain an organ segmentation prediction result through a student multi-organ segmentation model.

Preferably, in step S1, the teacher single-organ neural network segmentation model composed of the convolutional layer and the full-link layer is used to predict the input picture, and the formula is as follows:

where x is the input picture, the superscript j indicates the jth dimension, f_smIs the mth teacher neural network model, δ_kRepresenting the softmax prediction output in the k-th dimension,

is the prediction probability of the kth dimension obtained through the mth teacher neural network model;

the input picture is predicted by utilizing a student multi-organ neural network segmentation model formed by the convolutional layer and the full connecting layer, and the formula is as follows:

wherein f is_tIs a neural network model of a student,

is the predicted probability of the kth dimension obtained by the student neural network model.

Preferably, the step S2 includes:

preserving background prediction of teacher segmentation model for organ regions

Without change, the prediction of the organ numbered m

Move to the m-th dimension and then set the values of all other dimensions to 0, the formula is as follows:

wherein the content of the first and second substances,

is a teacher model inputOutputting the organ with the number m after the conversion mapping;

for the background region, the predicted values of all M teacher single organ segmentation models are averaged, and the formula is as follows:

wherein the content of the first and second substances,

the teacher model outputs the transformed and mapped output to the background.

Preferably, the step S3 includes:

a prediction mask for organ m is obtained, the formula is as follows:

and converting the prediction of the teacher single-organ segmentation model on the organ region, wherein the formula is as follows:

transforming the predictions of the student multi-organ segmentation model for the organ regions, the formula is as follows:

obtaining the mth teacher single organ segmentation model

For the mask of the background, the formula is as follows:

aggregating the masks of all M teacher single organ segmentation models for the background, the formula is as follows:

converting the prediction of the teacher single organ segmentation model for the background area, wherein the formula is as follows:

converting the prediction of the student multi-organ segmentation model for the background area, wherein the formula is as follows:

preferably, the step S4 sets different weights for different organ areas and background areas, and aligns the converted prediction results of the teacher and student models, as follows:

where λ is a hyper-parameter used to balance the weights of the organ region and the background region.

Preferably, the step S5 inputs the prediction picture to be predicted by the student multi-organ segmentation model, and the formula is as follows:

p＝δ(f_t(x)) (13)

where p is the resulting multi-organ segmentation result.

The invention also provides a multi-organ segmentation system based on multi-source integrated distillation, which comprises the following modules:

module M1: inputting pictures, and predicting by a teacher model and a student model to obtain a multi-component segmentation picture;

module M2: performing output conversion on the teacher picture segmentation picture, and expanding the teacher picture segmentation picture to multi-class output;

module M3: performing organ and background region conversion on the output according to the region-based mask;

module M4: respectively performing supervised learning on different organ areas and background areas, and aligning converted prediction results of a teacher model and a student model to obtain a student multi-organ segmentation model;

module M5: and inputting the prediction picture to obtain an organ segmentation prediction result through a student multi-organ segmentation model.

Preferably, the module M1 predicts the input picture by using a teacher single-organ neural network segmentation model composed of convolutional layers and full-link layers, and the formula is as follows:

where x is the input picture, the superscript j indicates the jth dimension,

is the mth teacher neural network model, δ_kRepresenting the softmax prediction output in the k-th dimension,

wherein f is_tIs a neural network model for a student,

is the predicted probability of the kth dimension obtained by the student neural network model;

the module M2 includes:

for organ regions, keeping a background prediction of teacher segmentation model

Without change, the prediction of the organ numbered m

wherein the content of the first and second substances,

outputting the organ with the number of m after the teacher model outputs the conversion mapping;

wherein the content of the first and second substances,

the teacher model outputs the transformed and mapped output to the background.

Preferably, said module M3 comprises:

a prediction mask for organ m is obtained, the formula is as follows:

obtaining the mth teacher single organ segmentation model

For the mask of the background, the formula is as follows:

converting the prediction of the teacher single organ segmentation model on the background area, wherein the formula is as follows:

preferably, the module M4 sets different weights for different organ and background regions, aligning the transformed prediction results of the teacher and student models, as follows:

wherein λ is a hyper-parameter for balancing weights of the organ region and the background region;

the module M5 inputs a prediction picture to be predicted by a student multi-organ segmentation model, and the formula is as follows:

p＝δ(f_t(x)) (13)

where p is the resulting multi-organ segmentation result.

Compared with the prior art, the invention has the following beneficial effects:

1. the invention provides a privacy-friendly multi-organ segmentation method and system based on multi-source integrated distillation, which can relieve the problems of multi-organ labeling loss and data privacy to a certain extent. Under the condition of lacking original training data, a plurality of pre-training single organ segmentation models aiming at different organs and a target domain data set without a label are utilized to obtain an efficient multi-organ segmentation model through training, the labeling requirement on the training data is reduced, and meanwhile, the original training data is not needed;

2. the multi-organ segmentation model obtained by training of the invention aggregates knowledge of a plurality of pre-trained single-organ segmentation models, and compared with a single-organ segmentation model and other multi-organ segmentation models, the multi-organ segmentation model further improves the segmentation accuracy of each organ, and simultaneously saves a large amount of calculation time consumption compared with the single-organ segmentation model;

3. the multi-organ segmentation model is respectively studied and trained aiming at different organs and background areas, and the proportion of the organs and the background areas is controlled through hyper-parameters, so that the multi-organ segmentation model can simultaneously attach importance to the organs and the background areas, the study on the organs and the background areas is mutually assisted and mutually constrained, and the segmentation performance of the organs is jointly improved;

4. the multi-organ segmentation model obtained by training utilizes the input of the target domain for training, so that the model can adapt to the prediction distribution and the environment, and the performance is better.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a flow chart of a method in an embodiment of the present invention;

fig. 2 is a schematic diagram of a system in an embodiment of the invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the concept of the invention. All falling within the scope of the invention.

The invention aims to provide a multi-organ segmentation method and system based on multi-source integrated distillation, which can train a model capable of accurately segmenting multiple organs according to a scene only having a single-organ pre-training model without original training data, and help data modeling in a practical medical scene.

The invention provides an embodiment, a multi-organ segmentation method based on multi-source integrated distillation, which comprises the following steps:

step S1: and inputting the pictures, and predicting by the teacher model and the student model to obtain a multi-component segmentation graph. The input picture is predicted by the teacher and student models to obtain a multi-component segmentation picture, and the input picture is predicted by a teacher single-organ neural network segmentation model formed by a convolutional layer and a full-link layer, wherein the formula is as follows:

where x is the input picture, the superscript j indicates the jth dimension,

is the predicted probability of the kth dimension obtained through the mth teacher neural network model.

wherein f is_tIs a neural network model of a student,

Step S2: and performing output conversion on the teacher picture segmentation picture, and expanding the teacher picture segmentation picture to multi-class output. The method for performing output conversion on the teacher picture segmentation picture and expanding the teacher picture segmentation picture to multi-class output comprises the following steps: preserving background prediction of teacher segmentation model for organ regions

Without change, the prediction of the organ numbered m

wherein the content of the first and second substances,

the teacher model outputs the transformed and mapped output to the organ numbered m.

wherein the content of the first and second substances,

the teacher model outputs the transformed and mapped output to the background.

Step S3: the organ and background regions are transformed on the output according to the region-based mask. The transforming the output into organ and background regions according to the region-based mask includes: a prediction mask for organ m is obtained, the formula is as follows:

obtaining the mth teacher single organ segmentation model

For the mask of the background, the formula is as follows:

step S4: and respectively performing supervised learning on different organ areas and background areas, so that converted prediction results of the teacher model and the student models are aligned, and an efficient student multi-organ segmentation model is obtained. Respectively performing supervised learning on different organ areas and background areas to ensure that the converted prediction results of a teacher model and a student model are aligned, and obtaining an efficient student multi-organ segmentation model comprises the following steps: setting different weights for different organ areas and background areas, and aligning the converted prediction results of the teacher model and the student model, wherein the formula is as follows:

Step S5: and inputting the prediction picture to obtain an organ segmentation prediction result through a student multi-organ segmentation model. Inputting a prediction picture through a student multi-organ segmentation model, and obtaining an organ segmentation prediction result comprises the following steps: inputting a prediction picture to be predicted through a student multi-organ segmentation model, wherein the formula is as follows:

p＝δ(f_t(x)) (13)

where p is the resulting multi-organ segmentation result.

The embodiment of the invention realizes the training of the multi-organ model by utilizing the learning mode of aggregating the single-organ knowledge of the multi-teacher model and based on organs and background areas, and obtains good effect on multi-organ prediction.

Based on the same concept of the above embodiments, in another embodiment of the present invention, a privacy-friendly multi-organ segmentation system based on multi-source integrated distillation is provided. Fig. 2 is a schematic diagram of the system of the present embodiment. The method comprises the following steps:

an organ segmentation model module: and inputting the pictures to obtain a multi-component segmentation chart through the teacher model and the student model.

Single organ segmentation output conversion module: and performing output conversion on the teacher picture segmentation picture, and expanding the teacher picture segmentation picture to multi-class output.

The mask definition and conversion module: the output is transformed into organ and background regions according to the region-based mask.

A supervision alignment module: and respectively performing supervised learning on different organ areas and background areas, so that converted prediction results of the teacher model and the student models are aligned, and an efficient student multi-organ segmentation model is obtained.

A prediction module: and inputting the prediction picture to obtain an organ segmentation prediction result through a student multi-organ segmentation model.

module M1: inputting pictures, and predicting by a teacher model and a student model to obtain a multi-component segmentation picture; the input picture is predicted by utilizing a teacher single-organ neural network segmentation model formed by the convolutional layer and the full-connection layer, and the formula is as follows:

where x is the input picture, the superscript j indicates the jth dimension,

wherein f is_tIs a neural network model of a student,

Module M2: performing output conversion on the teacher picture segmentation picture, and expanding the teacher picture segmentation picture to multi-class output; for organ regions, keeping a background prediction of teacher segmentation model

Without change, the prediction of the organ numbered m

wherein the content of the first and second substances,

for the background area, the predicted values of all M teacher single organ segmentation models are averaged, and the formula is as follows:

wherein the content of the first and second substances,

the teacher model outputs the transformed and mapped output to the background.

Module M3: performing organ and background region conversion on the output according to the region-based mask; a prediction mask for organ m is obtained, the formula is as follows:

obtaining the mth teacher single organ segmentation model

For the mask of the background, the formula is as follows:

module M4: respectively performing supervised learning on different organ areas and background areas, and aligning converted prediction results of a teacher model and a student model to obtain a student multi-organ segmentation model; setting different weights for different organ areas and background areas, and aligning the converted prediction results of the teacher model and the student model, wherein the formula is as follows:

Module M5: inputting a prediction picture and obtaining an organ segmentation prediction result through a student multi-organ segmentation model; inputting a prediction picture to be predicted through a student multi-organ segmentation model, wherein the formula is as follows:

p＝δ(f_t(x)) (13)

where p is the resulting multi-organ segmentation result.

In summary, in the above embodiments, a rough and super-segmentation result of a learnable rough and super-segmentation model is used, a high-frequency part of a high-frequency discrete learnable dictionary discrete coded picture is used, a local autoregressive model is used to complete generation from the rough and super-segmentation result to the high-frequency dictionary coding, and a high-definition picture corresponding to a final low-definition picture is generated by using a high-definition picture generation module, so that the problems of multiple organ labeling loss and data privacy are alleviated, the segmentation accuracy of each organ is improved, and a large amount of calculation time consumption is saved compared with a single organ segmentation model.

The invention can perform multi-organ segmentation by utilizing the disclosed pre-training single-organ segmentation model and the label-free multi-organ data set, thereby achieving good organ segmentation effect.

It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may implement the composition of the system by referring to the technical solution of the method, that is, the embodiment in the method may be understood as a preferred example for constructing the system, and will not be described herein again.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The above-described preferred features may be used in any combination without conflict with each other.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. A multi-organ segmentation method based on multi-source integrated distillation, the method comprising the steps of:

step S4: respectively performing supervised learning on different organ areas and background areas, and aligning the converted prediction results of the teacher model and the student models to obtain a student multi-organ segmentation model;

2. The multi-organ segmentation method based on multi-source integrated distillation of claim 1, wherein the step S1 is implemented by using a teacher single-organ neural network segmentation model composed of convolutional layers and fully-connected layers to predict the input picture, and the formula is as follows:

where x is the input picture, the superscript j indicates the jth dimension,

is the mth teacher neural network model, δ_kTo representSoftmax predicted output in the k-th dimension,

wherein f is_tIs a neural network model of a student,

3. The multi-organ segmentation method based on multi-source integrated distillation of claim 1, wherein the step S2 includes:

Without change, the prediction of the organ numbered m

wherein the content of the first and second substances,

wherein the content of the first and second substances,

the teacher model outputs the transformed and mapped output to the background.

4. The multi-organ segmentation method based on multi-source integrated distillation of claim 1, wherein the step S3 includes:

a prediction mask for organ m is obtained, the formula is as follows:

obtaining the mth teacher single organ segmentation model

For the mask of the background, the formula is as follows:

5. the multi-organ segmentation method based on multi-source integrated distillation of claim 1, wherein the step S4 sets different weights for different organ areas and background areas, and aligns the converted prediction results of teacher and student models, and the formula is as follows:

6. The multi-organ segmentation method based on multi-source integrated distillation of claim 1, wherein the step S5 is to input the prediction picture to be predicted by a student multi-organ segmentation model, and the formula is as follows:

p＝δ(f_t(x)) (13)

where p is the resulting multi-organ segmentation result.

7. A multi-organ segmentation system based on multi-source integrated distillation, characterized in that the system comprises the following modules:

module M1: inputting pictures and obtaining multi-component segmentation pictures through teacher and student model prediction;

8. The multi-organ segmentation system based on multi-source integrated distillation of claim 7, wherein the module M1 predicts the input picture by using a teacher single-organ neural network segmentation model composed of convolution layers and full connection layers, and the formula is as follows:

where x is the input picture, the superscript j indicates the jth dimension,

wherein f is_tIs a neural network model of a student,

the module M2 includes:

Without change, the prediction of the organ numbered m

wherein the content of the first and second substances,

wherein the content of the first and second substances,

the teacher model outputs the transformed and mapped output to the background.

9. The multi-organ segmentation system based on multi-source integrated distillation of claim 7, wherein the module M3 comprises:

a prediction mask for organ m is obtained, the formula is as follows:

obtaining the mth teacher single organ segmentation model

For the mask of the background, the formula is as follows:

10. the multi-organ segmentation system based on multi-source integrated distillation of claim 7 wherein the module M4 sets different weights for different organ and background regions, aligning the transformed prediction results of teacher and student models as follows:

wherein λ is a hyper-parameter for balancing weights of organ regions and background regions;

p＝δ(f_t(x)) (13)

where p is the resulting multi-organ segmentation result.