WO2021179205A1

WO2021179205A1 - Medical image segmentation method, medical image segmentation apparatus and terminal device

Info

Publication number: WO2021179205A1
Application number: PCT/CN2020/078800
Authority: WO
Inventors: 王书强; 陈卓; 申妍燕; 张炽堂; 吴国宝
Original assignee: 深圳先进技术研究院
Priority date: 2020-03-11
Filing date: 2020-03-11
Publication date: 2021-09-16

Abstract

Provided is a medical image segmentation method. The method comprises: acquiring a medical image to be detected; inputting said medical image into a trained segmentation model, wherein the segmentation model comprises an encoder and a decoder, the encoder comprises a plurality of first hierarchical structures, the plurality of first hierarchical structures comprise a first input layer, at least one first intermediate layer and a first output layer, the decoder comprises a second input layer, at least one second intermediate layer and a second output layer, and the input of any second intermediate layer comprises a fusion result of the output of the previous layer of the second intermediate layer and the output of at least two first hierarchical structures; and processing said medical image by means of the segmentation model so as to obtain an output result of the segmentation model, wherein the output result comprises a segmentation result regarding a medical feature region in said medical image. By means of the method, the accuracy of image segmentation of a medical image can be improved.

Description

Medical image segmentation method, medical image segmentation device and terminal equipment

Technical field

This application relates to the field of image segmentation technology, and in particular to medical image segmentation methods, medical image segmentation devices, terminal equipment, and computer-readable storage media.

Background technique

Medical image segmentation is a key step in medical image processing and analysis. In recent years, information technology and high-end medical imaging technology represented by artificial intelligence have continued to develop, and the application of deep learning in the field of medical image segmentation has also received more and more attention.

However, when segmenting medical images, traditional segmentation models are often difficult to make good use of some contextual information in the medical image, and it is difficult to well capture the dependency relationship between pixels in medical feature areas such as diseased areas. As a result, the effective feature information obtained by the segmentation model is not sufficient, which affects the accuracy of image segmentation for medical images.

technical problem

The embodiments of the present application provide a medical image segmentation method, a medical image segmentation device, a terminal device, and a computer-readable storage medium, which can improve the accuracy of image segmentation of a medical image.

Technical solutions

The first aspect of the embodiments of the present application provides a medical image segmentation method, including:

Obtain medical images to be tested;

The medical image to be detected is input into a trained segmentation model, wherein the segmentation model includes an encoder and a decoder, the encoder includes a plurality of first hierarchical structures, and the plurality of first hierarchical structures includes a first An input layer, at least one first intermediate layer, and a first output layer, the decoder includes a second input layer, at least one second intermediate layer, and a second output layer, wherein the input of any second intermediate layer includes the first 2. The fusion result of the output of the previous layer of the middle layer and the output of at least two first-level structures;

The medical image to be detected is processed by the segmentation model to obtain an output result of the segmentation model, wherein the output result includes a segmentation result of the medical feature region in the medical image to be detected.

A second aspect of the embodiments of the present application provides a medical image segmentation device. The medical image segmentation device may include a module for implementing the steps of the medical image segmentation method described above.

A third aspect of the embodiments of the present application provides a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the processor executes the foregoing Steps of medical image segmentation method.

A fourth aspect of the embodiments of the present application provides a computer device, which includes a memory and a processor. The memory stores computer-readable instructions. The processor implements the above-mentioned medical image segmentation when the computer-readable instructions are executed. Method steps.

Description of the drawings

FIG. 1 is a schematic flowchart of a medical image segmentation method provided by an embodiment of the present application;

Fig. 2 is an exemplary structure of the segmentation model provided by an embodiment of the present application;

FIG. 3 is a schematic flowchart of step S103 according to an embodiment of the present application;

FIG. 4 is an exemplary schematic diagram of performing second processing on the first feature matrix by the weight obtaining module according to an embodiment of the present application;

FIG. 5 is an exemplary schematic diagram of the segmentation model and the discrimination model provided by an embodiment of the present application;

Fig. 6 is a schematic structural diagram of a medical image segmentation device provided by an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.

Embodiments of the present invention

In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.

Reference to "one embodiment" or "some embodiments" described in the specification of this application means that one or more embodiments of this application include a specific feature, structure, or characteristic described in combination with the embodiment. Therefore, the sentences "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless it is specifically emphasized otherwise. The terms "including", "including", "having" and their variations all mean "including but not limited to", unless otherwise specifically emphasized.

The medical image segmentation method provided by the embodiments of this application can be applied to servers, desktop computers, mobile phones, tablet computers, wearable devices, vehicle-mounted devices, augmented reality (AR)/virtual reality (VR) devices, and notebooks. For terminal devices such as computers, ultra-mobile personal computers (UMPC), netbooks, and personal digital assistants (PDAs), the embodiments of this application do not impose any restrictions on the specific types of terminal devices.

Specifically, FIG. 1 shows a flowchart of a medical image segmentation method provided by an embodiment of the present application, and the medical image segmentation method may be applied to terminal equipment. The medical image segmentation method may include:

Step S101: Obtain a medical image to be detected.

In the embodiment of the present application, the type and acquisition method of the medical image to be detected are not limited here. Exemplarily, the medical image to be detected may include one or more of endoscopic images, angiography images, computed tomography images, positron emission tomography images, nuclear magnetic resonance images, and ultrasound images. The medical image to be detected often includes a medical characteristic area, where the medical characteristic area may be, for example, a lesion area, a specific tissue or organ area, and so on.

Step S102: Input the medical image to be detected into a trained segmentation model, where the segmentation model includes an encoder and a decoder, the encoder includes a plurality of first hierarchical structures, and the plurality of first hierarchical structures It includes a first input layer, at least one first intermediate layer, and a first output layer. The decoder includes a second input layer, at least one second intermediate layer, and a second output layer. The input of any second intermediate layer includes The fusion result of the output of the previous layer of the second intermediate layer and the output of the at least two first hierarchical structures.

In the embodiment of the present application, the trained segmentation model may be used to perform image segmentation on the medical image to be detected, so as to obtain information such as the contour of the medical feature region in the medical image to be detected. The trained segmentation model may include an encoder and a decoder, wherein the specific structure of the encoder and the decoder may be determined based on an existing or future machine learning model.

The structure of the encoder and the decoder may be symmetrical. In this case, the number of first-level structures in the encoder is the same as the number of second-level structures included in the decoder. The number of the first hierarchical structure can be determined according to actual requirements. In an example, the first hierarchical structure may have 5 layers. In this case, the encoder may include 3 first intermediate layers. It should be noted that any one of the first hierarchical structures may include one or more sub-layers. For example, any hierarchical structure in the encoder may include a convolutional layer and a down-sampling layer. Correspondingly, decoding The second hierarchical structure corresponding to the first hierarchical structure in the device may include an up-sampling layer and a convolutional layer. At this time, the output of any first hierarchical structure in the encoder may be the output of the down-sampling layer in the first hierarchical structure.

In some embodiments, the trained segmentation model may be improved based on an existing model such as U-Net.

Among them, the existing U-Net model is designed based on a jump-connected full convolutional network, which includes an encoder and decoder with a symmetric structure. At this time, the encoder and decoder of the existing U-Net model There is a one-to-one corresponding intermediate layer. The output of the intermediate layer of the encoder can be transferred to the corresponding intermediate layer in the decoder, and after the transfer, it is spliced and fused with the output of the previous layer of the corresponding intermediate layer in the decoder , And use the result of the splicing and fusion as the input of the corresponding middle layer in the decoder.

However, in the prior art, based on the symmetry of the encoder and the decoder, only the transfer of features between the corresponding intermediate layers in the U-Net model is considered.

In the embodiment of the present application, the input of any second intermediate layer in the decoder may include the fusion result of the output of the previous layer of the second intermediate layer and the output of at least two first-level structures. For example, the input of any second intermediate layer in the decoder may include the output of the previous layer of the second intermediate layer, the output of the first hierarchical structure corresponding to the second intermediate layer, and the output of the first layer corresponding to the second intermediate layer. The output fusion result of at least one adjacent layer (such as the previous layer and/or the next layer) of the first hierarchical structure corresponding to the second intermediate layer. At this time, the decoder can obtain the extracted features of different scales of multiple first-level structures, thereby fusing the multi-scale features to make full use of the context information of the pixels in the medical image to make the medical image segmentation.

There may be multiple specific fusion methods. For example, the output of the previous layer of the second intermediate layer may be spliced with the output of at least two first-level structures to obtain the fusion result.

In some embodiments, the fusion result of the output of the previous layer of the second intermediate layer and the output of the at least two first hierarchical structures is the output of the previous layer of the second intermediate layer and the output of the second intermediate layer. A fusion result of the output of the first hierarchical structure corresponding to the intermediate layer and the output of the previous layer of the first hierarchical structure corresponding to the second intermediate layer.

In the embodiment of the present application, at this time, the second intermediate layer can obtain and make full use of feature information of different depths through multiple jump connections from the encoder to the decoder, thereby improving the efficiency of feature expression and improving the segmentation model The segmentation performance.

The following specific example illustrates an example structure of the segmentation model in an embodiment of the present application.

As shown in Figure 2, it is an exemplary structure of the segmentation model, where the encoder of the segmentation network may have a first-level structure of 5 layers, namely A, B, C, D, and E. The structure of the decoder is symmetrical to the encoder, that is, the A, B, C, D, and E layers of the encoder correspond to the A', B', C', D', and E'layers in the decoder, then The A'layer in the decoder can obtain the fusion result of the output of the A layer of the encoder and the output of the B'layer of the decoder, and the B'layer in the decoder can obtain the output of the A layer and the B layer of the encoder. The output is fused with the output of the decoder C'layer. The C'layer in the decoder can obtain the output of the B layer of the encoder, the output of the C layer and the output of the decoder D'layer, and so on . It should be noted that FIG. 2 is only an exemplary structure of the segmentation model, and is not a limitation.

In step S103, the medical image to be detected is processed by the segmentation model to obtain an output result of the segmentation model, where the output result includes a segmentation result of the medical feature region in the medical image to be detected.

In the embodiment of the present application, the segmentation model may output a segmentation result about the medical feature region in the medical image to be detected, wherein, specifically, the segmentation result includes contour information of the medical feature region.

In some embodiments, the segmentation model includes a weight acquisition module, and the weight acquisition module is located between the encoder and the decoder;

The processing the medical image to be detected by the segmentation model to obtain the output result of the segmentation model includes:

Step S301: Perform first processing on the medical image to be detected by the encoder to obtain a first feature matrix output by the encoder;

Step S302, input the first feature matrix to the weight acquisition module;

Step S303: Perform a second process on the first feature matrix by the weight acquisition module to obtain the weight matrix output by the weight acquisition module;

Step S304, fusing the weight matrix and the first feature matrix to obtain a second feature matrix;

Step S305: Based on the second feature matrix, perform third processing by the decoder to obtain the output result.

In the embodiment of the present application, in order to further improve the segmentation performance of the segmentation model, the weight acquisition module may be set between the encoder and the decoder, thereby using the attention mechanism to improve the segmentation model's ability to represent the segmented region .

Wherein, the weight obtaining module may perform a second processing on the first feature matrix through a preset correlation function, so as to obtain the weight matrix output by the weight obtaining module. There may be multiple specific setting methods for the correlation function. For example, the correlation function may be obtained by combining convolution operation and activation function, or may be obtained by combining multiplication, addition, and other specific functions.

In some embodiments, each element in the weight matrix may respectively represent the weight value of the corresponding element in the corresponding first feature matrix. The weight matrix and the first feature matrix can be fused in multiple ways. For example, the weight matrix can be added to the first feature matrix, or the corresponding elements in the weight matrix can be combined with the first feature matrix. The corresponding elements in the first feature matrix are multiplied; in addition, the fusion may also include matrix dimension transformation, etc., for example, the weight matrix, the first feature matrix, or both may be added after The obtained matrix is dimensionally transformed to obtain the second feature matrix.

In some embodiments, the performing the second processing on the first feature matrix by the weight obtaining module to obtain the weight matrix output by the weight obtaining module includes:

Performing first convolution processing on the first feature matrix to obtain a third feature matrix;

Performing a second convolution process on the first feature matrix to obtain a fourth feature matrix;

Multiply the third feature matrix and the fourth feature matrix to obtain a fifth feature matrix;

Activate the fifth feature matrix through an activation function to obtain the weight matrix.

Wherein, performing the first convolution processing on the first feature matrix may be performing a convolution operation on the first convolution matrix and the first feature matrix, wherein the first convolution matrix may have a dimension of 1*1 matrix; the performing the second convolution processing on the first feature matrix may be performing a convolution operation on the second convolution matrix and the first feature matrix, wherein the second convolution matrix It can be a matrix with a dimension of 1*1. Exemplarily, the activation function may be a Softmax activation function or the like.

In an example, the above-mentioned second processing performed on the first feature matrix by the weight obtaining module may be represented by an association function. Wherein, exemplarily, through the weight acquisition module, _{the element at each position (x i} , x _j ) in the first feature matrix can be calculated through the correlation function f(x _i , x _j ) to obtain the corresponding The weight value, the correlation function f(x _i , x _j ) can be expressed as:

Among them, α(x _i ) is the first convolution process performed by the first embedding layer in the weight obtaining module, and β(x _j ) is the second convolution process performed by the second embedding layer in the weight obtaining module.

In some embodiments, the third processing performed by the decoder based on the second feature matrix to obtain the output result includes:

Fusing the second feature matrix with the output of the previous layer of the first output layer to obtain a sixth feature matrix;

Input the sixth feature matrix to the decoder;

Based on the sixth feature matrix, the decoder performs third processing to obtain the output result.

As shown in FIG. 4, it is an exemplary schematic diagram of performing the second processing on the first feature matrix by the weight obtaining module.

Wherein, the first convolution processing may be the performing the first convolution processing on the first feature matrix, which may be convolving a first convolution matrix with a dimension of 1*1 with the first feature matrix Operation; said performing the second convolution processing on the first feature matrix may be performing a convolution operation on the second convolution matrix with a dimension of 1*1 and the first feature matrix. The activation function may be a Softmax activation function. In addition, it is also possible to perform a matrix transformation on the first feature matrix and the like through a reshape operation, so as to adjust the dimensions of the corresponding matrix and realize the fusion between the matrices.

In the embodiment of the present application, after the second feature matrix is obtained, in order to enable the first input layer of the decoder to obtain features of more scales, the second feature matrix may be further combined with the first output The output of the previous layer of the layer is fused to obtain the sixth feature matrix, and then the sixth feature matrix is input to the decoder. At this time, the second input layer of the decoder can obtain the encoder The feature information extracted from different depths can be fused, so that the second input layer of the encoder can better utilize some context information in the medical image for processing, thereby improving The segmentation performance of the segmentation model.

In some embodiments, before inputting the medical image to be detected into the trained segmentation model, the method further includes:

The segmentation model to be trained is trained through the discriminant model until the training is completed, and the trained segmentation model is obtained, wherein the input of the discriminant model includes at least part of the output of the segmentation model to be trained, and the discriminant model includes volume A product neural network and an up-sampling layer, the output of the convolutional neural network is the input of the up-sampling layer.

In the prior art, a major challenge faced by medical image segmentation is the acquisition of a large amount of high-quality annotation data. Sufficient annotation data is an important factor for the reliability of a deep learning model. However, medical image annotation is highly dependent on professional physicians, cost is high, and patient privacy issues are involved. In addition, at present, medical image quality and standards have not yet been fully homogenized in China. Different types and quality of image data will not only affect the model’s performance Accuracy and universality limit the size of effectively annotated medical image data sets, thereby increasing the difficulty of training various deep learning models for medical image segmentation.

However, in the embodiment of the present application, by combining the discriminant model and the segmentation model, training is based on the form of generating a confrontation network, so that a small amount of labeled medical image data and a large amount of unlabeled medical image data can be used for training, thereby reducing The dependence on the large number of finely labeled medical image data is reduced, and the training performance is improved.

Specifically, the discriminant model may include a convolutional neural network and an up-sampling layer, where the up-sampling layer may be used to output a confidence map, and the confidence map may be used to indicate the difference between each predicted segmentation result. The location area where the similarity of the real medical feature region corresponding to the real segmentation label meets the preset similarity condition. By adding the up-sampling layer to the discriminant model, the difficulty of learning the spatial confidence of the discriminant model is increased, and the discriminative performance of the discriminant model can be made stronger, and through adversarial learning, the segmentation performance of the segmentation model can be further improved.

It should be noted that the discriminant model may include other structures besides the convolutional neural network model and the upsampling layer, for example, the input or output of the convolutional neural network, or The input or output of the up-sampling layer is subjected to other processing, such as image enhancement processing, image binarization processing, and so on. In addition, the input of the discriminant model may include at least a part of the output of the segmentation model to be trained. In addition, it may also include real segmentation labels annotated with medical image samples to train the discriminant performance of the discriminant model.

Optionally, in some embodiments, training the segmentation model to be trained by the discrimination model until the training is completed and obtaining the trained segmentation model includes:

Obtain medical image samples, where the medical image samples include labeled medical image samples and unlabeled medical image samples, the labeled medical image samples are medical image samples labeled with real segmentation labels, and the unlabeled medical image samples are unlabeled medical image samples. Annotated medical image samples;

Input the labeled medical image sample and the unlabeled medical image sample into the segmentation model to be trained, and obtain the predicted segmentation result of each medical image sample by the segmentation model to be trained;

Input the predicted segmentation result into the discriminant model to obtain the discriminant result and confidence map of the discriminant model, wherein the confidence map is output by the upsampling layer in the discriminant model, and the discriminant result is determined by the discriminant model. The output of the convolutional neural network in the model;

Calculating a loss value regarding the segmentation model to be trained and the discrimination model based on a preset loss function according to the discrimination result, the confidence map, and the predicted segmentation result;

Training the discriminant model and the segmentation model to be trained based on the loss value, until the obtained loss value meets the preset loss condition, the training is completed, and the trained segmentation model is obtained.

In the embodiment of the present application, in some examples, the medical image sample may be obtained by normalizing the corresponding original medical image. Of course, in some examples, the medical image sample may also be a medical image that has not been normalized.

Wherein, the confidence map may be used to indicate the location regions where the similarity of the real medical feature region corresponding to the real segmentation label in each of the predicted segmentation results meets the preset similarity condition.

In the embodiment of the present application, there may be multiple types and specific calculation methods for the segmentation model to be trained and the loss value of the discrimination model. Exemplarily, the loss value may include a cross-entropy loss related to the segmentation network, a supervision loss related to annotated medical image samples, a semi-supervised loss related to an unlabeled medical image sample, and/or a discrimination loss of the discriminant model, etc. One or more of etc.

Wherein, the semi-supervised loss regarding the unlabeled medical image sample may be determined based on the confidence map. Specifically, after the confidence map is obtained, the semi-supervised loss regarding the unlabeled medical image sample can be determined according to the confidence map. Among them, in some examples, the confidence map may be further processed, for example, the confidence map may be binarized or other encoding processing to highlight the credible region in the confidence map, so as to calculate the Semi-supervised loss of unlabeled medical image samples.

When training the discriminant model and the segmentation model to be trained based on the loss value and the confidence map, the discriminant model and the to-be-trained segmentation model can be adjusted by means of backpropagation to update the gradient. Divide the parameters of the model until the loss value obtained meets the preset loss condition. Exemplary preset loss conditions may be that the loss value is less than the preset loss threshold, and conditions such as convergence.

In the prior art, in traditional semi-supervised training, the training loss of unlabeled samples cannot be evaluated well, which leads to poor training effects on segmentation models using a large amount of unlabeled medical image data combined with a small amount of labeled medical image data. .

In the embodiment of the present application, by combining the confidence map, the correlation between the labeled medical image sample and the unlabeled medical image sample can be effectively used to obtain the relationship between the labeled medical image sample and the unlabeled medical image sample. To compare and evaluate the second predicted segmentation sub-result corresponding to the unlabeled medical image sample and the first predicted segmentation sub-result corresponding to the labeled medical image sample, thereby effectively improving the effectiveness of semi-supervised training.

In some embodiments, the calculating the loss value of the segmentation model to be trained and the discrimination model based on a preset loss function according to the discrimination result, the confidence map, and the predicted segmentation result includes:

According to the predicted segmentation result, the first predicted segmentation sub-result corresponding to the annotated medical image sample and the real segmentation label of the annotated medical image sample, calculating a first loss value for the segmentation model;

Calculating a second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the unlabeled medical image sample in the predicted segmentation result and the confidence map;

According to the predicted segmentation result, calculating a third loss value with respect to the discriminant model;

The loss value is calculated according to the first loss value, the second loss value, and the third loss value.

In the embodiment of the present application, the first loss value may be calculated according to the first predicted segmentation sub-result corresponding to the labeled medical image sample and the real segmentation label of the labeled medical image sample, and the first loss value The specific calculation method and the type of loss function included can be determined based on actual experience. For example, the first loss value may include a cross-entropy loss related to the segmentation network and/or a supervision loss related to annotated medical image samples, and so on. The third loss value may refer to the discrimination loss of the discrimination model.

The second loss value may be calculated according to the second predicted segmentation sub-result corresponding to the unlabeled medical image sample and the confidence map. In this case, the second loss value may also be referred to as a semi-supervised loss.

In the embodiment of the present application, according to the first loss value, the second loss value, and the third loss value, there may be multiple specific ways to calculate the loss value. For example, the first loss value, the first loss value and the third loss value may be calculated. The second loss value and the third loss value. Alternatively, the weight values corresponding to the first loss value, the second loss value, and the third loss value may be preset, and the weight values corresponding to the first loss value and the second loss value may be set in advance. The weight value corresponding to the third loss value and the third loss value and the first loss value, the second loss value, and the third loss value are calculated to obtain the loss value.

In some embodiments, the calculating the second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the unlabeled medical image sample in the predicted segmentation result and the confidence map includes:

Performing an encoding operation on each position in the confidence map to obtain an encoded image corresponding to the confidence map, where the encoded image includes the encoding value of each position in the confidence map;

Calculate the second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the encoded image and the unlabeled medical image sample.

In the embodiment of the present application, the encoding operation may be used to encode each position in the confidence map, and there may be multiple specific encoding methods, and the encoding may be used to encode each position in the confidence map. Carry out category labeling. Exemplarily, when the category of each position in the confidence map includes two categories, the encoding operation may be a binarization operation; of course, the encoding operation may also include other encoding methods, for example, the encoding The calculation can be based on one-hot encoding (One-Hot Encoding) and other methods for encoding and so on.

In some embodiments, the encoded image may be used to annotate the credible region in the second predicted segmentation sub-result, wherein the credible region can be based on the authentic medical features in the second predicted segmentation sub-result. The similarity of the region is determined by the location region that meets the preset similarity condition. At this time, in some cases, the relevant loss corresponding to the unlabeled medical image sample may be determined according to the credible region in the encoded image, that is, the second loss value.

The following uses a specific example to illustrate an exemplary specific calculation method of the loss value in the embodiment of the present application.

In some examples, the labeled medical image sample is {I ^f , L ^f }, where L ^f is a true segmentation label of the labeled medical image sample. The unlabeled medical image sample is {I ⁰ }.

The labeled medical image sample and the unlabeled medical image sample are input into the segmentation model to be trained, and the predicted segmentation result of each medical image sample by the segmentation model to be trained is obtained, wherein the first segment corresponding to the labeled medical image sample is obtained. The result of a predicted segmentation is

The second predicted segmentation sub-result corresponding to the unlabeled medical image sample is S(L ^f ).

At this time, according to the labeled medical image sample as {I ^f , L ^f } and the first predicted segmentation sub-result

Calculate the first loss value with respect to the segmentation model.

The first loss value may include the cross entropy loss of the segmentation network

And about the supervision loss of labeling medical image samples

Wherein, the cross entropy loss

Said supervision loss

In addition, the confidence map of the second predicted segmentation sub-result can be obtained through the up-sampling layer in the discriminant model

At this time, according to the second predicted segmentation sub-result corresponding to the unlabeled medical image sample in the predicted segmentation result and the confidence map, the second loss value of the segmentation model is calculated

Wherein, I(·) is an indicator function, T _semi is a preset confidence threshold, and Y({I ^f ,L ⁰ }) is used to indicate the coding category of the coded image corresponding to the confidence map, and The encoding category indicates whether the encoded image corresponds to an unlabeled medical image sample, wherein, if the encoding category indicates that the encoded image corresponds to an unlabeled medical image sample, then the Y({I ^f ,L ⁰ })=1.

Wherein, the preset signal threshold T _semi can be set according to actual experience or test results. By setting the preset signal threshold T _semi , the sensitivity of model training can be controlled.

In addition, it is also possible to calculate the discriminant loss on the discriminant model, that is, the third loss value

Among them, when λ=0, it corresponds to the predicted segmentation result output by the segmentation model to be trained; when λ=1, it corresponds to an annotated medical image sample with a real segmentation label.

After obtaining the first loss value, the second loss value, and the third loss value, the loss value may be calculated according to the first loss value, the second loss value, and the third loss value

Wherein, the λ _adv may be the supervision loss

Corresponding weight coefficient, the λ _semi may be the second loss value

The corresponding weight coefficient.

At this time, by adjusting the λ _adv and λ _semi , the training results of the segmentation model and the discriminant model can be weighed and adjusted, for example, excessive correction can be avoided, and the effect such as cross entropy loss can be avoided. Of course, in some cases, the cross entropy loss

There may also be a weight coefficient corresponding to the third loss value.

In some cases, after the discriminant model and the segmentation model to be trained are trained based on the loss value, until the obtained loss value meets the preset loss condition, it can also pass the medical image test for testing Samples and medical image verification samples for verification test and verify the segmentation model, so as to select the optimal segmentation model as the trained segmentation from the segmentation models whose loss values meet the preset loss conditions. Model.

As shown in FIG. 5, it is an exemplary schematic diagram of the segmentation model and the discrimination model.

Wherein, the segmentation model and the discriminant model can use labeled medical image samples and unlabeled medical image samples to implement semi-supervised training.

In the embodiment of the present application, the medical image to be detected can be obtained, and the medical image to be detected can be input into a trained segmentation model, where the segmentation model includes an encoder and a decoder, and the encoder includes a plurality of first Hierarchical structure, the plurality of first hierarchical structures include a first input layer, at least one first intermediate layer, and a first output layer, and the decoder includes a second input layer, at least one second intermediate layer, and a second output layer , Wherein the input of any second intermediate layer includes the fusion result of the output of the previous layer of the second intermediate layer and the output of at least two first-level structures; at this time, it can pass through at least each of the decoders The middle layer acquires features of different scales extracted by multiple layers in the decoder, so as to make full use of the context information of the pixels in the medical image to perform medical image segmentation; to obtain the medical features in the medical image to be detected The segmentation result of the region. Through the embodiments of this application, the extracted multi-scale features can be effectively used when medical images are processed through the segmentation model to achieve multi-information fusion, thereby improving the generalization performance of the segmentation model and improving the image segmentation of medical images. Accuracy.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

Corresponding to the medical image segmentation method described in the above embodiment, FIG. 6 shows a structural block diagram of a medical image segmentation device provided by an embodiment of the present application. part.

Referring to FIG. 6, the medical image segmentation device 6 includes:

The first obtaining module 601 is used to obtain medical images to be detected;

The input module 602 is configured to input the medical image to be detected into a trained segmentation model, where the segmentation model includes an encoder and a decoder, the encoder includes a plurality of first-level structures, and the plurality of first-level structures The hierarchical structure includes a first input layer, at least one first intermediate layer, and a first output layer. The decoder includes a second input layer, at least one second intermediate layer, and a second output layer, wherein any second intermediate layer The input of includes the fusion result of the output of the previous layer of the second intermediate layer and the output of at least two first-level structures;

The processing module 603 is configured to process the medical image to be detected by the segmentation model to obtain the output result of the segmentation model, wherein the output result includes information about the medical feature region in the medical image to be detected Segmentation result.

Optionally, the segmentation model includes a weight acquisition module, and the weight acquisition module is located between the encoder and the decoder;

The processing module 603 specifically includes:

A first processing unit, configured to perform first processing on the medical image to be detected by the encoder to obtain a first feature matrix output by the encoder;

A first input unit, configured to input the first feature matrix into the weight obtaining module;

A second processing unit, configured to perform second processing on the first feature matrix by the weight acquisition module to obtain the weight matrix output by the weight acquisition module;

A first fusion unit, configured to fuse the weight matrix and the first feature matrix to obtain a second feature matrix;

The third processing unit is configured to perform third processing by the decoder based on the second feature matrix to obtain the output result.

Optionally, the second processing unit specifically includes:

The first processing subunit is configured to perform first convolution processing on the first feature matrix to obtain a third feature matrix;

A second processing subunit, configured to perform a second convolution process on the first feature matrix to obtain a fourth feature matrix;

The third processing subunit is configured to multiply the third feature matrix and the fourth feature matrix to obtain a fifth feature matrix;

The fourth processing subunit is configured to activate the fifth feature matrix through an activation function to obtain the weight matrix.

Optionally, the third processing unit specifically includes:

The first fusion subunit is used to fuse the second feature matrix with the output of the previous layer of the first output layer to obtain a sixth feature matrix;

The first input subunit is used to input the sixth feature matrix to the decoder;

The fifth processing subunit is configured to perform third processing by the decoder based on the sixth feature matrix to obtain the output result.

Optionally, the medical image segmentation device 6 further includes:

The training module is used to train the segmentation model to be trained through the discriminant model until the training is completed, and obtain the trained segmentation model, wherein the input of the discriminant model includes at least part of the output of the segmentation model to be trained, so The discriminant model includes a convolutional neural network and an upsampling layer, and the output of the convolutional neural network is the input of the upsampling layer.

Optionally, the training module specifically includes:

An acquiring unit for acquiring medical image samples, wherein the medical image samples include labeled medical image samples and unlabeled medical image samples, the labeled medical image samples are medical image samples labeled with real segmentation tags, and the unlabeled medical image samples The medical image sample is a medical image sample that has not been labeled;

The fourth processing unit is configured to input the labeled medical image sample and the unlabeled medical image sample into the segmentation model to be trained, and obtain the predicted segmentation result of each medical image sample by the segmentation model to be trained;

The fifth processing unit is configured to input the predicted segmentation result into the discriminant model to obtain the discriminant result and confidence map of the discriminant model, wherein the confidence map is output by the upsampling layer in the discriminant model, so The discrimination result is output by the convolutional neural network in the discrimination model;

A calculation unit, configured to calculate a loss value regarding the segmentation model to be trained and the discrimination model based on a preset loss function according to the discrimination result, the confidence map, and the predicted segmentation result;

The training unit is configured to train the discriminant model and the segmentation model to be trained based on the loss value, until the obtained loss value meets the preset loss condition, then the training is completed and the trained segmentation model is obtained.

Optionally, the calculation unit specifically includes:

The first calculation subunit is configured to calculate the information about the segmentation model according to the first predicted segmentation sub-result corresponding to the labeled medical image sample and the real segmentation label of the labeled medical image sample in the predicted segmentation result. First loss value

A second calculation subunit, configured to calculate a second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the unlabeled medical image sample in the predicted segmentation result and the confidence map;

The third calculation subunit is configured to calculate the third loss value of the discriminant model according to the predicted segmentation result;

The fourth calculation subunit is configured to calculate the loss value according to the first loss value, the second loss value, and the third loss value.

Optionally, the second calculation subunit is specifically configured to:

Optionally, the fusion result of the output of the previous layer of the second intermediate layer and the output of the at least two first hierarchical structures is the output of the previous layer of the second intermediate layer and the output of the second intermediate layer. The fusion result of the output of the corresponding first hierarchical structure and the output of the previous layer of the first hierarchical structure corresponding to the second intermediate layer.

It should be noted that the information interaction and execution process between the above-mentioned devices/units are based on the same concept as the method embodiment of this application, and its specific functions and technical effects can be found in the method embodiment section. I won't repeat it here.

Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of a software functional unit. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.

FIG. 7 is a schematic structural diagram of a terminal device provided by an embodiment of this application. As shown in FIG. 7, the terminal device 7 of this embodiment includes: at least one processor 70 (only one is shown in FIG. 7), a memory 71, and is stored in the above-mentioned memory 71 and can run on the above-mentioned at least one processor 70 When the processor 70 executes the computer program 72, the steps in any of the medical image segmentation method embodiments described above are implemented.

The aforementioned terminal device 7 may be a computing device such as a server, a mobile phone, a wearable device, an augmented reality (AR)/virtual reality (VR) device, a desktop computer, a notebook, a desktop computer, and a palmtop computer. The terminal device may include, but is not limited to, a processor 70 and a memory 71. Those skilled in the art can understand that FIG. 7 is only an example of the terminal device 7 and does not constitute a limitation on the terminal device 7. It may include more or less components than shown in the figure, or a combination of certain components, or different components. , For example, can also include input devices, output devices, network access devices, and so on. The above-mentioned input device may include a keyboard, a touch panel, a fingerprint collection sensor (used to collect user fingerprint information and fingerprint orientation information), a microphone, a camera, etc., and an output device may include a display, a speaker, and the like.

The processor 70 may be a central processing unit (Central Processing Unit, CPU), and the processor 70 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application specific integrated circuits (Application Specific Integrated Circuits). , ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The foregoing memory 71 may be an internal storage unit of the foregoing terminal device 7 in some embodiments, such as a hard disk or memory of the terminal device 7. The above-mentioned memory 71 may also be an external storage device of the above-mentioned terminal device 7 in other embodiments, for example, a plug-in hard disk equipped on the above-mentioned terminal device 7, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital). ,SD) card, flash card (Flash Card), etc. Further, the aforementioned memory 71 may also include both an internal storage unit of the aforementioned terminal device 7 and an external storage device. The above-mentioned memory 71 is used to store an operating system, an application program, a boot loader (Boot Loader), data, and other programs, for example, the program code of the above-mentioned computer program. The aforementioned memory 71 can also be used to temporarily store data that has been output or will be output.

The embodiments of the present application also provide a computer-readable storage medium, and the above-mentioned computer-readable storage medium stores a computer program, and when the above-mentioned computer program is executed by a processor, the steps in each of the above-mentioned method embodiments can be realized.

The embodiments of the present application provide a computer program product. When the computer program product runs on a terminal device, the terminal device can realize the steps in the foregoing method embodiments when the terminal device is executed.

If the above integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the above-mentioned embodiments and methods, which can be completed by instructing relevant hardware through a computer program. The above-mentioned computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the above-mentioned computer program includes computer program code, and the above-mentioned computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The above-mentioned computer-readable medium may at least include: any entity or device capable of carrying computer program code to the camera device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), random access memory ( RAM, Random Access Memory), electric carrier signal, telecommunications signal and software distribution medium. For example, U disk, mobile hard disk, floppy disk or CD-ROM, etc. In some jurisdictions, according to legislation and patent practices, computer-readable media cannot be electrical carrier signals and telecommunication signals.

In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.

A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.

In the embodiments provided in this application, it should be understood that the disclosed apparatus/network equipment and method may be implemented in other ways. For example, the device/network device embodiments described above are merely illustrative, and the division of the modules or units mentioned above is only a logical function division. In actual implementation, there may be other division methods, for example, multiple units or components may be It can be combined or integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, rather than to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still compare the foregoing embodiments. The recorded technical solutions are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in this Within the scope of protection applied for.

Claims

A medical image segmentation method, which is characterized in that it includes:

Obtain medical images to be tested;

The medical image to be detected is input into a trained segmentation model, wherein the segmentation model includes an encoder and a decoder, the encoder includes a plurality of first hierarchical structures, and the plurality of first hierarchical structures includes a first An input layer, at least one first intermediate layer, and a first output layer, the decoder includes a second input layer, at least one second intermediate layer, and a second output layer, wherein the input of any second intermediate layer includes the first 2. The fusion result of the output of the previous layer of the middle layer and the output of at least two first-level structures;

The medical image to be detected is processed by the segmentation model to obtain an output result of the segmentation model, wherein the output result includes a segmentation result of the medical feature region in the medical image to be detected.
The medical image segmentation method according to claim 1, wherein the segmentation model comprises a weight acquisition module, and the weight acquisition module is located between the encoder and the decoder;

The processing the medical image to be detected by the segmentation model to obtain the output result of the segmentation model includes:

Performing first processing on the medical image to be detected by the encoder to obtain a first feature matrix output by the encoder;

Inputting the first feature matrix into the weight obtaining module;

Performing a second process on the first feature matrix by the weight obtaining module to obtain the weight matrix output by the weight obtaining module;

Fusing the weight matrix and the first feature matrix to obtain a second feature matrix;

Based on the second feature matrix, third processing is performed by the decoder to obtain the output result.
3. The medical image segmentation method according to claim 2, wherein the second processing of the first feature matrix by the weight acquisition module to obtain the weight matrix output by the weight acquisition module comprises:

Performing first convolution processing on the first feature matrix to obtain a third feature matrix;

Performing a second convolution process on the first feature matrix to obtain a fourth feature matrix;

Multiply the third feature matrix and the fourth feature matrix to obtain a fifth feature matrix;

Activate the fifth feature matrix through an activation function to obtain the weight matrix.
The medical image segmentation method according to claim 2, wherein the third processing is performed by the decoder based on the second feature matrix to obtain the output result, comprising:

Fusing the second feature matrix with the output of the previous layer of the first output layer to obtain a sixth feature matrix;

Input the sixth feature matrix to the decoder;

Based on the sixth feature matrix, the decoder performs third processing to obtain the output result.
The medical image segmentation method according to claim 1, wherein before inputting the medical image to be detected into a trained segmentation model, the method further comprises:

The segmentation model to be trained is trained through the discriminant model until the training is completed, and the trained segmentation model is obtained, wherein the input of the discriminant model includes at least part of the output of the segmentation model to be trained, and the discriminant model includes volume A product neural network and an up-sampling layer, the output of the convolutional neural network is the input of the up-sampling layer.
The medical image segmentation method according to claim 5, wherein the training the segmentation model to be trained by the discriminant model until the training is completed and obtaining the trained segmentation model comprises:

Obtain medical image samples, where the medical image samples include labeled medical image samples and unlabeled medical image samples, the labeled medical image samples are medical image samples labeled with real segmentation labels, and the unlabeled medical image samples are unlabeled medical image samples. Annotated medical image samples;

Input the labeled medical image sample and the unlabeled medical image sample into the segmentation model to be trained, and obtain the predicted segmentation result of each medical image sample by the segmentation model to be trained;

Input the predicted segmentation result into the discriminant model to obtain the discriminant result and confidence map of the discriminant model, wherein the confidence map is output by the upsampling layer in the discriminant model, and the discriminant result is determined by the discriminant model. The output of the convolutional neural network in the model;

Calculating a loss value regarding the segmentation model to be trained and the discrimination model based on a preset loss function according to the discrimination result, the confidence map, and the predicted segmentation result;

Training the discriminant model and the segmentation model to be trained based on the loss value, until the obtained loss value meets the preset loss condition, the training is completed, and the trained segmentation model is obtained.
7. The medical image segmentation method according to claim 6, wherein the segmentation model to be trained and the segmentation model to be trained are calculated based on a preset loss function according to the discrimination result, the confidence map, and the predicted segmentation result. The loss value of the discriminant model includes:

According to the predicted segmentation result, the first predicted segmentation sub-result corresponding to the annotated medical image sample and the real segmentation label of the annotated medical image sample, calculating a first loss value for the segmentation model;

Calculating a second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the unlabeled medical image sample in the predicted segmentation result and the confidence map;

According to the predicted segmentation result, calculating a third loss value with respect to the discriminant model;

The loss value is calculated according to the first loss value, the second loss value, and the third loss value.
The medical image segmentation method according to claim 7, wherein the prediction segmentation result is based on the second predicted segmentation sub-result corresponding to the unlabeled medical image sample and the confidence map to calculate the The second loss value of the segmentation model includes:

Performing an encoding operation on each position in the confidence map to obtain an encoded image corresponding to the confidence map, where the encoded image includes the encoding value of each position in the confidence map;

Calculate the second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the encoded image and the unlabeled medical image sample.
The medical image segmentation method according to any one of claims 1 to 8, wherein the fusion result of the output of the previous layer of the second intermediate layer and the output of at least two first hierarchical structures is the first The fusion result of the output of the previous layer of the second intermediate layer, the output of the first hierarchical structure corresponding to the second intermediate layer, and the output of the previous layer of the first hierarchical structure corresponding to the second intermediate layer .
A medical image segmentation device, characterized in that it comprises:

The first acquisition module is used to acquire the medical image to be detected;

The input module is used to input the medical image to be detected into a trained segmentation model, where the segmentation model includes an encoder and a decoder, the encoder includes a plurality of first-level structures, and the plurality of first-level structures The hierarchical structure includes a first input layer, at least one first intermediate layer, and a first output layer. The decoder includes a second input layer, at least one second intermediate layer, and a second output layer. The input includes a fusion result of the output of the previous layer of the second intermediate layer and the output of at least two first-level structures;

The processing module is configured to process the medical image to be detected through the segmentation model to obtain the output result of the segmentation model, wherein the output result includes segmentation of the medical feature region in the medical image to be detected result.
A terminal device includes a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the following steps when the processor executes the computer program:

Obtain medical images to be tested;

The medical image to be detected is input into a trained segmentation model, wherein the segmentation model includes an encoder and a decoder, the encoder includes a plurality of first hierarchical structures, and the plurality of first hierarchical structures includes a first An input layer, at least one first intermediate layer, and a first output layer, the decoder includes a second input layer, at least one second intermediate layer, and a second output layer, wherein the input of any second intermediate layer includes the first 2. The fusion result of the output of the previous layer of the middle layer and the output of at least two first-level structures;

The medical image to be detected is processed by the segmentation model to obtain an output result of the segmentation model, wherein the output result includes a segmentation result of the medical feature region in the medical image to be detected.
The terminal device according to claim 11, wherein the segmentation model includes a weight acquisition module, and the weight acquisition module is located between the encoder and the decoder;

When the processor executes the computer program, the processing the medical image to be detected by the segmentation model to obtain the output result of the segmentation model includes:

Performing first processing on the medical image to be detected by the encoder to obtain a first feature matrix output by the encoder;

Inputting the first feature matrix into the weight obtaining module;

Performing a second process on the first feature matrix by the weight obtaining module to obtain the weight matrix output by the weight obtaining module;

Fusing the weight matrix and the first feature matrix to obtain a second feature matrix;

Based on the second feature matrix, third processing is performed by the decoder to obtain the output result.
The terminal device according to claim 12, wherein when the processor executes the computer program, the weight acquisition module performs a second process on the first feature matrix to obtain the weight acquisition The weight matrix output by the module includes:

Performing first convolution processing on the first feature matrix to obtain a third feature matrix;

Performing a second convolution process on the first feature matrix to obtain a fourth feature matrix;

Multiply the third feature matrix and the fourth feature matrix to obtain a fifth feature matrix;

Activate the fifth feature matrix through an activation function to obtain the weight matrix.
The terminal device according to claim 12, wherein when the processor executes the computer program, the third processing is performed by the decoder based on the second feature matrix to obtain the output result ,include:

Fusing the second feature matrix with the output of the previous layer of the first output layer to obtain a sixth feature matrix;

Input the sixth feature matrix to the decoder;

Based on the sixth feature matrix, the decoder performs third processing to obtain the output result.
The terminal device according to claim 11, wherein when the processor executes the computer program, before inputting the medical image to be detected into the trained segmentation model, the method further comprises:

The segmentation model to be trained is trained through the discriminant model until the training is completed, and the trained segmentation model is obtained, wherein the input of the discriminant model includes at least part of the output of the segmentation model to be trained, and the discriminant model includes volume A product neural network and an up-sampling layer, the output of the convolutional neural network is the input of the up-sampling layer.
The terminal device according to claim 15, wherein when the processor executes the computer program, the segmentation model to be trained is trained by the discriminant model until the training is completed, and the trained segmentation model is obtained, include:

Obtain medical image samples, where the medical image samples include labeled medical image samples and unlabeled medical image samples, the labeled medical image samples are medical image samples labeled with real segmentation labels, and the unlabeled medical image samples are unlabeled medical image samples. Annotated medical image samples;

Input the labeled medical image sample and the unlabeled medical image sample into the segmentation model to be trained, and obtain the predicted segmentation result of each medical image sample by the segmentation model to be trained;

Input the predicted segmentation result into the discriminant model to obtain the discriminant result and confidence map of the discriminant model, wherein the confidence map is output by the upsampling layer in the discriminant model, and the discriminant result is determined by the discriminant model. The output of the convolutional neural network in the model;

Calculating a loss value regarding the segmentation model to be trained and the discrimination model based on a preset loss function according to the discrimination result, the confidence map, and the predicted segmentation result;

Training the discriminant model and the segmentation model to be trained based on the loss value, until the obtained loss value meets the preset loss condition, the training is completed, and the trained segmentation model is obtained.
The terminal device according to claim 16, wherein when the processor executes the computer program, the calculation is based on a preset loss function according to the discrimination result, the confidence map, and the predicted segmentation result The loss values of the segmentation model to be trained and the discrimination model include:

According to the predicted segmentation result, the first predicted segmentation sub-result corresponding to the annotated medical image sample and the real segmentation label of the annotated medical image sample, calculating a first loss value for the segmentation model;

Calculating a second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the unlabeled medical image sample in the predicted segmentation result and the confidence map;

According to the predicted segmentation result, calculating a third loss value with respect to the discriminant model;

The loss value is calculated according to the first loss value, the second loss value, and the third loss value.
The terminal device of claim 17, wherein when the processor executes the computer program, the second predicted segmentation sub-result corresponding to the unlabeled medical image sample in the predicted segmentation result And the confidence map, calculating the second loss value of the segmentation model, including:

Performing an encoding operation on each position in the confidence map to obtain an encoded image corresponding to the confidence map, where the encoded image includes the encoding value of each position in the confidence map;

Calculate the second loss value of the segmentation model according to the second predicted segmentation sub-result corresponding to the encoded image and the unlabeled medical image sample.
The terminal device according to any one of claims 11 to 18, wherein when the processor executes the computer program, the output of the previous layer of the second intermediate layer and at least two first-level structures The fusion result of the output of the second intermediate layer is the output of the previous layer of the second intermediate layer, the output of the first hierarchical structure corresponding to the second intermediate layer, and the first hierarchical structure corresponding to the second intermediate layer The fusion result of the output of the previous layer.
A computer-readable storage medium storing a computer program, wherein the computer program is executed by a processor to implement the medical image segmentation method according to any one of claims 1 to 9 .