CN115953412A - Training method, segmentation method and device of prostate ultrasonic segmentation model - Google Patents

Training method, segmentation method and device of prostate ultrasonic segmentation model Download PDF

Info

Publication number
CN115953412A
CN115953412A CN202310054691.2A CN202310054691A CN115953412A CN 115953412 A CN115953412 A CN 115953412A CN 202310054691 A CN202310054691 A CN 202310054691A CN 115953412 A CN115953412 A CN 115953412A
Authority
CN
China
Prior art keywords
module
prostate
training
output end
input end
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310054691.2A
Other languages
Chinese (zh)
Inventor
魏强
郑博文
鲁仁全
陶杰
吕世栋
姚宇千
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Southern Hospital Southern Medical University
Original Assignee
Guangdong University of Technology
Southern Hospital Southern Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology, Southern Hospital Southern Medical University filed Critical Guangdong University of Technology
Priority to CN202310054691.2A priority Critical patent/CN115953412A/en
Publication of CN115953412A publication Critical patent/CN115953412A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention aims to provide a training method, a segmentation method and a device of a prostate ultrasonic segmentation model, and relates to the field of image processing. The training method comprises the following steps: acquiring paired prostate ultrasonic images and Mask images of each patient; carrying out data preprocessing on the paired prostate ultrasonic images and Mask images of each patient, marking out a training set, and carrying out data enhancement processing on the paired prostate ultrasonic images and Mask images of each patient in the training set; combining an attention mechanism into the 3D-UNet to construct a prostate ultrasonic segmentation model, and training the prostate ultrasonic segmentation model according to a training set; extracting priori knowledge of the prostate contour according to the training set and constructing a loss function; and adjusting the model parameters of the prostate ultrasonic segmentation model according to the loss function. The prostate ultrasonic segmentation is carried out by using a deep learning method based on the prostate shape prior knowledge and the attention mechanism, and the shape prior knowledge is combined with the ultrasonic image, so that the accuracy of image segmentation is effectively improved.

Description

Training method, segmentation method and device of prostate ultrasonic segmentation model
Technical Field
The invention relates to the technical field of image processing, in particular to a training method, a segmentation method and a device of a prostate ultrasonic segmentation model.
Background
The incidence of prostate cancer is on the rise year by year, and prostate ultrasound is the most widely used imaging mode in prostate cancer diagnosis and treatment. In prostate fusion puncture and brachytherapy, accurate segmentation of prostate ultrasound images is particularly important for selection of puncture strategies and radiotherapy doses. However, manual segmentation by a doctor is time-consuming and labor-consuming, and the consistency is poor. With the development of the deep learning technology, the deep learning technology is also gradually applied to the field of prostate ultrasound segmentation. However, due to the characteristics of prostate image ultrasound, deep learning has some problems in prostate ultrasound segmentation:
1. the signal-to-noise ratio of the ultrasonic image is low, and artifacts can appear in the ultrasonic reflection process; structural imaging failures can also occur due to signal fall-back and shadowing. These problems lead to unclear boundaries between the prostate and surrounding tissue, greatly affecting model performance.
2. The deep learning can calculate all information of an input picture, and key information in the image cannot be identified; in image segmentation, the target region may only occupy a small portion of the image. The complex task needs a large amount of input information and complex calculation, and the calculation of all information of the image with partial weight influences the calculation efficiency of the model.
Disclosure of Invention
Aiming at the defects, the invention aims to provide a training method, a segmentation method and a device of a prostate ultrasonic segmentation model, which are used for carrying out prostate ultrasonic segmentation by using a deep learning method based on prostate shape prior knowledge and an attention mechanism, and effectively improving the accuracy of image segmentation by combining the shape prior knowledge with an ultrasonic image.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a training method of a prostate ultrasonic segmentation model in a first aspect, which comprises the following steps:
acquiring paired prostate ultrasonic images and Mask images of each patient;
carrying out data preprocessing on the paired prostate ultrasonic images and Mask images of each patient, dividing a training set, and carrying out data enhancement processing on the paired prostate ultrasonic images and Mask images of each patient in the training set;
combining an attention mechanism into the 3D-UNet to construct a prostate ultrasonic segmentation model, and training the prostate ultrasonic segmentation model according to the training set;
extracting priori knowledge of the prostate contour according to the training set and constructing a loss function;
and adjusting the model parameters of the prostate ultrasonic segmentation model according to the loss function to obtain the optimized prostate ultrasonic segmentation model.
As an alternative embodiment, in the first aspect of the present invention, the above-mentioned loss function includes an active shape loss function and a mean square loss function, and the adjusting the model parameters of the ultrasound prostate segmentation model according to the above-mentioned loss function includes the following steps:
combining the active shape loss function and the mean square loss function;
the Adam algorithm is used as an optimizer, and the learning rate of the prostate ultrasonic segmentation model is dynamically adjusted through a cosine annealing restart method, so that the weight is updated through back propagation iteration of a loss function.
As an alternative embodiment, in the first aspect of the present invention, the training set extracting a priori knowledge of the prostate contour and constructing the loss function includes the following steps:
extracting the prostate contour of each patient according to the paired prostate ultrasound image and Mask image of each patient in the training set to obtain three-dimensional point cloud data of the prostate contour of each patient, and correspondingly using each three-dimensional point cloud data as a training sample X of each patient=[x 1 ,y 1 ,z 1 ,x 2 ,y 2 ,z 2 ,…,x n ,y n ,z n ,] T Wherein x in the training sample n ,y n ,z n Representing the three-dimensional coordinates of the nth point, a training sample set J = { X) is generated 1 ,X 2 ,…,X j Where X in the training sample set j Denoted as the jth training sample;
calculating the average shape of all training samples according to the three-dimensional point cloud data of each training sample, wherein the calculation formula of the average shape is as follows:
Figure BDA0004060046790000031
in the formula (1), the first and second groups of the compound,
Figure BDA0004060046790000032
is an average shape, X i Is the ith training sample, j is the total number of the training samples;
registering all training samples in a training sample set to an average shape through affine transformation, and calculating the offset of each registered sample compared with the average shape, wherein the offset is calculated according to the formula:
Figure BDA0004060046790000033
in the formula (2), dX i Offset from the mean shape after registration for the ith training sample, X i For the (i) th training sample,
Figure BDA0004060046790000034
is an average shape;
and calculating a covariance matrix of the training sample set according to the offset, wherein a calculation formula of the covariance matrix is as follows:
Figure BDA0004060046790000035
in the formula (3), the first and second groups,
Figure BDA0004060046790000036
transpose the offset of the i-th post-registration sample compared to the average shape;
singular value decomposition is carried out on the covariance matrix to obtain an eigenvalue and an eigenvector, and the calculation formula is as follows:
Sp i =λ i p i (4);
in the formula (4), λ i Is the ith eigenvalue, p, of the covariance matrix i Is the ith eigenvector of the covariance matrix;
selecting the largest first t characteristic values to represent the main shape of the training sample, and obtaining a statistical model of the shape vector of the sample; wherein t satisfies:
Figure BDA0004060046790000037
in formula (5), ratio represents the proportion of the main shape capable of explaining all deformation in the original model;
wherein the expression of the main shape is:
Figure BDA0004060046790000041
in the formula (6), P is a characteristic vector, and B is a characteristic value corresponding to the characteristic vector;
the statistical model of the sample shape vector is then:
Figure BDA0004060046790000042
/>
in the formula (7), P t For the first t feature vectors,
Figure BDA0004060046790000043
is P t Transpose of (A), B t As the first t featuresThe eigenvalues of the vectors;
constructing a local gray model according to the statistical model to calculate the local characteristics of each characteristic point so as to adjust iteration parameters and obtain an optimal matching model of the target; the covariance matrix of the local gray model of each feature point is as follows:
Figure BDA0004060046790000044
in the formula (8), g ij For the ith feature point of the jth training sample,
Figure BDA0004060046790000045
the local gray model mean value of the ith sample is obtained;
calculating the similarity between the moved characteristic points in the Mahalanobis distance comparison matching process and the new characteristic points obtained after the movement, wherein the similarity between the two characteristic points is larger when the Mahalanobis distance is smaller as an evaluation index of the active shape loss function; the Mahalanobis distance between the two characteristic points is calculated by the following formula:
Figure BDA0004060046790000046
in the formula (9), g s In order to be a new feature point,
Figure BDA0004060046790000047
is g i The inverse of the covariance matrix;
in the training process of the prostate ultrasonic segmentation model, an active shape loss function is obtained by calculating errors of the predicted points and corresponding points of a Mask image, wherein the active shape loss function is as follows:
Figure BDA0004060046790000048
in the formula (10), the first and second groups,
Figure BDA0004060046790000049
i-th characteristic point, representing the j-th sample in the gold standard>
Figure BDA00040600467900000410
Representing the ith characteristic point of the jth sample in the model prediction result;
in the training process of the prostate ultrasonic segmentation model, a mean square loss function is obtained by calculating the variance between the model predicted value and the sample true value, and the mean square loss function is as follows:
Figure BDA0004060046790000051
in the formula (11), y i Representing the predicted value of the model in the ith sample,
Figure BDA0004060046790000052
representing the true value in the ith sample.
As an alternative embodiment, in the first aspect of the present invention, the prostate ultrasound segmentation model with attention mechanism combined includes an initialization module, a first encoding module, a second encoding module, a third encoding module, a fourth encoding module, a fifth encoding module, a first decoding module, a second decoding module, a third decoding module, a fourth decoding module, a fifth decoding module, a first attention mechanism module, a second attention mechanism module, a third attention mechanism module, a fourth attention mechanism module, a fifth attention mechanism module, and a final segmentation module;
the output end of the initialization module is connected to the input end of the first coding module; the output end of the first coding module is connected to the input end of the second coding module; the output end of the second coding module is connected to the input end of the third coding module; the output end of the third coding module is connected to the input end of the fourth coding module; the output end of the fourth coding module is connected to the input end of the fifth coding module; the output end of the fifth coding module and the output end of the fourth coding module are connected to the input end of the first attention mechanism module in a jumping manner; the output end of the first attention mechanism module is connected to the input end of the first decoding module; the output end of the first decoding module and the output end of the third encoding module are connected to the input end of the second attention mechanism module in a jumping mode; the output end of the second attention mechanism module is connected to the input end of the second decoding module; the output end of the second decoding module and the output end of the second encoding module are connected to the input end of the third attention mechanism module in a jumping manner; the output end of the third attention mechanism module is connected to the input end of the third decoding module; the output end of the third decoding module and the output end of the first encoding module are connected to the input end of the fourth attention mechanism module in a jumping mode; an output terminal of the fourth attention mechanism module is connected to an input terminal of the fourth decoding module; the output end of the fourth decoding module and the output end of the initialization module are connected to the input end of a fifth attention mechanism module in a jumping manner, and the output end of the fifth attention mechanism module is connected to the input end of the final segmentation module.
As an alternative embodiment, in the first aspect of the present invention, the first attention mechanism module, the second attention mechanism module, the third attention mechanism module, the fourth attention mechanism module and the fifth attention mechanism module each include a first parallel branch for outputting the channel attention, a second parallel branch for outputting the spatial attention, a third parallel branch for outputting the original input feature, and a summing unit, where output ends of the first parallel branch, the second parallel branch and the third parallel branch are respectively connected to the summing unit, and the summing unit is configured to sum the channel attention, the spatial attention and the original input feature.
As an optional embodiment, in the first aspect of the present invention, the first parallel branch is provided with a first max pooling layer, a first average pooling layer, a perceptron structure, a first vector adding unit and a first Sigmoid activation function unit, an input end of the first parallel branch is connected to an input end of the first max pooling layer and an input end of the first average pooling layer, and output ends of the first max pooling layer and the first average pooling layer are respectively connected to an input end of the perceptron structure; the output end of the sensing machine structure is connected with the input end of the first vector adding unit, the first vector adding unit is used for adding two vectors, the output end of the first vector unit is connected with the input end of the first Sigmoid activated function unit, and the output end of the first Sigmoid activated function unit is connected with the output end of the first parallel branch.
As an optional embodiment, in the first aspect of the present invention, the second parallel branch is provided with a second maximum pooling layer, a second average pooling layer, a second vector summing unit, a convolution layer, and a second Sigmoid activation function unit;
the input end of the second branch is connected to the input ends of the second maximum pooling layer and the second average pooling layer, the input ends of the second maximum pooling layer and the second average pooling layer are connected to the input end of the second vector summing unit, the second vector summing unit is used for summing two vectors on a channel, the output end of the channel summing unit is connected to the input end of the convolutional layer, the output end of the convolutional layer is connected to the input end of the second Sigmoid activation function unit, and the output end of the second Sigmoid activation function unit is connected to the output end of the second branch.
The invention discloses a prostate ultrasonic segmentation method, which comprises the following steps: acquiring a prostate ultrasonic image to be segmented; the ultrasound prostate image is input to the ultrasound prostate segmentation model obtained by the method for training the ultrasound prostate segmentation model according to any one of the first aspect of the present invention, so as to output a segmentation result.
A third aspect of the present invention discloses an electronic device, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of any one of the methods of the first and second aspects of the present invention.
A fourth aspect of the invention discloses a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of any of the first and second aspects of the invention.
The technical scheme provided by the invention can have the following beneficial effects:
according to the invention, the priori knowledge of the prostate form is combined as a loss function, and the loss function is utilized to adjust the model parameters of the prostate ultrasonic cutting model, so that the correction and compensation of the prostate ultrasonic cutting model are realized, the problems of fuzzy boundary of an ultrasonic image, structural imaging failure and the like are solved, and the segmentation precision is improved. In addition, the prostate ultrasonic cutting model also combines an attention mechanism to improve the weight of the prostate area, reduce task complexity and improve the efficiency of the prostate ultrasonic cutting model.
Drawings
FIG. 1 is a schematic diagram of a training flow of one embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an ultrasound prostate segmentation model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an attention mechanism module according to an embodiment of the invention.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
The invention discloses a training method of a prostate ultrasonic segmentation model in a first aspect, which comprises the following steps:
step S1: pairs of ultrasound prostate images and Mask images are acquired for each patient.
Step S2: and carrying out data preprocessing on the paired prostate ultrasonic images and Mask images of each patient, dividing a training set, and carrying out data enhancement processing on the paired prostate ultrasonic images and Mask images of each patient in the training set.
And step S3: and (4) integrating an attention mechanism into the 3D-UNet to construct a prostate ultrasonic segmentation model, and training the prostate ultrasonic segmentation model according to the training set.
And step S4: and extracting priori knowledge of the prostate contour according to the training set and constructing a loss function.
Step S5: and adjusting the model parameters of the prostate ultrasonic segmentation model according to the loss function to obtain the optimized prostate ultrasonic segmentation model. The loss function is an index for evaluating the difference between the predicted value and the true value, and the deep learning model can reversely propagate the updated parameters according to the result of the loss function, so that the loss is continuously reduced.
Wherein, the step S1 includes the following steps:
step S101: ultrasound images of the prostate of each patient are acquired.
Step S102: the prostate area in the acquired ultrasound image of the prostate was manually segmented using the ITK-SNAP software, and 2 imaging physicians with more than five years of experience in ultrasound diagnosis of the prostate outlined along the prostate margin in the ultrasound image to generate a segmented Mask map golden standard. Thereby acquiring a pair of ultrasound images of the prostate and Mask images for each patient.
Wherein, the step S2 includes the following steps:
step S201: and (5) data cleaning. The data cleaning is to examine the paired data of each patient, and check whether the resolution of the prostate ultrasound image data is consistent with that of the Mask image, and whether the Mask image is binarized (the prostate area value is 1, and the background area value is 0). And outputting the patient number of the error data, and performing data collection and labeling again.
Step S202: and (4) resampling the data. The data resampling specifically includes that spatial information such as Spacing, direction, origin and the like in the paired image data and the labeled data are respectively read, the spatial information of the labeled data is resampled to the image data by taking the image data as a reference, and consistency of the spatial information of the paired image data and the labeled data is guaranteed.
Step S203: and (4) data set allocation. The data set allocation specifically includes that paired data are randomly disturbed according to the size of the collected data volume, and the paired data are divided into a training set, a verification set and a test set according to the proportion of 80%, 10% and 10%.
The data enhancement processing specifically comprises the following steps: the Trochi is used for adding a data enhancement layer, and different layers have different data enhancement effects and have randomness. The method specifically comprises the following steps: the image is randomly flipped in the horizontal direction. The image is randomly flipped in the vertical direction. The image is randomly rotated between-90 degrees and 90 degrees clockwise. The contrast and brightness of the image were randomly adjusted in the range of-10% and 10%. Because the data enhancement has randomness, the enhancement effect of the same picture is different when deep learning is trained at every time, the diversity of the picture can be increased based on the existing data, and the generalization capability of the model is improved.
According to the invention, the model parameters of the ultrasonic cutting model of the prostate are adjusted through the loss function by combining the priori knowledge of the prostate form as the loss function, so that the correction and compensation of the ultrasonic cutting model of the prostate are realized, the problems of fuzzy boundary of an ultrasonic image, structural imaging failure and the like are solved, and the segmentation precision is improved. In addition, the prostate ultrasonic cutting model also combines an attention mechanism to improve the weight of the prostate area, reduce task complexity and improve the efficiency of the prostate ultrasonic cutting model.
As an alternative embodiment, the loss function includes an active shape loss function and a mean square loss function, and the adjusting the model parameters of the ultrasound prostate segmentation model according to the loss function includes the following steps:
step S501: the active shape loss function and the mean square loss function are combined.
Step S502: the Adam algorithm is used as an optimizer, and the learning rate of the prostate ultrasonic segmentation model is dynamically adjusted through a cosine annealing restart method, so that the weight is updated through back propagation iteration of a loss function.
As an alternative embodiment, the training set extracting a priori knowledge of the prostate contour and constructing the loss function includes the following steps:
step S401: extracting the prostate contour of each patient according to the paired prostate ultrasound image and Mask image of each patient in the training set to obtain three-dimensional point cloud data of the prostate contour of each patient, and correspondingly using each three-dimensional point cloud data as a training sample X = [ X ] of each patient 1 ,y 1 ,z 1 ,x 2 ,y2,z 2 ,…,x n ,y n ,z n ,] T Wherein x in the training sample n ,y n ,z n Representing the three-dimensional coordinates of the nth point, a training sample set J = { X) is generated 1 ,X 2 ,…,X j Where X in the training sample set j Denoted as the jth training sample.
Step S402: calculating the average shape of all training samples according to the three-dimensional point cloud data of each training sample, wherein the calculation formula of the average shape is as follows:
Figure BDA0004060046790000101
in the formula (1), the first and second groups of the compound,
Figure BDA0004060046790000102
is an average shape, X i Is the ith training sample, and j is the total number of training samples.
Step S403: registering all training samples in a training sample set to an average shape through affine transformation, and calculating the offset of each registered sample compared with the average shape, wherein the offset is calculated according to the formula:
Figure BDA0004060046790000103
in the formula (2), dX i Offset from the mean shape after registration for the ith training sample, X i For the (i) th training sample,
Figure BDA0004060046790000104
is an average shape.
Step S404: and calculating a covariance matrix of the training sample set according to the offset, wherein a calculation formula of the covariance matrix is as follows:
Figure BDA0004060046790000105
in the formula (3), the first and second groups,
Figure BDA0004060046790000106
is the transpose of the offset of the i-th post-registration sample compared to the average shape.
Step S405: singular value decomposition is carried out on the covariance matrix to obtain an eigenvalue and an eigenvector, and the calculation formula is as follows:
Sp i =λ i p i (4)。
in the formula (4), λ i Is the ith eigenvalue, p, of the covariance matrix i Is the ith eigenvector of the covariance matrix.
Step S406: and selecting the maximum first t feature values to represent the main shape of the training sample, and obtaining a statistical model of the shape vector of the sample. Wherein t needs to satisfy:
Figure BDA0004060046790000111
the Ratio indicates the proportion of the main shape that can account for all deformations in the original model, typically above 80%, and is set to 90% in this application, and the main shape of the sample in the training set can be shown using a smaller amount of variables.
The expression for the main shape is:
Figure BDA0004060046790000112
in formula (6), P is a feature vector, and B is a feature value corresponding to the feature vector.
Then the statistical model of the sample shape vector is:
Figure BDA0004060046790000113
in the formula (7), P t For the first t feature vectors,
Figure BDA0004060046790000114
is P t Transpose of (B) t The eigenvalues of the first t eigenvectors.
Step S407: and constructing a local gray model according to the statistical model to calculate the local characteristic of each characteristic point so as to adjust iteration parameters and obtain an optimal matching model of the target. The covariance matrix of the local gray model of each feature point is as follows:
Figure BDA0004060046790000115
in the formula (8), g ij For the ith feature point of the jth training sample,
Figure BDA0004060046790000116
is the local gray model mean of the ith sample.
Step S408: and calculating the similarity between the moved characteristic points in the Mahalanobis distance comparison matching process and the new characteristic points obtained after moving to serve as evaluation indexes of the active shape loss function. The similarity of the two points is larger when the Mahalanobis distance is smaller, so that the optimal position of the point is determined. The Mahalanobis distance calculation formula of the two characteristic points is as follows:
Figure BDA0004060046790000121
in the formula (9), g s Is a new characteristic point of the image to be displayed,
Figure BDA0004060046790000122
is g i The inverse of the covariance matrix.
Step S409: in the training process of the prostate ultrasonic segmentation model, an active shape loss function is obtained by calculating errors of the predicted points and corresponding points of the golden standard Mask, and the active shape loss function is as follows:
Figure BDA0004060046790000123
in the formula (10), the first and second groups,
Figure BDA0004060046790000124
an i-th characteristic point, which represents the jth sample in the gold standard>
Figure BDA0004060046790000125
And representing the ith characteristic point of the jth sample in the model prediction result.
Step S410: in the training process of the prostate ultrasonic segmentation model, a mean square loss function is obtained by calculating the variance between a model predicted value and a sample true value, and the mean square loss function is as follows:
Figure BDA0004060046790000126
in the formula (11), y i Representing the predicted value of the model in the ith sample,
Figure BDA0004060046790000127
representing the true value in the ith sample.
As an alternative embodiment, the prostate ultrasound segmentation model with attention mechanism includes an initialization module, a first encoding module, a second encoding module, a third encoding module, a fourth encoding module, a fifth encoding module, a first decoding module, a second decoding module, a third decoding module, a fourth decoding module, a fifth decoding module, a first attention mechanism module, a second attention mechanism module, a third attention mechanism module, a fourth attention mechanism module, a fifth attention mechanism module, and a final segmentation module.
The output end of the initialization module is connected to the input end of the first coding module. The output end of the first coding module is connected to the input end of the second coding module. The output end of the second coding module is connected to the input end of the third coding module. The output end of the third coding module is connected to the input end of the fourth coding module. The output end of the fourth coding module is connected to the input end of the fifth coding module. The output end of the fifth coding module and the output end of the fourth coding module are connected to the input end of the first attention mechanism module in a jumping mode. The output end of the first attention mechanism module is connected to the input end of the first decoding module. The output end of the first decoding module and the output end of the third encoding module are connected to the input end of the second attention mechanism module in a jumping mode. The output end of the second attention mechanism module is connected to the input end of the second decoding module. The output end of the second decoding module and the output end of the second encoding module are connected to the input end of the third attention mechanism module in a jumping mode. The output end of the third attention mechanism module is connected to the input end of the third decoding module. The output end of the third decoding module and the output end of the first encoding module are connected to the input end of the fourth attention mechanism module in a jumping mode. The output end of the fourth attention mechanism module is connected to the input end of the fourth decoding module. The output end of the fourth decoding module and the output end of the initialization module are connected to the input end of a fifth attention mechanism module in a jumping manner, and the output end of the fifth attention mechanism module is connected to the input end of the final segmentation module.
Specifically, in the present embodiment, the image input to the ultrasound prostate segmentation model is a five-dimensional vector having a size of N × C × D × H × W, where N is the batch size, C is the number of image channels, D is the image depth, H is the image height, and W is the image width. Referring to fig. 2, an input image is converted into a five-dimensional vector having a channel of 16 by an initialization module, and then sequentially passes through a first encoding module, a second encoding module, a third encoding module, a fourth encoding module, and a fifth encoding module. The output of the fifth encoding module and the input of the fourth encoding module enter the first attention module through a skip connection, and the output of the first attention module serves as the input of the first decoding module. The output of the first decoding module and the output of the third encoding module are input to the second attention module through a skip connection, and the output of the second attention module serves as the input of the second decoding module. The output of the second decoding module and the output of the second encoding module are input to a third attention module through a skip connection, and the output of the third attention module serves as the input of the third decoding module. The output of the third decoding module and the output of the first encoding module are input to a fourth attention module through a skip connection, and the output of the fourth attention module is used as the input of the fourth decoding module. The output of the fourth decoding module and the output of the initialization module are input to the fifth attention module through a skip connection, and the output of the fifth attention module is used as the input of the fifth decoding module. And the output of the fifth decoding module passes through the final segmentation module and outputs the segmentation result of the model.
The first coding module, the second coding module, the third coding module, the fourth coding module and the fifth coding module have the same structure and respectively consist of 1 maximum pooling layer (the convolution kernel is 2 multiplied by 2, the sliding step length is 2, the filling parameter is 0) and 2 convolution layers (the size of the convolution kernel is 3 multiplied by 3, the sliding step length is 1, the filling parameter is 1, the activation function is a ReLU function, and the normalization operation is group normalization). The first coding module, the second coding module, the third coding module, the fourth coding module and the fifth coding module respectively output five-dimensional vectors with channel numbers of 32, 64, 128, 256 and 512.
The first decoding module, the second decoding module, the third decoding module, the fourth decoding module and the fifth decoding module have the same structure and respectively consist of 2 convolution layers (the size of a convolution kernel is 3 multiplied by 3, the sliding step length is 1, the filling parameter is 1, the activation function is a ReLU function, and the normalization operation is group normalization). The first decoding module, the second decoding module, the third decoding module, the fourth decoding module and the fifth decoding module respectively output five-dimensional vectors with channel numbers of 256, 128, 64, 32 and 16.
The final segmentation module consists of 1 convolutional layer and sigmoid function, the convolutional kernel size of the convolutional layer is 1 multiplied by 1, and the sliding step length is 1. The final segmentation module is used to convert the output of the fifth decoding module into a 2-channel five-dimensional vector, i.e. the probability that the model predicts that each pixel belongs to the background or prostate.
According to the method, on the basis of the 3D-UNet, the attention mechanism module is added after jump connection, so that the weight of a prostate target region of the model is increased, and the efficiency of the model is improved.
As an alternative embodiment, each of the first attention mechanism module, the second attention mechanism module, the third attention mechanism module, the fourth attention mechanism module and the fifth attention mechanism module includes a first parallel branch, a second parallel branch, a third parallel branch and a summing unit, the first parallel branch is used for outputting the channel attention, the second parallel branch is used for outputting the spatial attention, the third parallel branch is used for outputting the original input feature, the output ends of the first parallel branch, the second parallel branch and the third parallel branch are respectively connected to the summing unit, and the summing unit is used for summing the channel attention, the spatial attention and the original input feature.
As an alternative embodiment, the first parallel branch is provided with a first max pooling layer, a first average pooling layer, a sensor structure, a first vector adding unit and a first Sigmoid activation function unit, an input end of the first parallel branch is connected to an input end of the first max pooling layer and an input end of the first average pooling layer, and output ends of the first max pooling layer and the first average pooling layer are respectively connected to an input end of the sensor structure. The output end of the sensing machine structure is connected with the input end of the first vector adding unit, the first vector adding unit is used for adding two vectors, the output end of the first vector unit is connected with the input end of the first Sigmoid activated function unit, and the output end of the first Sigmoid activated function unit is connected with the output end of the first parallel branch.
In this embodiment, the first parallel branch passes the input features through the first maximum pooling layer and the first average pooling layer to obtain the vector processed by the first maximum pooling layer
Figure BDA0004060046790000151
And vectors treated in the first averaging pooling layer->
Figure BDA0004060046790000152
Combining vectors>
Figure BDA0004060046790000153
And vector->
Figure BDA0004060046790000154
Input into a perceptron structure with two fully connected layers and a ReLU activation function. The two vectors output by the perceptron structure are added by a first vector addition unit. And finally, carrying out Sigmoid activation function processing on the added vectors through a first Sigmoid activation function unit to obtain the attention of the channel. The channel attention formula is expressed as:
M c (F)=σ(MLP(AvgPool(F))+MLP(MaxPool(F))) (12)。
wherein M is c (F) For channel attention, MLP (AvgPool (F)) is the average pooled vector output by the perceptron structure, MLP (MaxPool (F)) is the maximum pooled vector output by the perceptron structure, and σ is the Sigmoid activation function.
As an alternative embodiment, the second parallel branch is provided with a second maximum pooling layer, a second average pooling layer, a second vector summing unit, a convolution layer and a second Sigmoid activation function unit.
The input end of the second branch is connected to the input ends of the second maximum pooling layer and the second average pooling layer, the input ends of the second maximum pooling layer and the second average pooling layer are connected to the input end of the second vector summing unit, the second vector summing unit is used for summing two vectors on a channel, the output end of the channel summing unit is connected to the input end of the convolutional layer, the output end of the convolutional layer is connected to the input end of the second Sigmoid activation function unit, and the output end of the second Sigmoid activation function unit is connected to the output end of the second branch.
In this embodiment, the second parallel branch passes the input features through the second maximum pooling layer and the second average pooling layer to obtain a second filtered signalVector after second largest pooling layer processing
Figure BDA0004060046790000161
And vectors treated in the first averaging pooling layer->
Figure BDA0004060046790000162
The vector is ≥ via a second vector addition unit>
Figure BDA0004060046790000163
And vector->
Figure BDA0004060046790000164
Add over the channel. And inputting the added vector into a convolution layer with the convolution kernel size of 3, the sliding step length of 1 and the filling parameter of 1 for convolution, and performing Sigmoid activation function processing on the convolved vector through a second Sigmoid activation function unit to obtain the spatial attention. The spatial attention formula is expressed as:
M s (F)=σ(f 3×3 ([AvgPool(F);MaxPool(F)]) (13)。
in the formula (13), M s (F) For channel attention, avgPool (F) is the averaged pooled vector, maxPool (F) is the maximum pooled vector, and σ is the Sigmoid activation function.
The second aspect of the invention discloses a prostate ultrasonic segmentation method, which comprises the following steps:
step A1: and acquiring an ultrasonic image of the prostate to be segmented.
Step A2: the above prostate ultrasound image is input to the prostate ultrasound segmentation model obtained by any one of the above training methods for a prostate ultrasound segmentation model disclosed in the first aspect of the present invention, so as to output a segmentation result.
The third aspect of the present invention also discloses an electronic device, which comprises a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the method according to any one of the first and second aspects of the present invention.
The fourth aspect of the present invention also discloses a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of any one of the first and second aspects of the present invention.
Through the above detailed description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. Based on such understanding, the above technical solutions may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, wherein the storage medium includes a Read-Only Memory (ROM), a Random Access Memory (RAM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc-Read-Only Memory (CD-ROM) or other Memory capable of storing data, a magnetic tape, or any other computer-readable medium capable of storing data.
The technical principle of the present invention is described above in connection with specific embodiments. The description is made for the purpose of illustrating the principles of the invention and should not be taken in any way as limiting the scope of the invention. Based on the explanations herein, those skilled in the art will be able to conceive of other embodiments of the present invention without inventive step, and these embodiments will fall within the scope of the present invention.

Claims (10)

1. The training method of the prostate ultrasonic segmentation model is characterized by comprising the following steps: the method comprises the following steps:
acquiring paired prostate ultrasonic images and Mask images of each patient;
carrying out data preprocessing on the paired prostate ultrasonic images and Mask images of each patient, marking out a training set, and carrying out data enhancement processing on the paired prostate ultrasonic images and Mask images of each patient in the training set;
combining an attention mechanism into the 3D-UNet to construct a prostate ultrasonic segmentation model, and training the prostate ultrasonic segmentation model according to the training set;
extracting priori knowledge of the prostate contour according to the training set and constructing a loss function;
and adjusting the model parameters of the prostate ultrasonic segmentation model according to the loss function to obtain the optimized prostate ultrasonic segmentation model.
2. The method for training an ultrasound prostate segmentation model according to claim 1, wherein: the loss function comprises an active shape loss function and a mean square loss function, and the step of adjusting the model parameters of the prostate ultrasonic segmentation model according to the loss function comprises the following steps:
and combining the active shape loss function and the mean square loss function, taking an Adam algorithm as an optimizer, and dynamically adjusting the learning rate of the prostate ultrasonic segmentation model by a cosine annealing restart method so as to make the loss function propagate reversely and update the weight iteratively.
3. The method for training an ultrasound prostate segmentation model according to claim 1, characterized in that: the training set extracts the priori knowledge of the prostate contour and constructs a loss function, and the method comprises the following steps:
extracting the prostate contour of each patient according to the paired prostate ultrasound image and Mask image of each patient in the training set to obtain three-dimensional point cloud data of the prostate contour of each patient, and correspondingly using each three-dimensional point cloud data as a training sample X = [ X ] of each patient 1 ,y 1 ,z 1 ,x 2 ,y 2 ,z 2 ,…,x n ,y n ,z n ,] T Wherein x in the training sample n ,y n ,z n Three-dimensional coordinates representing the nth pointGenerating a training sample set J = { X = 1 ,X 2 ,…,X j Where X in the training sample set j Denoted as the jth training sample;
calculating the average shape of all training samples according to the three-dimensional point cloud data of each training sample, wherein the calculation formula of the average shape is as follows:
Figure FDA0004060046780000021
in the formula (1), the first and second groups,
Figure FDA0004060046780000022
is an average shape, X i Is the ith training sample, j is the total number of training samples;
registering all training samples in a training sample set to an average shape through affine transformation, and calculating the offset of each registered sample compared with the average shape, wherein the offset is calculated according to the formula:
Figure FDA0004060046780000023
in the formula (2), dX i Offset, X, from the mean shape after registration for the ith training sample i For the (i) th training sample,
Figure FDA0004060046780000024
is an average shape;
and calculating a covariance matrix of the training sample set according to the offset, wherein the calculation formula of the covariance matrix is as follows:
Figure FDA0004060046780000025
in the formula (3), the first and second groups,
Figure FDA0004060046780000026
transpose the offset of the ith post-registration sample compared to the average shape;
singular value decomposition is carried out on the covariance matrix to obtain an eigenvalue and an eigenvector, and the calculation formula is as follows:
Sp i =λ i p i (4);
in the formula (4), λ i Is the ith eigenvalue, p, of the covariance matrix i Is the ith eigenvector of the covariance matrix;
selecting the largest first t characteristic values to represent the main shape of the training sample, and obtaining a statistical model of the shape vector of the sample;
wherein t satisfies:
Figure FDA0004060046780000027
in formula (5), ratio represents the proportion of the main shape that can explain all deformation in the original model;
the expression for the main shape is:
Figure FDA0004060046780000031
in the formula (6), P is a characteristic vector, and B is a characteristic value corresponding to the characteristic vector;
the statistical model of the sample shape vector is then:
Figure FDA0004060046780000032
in the formula (7), P t For the first t feature vectors,
Figure FDA0004060046780000033
is P t Transpose of (A), B t The eigenvalues of the first t eigenvectors;
constructing a local gray model according to the statistical model to calculate the local characteristic of each characteristic point so as to adjust iteration parameters and obtain an optimal matching model of the target; the covariance matrix of the local gray model of each feature point is as follows:
Figure FDA0004060046780000034
in the formula (8), g ij For the ith feature point of the jth training sample,
Figure FDA0004060046780000035
the local gray model mean value of the ith sample is obtained;
calculating the similarity between the moved characteristic points in the Mahalanobis distance comparison matching process and the new characteristic points obtained after the movement, wherein the similarity between the two characteristic points is larger when the Mahalanobis distance is smaller as an evaluation index of the active shape loss function; the Mahalanobis distance calculation formula of the two characteristic points is as follows:
Figure FDA0004060046780000036
in the formula (9), g s In order to be a new feature point,
Figure FDA0004060046780000037
is g i The inverse of the covariance matrix;
in the training process of the prostate ultrasonic segmentation model, an active shape loss function is obtained by calculating errors of the predicted points and corresponding points of a Mask image, wherein the active shape loss function is as follows:
Figure FDA0004060046780000038
in the formula (10), the first and second groups,
Figure FDA0004060046780000039
an i-th characteristic point, which represents the jth sample in the gold standard>
Figure FDA00040600467800000310
Representing the ith characteristic point of the jth sample in the model prediction result;
in the training process of the prostate ultrasonic segmentation model, a mean square loss function is obtained by calculating the variance between the model predicted value and the sample true value, and the mean square loss function is as follows:
Figure FDA0004060046780000041
in the formula (11), y i Representing the predicted value of the model in the ith sample,
Figure FDA0004060046780000042
representing the true value in the ith sample.
4. The method for training an ultrasound prostate segmentation model according to claim 1, characterized in that:
the prostate ultrasonic segmentation model combined with the attention mechanism comprises an initialization module, a first encoding module, a second encoding module, a third encoding module, a fourth encoding module, a fifth encoding module, a first decoding module, a second decoding module, a third decoding module, a fourth decoding module, a fifth decoding module, a first attention mechanism module, a second attention mechanism module, a third attention mechanism module, a fourth attention mechanism module, a fifth attention mechanism module and a final segmentation module;
the output end of the initialization module is connected to the input end of the first coding module; the output end of the first coding module is connected to the input end of the second coding module; the output end of the second coding module is connected to the input end of the third coding module; the output end of the third coding module is connected to the input end of the fourth coding module; the output end of the fourth coding module is connected to the input end of the fifth coding module; the output end of the fifth coding module and the output end of the fourth coding module are connected to the input end of the first attention mechanism module in a jumping mode; the output end of the first attention mechanism module is connected to the input end of the first decoding module; the output end of the first decoding module and the output end of the third encoding module are connected to the input end of the second attention mechanism module in a jumping mode; the output end of the second attention mechanism module is connected to the input end of the second decoding module; the output end of the second decoding module and the output end of the second encoding module are connected to the input end of the third attention mechanism module in a jumping mode; the output end of the third attention mechanism module is connected to the input end of the third decoding module; the output end of the third decoding module and the output end of the first coding module are connected to the input end of the fourth attention mechanism module in a jumping mode; the output end of the fourth attention mechanism module is connected to the input end of the fourth decoding module; the output end of the fourth decoding module and the output end of the initialization module are connected to the input end of a fifth attention mechanism module in a jumping mode, and the output end of the fifth attention mechanism module is connected to the input end of the final segmentation module.
5. The method for training an ultrasound prostate segmentation model according to claim 1, characterized in that: the first attention mechanism module, the second attention mechanism module, the third attention mechanism module, the fourth attention mechanism module and the fifth attention mechanism module respectively comprise a first parallel branch, a second parallel branch, a third parallel branch and a summing unit, the first parallel branch is used for outputting channel attention, the second parallel branch is used for outputting space attention, the third parallel branch is used for outputting original input features, output ends of the first parallel branch, the second parallel branch and the third parallel branch are respectively connected to the summing unit, and the summing unit is used for adding the channel attention, the space attention and the original input features.
6. The method for training an ultrasound prostate segmentation model according to claim 5, wherein: the first parallel branch is provided with a first maximum pooling layer, a first average pooling layer, a sensor structure, a first vector adding unit and a first Sigmoid activation function unit, the input end of the first parallel branch is connected to the input end of the first maximum pooling layer and the input end of the first average pooling layer, and the output ends of the first maximum pooling layer and the first average pooling layer are respectively connected to the input end of the sensor structure; the output end of the perceptron structure is connected with the input end of the first vector adding unit, the first vector adding unit is used for adding two vectors, the output end of the first vector unit is connected with the input end of the first Sigmoid activated function unit, and the output end of the first Sigmoid activated function unit is connected with the output end of the first parallel branch.
7. The method for training an ultrasound prostate segmentation model according to claim 5, wherein: the second parallel branch is provided with a second maximum pooling layer, a second average pooling layer, a second vector summing unit, a convolution layer and a second Sigmoid activation function unit;
the input end of the second branch is connected to the input ends of the second maximum pooling layer and the second average pooling layer, the input ends of the second maximum pooling layer and the second average pooling layer are connected to the input end of the second vector summing unit, the second vector summing unit is used for summing two vectors on a channel, the output end of the channel summing unit is connected to the input end of the convolutional layer, the output end of the convolutional layer is connected to the input end of the second Sigmoid activation function unit, and the output end of the second Sigmoid activation function unit is connected to the output end of the second branch.
8. A method of ultrasound segmentation of a prostate comprising:
acquiring a prostate ultrasonic image to be segmented;
inputting the ultrasound prostate image into the ultrasound prostate segmentation model obtained in the training method of ultrasound prostate segmentation model according to any one of claims 1 to 7, to output a segmentation result.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 8 when executing the program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
CN202310054691.2A 2023-02-03 2023-02-03 Training method, segmentation method and device of prostate ultrasonic segmentation model Pending CN115953412A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310054691.2A CN115953412A (en) 2023-02-03 2023-02-03 Training method, segmentation method and device of prostate ultrasonic segmentation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310054691.2A CN115953412A (en) 2023-02-03 2023-02-03 Training method, segmentation method and device of prostate ultrasonic segmentation model

Publications (1)

Publication Number Publication Date
CN115953412A true CN115953412A (en) 2023-04-11

Family

ID=87289247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310054691.2A Pending CN115953412A (en) 2023-02-03 2023-02-03 Training method, segmentation method and device of prostate ultrasonic segmentation model

Country Status (1)

Country Link
CN (1) CN115953412A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561860A (en) * 2020-11-23 2021-03-26 重庆邮电大学 BCA-UNet liver segmentation method based on prior shape constraint
CN114037714A (en) * 2021-11-02 2022-02-11 大连理工大学人工智能大连研究院 3D MR and TRUS image segmentation method for prostate system puncture
CN115115648A (en) * 2022-06-20 2022-09-27 北京理工大学 Brain tissue segmentation method combining UNet and volume rendering prior knowledge
CN115239716A (en) * 2022-09-22 2022-10-25 杭州影想未来科技有限公司 Medical image segmentation method based on shape prior U-Net

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561860A (en) * 2020-11-23 2021-03-26 重庆邮电大学 BCA-UNet liver segmentation method based on prior shape constraint
CN114037714A (en) * 2021-11-02 2022-02-11 大连理工大学人工智能大连研究院 3D MR and TRUS image segmentation method for prostate system puncture
CN115115648A (en) * 2022-06-20 2022-09-27 北京理工大学 Brain tissue segmentation method combining UNet and volume rendering prior knowledge
CN115239716A (en) * 2022-09-22 2022-10-25 杭州影想未来科技有限公司 Medical image segmentation method based on shape prior U-Net

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
甘守飞 等: "改进主动形状模型的遥感图像飞机目标识别方法", 重庆大学学报, vol. 37, no. 9, pages 1 - 3 *

Similar Documents

Publication Publication Date Title
US11756160B2 (en) ML-based methods for pseudo-CT and HR MR image estimation
US11010630B2 (en) Systems and methods for detecting landmark pairs in images
CN109272443B (en) PET and CT image registration method based on full convolution neural network
RU2720440C1 (en) Image segmentation method using neural network
KR20210048523A (en) Image processing method, apparatus, electronic device and computer-readable storage medium
JP2002539870A (en) Image processing method and apparatus
US20080281203A1 (en) System and Method for Quasi-Real-Time Ventricular Measurements From M-Mode EchoCardiogram
CN112819831B (en) Segmentation model generation method and device based on convolution Lstm and multi-model fusion
CN112634265B (en) Method and system for constructing and segmenting fully-automatic pancreas segmentation model based on DNN (deep neural network)
WO2024011835A1 (en) Image processing method and apparatus, device, and readable storage medium
CN117078692B (en) Medical ultrasonic image segmentation method and system based on self-adaptive feature fusion
CN114359642A (en) Multi-modal medical image multi-organ positioning method based on one-to-one target query Transformer
EP3973508A1 (en) Sampling latent variables to generate multiple segmentations of an image
CN112750137A (en) Liver tumor segmentation method and system based on deep learning
CN116258933A (en) Medical image segmentation device based on global information perception
CN110570425A (en) Lung nodule analysis method and device based on deep reinforcement learning algorithm
CN112396605B (en) Network training method and device, image recognition method and electronic equipment
CN116563096B (en) Method and device for determining deformation field for image registration and electronic equipment
Huang et al. Feature pyramid network with level-aware attention for meningioma segmentation
CN115439423B (en) CT image-based identification method, device, equipment and storage medium
CN116309640A (en) Image automatic segmentation method based on multi-level multi-attention MLMA-UNet network
CN115953412A (en) Training method, segmentation method and device of prostate ultrasonic segmentation model
CN116485853A (en) Medical image registration method and device based on deep learning neural network
CN116309806A (en) CSAI-Grid RCNN-based thyroid ultrasound image region of interest positioning method
CN110739050A (en) left ventricle full parameter and confidence degree quantification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination