CN116433586A - Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method - Google Patents

Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method Download PDF

Info

Publication number
CN116433586A
CN116433586A CN202310149152.7A CN202310149152A CN116433586A CN 116433586 A CN116433586 A CN 116433586A CN 202310149152 A CN202310149152 A CN 202310149152A CN 116433586 A CN116433586 A CN 116433586A
Authority
CN
China
Prior art keywords
tomographic image
breast
image segmentation
layer
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310149152.7A
Other languages
Chinese (zh)
Inventor
丁明跃
蔡梦媛
周亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202310149152.7A priority Critical patent/CN116433586A/en
Publication of CN116433586A publication Critical patent/CN116433586A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30068Mammography; Breast

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Databases & Information Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

The invention discloses a breast ultrasonic tomographic image segmentation model establishment method and a segmentation method, which belong to the field of medical image segmentation and comprise the following steps: constructing a training set consisting of breast ultrasonic tomographic images marked with focus areas; constructing an initial breast ultrasound tomographic image segmentation model based on the UNet model; with L total =L E +αL vntrast +βL uncer For training the loss function, training the constructed initial model by using a training set to finish the establishment of the breast ultrasonic tomographic image segmentation model; wherein L is total Representing the overall loss;L E Representing a segmentation error of a breast ultrasound tomographic image segmentation model; l (L) contrast Representing a contrast loss function between the intermediate result and the labeling result output by the decoding structure, L uncer Outputting an uncertainty loss function between an intermediate result and a labeling result for the decoding structure, wherein alpha and beta represent weight coefficients; preferably, the model inserts a crossforce module and a multi-scale attention module in UNet. The invention can effectively improve the segmentation precision of the breast ultrasonic tomographic image.

Description

Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method
Technical Field
The invention belongs to the field of medical image segmentation, and particularly relates to a breast ultrasound tomographic image segmentation model establishment method and a segmentation method.
Background
Breast cancer is the second most deadly cancer in women next to lung cancer, and regular breast screening can effectively reduce breast cancer mortality. Compared with the traditional ultrasound, the ultrasonic tomography technology has the advantages of high sensitivity and standardized operation, the diagnosis result is not influenced by the experience of doctors, and the method has wide application prospect in the aspect of screening breast tumors. Ultrasound tomography techniques can be divided into two modes, a reflective mode and a transmissive mode, wherein the reflective mode image has a higher resolution than the conventional ultrasound image. The transmission image may provide functional information of the lesion, thereby further assisting the doctor's diagnosis, and thus, when imaging the breast, a transmission modality is often used. The ultrasonic tomography system adopts step scanning to obtain images of different faults of the breast tissue, and then reconstructs a three-dimensional image of the breast tissue, so that the lesion display is more visual.
The probe frequency used in ultrasound tomography systems is typically 2-3MHz, which is much lower than the more than 10MHz frequency used in conventional ultrasound probes, resulting in lower image resolution, and thus identification of lesions in breast ultrasound tomography is a challenging and time-consuming task. At present, the computer-aided diagnosis is widely applied in medicine, and is mainly used for assisting doctors in diagnosis and making subsequent treatment schemes, so that the specificity and sensitivity of doctor diagnosis are improved. In computer-aided diagnosis of the breast, automated medical image segmentation is one of the most critical steps in improving diagnostic efficiency and accuracy. The focus is segmented from the ultrasonic tomographic image, so that the size and the lesion degree of the focus can be determined by a doctor, the visual three-dimensional shape can be reconstructed, the doctor can make correct diagnosis conveniently, and a treatment scheme can be formulated accurately.
The UNet model has a good effect in medical image segmentation applications, and is therefore also widely used in breast ultrasound tomographic image segmentation. UNet is a typical encoding-decoding structure, the structure of which is shown in fig. 1, the left convolution network part is an encoding structure and is responsible for completing feature extraction, and the UNet is mainly composed of a convolution layer and a downsampling layer, so that the feature map is seen to be continuously reduced in size; the right is a decoding structure, and corresponds to an up-sampling process, and the up-sampling process is restored to a size close to the original image through long jump connection (concat mode) with information of different convolution layers. In order to further improve the segmentation effect of the breast ultrasound tomographic image, researchers improve the traditional UNet model, and some improved models, such as an Attention-UNet model, a unet++ model and a reset model, are proposed.
After training, the model can effectively segment the focus area in the breast ultrasonic tomography image, but different from other medical images, the background of the breast ultrasonic tomography image is complex and the contrast between the focus and other tissues is low, and in the decoding stage, a great amount of information is lost in the up-sampling operation of the model, so that the problem of false segmentation still exists in practical application. In addition, because lesions tend to be smaller in breast ultrasound tomographic images, existing models also do not accurately segment small lesion areas. In general, the segmentation accuracy of the existing breast ultrasound tomographic image segmentation method needs to be further improved.
Disclosure of Invention
Aiming at the defects and improvement demands of the prior art, the invention provides a breast ultrasonic tomography image segmentation model establishment method and a segmentation method, and aims to improve the segmentation precision of breast ultrasonic tomography images.
In order to achieve the above object, according to one aspect of the present invention, there is provided a breast ultrasound tomographic image segmentation model building method comprising:
constructing a training set; in the training set, each piece of training data is a breast ultrasonic tomographic image marked with a focus area;
constructing an initial breast ultrasonic tomographic image segmentation model based on the UNet model, and segmenting a focus area from the breast ultrasonic tomographic image;
with L total =L E +αL contrast +βL uncer For training the loss function, training an initial breast ultrasonic tomographic image segmentation model by using a training set, and completing establishment of the breast ultrasonic tomographic image segmentation model;
wherein L is total Representing the overall loss; l (L) E Representing a segmentation error of a breast ultrasound tomographic image segmentation model; l (L) contrast The contrast loss function between the intermediate result and the labeling result output by the decoding structure in the breast ultrasonic tomography image segmentation model is represented, and alpha represents the weight coefficient; l (L) uncer The method is an uncertainty loss function and is used for representing the difference between an intermediate result and a labeling result output by a decoding structure in a breast ultrasonic tomography image segmentation model, and beta represents a weight coefficient of the intermediate result; alpha is more than or equal to 0, beta is more than or equal to 0, and alpha and beta are not 0 at the same time.
Further, the method comprises the steps of,
Figure BDA0004090170100000031
Figure BDA0004090170100000032
Figure BDA0004090170100000033
where J represents the number of intermediate outputs of the model; l (L) CE () Representing cross entropy loss; l represents a labeling result; p is p j Representing intermediate results of decoding structure output in the breast ultrasound tomographic image segmentation model,
Figure BDA0004090170100000034
representing intermediate result p j Is the ith channel of (2); d (D) KL Represents KL divergence, C represents category number, < ->
Figure BDA0004090170100000035
The weight assigned to pixel p in the jth intermediate result is indicated.
Further, the method comprises the steps of,
Figure BDA0004090170100000036
wherein L represents the similarity between pixels; s is(s) o Representing features in intermediate results, s l Representing the characteristics in the labeling result; m and n represent feature classes;
Figure BDA0004090170100000037
features representing the same category->
Figure BDA0004090170100000038
Features representing different categories; τ represents the temperature coefficient.
Further, the breast ultrasound tomographic image segmentation model further includes: and a CrossFormer module inserted between the last layer of the coding structure and the decoding structure in the UNet model.
Further, the breast ultrasound tomographic image segmentation model further includes: a multi-scale attention module inserted in a long jump connection between the encoding structure and the decoding structure in the UNet model;
the multi-scale attention module includes: a cavity space convolution pooling pyramid and a characteristic enhancement module;
the cavity convolution pooling pyramid is used for extracting multi-scale features of the features output by the corresponding layer in the coding structure to obtain multi-scale features;
and the feature enhancement module takes the multi-scale features and decoding features output by corresponding layers in the decoding structure as inputs, and is used for enhancing the features related to the task and inhibiting the features not related to the features.
Further, the feature enhancement module comprises a first convolution layer, a second convolution layer, a third convolution layer, a ReLU activation layer, a Sigmoid activation layer, a pixel addition layer and a pixel multiplication layer;
the first convolution layer is used for inputting multi-scale features and carrying out convolution operation;
the second convolution layer is used for inputting decoding characteristics and carrying out convolution operation;
the pixel adding layer is used for adding the results output by the first convolution layer and the second convolution layer pixel by pixel;
the ReLU activation layer is used for activating the result output by the pixel addition layer;
the third convolution layer is used for carrying out convolution operation on the result output by the ReLU activation layer;
the Sigmoid activation layer is used for activating the result output by the third convolution layer;
the pixel multiplication layer is used for carrying out pixel-by-pixel multiplication on the result output by the Sigmoid activation layer and the multi-scale features.
Further, constructing a training set, comprising:
obtaining an original data set formed by breast ultrasonic tomographic images, and labeling focus areas in each breast ultrasonic tomographic image to obtain corresponding focus area mask images;
splicing continuous n Zhang Ruxian ultrasonic tomographic images in the channel dimension, and forming a training set by the spliced breast ultrasonic tomographic images and the corresponding focus area mask images;
wherein n is a positive integer greater than 1.
Further, n=3;
and, stitching consecutive 3 breast ultrasound tomographic images in a channel dimension, comprising:
extracting information of one channel from each breast ultrasonic tomographic image to obtain information of three channels;
and splicing the information of the three channels in the channel dimension.
According to another aspect of the present invention, there is provided a breast ultrasound tomographic image segmentation method including:
inputting the breast ultrasonic tomographic image to be segmented into a breast ultrasonic tomographic image segmentation model to obtain a focus region segmentation result;
the breast ultrasonic tomography image segmentation model is built by the breast ultrasonic tomography image segmentation model building method.
According to yet another aspect of the present invention, there is provided a computer readable storage medium comprising a stored computer program; when the computer program is executed by the processor, the equipment where the computer readable storage medium is located is controlled to execute the method for establishing the breast ultrasonic tomography image segmentation model provided by the invention, and/or the method for segmenting the breast ultrasonic tomography image provided by the invention.
In general, through the above technical solutions conceived by the present invention, the following beneficial effects can be obtained:
(1) When the established breast ultrasonic tomographic image segmentation model is trained, at least one of a contrast loss function and an uncertainty loss function between an intermediate result and a labeling result output by a decoding structure is introduced into a training loss function; the introduction of the contrast loss function enables the model to have the capability of distinguishing pixels belonging to different categories (namely focus areas or backgrounds), and the pixels near the boundary are more accurately classified, so that the influence of low contrast of the breast ultrasonic tomographic image on the segmentation precision is relieved; the uncertainty loss function is introduced, so that the difference between the intermediate result and the labeling result output by the decoding structure is minimized, the network can learn more discernable and reliable knowledge in early stage, the loss of detail information caused by the up-sampling operation in the decoding stage is effectively reduced, and the focus area can be accurately segmented under the condition of low contrast of the model. In general, the invention can effectively improve the segmentation precision of the breast ultrasound tomographic image by improving the model training loss function.
(2) The difference between the model output and the label is generally measured by using cross entropy loss and Dice loss, but in breast ultrasound tomographic image segmentation, an uncertainty loss function is constructed only based on the cross entropy loss, and the effect of relieving detail information loss caused by up-sampling is limited; in the preferred scheme of the invention, the designed uncertainty loss function introduces a regular term based on KL divergence on the basis of cross entropy loss, so that the loss of detail information caused by up-sampling operation in a decoding stage can be reduced to the greatest extent, and the segmentation precision of a model is further improved.
(3) In the preferred scheme of the invention, the designed contrast loss function maximizes the similarity of pixels belonging to the same category and minimizes the similarity belonging to different categories in the intermediate output and labeling results, so that the capability of distinguishing pixels of different categories of the model can be further improved.
(4) The breast ultrasonic tomographic image segmentation task belongs to a dense prediction task, and local features and global features are very important for the task; in the preferred scheme of the invention, a cross former module is inserted between the last layer of the coding structure and the decoding structure in the UNet model, global information is extracted by the cross former module on the basis of extracting rich local information from a convolution layer and a pooling layer in the coding structure, and the segmentation of small focus can be more accurate through the dependency relationship between the coded global information and the local information.
(5) In the preferred scheme of the invention, a multi-scale attention module is inserted in the long jump connection of the UNet model, the module is specifically composed of a cavity space convolution pooling pyramid and a feature enhancement module, wherein the cavity space convolution pooling pyramid enables the network to keep local detail information as much as possible while increasing receptive fields and capturing multi-scale information, the feature enhancement module further processes multi-scale features extracted from the cavity space convolution pooling pyramid, enhances features related to tasks and suppresses features not related to the features, and finally the model has higher segmentation precision for small focus.
(6) In the ultrasonic tomography process, multiple images are generated for the same object; in the preferred scheme of the invention, when a data set for model training is constructed, a plurality of continuous images are spliced in the channel dimension and then are input as a model, so that the interlayer information of the ultrasonic tomographic image can be effectively utilized, and the segmentation precision is further improved. In a further preferred scheme, three continuous images are specifically selected to be respectively extracted and spliced in one channel to be used as model input, so that on one hand, the spliced image can be prevented from containing excessive background area characteristics, and on the other hand, the dimension of the spliced image is consistent with that of an original image, and the processing of model input in the subsequent segmentation process is simplified.
Drawings
Fig. 1 is a schematic diagram of a conventional UNet model structure;
FIG. 2 is a breast ultrasound tomographic image and a mask image corresponding thereto provided by an embodiment of the present invention; wherein, (a) is a breast ultrasound tomographic image, and (b) is a mask image;
FIG. 3 is a schematic view of a breast ultrasound tomographic image segmentation model provided by an embodiment of the present invention;
FIG. 4 is a schematic diagram of a conventional CrossFormer module;
FIG. 5 is a schematic diagram of a multi-scale attention module according to an embodiment of the present invention;
fig. 6 is a segmentation result of the segmentation method provided in the embodiment of the invention and other segmentation methods based on UNet model; wherein, (a) is a focus area label of a breast ultrasonic tomographic image, (b) is a segmentation result of a UNet model, (c) is a segmentation result of an Attention-UNet model, (d) is a segmentation result of a unet++ model, (e) is a segmentation result of a reset model, (f) is a segmentation result of a TransUNet model, and (g) is a segmentation result of the embodiment.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
In the present invention, the terms "first," "second," and the like in the description and in the drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
In order to improve the accuracy of breast ultrasonic tomography image segmentation, the invention provides a breast ultrasonic tomography image segmentation model building method and a segmentation method, and the whole thought is as follows: the model has the capability of distinguishing pixels of different categories and/or learns more discernable and reliable knowledge in early stages by optimizing the loss function of the model training process, and reduces the loss of detail information caused by up-sampling operation, so that the model can accurately divide focus areas from breast ultrasonic tomography images with lower contrast. On the basis, the structure of the model is further improved, so that the model better utilizes global information and local detail information in the image, and the segmentation precision is further improved.
The following are examples.
Example 1:
a breast ultrasound tomographic image segmentation model building method comprises the following steps:
constructing a training set; in the training set, each piece of training data is a breast ultrasonic tomographic image marked with a focus area; fig. 2 shows an example of training data, where (a) in fig. 2 is an acquired breast ultrasound tomographic image and (b) is a mask image marking a lesion area.
Optionally, in this embodiment, the specific step of constructing the training set includes the following steps:
(1) Acquiring breast ultrasonic tomography images by using a 2-3MHz ultrasonic tomography system, acquiring 372 images in total, and dividing the images into a training set, a verification set and a test set according to the proportion of 7:1:2;
(2) Labeling focus areas of the acquired 372 images in a professional and same mode to obtain corresponding mask images which are used as labels during model training;
(3) Data enhancement is carried out on the breast ultrasonic tomographic image and the corresponding label in the training set through operations such as image overturning, rotation, scaling and the like, and the training set is expanded to 8 times of the original training set, and the total number of the training sets is 2088;
the robustness of the model can be improved and the overfitting can be avoided through data enhancement;
(4) Unifying the sizes of the training images after data enhancement to 512×512 so as to adapt to the requirements of the model on the sizes of the input images, and then splicing the three continuous images in the channel dimension, wherein the specific splicing mode is as follows: extracting information of one channel from each image, and then splicing the extracted information of three channels in the channel dimension, wherein the obtained three-channel image is used as the input of a segmentation model, namely the size of the input image is 512 multiplied by 3;
in the ultrasonic tomography process, multiple (usually 30) images are generated for the same object, and the continuous multiple images are spliced and then input as a model in the embodiment, so that interlayer information can be effectively utilized; in addition, the original breast ultrasonic tomographic image is a three-channel image, signals of each channel are identical, the embodiment specifically extracts one channel from three continuous ultrasonic images respectively and then splices the three channels into a three-channel image to be used as model input, on the basis of using interlayer information, the situation that the spliced image contains too many background area features is avoided, the dimension of the spliced image is consistent with the dimension of the original image, the channel dimension of the original breast ultrasonic tomographic image is consistent with the model input requirement in the actual segmentation process, and the model input processing in the subsequent segmentation process is simplified.
It should be noted that, the above training set construction method is only a preferred embodiment of the present invention, and should not be construed as a unique limitation of the present invention, and other data sets of breast ultrasound tomographic images marked with focal areas can be used in the present invention; it is to be understood that in the above steps, parameters such as the number of images, the size of the images, etc. are only exemplary descriptions, and should not be construed as the only limitation of the present invention, and may be adjusted as needed in practical applications.
The present embodiment further includes: an initial breast ultrasound tomographic image segmentation model is constructed based on the UNet model for segmenting a lesion area from the breast ultrasound tomographic image.
Considering an original UNet model and a model obtained based on the UNet model improvement, since detailed information is lost in the up-sampling process of the decoding stage, small lesions in the breast ultrasound tomographic image cannot be better segmented, the embodiment improves and obtains a new breast ultrasound tomographic image segmentation model based on the UNet model, the UNet model is specifically the model shown in fig. 1, and the related improvement comprises: a CrossFormer module is inserted between the last layer of the encoded structure and the decoded structure in the UNet model, and a multi-scale attention module (MFEModule) is inserted in the long-jump connection of the UNet model.
The improved segmentation model structure is shown in fig. 3, and can be divided into twelve layers: the first layer is composed of two convolution layers, the second layer to the fourth layer are all modules composed of a 2×2 pooling layer and two convolution layers, and the convolution layers are composed of a 3×3 convolution, a batch normalization module and a ReLU activation function; the size of the output characteristic diagram of the first layer is 512×512×64 (512 represents the width and height of the characteristic diagram is 512, and 64 represents the channel number of the characteristic diagram), the channel number of the output characteristic diagrams of the second layer to the fourth layer is twice that of the output characteristic diagram of the last layer, the width and height are 1/2 of that of the last layer, the size of the output characteristic diagram of the fourth layer is 64×64×512, the fifth layer consists of a pooling layer and a convolution layer, the channel number of the output characteristic diagram of the fifth layer is the same as that of the fourth layer, but the width and height are still halved, namely the size of the output characteristic diagram of the fifth layer is 32×32×512.
Referring to fig. 3, in the breast ultrasound tomographic image segmentation model, the sixth layer is an inserted CrossFormer module; the CrossFormer block is made up of two consecutive crossformers block as shown in fig. 4. One of the cross formaerblocks contains a cross-scale embedding Layer (CEL), two Layer normalization (Layer Norm), a relative position coding Layer (RPB), a short-distance attention Layer (SDA), and a multi-Layer perceptron Layer (MLP). Another cross formaerblock contains two layers of normalization (Layer Norm), a relative position coding Layer (RPB), a long distance attention Layer (LDA), and a multi-Layer perceptron Layer (MLP). The output feature map size of the sixth layer is the same as that of the fifth layer, i.e., still 32×32×512. The seventh layer is composed of a convolution layer and an up-sampling layer, the number of channels of the output characteristic diagram of the seventh layer is the same as that of the sixth layer, and the width and the height are 2 times that of the sixth layer, namely the size of the output characteristic diagram of the seventh layer is 64 multiplied by 512.
In the embodiment, the cross former module is inserted between the last layer of the coding structure and the decoding structure in the UNet model, and on the basis of extracting rich local information from the convolution layer and the pooling layer in the coding structure, the cross former module extracts global information, and the small focus segmentation can be more accurate through the dependency relationship between the coded global information and the local information.
Referring to fig. 3, in the breast ultrasound tomographic image segmentation model, the eighth layer to the tenth layer are each composed of two convolution layers and one upsampling layer, the number of channels of each layer of output feature map becomes 1/2 of the number of channels of the input feature map, and the width and height become twice the width and height of the input feature map. The width and height of the tenth layer output feature map are restored to the size of the input image, 512×512. The eleventh layer is composed of two convolution layers, the twelfth layer is composed of one 1×1 convolution layer, the number of output channels is the number of types, and the size of the output characteristic diagram is 512×512×2.
And the cross former module is used as a distinction, the left side structure corresponds to the coding structure, and the right side structure corresponds to the decoding structure.
Referring to fig. 3, in the breast ultrasound tomographic image segmentation model, first, second, third, and fourth layer outputs in the encoding structure are spliced with tenth, ninth, eighth, and seventh layer outputs in the decoding structure via an inserted multi-scale attention module (MFEModule) using one long jump connection. The sizes of the output characteristic diagrams of the first layer and the tenth layer are 512 multiplied by 64, the sizes of the output characteristic diagrams of the second layer and the ninth layer are 256 multiplied by 128, the sizes of the output characteristic diagrams of the third layer and the eighth layer are 128 multiplied by 256, and the sizes of the output characteristic diagrams of the fourth layer and the seventh layer are 64 multiplied by 512. Namely: the output characteristic diagrams of the first layer and the tenth layer are spliced to be 512 multiplied by 128 and then are subjected to convolution operation, the output characteristic diagrams of the second layer and the ninth layer are spliced to be 256 multiplied by 256 and then are subjected to convolution operation, the output characteristic diagrams of the third layer and the eighth layer are spliced to be 128 multiplied by 512 and then are subjected to convolution operation, and the output characteristic diagrams of the fourth layer and the seventh layer are spliced to be 64 multiplied by 1024 and then are subjected to convolution operation.
In this embodiment, the structure of a multi-scale attention module (MFEModule) is shown in fig. 5, and the structure includes two branches, wherein one branch is a cavity space convolution pooling pyramid (aspmmodule) for performing multi-scale feature extraction on features output by a corresponding layer in a coding structure to obtain multi-scale features; the ASPPModule comprises a convolution of 1×1, a 3×3 hole space convolution with three expansion coefficients of 6, 12 and 18 respectively, and a pooling layer of 1×1, and the ASPP is structured so that the network can keep local detail information as much as possible while increasing receptive fields and capturing multi-scale information; the other branch is a feature enhancement module comprising three 1 x 1 convolutions (i.e., a first convolution layer, a second convolution layer, and a third convolution layer), a ReLU activation layer, a Sigmoid activation layer, a pixel addition layer, and a pixel multiplication layer, as shown in fig. 5:
the first convolution layer is used for inputting multi-scale features and carrying out convolution operation;
the second convolution layer is used for inputting decoding characteristics and carrying out convolution operation;
the pixel adding layer is used for adding the results output by the first convolution layer and the second convolution layer pixel by pixel;
the ReLU activation layer is used for activating the result output by the pixel addition layer;
the third convolution layer is used for carrying out convolution operation on the result output by the ReLU activation layer;
the Sigmoid activation layer is used for activating the result output by the third convolution layer;
the pixel multiplication layer is used for carrying out pixel-by-pixel multiplication on the result output by the Sigmoid activation layer and the multi-scale features; the result after multiplication is used as the input of the next layer.
The multi-scale features extracted by the ASPP are further processed through the feature enhancement module, so that the features related to the task can be enhanced and the features not related to the features can be restrained; it should be noted that, the description of the structure of the feature enhancing module is a preferred scheme, and should not be construed as the only limitation of the present invention, and other modules implemented based on an attention mechanism, which can enhance features related to tasks and suppress features not related to features, may also be used in the present invention.
In general, the present embodiment can make the model have higher segmentation accuracy for small lesions by inserting the above feature enhancement module in the long-jump connection of the UNet model.
It should be noted that the above model is only a preferred model of the present invention, and should not be construed as the only limitation of the present invention; in other embodiments of the present invention, under the condition that the segmentation accuracy meets the application requirement, the cross former module may be inserted only at the corresponding position in the UNet model, or the multiscale Attention module may be inserted only, or the existing UNet model or the improvement models such as the Attention-UNet model, unet++ model, and ResUNet model may be directly used.
It is to be understood that when the structure of the selected UNet model is changed, the number of layers and parameters in each layer of the improved breast ultrasound tomographic image segmentation model may also be changed accordingly, and the above description of the model parameters is only an exemplary description and should not be construed as the only limitation of the present invention.
The initial breast ultrasonic tomographic image segmentation model can complete the actual segmentation task after training; the difference between the model output and the label is typically measured using cross entropy and Dice losses, but since the contrast between the lesion area and the background in the breast ultrasound tomographic image is low, whereas in UNet and its modified models, the upsampling operation loses a lot of information during the decoding phase, the model obtained by training with the existing training method still faces the problem of misclassification, for which the present embodiment introduces an intermediate result of decoding structure output in the training loss function of the model (i.e., O in fig. 3 2 ,O 3 ,O 4 ) Comparing the loss function with the labeling result and the uncertainty loss function, wherein the loss function after improvement is as follows:
L total =L E +αL contrast +βL uncer
wherein L is total Representing the overall loss; l (L) E Representing a segmentation error of a breast ultrasound tomographic image segmentation model; l (L) contrast The contrast loss function between the intermediate result and the labeling result output by the decoding structure in the breast ultrasonic tomography image segmentation model is represented, and alpha represents the weight coefficient; l (L) uncer The method is an uncertainty loss function and is used for representing the difference between an intermediate result and a labeling result output by a decoding structure in a breast ultrasonic tomography image segmentation model, and beta represents a weight coefficient of the intermediate result; alpha is more than or equal to 0, beta is more than or equal to 0, and alpha and beta are not 0 at the same time; the specific values of α and β can be flexibly adjusted according to the actual segmentation effect, and optionally, in this embodiment, α and β are set to 0.2 and 0.3 respectively.
In the loss function designed in this embodiment, by introducing the uncertainty loss function, the difference between the intermediate result and the labeling result output by the decoding structure is minimized, so that the network can learn more discriminative and reliable knowledge in early stage, and the loss of detail information caused by the up-sampling operation in the decoding stage is effectively reduced, so that the model can accurately segment the focus area under the condition of low contrast. Experiments find that an uncertainty loss function is built only based on cross entropy loss, the effect of relieving detail information loss caused by up-sampling is limited, in order to reduce the loss of detail information caused by up-sampling operation in a decoding stage to the greatest extent, a regular term based on KL divergence is introduced on the basis of cross entropy loss, and finally, the expression of the uncertainty loss function is specifically as follows:
Figure BDA0004090170100000131
Figure BDA0004090170100000132
Figure BDA0004090170100000133
where J represents the number of intermediate outputs of the model; l (L) CE () Representing cross entropy loss; l represents a labeling result; p is p j Representing intermediate results of decoding structure output in the breast ultrasound tomographic image segmentation model,
Figure BDA0004090170100000141
representing intermediate result p j Is the ith channel of (2); d (D) KL Represents KL divergence, C represents category number, < ->
Figure BDA0004090170100000142
The weight assigned to pixel p in the jth intermediate result is indicated.
The introduction of the contrast loss function enables the model to have the capability of distinguishing pixels belonging to different categories (namely focus areas or backgrounds), and the pixels near the boundary are more accurately classified, so that the influence of low contrast of the breast ultrasonic tomographic image on the segmentation precision is relieved; in this embodiment, the expression of the contrast loss function is specifically:
Figure BDA0004090170100000143
wherein L represents the similarity between pixels; s is(s) o Representing features in intermediate results, s l Representing the characteristics in the labeling result; m and n represent the feature class and,
Figure BDA0004090170100000144
representing the m-th class of features in the labeling result, < >>
Figure BDA0004090170100000145
And->
Figure BDA0004090170100000146
Respectively represent intermediate results s o M, n-th class of features; />
Figure BDA0004090170100000147
Features representing the same category->
Figure BDA0004090170100000148
Features representing different categories; τ represents a temperature coefficient, optionally, in this embodiment, its value is 2;
optionally, in this embodiment, the cosine similarity is specifically measured, and the corresponding expression is
Figure BDA0004090170100000149
It should be noted that cosine similarity is only an optional feature similarity measurement, and should not be construed as a unique limitation of the present invention, and other measurement methods, such as pearson correlation coefficient, mahalanobis distance, euclidean distance, etc., may also be used in the present invention.
Based on the contrast loss function, the similarity of pixels belonging to the same category is maximized and the similarity belonging to different categories is minimized in the intermediate output and labeling result, so that the capability of distinguishing pixels of different categories of the model can be further improved.
Optionally, in the above-mentioned loss function, a segmentation error L of the breast ultrasound tomographic image segmentation model E Results of lesion region segmentation O 1 And errors between labels, still using cross entropy loss function L CE And the Dice loss function L Dice And (3) representing.
Finally, the overall loss function expression is as follows:
Figure BDA0004090170100000151
L CE =-[ylogl+(1-y)log(1-l)]
Figure BDA0004090170100000152
wherein y represents a segmentation result output by the breast ultrasonic tomography image segmentation model, and l represents a labeling result;
based on the above analysis, the present embodiment further includes, after the data set and the initial breast ultrasound tomographic image segmentation model are constructed:
with L total =L E +αL contrast +βL uncer And training an initial breast ultrasonic tomographic image segmentation model by using a training set to complete the establishment of the breast ultrasonic tomographic image segmentation model for training the loss function.
It should be noted that, in other embodiments of the present invention, one of the super parameters α and β may be 0.
It is easy to understand that after model training is completed, in order to ensure that the segmentation result meets the requirement, the trained model is further tested and verified by using a test set and a verification set, and in the test process, the segmentation result can be evaluated by using the Dice coefficient and IoU as evaluation indexes.
Example 2:
a breast ultrasound tomographic image segmentation method comprising:
inputting the breast ultrasonic tomographic image to be segmented into a breast ultrasonic tomographic image segmentation model to obtain a focus region segmentation result;
the breast ultrasonic tomography image segmentation model is built by the breast ultrasonic tomography image segmentation model building method.
Example 3:
a computer readable storage medium comprising a stored computer program; when the computer program is executed by the processor, the equipment where the computer readable storage medium is located is controlled to execute the method for establishing the breast ultrasonic tomography image segmentation model provided by the invention, and/or the method for segmenting the breast ultrasonic tomography image provided by the invention.
The following further analyzes the beneficial effects obtained by the invention in combination with the segmentation effects of different models.
For the breast ultrasound tomographic image segmentation model shown in fig. 3, pytorch was used as a deep learning framework, and experimental environment Nvidia GeForce RTX 2080Ti (11 GB) GPU, the UNet network structure provided in the above-described embodiment 1 (i.e., the breast ultrasound tomographic image segmentation model shown in fig. 3) was adopted as a segmentation model, and c0=2, c1=64, c2=128, c3=256, and c4=512 in the model. The input image size is 512×512×3. A random gradient descent algorithm (weight decay=0.0001, momentum=0.9) was used as an optimizer, the initial learning rate was set to 0.01, the learning rate was reduced to 0.1 times the original value every 100 epoch learning rates passed, a total of 300 epochs were trained, and the batch size was set to 4. The data sets are divided into training sets, verification sets and test sets according to the proportion of 7:1:2, 327 data sets are used, and the training sets are expanded to 8 times of the original training sets. The loss function uses a cross entropy function, a Dice loss function, a contrast loss function, and an uncertainty loss function, and Dice coefficients and a cross-over ratio (IoU) as evaluation indexes.
The breast ultrasound tomographic image was segmented using different models with the existing UNet model, the Attention-UNet model, the unet++ model, the ResUNet model, and the TransUNet model as comparison models, and the results are shown in fig. 6. In fig. 6, (a) is a lesion area label of a plurality of breast ultrasound tomographic images, (b) is a segmentation result of a UNet model, (c) is a motion-UNet model segmentation result, (d) is a unet++ model segmentation result, (e) is a ResUNet model segmentation result, and (f) is a TransUNet model segmentation result (g) in the present embodiment. As can be seen from the results shown in fig. 6, the boundary segmented by the breast ultrasonic tomography image segmentation model established based on the invention is closer to the real boundary, and the situation of wrong segmentation is relatively fewer. For convenience of description, the model built in the above-described example 1 is hereinafter abbreviated as "MFE-dscrosunaet", and the Dice coefficient and IoU coefficient of each model division result are shown in table 1.
TABLE 1 Dice coefficient and IoU coefficient for different model segmentation results
Figure BDA0004090170100000171
According to the evaluation indexes shown in table 1, it can be seen that the result of the segmentation of the invention has a Dice coefficient of 0.8883 and an iou coefficient of 0.7990. Compared with the segmentation result of UNet, the die coefficient is improved by 3.68%, and the IoU coefficient is improved by 5.76%. The effectiveness and the superiority of the breast ultrasound tomographic image segmentation model established in the invention are fully proved.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. The method for establishing the breast ultrasonic tomography image segmentation model is characterized by comprising the following steps of:
constructing a training set; each piece of training data in the training set is a breast ultrasonic tomographic image marked with a focus area;
constructing an initial breast ultrasonic tomographic image segmentation model based on the UNet model, and segmenting a focus area from the breast ultrasonic tomographic image;
with L total =L E +αL contrast +βL uncer For training a loss function, training an initial breast ultrasonic tomographic image segmentation model by using the training set, and completing establishment of the breast ultrasonic tomographic image segmentation model;
wherein L is total Representing the overall loss; l (L) E Representing a segmentation error of a breast ultrasound tomographic image segmentation model; l (L) contrast The contrast loss function between the intermediate result and the labeling result output by the decoding structure in the breast ultrasonic tomography image segmentation model is represented, and alpha represents the weight coefficient; l (L) uncer The method is an uncertainty loss function and is used for representing the difference between an intermediate result and a labeling result output by a decoding structure in a breast ultrasonic tomography image segmentation model, and beta represents a weight coefficient of the intermediate result; alpha is more than or equal to 0, beta is more than or equal to 0, and alpha and beta are not 0 at the same time.
2. The method for constructing a breast ultrasound tomographic image segmentation model as in claim 1, wherein,
Figure FDA0004090170090000011
Figure FDA0004090170090000012
Figure FDA0004090170090000013
where J represents the number of intermediate outputs of the model; l (L) CE () Representing cross entropy loss; l represents a labeling result; p is p j Representing intermediate results of decoding structure output in the breast ultrasound tomographic image segmentation model,
Figure FDA0004090170090000014
representing intermediate result p j Is the ith channel of (2); d (D) KL Represents KL divergence, C represents category number, < ->
Figure FDA0004090170090000015
The weight assigned to pixel p in the jth intermediate result is indicated.
3. A method for constructing a breast ultrasound tomographic image segmentation model as in claim 1 or 2, wherein,
Figure FDA0004090170090000021
wherein L represents the similarity between pixels; s is(s) o Representing features in intermediate results, s l Representing the characteristics in the labeling result; m and n represent the feature class and,
Figure FDA0004090170090000022
features representing the same category->
Figure FDA0004090170090000023
Features representing different categories; τ represents the temperature coefficient.
4. The breast ultrasound tomographic image segmentation model as in claim 1, further comprising: and a CrossFormer module inserted between the last layer of the coding structure and the decoding structure in the UNet model.
5. The breast ultrasound tomographic image segmentation model creation method according to claim 1 or 4, wherein the breast ultrasound tomographic image segmentation model further comprises: a multi-scale attention module inserted in a long jump connection between the encoding structure and the decoding structure in the UNet model;
the multi-scale attention module includes: a cavity space convolution pooling pyramid and a characteristic enhancement module;
the cavity convolution pooling pyramid is used for extracting multi-scale features of the features output by the corresponding layer in the coding structure to obtain multi-scale features;
the feature enhancement module takes the multi-scale features and decoding features output by corresponding layers in the decoding structure as inputs, and is used for enhancing features related to tasks and suppressing features not related to the features.
6. The breast ultrasound tomographic image segmentation model creation method as in claim 5, wherein the feature enhancement module comprises a first convolution layer, a second convolution layer, a third convolution layer, a ReLU activation layer, a Sigmoid activation layer, a pixel addition layer, and a pixel multiplication layer;
the first convolution layer is used for inputting the multi-scale features and performing convolution operation;
the second convolution layer is used for inputting the decoding characteristics and performing convolution operation;
the pixel adding layer is used for adding the results output by the first convolution layer and the second convolution layer pixel by pixel;
the ReLU activation layer is used for activating the result output by the pixel addition layer;
the third convolution layer is used for carrying out convolution operation on the result output by the ReLU activation layer;
the Sigmoid activation layer is used for activating the result output by the third convolution layer;
the pixel multiplication layer is used for carrying out pixel-by-pixel multiplication on the result output by the Sigmoid activation layer and the multi-scale feature.
7. The breast ultrasound tomographic image segmentation model construction method as in claim 1, wherein constructing a training set comprises:
obtaining an original data set formed by breast ultrasonic tomographic images, and labeling focus areas in each breast ultrasonic tomographic image to obtain corresponding focus area mask images;
splicing continuous n Zhang Ruxian ultrasonic tomographic images in the channel dimension, and forming the training set by the spliced breast ultrasonic tomographic images and the corresponding focus area mask images;
wherein n is a positive integer greater than 1.
8. The breast ultrasound tomographic image segmentation model creation method as in claim 7, wherein n=3;
and, stitching consecutive 3 breast ultrasound tomographic images in a channel dimension, comprising:
extracting information of one channel from each breast ultrasonic tomographic image to obtain information of three channels;
and splicing the information of the three channels in the channel dimension.
9. A breast ultrasound tomographic image segmentation method, comprising:
inputting the breast ultrasonic tomographic image to be segmented into a breast ultrasonic tomographic image segmentation model to obtain a focus region segmentation result;
wherein the breast ultrasound tomographic image segmentation model is established by the breast ultrasound tomographic image segmentation model establishment method according to any one of claims 1 to 8.
10. A computer readable storage medium comprising a stored computer program; when the computer program is executed by a processor, the apparatus in which the computer readable storage medium is located is controlled to execute the breast ultrasound tomographic image segmentation model establishment method according to any one of claims 1 to 8, and/or the breast ultrasound tomographic image segmentation method according to claim 9.
CN202310149152.7A 2023-02-22 2023-02-22 Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method Pending CN116433586A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310149152.7A CN116433586A (en) 2023-02-22 2023-02-22 Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310149152.7A CN116433586A (en) 2023-02-22 2023-02-22 Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method

Publications (1)

Publication Number Publication Date
CN116433586A true CN116433586A (en) 2023-07-14

Family

ID=87080422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310149152.7A Pending CN116433586A (en) 2023-02-22 2023-02-22 Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method

Country Status (1)

Country Link
CN (1) CN116433586A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422927A (en) * 2023-11-09 2024-01-19 什维新智医疗科技(上海)有限公司 Mammary gland ultrasonic image classification method, system, electronic equipment and medium
CN118196102A (en) * 2024-05-17 2024-06-14 华侨大学 Method and device for detecting breast ultrasonic tumor lesion area based on double-network shadow removal
CN118261925A (en) * 2024-04-17 2024-06-28 徐州医科大学 Breast ultrasound image segmentation method with large receptive field and enhanced attention

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422927A (en) * 2023-11-09 2024-01-19 什维新智医疗科技(上海)有限公司 Mammary gland ultrasonic image classification method, system, electronic equipment and medium
CN118261925A (en) * 2024-04-17 2024-06-28 徐州医科大学 Breast ultrasound image segmentation method with large receptive field and enhanced attention
CN118196102A (en) * 2024-05-17 2024-06-14 华侨大学 Method and device for detecting breast ultrasonic tumor lesion area based on double-network shadow removal

Similar Documents

Publication Publication Date Title
CN112992308B (en) Training method of medical image report generation model and image report generation method
CN110934606B (en) Cerebral apoplexy early-stage flat-scan CT image evaluation system and method and readable storage medium
US10366491B2 (en) Deep image-to-image recurrent network with shape basis for automatic vertebra labeling in large-scale 3D CT volumes
Oghli et al. Automatic fetal biometry prediction using a novel deep convolutional network architecture
CN116433586A (en) Mammary gland ultrasonic tomography image segmentation model establishment method and segmentation method
Zheng et al. Machine learning based vesselness measurement for coronary artery segmentation in cardiac CT volumes
CN109523521A (en) Lung neoplasm classification and lesion localization method and system based on more slice CT images
CN113506308B (en) Deep learning-based vertebra positioning and spine segmentation method in medical image
US20220284583A1 (en) Computerised tomography image processing
CN114494296A (en) Brain glioma segmentation method and system based on fusion of Unet and Transformer
CN116258685A (en) Multi-organ segmentation method and device for simultaneous extraction and fusion of global and local features
CN115409859A (en) Coronary artery blood vessel image segmentation method and device, storage medium and terminal
Du et al. Segmentation and visualization of left atrium through a unified deep learning framework
Wen et al. Segmenting medical MRI via recurrent decoding cell
CN116309615A (en) Multi-mode MRI brain tumor image segmentation method
CN117934824A (en) Target region segmentation method and system for ultrasonic image and electronic equipment
Khademi et al. Spatio-temporal hybrid fusion of cae and swin transformers for lung cancer malignancy prediction
CN118037791A (en) Construction method and application of multi-mode three-dimensional medical image segmentation registration model
CN114581459A (en) Improved 3D U-Net model-based segmentation method for image region of interest of preschool child lung
CN117523350A (en) Oral cavity image recognition method and system based on multi-mode characteristics and electronic equipment
US20230368423A1 (en) Precise slice-level localization of intracranial hemorrhage on head cts with networks trained on scan-level labels
AU3720700A (en) 3-d shape measurements using statistical curvature analysis
CN116958094A (en) Method for dynamically enhancing magnetic resonance image characteristics to generate pathological image characteristics
CN116843697A (en) Two-stage 3D coronary artery segmentation reconstruction method and system
CN116934683A (en) Method for assisting ultrasonic diagnosis of spleen wound by artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination