CN115760810B

CN115760810B - Medical image segmentation apparatus, method and computer-readable storage medium

Info

Publication number: CN115760810B
Application number: CN202211486882.8A
Authority: CN
Inventors: 陈丽芳; 王涛
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2022-11-24
Filing date: 2022-11-24
Publication date: 2024-04-12
Anticipated expiration: 2042-11-24
Also published as: CN115760810A

Abstract

A medical image segmentation apparatus, method and computer-readable storage medium, the apparatus comprising: the salient feature acquisition unit is used for extracting salient features of the image to be segmented; a salient feature encoding unit that encodes salient features; a mask acquisition unit for extracting edge information of a main body to be segmented; the encoder performs feature extraction on the received image to be segmented; the salient feature fusion module of the first coding unit fuses the image to be segmented and the salient features of the image to be segmented; the decoder restores the image details and finally outputs the image segmentation result; the first coding unit further comprises a mask fusion module, which receives the output feature map of the salient feature fusion module and the mask output by the mask acquisition unit and fuses the output feature map and the mask output by the mask acquisition unit. The segmentation accuracy of the main body to be segmented is improved by combining the salient feature fusion module and the mask fusion module and the deep semantic network.

Description

Medical image segmentation apparatus, method and computer-readable storage medium

Technical Field

The disclosure relates to the technical field of medical image segmentation, in particular to a medical image segmentation method based on depth semantics.

Background

Image segmentation is an important branch in the field of computer vision, and is widely applied to various fields such as industrial detection, biological recognition, intelligent transportation, security protection, intelligent medical treatment and the like. Image segmentation techniques can be broadly divided into three categories, graph theory-based methods, pixel clustering-based methods, and depth semantic-based methods. The traditional segmentation method is only based on the lower-level content information such as color, brightness, texture and the like of the pixel points, has poor segmentation effect and is easy to generate wrong segmentation. Image segmentation based on depth semantics can utilize advanced semantic information, so that the problem of semantic information missing in the traditional image segmentation method is solved to a great extent, and great success is achieved.

Currently, for segmentation of medical images, particularly segmentation of cancerous skin, depth semantic-based segmentation methods are mostly based on a U-Net network architecture and an expansion model above the U-Net architecture. For example, o.ronneeberger et al first proposed a U-shaped medical image segmentation network. Zhou, Z, et al propose a medical image segmentation method based on the Unet++ model. Because of the problems of hair coverage and lighter lesion color of the skin, the existing segmentation method can not completely segment the cancerous skin part, and has poor segmentation effect.

Reference is made to:

1、O.Ronneberger，P.Fischer，and T.Brox，“U-net：Convolutional networks for biomedical image segmentation，”in Proceedings of International Conference on Medical image computing and computer-assisted intervention(MICCAI)，2015，pp.234-241；

2、Zhou，Z.，Siddiquee，M.M.R.，Tajbakhsh，N.，and Liang，J.，“Unet++：A nested u-net architecture for medical image segmentation，”in[Deep learning in medical image analysis and multimodallearning for clinical decision support]，3-11，Springer(2018)；

3. CN 110782466a, publication date: 2020.02.11.

disclosure of Invention

Accordingly, an object of the present disclosure is to provide a medical image segmentation method, which can solve the technical problems of incomplete and inaccurate medical image segmentation.

A medical image segmentation apparatus comprising: the device comprises a salient feature acquisition unit, a salient feature encoding unit, a mask acquisition unit, an encoder and a decoder, wherein the encoder and the decoder are connected in a jump connection mode;

the salient feature acquisition unit is used for extracting salient features of the image to be segmented;

the number of the salient feature encoding units is n layers, n is more than or equal to 3, and the salient features extracted by the salient feature acquisition unit are encoded layer by layer;

a mask acquisition unit for extracting edge information of a main body to be segmented;

the coder receives the image to be segmented and performs feature extraction on the image to be segmented; the encoder comprises n layers of encoding units, and the received images to be segmented are sequentially subjected to layer-by-layer feature extraction; n is more than or equal to 3; each layer of coding unit comprises a salient feature fusion module and a downsampling module which are sequentially connected in series;

the salient feature fusion module of the first coding unit receives the salient features output by the image to be segmented and the salient feature acquisition unit, fuses the salient features of the image to be segmented and the image to be segmented, and outputs a processed feature map;

the decoder is used for recovering image details and finally outputting an image segmentation result; the decoder comprises n layers of decoding units, wherein n is more than or equal to 3;

the first coding unit further comprises a mask fusion module, and the salient feature fusion module, the mask fusion module and the downsampling module of the first coding unit are sequentially connected in series; the mask fusion module receives the output feature map of the saliency feature fusion module and the mask output by the mask acquisition unit, and fuses the output feature map and the mask.

Optionally, the second coding unit further includes a mask fusion module, and the salient feature fusion module, the mask fusion module and the downsampling module of the second coding unit are sequentially connected in series; the mask fusion module receives the output feature image of the salient feature fusion module of the second coding unit and the mask output by the mask acquisition unit, and fuses the output feature image and the mask output by the mask acquisition unit.

Optionally, the jump connection includes an adaptive feature encoding module.

Optionally, the salient feature encoding unit comprises a first residual unit.

Optionally, the salient feature fusion module includes a second residual unit and a stitching unit.

Optionally, the first residual unit and the second residual unit each include at least one residual module; the at least one residual module comprises a first branch and a second branch, and the received salient features are processed in parallel respectively.

Optionally, the second residual unit includes at least two residual modules connected in sequence, the second residual module and the third residual module.

Optionally, the first residual unit and the second residual unit each comprise an attention module, the attention module comprising an energy function.

Optionally, the mask fusion module includes a logical and operation module and a logical or operation module.

The present disclosure also provides a medical image segmentation method comprising the steps of:

acquiring a medical image to be segmented;

the medical image segmentation device is used for receiving the images to be segmented, performing image segmentation processing and outputting processing results.

The present disclosure also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a computer, causes the computer to perform the above-described medical image segmentation method.

Advantageous effects

1. The salient features of the images to be segmented are fused by a salient feature fusion module, the basic shape and size information of the segmented object is obtained, and effective distinction is realized for the situation that the areas to be segmented are not obvious.

2. The residual error module with the energy function is adopted in the salient feature fusion module, so that the weight of the salient features can be effectively distributed, and the screening accuracy of the main body to be segmented and the filtering of noise can be improved.

3. And extracting edge information of the main body to be segmented through a mask, effectively segmenting a lesion area, and filtering noise. The mask fusion module adopts logical AND and logical OR to extract main body information and edge information respectively, and then adopts a splicing module to fuse, and fuses the edge information. And carrying out logical AND on the mask and the network main control information to obtain the shape and the size of the mask and determine the main body to be segmented. The logical or can obtain edge information of the image to be segmented. The accuracy of the information of the edge part with lighter color can be improved by combining the feature map obtained by the main nerve network in a logical OR mode.

Drawings

In order to more clearly illustrate the technical solutions of the present disclosure, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a medical image segmentation apparatus according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a salient feature encoding unit and salient feature fusion module according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a mask fusion module according to an embodiment of the disclosure;

fig. 4a0, 4a1, 4a2 are schematic diagrams of an image to be segmented, a schematic diagram of a salient feature, and a schematic diagram of a mask, respectively, according to an embodiment of the disclosure;

fig. 5b0, 5b1 are schematic diagrams of skin cancer images to be segmented, schematic diagrams of real segmentation results, schematic diagrams of segmentation results of the device or method, respectively, according to an embodiment of the disclosure;

fig. 5c0, 5c1, 5c2 are schematic diagrams of skin cancer images to be segmented, a real segmentation result schematic diagram, a segmentation result schematic diagram of the device or the method according to another embodiment of the disclosure;

fig. 5d0, 5d1, 5d2 are schematic diagrams of skin cancer images to be segmented, real segmentation result, segmentation result of the device or method, respectively, according to another embodiment of the present disclosure;

FIGS. 6a0, 6a1, and 6a2 are schematic diagrams of images of a cell slice to be segmented, a schematic diagram of a real segmentation result, and a schematic diagram of a segmentation result of the apparatus or method, respectively, according to an embodiment of the present disclosure;

fig. 6b0, 6b1, and 6b2 are schematic diagrams of images of a cell slice to be segmented, a schematic diagram of a real segmentation result, and a schematic diagram of a segmentation result of the apparatus or the method, respectively, according to an embodiment of the present disclosure.

Detailed Description

The technical solutions of the embodiments provided in the present specification will be clearly and completely described below with reference to the drawings in the present specification, and it is apparent that the described embodiments are only some embodiments, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden on the person of ordinary skill in the art based on the embodiments provided herein, are intended to be within the scope of the present invention.

The present disclosure provides a medical image segmentation apparatus including a salient feature acquisition unit, a salient feature encoding unit, a mask acquisition unit, an encoder, and a decoder, as shown in fig. 1. The encoder and the decoder are connected in a jump connection mode. The encoder and decoder are preferably U-shaped network structures. The U-shaped network structure can effectively capture the relation among image pixels, can better acquire image characteristics, acquire more semantic information through shallow-to-deep information extraction, and achieves good effects on the integrity of segmentation and the similarity of edges.

The salient feature acquisition unit is used for receiving the image to be segmented and extracting salient features. And extracting the salient features of the image to be segmented by adopting a salient feature extraction algorithm, acquiring the basic shape and size information of the segmented object, and realizing effective distinction aiming at the situation that the area to be segmented is not obvious. A frequency tuning algorithm is preferred which is capable of filtering out noise while rapidly obtaining the main portion of the image to be segmented.

And the multi-layer salient feature encoding unit is used for encoding the salient features extracted by the salient feature acquisition unit layer by layer. The number of the multi-layer salient feature coding units is n layers, and n is more than or equal to 3. The layer-by-layer coding is specifically: the first salient feature encoding unit encodes the salient features extracted by the salient feature acquisition unit for the first time and outputs a processing result. The second salient feature encoding unit performs second encoding on the output result of the first salient feature encoding unit and outputs a processing result. The nth salient feature encoding unit performs nth encoding on the output result of the (n-1) th salient feature encoding unit and outputs a processing result. The salient feature encoding preferably performs downsampling operation, and the salient features of the image to be segmented can be accurately extracted through multiple downsampling by layer encoding the salient features.

And the mask acquisition unit is used for extracting the edge information of the main body to be segmented, can effectively segment a relatively obvious lesion area and can filter out some noise. A binary image of the original image is generated by a conventional image segmentation algorithm, and a mask is initialized. The mask is generated using a thresholding method, and conventional image segmentation algorithms include maximum entropy, local thresholding, maximum inter-class variance (OTSU), etc., with maximum inter-class variance (OTSU) being preferred.

The encoder is used for receiving the image to be segmented and extracting image characteristic information. The multi-layer coding unit comprises a multi-layer coding unit, a first coding unit and a second coding unit … …, wherein the nth coding unit is sequentially connected to perform feature extraction at different stages. n is not less than 3, preferably 4. Each layer of coding unit comprises a salient feature fusion module and a downsampling module which are sequentially connected in series. The structure of each coding unit is described in turn below.

The salient feature fusion module of the first coding unit receives the salient features output by the image to be segmented and the salient feature acquisition unit, fuses the salient features of the image to be segmented and the image to be segmented, and outputs the processed feature map.

The first coding unit further comprises a mask fusion module, which receives the output feature map of the salient feature fusion module and the mask output by the mask acquisition unit and fuses the output feature map and the mask output by the mask acquisition unit.

And the downsampling module of the first coding unit receives the output characteristic diagram of the mask fusion module of the first coding unit and performs downsampling operation on the output characteristic diagram. The downsampling module preferably has a maximally pooled structure.

The salient feature fusion module, the mask fusion module and the downsampling module of the first coding unit are sequentially connected in series.

Preferably, the second coding unit has the same structure as the first coding unit, and comprises a salient feature fusion module, a mask fusion module and a downsampling module. The edge information extraction can be enhanced and the extraction accuracy can be improved through the processing of at least two layers of mask fusion modules. Alternatively, the second encoding unit may not include a mask fusion module.

And the salient feature fusion module of the second coding unit receives the output feature image of the downsampling module of the first coding unit and the salient features obtained by the salient feature obtaining unit, fuses the output feature image and the salient feature obtained by the salient feature obtaining unit, and outputs the processed feature image.

And the mask fusion module of the second coding unit receives the output feature map of the saliency feature fusion module of the second coding unit and the mask output by the mask acquisition unit, and fuses the output feature map and the mask.

And the downsampling module of the second coding unit receives the output characteristic diagram of the mask fusion module of the second coding unit and performs downsampling operation on the output characteristic diagram. The downsampling module preferably has a maximally pooled structure.

The nth encoding unit comprises a salient feature fusion module and a maximum pooling module. n is 3 or more. The nth encoding unit comprises a salient feature acquisition unit, a salient feature fusion module and a downsampling module.

And the salient feature fusion module of the nth coding unit receives the output feature image of the downsampling module of the nth-1 coding unit and the salient features obtained by the salient feature obtaining unit, fuses the output feature image and the salient feature obtained by the salient feature obtaining unit, and outputs the processed feature image.

And the downsampling module of the nth coding unit receives the output characteristic diagram of the saliency characteristic fusion module of the nth-1 coding unit and performs downsampling operation on the output characteristic diagram. The downsampling module preferably has a maximally pooled structure.

And the decoder is used for realizing the up-sampling process, recovering the image details and finally outputting the image segmentation result. It includes a multi-layer decoding unit, a first decoding unit, a second decoding unit … …, and an nth decoding unit. n is not less than 3, preferably 4. Each layer of decoding unit comprises an up-sampling module, a splicing module and a residual error module which are sequentially connected in series and stacked.

The number of encoders corresponds one to one with the number of decoders. The encoder and the decoder are connected in a jump connection mode. By adopting the connection mode, the semantic gap between the encoder and the decoder can be closed, the detail information of the main body (target object) to be segmented can be recovered, and the granularity is improved. Preferably, an adaptive feature coding module is added in the jump connection to match the distribution between the two feature maps in the encoder and decoder, and restore the feature of the fine granularity of the image. Namely, each layer of coding units is connected with a corresponding decoding unit through an adaptive feature coding module. The structure is a residual block with an energy function, and the specific structure is the same as the first residual unit below.

The up-sampling module of the first decoding unit receives the output characteristic diagram of the down-sampling unit of the nth encoding unit and performs up-sampling operation on the output characteristic diagram.

The splicing module of the first decoding unit receives the output information of the up-sampling module of the first decoding unit and the output feature map of the saliency feature fusion module of the nth encoding unit, and fuses the output information and the output feature map.

The residual error module of the first decoding unit receives the output information of the splicing module of the first decoding unit, processes the output information and outputs the processing result of the layer decoder. The residual module comprises at least two layers of residual blocks with energy functions.

The up-sampling module of the second decoding unit receives the output information of the first decoding unit and performs up-sampling operation on the output information. The upsampling is preferably performed by a bilinear interpolation algorithm.

And the splicing module of the second decoding unit receives the output information of the up-sampling module of the second decoding unit and the output characteristic diagram of the down-sampling module of the n-1 coding unit, and fuses the two.

The residual error module of the second decoding unit receives the output information of the splicing module of the second decoding unit, processes the output information and outputs the processing result of the layer decoder. The residual module comprises at least two layers of residual blocks with energy functions.

The structure of the nth decoding unit and the connection mode of each module are the same as those of the second decoding unit.

The upsampling module preferably performs upsampling processing by a bilinear interpolation algorithm.

In some embodiments of the present disclosure:

the structure of the salient feature encoding module is shown in fig. 2, and includes a first residual unit. The first residual unit comprises a first residual module which comprises two branches and is used for respectively processing the salient features in parallel; and summing the results after the first branch and the second branch. The first branch comprises a first convolution module which convolves the received salient features. The second branch comprises a second convolution module and a first attention module which are connected in sequence, and the received significance signature is processed. The first attention module contains an energy function capable of giving different weights on different pixels and/or channels of the image to be processed, which can more effectively eliminate noise effects than if the saliency image were used directly. And carrying out summation operation on the results after the first branch and the second branch processing, and outputting the operation result of the first residual error unit.

The structure of the salient feature fusion module is shown in fig. 2, and comprises a second residual error unit and a splicing unit. The second residual error unit comprises two residual error modules which are sequentially connected, and a second residual error module and a third residual error module. The second residual module comprises two branches, a third branch and a fourth branch. The third branch comprises a third convolution module, and convolution processing is carried out on the received image to be segmented. The fourth branch comprises a fourth convolution module and a second attention module which are connected in sequence, and the received significance signature is processed. And carrying out summation operation on the results processed by the third branch and the fourth branch, and outputting the operation result of the second residual error unit. The third residual module includes a first nerve activation module, a fifth branch, and a sixth branch. The fifth branch comprises a fifth convolution module, and convolution processing is carried out on the output information of the nerve activation module. The sixth branch comprises a sixth convolution module and a third attention module which are sequentially connected, and the output information of the first nerve activation module is processed. And carrying out summation operation on the results processed by the fifth branch and the sixth branch, and outputting the operation result of the third residual error unit. And the second nerve activation module processes the operation result of the third residual error unit again and outputs a processing result. The first and second nerve activation modules are preferably ReLU functions. The second attention module and the third attention module contain energy functions, can effectively distribute weights to the salient features, and realize screening of a main body to be segmented and filtering of noise.

And the splicing unit fuses output results of the first residual error unit and the second residual error unit. The splicing unit comprises a splicing layer, a convolution layer, a batch normalization layer and an activation layer. And fusing the channels by using a splicing layer of the saliency fusion module. And then sequentially processing the convolution layer, the batch normalization layer and the activation layer, and activating the obtained characteristics and preventing overfitting. The activation layer preferably activates the function ReLU.

In some embodiments of the present disclosure,

the structure of the mask fusion module is shown in fig. 3, and the mask fusion module comprises a first processing module, a second processing module, a third processing module, an operation unit and a splicing unit. The first processing module receives the mask information, and outputs a processing result after passing through the convolution layer, the mapping layer and the binarization layer in sequence. The second processing module receives the input feature map, and outputs a processing result after sequentially passing through the first convolution layer, the batch normalization layer, the nerve activation layer, the second convolution layer, the mapping layer and the binarization layer. The third processing module receives the input feature map, and outputs a processing result after sequentially passing through the first convolution layer, the batch normalization layer and the nerve activation layer processing layer.

The operation unit comprises a logical AND operation module, a logical OR operation module, a first multiplication operation module and a second multiplication operation module. The logical AND operation module performs logical AND operation on the output of the first processing module and the output of the second processing module, and outputs an operation result. The logical OR operation module carries out logical OR operation on the output of the first processing module and the output of the second processing module, and an operation result is output. The first multiplication operation module performs multiplication operation on the output and input feature graphs of the logical AND operation module and outputs an operation result. The second multiplication operation module performs multiplication operation on the output and input feature graphs of the logical OR operation module and outputs an operation result.

The splicing unit of the mask fusion module comprises three branches, wherein the first branch receives the output of the first multiplication operation module, and outputs a processing result after the convolution layer, the batch normalization layer and the nerve activation layer are processed in sequence. The second branch receives the output of the second multiplication operation module, processes the convolution layer, the batch normalization layer and the nerve activation layer in sequence, and outputs a processing result. And the third branch receives the output result of the third processing module of the mask fusion module. And the splicing layer of the splicing unit performs splicing operation on the output results of the first branch, the second branch and the third branch and outputs the processing result of the mask fusion module.

Preferably, the mapping layer uses a Sigmoid function to map the value of the feature to between 0 and 1. The binarization layer carries out binarization processing on the input information, and takes 1 which is larger than 0.5, namely the segmentation area, and 0 which is smaller than 0.5, namely the background.

The mask fusion module adopts logical AND and logical OR to extract main body information and edge information respectively, and then adopts a splicing module to fuse, and fuses the edge information. And carrying out logical AND on the mask and the network main control information to obtain the shape and the size of the mask and determine the main body to be segmented. The logical or can obtain edge information of the image to be segmented.

Only the edge information of the mask generated by the OTSU method is used as the edge information, which is easy to cause the wrong segmentation of the lesion part with lighter color and similar to the normal skin color. The accuracy of the information of the edge part with lighter color can be improved by combining the characteristic diagram obtained by the main support neural network in a logical OR mode.

The disclosure also provides a training method of the medical image segmentation device, comprising the following steps:

s10, preprocessing a data set. And processing the pictures in the training set by data enhancement methods such as random clipping, overturning, rotating, elastic transformation, grid distortion, optical distortion, gray level conversion, random brightness, contrast, channel, course exit and the like. Alternatively, the dataset employs ISIC-2018. The dataset consisted of 2594 pictures. Randomly dividing the training set and the verification set according to a preset proportion. 2336 pictures were used as training sets and 258 pictures were used as test sets. All images are resized to a predetermined size, optionally 256 x 256.

S20, preparing the medical image segmentation device. The medical image segmentation means is preferably built in pyrerch1.6. The hardware implementation is preferably 1070 graphics card and Ubuntu system, cuda version 10.1. The medical image segmentation device is realized by adopting a pure convolution network; preferably a U-Net network.

S30, setting training parameters, and inputting the training set images into the medical image segmentation device for training. Preferably, an AdamW optimizer is used, the initial learning rate is 0.0001, and the training period is 100 rounds. The loss function uses DiceLoss. The Adarn optimizer is used to learn the rate 1e-4. And saving the model with the minimum verification loss as a training model result of the network.

S40, inputting the verification set image into the medical image segmentation device for verification. The aggregate similarity (Dice) index reaches 90.32%, the Recall rate (Recall) reaches 91.31%, and the average cross-over ratio (mIoU) reaches 83.73%. The segmentation of cancerous skin is completed, and the segmentation of edge parts also achieves good effect.

and P10, acquiring an image to be segmented. The acquisition mode comprises an image acquisition unit or an image receiving unit, wherein the image acquisition unit comprises an image sensor, a camera and the like; the image receiving unit includes a device capable of receiving an image signal by a wired or wireless manner, such as an image acquisition card, an upper computer, a lower computer, a server, a workstation, and the like.

And P20, receiving the image to be segmented by adopting the medical image segmentation device, performing image segmentation processing, and outputting a processing result.

Referring to fig. 4, the result of dividing the skin cancer image using the medical image dividing apparatus or method described above. Fig. 4a0, 4a1, 4a2 are the image to be segmented, the salient features, the mask, respectively. Fig. 5b0, fig. 5c0, fig. 5d0 are original images to be processed. Fig. 5b1, 5c1, 5d1 show the results of the corresponding real skin cancer image segmentation. Fig. 5b2, 5c2, and 5d2 are results of segmentation using the medical image segmentation apparatus or method described above. The segmentation accuracy was 89.8. As can be seen from the figures, the image segmentation apparatus and method of the present disclosure can completely and accurately segment diseased skin.

Referring to fig. 6, the result of dividing the cell slice image using the medical image dividing apparatus or method described above. Fig. 6a0 and fig. 6b0 are original images to be processed. Fig. 6a1 and 6b1 show the corresponding real segmentation results. Fig. 6a2 and 6b2 show the results of segmentation using the medical image segmentation apparatus or method described above. As can be seen from the figures, the image segmentation apparatus and method of the present disclosure can segment a cell image completely and accurately.

From the above examples, it can be demonstrated that the apparatus and method provided by the present disclosure have good generalization ability and robustness in the field of medical image segmentation application.

The present disclosure also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a computer, causes the computer to perform the medical image segmentation method in any of the above-described embodiments.

The present disclosure also provides a computer program product comprising instructions which, when executed by a computer, cause the computer to perform the medical image segmentation method of any of the above embodiments.

It will be appreciated that the specific examples herein are intended only to assist those skilled in the art in better understanding the present disclosure and are not intended to limit the scope of the present invention.

It should be understood that, in various embodiments of the present disclosure, the sequence number of each process does not mean that the execution sequence is sequential, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the present disclosure.

It will be appreciated that the various embodiments described in this specification may be implemented either alone or in combination, and this disclosure is not limited in this regard.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. The terminology used in the description is for the purpose of describing particular embodiments only and is not intended to limit the scope of the description. The term "and/or" as used in this specification includes any and all combinations of one or more of the associated listed items. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present specification.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system, apparatus and unit may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided in this specification, it should be understood that the disclosed systems, apparatuses, and methods may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.

In addition, each functional unit in each embodiment of the present specification may be integrated into one processing unit, each unit may exist alone physically, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the present specification may be essentially or portions contributing to the prior art or portions of the technical solutions may be embodied in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present specification. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a read-only memory (ROM), a random-access memory (RAM), a magnetic disk, or an optical disk, etc.

The foregoing is merely specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope disclosed in the present disclosure, and should be covered by the scope of the present disclosure. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A medical image segmentation apparatus, comprising: the device comprises a salient feature acquisition unit, a salient feature encoding unit, a mask acquisition unit, an encoder and a decoder, wherein the encoder and the decoder are connected in a jump connection mode;

the salient feature acquisition unit is used for extracting salient features of the image to be segmented and acquiring basic shape and size information of the segmented object;

the salient feature encoding unit comprises a first residual unit, wherein the first residual unit comprises a first residual module, the first residual module comprises a first convolution module, a second convolution module and an attention module, and the first residual unit comprises at least one residual module;

the coder receives the image to be segmented and performs feature extraction on the image to be segmented; the encoder comprises n layers of encoding units, and the received images to be segmented are sequentially subjected to layer-by-layer feature extraction; n is more than or equal to 3; each layer of coding unit comprises a salient feature fusion module and a downsampling module;

the salient feature fusion module of the first coding unit of the coder receives the image to be segmented and the salient feature output by the salient feature coding unit, fuses the image to be segmented and the salient feature of the image to be segmented, and outputs the processed feature map;

the salient feature fusion module comprises a second residual error unit and a splicing unit, wherein the second residual error unit comprises at least one residual error module;

2. The apparatus of claim 1, wherein the second encoding unit of the encoder further comprises a mask fusion module, the salient feature fusion module, the mask fusion module, and the downsampling module of the second encoding unit being serially connected in sequence; the mask fusion module receives the output feature image of the salient feature fusion module of the second coding unit and the mask output by the mask acquisition unit, and fuses the output feature image and the mask output by the mask acquisition unit.

3. The apparatus of claim 1, wherein the jump connection includes an adaptive feature coding module for matching a distribution between two feature maps in the encoder and the decoder to restore features of fine granularity of the image, and the adaptive feature coding module has a structure of a residual block with an energy function, and a specific structure is the same as the first residual unit.

4. The apparatus of claim 1, wherein the first residual unit and the second residual unit each comprise an attention module, the attention module comprising an energy function.

5. The apparatus of claim 1, wherein the mask fusion module comprises a logical and operation module and a logical or operation module.

6. A medical image segmentation method, comprising the steps of:

acquiring a medical image to be segmented;

the medical image segmentation apparatus according to any one of claims 1 to 5 is used for receiving an image to be segmented, performing image segmentation processing, and outputting a processing result.

7. A computer-readable storage medium having stored thereon a computer program which, when executed by a computer, causes the computer to perform the medical image segmentation method of claim 6.