WO2021104058A1

WO2021104058A1 - Image segmentation method and apparatus, and terminal device

Info

Publication number: WO2021104058A1
Application number: PCT/CN2020/128846
Authority: WO
Inventors: 司伟鑫; 李才子; 王琼; 王平安
Original assignee: 中国科学院深圳先进技术研究院
Priority date: 2019-11-26
Filing date: 2020-11-13
Publication date: 2021-06-03
Also published as: CN111047602A

Abstract

An image segmentation method and apparatus, and a terminal device. The image segmentation method comprises: obtaining a target image to be segmented (101); performing convolution processing on said image by means of multiple convolutional layers (102), wherein the convolutional layers of the multiple convolutional layers are connected to one another, the convolutional layers in a first convolutional layer are sequentially connected in series, the convolutional layer of a first scale receives said image, the convolutional layers sequentially perform convolution down-sampling on said image, the convolutional layers in the last convolutional layer of the multiple convolution layers are sequentially connected in series, the convolutional layers sequentially perform convolution up-sampling on received feature information, and output a convolution processing result by means of the convolutional layer of the first scale; and performing image segmentation according to the convolution processing result (103). Compared with traditional neural networks, the method can greatly reduce the number of parameters, so that the calculation amount is reduced, and the performance and efficiency of the neural network are improved.

Description

Image segmentation method, device and terminal equipment

Technical field

This application belongs to the field of image processing technology, and in particular relates to image segmentation methods, devices, and terminal equipment.

Background technique

Research on neural network systems in image segmentation has achieved a lot of research results, but in most cases, humans can easily extract different information of an image on a series of spatial scales, so as to obtain information from small areas to large areas. The image details and characteristics of the area are a more challenging task for computer equipment. Moreover, neural network training requires a large number of parameters to participate in the calculation, and the process is cumbersome, which leads to high cost and poor accuracy of image segmentation using neural networks.

Summary of the invention

In order to overcome at least one problem in the related art, the embodiments of the present application provide an image segmentation method, device, and terminal device.

This application is realized through the following technical solutions:

In the first aspect, an embodiment of the present application provides an image segmentation method, including:

Obtain the target image to be segmented;

Convolution processing is performed on the target image to be segmented through a multi-layer convolutional layer; wherein, the convolutional layers of the multi-layer convolutional layer are connected to each other, and the convolutional layers in the first convolutional layer are connected to each other. Concatenated sequentially, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the last layer of the multi-layer convolutional layer convolution Each convolutional layer in the layer is serially connected in sequence, and each convolutional layer sequentially convolutional upsamples the received feature information, and outputs the convolution processing result through the first-scale convolutional layer;

Image segmentation is performed according to the output result of the convolution processing.

In a possible implementation of the first aspect, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and each subtree The parent node of is the aggregation of all previous subtrees.

In a possible implementation of the first aspect, each subtree includes an output layer convolutional layer, and the output layer convolutional layer is the same as the other convolutional layers in the current subtree and the parent of the current subtree. The nodes are connected separately, and the parent node of the current subtree and the other convolutional layers in the current subtree are sequentially connected; wherein the parent node of the current subtree is the output convolutional layer of the previous subtree.

In a possible implementation of the first aspect, between two convolutional layers that have a connection relationship, each convolutional layer of the next convolutional layer corresponds to the corresponding convolutional layer in the previous convolutional layer. Layer connection.

In a possible implementation of the first aspect, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers;

Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:

Each convolutional layer of the next convolutional layer is correspondingly connected to the convolutional layer of the same scale in the previous convolutional layer; wherein, the next convolutional layer is different from the previous convolutional layer. Two adjacent convolutional layers.

The convolutional layer of the current scale of the next convolutional layer is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale;

Wherein, the lower convolutional layer and the upper convolutional layer are two adjacent convolutional layers.

In a possible implementation of the first aspect, the convolutional upsampling or the convolutional downsampling is performed by a nearest neighbor interpolation method.

In a second aspect, an embodiment of the present application provides an image segmentation device, including:

The image acquisition module is used to acquire the target image to be segmented;

The convolution processing module is configured to perform convolution processing on the target image to be segmented through a multi-layer convolution layer; wherein, the convolution layers of the multi-layer convolution layer are connected to each other, and the first convolution layer is The convolutional layers of each are sequentially connected in series, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolution down-sampling on the target image to be segmented; the multi-layer convolution The convolutional layers in the last convolutional layer of the layer are serially connected in sequence, and each convolutional layer sequentially convolutional upsampling the received feature information, and output convolution processing through the first-scale convolutional layer result;

The segmentation module is configured to perform image segmentation according to the convolution processing result.

In the third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, The image segmentation method according to any one of the first aspect is implemented.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements the process described in any of the first aspects. Image segmentation method.

In a fifth aspect, embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the image segmentation method described in any one of the above-mentioned first aspects.

It is understandable that, for the beneficial effects of the second aspect to the fifth aspect described above, reference may be made to the relevant description in the first aspect described above, and details are not repeated here.

Compared with the prior art, the embodiments of this application have the following beneficial effects:

In the embodiment of the present application, the target image to be segmented is acquired, and then the target image to be segmented is convolved through a multi-layer convolutional layer, wherein each convolutional layer in the first convolutional layer is serially connected in sequence, and the first The scaled convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; each convolutional layer in the last convolutional layer is connected in series, and each convolutional layer is in turn Perform convolutional upsampling on the received feature information, and output the convolution processing result through the first-scale convolution layer, and perform image segmentation according to the output result of the convolution processing. Compared with the traditional neural network, the parameters can be greatly reduced. , Thereby reducing the amount of calculation and improving the performance and efficiency of the neural network.

Further, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the aggregation of all previous subtrees, using The way the features extracted from different layers of the convolutional layer of the neural network are merged, the feature pyramid is regarded as an overall feature extractor, then the feature pyramid closer to the input can be called a shallow pyramid, and vice versa, it can be called a deep pyramid. Among them, the shallow pyramid has advantages in low-level feature extraction, while the features in the deep pyramid are more semantic-level high-level features. Combining the two levels can efficiently achieve deep and shallow feature fusion, which can be effectively used The information of feature pyramids at different levels can further improve the accuracy of image segmentation.

It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and cannot limit this specification.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only of the present application. For some embodiments, for those of ordinary skill in the art, other drawings may be obtained based on these drawings without creative labor.

FIG. 1 is a schematic diagram of an application environment of an image segmentation method provided by an embodiment of the present application;

FIG. 2 is a schematic flowchart of an image segmentation method provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a parallel mechanism between convolutional layers of a neural network provided by an embodiment of the present application;

4 is a schematic structural diagram of a multi-layer convolutional layer provided by an embodiment of the present application;

FIG. 5 is a schematic structural diagram of a multi-layer convolutional layer provided by an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a multi-layer convolutional layer provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of connections between intermediate convolutional layers of a multi-layer convolutional layer provided by an embodiment of the present application;

FIG. 8 is a schematic structural diagram of an image segmentation device provided by an embodiment of the present application;

FIG. 9 is a schematic structural diagram of a terminal device provided by an embodiment of the present application;

FIG. 10 is a schematic structural diagram of a computer to which the image segmentation method provided by an embodiment of the present application is applicable.

Detailed ways

In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.

It should be understood that when used in the specification and appended claims of this application, the term "comprising" indicates the existence of the described features, wholes, steps, operations, elements and/or components, but does not exclude one or more other The existence or addition of features, wholes, steps, operations, elements, components, and/or collections thereof.

It should also be understood that the term "and/or" used in the specification of this application and the appended claims refers to any combination of one or more of the items listed in association and all possible combinations, and includes these combinations.

As used in the description of this application and the appended claims, the term "if" can be construed as "when" or "once" or "in response to determination" or "in response to detecting ". Similarly, the phrase "if determined" or "if detected [described condition or event]" can be construed as meaning "once determined" or "in response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".

In addition, in the description of the specification of this application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.

The reference to "one embodiment" or "some embodiments" described in the specification of this application means that one or more embodiments of this application include a specific feature, structure, or characteristic described in combination with the embodiment. Therefore, the sentences "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless it is specifically emphasized otherwise. The terms "including", "including", "having" and their variations all mean "including but not limited to", unless otherwise specifically emphasized.

Improving the multi-scale expression ability of neural networks is an important way to improve the multi-tissue segmentation of cardiac MRI images. At present, in the field of computer vision, image pyramids are widely used in computer vision tasks in various forms and methods. Although many research results have been achieved in the study of neural network systems in image segmentation, in most cases, the segmentation neural network parameters are too large, even if the convolutional neural network itself is built on the basis of parameter sharing, it is complicated The design of the network structure also makes a large number of parameters participate in the process of gradient optimization, which generates a huge amount of calculation. Moreover, although the multi-scale fusion pyramid network has shown good segmentation ability in the task of cardiac MRI image segmentation, the long connection between the input layer and the output layer network leads to the problem of too little information interaction between the shallow pyramid and the deep pyramid. , Which means that the multi-scale feature representation of the network is insufficient.

Based on the above problems, the embodiments of the application provide an image segmentation method, device, and terminal equipment, and design a neural network with a multilayer convolutional layer structure. Each convolutional layer in the first convolutional layer sequentially convolves the target image to be segmented. Product down-sampling, each convolutional layer in the last layer of convolutional layer convolutional up-sampling the received feature information in turn, the middle layer of convolutional layers are connected to each other, compared with the traditional neural network, it can greatly reduce the parameters, Thereby reducing the amount of calculation and improving the performance and efficiency of the neural network.

Specifically, the target image to be segmented can be obtained, and then convolution processing is performed on the target image to be segmented through the multi-layer convolutional layer, wherein the convolutional layers of the multi-layer convolutional layer are connected to each other, and the first convolutional layer is Each convolutional layer is connected in series. The first-scale convolutional layer receives the target image to be segmented, and each convolutional layer performs convolutional down-sampling on the target image to be segmented; each convolutional layer in the last convolutional layer The layers are serially connected in sequence, and each convolution layer sequentially convolution and upsamples the received feature information, and output the convolution processing result through the first scale convolution layer, and finally image according to the output result of the convolution processing segmentation.

For example, the embodiment of the present application can be applied to the exemplary scene shown in FIG. 1, in this scene, the magnetic resonance scanning device 10 scans a certain part of the human body to obtain a scanned image of the part for segmentation, for example It may be a heart image, and the scanned image is sent to the image segmentation device 20. After the image segmentation device 20 obtains the scanned image, it uses the scanned image as a target image to be segmented, and performs convolution processing on the target image to be segmented through a multi-layer convolution layer to obtain a convolution processing result, and then according to the above convolution processing The result is image segmentation.

It should be noted that the above application scenario is used as an exemplary description, and it is not used to limit the application scenario during the implementation of the embodiment of the present application. In fact, the embodiment of the present application may also be applied to other application scenarios. For example, in some other exemplary application scenarios, medical personnel may also select the target image to be segmented and send it to the image segmentation device.

In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present application will be described clearly and completely with reference to FIG. 1. Obviously, the described embodiments are only a part of the embodiments of the present invention. , Not all examples. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

FIG. 2 is a schematic flowchart of an image segmentation method provided by an embodiment of the present application. Referring to FIG. 1, the image segmentation method is described in detail as follows:

In step 101, a target image to be segmented is acquired.

The above-mentioned target image to be segmented may be an image obtained by performing magnetic resonance imaging MRI on a certain part of the human body, for example, an MRI image of the human heart.

Optionally, after obtaining the target image to be segmented, the target image to be segmented may be preprocessed, and then the preprocessed image may be processed in subsequent steps. Exemplarily, the target image to be segmented may include an image of the heart and a size of m*n. The target image to be segmented may be identified to extract the image of the heart part to obtain an image with a size of 128*128. deal with.

In step 102, convolution processing is performed on the target image to be segmented through a multi-layer convolution layer.

Wherein, the convolutional layers of the above-mentioned multi-layer convolutional layer are connected to each other, and the convolutional layers in the first convolutional layer are sequentially connected in series, and the convolutional layer of the first scale receives the target image to be divided, and each The convolutional layer sequentially performs convolutional down-sampling on the above-mentioned target image to be segmented; each convolutional layer in the last convolutional layer of the above-mentioned multi-layer convolutional layer is serially connected in sequence, and each convolutional layer sequentially performs The feature information is convolutional up-sampled, and the convolution processing result is output through the first-scale convolution layer.

In order to facilitate the understanding of the above image segmentation method, first introduce the parallel multi-scale cross fusion pyramid.

The traditional U-Net structure extracts the features of different levels of the image by successively convolving and pooling the input image, and then successively restores the deep semantic feature maps to the original image size through continuous deconvolution operations. In the process of restoring the size, the cross-connection operation plays a very important role in enhancing the feature expression ability of the convolutional layer on the restoration path, so that the convolutional layer with good contour characteristics on the contraction path and the strong semantics on the expansion path The convolutional layers merge with each other, and finally complete the pixel-by-pixel classification and recognition. However, the U-Net structure still has shortcomings for the segmentation of cardiac MRI images. U-Net's cross-connection mechanism is only for the convolutional layers of the same scale. This feature fusion ability is not sufficient. At the base and top of the segmentation target, especially the top part, the left and right ventricles and myocardium are usually compared. Therefore, the segmentation ability of U-Net in these positions still has obvious defects. In addition, in the process of U-Net from the high-resolution convolutional layer to the low-resolution convolutional layer, the number of feature maps increases by a factor of two, resulting in a large amount of model parameters and a certain burden on computing resources.

In order to avoid the shortcomings of U-Net's contraction and expansion structure design in multi-scale information fusion and to reduce the parameter amount of the segmentation model, and to further improve the multi-tissue segmentation capability of the image, such as the multi-tissue segmentation capability of cardiac MRI images, this application In the embodiment, a parallel cross-scale neural network structure is proposed.

Among them, the core of the parallel cross-scale neural network structure is the mutual fusion of the convolutional layer features at each scale, which enhances the multi-scale information exchange between the neural network convolutional layers. A parallel multi-scale fusion unit is shown in formula (1):

In a parallel multi-scale fusion unit, suppose the output convolutional layer is

Represents the nth convolutional layer of the i+1th layer, and the input convolutional layers are

Respectively represent the nth convolutional layer of the i-th layer and the convolutional layers of two adjacent scales, ο represents the splicing operation, and the function F(.) represents a set of operations, such as convolution, batch normalization, and activation. The three input convolutional layers contain features of different scales, and the large-scale feature maps will be downsampled to the output convolutional layer

The small-scale feature map will be upsampled to the output convolutional layer

The two sampled convolutional layers are then spliced with the input convolutional layer of the same output scale with the channel as the axis, and then the spliced convolutional layer is subjected to 3*3 convolution, batch normalization layer and ReLU activation operation, up-sampling and down-sampling operations respectively use nearest neighbor interpolation and maximum pooling. After multi-scale feature fusion, the output convolutional layer is equivalent to extracting feature information of different scales. It should be noted that when

When there are no larger-scale adjacent convolutional layers or small-scale adjacent convolutional layers, there are only two input convolutional layers.

A parallel cross multiscale pyramid feature input unit comprises a pyramid P _i and output characteristics pyramid P _{i + 1,} Figure 3 shows the relationship between P _i that is connected and P _{i + 1,} on each of the two scales are pyramid There is a parallel cross-scale fusion unit, which is equivalent to that for the output pyramid, the convolutional layer at each scale has the ability to receive information from the corresponding-scale convolutional layer of the input pyramid and all adjacent layers.

The above-mentioned parallel multi-scale fusion pyramid has three advantages for semantic segmentation: 1) Compared with the traditional encoding and decoding structure, the parallel cross-scale fusion pyramid can greatly enhance the multi-scale feature fusion ability from the shallow feature layer to the deep feature layer; 2 ) Continuous application of the feature pyramid can exponentially increase the receptive field of the neural network, which is more conducive to pixel-level classification; 3) Convolutional layers of the same resolution can directly interact, which can reduce the information brought by pooling and upsampling loss.

In the embodiments of the present application, a cascaded parallel multi-scale fusion pyramid network (PCP-Net) is proposed by further simplifying the multi-scale fusion structure. The structure of PCP-Net is shown in Figure 4. In order to show the network structure concisely, multiple identical parallel multi-scale connection pyramids are omitted in the figure. The overall structure is composed of several cascaded feature pyramids, and each pyramid contains multiple (for example, 5) convolutional layers of different scales. For the first pyramid on the left, the 5 convolutional layers of different scales are obtained by sequentially convolution and down-sampling the input image; for the last layer of convolutional layer, the inner 5 convolutional layers of different scales start from the lower Resolution to high resolution is obtained by up-sampling; the pyramids in the middle are connected to each other, for example, parallel multi-scale fusion rules can be used.

In some embodiments, between two convolutional layers with a connection relationship, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer of the previous convolutional layer.

Referring to FIG. 4, the multi-layer convolutional layer may include m-layer convolutional layers, and specifically may be the first layer of convolutional layer, the second layer of convolutional layer, ..., the i-1th layer of convolutional layer, and the i-th layer. Convolutional layer, i+1th convolutional layer, ..., m-1th convolutional layer, mth convolutional layer.

The convolutional layers of the first convolutional layer are connected in series according to this, the uppermost convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the mth convolutional layer The layers are connected in series according to this, and each convolutional layer sequentially performs convolution and upsampling on the received feature information, and outputs the convolution processing result through the uppermost convolution layer. Wherein, the above-mentioned convolutional up-sampling or the above-mentioned convolutional down-sampling can be performed by the nearest neighbor interpolation method.

In this embodiment, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers. In a possible implementation, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, which can be: the current scale of the next convolutional layer The convolutional layer is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale.

For example, referring to Figure 4, for the convolutional layer corresponding to scale 1 in the i-th convolutional layer, the convolutional layer corresponding to scale 1 in the i-1th convolutional layer and the convolutional layer corresponding to scale 2 can be Connection; for the convolutional layer corresponding to scale 2 in the i-th convolutional layer, it can correspond to the convolutional layer corresponding to scale 1 in the i-1th convolutional layer, the convolutional layer corresponding to scale 2 and the scale 3 Convolutional layer connection; for the convolutional layer corresponding to scale 3 in the i-th convolutional layer, it can be the convolutional layer corresponding to scale 2 in the i-1th convolutional layer, and the convolutional layer corresponding to scale 3 in the i-1th convolutional layer Connect to the convolutional layer corresponding to scale 4; for the convolutional layer corresponding to scale 4 in the i-th convolutional layer, it can correspond to the convolutional layer corresponding to scale 3 in the i-1th convolutional layer, and scale 4 The convolutional layer of is connected to the convolutional layer corresponding to scale 5; for the convolutional layer corresponding to scale 5 in the i-th convolutional layer, it can be the convolutional layer corresponding to scale 4 in the i-1th convolutional layer Connect with the convolutional layer corresponding to scale 5.

It should be noted that after each convolutional layer in Figure 4, batch normalization layer and ReLU activation layer are applied. In the figure, for brevity, the batch normalization layer and ReLU activation layer are implicitly included in the convolution operation. .

Figures 5 and 6 are schematic diagrams of the structure of the multi-layer convolutional layer provided by the embodiments of this application. See Figures 5 and 6, on the basis of the above-mentioned PCP-Net, a hierarchical aggregation parallel multi-scale fusion pyramid network (APCP-Net) is proposed. . Each convolutional layer of the multi-layer convolutional layer may form a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the aggregation of all previous subtrees.

Exemplarily, the multi-layer convolutional layer includes 9 convolutional layers, which are respectively P ₁ , P ₁ ,..., P ₉ , where P ₁ , P _2, and P ₃ form a subtree, and P ₃ aggregates P ₁ , The characteristics of the two nodes of P ₂ _{; P 3} , P ₄ , P ₅ and P ₆ form a subtree, and P ₆ aggregates the characteristics of P ₃ , P ₄ , P ₅ ; P ₆ , P ₇ , P ₈ and P ₉ To form a subtree, P ₉ aggregates the characteristics of P ₆ , P ₇ , and P ₈ . That is to say, the level aggregation is the fusion of the convolutional layers of each layer, which can achieve better integration of the shallow and deep features, and there is no dense connection that requires too many splicing operations on the convolutional layers to make the difference The fusion of hierarchical features is more efficient.

Exemplarily, each subtree includes an output layer convolutional layer, and the output layer convolutional layer is connected to the other convolutional layers in the current subtree and the parent node of the current subtree respectively, and the parent of the current subtree The nodes are sequentially connected to other convolutional layers in the current subtree; among them, the parent node of the current subtree is the output convolutional layer of the previous subtree.

For example, referring to 5 and 6, for the sub-tree composed of _{P 1} , P ₂ and P ₃ _{, P 3 is} the output layer convolutional layer of the sub-tree, and the parent node of the first sub-tree can be regarded as receiving the target image to be segmented P ₁ and P _{3 of} Input are connected to P ₁ and P ₂ respectively, and P ₁ and P _{2 are} connected; among them, P ₃ and P ₁ are two non-adjacent convolutional layers, and the connection between the two is a cross Layer connection (as shown by the thin solid line with arrow in Figure 6).

For the sub-tree composed of _{P 3} , P ₄ , P ₅ and P ₆ _{, P 6 is} the output layer convolutional layer of _{the sub-tree, P 3 is} the parent node of the sub-tree, and P _{6 is the} same as P ₃ , P ₄ , P _{5 is} connected, and P ₃ , P ₄ , and P ₅ are connected in turn; among them, P ₆ and P ₃ are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (such as The thin solid line with arrows in Figure 6), P ₆ and P ₄ are also two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (the arrowed line in Figure 6 (Shown by the thin solid line).

For the sub-tree composed of _{P 6} , P ₇ , P ₈ and P ₉ _{, P 9 is} the output layer convolutional layer of _{the sub-tree, P 6 is} the parent node of the sub-tree, and P _{9 is the} same as P ₆ , P ₇ , P _{8 is} connected, and P ₆ , P ₇ , P ₈ are connected in turn; among them, P ₉ and P ₆ are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (such as The thin solid line with arrows in Figure 6), P ₉ and P ₇ are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (the thin arrowed line in Figure 6 Shown by the solid line).

In a possible implementation, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, which can be: each convolution of the next convolutional layer The layers are correspondingly connected to the convolutional layers of the same scale in the upper convolutional layer; wherein, the above-mentioned next-level convolutional layer and the above-mentioned upper-level convolutional layer are two non-adjacent convolutional layers.

For example, the 9th convolutional layer P ₉ and the 7th convolutional layer P ₇ are two non-adjacent convolutional layers, but there is a connection relationship between the two convolutional layers. At this time, for the convolutional layer corresponding to scale 1 in _{P 9}

Convolutional layer that can correspond to scale 1 in P ₇

Connection; for the convolutional layer corresponding to scale 2 in _{P 9}

Convolutional layer that can correspond to scale 2 in P ₇

Connection; for the convolutional layer corresponding to scale 3 in _{P 9}

Convolutional layer that can correspond to scale 3 in P ₇

Connection; for the convolutional layer corresponding to scale 4 in _{P 9}

Convolutional layer that can correspond to scale 4 in P ₇

Connection; for the convolutional layer corresponding to scale 5 in _{P 9}

Convolutional layer that can correspond to scale 5 in P ₇

connection.

It should be noted that for other two non-adjacent convolutional layers that have a connection relationship, you can also refer to the above-mentioned respective convolutions between _{the 9th convolutional layer P 9} and the 7th convolutional layer P ₇ The connection relationship of the layers will not be repeated here.

In another possible implementation manner, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, which can be: The convolutional layer of the current scale is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale; wherein, the above-mentioned next-level convolutional layer is connected to the above-mentioned convolutional layer. The upper convolutional layer is two adjacent convolutional layers.

Referring to Figure 7, a convolution layer P i-th layer and the _i layer of the first convolutional layer i + 1 P _{i + 1} is the convolution of two adjacent layers, and there are connections between two convolutional layer, wherein 1 ≤i≤m-1, where m is the number of convolutional layers in the multi-layer convolutional layer. At this time, for the convolutional layer corresponding to scale 1 in _Pi+1

Layer may be a convolution of P _i corresponding to the scale 1

Convolutional layer corresponding to scale 2

Connection; for the convolutional layer corresponding to scale 2 in _Pi+1

Layer may be a convolution of P _i corresponding to the scale 1

Convolutional layer corresponding to scale 2

Convolutional layer corresponding to scale 3

Connection; for the convolutional layer corresponding to scale 3 in _Pi+1

Layer may be a convolution of P _i corresponding to the scale 2

Convolutional layer corresponding to scale 3

Convolutional layer corresponding to scale 4

Connection; for the convolutional layer corresponding to scale 4 in _Pi+1

Layer may be a convolution of P _i corresponding to the scale 3

Convolutional layer corresponding to scale 4

Convolutional layer corresponding to scale 5

Connection; for the convolutional layer corresponding to scale 5 in _Pi+1

Layer may be a convolution of P _i corresponding to the scale 4

Convolutional layer corresponding to scale 5

connection.

It should be noted that only the convolutional layers are shown in Figures 5 to 7. After each convolutional layer, a batch normalization layer and a ReLU activation layer are applied. For simplicity, the batch normalization layer and the ReLU activation layer It is implicit in the convolution operation.

Among them, the cross-layer connection is to forward the shallow contour information, and does not require multi-scale fusion, and too many multi-scale fusions will also bring about a large increase in the number of parameters, which affects the degree of simplification of the model.

Of course, in other embodiments, the convolution between the two layers may be cross-linked layer as a convolution between the i-th layer and the _i layer of the first P i + 1 P _{i +} layer convolutional layer ₁ using a convolution layer The connection mode of this application is not limited in this embodiment.

In step 103, image segmentation is performed according to the convolution processing result.

Wherein, after convolution processing is performed on the target image to be segmented through the above-mentioned multi-layer convolution layer, the image segmentation is performed according to the convolution processing result to obtain the image segmentation result.

In the above image segmentation method, the target image to be segmented is acquired, and then convolution processing is performed on the target image to be segmented through a multi-layer convolution layer, wherein each convolution layer in the first layer of convolution layer is connected in series, the first The scaled convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; each convolutional layer in the last convolutional layer is connected in series, and each convolutional layer is in turn Perform convolutional upsampling on the received feature information, and output the convolution processing result through the first-scale convolution layer, and perform image segmentation according to the output result of the convolution processing. Compared with the traditional neural network, the parameters can be greatly reduced. , Thereby reducing the amount of calculation and improving the performance and efficiency of the neural network.

The following experiments are performed on the PCP-Net structure and APCP-Net structure in the above image segmentation method to verify the effectiveness of the two structures.

Among them, the Dice similarity coefficient and Hausdorff distance on the verification data set are shown in Table 1.

Table 1 Experimental results of PCP-Net structure

It can be seen from the experimental results that the PCP-Net structure exceeds the benchmark model in the segmentation results of the left ventricle and myocardium, while the right ventricle remains unchanged. It can be seen that the segmentation accuracy of the PCP-Net network for the left ventricle and myocardium It has improved, but it has no effect on the results of the right ventricle. In fact, the morphology of the right ventricle in the tissues and organs during diastole and systole is too different, and there is even no target on some slices. Therefore, the target may disappear during the process of feature downsampling and continuous convolution, so online In the multi-level pyramid network of sexual connections, the segmentation of the right ventricle still has defects. From the perspective of the computational complexity of the model, the model has 0.278 million parameters, which is nearly 90% lower than the benchmark model. From the verification results, it can be determined that the PCP-Net structure not only improves the segmentation accuracy, but also greatly improves the model complexity.

Compared with the traditional U-Net network, the PCP-Net network structure has the following advantages:

1. Different pyramids are connected in parallel from high-resolution to low-resolution convolutional layers. There is no communication between the convolutional layers outside the pyramids of the input and output layers. This design has hierarchical image features. Extraction plays an important role. The convolutional layer of each resolution focuses on extracting the current level of features, so that it can effectively use the hierarchical feature extraction capabilities of the pyramid and reduce the complexity of the model as much as possible. The more feature maps that are fused, the greater the contribution to the growth of the parameter;

2. Use multiple feature pyramids to connect in parallel, so that not only the low-resolution deep semantic features can be extracted efficiently, but also because multiple high-resolution feature representations are maintained from input to output, this is useful for refined segmentation targets. The contour of MRI is very important, especially for the low-resolution characteristics of MRI images. Such a design can preserve and refine low-level contour features;

3. Since the U-Net network halves the resolution of the feature map while doubling the number of feature maps, this will inevitably bring about a larger increase in the amount of parameters, but for MRI segmentation tasks, due to the segmentation of fewer target categories, The gray value changes within the image are relatively smooth, and there is no need to extract too many features for the representation of high-level semantic features. On the contrary, for the fuzzy edge features between tissues, if you want to better describe the edges, you need better features. Therefore, in the neural network structure of multi-level parallel multi-scale fusion, the design of doubling the resolution of the number of feature maps is cancelled, and the same number of feature maps are used instead.

Table 2 shows the experimental results of the APCP-Net structure on the validation set. Compared with PCP-Net, the increase in the Dice similarity coefficient of each organization of the hierarchical aggregation mechanism during the systole indicates that the hierarchical aggregation mechanism has different depths of feature fusion It has made a significant contribution and the high efficiency of the hierarchical aggregation mechanism has not brought the large-scale parameter growth of APCP-Net. The parameter amount of APCP-Net is 0.317 million, which is close to the parameter amount of PCP-Net.

Table 2 Experimental results of APCP-Net structure

The mechanism of hierarchical fusion allows the shallow features to continue forward propagation during the feature fusion process, although the downsampling process of the first pyramid and the continuous fusion process may result in the right ventricle which accounts for a small proportion of the image. There are cases of disappearance, but due to the hierarchical fusion mechanism, the shallow features can still be retained until the deep features participate in the final prediction, which also proves the necessity of fusion of shallow features and deep features, especially for pixels of various categories in the image In the segmentation of small targets when the distribution is uneven, the role is more prominent.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Corresponding to the image segmentation method described in the above embodiment, FIG. 8 shows a structural block diagram of an image segmentation device provided in an embodiment of the present application. For ease of description, only parts related to the embodiment of the present application are shown.

Referring to FIG. 8, the image segmentation device in the embodiment of the present application may include an image conversion module 201, a convolution processing module 202 and a segmentation module 203.

Among them, the image acquisition module 201 is used to acquire the target image to be segmented;

The convolution processing module 202 is configured to perform convolution processing on the target image to be segmented through a multi-layer convolution layer; wherein, the convolution layers of the multi-layer convolution layer are connected to each other, and the first convolution layer The convolutional layers in, the convolutional layer of the first scale receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the multi-layer convolution The convolutional layers in the last convolutional layer of the buildup layer are serially connected in sequence, and each convolutional layer sequentially convolution and upsamples the received feature information, and outputs the convolution through the first-scale convolutional layer process result;

The segmentation module 203 is configured to perform image segmentation according to the output result of the convolution processing.

Optionally, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the subtree of all previous subtrees. polymerization.

Optionally, each subtree includes an output convolutional layer, and the output convolutional layer is connected to other convolutional layers in the current subtree and the parent node of the current subtree, and the current subtree The parent node and the other convolutional layers in the current subtree are sequentially connected; wherein, the parent node of the current subtree is the output convolutional layer of the previous subtree.

Optionally, between two convolutional layers that have a connection relationship, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer of the previous convolutional layer.

In a possible implementation manner, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers;

Exemplarily, the convolutional upsampling or the convolutional downsampling may be performed by the nearest neighbor interpolation method.

It should be noted that the information interaction and execution process between the above-mentioned devices/units are based on the same concept as the method embodiment of this application, and its specific functions and technical effects can be found in the method embodiment section for details. I won't repeat it here.

Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of a software functional unit. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.

An embodiment of the present application also provides a terminal device. Referring to FIG. 9, the terminal device 300 may include: at least one processor 310, a memory 320, and is stored in the memory 320 and can be stored on the at least one processor 310. A running computer program, when the processor 310 executes the computer program, the steps in any of the foregoing method embodiments, such as steps S101 to S103 in the embodiment shown in FIG. 2, are implemented. Or, when the processor 310 executes the computer program, the functions of the modules/units in the foregoing device embodiments, for example, the functions of the modules 201 to 203 shown in FIG. 8 are realized.

Exemplarily, the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 320 and executed by the processor 310 to complete the application. The one or more modules/units may be a series of computer program segments capable of completing specific functions, and the program segments are used to describe the execution process of the computer program in the terminal device 300.

Those skilled in the art can understand that FIG. 9 is only an example of a terminal device, and does not constitute a limitation on the terminal device. It may include more or less components than those shown in the figure, or a combination of certain components, or different components, such as Input and output equipment, network access equipment, bus, etc.

The processor 310 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), ready-made Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The memory 320 may be an internal storage unit of the terminal device, or an external storage device of the terminal device, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory card. (Flash Card) and so on. The memory 320 is used to store the computer program and other programs and data required by the terminal device. The memory 320 can also be used to temporarily store data that has been output or will be output.

The bus can be an Industry Standard Architecture (ISA) bus, Peripheral Component (PCI) bus, or Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, the buses in the drawings of this application are not limited to only one bus or one type of bus.

The image segmentation method provided in the embodiments of this application can be applied to terminal devices such as computers, tablets, notebooks, netbooks, personal digital assistants (PDAs), etc. The embodiments of this application do not impose any restrictions on the specific types of terminal devices. .

Take the terminal device as a computer as an example. FIG. 10 shows a block diagram of a part of the structure of a computer provided in an embodiment of the present application. 10, the computer includes: a communication circuit 410, a memory 420, an input unit 430, a display unit 440, an audio circuit 450, a wireless fidelity (WiFi) module 460, a processor 470, a power supply 480 and other components. Those skilled in the art can understand that the computer structure shown in FIG. 10 does not constitute a limitation on the computer, and may include more or less components than shown in the figure, or a combination of certain components, or different component arrangements.

The following is a detailed introduction to the various components of the computer in conjunction with Figure 10:

The communication circuit 410 can be used for receiving and sending signals during the process of sending and receiving information or talking. In particular, after receiving the image sample sent by the image acquisition device, it is processed by the processor 470; in addition, the image acquisition instruction is sent to the image acquisition device. Generally, the communication circuit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like. In addition, the communication circuit 410 can also communicate with the network and other devices through wireless communication. The above-mentioned wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile Communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division) Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE)), Email, Short Messaging Service (SMS), etc.

The memory 420 may be used to store software programs and modules. The processor 470 executes various functional applications and data processing of the computer by running the software programs and modules stored in the memory 420. The memory 420 may mainly include a program storage area and a data storage area. The program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of the computer (such as audio data, phone book, etc.), etc. In addition, the memory 420 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.

The input unit 430 may be used to receive inputted number or character information, and generate key signal input related to user settings and function control of the computer. Specifically, the input unit 430 may include a touch panel 431 and other input devices 432. The touch panel 431, also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 431 or near the touch panel 431. Operation), and drive the corresponding connection device according to the preset program. Optionally, the touch panel 431 may include two parts: a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 470, and can receive and execute the commands sent by the processor 470. In addition, the touch panel 431 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 431, the input unit 430 may also include other input devices 432. Specifically, the other input devices 432 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.

The display unit 440 may be used to display information input by the user or information provided to the user and various menus of the computer. The display unit 440 may include a display panel 441. Optionally, the display panel 441 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc. Further, the touch panel 431 can cover the display panel 441. When the touch panel 431 detects a touch operation on or near it, it transmits it to the processor 470 to determine the type of the touch event, and then the processor 470 determines the type of the touch event. The type provides corresponding visual output on the display panel 441. Although in FIG. 10, the touch panel 431 and the display panel 441 are used as two independent components to realize the input and input functions of the computer, in some embodiments, the touch panel 431 and the display panel 441 can be integrated. Realize the computer's input and output functions.

The audio circuit 450 may provide an audio interface between the user and the computer. The audio circuit 450 can transmit the electric signal after the conversion of the received audio data to the speaker, which is converted into a sound signal for output by the speaker; on the other hand, the microphone converts the collected sound signal into an electric signal, which is converted into an electric signal after being received by the audio circuit 450 The audio data is processed by the audio data output processor 470, and then sent to, for example, another computer through the communication circuit 410, or the audio data is output to the memory 420 for further processing.

WiFi is a short-distance wireless transmission technology. The computer can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 460. It provides users with wireless broadband Internet access. Although FIG. 10 shows the WiFi module 460, it can be understood that it is not a necessary component of the computer and can be omitted as needed without changing the essence of the invention.

The processor 470 is the control center of the computer. It uses various interfaces and lines to connect various parts of the entire computer. It executes by running or executing software programs and/or modules stored in the memory 420, and calling data stored in the memory 420. Various functions of the computer and processing data, so as to monitor the computer as a whole. Optionally, the processor 470 may include one or more processing units; preferably, the processor 470 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 470.

The computer also includes a power source 480 (such as a battery) for supplying power to various components. Preferably, the power source 480 may be logically connected to the processor 470 through a power management system, so that functions such as charging, discharging, and power consumption management can be managed through the power management system.

The embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each embodiment of the above-mentioned image segmentation method can be realized.

The embodiments of the present application provide a computer program product. When the computer program product runs on a mobile terminal, the steps in each embodiment of the above-mentioned image segmentation method can be realized when the mobile terminal is executed.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the implementation of all or part of the processes in the above-mentioned embodiments and methods in the present application can be accomplished by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. The computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may at least include: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), and random access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal and software distribution medium. Such as U disk, mobile hard disk, floppy disk or CD-ROM, etc. In some jurisdictions, in accordance with legislation and patent practices, computer-readable media cannot be electrical carrier signals and telecommunication signals.

In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.

A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.

In the embodiments provided in this application, it should be understood that the disclosed apparatus/network equipment and method may be implemented in other ways. For example, the device/network device embodiments described above are merely illustrative. For example, the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units. Or components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

An image segmentation method, characterized in that it includes:

Obtain the target image to be segmented;

Convolution processing is performed on the target image to be segmented through a multi-layer convolutional layer; wherein, the convolutional layers of the multi-layer convolutional layer are connected to each other, and the convolutional layers in the first convolutional layer are connected to each other. Concatenated sequentially, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the last layer of the multi-layer convolutional layer convolution Each convolutional layer in the layer is serially connected in sequence, and each convolutional layer sequentially convolutional upsamples the received feature information, and outputs the convolution processing result through the first-scale convolutional layer;

Image segmentation is performed according to the result of the convolution processing.
The image segmentation method according to claim 1, wherein each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, and each subtree includes at least two convolutional layers, and each subtree includes at least two convolutional layers. The parent node of each subtree is the aggregation of all previous subtrees.
The image segmentation method according to claim 2, wherein each subtree includes an output layer convolutional layer, and the output layer convolutional layer is the same as other convolutional layers in the current subtree and the current subtree The parent nodes of are respectively connected, and the parent node of the current subtree is connected to other convolutional layers in the current subtree in sequence; wherein, the parent node of the current subtree is the output convolutional layer of the previous subtree.
The image segmentation method according to claim 2, characterized in that, between two convolutional layers that have a connection relationship, each convolutional layer of the next convolutional layer corresponds to the corresponding one in the previous convolutional layer. Convolutional layer connection.
5. The image segmentation method according to claim 4, wherein the multi-layer convolutional layer includes a plurality of scales of convolutional layers, and each scale of the convolutional layer includes a plurality of convolutional layers;

Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:

Each convolutional layer of the next convolutional layer is correspondingly connected to the convolutional layer of the same scale in the previous convolutional layer; wherein, the next convolutional layer is different from the previous convolutional layer. Two adjacent convolutional layers.
5. The image segmentation method according to claim 4, wherein the multi-layer convolutional layer includes a plurality of scales of convolutional layers, and each scale of the convolutional layer includes a plurality of convolutional layers;

Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:

The convolutional layer of the current scale of the next convolutional layer is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale;

Wherein, the lower convolutional layer and the upper convolutional layer are two adjacent convolutional layers.
The image segmentation method according to claim 1, wherein the convolutional upsampling or the convolutional downsampling is performed by a nearest neighbor interpolation method.
An image segmentation device, characterized in that it comprises:

The image acquisition module is used to acquire the target image to be segmented;

The convolution processing module is configured to perform convolution processing on the target image to be segmented through a multi-layer convolution layer; wherein, the convolution layers of the multi-layer convolution layer are connected to each other, and the first convolution layer is The convolutional layers of each are sequentially connected in series, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolution down-sampling on the target image to be segmented; the multi-layer convolution The convolutional layers in the last convolutional layer of the layer are serially connected in sequence, and each convolutional layer sequentially convolutional upsampling the received feature information, and output convolution processing through the first-scale convolutional layer result;

The segmentation module is configured to perform image segmentation according to the output result of the convolution processing.
A terminal device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program as claimed in claims 1 to 7. The method of any one.
A computer-readable storage medium storing a computer program, wherein the computer program implements the method according to any one of claims 1 to 7 when the computer program is executed by a processor.