CN112001931A

CN112001931A - Image segmentation method, device, equipment and storage medium

Info

Publication number: CN112001931A
Application number: CN202010857319.1A
Authority: CN
Inventors: 丁子凡
Original assignee: Shanghai Eye Control Technology Co Ltd
Current assignee: Shanghai Eye Control Technology Co Ltd
Priority date: 2020-08-24
Filing date: 2020-08-24
Publication date: 2020-11-27

Abstract

The embodiment of the invention provides an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium, wherein an initial characteristic diagram is extracted from an image to be processed through a base network of an image segmentation model; pooling the initial characteristic graph through an average pooling sub-model of the image segmentation model to obtain a first characteristic graph carrying short-distance dependency relationship information; processing the initial characteristic diagram through at least one branch sub-model to obtain at least one target characteristic diagram, wherein the target characteristic diagram comprises a second characteristic diagram carrying global dependency relationship information and/or a third characteristic diagram carrying long-distance dependency relationship information; and cascading the first characteristic diagram and the target characteristic diagram, and then convolving to obtain and output an image segmentation result. The invention arranges branch submodels in parallel with the average pooling submodel to obtain the characteristic diagram carrying the global dependency relationship information and/or carrying the long-distance dependency relationship information, and cascades the characteristic diagram carrying the short-distance dependency relationship information obtained by the average pooling submodel, thereby enhancing the characteristic representation capability and improving the accuracy of image segmentation.

Description

Image segmentation method, device, equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of image processing, in particular to an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium.

Background

With the continuous development of AI technology and image processing technology, it has become a new trend to research AI technology to improve our lifestyle to serve humans using AI image technology. Image segmentation is an important and key image analysis technology, and aims to divide an image into regions with characteristics and extract interested parts, and the result of image segmentation is the basis of image understanding such as image feature extraction and recognition, so that the image segmentation has an important position in the field of computer vision, and also faces new challenges.

In the prior art, neural network models such as FCN, Segnet, Pspnet and the like are generally adopted for image segmentation, and although the methods such as FCN, Segnet, Pspnet and the like are helpful for capturing objects with different proportions by fusing context information during image segmentation, the relationship between the objects in a global view cannot be utilized, so that the segmentation effect is not good; when the convolution operation is performed on the neural network, a local receptive field is generated, which may cause that the corresponding characteristics of pixels of the same label may be different, and such difference may further cause inconsistency within the class, thereby affecting the accuracy of identification.

Disclosure of Invention

The embodiment of the invention provides an image segmentation method, an image segmentation device, image segmentation equipment and a storage medium, which are used for enhancing feature representation and improving the segmentation effect of an image.

A first aspect of an embodiment of the present invention provides an image segmentation method, including:

inputting an image to be processed into an image segmentation model, and extracting an initial characteristic map from the image to be processed through a base network of the image segmentation model;

pooling the initial characteristic graph through an average pooling sub-model of the image segmentation model to obtain a first characteristic graph carrying short-distance dependency relationship information;

processing the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map, wherein the at least one target feature map comprises a second feature map carrying global dependency relationship information and/or a third feature map carrying long-distance dependency relationship information;

and cascading the first characteristic diagram and the target characteristic diagram, performing convolution processing on a cascading result to obtain an image segmentation result, and outputting the image segmentation result.

In a possible implementation manner, the processing the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map includes:

processing the initial characteristic diagram through an attention mechanism sub-model of the image segmentation model to obtain a second characteristic diagram carrying global dependency relationship information; and/or

Pooling the initial characteristic diagram through a stripe pooling submodel of the image segmentation model to obtain a third characteristic diagram carrying long-distance dependency relationship information.

In a possible implementation manner, the processing the initial feature map by the attention mechanism sub-model of the image segmentation model to obtain a second feature map carrying global dependency relationship information includes:

performing convolution processing on the initial characteristic diagram through a convolution layer of the attention mechanism submodel to obtain an initial tensor, and performing reshaping processing on the initial tensor to convert the initial tensor from a third order to a second order to obtain an intermediate tensor;

acquiring an attention matrix according to the intermediate tensor, wherein the attention matrix is used for representing the correlation among the characteristics of the positions of the intermediate tensor;

and multiplying the intermediate tensor by the attention moment array to obtain a fourth characteristic diagram, and adding the initial characteristic diagram and the fourth characteristic diagram to obtain the second characteristic diagram.

In one possible implementation, the obtaining an attention matrix according to the intermediate tensor, where the attention matrix is used to represent a correlation between position features of the intermediate tensor, includes:

multiplying the transpose of the intermediate tensor by the intermediate tensor to obtain an eigen matrix;

and inputting the feature matrix into a Softmax layer of the attention mechanism submodel to acquire the attention matrix.

In a possible implementation manner, the pooling the initial feature map by the stripe pooling sub-model of the image segmentation model to obtain a third feature map carrying long-distance dependency relationship information includes:

performing horizontal stripe pooling and vertical stripe pooling on the initial characteristic map through the stripe pooling sub-model;

and performing convolution and up-sampling on the horizontal stripe pooling result and the vertical stripe pooling result respectively, and cascading the up-sampling results to obtain the third characteristic diagram.

In a possible implementation manner, the pooling the initial feature map by using the average pooling submodel of the image segmentation model to obtain the first feature map carrying the short-distance dependency relationship information includes:

pyramid pooling is carried out on the initial feature map through the average pooling sub-model;

and performing convolution and up-sampling on the pyramid pooling results respectively, and cascading the up-sampling results to obtain the first characteristic diagram.

In one possible implementation, the method further includes:

acquiring training data, wherein the training data are training images which are segmented and labeled by images;

carrying out transformation operation on the training image, wherein the training image after the transformation operation is also used as the training data, and the transformation operation comprises at least one of translation, scaling and rotation;

and training the image segmentation model according to the training data.

A second aspect of an embodiment of the present invention provides an image segmentation apparatus, including:

the input module is used for inputting the image to be processed into the image segmentation model;

the processing module is used for extracting an initial characteristic map from the image to be processed through a base network of the image segmentation model; pooling the initial characteristic graph through an average pooling sub-model of the image segmentation model to obtain a first characteristic graph carrying short-distance dependency relationship information; processing the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map, wherein the at least one target feature map comprises a second feature map carrying global dependency relationship information and/or a third feature map carrying long-distance dependency relationship information; cascading the first feature map and the target feature map, and performing convolution processing on a cascading result to obtain an image segmentation result;

and the output module is used for outputting the image segmentation result.

In a possible implementation manner, when the processing module processes the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map, the processing module is configured to:

In a possible implementation manner, when the processing module processes the initial feature map through an attention mechanism sub-model of the image segmentation model to obtain a second feature map carrying global dependency relationship information, the processing module is configured to:

In one possible implementation, the processing module, when obtaining an attention matrix from the intermediate tensor, the attention matrix being used to represent a correlation between the position features of the intermediate tensor, is configured to:

In a possible implementation manner, when the initial feature map is pooled through a stripe pooling sub-model of the image segmentation model and a third feature map carrying long-distance dependency relationship information is obtained, the processing module is configured to:

In a possible implementation manner, when the initial feature map is pooled through an average pooling sub-model of the image segmentation model and a first feature map carrying short-distance dependency relationship information is obtained, the processing module is configured to:

In one possible implementation, the processing module is further configured to:

and training the image segmentation model according to the training data.

A third aspect of embodiments of the present invention is to provide a computer device, including: at least one processor and memory;

the memory stores computer-executable instructions;

the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of the first aspect.

A fourth aspect of embodiments of the present invention is to provide a computer-readable storage medium having stored thereon a computer program;

which when executed by a processor implements the method according to the first aspect.

According to the image segmentation method, the device, the equipment and the storage medium provided by the embodiment of the invention, the initial characteristic diagram of the image to be processed is extracted through the base network of the image segmentation model; pooling the initial characteristic graph through an average pooling sub-model of the image segmentation model to obtain a first characteristic graph carrying short-distance dependency relationship information; processing the initial characteristic diagram through at least one branch sub-model of the image segmentation model to obtain at least one target characteristic diagram, wherein the at least one target characteristic diagram comprises a second characteristic diagram carrying global dependency relationship information and/or a third characteristic diagram carrying long-distance dependency relationship information; and cascading the first characteristic diagram and the target characteristic diagram, performing convolution processing on a cascading result to obtain an image segmentation result, and outputting the image segmentation result. In the embodiment of the invention, the branch submodels are arranged in parallel in the average pooling submodel in the image segmentation model, the characteristic graph carrying the global dependency relationship information and/or the long-distance dependency relationship information is obtained and is cascaded with the characteristic graph carrying the short-distance dependency relationship information obtained by the average pooling submodel, so that the characteristic representation capability is effectively enhanced, and the accuracy of image segmentation is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a diagram illustrating an image segmentation model according to an embodiment of the present invention;

FIG. 2 is a flowchart of an image segmentation method according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating an image segmentation model according to another embodiment of the present invention;

FIG. 4 is a flowchart of an image segmentation method according to another embodiment of the present invention;

FIG. 5 is a flowchart of an image segmentation method according to another embodiment of the present invention;

FIG. 6 is a schematic diagram of a stripe pooling process provided in accordance with an embodiment of the present invention;

FIG. 7 is a flowchart of an image segmentation method according to another embodiment of the present invention;

FIG. 8 is a block diagram of an image segmentation apparatus according to an embodiment of the present invention;

fig. 9 is a block diagram of a computer device for performing an image segmentation method according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention without any creative efforts shall fall within the protection scope of the embodiments of the present invention.

In order to solve the above problems, considering that neural network models such as Pspnet can only capture short-distance dependency relationships between different positions in an image but cannot capture global feature dependency relationships, in the embodiment of the present invention, a global feature dependency relationship is introduced, a rich context relationship is established on the basis of a local feature, a branch network is added in parallel with an average Pooling layer (Pyramid Pooling Module) of the Pspnet network on the basis of the Pspnet network, so as to obtain a feature map carrying global dependency relationship information and/or a feature map carrying long-distance dependency relationship information, and concat cascade with the feature map obtained by the Pooling layer, thereby enhancing the feature representation capability and improving the accuracy of image segmentation and identification. The Attention mechanism (Attention) can well capture the global feature dependency relationship in the space, and establish rich context relationship on the local features, so that the Attention mechanism (Attention) can be used as a branch network in the embodiment of the invention; the stripe pooling can also well capture the long-distance dependency relationship between different positions, and can make the connection between the discretely distributed regions in the whole scene and the connection between the coding regions of the strip structure possible, and can also be used as a branch network in the embodiment of the present invention. Therefore, in the embodiment of the present invention, the attention mechanism model and/or the stripe pooling model may be selected as a branch network in the Pspnet network parallel to the average pooling layer, such as the network architecture shown in fig. 1, and the network branch parallel to the average pooling layer is the stripe pooling model, but of course, only the attention mechanism model may be parallel, or only the stripe pooling model may be parallel.

The image segmentation process is described in detail below with reference to specific embodiments.

Fig. 2 is a flowchart of an image segmentation method according to an embodiment of the present invention. The embodiment provides an image segmentation method, the execution subject of which is a computer device with processing capability, the image segmentation method comprises the following specific steps:

s201, inputting an image to be processed into an image segmentation model, and extracting an initial characteristic diagram from the image to be processed through a base network of the image segmentation model.

In this embodiment, the base network in the image segmentation model is used to extract a feature map from the to-be-processed image of the neural network, and may include a convolutional layer, a full connection layer, and the like, for example, may be a residual error network (Resnet), including Resnet18, Resnet50, Resnet101, and the like, which are not described in detail herein.

S202, pooling the initial feature map through an average pooling sub-model of the image segmentation model to obtain a first feature map carrying short-distance dependency relationship information.

In this embodiment, Average Pooling (Average Pooling) may be performed on the initial feature map, that is, feature points in a neighborhood are averaged, so that downsampling of the initial feature map is achieved, and the first feature map carrying short-distance dependency information is obtained. The average pooling in the embodiment can well capture the short-distance dependency relationship between different positions, and is especially necessary for applying the average pooling under the condition that the semantic regions are closely distributed.

Optionally, in this embodiment, the average pooling sub-model may adopt a pyramid pooling module (pyramid pooling module), and the pyramid pooling module performs pyramid pooling on the initial feature map through pooling cores of different pyramid levels to obtain features of different levels, performs convolution respectively, then performs upsampling to obtain feature maps of the same size as the initial feature map, and performs concatant cascade on the feature maps to obtain the first feature map, thereby fusing features of different pyramid scales and reducing information loss between different regions.

S203, processing the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map, wherein the at least one target feature map comprises a second feature map carrying global dependency relationship information and/or a third feature map carrying long-distance dependency relationship information.

In this embodiment, at least one branch sub-model is arranged in parallel with the average pooling sub-model in the image segmentation model, and a target feature map of one dimension, such as a global-dimension feature map or a long-distance-dimension feature map, may be obtained for every branch sub-model in the at least one branch sub-model, that is, the target feature map carries global dependency information or carries long-distance dependency information.

Optionally, in this embodiment, the initial feature map may be processed by an attention mechanism sub-model, and a second feature map carrying global dependency relationship information is obtained; and/or pooling the initial characteristic map through a stripe pooling sub-model to obtain a third characteristic map carrying long-distance dependency relationship information.

The attention mechanism can well capture the global feature dependency relationship in the space, establish rich context relationship on local features, and encode wider context information into the local features, thereby enhancing the representation capability of the local features; while stripe pooling may well capture long distance dependencies between different locations and may enable connections between discretely distributed regions throughout the scene and connections between coding regions of the strip structure.

And S204, cascading the first characteristic diagram and the target characteristic diagram, performing convolution processing on a cascading result to obtain an image segmentation result, and outputting the image segmentation result.

In this embodiment, concat concatenation is performed on the first feature map and the target feature map, convolution processing is performed on a concatenated result to obtain an image segmentation result, the feature representation is made more robust through concat concatenation, and finally, significant improvement of the segmentation effect is further achieved.

In the image segmentation method provided by the embodiment, an initial feature map is extracted from an image to be processed through a base network of an image segmentation model; pooling the initial characteristic graph through an average pooling sub-model of the image segmentation model to obtain a first characteristic graph carrying short-distance dependency relationship information; processing the initial characteristic diagram through at least one branch sub-model of the image segmentation model to obtain at least one target characteristic diagram, wherein the at least one target characteristic diagram comprises a second characteristic diagram carrying global dependency relationship information and/or a third characteristic diagram carrying long-distance dependency relationship information; and cascading the first characteristic diagram and the target characteristic diagram, performing convolution processing on a cascading result to obtain an image segmentation result, and outputting the image segmentation result. In the embodiment, the branch submodels are arranged in parallel in the average pooling submodel in the image segmentation model, the characteristic diagram carrying the global dependency relationship information and/or the long-distance dependency relationship information is obtained and is cascaded with the characteristic diagram carrying the short-distance dependency relationship information obtained by the average pooling submodel, so that the characteristic representation capability is effectively enhanced, and the accuracy of image segmentation is improved.

Based on the foregoing embodiment, at least one branch sub-model in the image segmentation model of this embodiment may be an attention mechanism sub-model and/or a streak pooling sub-model, and the image segmentation model is specifically shown in fig. 3, and further, in S203, the processing the initial feature map by the at least one branch sub-model of the image segmentation model to obtain at least one target feature map may include:

On the basis of the foregoing embodiment, as shown in fig. 4, the processing the initial feature map through the attention mechanism sub-model of the image segmentation model to obtain a second feature map carrying global dependency information may include:

s301, performing convolution processing on the initial characteristic diagram through a convolution layer of the attention mechanism submodel to obtain an initial tensor, and performing remodeling processing on the initial tensor to convert the initial tensor from a third order to a second order to obtain an intermediate tensor;

s302, acquiring an attention matrix according to the intermediate tensor, wherein the attention matrix is used for expressing the correlation among the characteristics of each position of the intermediate tensor;

and S303, multiplying the intermediate tensor by the attention moment array to obtain a fourth feature map, and adding the initial feature map and the fourth feature map to obtain the second feature map.

In this embodiment, as shown in the part of the attention mechanism in fig. 3, an initial feature map X (C × H × W) is sent to the convolution layer to obtain an initial tensor, where X1, X2, and X3 are initial tensors of the same shape and all have the size of C × H × W, and the initial tensor is subjected to reshape reshaping processing to obtain an intermediate tensor of the size of C × N, where N ═ H × W, that is, the feature of the original H × W is given to a flitten to flatten, and then, the original H × W is reduced from three dimensions to two dimensions;

further, an attention matrix is obtained according to the intermediate tensor, and is used for expressing the correlation among the characteristics of each position of the intermediate tensor, specifically, the transposition of the intermediate tensor is multiplied by the intermediate tensor to obtain an characteristic matrix; and inputting the feature matrix into a Softmax layer of the attention mechanism submodel to acquire the attention matrix. That is, X1 and X2 are obtained after reshape

To pair

Is transposed and is combined with

Multiplying to obtain a feature matrix (adjacency matrix) of N × N, and marking as M, namely (H × W) × (H × W) size, sending the obtained feature matrix M of N × N into Softmax to obtain an attention matrix P (N × N), wherein the values in the attention matrix P reflect the phases of the features at two positionsRelevance, i.e. the more similar the representation of the features of two locations, the greater the correlation between them.

Wherein, the attention matrix calculated by Softmax can be specifically calculated by the following formula:

wherein, P_ijFor the values in the feature matrix M, i ∈ (0, N-1), j ∈ (0, N-1), S_ijIs a value in the attention matrix P for measuring the influence of the ith position on the jth position, i.e. the degree/correlation of the association between the ith position and the jth position, S_ijThe larger the i-th and j-th locations are.

After obtaining the attention matrix P, X3 is subjected to reshape

And multiplying the attention matrix P to obtain a fourth feature map T (C H W), and finally adding the fourth feature map T and the initial feature map X to obtain the second feature map for subsequent concat cascade connection with the first feature map.

In this embodiment, the attention mechanism described above employs a Position attention Module (Position attention Module), and captures the long-range dependency of a space (picture) based on a non-local mean filtering operation idea, when calculating the output of each pixel Position, the correlation is calculated with all positions in an image instead of only a neighborhood, and then the correlation is used as a weight to represent the similarity between other positions and the current Position to be calculated, so as to encode more extensive context information into a local feature, thereby enhancing the representation capability thereof.

On the basis of any of the above embodiments, as shown in fig. 5, the pooling the initial feature map by the stripe pooling sub-model of the image segmentation model to obtain a third feature map carrying long-distance dependency relationship information may include:

s401, performing horizontal stripe pooling and vertical stripe pooling on the initial characteristic diagram through the stripe pooling sub-model;

s402, convolution is respectively carried out on the horizontal stripe pooling result and the vertical stripe pooling result, then upsampling is carried out, and the upsampling results are cascaded to obtain the third characteristic diagram.

In this embodiment, the average Pooling in PSPnet may be changed to stripe Pooling by referring to a Pyramid Pooling Module in PSPnet, that is, after stripe Pooling, the pooled result is convolved and then upsampled, and then the upsampled result is cascaded.

For example, after the initial feature map X (C × H × W) is input into the streak pooling sub-model, horizontal streak pooling and vertical streak pooling are performed, as shown in fig. 6, where for more intuition, C ═ 1 is taken as an example, that is, the channel of the initial feature map X is 1, pooling results of H × 1 and 1 × W are obtained after the horizontal streak pooling and the vertical streak pooling are performed, then the feature map is obtained by one-dimensional convolution with a convolution kernel of 3, the size of C × H × W is recovered by performing upsampling, and finally, a third feature map is obtained by concat cascade for subsequent concat cascade with the first feature map.

On the basis of any of the above embodiments, as shown in fig. 7, the method further includes a training process for the image segmentation model, which is specifically as follows:

s501, acquiring training data, wherein the training data are training images which are segmented and labeled by images;

s502, carrying out transformation operation on the training image, wherein the training image after the transformation operation is also used as the training data, and the transformation operation comprises at least one of translation, scaling and rotation;

and S503, training the image segmentation model according to the training data.

In this embodiment, a predetermined number of images may be obtained, and image segmentation and labeling processes may be performed on the images to serve as training data, in order to expand the number of the training data, transformation operations, including but not limited to translation, scaling, rotation, and the like, may also be performed on the training images subjected to the image segmentation and labeling, and the training images subjected to the transformation operations are also used as the training data, so that the constructed image segmentation model may be trained according to the training data, and the training process is not described herein again.

Optionally, after the training data is obtained, the training data may be checked to check whether there is a training image with a label error or a missing label, and the checking process may be manual checking or checking through other approaches.

Furthermore, the trained image segmentation model can be applied to an actual scene, a determination logic can be set according to requirements, and the segmentation result output by the image segmentation model is determined, for example, the trained image segmentation model is applied to the traffic field, and the vehicle appearance, the vehicle type and the road condition analysis can be identified according to the image segmentation result.

Fig. 8 is a structural diagram of an image segmentation apparatus according to an embodiment of the present invention. The image segmentation apparatus provided in this embodiment can execute the processing procedure provided in the embodiment of the image segmentation method, and as shown in fig. 8, the image segmentation apparatus 80 includes an input module 81, a processing module 82, and an output module 83.

An input module 81, configured to input an image to be processed into an image segmentation model;

a processing module 82, configured to extract an initial feature map from the to-be-processed image through a base network of the image segmentation model; pooling the initial characteristic graph through an average pooling sub-model of the image segmentation model to obtain a first characteristic graph carrying short-distance dependency relationship information; processing the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map, wherein the at least one target feature map comprises a second feature map carrying global dependency relationship information and/or a third feature map carrying long-distance dependency relationship information; cascading the first feature map and the target feature map, and performing convolution processing on a cascading result to obtain an image segmentation result;

and an output module 83, configured to output the image segmentation result.

On the basis of the foregoing embodiment, when the processing module 82 processes the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map, the processing module is configured to:

On the basis of any of the above embodiments, when the processing module 82 processes the initial feature map through the attention mechanism sub-model of the image segmentation model to obtain the second feature map carrying the global dependency relationship information, it is configured to:

On the basis of any of the foregoing embodiments, the processing module 82, when obtaining an attention matrix from the intermediate tensor, where the attention matrix is used to represent a correlation between the position features of the intermediate tensor, is configured to:

On the basis of any of the above embodiments, when the initial feature map is pooled by the stripe pooling sub-model of the image segmentation model and a third feature map carrying long-distance dependency relationship information is obtained, the processing module 82 is configured to:

On the basis of any of the above embodiments, when the initial feature map is pooled through the average pooling sub-model of the image segmentation model and the first feature map carrying short-distance dependency relationship information is obtained, the processing module 82 is configured to:

On the basis of any of the above embodiments, the processing module 82 is further configured to:

and training the image segmentation model according to the training data.

The image segmentation apparatus provided in the embodiment of the present invention may be specifically configured to execute the method embodiments provided in fig. 2, 4-5, and 7, and specific functions are not described herein again.

The image segmentation device provided by the embodiment of the invention extracts an initial characteristic diagram from an image to be processed through a base network of an image segmentation model; pooling the initial characteristic graph through an average pooling sub-model of the image segmentation model to obtain a first characteristic graph carrying short-distance dependency relationship information; processing the initial characteristic diagram through at least one branch sub-model of the image segmentation model to obtain at least one target characteristic diagram, wherein the at least one target characteristic diagram comprises a second characteristic diagram carrying global dependency relationship information and/or a third characteristic diagram carrying long-distance dependency relationship information; and cascading the first characteristic diagram and the target characteristic diagram, performing convolution processing on a cascading result to obtain an image segmentation result, and outputting the image segmentation result. In the embodiment, the branch submodels are arranged in parallel in the average pooling submodel in the image segmentation model, the characteristic diagram carrying the global dependency relationship information and/or the long-distance dependency relationship information is obtained and is cascaded with the characteristic diagram carrying the short-distance dependency relationship information obtained by the average pooling submodel, so that the characteristic representation capability is effectively enhanced, and the accuracy of image segmentation is improved.

Fig. 9 is a schematic structural diagram of a computer device according to an embodiment of the present invention. The computer device provided in the embodiment of the present invention may execute the processing flow provided in the embodiment of the image segmentation method, as shown in fig. 9, the computer device 90 includes a memory 91, a processor 92, a computer program, and a communication interface 93; wherein a computer program is stored in the memory 91 and is configured to execute the image segmentation method described in the above embodiments by the processor 92.

The computer device of the embodiment shown in fig. 9 can be used to implement the technical solution of the above method embodiment, and the implementation principle and technical effect are similar, which are not described herein again.

In addition, the present embodiment also provides a computer-readable storage medium on which a computer program is stored, the computer program being executed by a processor to implement the image segmentation method described in the above embodiments.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. An image segmentation method, comprising:

2. The method of claim 1, wherein the processing the initial feature map through at least one branch sub-model of the image segmentation model to obtain at least one target feature map comprises:

3. The method of claim 2, wherein the processing the initial feature map through an attention mechanism submodel of the image segmentation model to obtain a second feature map carrying global dependency information comprises:

4. The method of claim 3, wherein obtaining an attention matrix from the intermediate tensor, the attention matrix being used to represent a correlation between features at locations of the intermediate tensor comprises:

5. The method according to claim 2, wherein pooling the initial feature map by a stripe pooling submodel of the image segmentation model to obtain a third feature map carrying long-distance dependency information comprises:

6. The method of claim 1, wherein pooling the initial feature map by an average pooling submodel of the image segmentation model to obtain a first feature map carrying short-distance dependency information comprises:

7. The method of claim 1, further comprising:

and training the image segmentation model according to the training data.

8. An image segmentation apparatus, comprising:

and the output module is used for outputting the image segmentation result.

9. A computer device, comprising: at least one processor and memory;

the memory stores computer-executable instructions;

the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of any one of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer program;

the computer program, when executed by a processor, implementing the method of any one of claims 1-7.