WO2021115061A1 - Image segmentation method and apparatus, and server - Google Patents

Image segmentation method and apparatus, and server Download PDF

Info

Publication number
WO2021115061A1
WO2021115061A1 PCT/CN2020/129521 CN2020129521W WO2021115061A1 WO 2021115061 A1 WO2021115061 A1 WO 2021115061A1 CN 2020129521 W CN2020129521 W CN 2020129521W WO 2021115061 A1 WO2021115061 A1 WO 2021115061A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature map
spatial position
image
information
feature
Prior art date
Application number
PCT/CN2020/129521
Other languages
French (fr)
Chinese (zh)
Inventor
廖祥云
孙寅紫
王琼
王平安
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2021115061A1 publication Critical patent/WO2021115061A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present invention relates to the technical field of image segmentation, in particular to an image segmentation method, device and server.
  • Image segmentation is one of the research hotspots of computer graphics, and it has important applications in the fields of medical disease diagnosis and unmanned driving.
  • U-Net U-shaped neural network algorithm
  • the U-shaped neural network algorithm is composed of an encoder and a decoder, and the encoder and the decoder are connected by splicing in the image channel dimension.
  • the image to be segmented is first extracted through an encoder for image feature extraction.
  • the encoder is composed of multiple convolutional layers, and the convolutional layers are connected by a pooling layer, thereby reducing the dimension of the original image to a certain size.
  • the image output from the encoder is restored to the original image size by the decoder.
  • the decoder is composed of multiple convolutional layers, and the convolutional layers are connected by transposed convolutional layers. Finally, the output image is converted into a probability map using the softmax activation function.
  • the UNet algorithm Compared with traditional image segmentation algorithms, such as threshold segmentation, region segmentation, and edge segmentation, the UNet algorithm has a simple network structure and high accuracy of image segmentation.
  • the current UNet image segmentation algorithm questions will exaggerate the difference between similar objects (inter-class distinction) or the similarity of objects of different categories (intra-class consistency), and it is impossible to separate "different types of similar features" and "same characteristics". The boundary between the “species difference characteristics”. As a result, the boundary between different objects to be segmented cannot be accurately segmented, and the segmentation accuracy of the image is low.
  • the embodiments of the present invention provide an image segmentation method, device, and server to solve the problem that the boundary between different objects to be segmented cannot be accurately segmented.
  • the first aspect of the embodiments of the present invention provides an image segmentation method, including:
  • the image to be divided is segmented according to the feature map containing the spatial position information, and the target image is output.
  • the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information, include:
  • N information extraction modules Perform feature extraction on the image to be segmented by N information extraction modules connected in series in the encoder to generate a feature map; the N information extraction modules are set according to preset scale information, N ⁇ 1;
  • the spatial position relationship between pixels in the feature map generated by the information extraction module is calculated to obtain spatial position information.
  • the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information, include:
  • the first information extraction module When the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information;
  • the fusing the feature map and the spatial location information to obtain a feature map containing spatial location information includes:
  • Context information is generated by fusing the feature map and spatial position information output by the Nth information extraction module through the encoder.
  • calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
  • calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
  • the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
  • a is the spatial position information
  • is the activation function
  • l is the number of convolutional layers of the convolutional neural network
  • w (i, j) is the pixel with the coordinate (i, j) in the feature map The weight coefficient of the point
  • k is the number of channels of the feature map
  • b is the offset
  • is the Hadaman product.
  • calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
  • the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
  • a is the spatial position information
  • is the activation function
  • l is the number of convolutional layers of the convolutional neural network
  • w (i, j, k) is the coordinates in the feature map as (i, j, k) the weight coefficient of the pixel
  • m is the number of channels of the feature map
  • b is the offset
  • is the Hadaman product.
  • the segmenting the image to be segmented according to the feature map containing the spatial position information and outputting the target image includes:
  • the decoder divides the image to be divided according to the context information, and outputs the target image.
  • a second aspect of the embodiments of the present invention provides an image segmentation device, including:
  • the image feature and location information extraction module is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the spatial position information;
  • the feature fusion module is used to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information
  • the image segmentation module is configured to segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
  • the third aspect of the embodiment provides a server, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the first computer program when the computer program is executed.
  • the image segmentation method On the one hand, the image segmentation method.
  • an image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the pixel points in the feature map To obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output the target image.
  • the spatial position information is obtained by calculating the spatial position relationship between pixels in the feature map, and the relative position relationship of pixels in different spatial positions in the feature map can be extracted.
  • the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map
  • the spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and “same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
  • FIG. 1 is a schematic flowchart of an image segmentation method provided by Embodiment 1 of the present invention
  • Embodiment 1 of the present invention is a schematic structural diagram of an image segmentation model provided by Embodiment 1 of the present invention
  • Embodiment 3 is a schematic flowchart of an image segmentation method provided by Embodiment 2 of the present invention.
  • FIG. 4 is a schematic diagram of convolution calculation of the feature map depth convolution layer of the second branch in the information extraction module provided by the second embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an image segmentation device provided by Embodiment 3 of the present invention.
  • Fig. 6 is a schematic structural diagram of a server provided in the fourth embodiment of the present invention.
  • FIG. 1 it is a schematic flowchart of an image segmentation method provided by Embodiment 1 of the present invention.
  • This embodiment can be applied to the application scenario of multi-target segmentation of an image.
  • the method can be executed by an image segmentation device, which can be a server, a smart terminal, a tablet or a PC, etc.; in this embodiment of the application, the image segmentation device is used as The executive subject will explain that the method specifically includes the following steps:
  • S110 Input the image to be segmented into an image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain spatial position information;
  • image segmentation can be performed by constructing an image segmentation model including a neural network through deep learning target images.
  • image features extracted after the convolution calculation of the multi-layer convolutional layer in the trained image segmentation model of the image to be segmented often exaggerate the difference between similar objects (inter-class distinction) or the similarity of objects of different categories
  • intra-class distinction the difference between similar objects
  • similarity of objects of different categories The issue of intra-class consistency.
  • the image segmentation model performs target image segmentation on the image to be segmented based on the extracted features, it cannot segment the boundary between "different types of similar features" and “same type difference features", resulting in over-segmentation and under-segmentation in the segmentation process.
  • the feature relationship between feature image pixels can be extracted from the different levels of the convolutional neural network of the image segmentation model to overcome the inability to segment the difference between "different types of similar features" and “same types of different features". The problem of borders.
  • the image to be segmented can be segmented through an image segmentation model trained based on multiple target images.
  • the image feature to be segmented is extracted to generate a feature map, and the spatial position relationship between the pixels in the feature map is calculated to obtain the spatial position information, so as to obtain the pixel points in the feature map in different spaces The relative positional relationship of the position.
  • the image segmentation model may adopt a U-shaped neural network (Feature Depth UNet) framework, and an encoder and a decoder form a symmetric structure; the encoder and the decoder are spliced by image channel dimensions.
  • Figure 2 shows the structure diagram of the image segmentation model.
  • the specific process of performing feature extraction on the image to be segmented to generate a feature map, and calculating the spatial position relationship between pixels in the feature map to obtain spatial position information may be: pairing N information extraction modules connected in series in the encoder The image to be segmented is subjected to feature extraction to generate a feature map; the N information extraction modules are set according to preset scale information, N ⁇ 1; for each of the information extraction modules, the feature map generated by the information extraction module is calculated The spatial position relationship between the pixels obtains the spatial position information.
  • the encoder includes N information extraction modules connected in series to perform image feature extraction on the input image to be divided to generate a feature map.
  • the N information extraction modules are set according to the preset size information, so that each information extraction module has different size information.
  • feature extraction of the image to be segmented through N information extraction modules can generate a feature map containing multi-scale information; and after each information extraction module performs feature extraction of the image to be segmented, the information is calculated
  • the spatial position relationship between pixels in the feature map generated by the extraction module is used to obtain spatial position information.
  • N information extraction modules corresponding to different scale information calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information, the spatial position information containing multi-scale information can be obtained.
  • feature extraction is performed on the image to be divided by N information extraction modules connected in series in the encoder to generate a feature map, and the spatial position relationship between pixels in the feature map generated by each information extraction module is calculated
  • the specific process of obtaining the spatial position information may be: when the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented to generate a feature map, and the space between pixels in the feature map is calculated Position relationship to obtain spatial location information; fusion of the feature map and the spatial location information to generate a new feature map and output it to the next information extraction module, so that the next information extraction module performs processing on the new feature map Feature extraction and spatial position relationship calculation.
  • each information extraction module can include two branches.
  • the first branch is used to extract features from the input image to generate a feature map to extract the pixel value information of the image; the second branch is the same as the first branch.
  • the spatial position relationship between the pixels in the feature map is also calculated to obtain the spatial position information, so as to realize the extraction of the spatial position relationship information between the pixels.
  • the first branch used to extract features from the input image to generate a feature map can be composed of several convolutional layers; the second branch can be composed of several convolutional layers that are the same as the first branch and one feature map depth convolutional layer , In order to realize the feature extraction of the input image to generate a feature map, and then calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information.
  • the N information extraction modules in the encoder can be connected in series through the maximum pooling layer.
  • each pixel in the depth convolutional layer of the feature map can map a different field of view on the original image.
  • a Batch Normalization (batch normalization) layer and L2 regularization can be added after the loss function between the convolutional layers in each information extraction module.
  • the image to be segmented is input to the first information extraction module
  • feature extraction is performed on the image to be segmented through several convolutional layers in the first branch of the first information extraction module to generate a feature map
  • the first information is extracted
  • Several convolution layers in the second branch of the module perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map through the feature map depth convolution layer in the second branch to obtain the space location information.
  • a new feature map is generated by fusing the feature map output by the first branch of the first information extraction module and the spatial position information output by the second branch through the pooling layer, and the newly generated feature map is input to the next information extraction module to
  • the first branch of the next information extraction module performs feature extraction on the feature map of the input module
  • the second branch of the next information extraction module performs feature extraction on the feature map of the input module and calculate the spatial position relationship.
  • the first branch of the Nth information extraction module performs feature extraction on the feature map of the input module to generate a feature map containing multi-scale information
  • the second branch of the Nth information extraction module performs feature extraction on the feature map of the input module and calculates the spatial position
  • the relationship obtains spatial location information containing multi-scale information.
  • N information extraction modules connected in series in the encoder perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map generated by each information extraction module to obtain the spatial position information
  • the Nth information extraction module outputs the finally obtained feature map and spatial position information.
  • the feature map output by the Nth information extraction module and the spatial position information are fused to obtain a feature map to complete the feature fusion.
  • the image segmentation model can be composed of an encoder and a decoder
  • the decoder needs to perform image segmentation on the image to be segmented according to the context information sent by the encoder.
  • Context information can be generated by fusing the feature map and spatial position information output by the Nth information extraction module through the pooling layer in the encoder.
  • the image segmentation model can include an encoder and a decoder, and the encoder and the decoder have a symmetrical structure.
  • the decoder corresponds to the convolutional layer structure in the encoder with a corresponding transposed convolutional layer. And in order to make the neural network retain the shallower information, the encoder and the decoder are connected by skipping.
  • the image to be segmented is segmented by the decoder according to the context information encoded by the encoder, and the target image is output.
  • the decoder can obtain the feature relationship between the pixels of the feature map according to the spatial position relationship between the pixels in the context information, thereby segmenting "different types of similar features" and “same features".
  • the boundary between the "type difference feature” realizes the precise segmentation of the boundary between different objects to be segmented.
  • An image segmentation method inputs an image to be segmented into an image segmentation model, extracts features of the image to be segmented to generate a feature map, and calculates the spatial position relationship between pixels in the feature map Obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output a target image.
  • the spatial position information is obtained by calculating the spatial position relationship between the pixels in the feature map, and the relative position relationship of the pixels in different spatial positions in the feature map is extracted.
  • the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map
  • the spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and “same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
  • FIG. 3 is a schematic flowchart of the image segmentation method provided in the second embodiment of the present invention.
  • this embodiment also provides a process of calculating the spatial position relationship between pixels in the feature map to obtain spatial position information, thereby further improving the accuracy of image segmentation.
  • the method specifically includes:
  • the N information extraction modules are set according to the preset size information, so that each information extraction module has different size information.
  • feature extraction of the image to be segmented through N information extraction modules can generate a feature map containing multi-scale information; and after each information extraction module performs feature extraction of the image to be segmented, the information is calculated
  • the spatial position relationship between pixels in the feature map generated by the extraction module is used to obtain spatial position information.
  • N information extraction modules corresponding to different scale information calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information, the spatial position information containing multi-scale information can be obtained.
  • each information extraction module can include two branches.
  • the first branch is used to extract features from the input image to generate a feature map; the second branch is to perform feature extraction on the input image in the same way as the first branch. After the feature map is generated, the spatial position relationship between the pixels in the feature map is also calculated to obtain the spatial position information.
  • the first branch used to extract features from the input image to generate a feature map can be composed of several convolutional layers; the second branch can be composed of several convolutional layers that are the same as the first branch and one feature map depth convolutional layer , In order to realize the feature extraction of the input image to generate a feature map, and then calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information.
  • the N information extraction modules in the encoder can be connected in series through the maximum pooling layer.
  • feature extraction is performed on the image to be segmented through a number of convolutional layers in the first branch of the first information extraction module to generate a feature map; at the same time, a feature map is generated by the first information extraction module.
  • convolutional layers in the second branch perform feature extraction on the image to be segmented to generate a feature map, and use the feature map depth convolution layer in the second branch to calculate the spatial position relationship between pixels in the feature map to obtain spatial position information.
  • a new feature map is generated by fusing the feature map output by the first branch of the first information extraction module and the spatial position information output by the second branch through the pooling layer, and the newly generated feature map is input to the next information extraction module to
  • the first branch of the next information extraction module performs feature extraction on the feature map of the input module
  • the second branch of the next information extraction module performs feature extraction on the feature map of the input module and calculate the spatial position relationship.
  • the first branch of the Nth information extraction module performs feature extraction on the feature map of the input module to generate a feature map containing multi-scale information
  • the second branch of the Nth information extraction module performs feature extraction on the feature map of the input module and calculates the spatial position
  • the relationship obtains spatial location information containing multi-scale information.
  • each information extraction module can include two branches
  • the second branch can be composed of several convolutional layers that are the same as the first branch and a feature map depth convolutional layer to realize feature extraction and generation of the input image
  • the spatial position relationship between the pixels in the feature map is calculated to obtain the spatial position information. Therefore, for each of the information extraction modules, convolving the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network can be: using the features of the second branch in the information extraction module
  • the map depth convolution layer convolves the feature map in the direction of the feature map obtained by convolution calculation of several convolution layers perpendicular to the second branch, and calculates the spatial position relationship between pixels in the feature map to obtain spatial position information.
  • the feature map generated by the information extraction module that is, the feature map calculated by convolution of several convolutional layers of the second branch
  • the feature map depth volume of the second branch in the information extraction module The formula for multi-layer calculation of the spatial position relationship between pixels in the feature map is:
  • a is the spatial position information
  • is the activation function
  • l is the number of convolutional layers of the convolutional neural network
  • w (i, j) is the pixel with the coordinate (i, j) in the feature map The weight coefficient of the point
  • k is the number of channels of the feature map
  • b is the offset
  • is the Hadamand product.
  • the feature map depth convolution layer of the second branch in the information extraction module can use a H ⁇ W ⁇ C convolution kernel, where H ⁇ W represents the size of the convolution kernel, C represents the number of convolution kernels, and its value Equal to the number of pixels in the XY plane of the output feature map.
  • H ⁇ W represents the size of the convolution kernel
  • C represents the number of convolution kernels
  • its value Equal to the number of pixels in the XY plane of the output feature map.
  • FIG. 4 it is a schematic diagram of the convolution calculation of the feature map depth convolution layer of the second branch in the information extraction module. In order to calculate the output of the depth convolution of the two-dimensional feature map, first put the H ⁇ W convolution kernel at the upper left corner of the feature map, and perform the first convolution operation.
  • the feature map generated by the information extraction module that is, the feature map obtained by convolution calculation of several convolutional layers of the second branch, is a three-dimensional feature map, and then the feature map of the second branch in the information extraction module is used to deep convolutional layer
  • the formula for calculating the spatial position relationship between pixels in the feature map is:
  • a is the spatial position information
  • is the activation function
  • l is the number of convolutional layers of the convolutional neural network
  • w (i, j, k) is the coordinates in the feature map as (i, j, k) the weight coefficient of the pixel
  • m is the number of channels of the feature map
  • b is the offset
  • is the Hadamand product.
  • the feature map depth convolution layer of the second branch in the information extraction module can use a convolution kernel of H ⁇ W ⁇ P ⁇ C, where H ⁇ W ⁇ P represents the size of the convolution kernel, and C represents the size of the convolution kernel.
  • the number whose value is equal to the number of pixels in the XY plane of the output feature map.
  • H ⁇ W ⁇ P convolution kernel At the upper left corner of the feature map, and perform the first 3D convolution operation.
  • slide the convolution kernel along the Z axis and perform the same three-dimensional convolution operation in the direction perpendicular to the feature map in turn.
  • the calculation results of the C convolution kernels are arranged on the XY plane according to the position of the feature map to obtain the spatial position information.
  • Context information is generated by fusing the feature map and spatial position information output by the Nth information extraction module through the pooling layer in the encoder.
  • the image to be divided is segmented by the decoder according to the context information encoded by the encoder, and the target image is output. Since the context information is generated based on the feature map containing the spatial position information, the decoder can obtain the feature relationship between the pixels of the feature map according to the spatial position relationship between the pixels in the context information, thereby segmenting "different types of similar features" and "same features". The boundary between the "type difference feature" realizes the precise segmentation of the boundary between different objects to be segmented.
  • the image segmentation device provided in the third embodiment of the present invention is shown.
  • an embodiment of the present invention also provides an image segmentation 5, and the device includes:
  • the image feature and location information extraction module 501 is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the space location information;
  • the image to be segmented is input into the image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain the spatial position information.
  • the feature and location information extraction module 501 includes:
  • the image feature extraction unit is configured to perform feature extraction on the image to be segmented to generate a feature map through N information extraction modules connected in series in the encoder; the N information extraction modules are set according to preset scale information, N ⁇ 1;
  • the location information extraction unit is configured to calculate the spatial location relationship between pixels in the feature map generated by the information extraction module for each of the information extraction modules to obtain spatial location information.
  • the position information extraction unit includes:
  • the position information extraction subunit is used to convolve the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network for each of the information extraction modules to calculate the feature map
  • the spatial position relationship between the pixels obtains the spatial position information.
  • the feature fusion module 502 is configured to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information;
  • the feature fusion module 502 when fusing the feature map and the spatial location information to obtain a feature map containing spatial location information, includes:
  • the feature fusion unit is used for fusing the feature map and the spatial position information output by the Nth information extraction module through the encoder to generate context information.
  • the image segmentation module 503 is configured to segment the image to be segmented according to the feature map containing spatial position information, and output a target image.
  • the image to be segmented is segmented according to the feature map containing spatial position information, and when the target image is output, the image segmentation module 503 includes:
  • the image segmentation unit is configured to segment the image to be segmented according to the context information by the decoder, and output a target image.
  • An image segmentation device inputs an image to be segmented into an image segmentation model, extracts features of the image to be segmented to generate a feature map, and calculates the spatial position relationship between pixels in the feature map Obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output a target image.
  • the spatial position information is obtained by calculating the spatial position relationship between the pixels in the feature map, and the relative position relationship of the pixels in different spatial positions in the feature map is extracted.
  • the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map
  • the spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and “same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
  • Fig. 6 is a schematic structural diagram of a server provided in the fourth embodiment of the present invention.
  • the server includes a processor 61, a memory 62, and a computer program 63 stored in the memory 62 and running on the processor 61, such as a program for an image segmentation method.
  • the processor 61 implements the steps in the embodiment of the image segmentation method when the computer program 63 is executed, for example, steps S110 to S130 shown in FIG. 1.
  • the computer program 63 may be divided into one or more modules, and the one or more modules are stored in the memory 62 and executed by the processor 61 to complete the application.
  • the one or more modules may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 63 in the server.
  • the computer program 63 can be divided into an image feature and location information extraction module, a feature fusion module, and an image segmentation module, and the specific functions of each module are as follows:
  • the image feature and location information extraction module is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the spatial position information;
  • the feature fusion module is used to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information
  • the image segmentation module is configured to segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
  • the server may include, but is not limited to, a processor 61, a memory 62, and a computer program 63 stored in the memory 62.
  • FIG. 6 is only an example of a server, and does not constitute a limitation on the server. It may include more or less components than those shown in the figure, or a combination of certain components, or different components, such as the
  • the server may also include input and output devices, network access devices, buses, and so on.
  • the processor 61 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 62 may be an internal storage unit of the server, such as a hard disk or memory of the server.
  • the memory 62 may also be an external storage device, such as a plug-in hard disk equipped on a server, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, a flash memory card (Flash Card), and so on.
  • the storage 62 may also include both an internal storage unit of the server and an external storage device.
  • the memory 62 is used to store the computer program and other programs and data required by the image segmentation method.
  • the memory 62 can also be used to temporarily store data that has been output or will be output.
  • the disclosed device/terminal device and method may be implemented in other ways.
  • the device/terminal device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
  • components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the present invention implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electrical carrier signal telecommunications signal
  • software distribution media etc.
  • the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of the legislation and patent practice in the jurisdiction.
  • the computer-readable medium Does not include electrical carrier signals and telecommunication signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

An image segmentation method and apparatus, and a server. The method comprises: inputting an image to be segmented into an image segmentation model, performing feature extraction on the image to be segmented so as to generate a feature graph, and calculating a spatial position relation between pixel points in the feature graph so as to obtain spatial position information (S110); fusing the feature graph and the spatial position information to obtain a feature graph comprising the spatial position information (S120); and according to the feature graph comprising the spatial position information, segmenting the image to be segmented, and outputting a target image (S130). The method solves the problem that the boundary between different targets to be segmented cannot be accurately segmented.

Description

图像分割方法、装置及服务器Image segmentation method, device and server 技术领域Technical field
本发明涉及图像分割的技术领域,尤其涉及一种图像分割方法、装置及服务器。The present invention relates to the technical field of image segmentation, in particular to an image segmentation method, device and server.
背景技术Background technique
图像分割是计算机图形学的研究热点之一,在医疗疾病诊断、无人驾驶等领域有重要应用。目前图像分割的算法有多种方法,其中(U-Net)U型神经网络算法是最为常用的算法之一。U型神经网络算法由编码器和解码器组成,编码器和解码器之间通过在图像通道维度拼接,实现连接。具体来说,待分割图像首先经过编码器进行图像特征的提取,编码器由多个卷积层组成,卷积层之间通过池化层连接,从而将原始图像的维度缩小到一定大小。之后从编码器输出的图像通过解码器恢复到原始图像尺寸,解码器由多个卷积层组成,卷积层之间通过转置卷积层连接。最后将输出的图像使用softmax激活函数转换为概率图。相较于传统的图像分割算法,如阈值分割,区域分割,边缘分割,UNet算法网络结构简单,图像分割的准确度高。然而,目前UNet图像分割算法题会出现夸大同类物体之间的差异性(inter-class distinction)或者不同类别物体的相似性(intra-class consistency),无法分割出“不同种类相似特征”和“同种类差异特征”之间的边界。从而导致对于不同待分割目标之间的边界不能精确分割,图像的分割准确率低。Image segmentation is one of the research hotspots of computer graphics, and it has important applications in the fields of medical disease diagnosis and unmanned driving. At present, there are many methods for image segmentation algorithms, among which (U-Net) U-shaped neural network algorithm is one of the most commonly used algorithms. The U-shaped neural network algorithm is composed of an encoder and a decoder, and the encoder and the decoder are connected by splicing in the image channel dimension. Specifically, the image to be segmented is first extracted through an encoder for image feature extraction. The encoder is composed of multiple convolutional layers, and the convolutional layers are connected by a pooling layer, thereby reducing the dimension of the original image to a certain size. After that, the image output from the encoder is restored to the original image size by the decoder. The decoder is composed of multiple convolutional layers, and the convolutional layers are connected by transposed convolutional layers. Finally, the output image is converted into a probability map using the softmax activation function. Compared with traditional image segmentation algorithms, such as threshold segmentation, region segmentation, and edge segmentation, the UNet algorithm has a simple network structure and high accuracy of image segmentation. However, the current UNet image segmentation algorithm questions will exaggerate the difference between similar objects (inter-class distinction) or the similarity of objects of different categories (intra-class consistency), and it is impossible to separate "different types of similar features" and "same characteristics". The boundary between the “species difference characteristics”. As a result, the boundary between different objects to be segmented cannot be accurately segmented, and the segmentation accuracy of the image is low.
发明内容Summary of the invention
有鉴于此,本发明实施例提供了一种图像分割方法、装置及服务器,以解决对于不同待分割目标之间的边界不能精确分割的问题。In view of this, the embodiments of the present invention provide an image segmentation method, device, and server to solve the problem that the boundary between different objects to be segmented cannot be accurately segmented.
本发明实施例的第一方面提供了一种图像分割方法,包括:The first aspect of the embodiments of the present invention provides an image segmentation method, including:
将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;Input the image to be divided into an image segmentation model, perform feature extraction on the image to be divided to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain spatial position information;
融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;Fusing the feature map and the spatial location information to obtain a feature map containing the spatial location information;
根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。The image to be divided is segmented according to the feature map containing the spatial position information, and the target image is output.
在一个实施示例中,所述将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息,包括:In an implementation example, the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information, include:
通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图;所述N个信息提取模块根据预设尺度信息设置,N≥1;Perform feature extraction on the image to be segmented by N information extraction modules connected in series in the encoder to generate a feature map; the N information extraction modules are set according to preset scale information, N≥1;
对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息。For each of the information extraction modules, the spatial position relationship between pixels in the feature map generated by the information extraction module is calculated to obtain spatial position information.
在一个实施示例中,所述将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息,包括:In an implementation example, the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information, include:
所述待分割图像输入第一所述信息提取模块时,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;When the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information;
融合所述特征图和所述空间位置信息生成新的特征图输出至下一所述信息提取模块,以使所述下一信息提取模块对所述新的特征图进行特征提取和空间位置关系计算。Fusion of the feature map and the spatial position information to generate a new feature map and output to the next information extraction module, so that the next information extraction module performs feature extraction and spatial position relationship calculation on the new feature map .
在一个实施示例中,所述融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图,包括:In an implementation example, the fusing the feature map and the spatial location information to obtain a feature map containing spatial location information includes:
通过编码器融合第N个信息提取模块输出的特征图和空间位置信息生成上下文信息。Context information is generated by fusing the feature map and spatial position information output by the Nth information extraction module through the encoder.
在一个实施示例中,所述对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息,包括:In an implementation example, for each of the information extraction modules, calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
对于每一所述信息提取模块,通过卷积神经网络沿垂直于所述信息提取模块生成的特征图方向对所述特征图进行卷积,计算所述特征图中像素点之间的空间位置关系得到空间位置信息。For each of the information extraction modules, convolve the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network, and calculate the spatial position relationship between pixels in the feature map Get spatial location information.
在一个实施示例中,所述对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息,包括:In an implementation example, for each of the information extraction modules, calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
若所述信息提取模块生成的特征图为二维特征图,则计算所述信息提取模块生成的特征图中像素点之间的空间位置关系的公式为:If the feature map generated by the information extraction module is a two-dimensional feature map, the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
Figure PCTCN2020129521-appb-000001
Figure PCTCN2020129521-appb-000001
其中,a为所述空间位置信息;δ为激活函数;l为所述卷积神经网络的卷积层数;w (i,j) 为所述特征图中坐标为(i,j)的像素点的权重系数;k为所述特征图的通道数;b为偏移量;⊙为哈达曼乘积。 Where a is the spatial position information; δ is the activation function; l is the number of convolutional layers of the convolutional neural network; w (i, j) is the pixel with the coordinate (i, j) in the feature map The weight coefficient of the point; k is the number of channels of the feature map; b is the offset; ⊙ is the Hadaman product.
在一个实施示例中,所述对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息,包括:In an implementation example, for each of the information extraction modules, calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
若所述信息提取模块生成的特征图为三维特征图,则计算所述信息提取模块生成的特征图中像素点之间的空间位置关系的公式为:If the feature map generated by the information extraction module is a three-dimensional feature map, the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
Figure PCTCN2020129521-appb-000002
Figure PCTCN2020129521-appb-000002
其中,a为所述空间位置信息;δ为激活函数;l为所述卷积神经网络的卷积层数;w (i,j,k)为所述特征图中坐标为(i,j,k)的像素点的权重系数;m为所述特征图的通道数;b为偏移量;⊙为哈达曼乘积。 Wherein, a is the spatial position information; δ is the activation function; l is the number of convolutional layers of the convolutional neural network; w (i, j, k) is the coordinates in the feature map as (i, j, k) the weight coefficient of the pixel; m is the number of channels of the feature map; b is the offset; ⊙ is the Hadaman product.
在一个实施示例中,所述根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像,包括:In an implementation example, the segmenting the image to be segmented according to the feature map containing the spatial position information and outputting the target image includes:
通过解码器根据所述上下文信息对所述待分割图像进行分割,输出目标图像。The decoder divides the image to be divided according to the context information, and outputs the target image.
本发明实施例的第二方面提供了一种图像分割装置,包括:A second aspect of the embodiments of the present invention provides an image segmentation device, including:
图像特征和位置信息提取模块,用于将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;The image feature and location information extraction module is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the spatial position information;
特征融合模块,用于融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;The feature fusion module is used to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information;
图像分割模块,用于根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。The image segmentation module is configured to segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
实施例的第三方面提供了一种服务器,包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现第一方面中图像分割方法。The third aspect of the embodiment provides a server, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the first computer program when the computer program is executed. On the one hand, the image segmentation method.
本发明实施例提供的一种图像分割方法、装置及服务器,通过将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。通过计算特征图中像素点之间的空间位置关系得到空间位置信息,实现提取特 征图中像素点在不同空间位置的相对位置关系。将包含图像信息的特征图与计算得到的空间位置信息融合得到包含空间位置信息的特征图后,根据包含空间位置信息的特征图对待分割图像进行分割,使得图像分割模型能够根据特征图中像素点之间的空间位置关系得到特征图像素间的特征关系,从而分割出“不同种类相似特征”和“同种类差异特征”之间的边界,实现对不同待分割目标之间的边界的精确分割,提高图像的分割准确率。According to an image segmentation method, device and server provided by embodiments of the present invention, an image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the pixel points in the feature map To obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output the target image. The spatial position information is obtained by calculating the spatial position relationship between pixels in the feature map, and the relative position relationship of pixels in different spatial positions in the feature map can be extracted. After fusing the feature map containing the image information with the calculated spatial location information to obtain the feature map containing the spatial location information, the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map The spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and "same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
附图说明Description of the drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions in the embodiments of the present invention more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only of the present invention. For some embodiments, those of ordinary skill in the art can obtain other drawings based on these drawings without creative work.
图1是本发明实施例一提供的图像分割方法的流程示意图;FIG. 1 is a schematic flowchart of an image segmentation method provided by Embodiment 1 of the present invention;
图2是本发明实施例一提供的图像分割模型的结构示意图;2 is a schematic structural diagram of an image segmentation model provided by Embodiment 1 of the present invention;
图3是本发明实施例二提供的图像分割方法的流程示意图;3 is a schematic flowchart of an image segmentation method provided by Embodiment 2 of the present invention;
图4是本发明实施例二提供的信息提取模块中第二分支的特征图深度卷积层的卷积计算示意图;4 is a schematic diagram of convolution calculation of the feature map depth convolution layer of the second branch in the information extraction module provided by the second embodiment of the present invention;
图5是本发明实施例三提供的图像分割装置的结构示意图;FIG. 5 is a schematic structural diagram of an image segmentation device provided by Embodiment 3 of the present invention;
图6是本发明实施例四提供的服务器的结构示意图。Fig. 6 is a schematic structural diagram of a server provided in the fourth embodiment of the present invention.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述,显然,所描述的实施例是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are the present invention. Part of the embodiment, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
本发明的说明书和权利要求书及上述附图中的术语“包括”以及它们任何变形,意图在于覆盖不排他的包含。例如包含一系列步骤或单元的过程、方法或系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。此外,术语“第一”、“第二”和“第三”等是用于区别不同对象,而非用于描述特定顺序。The term "comprising" in the specification and claims of the present invention and the above-mentioned drawings and any variations thereof are intended to cover non-exclusive inclusions. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally includes Other steps or units inherent in these processes, methods, products or equipment. In addition, the terms "first", "second", "third", etc. are used to distinguish different objects, rather than describing a specific order.
实施例一Example one
如图1所示,是本发明实施例一提供的图像分割方法的流程示意图。本实施例可适用于对图像进行多目标分割的应用场景,该方法可以由图像分割装置执行,该装置可为服务器、智能终端、平板或PC等;在本申请实施例中以图像分割装置作为执行主体进行说明,该方法具体包括如下步骤:As shown in FIG. 1, it is a schematic flowchart of an image segmentation method provided by Embodiment 1 of the present invention. This embodiment can be applied to the application scenario of multi-target segmentation of an image. The method can be executed by an image segmentation device, which can be a server, a smart terminal, a tablet or a PC, etc.; in this embodiment of the application, the image segmentation device is used as The executive subject will explain that the method specifically includes the following steps:
S110、将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;S110: Input the image to be segmented into an image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain spatial position information;
现有的图像分割方法中可通过深度学习目标图像构建包含神经网络的图像分割模型来进行图像分割。然而待分割图像经过训练好的图像分割模型中多层卷积层的卷积计算后提取得到的图像特征往往会出现夸大同类物体之间的差异性(inter-class distinction)或者不同类别物体的相似性(intra-class consistency)的问题。使得图像分割模型根据提取到的特征对待分割图像进行目标图像分割时,无法分割出“不同种类相似特征”和“同种类差异特征”之间的边界,造成分割过程中的过分割和欠分割,导致对不同待分割目标图像之间的边界难以精确分割。为解决这一技术问题,可从图像分割模型的卷积神经网络的不同层次中提取特征图像像素间的特征关系,克服无法分割出“不同种类相似特征”和“同种类差异特征”之间的边界的问题。In the existing image segmentation methods, image segmentation can be performed by constructing an image segmentation model including a neural network through deep learning target images. However, the image features extracted after the convolution calculation of the multi-layer convolutional layer in the trained image segmentation model of the image to be segmented often exaggerate the difference between similar objects (inter-class distinction) or the similarity of objects of different categories The issue of intra-class consistency. When the image segmentation model performs target image segmentation on the image to be segmented based on the extracted features, it cannot segment the boundary between "different types of similar features" and "same type difference features", resulting in over-segmentation and under-segmentation in the segmentation process. As a result, it is difficult to accurately segment the boundaries between different target images to be segmented. In order to solve this technical problem, the feature relationship between feature image pixels can be extracted from the different levels of the convolutional neural network of the image segmentation model to overcome the inability to segment the difference between "different types of similar features" and "same types of different features". The problem of borders.
具体地,可通过根据多个目标图像训练好的图像分割模型对待分割图像进行分割。待分割图像输入该图像分割模型后,对待分割图像进行图像特征提取生成特征图,并计算该特征图中像素点之间的空间位置关系得到空间位置信息,从而获得特征图中像素点在不同空间位置的相对位置关系。Specifically, the image to be segmented can be segmented through an image segmentation model trained based on multiple target images. After the image to be segmented is input into the image segmentation model, the image feature to be segmented is extracted to generate a feature map, and the spatial position relationship between the pixels in the feature map is calculated to obtain the spatial position information, so as to obtain the pixel points in the feature map in different spaces The relative positional relationship of the position.
在一个实施示例中,图像分割模型可采用U型神经网络(Feature Depth UNet)框架,由编码器和解码器形成对称结构;编码器和解码器之间通过图像通道维度拼接。如图2所示为图像分割模型的结构示意图。对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息的具体过程可为:通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图;所述N个信息提取模块根据预设尺度信息设置,N≥1;对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息。In an implementation example, the image segmentation model may adopt a U-shaped neural network (Feature Depth UNet) framework, and an encoder and a decoder form a symmetric structure; the encoder and the decoder are spliced by image channel dimensions. Figure 2 shows the structure diagram of the image segmentation model. The specific process of performing feature extraction on the image to be segmented to generate a feature map, and calculating the spatial position relationship between pixels in the feature map to obtain spatial position information may be: pairing N information extraction modules connected in series in the encoder The image to be segmented is subjected to feature extraction to generate a feature map; the N information extraction modules are set according to preset scale information, N≥1; for each of the information extraction modules, the feature map generated by the information extraction module is calculated The spatial position relationship between the pixels obtains the spatial position information.
具体地,编码器中包括串联的N个信息提取模块对输入的待分割图像进行图像特征提取生成特征图。其中,N个信息提取模块根据预设尺寸信息设置,使得每一信息提取模块具有不同的尺寸信息。待分割图像输入该图像分割模型后,通过N个信息提取模块对待分 割图像进行特征提取能够生成包含多尺度信息的特征图;且在每一信息提取模块对待分割图像进行特征提取后,计算该信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息。经过N个对应不同尺度信息的信息提取模块计算特征图中像素点之间的空间位置关系得到空间位置信息可得到包含多尺度信息的空间位置信息。Specifically, the encoder includes N information extraction modules connected in series to perform image feature extraction on the input image to be divided to generate a feature map. Among them, the N information extraction modules are set according to the preset size information, so that each information extraction module has different size information. After the image to be segmented is input to the image segmentation model, feature extraction of the image to be segmented through N information extraction modules can generate a feature map containing multi-scale information; and after each information extraction module performs feature extraction of the image to be segmented, the information is calculated The spatial position relationship between pixels in the feature map generated by the extraction module is used to obtain spatial position information. After N information extraction modules corresponding to different scale information calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information, the spatial position information containing multi-scale information can be obtained.
在一个实施示例中,通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图,并计算每一信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息具体过程可为:所述待分割图像输入第一所述信息提取模块时,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;融合所述特征图和所述空间位置信息生成新的特征图输出至下一所述信息提取模块,以使所述下一信息提取模块对所述新的特征图进行特征提取和空间位置关系计算。In an implementation example, feature extraction is performed on the image to be divided by N information extraction modules connected in series in the encoder to generate a feature map, and the spatial position relationship between pixels in the feature map generated by each information extraction module is calculated The specific process of obtaining the spatial position information may be: when the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented to generate a feature map, and the space between pixels in the feature map is calculated Position relationship to obtain spatial location information; fusion of the feature map and the spatial location information to generate a new feature map and output it to the next information extraction module, so that the next information extraction module performs processing on the new feature map Feature extraction and spatial position relationship calculation.
具体地,每一信息提取模块中可包括两个分支,第一分支用于对输入的图像进行特征提取生成特征图,实现提取图像的像素值信息;第二分支在以与第一分支同样的方式对输入的图像进行特征提取生成特征图后还计算该特征图中像素点之间的空间位置关系得到空间位置信息,实现提取像素之间的空间位置关系信息。可选的,用于对输入的图像进行特征提取生成特征图的第一分支可由若干卷积层构成;第二分支可由与第一分支相同的若干卷积层和一个特征图深度卷积层构成,以实现对输入的图像进行特征提取生成特征图后计算该特征图中像素点之间的空间位置关系得到空间位置信息。且编码器中N个信息提取模块可通过最大池化层串联起来。Specifically, each information extraction module can include two branches. The first branch is used to extract features from the input image to generate a feature map to extract the pixel value information of the image; the second branch is the same as the first branch. After the feature extraction is performed on the input image to generate the feature map, the spatial position relationship between the pixels in the feature map is also calculated to obtain the spatial position information, so as to realize the extraction of the spatial position relationship information between the pixels. Optionally, the first branch used to extract features from the input image to generate a feature map can be composed of several convolutional layers; the second branch can be composed of several convolutional layers that are the same as the first branch and one feature map depth convolutional layer , In order to realize the feature extraction of the input image to generate a feature map, and then calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information. And the N information extraction modules in the encoder can be connected in series through the maximum pooling layer.
通过将第二分支中的若干卷积层叠加在特征图深度卷积层前,扩大视野域;使得通过若干卷积层提取特征得到的特征图输入特征图深度卷积层进行特征图中像素点之间的空间位置关系计算时,特征图深度卷积层中的每个像素点可以在原图上映射出不同的视野域。可选的,为减小过拟合现象,还可在每一信息提取模块中的卷积层之间加入Batch Normalization(批量归一化)层和在损失函数后面加上L2正则化。By superimposing several convolutional layers in the second branch before the feature map depth convolutional layer, the field of view is expanded; the feature map obtained by extracting features through several convolutional layers is input into the feature map depth convolutional layer for pixel points in the feature map When calculating the spatial position relationship between the feature maps, each pixel in the depth convolutional layer of the feature map can map a different field of view on the original image. Optionally, in order to reduce the over-fitting phenomenon, a Batch Normalization (batch normalization) layer and L2 regularization can be added after the loss function between the convolutional layers in each information extraction module.
详细地,所述待分割图像输入第一信息提取模块时,通过第一信息提取模块的第一分支中若干卷积层对所述待分割图像进行特征提取生成特征图;同时通过第一信息提取模块中第二分支中若干卷积层对所述待分割图像进行特征提取生成特征图,并通过第二分支中特征图深度卷积层计算该特征图中像素点之间的空间位置关系得到空间位置信息。通过池化层融合第一信息提取模块的第一分支输出的特征图和第二分支输出的空间位置信息生成新的特征图,并将该新生成的特征图输入至下一信息提取模块,以使所述下一信息提取模 块的第一分支对输入模块的特征图进行特征提取,下一信息提取模块的第二分支对输入模块的特征图进行特征提取并计算空间位置关系。直至第N信息提取模块的第一分支对输入模块的特征图进行特征提取生成包含多尺度信息的特征图,第N信息提取模块的第二分支对输入模块的特征图进行特征提取并计算空间位置关系得到包含多尺度信息的空间位置信息。In detail, when the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented through several convolutional layers in the first branch of the first information extraction module to generate a feature map; at the same time, the first information is extracted Several convolution layers in the second branch of the module perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map through the feature map depth convolution layer in the second branch to obtain the space location information. A new feature map is generated by fusing the feature map output by the first branch of the first information extraction module and the spatial position information output by the second branch through the pooling layer, and the newly generated feature map is input to the next information extraction module to The first branch of the next information extraction module performs feature extraction on the feature map of the input module, and the second branch of the next information extraction module performs feature extraction on the feature map of the input module and calculate the spatial position relationship. Until the first branch of the Nth information extraction module performs feature extraction on the feature map of the input module to generate a feature map containing multi-scale information, the second branch of the Nth information extraction module performs feature extraction on the feature map of the input module and calculates the spatial position The relationship obtains spatial location information containing multi-scale information.
S120、融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;S120. Fusion of the feature map and the spatial location information to obtain a feature map containing the spatial location information;
通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图,并计算每一信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息后,由第N个信息提取模块输出最终得到的特征图和空间位置信息。将第N个信息提取模块输出的特征图和空间位置信息进行融合得到特征图完成特征融合。After N information extraction modules connected in series in the encoder perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map generated by each information extraction module to obtain the spatial position information, The Nth information extraction module outputs the finally obtained feature map and spatial position information. The feature map output by the Nth information extraction module and the spatial position information are fused to obtain a feature map to complete the feature fusion.
在一个实施示例中,由于图像分割模型可由编码器和解码器构成,解码器需根据编码器发送的上下文信息对待分割图像进行图像分割。可通过编码器中池化层融合第N个信息提取模块输出的特征图和空间位置信息生成上下文信息。In an implementation example, since the image segmentation model can be composed of an encoder and a decoder, the decoder needs to perform image segmentation on the image to be segmented according to the context information sent by the encoder. Context information can be generated by fusing the feature map and spatial position information output by the Nth information extraction module through the pooling layer in the encoder.
S130、根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。S130. Segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
由于图像分割模型可包括编码器和解码器,且编码器与解码器为对称结构。解码器对应编码器中的卷积层结构设置有对应的转置卷积层。且为使得神经网络保留较浅层信息,编码器和解码器通过跳跃连接。在一个实施示例中,通过解码器根据编码器编码的上下文信息对所述待分割图像进行分割,输出目标图像。由于上下文信息根据包含空间位置信息的特征图生成,使得解码器能够根据上下文信息中像素点之间的空间位置关系得到特征图像素间的特征关系,从而分割出“不同种类相似特征”和“同种类差异特征”之间的边界,实现对不同待分割目标之间的边界的精确分割。Because the image segmentation model can include an encoder and a decoder, and the encoder and the decoder have a symmetrical structure. The decoder corresponds to the convolutional layer structure in the encoder with a corresponding transposed convolutional layer. And in order to make the neural network retain the shallower information, the encoder and the decoder are connected by skipping. In an implementation example, the image to be segmented is segmented by the decoder according to the context information encoded by the encoder, and the target image is output. Since the context information is generated based on the feature map containing the spatial position information, the decoder can obtain the feature relationship between the pixels of the feature map according to the spatial position relationship between the pixels in the context information, thereby segmenting "different types of similar features" and "same features". The boundary between the "type difference feature" realizes the precise segmentation of the boundary between different objects to be segmented.
本发明实施例提供的一种图像分割方法,通过将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。通过计算特征图中像素点之间的空间位置关系得到空间位置信息,实现提取特征图中像素点在不同空间位置的相对位置关系。将包含图像信息的特征图与计算得到的空间位置信息融合得到包含空间位置信息的特征图后,根据包含空间位置信息的特征图对待分割图像进行分割,使得图像分割模型能够根据特征图中像素点之间的空间位置关系得到特征图像素间的特征 关系,从而分割出“不同种类相似特征”和“同种类差异特征”之间的边界,实现对不同待分割目标之间的边界的精确分割,提高图像的分割准确率。An image segmentation method provided by an embodiment of the present invention inputs an image to be segmented into an image segmentation model, extracts features of the image to be segmented to generate a feature map, and calculates the spatial position relationship between pixels in the feature map Obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output a target image. The spatial position information is obtained by calculating the spatial position relationship between the pixels in the feature map, and the relative position relationship of the pixels in different spatial positions in the feature map is extracted. After fusing the feature map containing the image information with the calculated spatial location information to obtain the feature map containing the spatial location information, the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map The spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and "same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
实施例二Example two
如图3所示的是本发明实施例二提供的图像分割方法的流程示意图。在实施例一的基础上,本实施例还提供了计算特征图中像素点之间的空间位置关系得到空间位置信息的过程,从而进一步提高图像分割的准确率。该方法具体包括:FIG. 3 is a schematic flowchart of the image segmentation method provided in the second embodiment of the present invention. On the basis of Embodiment 1, this embodiment also provides a process of calculating the spatial position relationship between pixels in the feature map to obtain spatial position information, thereby further improving the accuracy of image segmentation. The method specifically includes:
S210、将待分割图像输入图像分割模型,通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图;所述N个信息提取模块根据预设尺度信息设置,N≥1;S210. Input the image to be segmented into the image segmentation model, and perform feature extraction on the image to be segmented to generate a feature map through N information extraction modules connected in series in the encoder; the N information extraction modules are set according to preset scale information, N ≥1;
N个信息提取模块根据预设尺寸信息设置,使得每一信息提取模块具有不同的尺寸信息。待分割图像输入该图像分割模型后,通过N个信息提取模块对待分割图像进行特征提取能够生成包含多尺度信息的特征图;且在每一信息提取模块对待分割图像进行特征提取后,计算该信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息。经过N个对应不同尺度信息的信息提取模块计算特征图中像素点之间的空间位置关系得到空间位置信息可得到包含多尺度信息的空间位置信息。The N information extraction modules are set according to the preset size information, so that each information extraction module has different size information. After the image to be segmented is input to the image segmentation model, feature extraction of the image to be segmented through N information extraction modules can generate a feature map containing multi-scale information; and after each information extraction module performs feature extraction of the image to be segmented, the information is calculated The spatial position relationship between pixels in the feature map generated by the extraction module is used to obtain spatial position information. After N information extraction modules corresponding to different scale information calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information, the spatial position information containing multi-scale information can be obtained.
具体地,每一信息提取模块均可包括两个分支,第一分支用于对输入的图像进行特征提取生成特征图;第二分支在以与第一分支同样的方式对输入的图像进行特征提取生成特征图后还计算该特征图中像素点之间的空间位置关系得到空间位置信息。可选的,用于对输入的图像进行特征提取生成特征图的第一分支可由若干卷积层构成;第二分支可由与第一分支相同的若干卷积层和一个特征图深度卷积层构成,以实现对输入的图像进行特征提取生成特征图后计算该特征图中像素点之间的空间位置关系得到空间位置信息。且编码器中N个信息提取模块可通过最大池化层串联起来。Specifically, each information extraction module can include two branches. The first branch is used to extract features from the input image to generate a feature map; the second branch is to perform feature extraction on the input image in the same way as the first branch. After the feature map is generated, the spatial position relationship between the pixels in the feature map is also calculated to obtain the spatial position information. Optionally, the first branch used to extract features from the input image to generate a feature map can be composed of several convolutional layers; the second branch can be composed of several convolutional layers that are the same as the first branch and one feature map depth convolutional layer , In order to realize the feature extraction of the input image to generate a feature map, and then calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information. And the N information extraction modules in the encoder can be connected in series through the maximum pooling layer.
所述待分割图像输入第一信息提取模块时,通过第一信息提取模块的第一分支中若干卷积层对所述待分割图像进行特征提取生成特征图;同时通过第一信息提取模块中第二分支中若干卷积层对所述待分割图像进行特征提取生成特征图,并通过第二分支中特征图深度卷积层计算该特征图中像素点之间的空间位置关系得到空间位置信息。通过池化层融合第一信息提取模块的第一分支输出的特征图和第二分支输出的空间位置信息生成新的特征图,并将该新生成的特征图输入至下一信息提取模块,以使所述下一信息提取模块的第一分支对输入模块的特征图进行特征提取,下一信息提取模块的第二分支对输入模块的特征 图进行特征提取并计算空间位置关系。直至第N信息提取模块的第一分支对输入模块的特征图进行特征提取生成包含多尺度信息的特征图,第N信息提取模块的第二分支对输入模块的特征图进行特征提取并计算空间位置关系得到包含多尺度信息的空间位置信息。When the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented through a number of convolutional layers in the first branch of the first information extraction module to generate a feature map; at the same time, a feature map is generated by the first information extraction module. Several convolutional layers in the second branch perform feature extraction on the image to be segmented to generate a feature map, and use the feature map depth convolution layer in the second branch to calculate the spatial position relationship between pixels in the feature map to obtain spatial position information. A new feature map is generated by fusing the feature map output by the first branch of the first information extraction module and the spatial position information output by the second branch through the pooling layer, and the newly generated feature map is input to the next information extraction module to The first branch of the next information extraction module performs feature extraction on the feature map of the input module, and the second branch of the next information extraction module performs feature extraction on the feature map of the input module and calculate the spatial position relationship. Until the first branch of the Nth information extraction module performs feature extraction on the feature map of the input module to generate a feature map containing multi-scale information, the second branch of the Nth information extraction module performs feature extraction on the feature map of the input module and calculates the spatial position The relationship obtains spatial location information containing multi-scale information.
S220、对于每一所述信息提取模块,通过卷积神经网络沿垂直于所述信息提取模块生成的特征图方向对所述特征图进行卷积,计算所述特征图中像素点之间的空间位置关系得到空间位置信息;S220. For each information extraction module, convolve the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network, and calculate the space between pixels in the feature map The position relationship obtains the spatial position information;
具体地,由于每一信息提取模块均可包括两个分支,第二分支可由与第一分支相同的若干卷积层和一个特征图深度卷积层构成,以实现对输入的图像进行特征提取生成特征图后计算该特征图中像素点之间的空间位置关系得到空间位置信息。因此,对于每一所述信息提取模块,通过卷积神经网络沿垂直于所述信息提取模块生成的特征图方向对所述特征图进行卷积可为:通过信息提取模块中第二分支的特征图深度卷积层沿垂直于第二分支的若干卷积层卷积计算得到的特征图方向对该特征图进行卷积,计算该特征图中像素点之间的空间位置关系得到空间位置信息。Specifically, since each information extraction module can include two branches, the second branch can be composed of several convolutional layers that are the same as the first branch and a feature map depth convolutional layer to realize feature extraction and generation of the input image After the feature map, the spatial position relationship between the pixels in the feature map is calculated to obtain the spatial position information. Therefore, for each of the information extraction modules, convolving the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network can be: using the features of the second branch in the information extraction module The map depth convolution layer convolves the feature map in the direction of the feature map obtained by convolution calculation of several convolution layers perpendicular to the second branch, and calculates the spatial position relationship between pixels in the feature map to obtain spatial position information.
在一个实施示例中,若信息提取模块生成的特征图即第二分支的若干卷积层卷积计算得到的特征图为二维特征图,则通过信息提取模块中第二分支的特征图深度卷积层计算该特征图中像素点之间的空间位置关系的公式为:In an implementation example, if the feature map generated by the information extraction module, that is, the feature map calculated by convolution of several convolutional layers of the second branch, is a two-dimensional feature map, then the feature map depth volume of the second branch in the information extraction module The formula for multi-layer calculation of the spatial position relationship between pixels in the feature map is:
Figure PCTCN2020129521-appb-000003
Figure PCTCN2020129521-appb-000003
其中,a为所述空间位置信息;δ为激活函数;l为所述卷积神经网络的卷积层数;w (i,j)为所述特征图中坐标为(i,j)的像素点的权重系数;k为所述特征图的通道数;b为偏移量;⊙为哈达曼(Hadamand)乘积。 Where a is the spatial position information; δ is the activation function; l is the number of convolutional layers of the convolutional neural network; w (i, j) is the pixel with the coordinate (i, j) in the feature map The weight coefficient of the point; k is the number of channels of the feature map; b is the offset; ⊙ is the Hadamand product.
具体地,信息提取模块中第二分支的特征图深度卷积层可使用H×W×C的卷积核,其中H×W表示卷积核的大小,C表示卷积核的数量,其值等于输出的特征图在XY平面的像素个数。可选的。如图4所示,为信息提取模块中第二分支的特征图深度卷积层的卷积计算示意图。为了计算二维特征图深度卷积的输出,首先把H×W的卷积核放到特征图最左上角的位置,进行首次卷积操作。然后,将这个卷积核沿Z轴方向滑动,依次沿垂直于特征图方向进行同样的卷积操作。最后,将C个卷积核的卷积操作计算结果按照特征图的位置在XY平面上排列起来,得到空间位置信息。Specifically, the feature map depth convolution layer of the second branch in the information extraction module can use a H×W×C convolution kernel, where H×W represents the size of the convolution kernel, C represents the number of convolution kernels, and its value Equal to the number of pixels in the XY plane of the output feature map. Optional. As shown in FIG. 4, it is a schematic diagram of the convolution calculation of the feature map depth convolution layer of the second branch in the information extraction module. In order to calculate the output of the depth convolution of the two-dimensional feature map, first put the H×W convolution kernel at the upper left corner of the feature map, and perform the first convolution operation. Then, slide the convolution kernel along the Z axis, and perform the same convolution operation in the direction perpendicular to the feature map in turn. Finally, the calculation results of the convolution operation of the C convolution kernels are arranged on the XY plane according to the position of the feature map to obtain the spatial position information.
在一个实施示例中,信息提取模块生成的特征图即第二分支的若干卷积层卷积计算得到的特征图为三维特征图,则通过信息提取模块中第二分支的特征图深度卷积层计算该特 征图中像素点之间的空间位置关系的公式为:In an implementation example, the feature map generated by the information extraction module, that is, the feature map obtained by convolution calculation of several convolutional layers of the second branch, is a three-dimensional feature map, and then the feature map of the second branch in the information extraction module is used to deep convolutional layer The formula for calculating the spatial position relationship between pixels in the feature map is:
Figure PCTCN2020129521-appb-000004
Figure PCTCN2020129521-appb-000004
其中,a为所述空间位置信息;δ为激活函数;l为所述卷积神经网络的卷积层数;w (i,j,k)为所述特征图中坐标为(i,j,k)的像素点的权重系数;m为所述特征图的通道数;b为偏移量;⊙为哈达曼(Hadamand)乘积。 Wherein, a is the spatial position information; δ is the activation function; l is the number of convolutional layers of the convolutional neural network; w (i, j, k) is the coordinates in the feature map as (i, j, k) the weight coefficient of the pixel; m is the number of channels of the feature map; b is the offset; ⊙ is the Hadamand product.
具体地,信息提取模块中第二分支的特征图深度卷积层可使用H×W×P×C的卷积核,其中H×W×P表示卷积核的大小,C表示卷积核的数量,其值等于输出的特征图在XY平面的像素个数。为了计算三维特征图的深度卷积层输出,首先把H×W×P的卷积核放到特征图最左上角的位置,进行首次三维卷积操作。然后,将这个卷积核沿Z轴方向滑动,依次沿垂直于特征图方向进行同样的三维卷积操作。最后,将C个卷积核的计算结果按照特征图的位置在XY平面上排列起来,得到空间位置信息。Specifically, the feature map depth convolution layer of the second branch in the information extraction module can use a convolution kernel of H×W×P×C, where H×W×P represents the size of the convolution kernel, and C represents the size of the convolution kernel. The number, whose value is equal to the number of pixels in the XY plane of the output feature map. In order to calculate the output of the depth convolution layer of the 3D feature map, first put the H×W×P convolution kernel at the upper left corner of the feature map, and perform the first 3D convolution operation. Then, slide the convolution kernel along the Z axis, and perform the same three-dimensional convolution operation in the direction perpendicular to the feature map in turn. Finally, the calculation results of the C convolution kernels are arranged on the XY plane according to the position of the feature map to obtain the spatial position information.
S230、融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;S230. Fusion of the feature map and the spatial location information to obtain a feature map containing spatial location information;
通过编码器中池化层融合第N个信息提取模块输出的特征图和空间位置信息生成上下文信息。Context information is generated by fusing the feature map and spatial position information output by the Nth information extraction module through the pooling layer in the encoder.
S240、根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。S240. Segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
通过解码器根据编码器编码的上下文信息对所述待分割图像进行分割,输出目标图像。由于上下文信息根据包含空间位置信息的特征图生成,使得解码器能够根据上下文信息中像素点之间的空间位置关系得到特征图像素间的特征关系,从而分割出“不同种类相似特征”和“同种类差异特征”之间的边界,实现对不同待分割目标之间的边界的精确分割。The image to be divided is segmented by the decoder according to the context information encoded by the encoder, and the target image is output. Since the context information is generated based on the feature map containing the spatial position information, the decoder can obtain the feature relationship between the pixels of the feature map according to the spatial position relationship between the pixels in the context information, thereby segmenting "different types of similar features" and "same features". The boundary between the "type difference feature" realizes the precise segmentation of the boundary between different objects to be segmented.
实施例三Example three
如图5所示的是本发明实施例三提供的图像分割装置。在实施例一或二的基础上,本发明实施例还提供了一种图像分割5,该装置包括:As shown in FIG. 5, the image segmentation device provided in the third embodiment of the present invention is shown. On the basis of Embodiment 1 or 2, an embodiment of the present invention also provides an image segmentation 5, and the device includes:
图像特征和位置信息提取模块501,用于将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;The image feature and location information extraction module 501 is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the space location information;
在一个实施示例中,将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息时,图 像特征和位置信息提取模块501包括:In an implementation example, the image to be segmented is input into the image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain the spatial position information. The feature and location information extraction module 501 includes:
图像特征提取单元,用于通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图;所述N个信息提取模块根据预设尺度信息设置,N≥1;The image feature extraction unit is configured to perform feature extraction on the image to be segmented to generate a feature map through N information extraction modules connected in series in the encoder; the N information extraction modules are set according to preset scale information, N≥1;
位置信息提取单元,用于对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息。The location information extraction unit is configured to calculate the spatial location relationship between pixels in the feature map generated by the information extraction module for each of the information extraction modules to obtain spatial location information.
在一个实施示例中,对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息时,位置信息提取单元包括:In an implementation example, for each of the information extraction modules, when the spatial position relationship between pixels in the feature map generated by the information extraction module is calculated to obtain spatial position information, the position information extraction unit includes:
位置信息提取子单元,用于对于每一所述信息提取模块,通过卷积神经网络沿垂直于所述信息提取模块生成的特征图方向对所述特征图进行卷积,计算所述特征图中像素点之间的空间位置关系得到空间位置信息。The position information extraction subunit is used to convolve the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network for each of the information extraction modules to calculate the feature map The spatial position relationship between the pixels obtains the spatial position information.
特征融合模块502,用于融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;The feature fusion module 502 is configured to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information;
在一个实施示例中,融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图时,特征融合模块502包括:In an implementation example, when fusing the feature map and the spatial location information to obtain a feature map containing spatial location information, the feature fusion module 502 includes:
特征融合单元,用于通过编码器融合第N个信息提取模块输出的特征图和空间位置信息生成上下文信息。The feature fusion unit is used for fusing the feature map and the spatial position information output by the Nth information extraction module through the encoder to generate context information.
图像分割模块503,用于根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。The image segmentation module 503 is configured to segment the image to be segmented according to the feature map containing spatial position information, and output a target image.
在一个实施示例中,根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像时,图像分割模块503包括:In an implementation example, the image to be segmented is segmented according to the feature map containing spatial position information, and when the target image is output, the image segmentation module 503 includes:
图像分割单元,用于通过解码器根据所述上下文信息对所述待分割图像进行分割,输出目标图像。The image segmentation unit is configured to segment the image to be segmented according to the context information by the decoder, and output a target image.
本发明实施例提供的一种图像分割装置,通过将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。通过计算特征图中像素点之间的空间位置关系得到空间位置信息,实现提取特征图中像素点在不同空间位置的相对位置关系。将包含图像信息的特征图与计算得到的空间位置信息融合得到包含空间位置信息的特征图后,根据包含空间位置信息的特征图对待分割图像进行分割,使得图像分割模型能够根据特征图中像素点之间的空间位置关系得到特征图像素间的特征 关系,从而分割出“不同种类相似特征”和“同种类差异特征”之间的边界,实现对不同待分割目标之间的边界的精确分割,提高图像的分割准确率。An image segmentation device provided by an embodiment of the present invention inputs an image to be segmented into an image segmentation model, extracts features of the image to be segmented to generate a feature map, and calculates the spatial position relationship between pixels in the feature map Obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output a target image. The spatial position information is obtained by calculating the spatial position relationship between the pixels in the feature map, and the relative position relationship of the pixels in different spatial positions in the feature map is extracted. After fusing the feature map containing the image information with the calculated spatial location information to obtain the feature map containing the spatial location information, the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map The spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and "same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
实施例四Example four
图6是本发明实施例四提供的服务器的结构示意图。该服务器包括:处理器61、存储器62以及存储在所述存储器62中并可在所述处理器61上运行的计算机程序63,例如用于图像分割方法的程序。所述处理器61执行所述计算机程序63时实现上述图像分割方法实施例中的步骤,例如图1所示的步骤S110至S130。Fig. 6 is a schematic structural diagram of a server provided in the fourth embodiment of the present invention. The server includes a processor 61, a memory 62, and a computer program 63 stored in the memory 62 and running on the processor 61, such as a program for an image segmentation method. The processor 61 implements the steps in the embodiment of the image segmentation method when the computer program 63 is executed, for example, steps S110 to S130 shown in FIG. 1.
示例性的,所述计算机程序63可以被分割成一个或多个模块,所述一个或者多个模块被存储在所述存储器62中,并由所述处理器61执行,以完成本申请。所述一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序63在所述服务器中的执行过程。例如,所述计算机程序63可以被分割成图像特征和位置信息提取模块、特征融合模块和图像分割模块,各模块具体功能如下:Exemplarily, the computer program 63 may be divided into one or more modules, and the one or more modules are stored in the memory 62 and executed by the processor 61 to complete the application. The one or more modules may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 63 in the server. For example, the computer program 63 can be divided into an image feature and location information extraction module, a feature fusion module, and an image segmentation module, and the specific functions of each module are as follows:
图像特征和位置信息提取模块,用于将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;The image feature and location information extraction module is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the spatial position information;
特征融合模块,用于融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;The feature fusion module is used to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information;
图像分割模块,用于根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。The image segmentation module is configured to segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
所述服务器可包括,但不仅限于,处理器61、存储器62以及存储在所述存储器62中的计算机程序63。本领域技术人员可以理解,图6仅仅是服务器的示例,并不构成对服务器的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述服务器还可以包括输入输出设备、网络接入设备、总线等。The server may include, but is not limited to, a processor 61, a memory 62, and a computer program 63 stored in the memory 62. Those skilled in the art can understand that FIG. 6 is only an example of a server, and does not constitute a limitation on the server. It may include more or less components than those shown in the figure, or a combination of certain components, or different components, such as the The server may also include input and output devices, network access devices, buses, and so on.
所述处理器61可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 61 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
所述存储器62可以是所述服务器的内部存储单元,例如服务器的硬盘或内存。所述存 储器62也可以是外部存储设备,例如服务器上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器62还可以既包括服务器的内部存储单元也包括外部存储设备。所述存储器62用于存储所述计算机程序以及图像分割方法所需的其他程序和数据。所述存储器62还可以用于暂时地存储已经输出或者将要输出的数据。The memory 62 may be an internal storage unit of the server, such as a hard disk or memory of the server. The memory 62 may also be an external storage device, such as a plug-in hard disk equipped on a server, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, a flash memory card (Flash Card), and so on. Further, the storage 62 may also include both an internal storage unit of the server and an external storage device. The memory 62 is used to store the computer program and other programs and data required by the image segmentation method. The memory 62 can also be used to temporarily store data that has been output or will be output.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of a software functional unit. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.
在本发明所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided by the present invention, it should be understood that the disclosed device/terminal device and method may be implemented in other ways. For example, the device/terminal device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units. Or components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。 另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments. In addition, the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。If the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the present invention implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc. It should be noted that the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of the legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to the legislation and patent practice, the computer-readable medium Does not include electrical carrier signals and telecommunication signals.
以上所述实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围,均应包含在本发明的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still implement the foregoing various embodiments. The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included in Within the protection scope of the present invention.

Claims (10)

  1. 一种图像分割方法,其特征在于,包括:An image segmentation method, characterized in that it includes:
    将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;Input the image to be divided into an image segmentation model, perform feature extraction on the image to be divided to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain spatial position information;
    融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;Fusing the feature map and the spatial location information to obtain a feature map containing the spatial location information;
    根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。The image to be divided is segmented according to the feature map containing the spatial position information, and the target image is output.
  2. 如权利要求1所述的图像分割方法,其特征在于,所述将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息,包括:The image segmentation method according to claim 1, wherein the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the pixel points in the feature map are calculated. The spatial position relationship of to obtain the spatial position information, including:
    通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图;所述N个信息提取模块根据预设尺度信息设置,N≥1;Perform feature extraction on the image to be segmented by N information extraction modules connected in series in the encoder to generate a feature map; the N information extraction modules are set according to preset scale information, N≥1;
    对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息。For each of the information extraction modules, the spatial position relationship between pixels in the feature map generated by the information extraction module is calculated to obtain spatial position information.
  3. 如权利要求2所述的图像分割方法,其特征在于,所述将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息,包括:The image segmentation method of claim 2, wherein the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the distance between pixels in the feature map is calculated The spatial position relationship of to obtain the spatial position information, including:
    所述待分割图像输入第一所述信息提取模块时,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;When the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information;
    融合所述特征图和所述空间位置信息生成新的特征图输出至下一所述信息提取模块,以使所述下一信息提取模块对所述新的特征图进行特征提取和空间位置关系计算。Fusion of the feature map and the spatial position information to generate a new feature map and output to the next information extraction module, so that the next information extraction module performs feature extraction and spatial position relationship calculation on the new feature map .
  4. 如权利要求3所述的图像分割方法,其特征在于,所述融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图,包括:5. The image segmentation method according to claim 3, wherein said fusing the feature map and the spatial position information to obtain a feature map containing spatial position information comprises:
    通过编码器融合第N个信息提取模块输出的特征图和空间位置信息生成上下文信息。Context information is generated by fusing the feature map and spatial position information output by the Nth information extraction module through the encoder.
  5. 如权利要求3所述的图像分割方法,其特征在于,所述对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息,包 括:8. The image segmentation method according to claim 3, wherein for each of the information extraction modules, calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information, include:
    对于每一所述信息提取模块,通过卷积神经网络沿垂直于所述信息提取模块生成的特征图方向对所述特征图进行卷积,计算所述特征图中像素点之间的空间位置关系得到空间位置信息。For each of the information extraction modules, convolve the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network, and calculate the spatial position relationship between pixels in the feature map Get spatial location information.
  6. 如权利要求5所述的图像分割方法,其特征在于,所述对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息,包括:5. The image segmentation method of claim 5, wherein for each of the information extraction modules, calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information, include:
    若所述信息提取模块生成的特征图为二维特征图,则计算所述信息提取模块生成的特征图中像素点之间的空间位置关系的公式为:If the feature map generated by the information extraction module is a two-dimensional feature map, the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
    Figure PCTCN2020129521-appb-100001
    Figure PCTCN2020129521-appb-100001
    其中,a为所述空间位置信息;δ为激活函数;l为所述卷积神经网络的卷积层数;w (i,j)为所述特征图中坐标为(i,j)的像素点的权重系数;k为所述特征图的通道数;b为偏移量;⊙为哈达曼乘积。 Where a is the spatial position information; δ is the activation function; l is the number of convolutional layers of the convolutional neural network; w (i, j) is the pixel with the coordinate (i, j) in the feature map The weight coefficient of the point; k is the number of channels of the feature map; b is the offset; ⊙ is the Hadaman product.
  7. 如权利要求5所述的图像分割方法,其特征在于,所述对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息,包括:5. The image segmentation method of claim 5, wherein for each of the information extraction modules, calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information, include:
    若所述信息提取模块生成的特征图为三维特征图,则计算所述信息提取模块生成的特征图中像素点之间的空间位置关系的公式为:If the feature map generated by the information extraction module is a three-dimensional feature map, the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
    Figure PCTCN2020129521-appb-100002
    Figure PCTCN2020129521-appb-100002
    其中,a为所述空间位置信息;δ为激活函数;l为所述卷积神经网络的卷积层数;w (i,j,k)为所述特征图中坐标为(i,j,k)的像素点的权重系数;m为所述特征图的通道数;b为偏移量;⊙为哈达曼乘积。 Wherein, a is the spatial position information; δ is the activation function; l is the number of convolutional layers of the convolutional neural network; w (i, j, k) is the coordinates in the feature map as (i, j, k) the weight coefficient of the pixel; m is the number of channels of the feature map; b is the offset; ⊙ is the Hadaman product.
  8. 如权利要求4所述的图像分割方法,其特征在于,所述根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像,包括:5. The image segmentation method according to claim 4, wherein said segmenting said image to be segmented according to said feature map containing spatial position information and outputting a target image comprises:
    通过解码器根据所述上下文信息对所述待分割图像进行分割,输出目标图像。The decoder divides the image to be divided according to the context information, and outputs the target image.
  9. 一种图像分割装置,其特征在于,包括:An image segmentation device, characterized in that it comprises:
    图像特征和位置信息提取模块,用于将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;The image feature and location information extraction module is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the spatial position information;
    特征融合模块,用于融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;The feature fusion module is used to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information;
    图像分割模块,用于根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。The image segmentation module is configured to segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
  10. 一种服务器,其特征在于,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至8任一项所述图像分割方法的步骤。A server, characterized in that it comprises a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program as described in the right The steps of the image segmentation method described in any one of 1 to 8 are required.
PCT/CN2020/129521 2019-12-11 2020-11-17 Image segmentation method and apparatus, and server WO2021115061A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911266841.6 2019-12-11
CN201911266841.6A CN111145196A (en) 2019-12-11 2019-12-11 Image segmentation method and device and server

Publications (1)

Publication Number Publication Date
WO2021115061A1 true WO2021115061A1 (en) 2021-06-17

Family

ID=70518054

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/129521 WO2021115061A1 (en) 2019-12-11 2020-11-17 Image segmentation method and apparatus, and server

Country Status (2)

Country Link
CN (1) CN111145196A (en)
WO (1) WO2021115061A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610754A (en) * 2021-06-28 2021-11-05 浙江文谷科技有限公司 Defect detection method and system based on Transformer

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145196A (en) * 2019-12-11 2020-05-12 中国科学院深圳先进技术研究院 Image segmentation method and device and server
CN112363844B (en) * 2021-01-12 2021-04-09 之江实验室 Convolutional neural network vertical segmentation method for image processing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109087318A (en) * 2018-07-26 2018-12-25 东北大学 A kind of MRI brain tumor image partition method based on optimization U-net network model
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN110163875A (en) * 2019-05-23 2019-08-23 南京信息工程大学 One kind paying attention to pyramidal semi-supervised video object dividing method based on modulating network and feature
US20190311223A1 (en) * 2017-03-13 2019-10-10 Beijing Sensetime Technology Development Co., Ltd. Image processing methods and apparatus, and electronic devices
CN110428428A (en) * 2019-07-26 2019-11-08 长沙理工大学 A kind of image, semantic dividing method, electronic equipment and readable storage medium storing program for executing
CN111145196A (en) * 2019-12-11 2020-05-12 中国科学院深圳先进技术研究院 Image segmentation method and device and server

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493347B (en) * 2017-09-12 2021-03-23 深圳科亚医疗科技有限公司 Method and system for segmenting sparsely distributed objects in an image

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190311223A1 (en) * 2017-03-13 2019-10-10 Beijing Sensetime Technology Development Co., Ltd. Image processing methods and apparatus, and electronic devices
CN109087318A (en) * 2018-07-26 2018-12-25 东北大学 A kind of MRI brain tumor image partition method based on optimization U-net network model
CN109461157A (en) * 2018-10-19 2019-03-12 苏州大学 Image, semantic dividing method based on multi-stage characteristics fusion and Gauss conditions random field
CN110163875A (en) * 2019-05-23 2019-08-23 南京信息工程大学 One kind paying attention to pyramidal semi-supervised video object dividing method based on modulating network and feature
CN110428428A (en) * 2019-07-26 2019-11-08 长沙理工大学 A kind of image, semantic dividing method, electronic equipment and readable storage medium storing program for executing
CN111145196A (en) * 2019-12-11 2020-05-12 中国科学院深圳先进技术研究院 Image segmentation method and device and server

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113610754A (en) * 2021-06-28 2021-11-05 浙江文谷科技有限公司 Defect detection method and system based on Transformer
CN113610754B (en) * 2021-06-28 2024-05-07 浙江文谷科技有限公司 Defect detection method and system based on transducer

Also Published As

Publication number Publication date
CN111145196A (en) 2020-05-12

Similar Documents

Publication Publication Date Title
WO2021115061A1 (en) Image segmentation method and apparatus, and server
WO2020119527A1 (en) Human action recognition method and apparatus, and terminal device and storage medium
WO2020199693A1 (en) Large-pose face recognition method and apparatus, and device
CN109960742B (en) Local information searching method and device
US10726580B2 (en) Method and device for calibration
CN110598714B (en) Cartilage image segmentation method and device, readable storage medium and terminal equipment
EP4027299A2 (en) Method and apparatus for generating depth map, and storage medium
CN110832501A (en) System and method for pose-invariant face alignment
CN107730514B (en) Scene segmentation network training method and device, computing equipment and storage medium
CN111967467B (en) Image target detection method and device, electronic equipment and computer readable medium
EP3803803A1 (en) Lighting estimation
US20220156968A1 (en) Visual feature database construction method, visual positioning method and apparatus, and storage medium
WO2022134464A1 (en) Target detection positioning confidence determination method and apparatus, and electronic device and storage medium
CN112336342A (en) Hand key point detection method and device and terminal equipment
WO2021097595A1 (en) Method and apparatus for segmenting lesion area in image, and server
CN111368860B (en) Repositioning method and terminal equipment
WO2019109410A1 (en) Fully convolutional network model training method for splitting abnormal signal region in mri image
CN110288691B (en) Method, apparatus, electronic device and computer-readable storage medium for rendering image
CN114066930A (en) Planar target tracking method and device, terminal equipment and storage medium
US20230048643A1 (en) High-Precision Map Construction Method, Apparatus and Electronic Device
US20220392251A1 (en) Method and apparatus for generating object model, electronic device and storage medium
CN114494782B (en) Image processing method, model training method, related device and electronic equipment
CN113610856B (en) Method and device for training image segmentation model and image segmentation
CN115147469A (en) Registration method, device, equipment and storage medium
CN114022458A (en) Skeleton detection method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900307

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20900307

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20.01.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20900307

Country of ref document: EP

Kind code of ref document: A1