WO2021115061A1 - 图像分割方法、装置及服务器 - Google Patents
图像分割方法、装置及服务器 Download PDFInfo
- Publication number
- WO2021115061A1 WO2021115061A1 PCT/CN2020/129521 CN2020129521W WO2021115061A1 WO 2021115061 A1 WO2021115061 A1 WO 2021115061A1 CN 2020129521 W CN2020129521 W CN 2020129521W WO 2021115061 A1 WO2021115061 A1 WO 2021115061A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- feature map
- spatial position
- image
- information
- feature
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the present invention relates to the technical field of image segmentation, in particular to an image segmentation method, device and server.
- Image segmentation is one of the research hotspots of computer graphics, and it has important applications in the fields of medical disease diagnosis and unmanned driving.
- U-Net U-shaped neural network algorithm
- the U-shaped neural network algorithm is composed of an encoder and a decoder, and the encoder and the decoder are connected by splicing in the image channel dimension.
- the image to be segmented is first extracted through an encoder for image feature extraction.
- the encoder is composed of multiple convolutional layers, and the convolutional layers are connected by a pooling layer, thereby reducing the dimension of the original image to a certain size.
- the image output from the encoder is restored to the original image size by the decoder.
- the decoder is composed of multiple convolutional layers, and the convolutional layers are connected by transposed convolutional layers. Finally, the output image is converted into a probability map using the softmax activation function.
- the UNet algorithm Compared with traditional image segmentation algorithms, such as threshold segmentation, region segmentation, and edge segmentation, the UNet algorithm has a simple network structure and high accuracy of image segmentation.
- the current UNet image segmentation algorithm questions will exaggerate the difference between similar objects (inter-class distinction) or the similarity of objects of different categories (intra-class consistency), and it is impossible to separate "different types of similar features" and "same characteristics". The boundary between the “species difference characteristics”. As a result, the boundary between different objects to be segmented cannot be accurately segmented, and the segmentation accuracy of the image is low.
- the embodiments of the present invention provide an image segmentation method, device, and server to solve the problem that the boundary between different objects to be segmented cannot be accurately segmented.
- the first aspect of the embodiments of the present invention provides an image segmentation method, including:
- the image to be divided is segmented according to the feature map containing the spatial position information, and the target image is output.
- the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information, include:
- N information extraction modules Perform feature extraction on the image to be segmented by N information extraction modules connected in series in the encoder to generate a feature map; the N information extraction modules are set according to preset scale information, N ⁇ 1;
- the spatial position relationship between pixels in the feature map generated by the information extraction module is calculated to obtain spatial position information.
- the image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information, include:
- the first information extraction module When the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain spatial position information;
- the fusing the feature map and the spatial location information to obtain a feature map containing spatial location information includes:
- Context information is generated by fusing the feature map and spatial position information output by the Nth information extraction module through the encoder.
- calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
- calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
- the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
- a is the spatial position information
- ⁇ is the activation function
- l is the number of convolutional layers of the convolutional neural network
- w (i, j) is the pixel with the coordinate (i, j) in the feature map The weight coefficient of the point
- k is the number of channels of the feature map
- b is the offset
- ⁇ is the Hadaman product.
- calculating the spatial position relationship between pixels in the feature map generated by the information extraction module to obtain spatial position information includes:
- the formula for calculating the spatial position relationship between pixels in the feature map generated by the information extraction module is:
- a is the spatial position information
- ⁇ is the activation function
- l is the number of convolutional layers of the convolutional neural network
- w (i, j, k) is the coordinates in the feature map as (i, j, k) the weight coefficient of the pixel
- m is the number of channels of the feature map
- b is the offset
- ⁇ is the Hadaman product.
- the segmenting the image to be segmented according to the feature map containing the spatial position information and outputting the target image includes:
- the decoder divides the image to be divided according to the context information, and outputs the target image.
- a second aspect of the embodiments of the present invention provides an image segmentation device, including:
- the image feature and location information extraction module is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the spatial position information;
- the feature fusion module is used to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information
- the image segmentation module is configured to segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
- the third aspect of the embodiment provides a server, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor implements the first computer program when the computer program is executed.
- the image segmentation method On the one hand, the image segmentation method.
- an image to be segmented is input into an image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the pixel points in the feature map To obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output the target image.
- the spatial position information is obtained by calculating the spatial position relationship between pixels in the feature map, and the relative position relationship of pixels in different spatial positions in the feature map can be extracted.
- the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map
- the spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and “same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
- FIG. 1 is a schematic flowchart of an image segmentation method provided by Embodiment 1 of the present invention
- Embodiment 1 of the present invention is a schematic structural diagram of an image segmentation model provided by Embodiment 1 of the present invention
- Embodiment 3 is a schematic flowchart of an image segmentation method provided by Embodiment 2 of the present invention.
- FIG. 4 is a schematic diagram of convolution calculation of the feature map depth convolution layer of the second branch in the information extraction module provided by the second embodiment of the present invention.
- FIG. 5 is a schematic structural diagram of an image segmentation device provided by Embodiment 3 of the present invention.
- Fig. 6 is a schematic structural diagram of a server provided in the fourth embodiment of the present invention.
- FIG. 1 it is a schematic flowchart of an image segmentation method provided by Embodiment 1 of the present invention.
- This embodiment can be applied to the application scenario of multi-target segmentation of an image.
- the method can be executed by an image segmentation device, which can be a server, a smart terminal, a tablet or a PC, etc.; in this embodiment of the application, the image segmentation device is used as The executive subject will explain that the method specifically includes the following steps:
- S110 Input the image to be segmented into an image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain spatial position information;
- image segmentation can be performed by constructing an image segmentation model including a neural network through deep learning target images.
- image features extracted after the convolution calculation of the multi-layer convolutional layer in the trained image segmentation model of the image to be segmented often exaggerate the difference between similar objects (inter-class distinction) or the similarity of objects of different categories
- intra-class distinction the difference between similar objects
- similarity of objects of different categories The issue of intra-class consistency.
- the image segmentation model performs target image segmentation on the image to be segmented based on the extracted features, it cannot segment the boundary between "different types of similar features" and “same type difference features", resulting in over-segmentation and under-segmentation in the segmentation process.
- the feature relationship between feature image pixels can be extracted from the different levels of the convolutional neural network of the image segmentation model to overcome the inability to segment the difference between "different types of similar features" and “same types of different features". The problem of borders.
- the image to be segmented can be segmented through an image segmentation model trained based on multiple target images.
- the image feature to be segmented is extracted to generate a feature map, and the spatial position relationship between the pixels in the feature map is calculated to obtain the spatial position information, so as to obtain the pixel points in the feature map in different spaces The relative positional relationship of the position.
- the image segmentation model may adopt a U-shaped neural network (Feature Depth UNet) framework, and an encoder and a decoder form a symmetric structure; the encoder and the decoder are spliced by image channel dimensions.
- Figure 2 shows the structure diagram of the image segmentation model.
- the specific process of performing feature extraction on the image to be segmented to generate a feature map, and calculating the spatial position relationship between pixels in the feature map to obtain spatial position information may be: pairing N information extraction modules connected in series in the encoder The image to be segmented is subjected to feature extraction to generate a feature map; the N information extraction modules are set according to preset scale information, N ⁇ 1; for each of the information extraction modules, the feature map generated by the information extraction module is calculated The spatial position relationship between the pixels obtains the spatial position information.
- the encoder includes N information extraction modules connected in series to perform image feature extraction on the input image to be divided to generate a feature map.
- the N information extraction modules are set according to the preset size information, so that each information extraction module has different size information.
- feature extraction of the image to be segmented through N information extraction modules can generate a feature map containing multi-scale information; and after each information extraction module performs feature extraction of the image to be segmented, the information is calculated
- the spatial position relationship between pixels in the feature map generated by the extraction module is used to obtain spatial position information.
- N information extraction modules corresponding to different scale information calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information, the spatial position information containing multi-scale information can be obtained.
- feature extraction is performed on the image to be divided by N information extraction modules connected in series in the encoder to generate a feature map, and the spatial position relationship between pixels in the feature map generated by each information extraction module is calculated
- the specific process of obtaining the spatial position information may be: when the image to be segmented is input to the first information extraction module, feature extraction is performed on the image to be segmented to generate a feature map, and the space between pixels in the feature map is calculated Position relationship to obtain spatial location information; fusion of the feature map and the spatial location information to generate a new feature map and output it to the next information extraction module, so that the next information extraction module performs processing on the new feature map Feature extraction and spatial position relationship calculation.
- each information extraction module can include two branches.
- the first branch is used to extract features from the input image to generate a feature map to extract the pixel value information of the image; the second branch is the same as the first branch.
- the spatial position relationship between the pixels in the feature map is also calculated to obtain the spatial position information, so as to realize the extraction of the spatial position relationship information between the pixels.
- the first branch used to extract features from the input image to generate a feature map can be composed of several convolutional layers; the second branch can be composed of several convolutional layers that are the same as the first branch and one feature map depth convolutional layer , In order to realize the feature extraction of the input image to generate a feature map, and then calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information.
- the N information extraction modules in the encoder can be connected in series through the maximum pooling layer.
- each pixel in the depth convolutional layer of the feature map can map a different field of view on the original image.
- a Batch Normalization (batch normalization) layer and L2 regularization can be added after the loss function between the convolutional layers in each information extraction module.
- the image to be segmented is input to the first information extraction module
- feature extraction is performed on the image to be segmented through several convolutional layers in the first branch of the first information extraction module to generate a feature map
- the first information is extracted
- Several convolution layers in the second branch of the module perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map through the feature map depth convolution layer in the second branch to obtain the space location information.
- a new feature map is generated by fusing the feature map output by the first branch of the first information extraction module and the spatial position information output by the second branch through the pooling layer, and the newly generated feature map is input to the next information extraction module to
- the first branch of the next information extraction module performs feature extraction on the feature map of the input module
- the second branch of the next information extraction module performs feature extraction on the feature map of the input module and calculate the spatial position relationship.
- the first branch of the Nth information extraction module performs feature extraction on the feature map of the input module to generate a feature map containing multi-scale information
- the second branch of the Nth information extraction module performs feature extraction on the feature map of the input module and calculates the spatial position
- the relationship obtains spatial location information containing multi-scale information.
- N information extraction modules connected in series in the encoder perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map generated by each information extraction module to obtain the spatial position information
- the Nth information extraction module outputs the finally obtained feature map and spatial position information.
- the feature map output by the Nth information extraction module and the spatial position information are fused to obtain a feature map to complete the feature fusion.
- the image segmentation model can be composed of an encoder and a decoder
- the decoder needs to perform image segmentation on the image to be segmented according to the context information sent by the encoder.
- Context information can be generated by fusing the feature map and spatial position information output by the Nth information extraction module through the pooling layer in the encoder.
- the image segmentation model can include an encoder and a decoder, and the encoder and the decoder have a symmetrical structure.
- the decoder corresponds to the convolutional layer structure in the encoder with a corresponding transposed convolutional layer. And in order to make the neural network retain the shallower information, the encoder and the decoder are connected by skipping.
- the image to be segmented is segmented by the decoder according to the context information encoded by the encoder, and the target image is output.
- the decoder can obtain the feature relationship between the pixels of the feature map according to the spatial position relationship between the pixels in the context information, thereby segmenting "different types of similar features" and “same features".
- the boundary between the "type difference feature” realizes the precise segmentation of the boundary between different objects to be segmented.
- An image segmentation method inputs an image to be segmented into an image segmentation model, extracts features of the image to be segmented to generate a feature map, and calculates the spatial position relationship between pixels in the feature map Obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output a target image.
- the spatial position information is obtained by calculating the spatial position relationship between the pixels in the feature map, and the relative position relationship of the pixels in different spatial positions in the feature map is extracted.
- the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map
- the spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and “same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
- FIG. 3 is a schematic flowchart of the image segmentation method provided in the second embodiment of the present invention.
- this embodiment also provides a process of calculating the spatial position relationship between pixels in the feature map to obtain spatial position information, thereby further improving the accuracy of image segmentation.
- the method specifically includes:
- the N information extraction modules are set according to the preset size information, so that each information extraction module has different size information.
- feature extraction of the image to be segmented through N information extraction modules can generate a feature map containing multi-scale information; and after each information extraction module performs feature extraction of the image to be segmented, the information is calculated
- the spatial position relationship between pixels in the feature map generated by the extraction module is used to obtain spatial position information.
- N information extraction modules corresponding to different scale information calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information, the spatial position information containing multi-scale information can be obtained.
- each information extraction module can include two branches.
- the first branch is used to extract features from the input image to generate a feature map; the second branch is to perform feature extraction on the input image in the same way as the first branch. After the feature map is generated, the spatial position relationship between the pixels in the feature map is also calculated to obtain the spatial position information.
- the first branch used to extract features from the input image to generate a feature map can be composed of several convolutional layers; the second branch can be composed of several convolutional layers that are the same as the first branch and one feature map depth convolutional layer , In order to realize the feature extraction of the input image to generate a feature map, and then calculate the spatial position relationship between the pixels in the feature map to obtain the spatial position information.
- the N information extraction modules in the encoder can be connected in series through the maximum pooling layer.
- feature extraction is performed on the image to be segmented through a number of convolutional layers in the first branch of the first information extraction module to generate a feature map; at the same time, a feature map is generated by the first information extraction module.
- convolutional layers in the second branch perform feature extraction on the image to be segmented to generate a feature map, and use the feature map depth convolution layer in the second branch to calculate the spatial position relationship between pixels in the feature map to obtain spatial position information.
- a new feature map is generated by fusing the feature map output by the first branch of the first information extraction module and the spatial position information output by the second branch through the pooling layer, and the newly generated feature map is input to the next information extraction module to
- the first branch of the next information extraction module performs feature extraction on the feature map of the input module
- the second branch of the next information extraction module performs feature extraction on the feature map of the input module and calculate the spatial position relationship.
- the first branch of the Nth information extraction module performs feature extraction on the feature map of the input module to generate a feature map containing multi-scale information
- the second branch of the Nth information extraction module performs feature extraction on the feature map of the input module and calculates the spatial position
- the relationship obtains spatial location information containing multi-scale information.
- each information extraction module can include two branches
- the second branch can be composed of several convolutional layers that are the same as the first branch and a feature map depth convolutional layer to realize feature extraction and generation of the input image
- the spatial position relationship between the pixels in the feature map is calculated to obtain the spatial position information. Therefore, for each of the information extraction modules, convolving the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network can be: using the features of the second branch in the information extraction module
- the map depth convolution layer convolves the feature map in the direction of the feature map obtained by convolution calculation of several convolution layers perpendicular to the second branch, and calculates the spatial position relationship between pixels in the feature map to obtain spatial position information.
- the feature map generated by the information extraction module that is, the feature map calculated by convolution of several convolutional layers of the second branch
- the feature map depth volume of the second branch in the information extraction module The formula for multi-layer calculation of the spatial position relationship between pixels in the feature map is:
- a is the spatial position information
- ⁇ is the activation function
- l is the number of convolutional layers of the convolutional neural network
- w (i, j) is the pixel with the coordinate (i, j) in the feature map The weight coefficient of the point
- k is the number of channels of the feature map
- b is the offset
- ⁇ is the Hadamand product.
- the feature map depth convolution layer of the second branch in the information extraction module can use a H ⁇ W ⁇ C convolution kernel, where H ⁇ W represents the size of the convolution kernel, C represents the number of convolution kernels, and its value Equal to the number of pixels in the XY plane of the output feature map.
- H ⁇ W represents the size of the convolution kernel
- C represents the number of convolution kernels
- its value Equal to the number of pixels in the XY plane of the output feature map.
- FIG. 4 it is a schematic diagram of the convolution calculation of the feature map depth convolution layer of the second branch in the information extraction module. In order to calculate the output of the depth convolution of the two-dimensional feature map, first put the H ⁇ W convolution kernel at the upper left corner of the feature map, and perform the first convolution operation.
- the feature map generated by the information extraction module that is, the feature map obtained by convolution calculation of several convolutional layers of the second branch, is a three-dimensional feature map, and then the feature map of the second branch in the information extraction module is used to deep convolutional layer
- the formula for calculating the spatial position relationship between pixels in the feature map is:
- a is the spatial position information
- ⁇ is the activation function
- l is the number of convolutional layers of the convolutional neural network
- w (i, j, k) is the coordinates in the feature map as (i, j, k) the weight coefficient of the pixel
- m is the number of channels of the feature map
- b is the offset
- ⁇ is the Hadamand product.
- the feature map depth convolution layer of the second branch in the information extraction module can use a convolution kernel of H ⁇ W ⁇ P ⁇ C, where H ⁇ W ⁇ P represents the size of the convolution kernel, and C represents the size of the convolution kernel.
- the number whose value is equal to the number of pixels in the XY plane of the output feature map.
- H ⁇ W ⁇ P convolution kernel At the upper left corner of the feature map, and perform the first 3D convolution operation.
- slide the convolution kernel along the Z axis and perform the same three-dimensional convolution operation in the direction perpendicular to the feature map in turn.
- the calculation results of the C convolution kernels are arranged on the XY plane according to the position of the feature map to obtain the spatial position information.
- Context information is generated by fusing the feature map and spatial position information output by the Nth information extraction module through the pooling layer in the encoder.
- the image to be divided is segmented by the decoder according to the context information encoded by the encoder, and the target image is output. Since the context information is generated based on the feature map containing the spatial position information, the decoder can obtain the feature relationship between the pixels of the feature map according to the spatial position relationship between the pixels in the context information, thereby segmenting "different types of similar features" and "same features". The boundary between the "type difference feature" realizes the precise segmentation of the boundary between different objects to be segmented.
- the image segmentation device provided in the third embodiment of the present invention is shown.
- an embodiment of the present invention also provides an image segmentation 5, and the device includes:
- the image feature and location information extraction module 501 is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the space location information;
- the image to be segmented is input into the image segmentation model, feature extraction is performed on the image to be segmented to generate a feature map, and the spatial position relationship between pixels in the feature map is calculated to obtain the spatial position information.
- the feature and location information extraction module 501 includes:
- the image feature extraction unit is configured to perform feature extraction on the image to be segmented to generate a feature map through N information extraction modules connected in series in the encoder; the N information extraction modules are set according to preset scale information, N ⁇ 1;
- the location information extraction unit is configured to calculate the spatial location relationship between pixels in the feature map generated by the information extraction module for each of the information extraction modules to obtain spatial location information.
- the position information extraction unit includes:
- the position information extraction subunit is used to convolve the feature map in a direction perpendicular to the feature map generated by the information extraction module through a convolutional neural network for each of the information extraction modules to calculate the feature map
- the spatial position relationship between the pixels obtains the spatial position information.
- the feature fusion module 502 is configured to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information;
- the feature fusion module 502 when fusing the feature map and the spatial location information to obtain a feature map containing spatial location information, includes:
- the feature fusion unit is used for fusing the feature map and the spatial position information output by the Nth information extraction module through the encoder to generate context information.
- the image segmentation module 503 is configured to segment the image to be segmented according to the feature map containing spatial position information, and output a target image.
- the image to be segmented is segmented according to the feature map containing spatial position information, and when the target image is output, the image segmentation module 503 includes:
- the image segmentation unit is configured to segment the image to be segmented according to the context information by the decoder, and output a target image.
- An image segmentation device inputs an image to be segmented into an image segmentation model, extracts features of the image to be segmented to generate a feature map, and calculates the spatial position relationship between pixels in the feature map Obtain spatial position information; fuse the feature map and the spatial position information to obtain a feature map containing spatial position information; segment the image to be divided according to the feature map containing spatial position information, and output a target image.
- the spatial position information is obtained by calculating the spatial position relationship between the pixels in the feature map, and the relative position relationship of the pixels in different spatial positions in the feature map is extracted.
- the segmented image is segmented according to the feature map containing the spatial location information, so that the image segmentation model can be based on the pixel points in the feature map
- the spatial position relationship between the feature maps obtains the feature relationship between the pixels of the feature map, thereby segmenting the boundary between "different types of similar features" and “same type of difference features" to achieve accurate segmentation of the boundaries between different objects to be segmented, Improve the accuracy of image segmentation.
- Fig. 6 is a schematic structural diagram of a server provided in the fourth embodiment of the present invention.
- the server includes a processor 61, a memory 62, and a computer program 63 stored in the memory 62 and running on the processor 61, such as a program for an image segmentation method.
- the processor 61 implements the steps in the embodiment of the image segmentation method when the computer program 63 is executed, for example, steps S110 to S130 shown in FIG. 1.
- the computer program 63 may be divided into one or more modules, and the one or more modules are stored in the memory 62 and executed by the processor 61 to complete the application.
- the one or more modules may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 63 in the server.
- the computer program 63 can be divided into an image feature and location information extraction module, a feature fusion module, and an image segmentation module, and the specific functions of each module are as follows:
- the image feature and location information extraction module is used to input the image to be segmented into the image segmentation model, perform feature extraction on the image to be segmented to generate a feature map, and calculate the spatial position relationship between pixels in the feature map to obtain the spatial position information;
- the feature fusion module is used to fuse the feature map and the spatial location information to obtain a feature map containing the spatial location information
- the image segmentation module is configured to segment the image to be segmented according to the feature map containing the spatial position information, and output a target image.
- the server may include, but is not limited to, a processor 61, a memory 62, and a computer program 63 stored in the memory 62.
- FIG. 6 is only an example of a server, and does not constitute a limitation on the server. It may include more or less components than those shown in the figure, or a combination of certain components, or different components, such as the
- the server may also include input and output devices, network access devices, buses, and so on.
- the processor 61 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory 62 may be an internal storage unit of the server, such as a hard disk or memory of the server.
- the memory 62 may also be an external storage device, such as a plug-in hard disk equipped on a server, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, a flash memory card (Flash Card), and so on.
- the storage 62 may also include both an internal storage unit of the server and an external storage device.
- the memory 62 is used to store the computer program and other programs and data required by the image segmentation method.
- the memory 62 can also be used to temporarily store data that has been output or will be output.
- the disclosed device/terminal device and method may be implemented in other ways.
- the device/terminal device embodiments described above are only illustrative.
- the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
- components can be combined or integrated into another system, or some features can be omitted or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the present invention implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
- the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments.
- the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
- the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signal, telecommunications signal, and software distribution media, etc.
- ROM Read-Only Memory
- RAM Random Access Memory
- electrical carrier signal telecommunications signal
- software distribution media etc.
- the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of the legislation and patent practice in the jurisdiction.
- the computer-readable medium Does not include electrical carrier signals and telecommunication signals.
Abstract
Description
Claims (10)
- 一种图像分割方法,其特征在于,包括:将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。
- 如权利要求1所述的图像分割方法,其特征在于,所述将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息,包括:通过编码器中串联的N个信息提取模块对所述待分割图像进行特征提取生成特征图;所述N个信息提取模块根据预设尺度信息设置,N≥1;对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息。
- 如权利要求2所述的图像分割方法,其特征在于,所述将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息,包括:所述待分割图像输入第一所述信息提取模块时,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;融合所述特征图和所述空间位置信息生成新的特征图输出至下一所述信息提取模块,以使所述下一信息提取模块对所述新的特征图进行特征提取和空间位置关系计算。
- 如权利要求3所述的图像分割方法,其特征在于,所述融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图,包括:通过编码器融合第N个信息提取模块输出的特征图和空间位置信息生成上下文信息。
- 如权利要求3所述的图像分割方法,其特征在于,所述对于每一所述信息提取模块,计算所述信息提取模块生成的特征图中像素点之间的空间位置关系得到空间位置信息,包 括:对于每一所述信息提取模块,通过卷积神经网络沿垂直于所述信息提取模块生成的特征图方向对所述特征图进行卷积,计算所述特征图中像素点之间的空间位置关系得到空间位置信息。
- 如权利要求4所述的图像分割方法,其特征在于,所述根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像,包括:通过解码器根据所述上下文信息对所述待分割图像进行分割,输出目标图像。
- 一种图像分割装置,其特征在于,包括:图像特征和位置信息提取模块,用于将待分割图像输入图像分割模型,对所述待分割图像进行特征提取生成特征图,并计算所述特征图中像素点之间的空间位置关系得到空间位置信息;特征融合模块,用于融合所述特征图和所述空间位置信息得到包含空间位置信息的特征图;图像分割模块,用于根据所述包含空间位置信息的特征图对所述待分割图像进行分割,输出目标图像。
- 一种服务器,其特征在于,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至8任一项所述图像分割方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911266841.6A CN111145196A (zh) | 2019-12-11 | 2019-12-11 | 图像分割方法、装置及服务器 |
CN201911266841.6 | 2019-12-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021115061A1 true WO2021115061A1 (zh) | 2021-06-17 |
Family
ID=70518054
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/129521 WO2021115061A1 (zh) | 2019-12-11 | 2020-11-17 | 图像分割方法、装置及服务器 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111145196A (zh) |
WO (1) | WO2021115061A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610754A (zh) * | 2021-06-28 | 2021-11-05 | 浙江文谷科技有限公司 | 一种基于Transformer的缺陷检测方法及系统 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111145196A (zh) * | 2019-12-11 | 2020-05-12 | 中国科学院深圳先进技术研究院 | 图像分割方法、装置及服务器 |
CN112363844B (zh) * | 2021-01-12 | 2021-04-09 | 之江实验室 | 一种面向图像处理的卷积神经网络垂直分割方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109087318A (zh) * | 2018-07-26 | 2018-12-25 | 东北大学 | 一种基于优化U-net网络模型的MRI脑肿瘤图像分割方法 |
CN109461157A (zh) * | 2018-10-19 | 2019-03-12 | 苏州大学 | 基于多级特征融合及高斯条件随机场的图像语义分割方法 |
CN110163875A (zh) * | 2019-05-23 | 2019-08-23 | 南京信息工程大学 | 一种基于调制网络和特征注意金字塔的半监督视频目标分割方法 |
US20190311223A1 (en) * | 2017-03-13 | 2019-10-10 | Beijing Sensetime Technology Development Co., Ltd. | Image processing methods and apparatus, and electronic devices |
CN110428428A (zh) * | 2019-07-26 | 2019-11-08 | 长沙理工大学 | 一种图像语义分割方法、电子设备和可读存储介质 |
CN111145196A (zh) * | 2019-12-11 | 2020-05-12 | 中国科学院深圳先进技术研究院 | 图像分割方法、装置及服务器 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110838124B (zh) * | 2017-09-12 | 2021-06-18 | 深圳科亚医疗科技有限公司 | 用于分割具有稀疏分布的对象的图像的方法、系统和介质 |
-
2019
- 2019-12-11 CN CN201911266841.6A patent/CN111145196A/zh active Pending
-
2020
- 2020-11-17 WO PCT/CN2020/129521 patent/WO2021115061A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190311223A1 (en) * | 2017-03-13 | 2019-10-10 | Beijing Sensetime Technology Development Co., Ltd. | Image processing methods and apparatus, and electronic devices |
CN109087318A (zh) * | 2018-07-26 | 2018-12-25 | 东北大学 | 一种基于优化U-net网络模型的MRI脑肿瘤图像分割方法 |
CN109461157A (zh) * | 2018-10-19 | 2019-03-12 | 苏州大学 | 基于多级特征融合及高斯条件随机场的图像语义分割方法 |
CN110163875A (zh) * | 2019-05-23 | 2019-08-23 | 南京信息工程大学 | 一种基于调制网络和特征注意金字塔的半监督视频目标分割方法 |
CN110428428A (zh) * | 2019-07-26 | 2019-11-08 | 长沙理工大学 | 一种图像语义分割方法、电子设备和可读存储介质 |
CN111145196A (zh) * | 2019-12-11 | 2020-05-12 | 中国科学院深圳先进技术研究院 | 图像分割方法、装置及服务器 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113610754A (zh) * | 2021-06-28 | 2021-11-05 | 浙江文谷科技有限公司 | 一种基于Transformer的缺陷检测方法及系统 |
CN113610754B (zh) * | 2021-06-28 | 2024-05-07 | 浙江文谷科技有限公司 | 一种基于Transformer的缺陷检测方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN111145196A (zh) | 2020-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021115061A1 (zh) | 图像分割方法、装置及服务器 | |
WO2020119527A1 (zh) | 人体动作识别方法、装置、终端设备及存储介质 | |
CN109960742B (zh) | 局部信息的搜索方法及装置 | |
US10726580B2 (en) | Method and device for calibration | |
EP4027299A2 (en) | Method and apparatus for generating depth map, and storage medium | |
US20210272306A1 (en) | Method for training image depth estimation model and method for processing image depth information | |
CN110832501A (zh) | 用于姿态不变面部对准的系统和方法 | |
CN107730514B (zh) | 场景分割网络训练方法、装置、计算设备及存储介质 | |
EP3803803A1 (en) | Lighting estimation | |
CN113409382A (zh) | 车辆损伤区域的测量方法和装置 | |
CN111640180B (zh) | 一种三维重建方法、装置及终端设备 | |
WO2022134464A1 (zh) | 目标检测定位置信度确定方法、装置、电子设备及存储介质 | |
CN111967467A (zh) | 图像目标检测方法、装置、电子设备和计算机可读介质 | |
CN112336342A (zh) | 手部关键点检测方法、装置及终端设备 | |
CN114219855A (zh) | 点云法向量的估计方法、装置、计算机设备和存储介质 | |
CN113592015A (zh) | 定位以及训练特征匹配网络的方法和装置 | |
EP4075381A1 (en) | Image processing method and system | |
CN111161348A (zh) | 一种基于单目相机的物体位姿估计方法、装置及设备 | |
WO2019109410A1 (zh) | 用于分割 mri 图像中异常信号区的全卷积网络模型训练方法 | |
CN110288691B (zh) | 渲染图像的方法、装置、电子设备和计算机可读存储介质 | |
CN111368860B (zh) | 重定位方法及终端设备 | |
Geng et al. | SANet: A novel segmented attention mechanism and multi-level information fusion network for 6D object pose estimation | |
US20230048643A1 (en) | High-Precision Map Construction Method, Apparatus and Electronic Device | |
EP4086853A2 (en) | Method and apparatus for generating object model, electronic device and storage medium | |
CN114494782B (zh) | 图像处理方法、模型训练方法、相关装置及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20900307 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20900307 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20.01.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20900307 Country of ref document: EP Kind code of ref document: A1 |