CN112950653A - Attention image segmentation method, device and medium - Google Patents

Attention image segmentation method, device and medium Download PDF

Info

Publication number
CN112950653A
CN112950653A CN202110217268.0A CN202110217268A CN112950653A CN 112950653 A CN112950653 A CN 112950653A CN 202110217268 A CN202110217268 A CN 202110217268A CN 112950653 A CN112950653 A CN 112950653A
Authority
CN
China
Prior art keywords
matrix
image
segmentation
network
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110217268.0A
Other languages
Chinese (zh)
Other versions
CN112950653B (en
Inventor
王立
郭振华
赵雅倩
李仁刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yingxin Computer Technology Co Ltd
Original Assignee
Shandong Yingxin Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yingxin Computer Technology Co Ltd filed Critical Shandong Yingxin Computer Technology Co Ltd
Priority to CN202110217268.0A priority Critical patent/CN112950653B/en
Publication of CN112950653A publication Critical patent/CN112950653A/en
Application granted granted Critical
Publication of CN112950653B publication Critical patent/CN112950653B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an attention image segmentation method, which comprises the following steps: performing convolution on the image and extracting a plurality of feature maps of the image; selecting and fusing a plurality of feature maps to obtain a fused feature map; obtaining a first segmentation result of the image through an attention network and the fusion feature map; selecting a segmentation network; carrying out size transformation on the first segmentation result of the image to obtain regional information; weighting and fusing the image through the segmentation network and the region information to obtain a fourth matrix; inputting the fourth matrix into the segmentation network to obtain a second segmentation result of the image; through the mode, the feature graphs can be fused, and the weighted fusion is carried out according to the segmentation network, so that the segmentation precision is improved.

Description

Attention image segmentation method, device and medium
Technical Field
The present invention relates to the field of image processing, and in particular, to a method, an apparatus, and a medium for attention map image segmentation.
Background
Image segmentation (image segmentation) is an important research direction in the field of computer vision, and is an important part of image semantic understanding. Image segmentation refers to a process of dividing an image into a plurality of regions with similar properties, in recent years, an image segmentation technology has been developed rapidly, and technologies such as scene object segmentation, human body front background segmentation, human face and human body analysis, three-dimensional reconstruction and the like related to the technology are widely applied to industries such as unmanned driving, augmented reality, security monitoring and the like.
The image segmentation means that an image is divided into a plurality of mutually disjoint areas according to characteristics such as gray scale, color, spatial texture, geometric shape and the like, so that the characteristics show consistency or similarity in the same area and obviously differ among different areas. In brief, in one image, the object is separated from the background. For grayscale images, pixels inside a region generally have grayscale similarities, and pixels at the boundaries of the region generally have grayscale discontinuities.
Generally, image segmentation needs to predict whether a pixel point in an image belongs to a certain target class or a scene class. Due to the complex variety of image scenes: illumination, visual angle, size, shielding and the like, which cause great difficulty in understanding the scene and distinguishing pixel points.
Disclosure of Invention
The invention mainly solves the problem of more accurate image segmentation by accurately classifying and judging the image pixels.
In order to solve the technical problems, the invention adopts a technical scheme that: an attention map image segmentation method is provided, comprising the following steps:
convolving an image and extracting a plurality of feature maps of the image;
selecting and fusing a plurality of feature maps to obtain a fused feature map;
obtaining a first segmentation result of the image through an attention network and the fusion feature map;
selecting a segmentation network;
carrying out size transformation on the first segmentation result of the image to obtain regional information;
weighting and fusing the image through the segmentation network and the region information to obtain a fourth matrix;
and inputting the fourth matrix into the segmentation network to obtain a second segmentation result of the image.
Preferably, the step of performing weighted fusion on the image through the segmentation network and the region information to obtain a fourth matrix further includes:
inputting the image into the segmentation network for calculation to obtain a feature matrix;
the feature matrix comprises a first matrix, a second matrix and a third matrix;
carrying out weight calculation on the first matrix, the second matrix and the area information to obtain a weighting strategy;
and obtaining the fourth matrix based on the weighting strategy and the third matrix.
Preferably, the step of calculating the weight of the first matrix, the second matrix and the area information further includes:
obtaining the vector dimension of a first element in the second matrix;
querying the elements with the same type as the first element in the region information, and recording as second elements;
inquiring the elements in the first matrix, which have the same kind as the second elements, and recording as third elements;
acquiring a vector dimension of the third element;
calculating a vector inner product of the vector dimension of the first element and the vector dimension of the third element to obtain first data;
normalizing the first data to obtain a first vector;
and returning to obtain the vector dimension of the first element in the second matrix until the second matrix is traversed.
Preferably, the step of obtaining the fourth matrix based on the weighting policy and the third matrix further includes:
respectively carrying out weighted fusion on all the first vectors obtained after traversing the second matrix and the third matrix to obtain a plurality of second vectors;
arranging a plurality of second vectors according to the position of the first element in the second matrix to obtain a fourth matrix;
preferably, the step of obtaining the fourth matrix based on the weighting policy and the third matrix further includes:
arranging all the first vectors obtained after traversing the second matrix according to the position of the first element in the second matrix to obtain a weighting matrix;
and performing weighted fusion on the weighting matrix and the third matrix to obtain the fourth matrix.
Preferably, the step of selecting and fusing a plurality of feature maps further comprises: enabling the sizes of the feature maps to be the same through a bilinear interpolation method or a deconvolution network method;
and adding the feature maps with the same size to obtain the fusion feature map.
Preferably, the size conversion is performed by down-sampling so that the size of the first division result is the same as the size of the image input to the division network.
Preferably, the step of obtaining the first segmentation result of the image through the attention network and the fused feature map further includes: inputting the fused feature map to the attention network;
changing the size of the fused feature map into the size of the image through a bilinear interpolation method;
normalizing the value range of the fusion characteristic diagram through a normalization function;
and obtaining a first segmentation result of the image by solving a parameter function.
The present invention also provides an attention image segmentation system, comprising: the system comprises an extraction module, a fusion module, a first segmentation module, a transformation module and a second segmentation module;
the extraction module is used for performing convolution on the image through a convolution kernel and extracting a plurality of feature maps of the image;
the fusion module is used for selecting and fusing a plurality of feature maps to obtain a fusion feature map;
the first segmentation module is used for obtaining a first segmentation result of the image through an attention network and the fusion feature map;
the transformation module is used for selecting a segmentation network and carrying out size transformation on the first segmentation result of the image to obtain region information;
the second segmentation module is used for weighting and fusing the image through the segmentation network and the region information to obtain a fourth matrix, and inputting the fourth matrix into the segmentation network to obtain a second segmentation result of the image.
The invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method for attention image segmentation.
The invention has the beneficial effects that:
1. according to the attention map image segmentation method, the feature maps can be fused, and weighting fusion is carried out according to the segmentation network, so that the segmentation precision is improved.
2. The attention image segmentation network training system can realize that the attention result obtained by the auxiliary attention network can weight the features in the main segmentation network by using an attention weighting method, and can improve the accuracy of image segmentation.
3. The computer-readable storage medium can realize the calculation of elements and vector inner products in a matrix, calculates a rough segmentation result through an attention network, calculates a fine segmentation result through rough segmentation, improves the calculation efficiency, and does not generate errors when the calculation process is realized by software.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an attention image segmentation method according to embodiment 1 of the present invention;
fig. 2 is a schematic view of an attention network structure in the attention image segmentation method according to embodiment 1 of the present invention;
fig. 3 is a schematic diagram of a segmentation network structure in the attention image segmentation method according to embodiment 1 of the present invention;
FIG. 4 is a flowchart of weighted fusion in the attention image segmentation method according to embodiment 1 of the present invention;
FIG. 5 is a flowchart of a weight calculation method in the attention image segmentation method according to embodiment 1 of the present invention;
fig. 6 is a schematic structural diagram of an attention image segmentation system according to embodiment 2 of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
It is noted that in the description of the present invention
The attention mechanism (attention) has made a significant breakthrough in the image domain in recent years, and has proven beneficial for improving the performance of the model. The attention mechanism itself is also the perception mechanism according to the human brain and human eye.
The essence of the attention mechanism is to locate the information of interest, suppress unwanted information, and enable the network to focus on the areas of greater interest. More specifically, in image processing, the attention mechanism learns the relationship between a certain pixel and pixels at all other positions (including a far position) in an image, and the learned characteristics of the relationship are used for assisting in segmenting the details of the image, so that the segmentation result is more accurate and finer.
ResNet is a residual network, which can be understood as a sub-network that can be stacked to form a very deep network.
When the convolution kernel is used for image processing, given an input image, each corresponding pixel in an output image is formed after weighted averaging of pixels in a small area in the input image, wherein a weight value is defined by a function, and the function is called the convolution kernel.
At each convolution layer of the CNN, the data is in three-dimensional form, and can be seen as a stack of two-dimensional pictures, each of which is called a feature map.
feature map represents a feature map in the present embodiment;
bilinear interpolation, also known as bilinear interpolation. Mathematically, bilinear interpolation is linear interpolation extension of an interpolation function with two variables, and the core idea is to perform linear interpolation in two directions respectively.
Bilinear interpolation is used as an interpolation algorithm in numerical analysis and is widely applied to the aspects of signal processing, digital image and video processing and the like.
The deconvolution is a special forward convolution, which is to enlarge the size of the input image by 0 supplementation according to a certain proportion, then rotate the convolution kernel, and then perform forward convolution.
Normalization is a simplified calculation mode, namely, a dimensional expression is transformed into a dimensionless expression to become a scalar.
argmax is a function that is a function of (a set of) parameters to the function.
The softmax function is also called a normalized exponential function.
It is noted that in the description of the present invention
The first segmentation result is a coarse segmentation result and the second segmentation result is a fine segmentation result.
unewIs a second vector; the vector inner product of the first element is e, and the vector inner product of the third element is f;
the first data is h,.
Example 1
An embodiment of the present invention provides a method for segmenting an attention image, please refer to fig. 1, which includes the following steps:
s100, training an attention network, wherein the attention network is a trainable rough segmentation network; the attention network in this embodiment is based on the ResNet network, but is not limited to this type of network; the method comprises the steps of obtaining an original image to be segmented and a backbone network structure;
s110, the main network convolutes the image through a convolution kernel and extracts a characteristic graph of the image; setting convolution step length, and controlling the size of the feature image after convolution through the convolution step length; in the backbone network, the size of the feature map of the image decreases by one time after each convolution, for example, the image with the previous image of 200 × 200 becomes 100 × 100 after one convolution;
s120, performing multiple convolution on the original image to be segmented to obtain a heatMap;
the method comprises the following specific steps:
s121, performing convolution conv1 on the original image to be segmented to obtain a first image; 1/2, the first image becomes the original image to be segmented;
s122, performing secondary convolution conv2 on the first image to obtain a second image; the second image becomes 1/2 of the first image, becomes 1/4 of the original image to be segmented;
s123, carrying out convolution conv3 on the second image for three times to obtain a third image; the third image becomes 1/2 of the second image, becomes 1/8 of the original image to be segmented; outputting a first featureMap of the current image;
s124, carrying out convolution conv4 on the third image for four times to obtain a fourth image; the fourth image becomes 1/2 of the third image, becomes 1/16 of the original image to be segmented; outputting a second featureMap of the current image;
s125, performing five times of convolution conv5 on the fourth image to obtain a fifth image; the fifth image becomes 1/2 of the fourth image, becomes 1/32 of the original image to be segmented; outputting a third featureMap of the current image;
s126, conv6 is performed for six times on the fifth image, the number of channels of the third featureMap is changed,
after the sixth convolution, the number of featureMap channels of the image changes, which is a common practice in ResNet networks. In the convolutional neural network, the size and the number of channels of the feature map can be changed by manual setting, respectively. In the present invention, the most common setting means (resnet50) is used to control the number of channels of the feature map and the output of the feature map size for each layer.
Obtaining a sixth image, wherein the size of the sixth image is still 1/32 of the original image to be segmented, and outputting a fourth featureMap of the current image; and the fourth featureMap of the sixth image at this time is the heatMap heat map;
typically, the last layer of feature maps in an image segmentation convolutional neural network is referred to as a heat map. The thermal map is a characteristic map in the present embodiment; because the last layer of the graph is taken out for calculation and the researcher draws the image representation of the last layer of the feature graph.
S130, selecting a plurality of feature maps, and fusing the feature maps;
the fusion method comprises the following steps: fusing feature maps of different sizes;
assuming that the size of a first feature in the plurality of features is (C x W H) 1 x 28, where C is the number of channels in the first feature, H is the height of the first feature, and W is the width of the first feature; the second feature size (C × W/2 × H/2) is 1 × 14, the third feature size (C × W/4 × H/4) is 1 × 7, when feature fusion is performed, simple addition cannot be directly performed, because the feature sizes are different, and in order to add the feature sizes, small feature graphs are sampled, and usually a deconvolution network or a bilinear interpolation method is used to obtain the same size as that of a large-size feature graph, and then the addition fusion is performed;
for example: obtaining the same size of the third feature map through bilinear interpolation (F.interpole) or Deconvolution (deconvo) and adding the third feature map and the second feature map to realize feature fusion, and calculating a rough segmentation result of the image through the fused features through an attention network;
in this embodiment, the feature fusion method is not limited to the above one, and may be implemented in various ways, for example:
fusing the second feature map and the first feature map, fusing the third feature map and the second feature map, fusing the third feature map and the first feature map, and the like;
or directly and independently using the characteristics of the first characteristic diagram, the second characteristic diagram or the third characteristic diagram to calculate the rough segmentation result of the original image to be segmented through an attention network without characteristic fusion;
the fused feature map can obtain a rough segmentation result of the original image to be segmented through an attention network;
the method comprises the following specific steps:
s131, referring to FIG. 2, the attention network includes a convolution layer, a reduction layer, a softmax normalization function layer and an argmax parameter-solving layer
Inputting the fused feature map into an attention network, and restoring a second feature map to an original size through bilinear interpolation (F.interplate); normalizing the value range of the feature map to be in a range of [0, 1] through a softmax layer in the attention network, and obtaining a rough segmentation result of a second feature map through an argmax function; calculating a loss value of the attention network through a loss function; wherein the loss function adopts a cross entropy loss function which is common to image segmentation.
S200, segmenting the network, where there are many segmented networks, such as FCN, SegNet, ENet, etc., and this embodiment is not limited to what type of segmented network is adopted;
referring to fig. 3, in the present embodiment, a network similar to the attention structure network is used as the segmentation network; the specific structure of the split network comprises: conv2d convolutional layer, bilinear interpolation (f.interplate) layer, output layer, softmax layer and argmax layer;
where Conv2d represents a 2d convolutional layer network, the input feature map may be convolved to extract features of the input feature map.
S300, referring to FIG. 4, performing weighted fusion through a rough segmentation attention network to obtain a finely segmented image segmentation result;
inputting a feature diagram with the size of C H W through a segmentation network, respectively passing through 3 convolution layers conv2d with the size of 1W and outputting three feature matrixes which are marked as a first matrix, a second matrix and a third matrix;
calculating a weighting strategy by the first matrix and the second matrix through weight calculation;
the weighting strategy is: how each feature element in the feature map should be weighted,
and how each feature element in the feature map is weighted to be applied to the third matrix,
referring to fig. 5, the method of weight calculation includes the following steps:
s310, after the attention network passes through the argmax layer, a rough segmentation result of the input image is obtained, the size of the rough segmentation result is equal to that of the original image, the rough segmentation result is subjected to size conversion to obtain region information, the size of the rough segmentation result is converted into C H W, the C H W is consistent with the size of a feature map input by the segmentation network, and a down-sampling method is adopted in a size conversion method;
s320, traversing the element u at each position in the second matrix, for example, the size of the second matrix is C × H × W, where C represents the number of channels, and only traversing the positions of the elements represented by H × W; assuming that element u at the first position in the second matrix is traversed, the vector dimension of the position of element u is 1 × C, that is, vector dimension C;
this is partly because the matrix C × H × W is a three-dimensional matrix, but the positions of the elements represented by H × W are traversed, so when traversing the H × W elements, the vector dimension in which the position of the element u is located is 1 × C;
s330, searching the element position associated with the element u position in the region removing information according to the element u position, finding the region associated with the element u position according to the position, and outputting;
inquiring elements with the same category as the element u in the region information, outputting position information of the elements with the same category, and if N elements with the same category as the element u exist in the region information, outputting the position information of the N elements; the position information is coordinates;
s340, obtaining a vector dimension C at the position of the element u in the second matrix, obtaining a vector dimension C x 1, representing by f, obtaining N vector dimensions C associated with the element u in the first matrix, obtaining the position information of the N elements through the step S330 in the same method as the step S330, and then obtaining the vector dimension C corresponding to the N position information; marking the vector dimension C corresponding to the N pieces of information, N & ltC & gt as e;
s350, obtaining an inner vector product of e and f, [ C1 ] · [ C × N ] ═ 1 × N ], and denoted by h, where h includes N elements, each of the N elements includes respective position information, and the position information is the same as the position information of the N elements acquired in the first matrix;
s360, solving softmax of h, and carrying out normalization to obtain a vector of [1 x N ];
using a weighted fusion formula:
Figure BDA0002954283990000111
and using [1 x N ] of h]Carrying out weighted fusion on the vector and the third matrix;
wherein G isiRepresenting h taken from the third matrixiAnd is combined with hiVector of the corresponding position, GiHas a dimension of 1 × C, and is subjected to weighted fusionnewIs also 1 × C;
s370, traversing the elements at each position in the third matrix, wherein the number of the elements is H, W, and substituting the elements at each position and the vector of each element in the corresponding H into the weighted fusion formula to obtain a plurality of u after weighted fusionnewFor a number u according to the position of the element u in the second matrixnewArranging to obtain a weighted and fused matrix, and marking the weighted and fused matrix as a fourth matrix, wherein the dimensionality of the fourth matrix is C x H x W;
or the vectors of each element in h are arranged according to the position of the element u in the second matrix to obtain a weighting matrix, and each vector in the weighting matrix and the corresponding element in each position in the third matrix are subjected to weighted fusion to obtain a weighted-fused matrix, wherein the matrix is a fourth matrix.
S380, taking the fourth matrix as the input of the segmentation network, and obtaining a finer segmentation result through the computation processing of the segmentation network, wherein the u is obtained by performing scale transformation on the rough segmentation result of the attention network and performing weighted fusion on the rough segmentation result and the matrix input by the segmentation networknewBefore the rough segmentation is carried out through the attention network, the images are convoluted for multiple times to extract feature maps in the images, the feature maps obtained through different convolutions are fused, the features of the images are fused, the rough segmentation result is obtained through the attention network, the rough segmentation precision is further improved compared with the prior art, and u obtained after weighting fusion is further improvednewTherefore, the result of the rough segmentation is weighted again, so that the precision of the image is further improved, and the final segmentation using the segmentation network is improved compared with the previous segmentation precision, so that a finer segmentation result is obtained.
Example 2
An embodiment of the present invention further provides an attention image segmentation system, please refer to fig. 6, including: the system comprises an extraction module, a fusion module, a first segmentation module, a transformation module and a second segmentation module;
the extraction module is used for carrying out convolution on the image through the convolution kernel and extracting a plurality of characteristic graphs of the image;
the fusion module is used for selecting and fusing a plurality of feature maps to obtain a fusion feature map;
the first segmentation module is used for obtaining a first segmentation result of the image through an attention network and the fusion feature map;
the transformation module is used for selecting a segmentation network and carrying out size transformation on the first segmentation result of the image to obtain region information;
and the second segmentation module is used for weighting and fusing the image through the segmentation network and the region information to obtain a fourth matrix, and inputting the fourth matrix into the segmentation network to obtain a second segmentation result of the image.
Based on the same inventive concept as the method in the foregoing embodiments, the present specification embodiment further provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of an attention image segmentation method as disclosed in the foregoing.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps of implementing the above embodiments may be implemented by hardware, and a program that can be implemented by the hardware and can be instructed by the program to be executed by the relevant hardware may be stored in a computer readable storage medium, where the storage medium may be a read-only memory, a magnetic or optical disk, and the like.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An attention map image segmentation method, characterized by comprising the steps of:
convolving an image and extracting a plurality of feature maps of the image;
selecting and fusing a plurality of feature maps to obtain a fused feature map;
obtaining a first segmentation result of the image through an attention network and the fusion feature map;
selecting a segmentation network;
carrying out size transformation on the first segmentation result of the image to obtain regional information;
weighting and fusing the image through the segmentation network and the region information to obtain a fourth matrix;
and inputting the fourth matrix into the segmentation network to obtain a second segmentation result of the image.
2. An attention map image segmentation method as claimed in claim 1, characterized in that: the step of performing weighted fusion on the image through the segmentation network and the region information to obtain a fourth matrix further comprises:
inputting the image into the segmentation network for calculation to obtain a feature matrix;
the feature matrix comprises a first matrix, a second matrix and a third matrix;
carrying out weight calculation on the first matrix, the second matrix and the area information to obtain a weighting strategy;
and obtaining the fourth matrix based on the weighting strategy and the third matrix.
3. An attention map image segmentation method as claimed in claim 2, characterized in that: the step of performing weight calculation on the first matrix, the second matrix and the region information further includes:
obtaining the vector dimension of a first element in the second matrix;
querying the elements with the same type as the first element in the region information, and recording as second elements;
inquiring the elements in the first matrix, which have the same kind as the second elements, and recording as third elements;
acquiring a vector dimension of the third element;
calculating a vector inner product of the vector dimension of the first element and the vector dimension of the third element to obtain first data;
normalizing the first data to obtain a first vector;
and returning to obtain the vector dimension of the first element in the second matrix until the second matrix is traversed.
4. An attention map image segmentation method as claimed in claim 3, characterized in that: the step of deriving the fourth matrix based on the weighting policy and the third matrix further comprises:
respectively carrying out weighted fusion on all the first vectors obtained after traversing the second matrix and the third matrix to obtain a plurality of second vectors;
and arranging the second vectors according to the position of the first element in the second matrix to obtain a fourth matrix.
5. An attention map image segmentation method as claimed in claim 3, characterized in that: the step of obtaining the fourth matrix based on the weighting policy and the third matrix further comprises:
arranging all the first vectors obtained after traversing the second matrix according to the position of the first element in the second matrix to obtain a weighting matrix;
and performing weighted fusion on the weighting matrix and the third matrix to obtain the fourth matrix.
6. An attention map image segmentation method as claimed in claim 1, characterized in that: the step of selecting and fusing a plurality of feature maps further comprises: enabling the sizes of the feature maps to be the same through a bilinear interpolation method or a deconvolution network method;
and adding the feature maps with the same size to obtain the fusion feature map.
7. An attention map image segmentation method as claimed in claim 1, characterized in that: the size transformation makes the size of the first segmentation result the same as the size of the image input to the segmentation network by means of downsampling.
8. An attention map image segmentation method as claimed in claim 1, characterized in that: the step of obtaining the first segmentation result of the image through the attention network and the fused feature map further comprises: inputting the fused feature map to the attention network;
changing the size of the fused feature map into the size of the image through a bilinear interpolation method;
normalizing the value range of the fusion characteristic diagram through a normalization function;
and obtaining a first segmentation result of the image by solving a parameter function.
9. An attention image segmentation system, comprising: the system comprises an extraction module, a fusion module, a first segmentation module, a transformation module and a second segmentation module;
the extraction module is used for performing convolution on the image through a convolution kernel and extracting a plurality of feature maps of the image;
the fusion module is used for selecting and fusing a plurality of feature maps to obtain a fusion feature map;
the first segmentation module is used for obtaining a first segmentation result of the image through an attention network and the fusion feature map;
the transformation module is used for selecting a segmentation network and carrying out size transformation on the first segmentation result of the image to obtain region information;
the second segmentation module is used for weighting and fusing the image through the segmentation network and the region information to obtain a fourth matrix, and inputting the fourth matrix into the segmentation network to obtain a second segmentation result of the image.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of a method for attention image segmentation according to any one of claims 1 to 7.
CN202110217268.0A 2021-02-26 2021-02-26 Attention image segmentation method, device and medium Active CN112950653B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110217268.0A CN112950653B (en) 2021-02-26 2021-02-26 Attention image segmentation method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110217268.0A CN112950653B (en) 2021-02-26 2021-02-26 Attention image segmentation method, device and medium

Publications (2)

Publication Number Publication Date
CN112950653A true CN112950653A (en) 2021-06-11
CN112950653B CN112950653B (en) 2023-05-23

Family

ID=76246443

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110217268.0A Active CN112950653B (en) 2021-02-26 2021-02-26 Attention image segmentation method, device and medium

Country Status (1)

Country Link
CN (1) CN112950653B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115731243A (en) * 2022-11-29 2023-03-03 北京长木谷医疗科技有限公司 Spine image segmentation method and device based on artificial intelligence and attention mechanism

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018111940A1 (en) * 2016-12-12 2018-06-21 Danny Ziyi Chen Segmenting ultrasound images
CN109992784A (en) * 2019-04-08 2019-07-09 北京航空航天大学 A kind of heterogeneous network building and distance metric method for merging multi-modal information
CN110084299A (en) * 2019-04-24 2019-08-02 中国人民解放军国防科技大学 target detection method and device based on multi-head fusion attention
US20200372648A1 (en) * 2018-05-17 2020-11-26 Tencent Technology (Shenzhen) Company Limited Image processing method and device, computer apparatus, and storage medium
CN112116605A (en) * 2020-09-29 2020-12-22 西北工业大学深圳研究院 Pancreas CT image segmentation method based on integrated depth convolution neural network
CN112258526A (en) * 2020-10-30 2021-01-22 南京信息工程大学 CT (computed tomography) kidney region cascade segmentation method based on dual attention mechanism

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018111940A1 (en) * 2016-12-12 2018-06-21 Danny Ziyi Chen Segmenting ultrasound images
US20200372648A1 (en) * 2018-05-17 2020-11-26 Tencent Technology (Shenzhen) Company Limited Image processing method and device, computer apparatus, and storage medium
CN109992784A (en) * 2019-04-08 2019-07-09 北京航空航天大学 A kind of heterogeneous network building and distance metric method for merging multi-modal information
CN110084299A (en) * 2019-04-24 2019-08-02 中国人民解放军国防科技大学 target detection method and device based on multi-head fusion attention
CN112116605A (en) * 2020-09-29 2020-12-22 西北工业大学深圳研究院 Pancreas CT image segmentation method based on integrated depth convolution neural network
CN112258526A (en) * 2020-10-30 2021-01-22 南京信息工程大学 CT (computed tomography) kidney region cascade segmentation method based on dual attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
QINGXIA MENG,ET AL: "Factorization-Based Active Contour for Water-Land SAR Image Segmentation via the Fusion of Features", 《IEEE ACCESS》 *
冒伟: "融合谱聚类和多特征的遥感图像分割", 《软件导刊》 *
周晓玲,等: "结合脉冲耦合神经网络与引导滤波的红外与可见光图像融合", 《光学学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115731243A (en) * 2022-11-29 2023-03-03 北京长木谷医疗科技有限公司 Spine image segmentation method and device based on artificial intelligence and attention mechanism
CN115731243B (en) * 2022-11-29 2024-02-09 北京长木谷医疗科技股份有限公司 Spine image segmentation method and device based on artificial intelligence and attention mechanism

Also Published As

Publication number Publication date
CN112950653B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN107154023B (en) Based on the face super-resolution reconstruction method for generating confrontation network and sub-pix convolution
Ji et al. Deep view morphing
US10353271B2 (en) Depth estimation method for monocular image based on multi-scale CNN and continuous CRF
CN111160214B (en) 3D target detection method based on data fusion
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN111968138B (en) Medical image segmentation method based on 3D dynamic edge insensitivity loss function
CN111815665B (en) Single image crowd counting method based on depth information and scale perception information
CN109086777B (en) Saliency map refining method based on global pixel characteristics
KR101567792B1 (en) System and method for describing image outlines
CN113111751B (en) Three-dimensional target detection method capable of adaptively fusing visible light and point cloud data
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
Mutahira et al. Focus measurement in color space for shape from focus systems
CN114742875A (en) Binocular stereo matching method based on multi-scale feature extraction and self-adaptive aggregation
CN112950653B (en) Attention image segmentation method, device and medium
Wang et al. Learning continuous depth representation via geometric spatial aggregator
CN112102379B (en) Unmanned aerial vehicle multispectral image registration method
KR20230083212A (en) Apparatus and method for estimating object posture
CN112927304B (en) Fish-eye lens calibration method based on convolutional neural network
CN112419387B (en) Unsupervised depth estimation method for solar greenhouse tomato plant image
CN114863132A (en) Method, system, equipment and storage medium for modeling and capturing image spatial domain information
CN115410014A (en) Self-supervision characteristic point matching method of fisheye image and storage medium thereof
CN114022458A (en) Skeleton detection method and device, electronic equipment and computer readable storage medium
CN110503603A (en) A method of it is up-sampled based on guidance and obtains light field weight focus image
CN112017158B (en) Spectral characteristic-based adaptive target segmentation method in remote sensing scene
CN113569636B (en) Fisheye image feature processing method and system based on spherical features and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant