CN111080666B - Object segmentation method and device based on cyclic convolution - Google Patents

Object segmentation method and device based on cyclic convolution Download PDF

Info

Publication number
CN111080666B
CN111080666B CN201911374778.8A CN201911374778A CN111080666B CN 111080666 B CN111080666 B CN 111080666B CN 201911374778 A CN201911374778 A CN 201911374778A CN 111080666 B CN111080666 B CN 111080666B
Authority
CN
China
Prior art keywords
curve
node
closed curve
feature
segmentation method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911374778.8A
Other languages
Chinese (zh)
Other versions
CN111080666A (en
Inventor
周晓巍
鲍虎军
彭思达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201911374778.8A priority Critical patent/CN111080666B/en
Publication of CN111080666A publication Critical patent/CN111080666A/en
Application granted granted Critical
Publication of CN111080666B publication Critical patent/CN111080666B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/149Segmentation; Edge detection involving deformable models, e.g. active contour models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses an object segmentation method and device, which are used for segmenting an object by predicting the contour line of the object in an image. In order to predict the contour line of an object, the invention provides a feature learning method based on cyclic convolution and a curve deformation method. The implementation of the invention comprises: constructing a feature vector for each node of the curve based on the initialized closed curve; performing feature learning on a feature vector sequence defined on a closed curve by using cyclic convolution; based on cyclic convolution, a deep neural network is provided for curve deformation; realizing object segmentation based on a curve deformation method; a target object containing a plurality of connected regions is processed. According to the method, through cyclic convolution, efficient feature learning on the curve is achieved, and the accuracy of the object segmentation method based on the contour line is improved.

Description

Object segmentation method and device based on cyclic convolution
Technical Field
The invention belongs to the technical field of computers, and particularly relates to an object segmentation method and device based on cyclic convolution.
Background
In the related object segmentation technology, the conventional image processing method obtains an object contour curve by optimizing an initial curve, but is prone to fall into a local optimal point. Some recent deep learning methods directly regress the object contour curve, but the segmentation effect is not accurate enough. There are also implementations that use graph convolution to perform feature learning on the initial curve to predict the object profile curve. But using generalized graph convolution does not take full advantage of the topological features of the curves, making feature learning not necessarily very efficient.
Disclosure of Invention
The invention aims to provide a method for learning characteristics on a curve based on cyclic convolution aiming at the defects of the prior art, and the learned characteristics are used for deforming the curve. The invention performs object segmentation based on a curve deformation method.
According to a first aspect of the present invention, there is provided a method for learning features on a curve based on cyclic convolution, comprising:
1. generation of image features: and giving a picture to be subjected to object segmentation, processing the picture by using a deep neural network, and extracting picture features. The picture features are similar to pictures and are a tensor matrix. The resolution of the picture features is determined according to the input picture and the neural network. The extraction of picture features can use most existing deep neural networks.
2. Construction of features on the curves: a closed curve is given on the image, and the curve consists of N nodes. Based on the picture features, a feature vector can be constructed for each node.
3. A cyclic convolution-based feature learning method on a curve comprises the following steps: in graph theory, a closed curve is a cyclic graph. In the cyclic graph, N nodes form a closed chain, each node having a degree of 2, i.e., each node is an end point of two edges. When the curve is not closed, the feature vector on the curve is a one-dimensional discrete signal, and can be subjected to one-dimensional convolution with a one-dimensional convolution kernel, so that signal processing is realized. When the curve is closed, the feature vector on the curve is a periodic one-dimensional discrete signal. In this case, the one-dimensional convolution on such a periodic sequence of feature vectors is called a cyclic convolution.
4. Cyclic convolution based neural networks: the standard one-dimensional convolution uses a one-dimensional convolution kernel to perform convolution with a one-dimensional discrete signal, and the cyclic convolution uses a one-dimensional convolution kernel to perform convolution with a one-dimensional periodic discrete signal. Therefore, the circular convolution can be used for forming a neural network layer like one-dimensional convolution, and the neural network layer is embedded into a common deep neural network for feature extraction.
According to a second aspect of the present finding, there is provided a curve deformation method comprising:
1. and (3) offset prediction: and (3) giving a picture and a closed curve, and performing feature learning on the curve provided by the invention. After feature learning, each node now has a feature vector of high-level semantic information. An offset can be predicted at each curve node using a regressor commonly used for deep learning, such as a multi-layer perceptron or 1 × 1 convolution. This offset represents the offset from the current node coordinates to the target node coordinates. In object segmentation, the target curve is the object contour, and the offset is from the curve node to the object edge point.
2. And (3) deformation of the curve: and after the offset is predicted by each node, adding the offset to the coordinates of each node, updating the coordinates of the nodes of the curve, and realizing the deformation of the curve.
According to a third aspect of the present invention, there is provided an object segmentation method comprising:
1. an object detector: the object segmentation method of the present invention can use most existing target object detectors. The target object detector based on deep learning is composed of two parts, one is a neural network backbone structure, and the other is a regressor. And reading the target picture by the neural network backbone structure and outputting the picture characteristics. Based on the picture features, the regressor predicts the location and class of objects in the picture. The position of the object is represented by a two-dimensional rectangular box and the class of the object is represented by a unique heat vector.
2. Initializing a curve: the invention performs curve initialization based on a two-dimensional rectangular frame given by the target object detector. Each side of the two-dimensional rectangular frame is provided with a middle point, and the four middle points are connected to obtain a quadrangle. The quadrangle is a closed curve, and four corner points of the quadrangle can be deformed into four poles of the object by adopting the curve deformation method. Based on the four poles of the object, the invention constructs an octagon which is relatively close to the object, and the octagon is used as an initialized curve.
3. Iterative deformation curves: the initialized curve has only 8 nodes as an octagon. The invention re-samples the data evenly to obtain N nodes. Meanwhile, for the edge contour line of the target object, the method also carries out uniform sampling to obtain N nodes. The two curves are aligned according to the poles on the object, so that each node in the initialization curve has a target node. The curve deformation method provided by the invention can be used for deforming the initialization curve to the contour line of the target object. Considering that the initial curve is far away from the target curve, the curve deformation method can be used for multiple times to iterate the deformation curve to obtain the final object contour line, and the object in the image is segmented.
According to a fourth aspect of the present invention, there is provided a method of processing an object containing a plurality of connected regions, comprising:
when the object is not occluded, the object is a connected region, represented by a closed curve. When the object is shielded, the object is divided into a plurality of connected areas which are represented by a plurality of closed curves. If only one initialization curve is used, the object cannot be completely segmented. Therefore, in a complete two-dimensional rectangular frame of the object, the rectangular frames of all the connected regions of the object are detected, and then the rectangular frames are deformed into the contour lines of the connected regions. And merging the contour lines in the complete object rectangular frame, and segmenting the complete object.
The beneficial effects of the invention are: the object is segmented by predicting the contour of the object in the image. In order to predict the contour line of an object, the invention provides a feature learning method based on cyclic convolution and a curve deformation method. According to the method, through cyclic convolution, efficient feature learning on the curve is achieved, and the accuracy of the object segmentation method based on the contour line is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic diagram of a feature learning method and a deformation curve on a curve based on circular convolution according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of feature learning on a curve using circular convolution according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a deep neural network constructed based on cyclic convolution according to an embodiment of the present invention.
Fig. 4 is a flow chart of object segmentation according to an embodiment of the present invention.
FIG. 5 is a schematic diagram of an embodiment of the present invention for processing an object containing a plurality of connected regions.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, embodiments accompanying figures are described in detail below.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein, and it will be appreciated by those skilled in the art that the present invention may be practiced without departing from the spirit and scope of the present invention and that the present invention is not limited by the specific embodiments disclosed below.
As shown in fig. 1, the present invention provides a method for learning features on a curve based on cyclic convolution, which performs curve deformation based on the learned features. The method comprises the following specific steps:
1. a target picture is input, and a picture feature F can be extracted by using an existing deep neural network.
2. An initialization curve is given, such as in fig. 1, the curve surrounds the target object. The curve is a closed chain of N nodes. The mathematical representation of the curve is { x }i1, N, where xiRepresenting the two-dimensional coordinates of the ith node.
3. Constructing a feature vector of each node on the curve: each node has a two-dimensional coordinate xiFor each node, the invention constructs a corresponding feature vector: [ F (x)i);x′i]Wherein [; a means of]Representing the concatenation of two vectors, F (x)i) Is picture feature F at xiLocation extracted feature, x'iIs xiA translation invariance version of (c). The known picture feature F is a three-dimensional tensor, x, similar to a pictureiIs a two-dimensional coordinate, F (x)i) Can be represented by x on FiAnd (5) position interpolation is carried out. F (x)i) The method brings high-level features learned by the network for the pictures, and the features contain the contents of the pictures. Besides semantic content, the deformation curve also needs to know the relative distribution of each node, so that a two-dimensional coordinate x of the node is addedi. Considering that the curve deformation will not change due to the position of the curve on the picture, the invention constructs a picture withTwo-dimensional coordinate x 'of translational invariance'iThe construction process comprises the following steps: for all nodes { xi1., N, then, all nodes are subtracted by the smallest x-axis coordinate and the smallest y-axis coordinate.
4. Feature learning is performed on the curve by circular convolution as shown in fig. 2. There is one feature vector on each node, and there are N feature vectors for N nodes. In general, the N eigenvectors can be regarded as one-dimensional discrete functions
Figure BDA0002340631450000044
One-dimensional convolution processing can be used directly. However, the standard one-dimensional convolution does not take into account the periodicity of the closed curve, destroying the topology of the closed curve. The present invention uses circular convolution to perform feature learning on a sequence of feature vectors defined on a curve. The sequence of feature vectors defined on the curve being a periodic signal fNAnd can be defined as:
Figure BDA0002340631450000041
wherein (f)N)iRepresenting a periodic sequence of feature vectors fNI mod N denotes that i takes the remainder of N.
The invention processes this periodic signal f using a cyclic convolutionNDefined as:
Figure BDA0002340631450000042
where symbol denotes a standard one-dimensional convolution,
Figure BDA0002340631450000043
is a learnable convolution kernel, kjThe jth parameter vector representing the convolution kernel k. In the formula, the size of the convolution kernel is 2r + 1.
Fig. 2 shows an example of a circular convolution on a curve. As shown in fig. 2, the lowest circular node sequence is a feature vector sequence defined on a curve, the middle 5 nodes represent one-dimensional convolution kernels of size 5, and the top is the result of the output of the circular convolution. The meaning of the convolution kernel is the same as that of a standard one-dimensional convolution, inner products are respectively made on 5 parameter vectors of the convolution kernel and feature vectors of 5 nodes taking the current node as the center, and the 5 inner product results are added to obtain the output of the current position.
5. Based on the learned features, the regression offset is convolved using a multi-layered perceptron or 1x 1. FIG. 1 shows a schematic diagram of an offset regression. Specifically, a feature vector sequence on a curve is defined, and a feature vector learned on each node is obtained after a series of cyclic convolution network layers. For each node, a multi-layer perceptron or a plurality of 1x1 convolutional network layers can be used for feature transformation, features are mapped to node offset space, and the offset of each node is predicted. The initialized curve has N nodes, and the object contour line also has N nodes. The nodes are in one-to-one correspondence, and the offset of each node is predicted by the method.
As shown in fig. 3, the present invention provides a deep neural network based on cyclic convolution as a curve deformer. The deep neural network comprises three parts: a network backbone structure, a feature fusion part and a prediction regression part. The network backbone structure is used for feature learning on the curve. The network backbone structure is composed of a plurality of cyclic convolution network layers. And the characteristic fusion part pools the characteristic vectors learned by the network backbone structure on each node, fuses the information of all nodes on the closed curve to obtain a fusion vector, and then connects the fusion vector to the characteristic vector learned by each node. The prediction regression part uses a multilayer perceptron or a plurality of 1x1 convolutional network layers to map the fused feature vector on each node into two-dimensional offset and points to a target node.
As shown in fig. 4, the present invention provides an object segmentation method based on curve deformation. A picture is input, and an initial closed curve is obtained by using the target object detector. The closed curve can be a rectangular frame, or a closed curve which is rough around the object. When the closed curve is a rectangular frame, the invention takes the midpoints of four sides of the rectangle to connect the four sides into a quadrangle. Inputting the quadrangle into the curve deformer provided by the invention, deforming four nodes of the quadrangle, and predicting to obtain four object poles. The object poles are pixel points at the top, bottom, left and right edges of the object. Based on the predicted poles of the object, an octagon may be constructed. In particular, the four poles extending horizontally and vertically may form a rectangular frame. For the upper extreme point, a line segment with the length of a quarter rectangle is extended along the horizontal direction, and for the lower left and right extreme points, similar operation is carried out, so that four line segments can be obtained. And connecting the four line segments to obtain the octagon. And sampling the octagon to obtain N nodes (preferably uniform sampling), inputting the N nodes into a curve deformer, deforming the sampled N nodes, and predicting to obtain N nodes on the object contour line.
As shown in FIG. 5, the present invention provides a method of processing an object comprising a plurality of connected regions. In fig. 5, a car is sectioned by pillars into three unconnected regions. The invention firstly uses a target object detector to detect a complete object rectangular frame, and then detects rectangular frames of all connected regions in the complete object rectangular frame. The rectangular frame of the connected region can be used as an initial closed curve, the rectangular frame is transformed into the contour lines of the connected region through the flow provided by the step shown in fig. 4, and the contour lines of the three connected regions are combined to complete the segmentation of the target object in the graph.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points.
The foregoing is merely a preferred embodiment of the present invention, and although the present invention has been disclosed in the context of preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make many possible variations and modifications to the disclosed solution, or to modify equivalent embodiments, without departing from the scope of the solution, using the methods and techniques disclosed above. Therefore, any simple modification, equivalent change and modification made to the above embodiments according to the technical essence of the present invention are still within the scope of the protection of the technical solution of the present invention, unless the contents of the technical solution of the present invention are departed.

Claims (10)

1. An object segmentation method based on cyclic convolution, the method comprising:
determining an initialized closed curve of a target image, constructing a feature vector for each node of the closed curve, and performing feature learning on the closed curve through cyclic convolution;
predicting an offset on each node of the closed curve by using a regressor, wherein the offset points to the object contour line node from the curve node; adding offset to each node coordinate to realize curve deformation;
and (5) iterating the deformation curve to obtain a final object contour line, and segmenting an object in the image.
2. The object segmentation method according to claim 1, wherein the constructing a feature vector for each node of the closed curve comprises: and extracting picture features of the target image by using a deep neural network, and constructing a feature vector for each node based on the picture features.
3. The object segmentation method according to claim 2, wherein the feature vector of each node is formed by splicing a feature extracted by the node at a corresponding two-dimensional coordinate in the picture feature and a two-dimensional coordinate with translation invariance of the node.
4. The object segmentation method according to claim 1, wherein the feature learning by cyclic convolution includes: the characteristic vector sequence on the closed curve is a periodic one-dimensional discrete signal, and the periodic signal is processed by using cyclic convolution to realize characteristic learning.
5. The object segmentation method according to claim 1, wherein the regressor is a multi-layer perceptron or a plurality of 1x1 convolutional network layers.
6. The object segmentation method according to claim 1, wherein the determining an initialized closed curve of the target image comprises: the position of a target object is predicted using a target object detector based on deep learning, and the position of the target object is represented by a closed curve surrounding the object.
7. The object segmentation method according to claim 6, wherein the position of the target object in the target image is represented by a two-dimensional rectangular frame, the midpoints of each side of the two-dimensional rectangular frame are connected to obtain a quadrangle, the quadrangle is subjected to curve deformation, four corner points of the quadrangle are deformed into four poles of the object, an octagon is constructed based on the four poles, the octagon is used as an initialized closed curve, and the initialized closed curve is sampled and then subjected to curve deformation to obtain an object contour line.
8. The object segmentation method according to claim 1, wherein when the object is occluded, rectangular frames of respective connected regions of the object are detected in a complete object two-dimensional rectangular frame, then the rectangular frames are deformed into the contour lines of the connected regions, and the contour lines in the complete object rectangular frames are merged to segment the complete object.
9. The object segmentation method according to claim 1, wherein the feature vectors learned at each node are pooled, information of all nodes on the closed curve is fused to obtain a fused vector, and the fused vector is connected to the feature vector learned at each node before and then predictive regression is performed.
10. An apparatus for object segmentation based on cyclic convolution, the apparatus comprising:
the characteristic learning module: inputting an initialized closed curve of a target object, constructing a feature vector for each node of the closed curve, and performing feature learning on the closed curve through cyclic convolution;
a curve deformation module: for each node of the closed curve, inputting a feature vector obtained by learning of the node, predicting an offset on the node by using a regressor, and enabling the offset to point to an object contour line node from the curve node; adding offset to each node coordinate to realize curve deformation;
an object segmentation module: and (5) iterating the deformation curve to obtain a final object contour line, and segmenting the object in the image.
CN201911374778.8A 2019-12-27 2019-12-27 Object segmentation method and device based on cyclic convolution Active CN111080666B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911374778.8A CN111080666B (en) 2019-12-27 2019-12-27 Object segmentation method and device based on cyclic convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911374778.8A CN111080666B (en) 2019-12-27 2019-12-27 Object segmentation method and device based on cyclic convolution

Publications (2)

Publication Number Publication Date
CN111080666A CN111080666A (en) 2020-04-28
CN111080666B true CN111080666B (en) 2022-07-15

Family

ID=70318411

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911374778.8A Active CN111080666B (en) 2019-12-27 2019-12-27 Object segmentation method and device based on cyclic convolution

Country Status (1)

Country Link
CN (1) CN111080666B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846322A (en) * 2016-12-30 2017-06-13 西安电子科技大学 Based on the SAR image segmentation method that curve wave filter and convolutional coding structure learn
CN109543519A (en) * 2018-10-15 2019-03-29 天津大学 A kind of depth segmentation guidance network for object detection

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012154260A2 (en) * 2011-02-17 2012-11-15 The Johns Hopkins University Multiparametric non-linear dimension reduction methods and systems related thereto

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846322A (en) * 2016-12-30 2017-06-13 西安电子科技大学 Based on the SAR image segmentation method that curve wave filter and convolutional coding structure learn
CN109543519A (en) * 2018-10-15 2019-03-29 天津大学 A kind of depth segmentation guidance network for object detection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Poisson方程的曲线形状渐变方法;雷开彬等;《计算机辅助设计与图形学学报》;20070330(第03期);全文 *

Also Published As

Publication number Publication date
CN111080666A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
Dang et al. Msr-gcn: Multi-scale residual graph convolution networks for human motion prediction
CN108549893B (en) End-to-end identification method for scene text with any shape
CN110837835B (en) End-to-end scene text identification method based on boundary point detection
US20200273192A1 (en) Systems and methods for depth estimation using convolutional spatial propagation networks
AU2016201908B2 (en) Joint depth estimation and semantic labeling of a single image
CN111027493B (en) Pedestrian detection method based on deep learning multi-network soft fusion
CN110378348B (en) Video instance segmentation method, apparatus and computer-readable storage medium
CN110852349B (en) Image processing method, detection method, related equipment and storage medium
CN113673425B (en) Multi-view target detection method and system based on Transformer
EP3815043A1 (en) Systems and methods for depth estimation via affinity learned with convolutional spatial propagation networks
CN108305260B (en) Method, device and equipment for detecting angular points in image
CN114969405A (en) Cross-modal image-text mutual inspection method
CN108764244B (en) Potential target area detection method based on convolutional neural network and conditional random field
Huang et al. Image saliency detection via multi-scale iterative CNN
CN116645592A (en) Crack detection method based on image processing and storage medium
Yao et al. As‐global‐as‐possible stereo matching with adaptive smoothness prior
CN116977674A (en) Image matching method, related device, storage medium and program product
Hu et al. PolyBuilding: Polygon transformer for building extraction
CN111080666B (en) Object segmentation method and device based on cyclic convolution
CN111652181A (en) Target tracking method and device and electronic equipment
JP2021184141A (en) Code decoding device, code decoding method, and program
CN111444834A (en) Image text line detection method, device, equipment and storage medium
JPH08335268A (en) Area extracting method
CN113139540B (en) Backboard detection method and equipment
CN115775220A (en) Method and system for detecting anomalies in images using multiple machine learning programs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant