CN103942779A

CN103942779A - Image segmentation method based on combination of graph theory and semi-supervised learning

Info

Publication number: CN103942779A
Application number: CN201410118303.3A
Authority: CN
Inventors: 朱松豪; 陈玲玲; 罗青青; 李向向
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University; Nanjing University of Posts and Telecommunications
Priority date: 2014-03-27
Filing date: 2014-03-27
Publication date: 2014-07-23

Abstract

The invention discloses an image segmentation method based on a combination of a graph theory and semi-supervised learning. The image segmentation method includes the steps that an image is roughly segmented into a certain number of region blocks and mapped into a weighted graph; a correlation matrix is constructed by using a semi-supervised learning method; semantic segmentation is performed on the image through a normalization segmentation method. According to the image segmentation method, the image is segmented on the basis of the graph theory and semi-supervised learning, precision of image segmentation can be improved, and the image segmentation method not only can be beneficial to image segmentation and objective extraction, but also can be beneficial for promoting development of the fields of pattern recognition, computer vision, artificial intelligence and the like.

Description

A kind of image partition method combining based on graph theory and semi-supervised learning

Technical field

The present invention relates to technical field of image processing, particularly a kind of image partition method combining based on graph theory and semi-supervised learning.

Background technology

Image is cut apart with target and is extracted as an important branch in Image processing and compute machine vision field, is attracting numerous researchers' concern always.Meanwhile, image is cut apart and is extracted in the fields such as pattern-recognition, computer vision, artificial intelligence with target and also has a wide range of applications.Therefore, image is cut apart to the further investigation of extracting with target and not only contribute to image to cut apart the perfect solution of extracting with target, and contribute to promote the development in the fields such as pattern-recognition, computer vision, artificial intelligence.This is also the direct driving force that the present invention studies.

Traditional image partition method has: average drifting method, that is: and a kind of dividing method based on region, smooth region and the texture region of the method to image is insensitive, has good robustness; Normalization dividing method, that is: a kind of dividing method based on graph theory, the method is selected very sensitive to parameter, and operand is larger; K Mean Method, that is: a kind of dividing method based on cluster, the method is applicable to image less demanding to segmentation precision or that image brightness distribution is wider and cuts apart.And the present invention can solve problem above well.

Summary of the invention

The object of the invention is that the method for employing based on graph theory is to Image Segmentation Using, utilize average drifting method to realize the coarse segmentation of image, then the image block after cutting apart is considered as to the node in graph model, and sets up the association between adjacent block, thereby build the weighted graph based on image block.The method is done corresponding modify to internodal incidence matrix, enables to build better internodal association, has improved the precision that image is cut apart.

The present invention solves the technical scheme that its technical matters takes: the invention provides a kind of image partition method combining based on graph theory and semi-supervised learning, described method comprises the steps:

Step 1: image rough segmentation is slit into the region unit of some, and is mapped to weighted graph;

The method that the present invention proposes is that block-based image is cut apart, and therefore the structure to associated diagram is necessarily improved.

In image is processed, it is generally acknowledged from the neighbor in same region and there is similar color value, and the element of correlation model W taking color as feature be expressed as:

w_{ij}^{c} = \exp (- θ_{p} {| | p_{i} - p_{j} | |}^{2} - θ_{c} | | c_{i} - c_{j} | |) - - - (1)

In above formula, P _irepresent the position of pixel i, C _irepresent its color value, θ _pand θ _cfor constant.

Each image-region obtaining after coarse segmentation is considered as a node by the present invention, the set that now the V in weighted graph is all nodes, and E is the limit of connected node between two, its weight is:

w _ij＝exp(-θ _c||c _i-c _j||) (2)

In above formula, C _irepresent the color average of cut zone, θ _crepresent the constant of control weight.Can find out from (2) formula: in the time of weight calculation, be greater than from the degree of association between different block pixels from the degree of association between pixel in same cut zone, therefore (2) formula has also implied the local edge of image simultaneously.

Step 2: utilize semi-supervised learning method construct relevance matrix;

In the present invention, the present invention will utilize semi-supervised learning method to calculate all internodal incidence matrix.Specifically, first the present invention is divided into all nodes mark node and does not mark node (wherein marking node occupies the minority), then utilizes semi-supervised learning calculate mark node and do not mark internodal relevance.In general unsupervised learning framework, the label information transmission based on iterative process need spend a large amount of time, and therefore the present invention adopts the method for directly carrying out label propagation to reduce time cost.

Node m and all internodal interconnection vectors are expressed as:

a _m=α(I-(1-α)P) ^-1b _m (3)

Wherein, unit matrix I dimension is image coarse segmentation region number n, matrix P=D ^-1w, diagonal matrix D=diag (d ₁, d ₂..., d _n) a certain element d _ifor:

d_{i} = Σ_{j = 1}^{n} w_{ij} - - - (4)

In addition the initial state vector b in (3) formula, _m=[b _im] _{n × 1}the value of expressive notation node m is b _im=1, the value of all the other nodes is 0; Parameter alpha is in order to represent the original tag information of a certain node and the ratio of its reception adjacent node label information.

Said method minimizes the energy function on figure:

Q = Σ_{i, j = 1}^{n} w_{ij} {| \frac{a_{im}^{u}}{\sqrt{d_{i}}} - \frac{a_{jm}^{u}}{\sqrt{d_{j}}} |}^{2} - μ Σ_{i = 1}^{n} {| a_{im}^{u} - b_{im} |}^{2} - - - (5)

Formula (5) indicates to obtain desirable segmentation effect, in image cutting procedure, should follow smoothness constraint and matching constraint, and smoothness constraint wherein requires image to cut apart front and back, and the label information between adjacent node can not change too much; Matching constraint requires image to cut apart front and back, and the label information on a certain node can not change too much.

Formula (3) can estimate that internodal local smoothing method changes preferably, and therefore the incidence matrix in the present invention is:

A = [{\overset{r}{a}}_{1}^{u}, . . ., {\overset{r}{a}}_{n}^{u}] = α {(I - (1 - α) P)}^{- 1} - - - (6)

P=D in above formula ^-1w, matrix (I-(1-α) P) is the sparse matrix of positive definite.

Step 3: adopt normalization dividing method to carry out semantic segmentation to image.

Image segmentation problem is considered as a label distribution problem by the present invention: by the arbitrary label k ∈ in tally set K 1,2 ..., K} distributes to each the block node i in image.Split vector y _k=[y _ik] _{n × 1}the value that represents k cut zone is y _ik=1, the value of all the other nodes is 0.

S (Y) = \frac{1}{K} Σ_{1}^{K} \frac{y_{k}^{T}}{y_{k}^{T}} \frac{A y_{k}}{D y_{k}} - - - (7)

Wherein, Y=[y ₁, y ₂..., y _n] be subdivision matrix and meet YY ^t=I, the implication of diagonal matrix D is identical with formula (3).

The optimum solution of formula (7) is matrix D ^1/2(I-(1-α) P) D ^1/2the subspace that generates of minimum K eigenwert.

Beneficial effect:

1, graph theoretic approach of the present invention is all in theory still comparatively ripe in practice.

2, the present invention is mapped as pixel between image after weighted graph, utilizes the just energy image processing of graph theory knowledge.

3, the present invention can improve precision and the image that image cuts apart and cuts apart quality, contributes to image to cut apart the perfect solution of extracting with target

Determine, and contribute to promote the development in the fields such as pattern-recognition, computer vision, artificial intelligence.

Brief description of the drawings

Fig. 1 is method flow diagram of the present invention.

Fig. 2 is the schematic diagram of contrast before and after pre-service;

Identifier declaration: a, c, e are original image, and b, d, f are pretreated image.

Fig. 3 is the schematic diagram from Berkeley picture library.

Fig. 4 is the schematic diagram from MSRC picture library.

Fig. 5 is two groups of comparative examples figure of the Berkeley picture library segmentation result based on four kinds of distinct methods;

Caption, is from left to right followed successively by: former figure, the segmentation result schematic diagram based on average drifting method, the segmentation result schematic diagram based on method for normalizing, the segmentation result schematic diagram based on multiple dimensioned method for normalizing, the segmentation result schematic diagram based on the method for the invention.

Fig. 6 is two groups of comparative examples figure of the segmentation result of the MSRC picture library based on four kinds of distinct methods;

Embodiment

Below in conjunction with Figure of description, the invention is described in further detail.

As shown in Figure 1, the invention provides a kind of image partition method combining based on graph theory and semi-supervised learning, the research method of the method based on graph theory is with weighted graph G=(V, E) carry out the relation between map image pixel, wherein node V presentation video pixel, the relation between weights E presentation video pixel.The method realization approach is first to utilize average drifting method to realize the coarse segmentation of image, then the image block after cutting apart is considered as to the node in graph model, and sets up the association between adjacent block, thereby build the weighted graph based on image block.

Method flow:

Because pixel is a discrete series of presentation video information, and along with the continuous increase of resolution, thereby cause making to become more and more difficult based on the optimal treatment of image pixel.Therefore, first the present invention adopts average drifting method to carry out pre-service to image, image is divided into some and has the fritter of locality, continuity.Fig. 2 provides the example contrasting before and after some pre-service.

As everyone knows, in the image based on graph theory is cut apart, image is cut apart the associated diagram that quality depends on structure to a great extent.Many researchers is all utilized as color now, and the local features such as edge are set up a suitable associated diagram, and all direct nodes using pixel as associated diagram of these methods.The method proposing due to the present invention is that block-based image is cut apart, and therefore the structure to associated diagram is necessarily improved.

w_{ij}^{c} = \exp (- θ_{p} {| | p_{i} - p_{j} | |}^{2} - θ_{c} | | c_{i} - c_{j} | |) - - - (1)

In the present invention, each image-region obtaining after coarse segmentation is considered as a node by the present invention, the set that now the V in weighted graph is all nodes, and E is the limit of connected node between two, its weight is:

w _ij＝exp(-θ _c||c _i-c _j||) (2)

Step 2: utilize semi-supervised learning method construct relevance matrix;

Semi-supervised learning is the hot research problem in area of pattern recognition and machine learning field in recent years, is a kind of learning method that supervised learning combines with unsupervised learning.It mainly considers how to utilize a small amount of mark sample training and to a large amount of problems that sample is classified that do not mark, it marks cost to how to reduce, raising learning performance has very great practical significance.The crucial part of semi-supervised learning is consistance hypothesis: (1) adjacent node has identical label; (2) node in same structure has identical label.More than suppose so-called cluster hypothesis.

The present invention will utilize semi-supervised learning method to calculate all internodal incidence matrix.Specifically, first the present invention is divided into all nodes mark node and does not mark node (wherein marking node occupies the minority), then utilizes semi-supervised learning calculate mark node and do not mark internodal relevance.In general unsupervised learning framework, the label information transmission based on iterative process need spend a large amount of time, and therefore the present invention directly carries out label propagation to reduce time cost.

Node m and all internodal interconnection vectors are expressed as:

a _m=α(I-(1-α)P) ^-1b _m (3)

d_{i} = Σ_{j = 1}^{n} w_{ij} - - - (4)

Above method essence is equivalent to minimize the energy function on figure:

Q = Σ_{i, j = 1}^{n} w_{ij} {| \frac{a_{im}^{u}}{\sqrt{d_{i}}} - \frac{a_{jm}^{u}}{\sqrt{d_{j}}} |}^{2} - μ Σ_{i = 1}^{n} {| a_{im}^{u} - b_{im} |}^{2} - - - (5)

Formula (3) can estimate that internodal local smoothing method changes preferably, and therefore, the incidence matrix of using in the present invention is:

A = [{\overset{r}{a}}_{1}^{u}, . . ., {\overset{r}{a}}_{n}^{u}] = α {(I - (1 - α) P)}^{- 1} - - - (6)

In the present invention, image segmentation problem is considered as a label distribution problem by the present invention: by the arbitrary label k ∈ in tally set K 1,2 ..., K} distributes to each the block node i in image.Split vector y _k=[y _ik] _{n × 1}the value that represents k cut zone is y _ik=1, the value of all the other nodes is 0.

S (Y) = \frac{1}{K} Σ_{1}^{K} \frac{y_{k}^{T}}{y_{k}^{T}} \frac{A y_{k}}{D y_{k}} - - - (7)

For assessed for performance, the present invention selects two common image storehouses as lab diagram image set.Berkeley picture library.This picture library comprises 12,000 people's work blocks from 300 pictures, and wherein 200 pictures are for training, and another 100 pictures are for test.Fig. 3 provides some examples from Berkeley picture library.Microsoft Research Cambridge (MSRC) picture library.This picture library is containing 591 pictures from 29 classifications, and 216 pictures are wherein for training, and another 275 pictures are for test.Fig. 4 provides some examples from MSRC picture library.

In this experiment, the present invention and three kinds of classical image partition methods compare:

Dividing method based on average drifting: be by selecting suitable kernel function, realize the figure of cutting apart of image.

Based on normalized dividing method: be the association by maximizing each subgraph in graph model, realize the figure of cutting apart of image.

Based on multiple dimensioned normalized dividing method: be the statistical information of obtaining picture structure in the neighborhood by expanding gradually in scope, realize the figure of cutting apart of image.

Table 1: the contrast experiment on Berkeley picture library.

From four kinds of methods experiment comparative results of table 1 and know, various aspects of performance of the present invention is all better than other three kinds of classical ways, its reason comprises: (1) is considered cut apart image and do not cut the relevance between image simultaneously, contributes to better to reflect the correlativity between fast of each region in image; (2) improve the structural form of incidence matrix, contribute to realize more exactly the semantic segmentation of image.

Fig. 5, Fig. 6 provide respectively Berkeley picture library based on four kinds of distinct methods and the segmentation result exemplary plot of MSRC picture library, and the cut zone piece K that wherein the cut zone piece K on Berkeley picture library is taken as on 20, MSRC picture library is taken as 30.

Claims

1. the image partition method combining based on graph theory and semi-supervised learning, is characterized in that, described method comprises the steps:

Above-mentioned steps 1 is that each image-region obtaining after coarse segmentation is considered as to a node, the set that now the V in weighted graph is all nodes, and E is the limit of connected node between two, its weight is:

w _ij＝exp(-θ _c||c _i-c _j||) (2)

In above formula (2), C _irepresent the color average of cut zone, θ _cthe constant that represents control weight, in the time of weight calculation, is greater than from the degree of association between different block pixels from the degree of association between pixel in same cut zone;

Step 2: utilize semi-supervised learning method construct relevance matrix;

First above-mentioned steps 2 is divided into all nodes mark node and does not mark node, wherein marking node occupies the minority, then utilize semi-supervised learning calculate mark node and do not mark internodal relevance, adopt the method for directly carrying out label propagation to reduce time cost;

Node m and all internodal interconnection vectors are expressed as:

a _m=α(I-(1-α)P) ^-1b _m (3)

d_{i} = Σ_{j = 1}^{n} w_{ij} - - - (4)

Initial state vector b in above formula (3) _m=[b _im] _{n × 1}the value of expressive notation node m is b _im=1, the value of all the other nodes is 0; Parameter alpha is in order to represent the original tag information of a certain node and the ratio of its reception adjacent node label information;

Described method minimizes the energy function on figure:

Q = Σ_{i, j = 1}^{n} w_{ij} {| \frac{a_{im}^{u}}{\sqrt{d_{i}}} - \frac{a_{jm}^{u}}{\sqrt{d_{j}}} |}^{2} - μ Σ_{i = 1}^{n} {| a_{im}^{u} - b_{im} |}^{2} - - - (5)

Formula (5) indicates to obtain desirable segmentation effect, in image cutting procedure, should follow smoothness constraint and matching constraint, and smoothness constraint wherein requires image to cut apart front and back, and the label information between adjacent node can not change too much; Matching constraint requires image to cut apart front and back, and the label information on a certain node can not change too much;

Formula (3) can estimate that internodal local smoothing method changes preferably, and described incidence matrix is:

A = [{\overset{r}{a}}_{1}^{u}, . . ., {\overset{r}{a}}_{n}^{u}] = α {(I - (1 - α) P)}^{- 1} - - - (6)

In above formula (6), P=D ^-1w, matrix (I-(1-α) P) is the sparse matrix of positive definite;

Step 3: adopt normalization dividing method to carry out semantic segmentation to image;

Image segmentation problem is considered as a label distribution problem by above-mentioned steps 3: by the arbitrary label k ∈ in tally set K 1,2 ..., K} distributes to each the block node i in image, split vector y _k=[y _ik] _{n × 1}the value that represents k cut zone is y _ik=1, the value of all the other nodes is 0;

S (Y) = \frac{1}{K} Σ_{1}^{K} \frac{y_{k}^{T}}{y_{k}^{T}} \frac{A y_{k}}{D y_{k}} - - - (7)

2. a kind of image partition method combining based on graph theory and semi-supervised learning according to claim 1, is characterized in that the local edge that the formula (2) of described method step 1 has comprised image.

3. a kind of image partition method combining based on graph theory and semi-supervised learning according to claim 1, is characterized in that, described method step 3 comprises: the optimum solution of above-mentioned formula (7) is matrix D ^1/2(I-(1-α) P) D ^1/2the subspace that generates of minimum K eigenwert.

4. according to the arbitrary described a kind of image partition method combining based on graph theory and semi-supervised learning of claim 1-3, it is characterized in that, described method is with weighted graph G=(V, E) carry out the relation between map image pixel, wherein node V presentation video pixel, the relation between weights E presentation video pixel.