CN104778683A

CN104778683A - Multi-modal image segmenting method based on functional mapping

Info

Publication number: CN104778683A
Application number: CN201510040592.4A
Authority: CN
Inventors: 李平; 李黎; 李建军; 俞俊
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Huicui Intelligent Technology Co ltd
Priority date: 2015-01-27
Filing date: 2015-01-27
Publication date: 2015-07-15
Anticipated expiration: 2035-01-27
Also published as: CN104778683B

Abstract

The invention relates to a multi-modal image segmenting method based on functional mapping. For an image set comprising a target, the multi-modal image segmenting method comprises the following steps: (1), segmenting an image into superpixel blocks, and representing the superpixel blocks by using different feature descriptors to obtain a multi-modal image representation; (2), establishing a superpixel map on the multi-modal image, and constructing a corresponding Laplacian matrix; (3), representing a reduction functional space of each image, and establishing functional mapping between image pairs; (4), aligning the image functional mapping of each modal to an image thread, and introducing an implicit function to keep consistency between functional mapping; (5), obtaining a functional mapping representation according to the consistency of the multi-modal mapping, and calculating a segmentation function corresponding to an image through combining and optimizing an objective function to obtain an optimal segmentation representation of the image. According to the multi-modal image segmenting method, each target region block of the image can be accurately judged by using a target potential relevance shared between feature representations of different modals of the image and the image, so that the image segmentation performance and effect are enhanced.

Description

A kind of multi-modality images dividing method based on Functional Mapping

Technical field

The invention belongs to the technical field of image segmentation in image procossing, particularly based on the multi-modality images dividing method of Functional Mapping.

Background technology

The flourish of Digital image technology has expedited the emergence of a large amount of new industries, as remote sensing satellite image location, medical image analysis, traffic intelligent identification etc., facilitates the ripe day by day of informationized society.Image, as the important bridge in the human perception world, is also closely related with visual field.Such as, the demand of image procossing in all kinds of vision application in the fields such as artificial intelligence, machine vision, physiology, medical science, meteorology, military science constantly increases, and plays more and more crucial effect.And Iamge Segmentation is as image pre-processing method, for solid foundation has been established in image semantic analysis on the middle and senior level, such as, in many application such as image recognition, target localization, rim detection, all image Segmentation Technology improving performance can be used.

Iamge Segmentation, as the term suggests be that given image is carried out region segmentation according to certain rule or target.Such as, the image of a secondary lakeside shooting can be divided into the region of the different semantic classess of multiple representative such as lake surface, people, canoe, house, grove, sky, and people here and canoe can regard target foreground objects as, and remaining can regard background object as.Traditional image Segmentation Technology processes mainly for the clue such as gray scale, color, texture, shape of single image, and typical method has Threshold segmentation, region segmentation, edge segmentation, figure to split, based on energy functional segmentation etc.Such as threshold segmentation method judges its generic according to setting threshold value to gray-scale value; Edge segmentation method has the feature such as step evolution or mutability according to edge gray-scale value and detects; Region segmentation method judges according to image similarity criterion, mainly contains the technology such as watershed divide, regional merge and split, a period of time region growing; Image to be regarded as and is summit with pixel and non-directed graph that neighbor connects with limit by figure segmentation, and each cut zone regards the subgraph in figure as; Utilize continuous curve to represent object edge based on energy functional segmentation, and minimized by energy functional and solve segmentation result, be generally divided into parametric active contour model and two kinds, geometric active contour model.

The not foot point of said method is mainly manifested in the following aspects: the first, directly for the process of image original pixels, adds the time complexity of algorithm, increases computing cost; The second, bottom layer treatment technology such as Threshold segmentation and edge segmentation are difficult to be associated with the semantic feature of image; 3rd, have ignored the complementary information between image, there are some common structure and potential informations between the image especially comprising similar purpose, directly affects the segmentation effect of image object.Therefore, these methods are also not suitable for the Iamge Segmentation task comprising common objective on a large scale, and the practical application such as image recognition, target localization larger to the order of magnitude thus produces certain adverse effect.Consider based on these, for applications such as intelligent transportation identification, medical science impact analysis, large-scale image identifications, a kind ofly can set up associating of image low-level image feature and semantic feature in the urgent need to designing, and multi-facetedly effectively can utilize the image Segmentation Technology of the potential structural information between image.

Summary of the invention

In order to effectively utilize the potential structural information between image, reduce the computation complexity of Iamge Segmentation process, promote the Target Segmentation effect in image, the present invention proposes a kind of multi-modality images dividing method based on Functional Mapping, the method comprises the following steps:

1, after acquisition comprises the image collection of target, following operation is carried out:

1) each Iamge Segmentation in set is become super-pixel block, represent to obtain multi-modality images by the super-pixel after different Feature Descriptors characterizes segmentation;

2) on multi-modality images, set up the figure based on super-pixel, and build corresponding Laplacian Matrix;

3) what characterize every width image about subtracts functional space, set up image between Functional Mapping;

4) the image Functional Mapping of often kind of mode is alignd with image clue, and introduce the consistance between implicit function maintenance Functional Mapping;

5) obtain Functional Mapping according to multi-modal consistency in mapping to express, by the segmentation function that combined optimization objective function computed image is corresponding, obtain image optimum segmentation and represent, complete Iamge Segmentation.

Further, described step 1) described in each Iamge Segmentation in set is become super-pixel block, represent to obtain multi-modality images by the super-pixel after different Feature Descriptors characterizes segmentation, specifically:

1) image establishing set to be associated by n width forms, and is designated as every width image contains one or more target class, and the target class number of whole set is C;

2) pixel in image is regarded as the summit of figure, utilize figure dividing method that the image in set is divided into q zonule (as 100), q is positive integer, these zonules are made up of the pixel that value is close, be referred to as super-pixel, the block belonging to c class in the i-th width image is expressed as S _ic, wherein i={1,2 ..., n}, c={1,2 ..., C};

3) Feature Descriptor that m kind is different is utilized, as Scale invariant features transform (SIFT), local binary patterns (LBP), histogram of gradients (HOG) etc., each super-pixel in token image, thus obtain the multi-modal character representation of multi-faceted reflection image extrinsic information, as the i-th width image homography set i.e. kth kind image feature descriptor corresponding kth kind mode

Further, described step 2) in the figure set up on multi-modality images based on super-pixel, and build corresponding Laplacian Matrix, specifically:

2.1) super-pixel of the q in often kind of image modalities is regarded as the summit of figure, build the super-pixel figure that respective vertices is formed by connecting entirely;

2.2) on the super-pixel figure of variant mode, Laplacian Matrix is built respectively the weight matrix W that they are calculated by Gauss's weighted strategy obtains, i.e. L=D-W, and wherein D is pair of horns battle array, its diagonal entry be W each column element and.

Described step 3) in the linear of the every width image of sign about subtract functional space, set up image between Functional Mapping, specifically:

1) multi-modal Laplacian Matrix is calculated eigenwert and proper vector, and get front p (p<q) individual proper vector Zhang Chengyue and subtract functional space and characteristic of correspondence value forms diagonal matrix respectively in each mode of every width image

2) set every width Iamge Segmentation function as f _oicorresponding S _ic, the corresponding one group of base vector in search volume of this function the p dimension space opened and f _oilinear function combination is expressed as to the coefficient of the i-th width image wherein B _ifor p before Laplacian Matrix proper vector is formed;

3) relation of reflection arbitrarily in pairs between image is mapped, as the subspace from the i-th width image by linear functional to the subspace of jth width image functional Mapping matrix represent, i.e. subspace in Function Mapping to subspace in expression value can by calculating R _ijf obtains.

Described step 4) in the image Functional Mapping of often kind of mode is alignd with image clue, and introduce the consistance that implicit function keeps between Functional Mapping, specifically:

1) the corresponding different description operator of image clue, the image Functional Mapping of often kind of mode aligns with image clue and to realize by optimizing following formula, namely

\min_{R_{ij}} H (R_{ij}) = Σ_{k = 1}^{m} {| | R_{ij} X_{i}^{k} - X_{j}^{k} | |}_{1} + α {| | R_{ij} Λ_{i}^{k} - Λ_{j}^{k} R_{ij} | |}_{F}^{2} + β {| | R_{ij} | |}_{1},

Wherein, constant α >0, β >0, symbol ‖ ‖ ₁the L1 norm of representing matrix, symbol ‖ ‖ _fthe Frobenius norm of representing matrix;

2) implicit function introduced is shared by input picture, and makes implicit function corresponding on the every width image of Functional Mapping energy efficient association by the consistent item of Functional Mapping, and each implicit function only appears in certain subset of image, and the implicit function z that the i-th width image is corresponding _i=[z _i1, z _i2..., z _il] ∈ 0,1} characterizes the relation between implicit function and image, and continuous variable Φ _i=[φ _i1, φ _i2..., φ _il] each implicit function on image is described;

The consistent item of Functional Mapping in previous step is expressed as

Q (R_{ij}, Φ_{i}, z_{i}) = γ \underset{(i, j) &Element; E}{Σ} {| | R_{ij} Φ_{i} - Φ_{j} diag (z_{i}) | |}_{2}^{2} + λ Σ_{i = 1}^{n} {| | Φ_{i} - Φ_{i} diag (z_{i}) | |}_{2}^{2},

Wherein, constant γ >0, λ >0, symbol ‖ ‖ ₂the L2 norm of representing matrix, diag (z _i) representing pair of horns matrix, (i, j) ∈ Ε represents neighbour's set that image is right, as desirable 20 width neighbor image calculate.

Described step 5) in obtain Functional Mapping according to multi-modal consistency in mapping and express, calculate segmentation function corresponding to every width figure by combined optimization objective function, specifically:

1) calculate Functional Mapping according to the multi-modal consistency in mapping relation set up to express, namely

\min_{R_{ij}, Φ_{i}, z_{i}} \underset{(i, j) &Element; E}{Σ} H (R_{ij}) + Q (R_{ij}, Φ_{i}, z_{i}),

Wherein, variable Φ _iwith Φ _jbetween there is orthogonality constraint, adopt variable alternate optimization method to solve here, namely fix other Two Variables and optimize a remaining variable, variable z _ibe initialized as complete 1 vector, by successive ignition until function convergence, optimum Functional Mapping can be calculated and express R _ij;

2) take image pattern as figure on summit, the weight between two summits is designated as then the combined optimization goal expression of Iamge Segmentation function is

Wherein, constant ζ >0, c={1,2 ..., C}, symbol () ^trepresent vector or transpose of a matrix, subspace B _ikopened into by p proper vector before super-pixel figure Laplacian Matrix in the i-th width image corresponding kth kind mode, and different classes of segmentation function f _icmeet mutual exclusive restrict;

3) by solving the optimum solution of objective function in above-mentioned steps, the optimum segmentation function of the i-th width image can be obtained can determine that belonging to other optimum segmentation block of c target class in image is expressed as accordingly

The present invention proposes the multi-modality images dividing method based on Functional Mapping, its advantage is: splitting formation super-pixel by carrying out figure to image original pixels, reducing computing cost; The sign content of the angle reflection image from different descriptor is represented by building multi-modal super-pixel; By about subtract functional space set up image between Functional Mapping, and utilize implicit function to keep its consistance, effectively establish the low-level feature of image and associating between high-level semantic, and then improve image segmentation, for such as the basis of compacting has been established in the application of the vision such as image recognition, target localization.

Accompanying drawing explanation

Fig. 1 is method flow diagram of the present invention.

Embodiment

With reference to accompanying drawing 1, further illustrate the present invention:

Step 1) described in by set in each Iamge Segmentation become super-pixel block, with different Feature Descriptors characterize segmentation after super-pixel represent to obtain multi-modality images, specifically:

Step 2) in the figure set up on multi-modality images based on super-pixel, and build corresponding Laplacian Matrix, specifically:

2.2) on the super-pixel figure of variant mode, Laplacian Matrix is built respectively the weight matrix W that they are calculated by Gauss's weighted strategy obtains, i.e. L=D-W, wherein D to be diagonal entry be W column element and diagonal matrix.

Step 3) in the linear of the every width image of sign about subtract functional space, set up image between Functional Mapping, specifically:

Step 4) in the image Functional Mapping of often kind of mode is alignd with image clue, and introduce the consistance that implicit function keeps between Functional Mapping, specifically:

\min_{R_{ij}} H (R_{ij}) = Σ_{k = 1}^{m} {| | R_{ij} X_{i}^{k} - X_{j}^{k} | |}_{1} + α {| | R_{ij} Λ_{i}^{k} - Λ_{j}^{k} R_{ij} | |}_{F}^{2} + β {| | R_{ij} | |}_{1},

The consistent item of Functional Mapping in previous step is expressed as

Q (R_{ij}, Φ_{i}, z_{i}) = γ \underset{(i, j) &Element; E}{Σ} {| | R_{ij} Φ_{i} - Φ_{j} diag (z_{i}) | |}_{2}^{2} + λ Σ_{i = 1}^{n} {| | Φ_{i} - Φ_{i} diag (z_{i}) | |}_{2}^{2},

Step 5) in obtain Functional Mapping according to multi-modal consistency in mapping and express, calculate segmentation function corresponding to every width figure by combined optimization objective function, specifically:

\min_{R_{ij}, Φ_{i}, z_{i}} \underset{(i, j) &Element; E}{Σ} H (R_{ij}) + Q (R_{ij}, Φ_{i}, z_{i}),

Claims

1., based on a multi-modality images dividing method for Functional Mapping, it is characterized in that, to the image collection comprising target, proceeding as follows:

2. as claimed in claim 1 based on the multi-modality images dividing method of Functional Mapping, it is characterized in that: described step 1) in by set in each Iamge Segmentation become super-pixel block, represent to obtain multi-modality images by the super-pixel after different Feature Descriptors characterizes segmentation, specifically:

1.1) image establishing set to be associated by n width forms, and is designated as every width image contains one or more target class, and the target class number of whole set is C;

1.2) pixel in image is regarded as the summit of figure, utilize figure dividing method that the image in set is divided into q zonule, q is positive integer, and these zonules are made up of the pixel that value is close, be referred to as super-pixel, the block belonging to c class in the i-th width image is expressed as S _ic, wherein i={1,2 ..., n}, c={1,2 ..., C};

1.3) utilize the Feature Descriptor that m kind is different, each super-pixel in token image, thus obtain the multi-modal character representation of multi-faceted reflection image extrinsic information, if the i-th width image homography set i.e. kth kind image feature descriptor corresponding kth kind mode

3., as claimed in claim 1 based on the multi-modality images dividing method of Functional Mapping, it is characterized in that: described step 2) in the figure set up on multi-modality images based on super-pixel, and build corresponding Laplacian Matrix, specifically:

4., as claimed in claim 1 based on the multi-modality images dividing method of Functional Mapping, it is characterized in that: described step 3) in the every width image of sign about subtract functional space, set up image between Functional Mapping, specifically:

3.1) Laplacian Matrix of multi-modality images is calculated eigenwert and proper vector, and get a front p proper vector Zhang Chengyue and subtract functional space and characteristic of correspondence value forms diagonal matrix respectively in each mode of every width image wherein p<q;

3.2) set every width Iamge Segmentation function as f _oicorresponding S _ic, the corresponding one group of base vector in search volume of this function the p dimension space opened and f _oilinear function combination is expressed as to the coefficient of the i-th width image wherein B _ifor p before Laplacian Matrix proper vector is formed;

3.3) relation of reflection arbitrarily in pairs between image is mapped, from the subspace of the i-th width image by linear functional to the subspace of jth width image functional Mapping matrix represent, i.e. subspace in Function Mapping to subspace in expression value can by calculating R _ijf obtains.

5. as claimed in claim 1 based on the multi-modality images dividing method of Functional Mapping, it is characterized in that: described step 4) in the image Functional Mapping of often kind of mode is alignd with image clue, and the consistance introduced between implicit function maintenance Functional Mapping, specifically:

4.1) the corresponding different description operator of image clue, the image Functional Mapping of often kind of mode aligns with image clue and to realize by optimizing following formula, namely

\min_{R_{ij}} H (R_{ij}) = Σ_{k = 1}^{m} {| | R_{ij} X_{i}^{k} - X_{j}^{k} | |}_{1} + α {| | R_{ij} Λ_{j}^{k} R_{ij} | |}_{F}^{2} + β {| | R_{ij} | |}_{1},

Wherein, constant α >0, β >0, symbol || || ₁the L1 norm of representing matrix, symbol || || _fthe Frobenius norm of representing matrix;

4.2) implicit function introduced is shared by input picture, and make implicit function corresponding on the every width image of Functional Mapping energy efficient association by the consistent item of Functional Mapping, and each implicit function only appears in certain subset of image, and the implicit function z that the i-th width image is corresponding _i=[z _i1, z _i2..., z _il] ∈ 0,1} characterizes the relation between implicit function and image, and continuous variable Φ _i=[φ _i1, φ _i2..., φ _il] each implicit function on image is described;

The consistent item of Functional Mapping in previous step is expressed as

Q (R_{ij}, Φ_{i}, z_{i}) = γ \underset{(i, j) &Element; E}{Σ} {| | R_{ij} Φ_{i} - Φ_{j} diag (z_{i}) | |}_{2}^{2} + λ Σ_{i = 1}^{n} {| | Φ_{i} - Φ_{i} diag (z_{i}) | |}_{2}^{2},

Wherein, constant γ >0, λ >0, symbol || || ₂the L2 norm of representing matrix, diag (z _i) representing pair of horns matrix, (i, j) ∈ Ε represents neighbour's set that image is right.

6. as claimed in claim 1 based on the multi-modality images dividing method of Functional Mapping, it is characterized in that: described step 5) in obtain Functional Mapping according to multi-modal consistency in mapping and express, segmentation function corresponding to every width figure is calculated by combined optimization objective function, specifically:

5.1) according to step 4) in the multi-modal consistency in mapping relation set up calculate Functional Mapping and express, namely

\min_{R_{ij}, Φ_{i}, z_{i}} \underset{(i, j) &Element; E}{Σ} H (R_{ij}) + Q (R_{ij}, Φ_{i}, z_{i}),

5.2) take image pattern as figure on summit, the weight between two summits is designated as then the combined optimization goal expression of Iamge Segmentation function is

5.3) by solving 5.2) in the optimum solution of objective function, the optimum segmentation function of the i-th width image can be obtained can determine that belonging to other optimum segmentation block of c target class in image is expressed as accordingly