CN104063685A

CN104063685A - Object recognition method based on submodule optimization

Info

Publication number: CN104063685A
Application number: CN201410270916.9A
Authority: CN
Inventors: 邵岭; 朱凡; 江卓林
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2014-06-17
Filing date: 2014-06-17
Publication date: 2014-09-24
Anticipated expiration: 2034-06-17
Also published as: CN104063685B

Abstract

The invention provides an object recognition method based on submodule optimization. The object recognition method comprises the steps that a series of bottom image fragment assumptions is obtained by applying a CPMC image segmentation algorithm to each image through an unsupervised method; a map is constructed based on the generated bottom image fragment assumptions; elements are selected through iteration and are added to find out a most discriminative subset; an object mask is extracted by stacking selected image fragments as a foreground object; objects are classified and recognized by applying a linear classifier. According to the object recognition method based on submodule optimization, the fragments are considered by utilizing the property of a submodule function, and the method is based on the recently-provided CPMC algorithm achieving the strong advantages on the image segmentation aspect.

Description

The object identification method of optimizing based on submodule

Technical field

The present invention relates to field of anti-counterfeit technology, particularly a kind of object identification method of optimizing based on submodule.

Background technology

In the past few years, word bag (BoF) model and extended version pyramid Matching Model (SPM) thereof are all the fashion in object identification field.When the pyramid grid of same intensive sampling and powerful set of classifiers are combined, BoF and SPM model have been obtained very outstanding result on many object identification benchmark databases.These databases comprise PASCAL VOC2007, Caltech-101, ETHZ-shape etc.The grid of these intensive samplings can retain the contextual information of an object classification, and such as space layout, yet incoherent background information has also been retained simultaneously.In order to address this problem, the methods before a lot of attempt to utilize result that image is cut apart to strengthen the discrimination of object identification.The identification that cooperative figure looks like to cut apart can bring following two aspect benefits: 1) image segmentation result can strengthen the boundary contrast of object accurately, so can comprise more shape information along the feature on border.2) in the image segments of homogeneity, calculated characteristics can strengthen signal to noise ratio (S/N ratio).But owing to lacking image partition method trusty, the object identification of cutting apart based on image does not have too large progress.The identification of cutting apart based on image the flower being carried out with Nilsback and Zisserman as an example.Owing to only having in piece image, consider single fragment, segmentation result can obtain in very simple situation in the background of a sub-picture more accurately., the resulting result of its method is compared raising not obvious with the classification results of not cutting apart based on image.In the nearest object identification method based on bottom, a lot of methods attempt to utilize the space distribution of object to obtain better performance.

The people such as He (referring to document " X.He; R.S.Zemel; and M.A.Carreira-Perpin ' an.Multiscaleconditional random fields for image labeling.In IEEE Conference on Computer Vision andPattern Recognition; volume2; pages II – 695.IEEE, 2004. ") have set up a condition random field (CRF) framework and have come to distribute the label in a limited tags set for each pixel on image pixel.So, characteristics of image and image tag are all included in a probabilistic framework.The people such as Shotton are (referring to document " J.Shotton; J.Winn; C.Rother; and A.Criminisi.Textonboost for image understanding:Multi-classobject recognition and segmentation by jointly modelling texture; layout; and context.International Journal of Computer Vision, 81 (1): 2 – 23,2009. ") method that proposed Textonboost comes combined with texture, distribution and contextual information in one-way layout problem.By in conjunction with one-way layout in CRF, the space interaction that the present invention catches the class label between neighbor guarantees the slickness of data.A significant limitation based on pixel aspect method is the deficiency showing when cutting apart close generic object.The people such as Gould are (referring to document " S.Gould, R.Fulton, and D.Koller.Decomposing a scene into geometric andsemantically consistent regions.In IEEE International Conference on Computer Vision, pages1 – 8.IEEE, 2009. ") and the people such as Ladicky (referring to document " L.Ladicky, C.Russell, P.Kohli, and P.H.Torr.Associative hierarchical crfs for object class image segmentation.In IEEEInternational Conference on Computer Vision, pages739 – 746.IEEE, 2009. ") use the enclosing region detection limit of rectangle to solve this problem.With using the method for enclosing region to compare, the method based on image segments or image pixel is closer to real object space region.The people such as Fulkerson (referring to document " B.Fulkerson; A.Vedaldi; and S.Soatto.Class segmentation and object localization withsuperpixel neighborhoods.In IEEE International Conference on Computer Vision; pages670 – 677.IEEE, 2009. ") utilize super pixel as elementary cell in identification framework.In this regard, sorter is to construct on the basis of gray-scale map of the local feature in each super pixel, and utilize the super pixel grey scale close with this super pixel distance figure's and as regularization term.For the recognition methods based on image segments, the people such as Rabinovich are (referring to document " A.Rabinovich, S.Belongie, T.Lange, and J.M.Buhmann.Model orderselection and cue combination for image segmentation.In IEEE Conference on ComputerVision and Pattern Recognition, volume1, pages1130 – 1137.IEEE, 2006. ") utilize to stablize and heuristicly from being cut by normalized figure in obtained image segmentation result, select minimizing sample (referring to document " J.Shiand J.Malik.Normalized cuts and image segmentation.Pattern Analysis and MachineIntelligence, IEEE Transactions on, 22 (8): 888 – 905, 2000. ").For piece image, it is independently image of a pair that each image segments in obtained list is taken as, and the label of these image segments is voted the classification to image simultaneously.Than carrying out object identification by a single image fragment, utilize the set of image segments can catch more object boundary information.But, they do not provide the machine-processed fragment that deviation is larger and a calculated amount that need to be huge as a new image collection using the set of whole image segments of excluding of image segments selection reliably.People such as Li (referring to document " F.Li; J.Carreira; and C.Sminchisescu.Objectrecognition as ranking holistic figure-ground hypotheses.In IEEE Conference onComputer Vision and Pattern Recognition; pages1712 – 1719.IEEE, 2010. ") have introduced the object identification framework of the similar a plurality of bottom layer image segmentation results that generate based on CPMC algorithm of work that will propose with the present invention.But, method provided by the invention and their method are very different in How to choose bottom layer image fragment succinct and that have identification.The present invention utilizes the approximate of a constant factor based on submodule character, rather than bottom layer image fragment is chosen in ad hoc going.

Owing to lacking image partition method trusty, the object identification of cutting apart based on image does not have too large progress.The identification of cutting apart based on image the flower being carried out with Nilsback and Zisserman as an example.Owing to only having in piece image, consider single fragment, segmentation result can obtain in very simple situation in the background of a sub-picture more accurately., the resulting result of its method is compared raising not obvious with the classification results of not cutting apart based on image.

Therefore, for traditional object identification method of optimizing based on submodule, be necessary to propose a kind of method of only considering single fragment in a sub-picture that is different from, by utilizing the character of Submodular function, consider a plurality of fragments, new method, based on nearest proposed and shown the CPMC algorithm of powerful advantages aspect cutting apart at image, can come the high efficiency fragment with identification of selecting from a series of bottom layer image fragment carry out object identification by a submodule objective function.

Summary of the invention

Object of the present invention aims to provide and the invention provides a kind of object identification method of optimizing based on submodule, by utilizing the character of Submodular function to consider a plurality of fragments, the method is based on nearest proposed and shown the CPMC algorithm of powerful advantages aspect cutting apart at image.

The invention provides a kind of object identification method of optimizing based on submodule, comprising: on every width image, use CPMC image segmentation algorithm to use non-supervisory method to obtain a series of bottom layer image fragment hypothesis; Bottom layer image fragment based on these generations builds a figure G; By iteration, select the element in S and join in A, to find the subset A most in S with identification; Object masking-out is to extract for foreground object by the selected image segments that superposes; Use a linear classifier to identify sorting objects.

Further, described method adopts:

Use N _arepresent to open the number of facility, under the condition of constraint K, the combination of Facility Location Problem is set and is summed up as:

\max_{A} H (A) = \underset{i &Element; V}{Σ} \max_{j &Element; A} w_{ij} - \underset{j &Element; A}{Σ} φ_{j}, s . t . A &SubsetEqual; S &SubsetEqual; V, N_{A} \leq K - - - (1)

Wherein, w _ijrepresent a group element v _iwith one group of potential culminating point v _jbetween relation, wherein: v _ibe regarded as client and v _jbe regarded as facility, the cost of an open new facility is simultaneously fixed to δ.

Further, described method adopts:

Encourage element v for first in formula (1) _ion the little group switching centre distributing at it, have maximum value, it is more prone to selected image segments v _jbetter express or more approach its group interior element, wherein: image segments v _jfor little group switching centre, so the final set A of selecting will have very strong representativeness, each border e _ijon weight w _ijcan be obtained by following formula:

w _ij＝K(v _i,v _j)+O(v _i,v _j), (2)

K (v wherein _i, v _j) represent that card side based on any Liang Dui group elemental characteristic histogram is apart from exp (γ χ ²(v _i, v _j)), O (v _i, v _j) represent the criterion of overlapping size between same group element:

O (v_{i}, v_{j}) = \frac{| v_{i} \cap v_{j} |}{| v_{i} \cup v_{j}} - - - (3)

If w _ijthe words that just criterion based on overlapping size calculates, facility addressing item will tend to select those image segments that are more adjacent to have the image segments of very large overlapping region, to such an extent as to comprises that the image segments of a lot of background informations will be preferred.

Further, described method adopts:

Add a pure restriction of classification to strengthen the identification of selected set A, this differentiation is that the pure property of classification of the image segments for each object classification the obtaining recurrence device based on by learnt limits, these image segments are represented by the descriptor of pyramid structure, and can be expressed as { x ₁, x ₂..., x _n, the object classification comprising in image I be k ∈ 1,2 ..., m}, so the present invention need to learn m scoring function f ₁(S _i), f ₂(S _i) ..., f _m(S _i), each function is defined within by calculating certain the image segments S in a sub-picture _itrue fragment with classification k the score set O that obtains of overlapping value upper, O _ican calculate and obtain by following formula:

O_{i} (S_{i}, G_{I}^{k}) = \frac{| S_{i} \cap G_{I}^{k} |}{| S_{i} \cup G_{I}^{k} |} . - - - (4)

Like this, each bottom layer image fragment S _iall pass through with its regressand value O _ibe associated, because an image segments is at training f _k(S _i) time conventionally overlaps mutually with more than one true picture fragment, so it has different regressand values when training different classes of scoring function, if a classification k does not exist in a sub-picture I, all image segments in image I are regarded as generic k not to be had overlapping, for all images in training set, a simple linear support vector returns the scoring function f that device is used to learn each classification _k(S _i).

Further, described method adopts:

Facility addressing item and differentiation are attached in a unified objective function:

C (A) is the non-negative linear combination of H (A) and E (A), and H (A) and E (A) be Submodular function, so C (A) is also Submodular function;

Choose one and there is maximum gain R _a*image segments a* ∈ A S join in set A;

In each circulation, the gain of selected image segments is updated to 0, uses simultaneously upgrade remaining facility in V.When required image segments quantity has reached, or gain while being negative, A will stop absorbing new image segments;

The quantitative restriction of facility in required opening meets a simple balanced matroid U=(S, I), and wherein I meets opened facility quantity X _ibe less than the subset of K;

Under the condition that meets balanced matroid constraint, maximize a Submodular function to obtain (1-1/e) approximate optimal solution.

Further, described method adopts:

Final masking-out M is carried out together with SPM architecture combined to object representation;

For any sub-picture, masking-out M is used to select foreground object;

The information beyond masking-out M that drops on is abandoned in the expression of image original image by calculating in SPM framework obtains;

Train a linear classifier that meets formula (8):

Wherein X represents training sample, and H represents the class label matrix of X, W presentation class device parameter.Thereby obtain a solution wherein Z is unit matrix;

For a width test pattern I, first calculate it and express x _i, then used estimate its class label, wherein l ∈ R ^m, its classification is the index of greatest member in corresponding l.

Be different from the method for only considering single fragment in a sub-picture, a kind of object identification method of optimizing based on submodule provided by the invention, by utilizing the character of Submodular function, consider a plurality of fragments, new method, based on nearest proposed and shown the CPMC algorithm of powerful advantages aspect cutting apart at image, can come the high efficiency fragment with identification of selecting from a series of bottom layer image fragment carry out object identification by a submodule objective function

A kind of object identification method of optimizing based on submodule provided by the invention, by utilizing the character of Submodular function to consider a plurality of fragments, the method is based on nearest proposed and shown the CPMC algorithm of powerful advantages aspect cutting apart at image.The present invention proposes a submodule objective function and comes the high efficiency fragment with identification of selecting from a series of bottom layer image fragment to carry out object identification.The present invention is to regression function of each Category Learning.The regressand value of these regression functions is to be determined by the size of the lap between each bottom layer image fragment and the true fragment corresponding with it.The present invention utilizes the advantage of regression model to come classification and the quality of difference image fragment.Objective function proposed by the invention comprises a device location item and a differentiation.Wherein, device location item is determined by the similarity summation of any a pair of image segments and equipment penalty term, and it is determined by the consistance of selected image segments classification differentiating item.The main contribution of this work has three aspects::

1) object identification is summed up as a facility location problem having selecting the classification consistance of image segments to limit, and solves this problem by maximizing a Submodular function.The invention provides the new visual field utilizing Submodular function to solve in object identification problem.

2) based on its submodule character, objective function proposed by the invention can be solved by greedy algorithm efficiently, and can guarantee that its performance is greatly on top performance.

3) object identification method based on Submodular function proposed by the invention has been obtained advanced result on three popular object identification benchmark databases.

The aspect that the present invention is additional and advantage in the following description part provide, and these will become obviously from the following description, or recognize by practice of the present invention.

Accompanying drawing explanation

Fig. 1 shows the different values according to an embodiment of the present invention result schematic diagram on the impact of the cumulative ballot of institute's selected episode;

The Submodular function that utilizes that Fig. 2 shows according to an embodiment of the present invention carries out the result schematic diagram of the example of image segments selection for judging two classification problems that whether target object exists;

Fig. 3 shows the result schematic diagram that the image segments based on facility addressing item and entropy item is according to an embodiment of the present invention selected;

Fig. 4 shows the schematic flow sheet of the method for anti-counterfeit of the NFC label that possesses cipher protection function according to an embodiment of the present invention.

Embodiment

Describe embodiments of the present invention below in detail, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has the element of identical or similar functions from start to finish.Below by the embodiment being described with reference to the drawings, be exemplary, only for explaining the present invention, and can not be interpreted as limitation of the present invention.

Unless those skilled in the art of the present technique are appreciated that specially statement, singulative used herein " ", " one ", " described " and " being somebody's turn to do " also can comprise plural form.Should be further understood that, the wording of using in instructions of the present invention " comprises " and refers to and have described feature, integer, step, operation, element and/or assembly, but do not get rid of, do not exist or adds one or more other features, integer, step, operation, element, assembly and/or their group.Should be appreciated that, when the present invention claims element to be " connected " or " coupling " arrives another element, it can be directly connected or coupled to other elements, or also can have intermediary element.In addition, " connection " used herein or " coupling " can comprise wireless connections or couple.Wording "and/or" used herein comprises arbitrary unit of listing item and all combinations that one or more is associated.

Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein (comprising technical term and scientific terminology) have with the present invention under the identical meaning of the general understanding of those of ordinary skill in field.Should also be understood that such as those terms that define in general dictionary and should be understood to have the consistent meaning of meaning in the context with prior art, unless and definition as here, can not explain by idealized or too formal implication.

The present invention proposes an object identification method based on greedy algorithm and Submodular function.The present invention regards a Submodular function at the Facility Location Problem guaranteeing under selected image segments classification consistency constraint condition as, and selects the image segments with identification by maximizing this Submodular function.The classification of image segments is estimated by the recurrence device of training respectively in each classification.The objective function proposing will complete its optimization by very efficient greedy algorithm.

Objective function proposed by the invention can be solved by greedy algorithm efficiently, and can guarantee that its performance is greatly on top performance.The framework proposing is at three benchmark database PASCAL VOC2007, Caltech-101 and obtained outstanding result above ETHZ-shape.

Fig. 1 shows the different values according to an embodiment of the present invention result schematic diagram on the impact of the cumulative ballot of institute's selected episode.In Fig. 1, (a) is input picture; In Fig. 1, (b) is the corresponding true fragment of image target object.(c) in Fig. 1-(f): the voting results of the institute's selected episode when penalty term is got.Fig. 1 (c) when only having the image segments of seldom measuring selected, voting results can not comprise target object very accurately.Fig. 1 when more fragment is selected (d), voting results will improve much the parcel of target object.In Fig. 1, (e) has target object covered the most accurately.Meanwhile, in Fig. 1, the situation of (f) has also been reacted the coarse situation of the covering of having selected too much image segments to cause.

The Submodular function that utilizes that Fig. 2 shows according to an embodiment of the present invention carries out the result schematic diagram of the example of image segments selection for judging two classification problems that whether target object exists.As shown in Figure 2, (a) in Fig. 2: input picture; (b) in Fig. 2-(f): the different quality bottom layer image fragment that sub-fraction is generated by CPMC algorithm; (g) in Fig. 2-(i): the stack result of selected image segments Weighted Coefficients, wherein in Fig. 2, (g) just poises item based on facility, in Fig. 2 (h) and (i) be simultaneously based on facility addressing item and entropy item; In Fig. 2, (h) returns by facility addressing item and entropy item the result obtaining on device simultaneously at " automobile ", and in Fig. 2, (i) returns by facility addressing item and entropy item the result obtaining on device simultaneously at " ship ".Owing to adding in Fig. 2 (g), do not differentiate item, the classification of image segments is not considered in selection course.Like this, selected image segments can not concentrate on (for example Fig. 2 (h) or Fig. 2 (i)) in some object classifications when a sub-picture comprises more than one object classification; (j) in Fig. 2-(l): utilize threshold value to filter the result of (g) in Fig. 2-(i), obtain final masking-out foreground area in corresponding original image.

Fig. 4 shows the schematic flow sheet of the method for anti-counterfeit of the NFC label that possesses cipher protection function according to an embodiment of the present invention.Method provided by the invention is by selecting a series of sub-fragments to solve object identification problem in image segments, with the query image in best discovery destination object.First, we use CPMC image segmentation algorithm to use non-supervisory method to obtain a series of bottom layer image fragment hypothesis on every width image.Then, our the bottom layer image fragment based on these generations builds a figure G.Owing to utilizing all image segments hypothesis to need too large calculated amount and the likely prediction of misleading property, so we select the element in S by iteration and join to find the subset A most in S with identification in A.Object masking-out is extracted for foreground object by the selected image segments that superposes.Finally, we use a linear classifier to sorting objects.

1, composition

For a sub-picture I, N image bottom fragment S={s ₁, s ₂..., s _r, by CPMC, generated, belong to the true picture fragment of object classification k simultaneously in training set, provide.The subset of image bottom fragment hypothesis provides in Fig. 1 (b)-(f).The present invention builds wherein vertex v ∈ V representative image fragment hypothesis of a figure G=(V, E) in the fragment hypothesis of image I, and limit e ∈ E describes the intersegmental relation of any a pair of different images hypothesis sheet.Append to a limit e _ijon weight w _ijcan calculate by formula 2.

2, the selection of specific image fragment

The present invention selects image segments by following criterion: 1), in order to obtain the image segments of identification to object identification, the present invention is considered as Facility Location Problem by the selection problem of identification and image segments; 2) surround and watch and image segments and object classification are carried out associated, the present invention is directed to a recurrence device of each object classification training and come predicting candidate image segments and the intersegmental overlapping value of true picture sheet.Suppose that an object classification with maximum regressand value is assigned to certain image segments, the pure property of the present invention's image segments classification in Piece Selection process control one sub-picture.The present invention meets this two criterions by set facility addressing item and entropy item in objective function, and maximizes objective function based on submodule character.

2-1 facility addressing item

The problem that the present invention has the image segments of identification by selection in all images fragment at a sub-picture is summed up as a Facility Location Problem.Image segments is regarded as a series of addresses of the facility that will open.The present invention N _arepresent to open the number of facility.Under the condition of constraint K, the combination of Facility Location Problem is set and is summed up as:

\max_{A} H (A) = \underset{i &Element; V}{Σ} \max_{j &Element; A} w_{ij} - \underset{j &Element; A}{Σ} φ_{j}, s . t . A &SubsetEqual; S &SubsetEqual; V, N_{A} \leq K - - - (1)

Wherein, w _ijrepresent a group element v _i(being regarded as client) and one group of potential culminating point v _jrelation between (being regarded as facility), the cost of an open new facility is simultaneously fixed to δ.The submodule character of global gain II is (referring to document " R.D.Galv～ao.Uncapacitated facility location problems:contributions.PesquisaOperacional; 24 (1): 7 – 38; 2004. " and " N.Lazic; I.Givoni; B.Frey; and P.Aarabi.Floss:Facility location for subspace segmentation.In IEEE International Conference onComputer Vision, pages825 – 832.IEEE, 2009. "), proved.

Encourage element v for first in formula (1) _ion the little group switching centre distributing at it, there is maximum value.It is more prone to selected image segments v _jits client (group interior element) is better expressed or more approached to (little group switching centre), so the final set A of selecting will have very strong representativeness.Each border e _ijon weight w _ijcan be obtained by following formula:

w _ij＝K(v _i,v _j)+O(v _i,v _j), (2)

O (v_{i}, v_{j}) = \frac{| v_{i} \cap v_{j} |}{| v_{i} \cup v_{j}} - - - (3)

If w _ijthe words that just criterion based on overlapping size calculates, facility addressing item will tend to select those image segments that are more adjacent to have the image segments of very large overlapping region, to such an extent as to comprises that the image segments of a lot of background informations will be preferred.The present invention adds card side's distance of image segments feature histogram effectively to prevent this problem.Second irrelevant facility is punished.When adding a new image segments to be passed the expense of opening this facility and to offset to the gain in histogram, A will stop selecting new image segments.So, A will be meet simultaneously representative and diversified.

2-2, differentiation

The present invention adds a pure restriction of classification to strengthen the identification of selected set A by force.This differentiation is that the pure property of classification of the image segments for each object classification the obtaining recurrence device based on by learnt limits.

These image segments are represented by the descriptor of pyramid structure, and can be expressed as { x ₁, x ₂..., x _n.The object classification comprising in image I be k ∈ 1,2 ..., m}, so be that the present invention need to learn m scoring function f ₁(S _i), f ₂(S _i) ..., f _m(S _i).Each function is defined within by calculating certain the image segments S in a sub-picture _itrue fragment with classification k the score set O that obtains of overlapping value on.Say accurately O _ican calculate and obtain by following formula:

O_{i} (S_{i}, G_{I}^{k}) = \frac{| S_{i} \cap G_{I}^{k} |}{| S_{i} \cup G_{I}^{k} |} . - - - (4)

Like this, each bottom layer image fragment S _iall pass through with its regressand value O _ibe associated.Because an image segments is at training f _k(S _i) time conventionally overlaps mutually with more than one true picture fragment, so it has different regressand values when the different classes of scoring function of training.If a classification k does not exist in a sub-picture I, all image segments in image I are regarded as generic k not to be had overlapping.For all images in training set, a simple linear support vector returns the scoring function f that device is used to learn each classification _k(S _i).At test phase, there is the scoring function of high regressand value and determine inquiry image segments S _iclassification, in other words, S _iclassification by y _i=arg max _kf _k(S _i) obtain.

The class probability that entropy exists in A distributes and determines, and it weighs the consistance of selected image segments classification.It should be noted that the Probability p (j) is not here that the number that belongs to the image segments of each classification by calculating decides, but directly utilize the class label of each selected image segments to decide.Being defined in below of entropy and probability provides:

E (A) = - \underset{j &Element; A}{Σ} p (j) \log p (j), - - - (5)

p (j) = \frac{\arg \max_{k} f_{k} (S_{j})}{Σ_{j &Element; A} \arg \max_{k} f_{k} (S_{j})}, - - - (6)

Here, the object classification of minute image segments j of subrepresentation in formula (6), its denominator be all image segments object classifications and guarantee that this convolution of probability distribution is 1.For each candidate image fragment, its classification is by its scoring function that obtains maximum regressand value is determined.By maximizing E (A), the present invention encourages the selected image segments set A going out to have identical class label, to reduce the negative effect of inaccurate recurrence device.When when namely all image segments in A are from same object classification, E (A) has maximal value.It should be noted that for multi-class classification task (Caltech-101 database) and two classification task (PASCAL2007 database and ETHZ-shape database) the present invention of whether existing takes different strategies in objective function, to add entropy item.For multi-class classification, in the process that the recurrence device of all categories is chosen in image segments, be utilized, and the classification of each image segments is to be assigned by the recurrence device that makes it have maximum regressand value.For two classification task that whether exist, the present invention only considers the single recurrence device for inquiry classification.Now, if its regressand value higher than 0.5, the classification of this image segments is noted as ' 1 ', otherwise it is noted as ' 2 '.When inquiry classification changes, different image segments can be followed different recurrence devices and is selected.Two classification task when like this, method provided by the invention can well be processed a sub-picture and comprises a plurality of Wu Er and carry.About the proof of E (A) monotonicity and submodularity, the present invention will provide in appendix.

Fig. 3 shows the result schematic diagram that the image segments based on facility addressing item and entropy item is according to an embodiment of the present invention selected.The point of different colours represents the summit from different classes of figure; Black circle represents a selecteed image segments.Weight w near numeral formula 2 limit between two summits _ij." Obj value " represents two kinds of resulting target function values of selection scheme.By adding entropy item, the present invention can observe straightforward and its conspicuousness of selected image segments classification and be considered simultaneously.So, the selection scheme of (b) in the preferred Fig. 3 of the present invention.

(j) in Fig. 2-(l) shown the image segments selection result that judges two classification task whether target object exists for.In Fig. 2 (j), do not adding under the situation of entropy item, facility addressing item just trends towards selecting to have the image segments of expressivity.When adding entropy item, for the image segments of classification, will be selected.When inspected object classification " vehicle ", those return at " vehicle " image segments that obtain very high regressand value on device will be preferred (as shown in (k) in Fig. 2).

3, optimize

The present invention is attached to facility addressing item and differentiation in a unified objective function:

Directly maximizing C (A) is a NP difficult problem (referring to document " R.D.Galv～ao.Uncapacitatedfacility location problems:contributions.Pesquisa Operacional; 24 (1): 7 – 38,2004. ").C (A) is the non-negative linear combination of H (A) and E (A), and H (A) and E (A) be Submodular function, so C (A) is also Submodular function.Utilize this character, maximization C (A) can pass through greedy algorithm (referring to document " R.D.Galv～ao.Uncapacitated facility location problems:contributions.Pesquisa Operacional; 24 (1): 7 – 38; 2004. " and " G.L.Nemhauser; L.A.Wolsey; and M.L.Fisher.An analysis ofapproximations for maximizing submodular set functionsi.Mathematical Programming; 14 (1): 265 – 294,1978. ") and effectively solve.Image segments set A is set to empty set when initialization, and in after this each circulation, the present invention chooses one and has maximum gain R _a*image segments a* ∈ A S join in set A.Algorithm of the present invention is updated to 0 by the gain of selected image segments in each circulation, uses simultaneously upgrade remaining facility in V.When required image segments quantity has reached, or gain while being negative, A will stop absorbing new image segments.The quantitative restriction of facility in required opening meets a simple balanced matroid U=(S, I), and wherein I meets opened facility quantity X _ibe less than the subset of K.Under the condition that meets balanced matroid constraint, maximize a Submodular function and can obtain [(1-1/e)-approximate optimal solution] (referring to document " G.L.Nemhauser; L.A.Wolsey; and M.L.Fisher.An analysis of approximations for maximizing submodular setfunctionsi.Mathematical Programming; 14 (1): 265 – 294,1978. ").Therefore, method provided by the invention provides the solution with performance guarantee.

On this basis, optimizing process can also be accelerated by the submodule character of objective function.When often add a new image segments a ∈ V all recalculate gain during A and need to carry out the gain of C (A) | this estimation of V|-|A|.The present invention utilizes the laziness in (referring to document " ") [19] to make an appraisal of the situation to replace this way.The false code of the whole object identification framework based on Submodular function provides in algorithm 1, and wherein submodule optimizing process provides at the 3rd to 20 row.

4, the masking-out of image segments builds

Final masking-out M obtains by all image segments in stack set A.Meanwhile, each image segments a _jthe regressand value f of ∈ A _k(a _j) will be as weights when stack.An adaptive threshold tau=0.6 * N for the present invention _aline of pixels lower than this threshold value in M is removed.

5, classification

The present invention combines to carry out object representation by final masking-out M and spatial domain pyramid Matching Model SPM framework (referring to document " J.Yang; K.Yu; Y.Gong; and T.Huang.Linear spatial pyramid matching using sparse coding forimage classification.In IEEE Conference on Computer Vision and Pattern Recognition; pages1794 – 1801.IEEE, 2009. ").For any sub-picture, masking-out M is used to select foreground object.The information beyond masking-out M that drops on is abandoned in the expression of image original image by calculating in SPM framework obtains.Next the present invention trains a linear classifier that meets formula (8):

Wherein X represents training sample, and H represents the class label matrix of X, W presentation class device parameter.Thereby obtain a solution wherein Z is unit matrix.For a width test pattern l, first the present invention calculates it and expresses x _i, then used estimate its class label, wherein l ∈ R ^m.Its classification is the index of greatest member in corresponding l.

Those skilled in the art of the present technique are appreciated that the present invention can relate to for carrying out the equipment of the one or more operation of operation described in the application.Described equipment can be required object specialized designs and manufacture, or also can comprise the known device in multi-purpose computer, and described multi-purpose computer has storage procedure Selection within it and activates or reconstruct.Such computer program (is for example stored in equipment, computing machine), in computer-readable recording medium or be stored in the medium of any type that is suitable for store electrons instruction and is coupled to respectively bus, described computer-readable medium includes but not limited to the dish (comprising floppy disk, hard disk, CD, CD-ROM and magneto-optic disk) of any type, storer (RAM), ROM (read-only memory) (ROM), electrically programmable ROM, electric erasable ROM (EPROM), electrically erasable ROM (EEPROM), flash memory, magnetic card or light card immediately.Computer-readable recording medium comprises for any mechanism with for example, by the storage of the readable form of equipment (, computing machine) or transmission information.For example, computer-readable recording medium comprises storer (RAM) immediately, ROM (read-only memory) (ROM), magnetic disk storage medium, optical storage medium, flash memory device, the signal (such as carrier wave, infrared signal, digital signal) propagated with electricity, light, sound or other form etc.

Those skilled in the art of the present technique are appreciated that and can realize each frame in these structural drawing and/or block diagram and/or flow graph and the combination of the frame in these structural drawing and/or block diagram and/or flow graph with computer program instructions.The processor that these computer program instructions can be offered to multi-purpose computer, special purpose computer or other programmable data disposal routes generates machine, thereby the instruction of carrying out by the processor of computing machine or other programmable data disposal routes has created for the frame of implementation structure figure and/or block diagram and/or flow graph or the method for a plurality of frame appointments.

Those skilled in the art of the present technique be appreciated that step in the various operations discussed in the present invention, method, flow process, measure, scheme by alternately, change, combination or delete.Further, have other steps in the various operations discussed in the present invention, method, flow process, measure, scheme also by alternately, change, reset, decompose, combination or delete.Further, of the prior art have with the present invention in step in disclosed various operations, method, flow process, measure, scheme also by alternately, change, reset, decompose, combination or delete.

The above is only part embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1. an object identification method of optimizing based on submodule, is characterized in that, comprising:

On every width image, use restriction on the parameters minimal cut CPMC image segmentation algorithm to use non-supervisory method to obtain a series of bottom layer image fragment hypothesis;

Bottom layer image fragment based on these generations builds a figure G;

By iteration, select the element in S and join in A, to find the subset A most in S with identification;

By the selected image segments that superposes, come to extract object masking-out M for foreground object;

Use a linear classifier to identify sorting objects.

2. the object identification method of optimizing based on submodule as claimed in claim 1, is characterized in that, described method adopts:

\max_{A} H (A) = \underset{i &Element; V}{Σ} \max_{j &Element; A} w_{ij} - \underset{j &Element; A}{Σ} φ_{j}, s . t . A &SubsetEqual; S &SubsetEqual; V, N_{A} \leq K - - - (1)

Wherein, w _ijrepresent a group element v _iculminating point v with one group of potential image segments _jbetween relation, wherein: v _ibe regarded as client, v _jbe regarded as facility, the cost of an open new facility is simultaneously fixed to δ, vertex v ∈ V representative image fragment hypothesis.

3. the object identification method of optimizing based on submodule as claimed in claim 2, is characterized in that, described method adopts:

A group element v in formula (1) _ion the little group switching centre distributing at it, have maximum value, it is more prone to the culminating point v of selected image segments _jbetter express or more approach its group interior element, wherein: the culminating point v of image segments _jfor little group switching centre, so the final set A of selecting will have very strong representativeness, each border e _ijon weight w _ijcan be obtained by following formula:

w _ij＝K(v _i,v _j)+O(v _i,v _j), (2)

O (v_{i}, v_{j}) = \frac{| v_{i} \cap v_{j} |}{| v_{i} \cup v_{j}} - - - (3) .

4. the object identification method of optimizing based on submodule as claimed in claim 1, is characterized in that, described method adopts:

Add a pure restriction of classification to strengthen the identification of selected set A, this differentiation is that the pure property of classification of the image segments for each object classification the obtaining recurrence device based on by learnt limits, these image segments are represented by the descriptor of pyramid structure, and can be expressed as { X ₁, X ₂..., X _n, the object classification comprising in image I be K ∈ 1,2 ..., m}, so need to learn m scoring function f ₁(S _i), f ₂(S _i) ..., f _m(S _i), each function is defined within by calculating certain the image segments S in a sub-picture _itrue fragment with classification K the score set O that obtains of overlapping value upper, O _ican calculate and obtain by following formula:

O_{i} (S_{i}, G_{I}^{k}) = \frac{| S_{i} \cap G_{I}^{k} |}{| S_{i} \cup G_{I}^{k} |} . - - - (4) .

5. the object identification method of optimizing based on submodule as claimed in claim 1, is characterized in that, described method adopts:

C (A) is that (the non-negative linear combination of A, and H (A) and E (A) be Submodular function, so C (A) is also Submodular function for H (A) and E;

Choose an image segments a* ∈ A with maximum gain Ra* S join in set A;

In each circulation, the gain of selected image segments is updated to 0, uses simultaneously upgrade remaining facility in V, when required image segments quantity has reached, or gain while being negative, A will stop absorbing new image segments;

6. the object identification method of optimizing based on submodule as claimed in claim 1, is characterized in that, described method adopts:

For any sub-picture, object masking-out M is used to select foreground object;

Train a linear classifier that meets formula (8):

Wherein X represents training sample, and H represents the class label matrix of X, W presentation class device parameter, thus obtain a solution wherein Z is unit matrix;

For a width test pattern I, first calculate it and express χ _i, then used estimate its class label, wherein l ∈ R ^m, its classification is the index of greatest member in corresponding I.