CN102509104B

CN102509104B - Confidence map-based method for distinguishing and detecting virtual object of augmented reality scene

Info

Publication number: CN102509104B
Application number: CN 201110299857
Authority: CN
Inventors: 陈小武; 赵沁平; 穆珺; 王哲
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2011-09-30
Filing date: 2011-09-30
Publication date: 2013-03-20
Anticipated expiration: 2031-09-30
Also published as: CN102509104A

Abstract

The invention relates to a confidence map-based method for distinguishing and detecting a virtual object of an augmented reality scene. The method comprises the following steps of: selecting vitality and reality classification features; constructing a pixel level vitality and reality classifier by means of the vitality and reality classification features; extracting regional comparison features of the augmented reality scene and a real scene respectively by means of the vitality and reality classification features, and constructing a region level vitality and reality classifier; giving a test augmented reality scene, detecting by means of the pixel level vitality and reality classifier and a small-size detection window to acquire a virtual score plot which reflects each pixel vitality and reality classification result; defining a virtual confidence map, and acquiring the virtual confidence map of the test augmented reality scene by thresholding; acquiring the rough shape and the position of a virtual object bounding box according to the distribution situation of high virtual response points in the virtual confidence map; and detecting by means of the region level vitality and reality classifier and a large-size detection window in the test augmented reality scene to acquire a final detection result of the virtual object. The method can be applied to the fields of film and television manufacturing, digital entertainment, education training and the like.

Description

Augmented reality scene virtual objects based on degree of confidence figure is differentiated and detection method

Technical field

The present invention relates to image processing, computer vision and augmented reality field, specifically a kind of augmented reality scene virtual objects based on degree of confidence figure is differentiated and detection method.

Background technology

Augmented reality is the further expansion of virtual reality, it coexists as in the same augmented reality system true environment of virtual objects that computing machine generates and outwardness by the equipment of necessity, presents the augmented reality environment that virtual objects and true environment combine together to the user on sense organ and the experience effect.Along with the development of augmented reality, have the appearance of the augmented reality scene of the higher image sense of reality, be badly in need of the standard and judgment of tolerance and evaluation augmented reality scene confidence level.How to judge whether augmented reality scene of a scene, and further the virtual objects in the augmented reality scene detected that an approach as augmented reality scene image trust evaluation has important Research Significance and application demand.

2011, the researchist of much of Italian special human relations proposed a kind of image forge discrimination method, and the computing machine generating component that the method can will incorporate in the real scene detects.This work be in the known work on hand unique one take the augmented reality scene as processing object.But the detection that this work is carried out is not take object as unit, but only detects the virtual composition in the augmented reality scene, and namely testing result may be a zone, may be the point of scattered distribution yet.

The researchist of U.S. Dartmouth University in 2005 has proposed based on the natural image statistical model of wavelet decomposition and has adopted support vector machine and the classify method of virtual image and true picture of linear discriminant analysis.This at first extracts after the coloured image wavelet decomposition Fourth feature (average, variance, the degree of bias, kurtosis) of coefficient of dissociation on each subband and direction; Consider simultaneously the Fourth Order Linear predicated error feature between the adjacent coefficient of dissociation after the wavelet decomposition, then utilize support vector machine and linear discriminant analysis method to train sorter, the sorter that again the test set input is trained obtains classification results.The actual situation classification of the method is carried out for whole image, and classification accuracy has than great fluctuation process along with the extraction area size difference of actual situation characteristic of division.

2007, the researchist of USA New York University of Science and Technology proposed to utilize the color filter array interpolation to detect the method that aberration consistance in characteristics and the image is distinguished virtual image and true picture.The method is at first extracted color-based filter array interpolation and is detected the conforming feature of aberration in characteristics and the image from the positive negative sample of training set, then will train sorter in the feature input support vector machine of extracting, the sorter that again the test set input is trained obtains classification results.

The researchist of Alberta, Canada university in 2009 has proposed to utilize the classify method of virtual image and true picture of the consistance of image block resampling parameter.The party's ratio juris is based on virtual image may use the operations such as rotation to texture image, convergent-divergent to the process of model surface texture in generating, and causes the parameter that each image block resamples in the virtual image inconsistent.So just can whether unanimously distinguish virtual image and true picture by the parameter that the detected image piece resamples.The parameter estimation that the image block of the method resamples is to carry out for whole image.

2004, the researchist of Compag Computer's Cambridge Research Laboratories proposed to utilize based on the Ha Er wave filter and has adopted the AdaBoost sorting algorithm to carry out the method that people's face detects.The method is at first extracted characteristic of division from training set, retraining goes out the sorter based on people's face and non-face statistical nature, the characteristic of division of the image to be detected that then will extract input sorter and by cascade classifier reduce need the detection window that calculates number to raise the efficiency, finally obtain testing result.The feature extraction of the method is based on the Ha Er wave filter, description be the region contrast that people's face inherent structure brings.

2005, the researchist of La Photographie computing machine and automation research institute proposed to utilize histograms of oriented gradients and linear SVM to carry out the method for person detecting.The method is divided at first the input picture is carried out color normalization, then calculate the gradient in the picture, statistics drops on the pixel between different directions and azimuthal bin, and overlapping space piece compared normalization, the histograms of oriented gradients of each detection window of regeneration, sort out personage/inhuman object area with the linear SVM sorter at last, obtain testing result.The method has higher detection effect than other detection methods, but requires personage in the picture will roughly keep the state of standing.What the method feature extraction was adopted is the image gradient histogram, description be the inherent characteristics of human body contour outline.

The method of above-mentioned differentiation virtual image and true picture, common ground are that actual situation characteristic of division that they extract all is not suitable for the actual situation classification for any given zone in the image.In addition, in the work of existing object detection, the general object of processing has the stronger appearance characteristics that is easy to describe as prior imformation.Comparatively speaking, the virtual objects in the augmented reality scene detects, and it detects target (being virtual objects) and does not have the explicit in appearance prior imformation that is easy to describe, and such as color, shape, size etc., therefore differentiation and detection difficulty are larger.

Summary of the invention

Technical solution of the present invention: overcome the deficiencies in the prior art, provide a kind of augmented reality scene virtual objects based on degree of confidence figure to differentiate and detection method, the method does not need to know in advance any appearance information of virtual objects, such as color, shape, size, do not need to know virtual objects residing position in the augmented reality scene yet, but utilize the physics imaging difference of distinguishing virtual objects and true picture, carrying out the actual situation characteristic of division extracts, regional unique characteristics and the regional correlation feature of the positive negative sample of difference calculation training collection, and construct Pixel-level actual situation sorter and region class actual situation sorter; On this basis, carry out tentatively formalize location and accurately detecting of virtual objects by differentiating based on the virtual objects of virtual degree of confidence figure with detection.

The technical solution used in the present invention: the augmented reality scene virtual objects based on degree of confidence figure is differentiated and detection method, step is as follows: make up augmented reality scene training dataset, and utilize the physics imaging difference of virtual objects and true picture, choose the actual situation characteristic of division; On training dataset, utilize the actual situation characteristic of division, extract respectively the regional unique characteristics of augmented reality scene and real scene, make up Pixel-level actual situation sorter; On training dataset, utilize the actual situation characteristic of division, extract respectively the regional correlation feature of augmented reality scene and real scene, make up region class actual situation sorter; Given test augmented reality scene utilizes Pixel-level actual situation sorter and small size detection window to detect, and obtains reflecting the virtual shot chart of each pixel actual situation classification results; Defining virtual degree of confidence figure, and on the basis of virtual shot chart, utilize thresholding to obtain testing the virtual degree of confidence figure of augmented reality scene; According to the distribution situation of high virtual responsive point among the virtual degree of confidence figure, obtain rough shape and the position of virtual objects bounding box; On the basis of virtual objects coarse localization, in test augmented reality scene image, utilize region class actual situation sorter and large scale detection window to detect, obtain the final detection result of virtual objects.

Make up augmented reality scene training dataset.Concentrate at training data, will comprise the augmented reality scene image of virtual objects as positive sample, with the real scene image as negative sample.Utilize the physics imaging difference of virtual objects and true picture, choose the actual situation characteristic of division.The virtual class feature of choosing comprises: local statistic, surface graded, second fundamental form, Marco Beltrami stream.Can extract the above-mentioned actual situation characteristic of division that obtains this some correspondence at each pixel place of image.

On training dataset, utilize the actual situation characteristic of division, extract the regional unique characteristics of augmented reality scene, make up Pixel-level actual situation sorter.When making up the Pixel-level sorter, to the augmented reality scene image, only choose the virtual objects zone as positive sample areas; And to the real scene image, only choose with positive sample in the akin zone of virtual objects as negative sample zone.For given image-region, calculate the actual situation characteristic of division (comprising: local statistic, surface graded, second fundamental form, Marco Beltrami stream) of every bit in the zone; Utilize the moment of inertia compression method that the actual situation characteristic of division of given area is compressed, obtain the regional unique characteristics of this zone correspondence.The regional unique characteristics set input support vector machine classifier of positive negative sample is trained, obtain Pixel-level actual situation sorter.

On training dataset, utilize the actual situation characteristic of division, extract the regional correlation feature of augmented reality scene, make up region class actual situation sorter.For positive and negative sample areas, subject area to be determined will itself be considered as; And the homalographic rectangular area outside the regional bounding box is considered as the residing background area of object; Extract respectively the actual situation characteristic of division of every bit in subject area and the background area; The actual situation characteristic of division that all-pair is answered in objects of statistics zone and the background area consists of respectively the joint distribution histogram of subject area feature and the joint distribution histogram of background area feature; Calculate the card side's distance between two histograms, with its feature that is considered as weighing object background difference of living in it, be called the regional correlation feature; The regional correlation characteristic set input support vector machine classifier of the positive negative sample that extracts is trained, obtain region class actual situation sorter.

Virtual shot chart makes up, and step is the augmented reality scene image for input, utilize small size detection window (detection window is of a size of [10,30] * [10,30] pixel) with less moving step length (as 1,2,3,4,5} pixel) the scanning entire image; Calculate the regional unique characteristics of the little image block in each small size detection window; The regional unique characteristics of all little image blocks is input in the Pixel-level actual situation sorter, obtains the regional unique characteristics score of each little image block, the high expression of score Pixel-level sorter is high with the degree of certainty that this image block is categorized as virtual region; Because the relative entire image of size of detection window is very little and densely distributed, therefore the regional unique characteristics score of each little image block can be mapped to the center pixel of this image block, and with its virtual score as this central pixel point; Consisted of thus the virtual shot chart of whole augmented reality scene image.This process can improve counting yield by two-dimensional integration figure.

Virtual degree of confidence figure makes up, and step is that the virtual shot chart for the augmented reality scene image carries out thresholding and processes, and records all and virtual is divided into positive point; A fixing number percent N% is set, records all virtual front N% and these point residing positions on original image that are divided into positive point, these points are called high virtual responsive point; The constant M (as making M ∈ [10,100]) of a fixing and less is set, records all virtual front M point and these residing positions on original image that are divided into positive point, these points are called the highest virtual responsive point; Can guarantee that by parameter setting the highest virtual responsive point also is contained in the set at high virtual responsive point place simultaneously, namely the highest virtual responsive point is the virtual the highest part of score value that gets in the high virtual responsive point.Positional information on comprehensive high virtual responsive point, the highest virtual responsive point and the place original image thereof consists of virtual degree of confidence figure.

The rough shape of virtual objects bounding box and position reasoning step are as follows: to the virtual degree of confidence figure that obtains, with its be divided into five homalographics, can be overlapping subregion, try to achieve respectively the distribution center of the high virtual responsive point in every sub regions; The subregion center is considered as candidate's virtual objects central point, from each central point respectively outwards expanded search obtain the densely distributed zone of high virtual responsive point, for the densely distributed zone of high virtual responsive point, Approximate Calculation goes out the candidate target shape (being presented as candidate's virtual objects bounding box) in this zone, in conjunction with this regional positional information, consist of the preliminary candidate region of virtual objects; In the preliminary candidate region of a plurality of virtual objects, each self-contained high virtual responsive point and the number of high virtual responsive point according to it, select maximum one of weighting number, as the virtual objects candidate region, this zone has namely comprised virtual objects bounding box rough shape and positional information with it.

For the coarse localization of the virtual objects that obtains, further optimize, obtain the final detection result of virtual objects.Concrete steps are: get the zone that area is virtual objects candidate region twice around in the virtual objects candidate region, (range of size of large scale detection window is generally [200 to the structure form size a plurality of overlapped large scale detection window identical with the virtual objects candidate region in this zone, 500] * [200,500], the concrete value of its length and width equals length and the width of virtual objects bounding box in the virtual objects candidate region); Get the interior image block of each large scale detection window and calculate its regional correlation feature; The regional correlation feature input area level actual situation sorter of image block in all large scale detection window is classified, select the final detection result of the highest detection window of reserved portion as virtual objects.

The present invention compared with prior art, its beneficial effect is:

(1) virtual objects of the present invention in the augmented reality scene be as detected object, the virtual objects in the augmented reality scene can be done as a whole differentiation and detect.

(2) the present invention has made up two-stage actual situation sorter, comprises Pixel-level actual situation sorter and region class actual situation sorter, satisfies the demand that degree of confidence figure makes up and virtual objects finally detects.

(3) the present invention has built a degree of confidence figure, based on virtual degree of confidence figure, can under the condition that does not have the prior imformations such as virtual objects outward appearance, shape, position, draw virtual objects approximate location and shape in the augmented reality scene.

(4) the present invention does not need to know in advance any appearance information of virtual objects, such as prior imformations such as color, shape, sizes, do not need to know virtual objects residing position in the augmented reality scene yet, wider applicability is arranged, but widespread use is generalized to the fields such as production of film and TV, digital entertainment, educational training.

Description of drawings

Fig. 1 is overall design structure of the present invention;

Fig. 2 is that virtual degree of confidence figure of the present invention makes up process flow diagram;

Fig. 3 is virtual objects bounding box shape of the present invention, position reasoning process flow diagram;

Fig. 4 is the process flow diagram of the candidate's of obtaining central point of the present invention;

Fig. 5 is that expanded search of the present invention, the high virtual responsive of acquisition are put the process flow diagram in densely distributed zone.

Embodiment

As shown in Figure 1, key step of the present invention is as follows: make up augmented reality scene training dataset, and utilize the physics imaging difference of virtual objects and true picture, choose the actual situation characteristic of division; On training dataset, utilize the actual situation characteristic of division, extract the regional unique characteristics of augmented reality scene, make up Pixel-level actual situation sorter; On training dataset, utilize the actual situation characteristic of division, extract the regional correlation feature of augmented reality scene, make up region class actual situation sorter; Given test augmented reality scene is utilized Pixel-level actual situation sorter to carry out small scale and is detected, and obtains reflecting the virtual shot chart of each pixel actual situation classification results; Defining virtual degree of confidence figure, and on the basis of virtual shot chart, utilize thresholding to obtain testing the virtual degree of confidence figure of augmented reality scene; According to the distribution situation of high virtual responsive point among the virtual degree of confidence figure, obtain rough shape and the position of virtual objects bounding box; On the basis of virtual objects coarse localization, in test augmented reality scene image, utilize region class actual situation sorter and large scale detection window to detect, obtain the final detection result of virtual objects.

The structure training dataset is used for training actual situation sorter.Training dataset is made of as negative sample as positive sample, real scene image the augmented reality scene image that comprises virtual objects.When training Pixel-level sorter, to the augmented reality scene image, only choose the virtual objects zone as positive sample; And to the real scene image, only choose with positive sample in the akin zone of virtual objects as negative sample.In training during regional fraction class device, to the augmented reality scene image, choose virtual objects and on every side the image-region of homalographic as positive sample; And to the real scene image, choose with positive sample in the akin zone of virtual objects and on every side the image-region of homalographic as negative sample.

The extraction of zone unique characteristics.For given image-region, calculate the actual situation characteristic of division of every bit in the zone, comprising: local statistic, surface graded, second fundamental form, Marco Beltrami stream; Utilize the moment of inertia compression method that the actual situation characteristic of division of given area is compressed, obtain the regional unique characteristics of this zone correspondence.

Local statistic, surface graded, second fundamental form, Marco Beltrami stream physical significance and computing method thereof separately are as follows respectively:

What local statistic reflected is local small marginal texture.The computing method of local statistic are as follows: get any point P on the gray-scale map of original image, the little image block of 3 * 3 pixels centered by the P point is arranged in 9 dimensional vector x=[x in order with the pixel value of every bit wherein ₁, x ₂... x ₉].The local statistic y that P is ordered is 9 dimensional vectors, and it is defined as:

y = \frac{x - \overset{&OverBar;}{x}}{{| | x - \overset{&OverBar;}{x} | |}_{D}} .

Wherein,

\overset{&OverBar;}{x} = \frac{1}{9} Σ_{i = 1}^{9} x_{i};

And || || _DIt is the operation of D norm.

The definition of D norm operation is:

Wherein the point of all neighbours territory relations is right in i～j presentation video piece.

The local statistic actual situation characteristic of division at p place, arbitrfary point is the 9 dimensional vector y at this some place.

Surface graded is the nonlinearities change characteristics of measuring in the real scene imaging process.The surface graded S at any point place is defined as in the image:

Wherein,

Be the image gradient mould value at this some place,

I _x, I _xThe local derviation of difference presentation video x direction (horizontal direction) and y direction (vertical direction).α is constant, α=0.25.

The surface graded actual situation characteristic of division at p place, arbitrfary point is united by the image pixel value I at this some place and the surface graded S in this some place and is consisted of.

Second fundamental form is the concavo-convex degree of Description Image surface local.Two component λ of second fundamental form ₁And λ ₂Two eigenwerts of difference homography A.

A = \frac{1}{\sqrt{1 + I_{x}^{2} + I_{y}^{2}}} (\begin{matrix} I_{xx} & I_{xy} \\ I_{xy} & I_{yy} \end{matrix}),

Wherein, I _x, I _xThe local derviation of difference presentation video x direction and y direction; I _Xx, I _Xy, I _YyThe second order local derviation of difference presentation video xx direction, xy direction, yy direction; Can be calculated the value of matrix A by this formula.A might as well be designated as:

A = (\begin{matrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix}),

A wherein ₁₁, a ₁₂, α ₂₁, α ₂₂Four corresponding element values among the difference representing matrix A.Therefore, two of matrix A eigenvalue λ ₁And λ ₂Computing formula is as follows:

{λ_{1}, λ_{2}} = \frac{a_{11} + a_{22} &PlusMinus; \sqrt{{(a_{11} - a_{22})}^{2} + 4 a_{12} a_{21}}}{2}, λ_{1} λ_{2}

The second fundamental form actual situation characteristic of division at p place, arbitrfary point is by the image gradient mould value at this some place

Two component λ with this some place second fundamental form ₁, λ ₂The associating formation.

Marco Beltrami stream can be used for describing the correlativity between the different color channels.The Marco Beltrami stream Δ that Color Channel c (c={R, G, B}) is corresponding _gI _cBe defined as:

Δ_{g} I_{c} = \frac{1}{| g |} ({&PartialD;}_{x} (\sqrt{| g |} (g^{xx} {&PartialD;}_{x} I_{c} + g^{xy} {&PartialD;}_{y} I_{c}))) + \frac{1}{| g |} ({&PartialD;}_{y} (\sqrt{| g |} (g^{yx} {&PartialD;}_{x} I_{c} + g^{yy} {&PartialD;}_{y} I_{c})))

Wherein, I _cThe image corresponding to Color Channel c (c={R, G, B}) of expression original image; Operator

Represent respectively to measure for effect the local derviation of x direction and y direction;

Matrix

g = (\begin{matrix} 1 + {(I_{x}^{R})}^{2} + {(I_{x}^{G})}^{2} + {(I_{x}^{B})}^{2} & I_{x}^{R} I_{y}^{R} + I_{x}^{G} I_{y}^{G} + I_{x}^{B} I_{y}^{B} \\ I_{x}^{R} I_{y}^{R} + I_{x}^{G} I_{y}^{G} + I_{x}^{B} I_{y}^{B} & 1 + {(I_{y}^{R})}^{2} + {(I_{y}^{G})}^{2} + {(I_{y}^{B})}^{2} \end{matrix}),

The local derviation of difference presentation video R passage (red channel) x direction and y direction;

The local derviation of difference presentation video G passage (green channel) x direction and y direction;

The local derviation of difference presentation video B passage (blue channel) x direction and y direction; | g| is the determinant of matrix g; And g ^Xx, g ^Xy, g ^Yy, g ^YxThen by

g^{- 1} = (\begin{matrix} g^{xx} & g^{xy} \\ g^{yx} & g^{yy} \end{matrix})

Provide respectively, i.e. g ^Xx, g ^Xy, g ^Yy, g ^YxBe respectively four element values corresponding to inverse matrix of matrix g.

The Marco Beltrami stream actual situation characteristic of division at p place, arbitrfary point is by the Marco Beltrami flow point amount Δ of each Color Channel c (c={R, G, B}) at this some place _gI _cImage gradient mould value with each Color Channel The associating formation.

Behind four groups of actual situation characteristic of divisions of every bit in calculating the zone (comprising: local statistic, surface graded, second fundamental form, Marco Beltrami flow), need to utilize the moment of inertia compression method that the actual situation characteristic of division is compressed.Moment of inertia compression method step is as follows: consider that first separately local statistic, surface graded, second fundamental form, Marco Beltrami flow any one group (processing mode of each group actual situation characteristic of division is all identical) in these four groups of actual situation characteristic of divisions.If total N point in the given area, any point P _i(i=1 ..., the total M dimension of one group of actual situation characteristic of division N) (value of M according in four groups of actual situation characteristic of divisions, get fixed wherein one group can determine), will put P _i(i=1 ..., one group of actual situation characteristic of division N) is designated as v _i=(v _I1..., v _Im).At this moment, will put P _iActual situation characteristic of division v _i=(v _I1..., v _Im) be considered as a particle in the M dimensional feature space, stipulate that the quality of this particle is

, the position coordinates of this particle in the M dimensional feature space is v _i=(v _I1..., v _Im), then can calculate by the solid moment of inertia Matrix Formula moment of inertia matrix J of the system of particles that all N particles consist of.Moment of inertia matrix J is that a M * M ties up matrix, and matrix J can be write as following form:

Any one element of matrix J is designated as J _Jk(j, k=1 ..., M).J _JkComputing method be:

(j, k=1 ..., M).M wherein _iExpression particle P _iQuality,

v _i=(v _I1..., v _Im) expression point P _iPosition coordinates in feature space; | v _i| expression particle P _iTo the Euclidean distance of true origin, namely

δ _JkBe Kronecker function, its computing method are

δ_{jk} = \{\begin{matrix} 1 & , & if & i = j \\ 0 & , & if & i &NotEqual; j \end{matrix} .

Can determine thus all elements J of moment of inertia matrix J _Jk(j, k=1 ..., M).By the symmetry of moment of inertia matrix J as can be known, J _Jk=J _Kj, therefore only get principal diagonal and the above all elements J of principal diagonal of matrix J _Jk(j, k=1 ..., M and j≤k), these elements can represent all information of original matrix J.

Get all elements J in the moment of inertia matrix _Jk(j, k=1 ..., M and j≤k); The centroid vector of uniting all particles

Unite all particles and true origin distance | v _i| average, variance, the degree of bias, kurtosis; Constitute a proper vector.This proper vector is the compression expression result that this group actual situation characteristic of division of having a few in this zone obtains through the moment of inertia compression method.With four groups of compression expression result associatings that obtain respectively, can obtain the regional unique characteristics of this zone correspondence.Because moment of inertia matrix can describe the distribution of a plurality of particles in feature space preferably, thus the moment of inertia matrix compression method can be when a plurality of high dimensional data points compress to the zone as far as possible low guaranteed keep largely the information that legacy data distributes.

The extraction of regional correlation feature.For given image-region, the zone itself is considered as subject area to be determined; And the homalographic rectangular area of next-door neighbour's bounding box outside the regional bounding box is considered as the residing background area of object; Calculate respectively the actual situation characteristic of division of every bit in subject area and the background area; The actual situation characteristic of division that all-pair is answered in the objects of statistics zone and the background area respectively consists of in the subject area and the joint distribution histogram of background area feature; Card side's distance between the joint distribution histogram of the joint distribution histogram of calculating object provincial characteristics and background area feature with its feature that is considered as weighing contrast between object and its background of living in or difference, is called the regional correlation feature.

Make up Pixel-level actual situation sorter and region class actual situation sorter, be used for dividing given zone whether to belong to the virtual objects region from the angle of regional unique characteristics and the angular area of regional correlation feature respectively.

Pixel-level actual situation sorter makes up, by the positive negative sample of input training set; Extract respectively the regional unique characteristics of positive negative sample; The regional unique characteristics set input support vector machine classifier of the positive negative sample that extracts is trained, obtain Pixel-level actual situation sorter.The characteristics of Pixel-level actual situation sorter are that the Feature Compression method that it adopts is so that its classification results tool has the dimensions adaptability.Namely when regional unique characteristics to be sorted be when being obtained by the extracted region significantly different from the area size of training set, Pixel-level actual situation sorter has preferably accuracy for the classification results whether given area belongs to the virtual objects region.Particularly: although the Pixel-level sorter is (to be of a size of [10 by virtual objects in the training set, 30] * [10,30] regional unique characteristics set training pixel) obtains, but experimental result shows: this sorter (is of a size of [10 for relatively little a lot of zone, 30] * [10,30] provincial characteristics pixel), its classification results still has preferably accuracy.Since the classification of this sorter to as if for the small size zone, and these zonules are used for the approximate description regional center and put corresponding pixel, therefore this sorter are become Pixel-level actual situation sorter.

Region class actual situation sorter makes up, by the positive negative sample of input training set; Extract respectively the regional correlation feature of positive negative sample; The regional correlation characteristic set input support vector machine classifier of the positive negative sample that extracts is trained, obtain region class actual situation sorter.Because the characteristic of division that region class actual situation sorter uses is the population distribution difference between conversion zone and the place background thereof, therefore object to be detected can be done as a wholely to differentiate preferably and detect.

Make up virtual shot chart.For the input the augmented reality scene image, utilize small size detection window (detection window is of a size of [10,30] * [10,30] pixel) with less moving step length (as 1,2,3,4,5} pixel) scanning entire image; Calculate the regional unique characteristics of the little image block in each small size detection window; The regional unique characteristics of all little image blocks is input in the Pixel-level actual situation sorter, obtains the regional unique characteristics score of each little image block, the high expression of score Pixel-level sorter is high with the degree of certainty that this image block is categorized as virtual region; Because the relative entire image of size of detection window is very little and densely distributed, therefore the regional unique characteristics score of each little image block can be mapped to the center pixel of this image block, and with its virtual score as this central pixel point; Consisted of thus the virtual shot chart of whole augmented reality scene image.Because the actual situation characteristic of division during zone self sign is calculated calculates and the Feature Compression operation is more consuming time, and need to calculate one by one its regional unique characteristics for a large amount of overlapped image block that generates, therefore in this step, adopt integrogram method speed-up computation process.The virtual shot chart that obtains thus makes up point and this point relation that whether belong to virtual objects of result on can the reflected well image.That is: experimental result shows: in the virtual shot chart, the point that virtual score is high generally all concentrates on the virtual objects region; Otherwise, the point of virtual objects region, corresponding virtual score is all higher.

Make up virtual degree of confidence figure, its flow process as shown in Figure 2.At first, carry out thresholding for the virtual shot chart that obtains and process, select first and record all virtual to be divided into positive point; A fixing number percent N% is set, selects and record all virtual front N% and these point residing positions on original image that are divided into positive point.These points are called high virtual responsive point.The constant M (as making M ∈ [10,100]) of a fixing and less is set, selects and record all virtual front M point and these residing positions on original image that are divided into positive point.These points are called the highest virtual responsive point.The number of high virtual responsive point is much smaller than the number of high virtual responsive point.Positional information on comprehensive high virtual responsive point, the highest virtual responsive point and the place original image thereof namely consists of virtual degree of confidence figure.Described virtual degree of confidence figure makes up point and this point relation that whether belong to virtual objects of result on can the reflected well image.That is: experimental result shows: among the virtual degree of confidence figure, high virtual responsive point generally all concentrates on the virtual objects region; Otherwise, the point of virtual objects region, corresponding high virtual responsive point distributes comparatively intensive.Similarly, the highest virtual responsive point generally only appears at the virtual objects region; Otherwise, in the virtual objects region, the highest more virtual responsive point generally can appear.For appearing at the virtual objects zone in high virtual responsive point and the highest virtual responsive point in addition, be referred to as noise spot.

The rough shape of virtual objects bounding box and position reasoning flow process may further comprise the steps as shown in Figure 3: divide subregion, obtain candidate's central point; Expanded search obtains high virtual responsive and puts densely distributed zone; The preliminary candidate region of virtual objects is determined; The virtual objects candidate region is determined.Wherein particularly, dividing subregion, is the virtual degree of confidence figure to obtaining, with its be divided into five homalographics, can be overlapping subregion.Obtain the process flow diagram of candidate's central point as shown in Figure 4, respectively according to the distribution of the high virtual responsive point in every sub regions, utilize mean shift algorithm to try to achieve high virtual responsive point center of distribution point in every sub regions, this central point is called candidate's central point.Candidate's central point number is k (k≤5, k less than 5 situation corresponding to there not being high virtual responsive point in some subregion), must be in above-mentioned k candidate's central point at this central point that might as well suppose virtual objects institute corresponding region.Expanded search, obtain high virtual responsive and put densely distributed zone, this process is as shown in Figure 5: for each candidate's central point, take candidate's central point as the center of circle, take the length that increases according to fixed step size as radius, the border that has searched the virtual objects zone until when the number of high virtual responsive point no longer increases in the current search zone, then can be thought in the circular region of search that dynamic construction increases successively; Ideally, the condition that expanded search stops is when search radius increases, the number increment of high virtual responsive point is zero in the region of search, but in order to eliminate the impact of the noise spot that exists among the virtual degree of confidence figure, a squelch parameter is set, the condition reinforcement that expanded search is stopped is: when search radius increased, the number increment of high virtual responsive point must be greater than the squelch parameter in the region of search.

The preliminary candidate region of virtual objects is determined to go out candidate target shape in this zone by the densely distributed regional Approximate Calculation of high virtual responsive point, in conjunction with this regional positional information, consists of the preliminary candidate region of virtual objects.When expanded search stops, obtaining high virtual responsive and put densely distributed zone, the set P of all high virtual responsive points in as can be known should the zone can draw candidate target shape in this zone according to P, namely is presented as the shape of candidate target bounding box:

x _min＝min({x|<x，y>∈P})；x _max＝max({x|<x，y>∈P})；

y _min＝min({y|<x，y>∈P})；y _max＝max({y|<x，y>∈P})；

X wherein _Min, x _MaxRepresent respectively x direction minimum value and maximal value in the correspondence image coordinate of candidate target bounding box region; y _Min, y _MaxRepresent respectively y direction minimum value and maximal value in the correspondence image coordinate of candidate target bounding box position.Can determine that thus the candidate target bounding box is with respect to the position in the image and shape.

Candidate target shape in this zone in conjunction with this regional positional information (candidate's center position), consists of the preliminary candidate region of virtual objects.

In the preliminary candidate region of the k that an obtains virtual objects, each self-contained high virtual responsive point and the number of high virtual responsive point according to it, select maximum one of weighting number, as the virtual objects candidate region, this zone has namely comprised the general shape of virtual objects bounding box and the information of position with it.

For the coarse localization of the virtual objects that obtains, further optimize, with the error that reduces may occur in the computation process of virtual objects candidate region, thereby obtain the final detection result of virtual objects.Concrete steps are: get the zone that area is virtual objects candidate region twice around in the virtual objects candidate region; (range of size of large scale detection window is generally [200 to the structure form size a plurality of overlapped large scale detection window identical with the virtual objects candidate region in this zone, 500] * [200,500], the concrete value of its length and width equals length and the width of virtual objects bounding box in the virtual objects candidate region); Get the interior image block of each large scale detection window and calculate its regional correlation feature; The regional correlation feature input area level actual situation sorter of image block in all large scale detection window is classified, select the final detection result of the highest detection window of reserved portion as virtual objects.

The above only is basic explanations more of the present invention, and any equivalent transformation according to technical scheme of the present invention is done all should belong to protection scope of the present invention.

The non-elaborated part of the present invention belongs to techniques well known.

Claims

1. differentiate and detection method based on the augmented reality scene virtual objects of degree of confidence figure, it is characterized in that performing step is as follows:

(1) with the augmented reality image that comprises virtual objects as positive sample, the real scene image makes up augmented reality scene training dataset as negative sample; And utilize the physics imaging difference of virtual objects and true picture, choose the actual situation characteristic of division;

(2) on training dataset, utilize the actual situation characteristic of division, extract respectively the regional unique characteristics of augmented reality scene and real scene, make up Pixel-level actual situation sorter;

(3) on training dataset, utilize the actual situation characteristic of division, extract respectively the regional correlation feature of augmented reality scene and real scene, make up region class actual situation sorter;

(4) given test augmented reality scene utilizes Pixel-level actual situation sorter and small size detection window to detect, and obtains reflecting the virtual shot chart of each pixel actual situation classification results;

(5) defining virtual degree of confidence figure, and on the basis of virtual shot chart, utilize thresholding to obtain testing the virtual degree of confidence figure of augmented reality scene;

(6) based on virtual degree of confidence figure, carry out the virtual objects coarse localization, obtain rough shape and the position of virtual objects bounding box;

(7) on the basis of virtual objects coarse localization, in test augmented reality scene image, utilize region class actual situation sorter and large scale detection window to detect, obtain the final detection result of virtual objects;

The actual situation characteristic of division of choosing in the described step (1) comprises: local statistic, surface graded, second fundamental form and Marco Beltrami stream, can both extract the above-mentioned actual situation characteristic of division that obtains this point correspondence at each pixel place of image;

When making up the Pixel-level sorter in the described step (2), on training dataset, to the augmented reality scene image, only choose the virtual objects zone as positive sample areas; And to the real scene image, only choose with positive sample in the akin zone of virtual objects as negative sample zone; For given image-region, calculate the actual situation characteristic of division of every bit in the zone; Utilize the moment of inertia compression method that the actual situation characteristic of division of given positive and negative sample areas is compressed, obtain the regional unique characteristics of this zone correspondence; The regional unique characteristics set input support vector machine classifier of positive negative sample is trained, obtain Pixel-level actual situation sorter;

When described step (3) makes up region class actual situation sorter, on training dataset, for positive and negative sample areas, subject area to be determined will itself be considered as; And the homalographic rectangular area outside the regional bounding box is considered as the residing background area of object; Extract respectively the actual situation characteristic of division of every bit in subject area and the background area; The actual situation characteristic of division that all-pair is answered in objects of statistics zone and the background area consists of respectively the joint distribution histogram of subject area feature and the joint distribution histogram of background area feature; Calculate the card side's distance between two histograms, with its feature that is considered as weighing object background difference of living in it, be called the regional correlation feature; The regional correlation characteristic set input support vector machine classifier of the positive negative sample that extracts is trained, obtain region class actual situation sorter;

The virtual shot chart construction step of described step (4) is: for the augmented reality scene image of input, utilize the small size detection window with less moving step length scanning entire image; Calculate the regional unique characteristics of the little image block in each small size detection window; The regional unique characteristics of all little image blocks is input in the Pixel-level actual situation sorter, obtains the regional unique characteristics score of each little image block, the high expression of score Pixel-level sorter is high with the degree of certainty that this image block is categorized as virtual region; Because the relative entire image of size of detection window is very little and densely distributed, therefore the regional unique characteristics score of each little image block can be mapped to the center pixel of this image block, and with its virtual score as this central pixel point; Consisted of thus the virtual shot chart of whole augmented reality scene image;

The virtual degree of confidence figure construction method of described step (5) is: carry out thresholding for the virtual shot chart of augmented reality scene image and process, record all and virtual be divided into positive point; A fixing number percent N% is set, records all virtual front N% and these point residing positions on original image that are divided into positive point, these points are called high virtual responsive point; The constant M of a fixing and less is set, records all virtual front M point and these residing positions on original image that are divided into positive point, these points are called the highest virtual responsive point; Can guarantee that by parameter setting the highest virtual responsive point also is contained in the set at high virtual responsive point place simultaneously, namely the highest virtual responsive point is the virtual the highest part of score value that gets in the high virtual responsive point; Positional information on comprehensive high virtual responsive point, the highest virtual responsive point and the place original image thereof consists of virtual degree of confidence figure;

Described step (6) obtains the rough shape of virtual objects bounding box and the method for position is: to the virtual degree of confidence figure that obtains with its be divided into five homalographics, can be overlapping subregion, try to achieve respectively the distribution center of the high virtual responsive point in every sub regions; The subregion center is considered as candidate's virtual objects central point, from each central point outside expanded search respectively, obtains the densely distributed zone of high virtual responsive point; For the densely distributed zone of high virtual responsive point, Approximate Calculation goes out the candidate target shape in this zone respectively, in conjunction with this regional positional information, consists of the preliminary candidate region of virtual objects; In the preliminary candidate region of virtual objects, each self-contained high virtual responsive point and the number of high virtual responsive point according to it, select maximum one of weighting number, as the virtual objects candidate region, this zone has namely comprised virtual objects bounding box rough shape and positional information with it;

Described step (7) virtual objects detection method is specially: intensive sampling around the virtual objects candidate region in test augmented reality scene, construct a plurality of overlapped detection window, and use region class actual situation sorter to classify, choose the best detection window of score as the final detection result of virtual objects.