CN103679142A

CN103679142A - Target human body identification method based on spatial constraint

Info

Publication number: CN103679142A
Application number: CN201310636477.4A
Authority: CN
Inventors: 陈普强; 郭立君; 柯伟扬; 张�荣; 赵杰煜
Original assignee: Ningbo University
Current assignee: Ningbo Skyview Image Technology Co., Ltd.
Priority date: 2013-12-02
Filing date: 2013-12-02
Publication date: 2014-03-26
Anticipated expiration: 2033-12-02
Also published as: CN103679142B

Abstract

The invention discloses a target human body identification method based on spatial constraint. The method includes: extracting the foreground area of a to-be-identified image and the foreground area of an image in an image target database, segmenting the foreground area of the image in the image target database into equal and overlapped first image blocks, segmenting the foreground area of the to-be-identified image into equal and overlaid second image blocks, extracting the feature vectors of the first image blocks and the feature vectors of the second image blocks, finding the block, in each image in the image target database, most similar to the second image block, respectively extracting the minimum distance block sequence of the first image blocks and the second image blocks, calculating the matching score for obtaining to the to-be-identified image relative to the image in the image target database so as to perform human body identification. The method has the advantages that rich structural information is integrated into the matching score, influences of shooting angle, gesture or light variation can be avoided effectively, minor differences of similar areas of different human bodies can be identified effectively, and marking of the to-be-identified image is not needed.

Description

A kind of target body recognition methods based on space constraint

Technical field

The present invention relates to a kind of target body recognition methods, especially relate to a kind of target body recognition methods based on space constraint.

Background technology

Target body identification (hereinafter to be referred as human body identification) be judge the pedestrian that occurs under a camera whether with another camera under pedestrian Wei Tong a group traveling together of occurring, human body identification problem is called as the human body reproduction problem (Person Re-identification) in non-overlapped many camera systems in computer vision field.Human body recognition method has important application in intelligent video monitoring field, and its purposes mainly comprises target following in the extraction of target body, non-overlapped camera system.Along with popularizing of intelligent monitoring, human body recognition method is subject to more attention, from researchist in recent years, the research of human body recognition method is learnt, current human body recognition method is faces enormous challenge still: first, the captured image resolution ratio of camera is lower, recognition of face or other biological feature identification technique are almost difficult to be applied to human body identification, and this just makes the Main Basis of human body identification be limited in human body appearance features aspect; Secondly, in order to simplify human body recognition method, conventionally the dress ornament under different scenes is constant with a group traveling together for hypothesis, but under different scenes (as shown in Figure 1a), the impact of the factors such as the angle that pedestrian is taken, posture, background, light, camera parameters, still to human body, identification has brought huge challenge.Solve at present human body knowledge method for distinguishing and mainly contain two kinds: supervised learning method and unsupervised learning method.

For supervised learning method, the people such as Prosser (B.Prosser, W.Zheng, S.Gong, T.Xiang, and Q.Mary.Person re-identification by support vector ranking.In BMVC, 2010. human body based on support vector machine rank identifications, Britain's machine vision meeting) human body identification problem is converted into rank problem, utilizes improved SVM algorithm to carry out human body identification, the people such as Gray (D.Gray and H.Tao.Viewpoint invariant pedestrian recognition with an ensemble of localized features.ECCV, the human body identification of 2008. angle invariabilities in conjunction with a plurality of local features, Europe computer vision international conference) in conjunction with local features such as space and colors, utilize boosting algorithm to carry out human body identification, the people such as Schwartz (W.Schwartz and L.Davis.Learning discriminative appearance-based models using partial least squares.In XXII Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI), 2009. learning models based on partial least square method identification performance characteristic, the computer graphic image of 22 Brazilian symposials is processed) use local least squares method to the color of extracting, gradient, these three high dimensional features of texture carry out respectively dimensionality reduction, and according to recognition capability separately, give different weights and carry out human body identification, the people such as Li (W.Li and X.Wang.Locally aligned feature transforms across views.In CVPR, 2013. local feature based on visual angle conversions, IEEE international computer vision and pattern-recognition meeting) image under two different cameras is first converted into different feature spaces, according to different feature spaces, carry out respectively metric learning again, thereby carry out human body identification, the people such as Zheng (W.Zheng, S.Gong, and T.Xiang.Person re-identification by probabilistic relative distance comparison.In CVPR, the human body identification of the comparison of 2011. probability based on relative distance, IEEE international computer vision and pattern-recognition meeting) human body identification problem is treated as to distance study problem carry out human body identification, the people such as Weinberger (K.Q.Weinberger and L.K.Saul.Fast solvers and efficient implementations for distance metric learning.In ICML, 2008. learning distance metric methods fast and effectively, machine learning international conference) LMNN algorithm has been proposed, this algorithm is by nearer to of a sort distance, but not the farther punishment of of a sort distance learns to realize human body identification, the people such as Kostinger (M.Kostinger, M.Hirzer, P.Wohlhart, P.Roth, and H.Bischof.Large scale metric learning from equivalence constraints.In CVPR, 2012. large scale metric learnings based on equivalent constraint, IEEE international computer vision and pattern-recognition meeting) utilize a loose learning distance metric to carry out human body identification problem.These supervised learning methods all need first human sample to be carried out learning after mark, when camera environment change, as shooting angle, posture and light change, need to again to human sample, carry out mark, can not effectively process the human sample identification under a large amount of human samples and varying environment.

For unsupervised learning method, it is mainly the distance proposing between the direct computed image of more effective proper vector.The people such as M.Farenzena (M.Farenzena, L.Bazzani, A.Perina, V.Murino, and M.Cristani.Person re-identification by symmetry-driven accumulation of local features.In CVPR, 2010. based on the cumulative human body identification of symmetry local feature, IEEE international computer vision and pattern-recognition meeting) in conjunction with the feature of whole HSV feature, maximum stable piece, repeat these three parts of feature of many pieces, the distance of calculating between two width images is carried out human body identification; The people such as R.Zhao (R.Zhao, W.Ouyang, the human body identification of and X.Wang.Unsupervised Salience Learning for Person Re-identification.In CVPR2013. based on without the study of supervision prominent feature, IEEE international computer vision and pattern-recognition meeting) according to the outstanding weights of each piece obtaining, in local matching process, thereby give the difference contribution of each localized mass in identification according to outstanding weights, realize human body identification; The people such as Y.Zhang (Y.Zhang and S.Li.Gabor-LBP based region covariance descriptor for person reidentification.Proc.Int.Conference on Image and Graphics, the human body identification of 2011. Gabor-LBP that describe based on regional area covariance, image and figure meeting) in carrying out human body identifying, LBP textural characteristics and Gabor feature are combined and ask ground distance, avoided preferably the impact on human body identification of illumination and postural change; The people such as Malocal (B.Ma, Y.Su, the human body identification of the partial descriptions of and F.Jurie.Local descriptors encoded by fisher vectors for person re-identification.2012. based on complement vector) the local feature statistics of utilizing complement vector to encode higher, thus make human body recognition effect have larger improvement.These unsupervised learning methods are all concentrated and are focused on characteristic Design and part matches, but do not calculated from the structural information of enriching in image, when running into the similar area of two different human body, be difficult to small difference to be distinguished, as shown in Figure 1 b, without measure of supervision, by accumulative total local matching, easily the different people with similar area in Fig. 1 b is judged as to same person.

Summary of the invention

Technical matters to be solved by this invention is to provide and does not a kind ofly need to treat recognition image and carry out mark, can effectively avoid the target body recognition methods based on space constraint of the minute differences in the impact that shooting angle, posture and light changes and the similar area that can effectively identify different human body.

The present invention solves the problems of the technologies described above adopted technical scheme: a kind of target body recognition methods based on space constraint, is characterized in that: comprise the following steps:

1. supposition has a camera A, the synthetic image object storehouse of all image sets that camera A is photographed, then extract the foreground area in every width image in image object storehouse, foreground area in image object storehouse in every width image is the human body in every width image in image object storehouse, then the foreground area in every width image in image object storehouse is divided into the first image block that size is identical and overlap each other, then extracts the proper vector of each the first image block in the foreground area in every width image in image object storehouse;

2. supposition has a camera B, the image that camera B is photographed is in real time as image to be identified, then extract the foreground area in image to be identified, foreground area in image to be identified is the human body in image to be identified, then the foreground area in image to be identified is divided into the second image block that size is identical and overlap each other, then extracts the proper vector of each the second image block in the foreground area in image to be identified;

3. according to the proper vector of each the first image block in the foreground area in every width image in image object storehouse, obtain the minor increment piece sequence of each the first image block in the foreground area in every width image in image object storehouse, the minor increment piece sequence of central pixel point the first image block that coordinate position is (i, j) in q width image in the foreground area in the q width image in image object storehouse is designated as

d_{i, j}^{B, q} = {d_{i, j, 1}^{B, q}, d_{i, j, 2}^{B, q}, . . ., d_{i, j, k}^{B, q}, . . ., d_{i, j, M}^{B, q}},

Wherein, 1≤q≤Q, the quantity of the image in Q presentation video object library, the coordinate position of the central pixel point of (i, j) expression the first image block in q width image,

for belonging to symbol, F represents the set of the coordinate position of all pixels in the foreground area in q width image, 1≤k≤M, and M represents total line number of the first image block of comprising in the foreground area in q width image,

d_{i, j, k}^{B, q} = \min {d (x_{i, j}^{B, q}, x_{k, 1}^{B, q}), . . ., d (x_{i, j}^{B, q}, x_{k, j^{'}}^{B, q}), . . ., d (x_{i, j}^{B, q}, x_{k, N^{'}}^{B, q})},

Min{} represents to get minimum value function, 1≤j'≤N', and N' represents total number of the first image block of comprising in capable the first image block of k in foreground area in q width image,

|| || ₂represent two normal form symbols,

represent that in the foreground area in q width image, central pixel point coordinate position in q width image is first image block of (i, j)

proper vector,

represent that in the foreground area in q width image, central pixel point coordinate position in q width image is first image block of (k, j')

proper vector,

Then according to the proper vector of each the second image block in the foreground area in image to be identified, obtain the minor increment piece sequence of each the second image block in the foreground area in image to be identified, the minor increment piece sequence of the second image block that coordinate position is (m, n) in image to be identified of central pixel point in the foreground area in image to be identified is designated as

d_{m, n}^{A} = {d_{m, n, 1}^{A}, d_{m, n, 2}^{A}, . . ., d_{m, n, k^{'}}^{A}, . . ., d_{m, n, K}^{A}},

Wherein, the coordinate position of the central pixel point of (m, n) expression the second image block in image to be identified,

f' represents the set of the coordinate position of all pixels in the foreground area in image to be identified, 1≤k'≤K, and K represents total line number of the second image block of comprising in the foreground area in image to be identified,

d_{m, n, k^{'}}^{A} = \min {d (x_{m, n}^{A}, x_{k^{'}, 1}^{A}), . . ., d (x_{m, n}^{A}, x_{k^{'}, n^{'}}^{A}), . . ., d (x_{m, n}^{A}, x_{k^{'}, L^{'}}^{A})}, 1 \leq n^{'} \leq L^{'},

L' represents total number of the second image block of comprising in capable the second image block of k' in foreground area in image to be identified,

represent that in the foreground area in image to be identified, central pixel point coordinate position in image to be identified is second image block of (m, n) proper vector,

represent that in the foreground area in image to be identified, central pixel point coordinate position in image to be identified is second image block of (k', n')

proper vector,

(k^{'}, n^{'}) &SubsetEqual; F^{'};

4. adopt neighbouring node algorithm, find out the most similar in the foreground area in each second image block in the foreground area in image to be identified every width image in image object storehouse, for

m-l in foreground area in every width image in image object storehouse walks in all the first image blocks that m+l is capable, finds out in foreground area in every width image in image object storehouse the most similar, suppose m-l in the foreground area in the q width image in image object storehouse walk to the most similar of finding out in all the first image blocks that m+l is capable for coordinate position in the q width image of central pixel point in image object storehouse in the foreground area in the q width image in image object storehouse be (m ₁, n ₁) the first image block, will

the most similar of walking in all the first image blocks that m+l is capable of m-l in foreground area in q width image in image object storehouse is designated as wherein, m-l≤m ₁≤ m+l, 1≤l≤m,

(m_{1}, n_{1}) &SubsetEqual; F;

Then the proper vector of the most similar in foreground area in every width image in image object storehouse according to each second image block in the foreground area in the proper vector of each the second image block in the foreground area in image to be identified and image to be identified, obtain the matching score in each second image block in the foreground area in image to be identified every width image in image object storehouse, for

will

matching score in q width image in image object storehouse is designated as

score (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q}) = e^{t_{1}},

Wherein, e represents nature radix,

t_{1} = - \frac{{{| | x_{m, n}^{A} - x_{m_{1}, n_{1}}^{B, q} | |}_{2}}^{2}}{σ {δ_{2}}^{2}},

σ is constant, δ ₂it is an expression

the constant of frequency span,

represent

proper vector, represent

proper vector;

5. according to the minor increment piece sequence of all the first image blocks in the foreground area in the every width image in the minor increment piece sequence of all the second image blocks in the foreground area in image to be identified and image object storehouse, obtain all the second image blocks in the foreground area in image to be identified with respect to the approximately beam weight of the space constraint of the every width image in image object storehouse, for

will

approximately beam weight with respect to the space constraint of the q width image in image object storehouse is designated as

w (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q}) = e^{f_{1}},

Wherein, e represents nature radix,

f_{1} = - \frac{{Σ_{k^{'} = 1}^{K} (d_{m, n, k^{'}}^{A} - d_{m}^{_{1}, n_{1}, k^{'}})}^{2}}{2 δ_{3}^{2}},

δ ₃it is an expression

the constant of frequency span,

represent in k' element,

represent

in k' element,

represent

in q width image in image object storehouse the most similar

minor increment piece sequence, 1≤k'≤K, K represents total line number of the second image block of comprising in the foreground area in image to be identified;

6. according to each second image block in the foreground area in image to be identified matching score in every width image in image object storehouse and all the second image blocks in the foreground area in image to be identified with respect to the approximately beam weight of the space constraint of the every width image in image object storehouse, obtain image to be identified with respect to the matching score of the every width image in image object storehouse, the matching score by image to be identified with respect to q width image in image object storehouse is designated as

Score (I^{A}, I_{q}^{B}) = Σ_{m = 1, n = 1}^{m = K, n = L} {score (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q}) \cdot w (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q})},

Wherein, I ^arepresent image to be identified,

q width image in presentation video object library;

7. the every width image in image object storehouse is arranged with respect to the matching score of every width image in image object storehouse from big to small by image to be identified, obtain an image sequence, the human body in image to be identified is identified as to the human body in the first sub-picture in image sequence.

Described step detailed process is 1.:

1.-1, supposition has a camera A, the synthetic image object storehouse of all image sets that camera A is photographed;

1.-2, current pending q width image in image object storehouse is defined as to present image, wherein, 1≤q≤Q, the quantity of the image in Q presentation video object library;

1.-3, the V passage in the HSV passage of present image is carried out to equalization processing, then adopt attitude to estimate to extract the foreground area in present image, foreground area in present image is the human body in present image, again the foreground area in present image is divided into the first image block that a plurality of sizes are identical and overlap each other, current the first pending image block is defined as to current the first image block;

1.-4, extract the LAB color characteristic of current the first image block, then current the first image block is evenly divided into the first elementary area lattice of a plurality of non-overlapping copies, and according to all the first elementary area lattice, current the first image block partial gradient is quantified as to eight directions, according to eight of current the first image block directions, extract again the SIFT feature of current the first image block, last according to the SIFT feature of the LAB color characteristic of current the first image block and current the first image block, the proper vector of obtaining current the first image block, is designated as c _q,

c_{q} = [\begin{matrix} a_{q} \\ b_{q} \end{matrix}],

Wherein, a _q288 dimensional vectors, a _qfor three Color Channels of current the first image block carry out respectively after three samplings, then extract respectively the LAB color characteristic of current the first image block getting after the 32 dimension color histograms of each Color Channel of current the first image block, b _q384 dimensional vectors, b _qfor extract respectively the SIFT feature of current the first image block getting after 16 dimensional features in each direction of current the first image block on three Color Channels of current the first image block;

1. ,-5, using pending the first image block of the next one as current the first image block, then return to step 1.-4 and continue to carry out, until obtain the proper vector of all the first image blocks in the foreground area in present image;

1.-6, make q=q+1, in image object storehouse, next pending image, as present image, then returns to step 1.-3 and continues to carry out, until handle all images in image object storehouse.

The process that in described step 1.-3, the foreground area in present image is divided into the first identical and overlapped image block of a plurality of sizes is: the moving window that the size of take is 10 * 10 moves every 4 pixels in present image, and it is the first image block of 10 * 10 that present image is divided into a plurality of sizes.

In described step 1.-4, the size of the first elementary area lattice is 4 * 4.

Described step detailed process is 2.:

2.-1, supposition has a camera B, the image that camera B is photographed is in real time as image to be identified, then the V passage for the treatment of in the HSV passage of recognition image carries out equalization processing, then adopt attitude to estimate to extract the foreground area in image to be identified, foreground area in image to be identified is the human body in image to be identified, again the foreground area in present image is divided into the second image block that size is identical and overlap each other, the second image block when pre-treatment is defined as to current the second image block;

2.-2, extract the LAB color characteristic of current the second image block, then current the second image block is evenly divided into the second elementary area lattice of a plurality of non-overlapping copies, and according to the second all elementary area lattice, current the second image block partial gradient is quantified as to eight directions, again according to eight of current the second image block directions extract current the second image block SIFT feature, again according to the SIFT feature of the LAB color characteristic of current the second image block and current the second image block, obtain the proper vector of current the second image block, be designated as c

c = [\begin{matrix} a \\ b \end{matrix}],

Wherein, a is 288 dimensional vectors, a is that three passages of current the second image block carry out respectively after three samplings, extract respectively again the LAB color characteristic of the second image block that 32 dimensions of each Color Channel of current the second image block get after color histograms, b is 384 dimensional vectors, and the element in b for extracting respectively the SIFT feature of current the second image block getting after 16 dimensional features in each direction of current the second image block on three Color Channels of current the second image block;

2. ,-3, using pending the second image block of the next one as current the second image block, then return to step 2.-2 and continue to carry out, until obtain the proper vector of all the second image blocks in the foreground area in image to be identified.

The process that in described step 2.-1, the foreground area in image to be identified is divided into the second identical and overlapped image block of a plurality of sizes is: the moving window that the size of take is 10 * 10 moves every 4 pixels in image to be identified, and it is the second image block of 10 * 10 that image to be identified is divided into a plurality of sizes.

The size of the second elementary area lattice in described step 2.-2 is 4 * 4.

Described step adopts neighbouring node algorithm in 4., finds out in the process of the most similar in the foreground area in each second image block in the foreground area in image to be identified every width image in image object storehouse, for

in capable all the first image blocks of m in foreground area in every width image in image object storehouse, find out

in foreground area in every width image in image object storehouse the most similar, suppose to find out in all the first image blocks that m in the foreground area in the q width image in image object storehouse is capable the most similar for coordinate position in the q width image of central pixel point in image object storehouse in the foreground area in the q width image in image object storehouse be (m, n ₁) the first image block, will

the most similar in capable all the first image blocks of m in foreground area in q width image in image object storehouse is designated as

(m, n_{1}) &SubsetEqual; F;

will matching score in q width image in image object storehouse is designated as

score (x_{m, n}^{A}, x_{m, n_{1}}^{B, q}) = e^{t},

Wherein, e represents nature radix,

t = - \frac{{{| | x_{m, n}^{A} - x_{m, n_{1}}^{B, q} | |}_{2}}^{2}}{σ δ^{2}},

σ is constant, and δ is an expression

the constant of frequency span,

represent proper vector, represent

proper vector.

Described step is 5. middle will

approximately beam weight with respect to the space constraint of q width image in image object storehouse is designated as

w (x_{m, n}^{A}, x_{m, n_{1}}^{B, q}) = e^{f},

Wherein, e represents nature radix,

f = - \frac{{Σ_{k^{'} = 1}^{K} (d_{m, n, k^{'}}^{A} - d_{m {, n}_{1}, k^{'}}^{B, q})}^{2}}{2 δ_{1}^{2}},

δ ₁it is an expression

the constant of frequency span,

represent

in k' element,

represent

in k' element,

represent

in q width image in image object storehouse the most similar

minor increment piece sequence, 1≤k'≤K, K represents total line number of the second image block of comprising in the foreground area in image to be identified.

Described step is in 6.

be designated as

Score (I^{A}, I_{q}^{B}) = Σ_{m = 1, n = 1}^{m = K, n = L} {score (x_{m, n}^{A}, x_{m, n_{1}}^{B, q}) \cdot w (x_{m, n}^{A}, x_{m, n_{1}}^{B, q})},

Wherein, I ^arepresent image to be identified,

q width image in presentation video object library.

Compared with prior art, the invention has the advantages that: in carrying out the process of target body identification, first extract the foreground area of the image in image to be identified and image object storehouse, then by the sizes such as the foreground area of the image in image object storehouse is divided into and the first image block of overlapping each other, by the sizes such as the foreground area of image to be identified is divided into and the second image block of overlapping each other, extract the proper vector of each first image block and each the second image block, and find the most similar in every width image of each second image block in image object storehouse, extract respectively again the minor increment piece sequence of each first image block and each the second image block, thereby then calculate image to be identified and carry out human body identification with respect to the matching score of the image in image object storehouse, the minor increment piece sequence of extracting each first image block and each the second image block is dissolved in matching score structural information abundant in image to be identified to go, the impact of effectively having avoided shooting angle, posture and light to change, can effectively identify the minute differences in the similar area of different human body, and in target body identifying, not need to treat recognition image and carry out mark.

Accompanying drawing explanation

Fig. 1 a is from VIPeR storehouse and the comparison diagram of same person under different scenes in CUHK02 storehouse, and wherein, two width images of each row are same person;

Fig. 1 b is that wherein, two width images of each row are two similar people from VIPeR storehouse and the comparison diagram in CUHK02 storehouse with the different people of similar area;

Fig. 2 is the process flow diagram of the inventive method;

Fig. 3 a is the image to be identified that the inventive method is selected while specifically implementing;

Fig. 3 b is the foreground area of image to be identified, and wherein, the region in rectangle frame is foreground area;

Fig. 3 c is the image in the image object storehouse selected while specifically implementing of the inventive method;

Fig. 3 d is the foreground area of the image in the image object storehouse selected while specifically implementing of the inventive method, and wherein, the region in rectangle frame is foreground area;

Fig. 4 a is the impact of postural change on about beam weight in the inventive method, wherein, in the first row first pair and the 3rd width image are same person, in the second row first pair and the 3rd width image are different people, gray area in the second width image is background area, non-gray area represents about beam weight, and color is more deeply felt and shown that about beam weight is larger, and white represents that about beam weight is minimum;

Fig. 4 b is that in the inventive method, human body parts blocks the rear impact on about beam weight, wherein, in the first row first pair and the 3rd width image are same person, in the second row first pair and the 3rd width image are different people, gray area in the second width image is background area, non-gray area represents about beam weight, and color is more deeply felt and shown that about beam weight is larger, and white represents that about beam weight is minimum;

Fig. 4 c is that in the inventive method, light changes the impact on about beam weight, wherein, in the first row first pair and the 3rd width image are same person, in the second row first pair and the 3rd width image are different people, gray area in the second width image is background area, non-gray area represents about beam weight, and color is more deeply felt and shown that about beam weight is larger, and white represents that about beam weight is minimum;

Fig. 5 is the experimental result comparison diagram (adopting from the image in CUHK02 storehouse) of the inventive method and existing target body recognition methods;

Fig. 6 is the experimental result comparison diagram (adopting from the image in VIPeR storehouse) of the inventive method and existing target body recognition methods.

Embodiment

Below in conjunction with accompanying drawing, embodiment is described in further detail the present invention.

The present invention proposes a kind of target body recognition methods based on space constraint, as shown in Figure 2, it comprises the following steps the process flow diagram of the method:

1. supposition has a camera A, the synthetic image object storehouse of all image sets that camera A is photographed, then extract the foreground area in every width image in image object storehouse, foreground area in image object storehouse in every width image is the human body in every width image in image object storehouse, then the foreground area in every width image in image object storehouse is divided into the first image block that size is identical and overlap each other, the proper vector of extracting again each the first image block in the foreground area in every width image in image object storehouse, detailed process is:

1.-1, supposition has a camera A, the synthetic image object storehouse of all image sets that camera A is photographed.

1.-2, current pending q width image in image object storehouse is defined as to present image, wherein, 1≤q≤Q, the quantity of the image in Q presentation video object library, image in image object storehouse is used for contrasting with image to be identified, and present image as shown in Figure 3 c.

1.-3,, in order to reduce the impact of light on image, first the V passage in the HSV passage of present image is carried out to equalization processing, then adopt attitude to estimate to extract the foreground area (as shown in Figure 3 d) in present image.Again the foreground area in present image is divided into the first image block that a plurality of sizes are identical and overlap each other, current the first pending image block is defined as to current the first image block, the size of these the first image blocks is 10 * 10, wherein the first image block is 4 pixels in the step-length of horizontal and vertical direction, thereby the foreground area in present image is divided into the first identical and overlapped image block of a plurality of sizes, its process is: the moving window that the size of take is 10 * 10 moves every 4 pixels in present image, it is the first image block of 10 * 10 that present image is divided into a plurality of sizes.

1.-4, extract the LAB color characteristic of current the first image block, in order to obtain better the discriminant information of present image profile and details, each Color Channel of current the first image block is down-sampled through three times, three times the down-sampled factor is respectively 0.5,0.75,1, then extract 32 dimension color histograms on each Color Channel of current the first image block, get 288 dimension color characteristics of current the first image block.In order to combine with further feature, all histograms are normalized.In order to reduce the impact on image recognition of shooting angle and light, extract again SIFT(Scale-invariant feature transform) feature supplements as one of color characteristic, in the extraction stage of SIFT feature, current the first image block is evenly divided into the first elementary area lattice of a plurality of non-overlapping copies, the size of these the first elementary area lattice is 4 * 4, in first image block, include four the first elementary area lattice, these four the first elementary area cases are on four angles of the first image block, two pixels of being separated by between adjacent two the first elementary area lattice in same the first image block, and according to all the first elementary area lattice, current the first image block partial gradient is quantified as to eight directions, in three Color Channels of current the first image block, extract respectively the SIFT feature of 128 dimensions, in conjunction with three Color Channels, form altogether the SIFT feature of current first image block of 384 dimensions, in order to be combined with the LAB of current the first image block color characteristic, the SIFT feature of this current the first image block is also normalized.According to the SIFT feature of the LAB color characteristic of current the first image block and current the first image block, obtain again the proper vector of current the first image block, be designated as c _q,

c_{q} = [\begin{matrix} a_{q} \\ b_{q} \end{matrix}],

The proper vector of current the first image block with one 672 dimension proper vector represent, wherein, a _q288 dimensional vectors, a _qin element be the LAB color characteristic of current the first image block of extracting, b _q384 dimensional vectors, b _qin element be the SIFT feature of current the first image block of extracting.

1. ,-5, using pending the first image block of the next one as current the first image block, then return to step 1.-4 and continue to carry out, until obtain the proper vector of all the first image blocks in the foreground area in present image.

2. supposition has a camera B, the image that camera B is photographed is in real time as image to be identified, as shown in Figure 3 a, then extract the foreground area in image to be identified, foreground area in image to be identified is the human body in image to be identified, then the foreground area in image to be identified is divided into the second image block that size is identical and overlap each other, then extracts the proper vector of each the second image block in the foreground area in image to be identified, detailed process is:

2.-1, supposition has a camera B, the image that camera B is photographed is in real time as image to be identified, in order to reduce the impact of light on image, first the V passage for the treatment of in the HSV passage of recognition image carries out equalization processing, then adopt attitude to estimate to extract the foreground area in image to be identified, the foreground area in image to be identified is the human body (as shown in Figure 3 b) in image to be identified.Again the foreground area in present image is divided into the second image block that size is identical and overlap each other, the second image block when pre-treatment is defined as to current the second image block, the size of these the second image blocks is 10 * 10, and these the second image blocks step-length is in the horizontal and vertical directions 4 pixels, thereby the foreground area in image to be identified is divided into the second identical and overlapped image block of a plurality of sizes, its process is: the moving window that the size of take is 10 * 10 moves every 4 pixels in image to be identified, it is the second image block of 10 * 10 that image to be identified is divided into a plurality of sizes.

2.-2, extract the LAB color characteristic of current the second image block, in order to obtain better the discriminant information of image outline to be identified and details, three Color Channels of current the second image block are all down-sampled through three times, three times the down-sampled factor is respectively 0.5,0.75,1, on each Color Channel, extract 32 dimension color histograms of current the second image block, obtain 288 dimension color characteristics of current the second image block.In order to combine with further feature, all histograms are normalized.In order to reduce the impact on image recognition of shooting angle and light, extracting SIFT feature supplements as of color characteristic, in the extraction stage of SIFT feature, current the second image block is evenly divided into the second elementary area lattice of a plurality of non-overlapping copies, the size of these the second elementary area lattice is 4 * 4, in second image block, include four the second elementary area lattice, these four the second elementary area cases are on four angles of the second image block, two pixels of being separated by between adjacent two the second elementary area lattice in same the second image block, and according to the second all elementary area lattice, the second image block partial gradient is quantified as to eight directions, in three Color Channels of current the second image block, extract respectively the SIFT feature of 128 dimensions, in conjunction with three Color Channels, form altogether the SIFT feature of current second image block of 384 dimensions, in order to be combined with the LAB of current the second image block color characteristic, the SIFT feature of current the second image block is also normalized.According to the SIFT feature of the LAB color characteristic of current the second image block and current the second image block, obtain again the proper vector of current the second image block, be designated as c,

c = [\begin{matrix} a \\ b \end{matrix}],

The proper vector of current the second image block represents with the proper vector of one 672 dimension, wherein, a is 288 dimensional vectors, element in a is the LAB color characteristic of current the second image block of extracting, b is 384 dimensional vectors, and the element in b is the SIFT feature of current the second image block of extracting.

2.-3, the next one of current the second image block the second image block is defined as to current the second image block, returns to step 2.-2 and continue to carry out, until obtain the proper vector of all the second image blocks in image to be identified.

3. in order there to be in the vertical direction good opposing partial occlusion, posture, light, the variation of clothes wrinkle and affect constraint, nearest Euclidean distance piece of every line search in the vertical direction in self image, so according to the proper vector of each the first image block in the foreground area in every width image in image object storehouse, obtain the minor increment piece sequence of each the first image block in the foreground area in every width image in image object storehouse, by central pixel point coordinate position in q width image in the foreground area in the q width image in image object storehouse, be (i, the minor increment piece sequence of the first image block j) is designated as

d_{i, j}^{B, q} = {d_{i, j, 1}^{B, q}, d_{i, j, 2}^{B, q}, . . ., d_{i, j, k}^{B, q}, . . ., d_{i, j, M}^{B, q}},

d_{i, j, k}^{B, q} = \min {d (x_{i, j}^{B, q}, x_{k, 1}^{B, q}), . . ., d (x_{i, j}^{B, q}, x_{k, j^{'}}^{B, q}), . . ., d (x_{i, j}^{B, q}, x_{k, N^{'}}^{B, q})},

|| || ₂represent two normal form symbols,

proper vector,

represent that in the foreground area in q width image, central pixel point coordinate position in q width image is first image block of (k, j') proper vector,

(k, j^{'}) &SubsetEqual; F .

Similar with the minor increment piece sequence that in image in image object storehouse is each the first image block in foreground area, according to the proper vector of each the second image block in the foreground area in image to be identified, obtain the minor increment piece sequence of each the second image block in the foreground area in image to be identified, the minor increment piece sequence of the second image block that coordinate position is (m, n) in image to be identified of central pixel point in the foreground area in image to be identified is designated as

d_{m, n}^{A} = {d_{m, n, 1}^{A}, d_{m, n, 2}^{A}, . . ., d_{m, n, k^{'}}^{A}, . . ., d_{m, n, K}^{A}},

d_{m, n, k^{'}}^{A} = \min {d (x_{m, n}^{A}, x_{k^{'}, 1}^{A}), . . ., d (x_{m, n}^{A}, x_{k^{'}, n^{'}}^{A}), . . ., d (x_{m, n}^{A}, x_{k^{'}, L^{'}}^{A})}, 1 \leq n^{'} \leq L^{'},

proper vector,

(k^{'}, n^{'}) &SubsetEqual; F^{'} .

The calculating of the approximately beam weight of the inventive method is all by finding the most similar piece in corresponding region with mating of piece, therefore can finely keep out the impact of the factors such as light, posture on weights constraint.As shown in Fig. 4 a, Fig. 4 b and Fig. 4 c, two different people in each group have most of feature almost identical, but the one-piece construction from the same person under different cameras is more likely similar, by space constraint, suppress different people because of the coupling of the similar but piece that one-piece construction differs greatly separately of regional area.

4. consider that the data of the inventive method are all from human detection, the bounding box that human detection obtains can have certain float in space in human region, cannot accurately align, in order to process better the impact of spatial variations, in finding the process of the most similar of each the second image block, not in strict accordance with the identical line search in the image in image object storehouse, but allow there is in vertical direction less fluctuation range.In the present embodiment, in the q width image of m-l in q width image in image object storehouse finds the capable n of the m in image to be identified row the second image block in walking to the image block in m+l line range in image object storehouse the most similar, wherein, 1≤l≤m, when l arranges too small, because vertical direction can not be completely unjustified, the second image block cannot search the most similar, when l arranges excessive, the second image block of a upper body likely searches the most similar of the second image block of the lower part of the body, consider these, in specific implementation process, get l=2.

Adopt neighbouring node algorithm, find out the most similar in the foreground area in each second image block in the foreground area in image to be identified every width image in image object storehouse, for

m-l in foreground area in every width image in image object storehouse walks in all the first image blocks that m+l is capable, finds out

in foreground area in every width image in image object storehouse the most similar, suppose m-l in the foreground area in the q width image in image object storehouse walk to the most similar of finding out in all the first image blocks that m+l is capable for coordinate position in the q width image of central pixel point in image object storehouse in the foreground area in the q width image in image object storehouse be (m ₁, n ₁) the first image block, will

the most similar of walking in all the first image blocks that m+l is capable of m-l in foreground area in q width image in image object storehouse is designated as

wherein, m-l≤m ₁≤ m+l, 1≤l≤m,

(m_{1}, n_{1}) &SubsetEqual; F .

will

matching score in q width image in image object storehouse is designated as

score (x_{m, n}^{A}, x_{m}^{_{1}, n_{1}}) = e^{t_{1}},

Wherein, e represents nature radix,

σ is constant, δ ₂it is an expression

the constant of frequency span, in specific implementation process, get σ=1, get δ ₂=2.8, represent

proper vector, represent

proper vector.

will approximately beam weight with respect to the space constraint of the q width image in image object storehouse is designated as

w (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q}) = e^{f_{1}},

About beam weight is larger, represents that the global structure of the image of the second image block in image object storehouse is more similar, is that the possibility of same person is higher.Wherein, e represents nature radix,

δ ₃it is an expression

the constant of frequency span, in specific implementation process, get δ ₃=2.8,

represent

in k' element,

represent

in k' element,

represent

in q width image in image object storehouse the most similar minor increment piece sequence, 1≤k'≤K, K represents total line number of the second image block of comprising in the foreground area in image to be identified.

The proper vector of extracting each image block in every width image in image object storehouse is that image has been carried out to space constraint with the minor increment piece sequence of extracting each the second image block in image to be identified, this space constraint has limited the defect of local matching, can obtain about beam weight according to this space constraint, make each image block and each second image block in matching process with overall constraint information.

Score (I^{A}, I_{q}^{B}) = Σ_{m = 1, n = 1}^{m = K, n = L} {score (x_{m, n}^{A}, x_{m}^{_{1}, n_{1}}) \cdot w (x_{m, n}^{A}, x_{m}^{_{1}, n_{1}})},

value larger, represent that the people in two width figure is more likely same person, I ^arepresent image to be identified,

q width image in presentation video object library.

7. every width image in image object storehouse is arranged with respect to the matching score of every width image in image object storehouse from big to small by image to be identified, obtain an image sequence, the human body in image to be identified is identified as to the human body in the first sub-picture in image sequence.

In the present embodiment, step is found out in the detailed process of the most similar of each the second image block in the foreground area in image to be identified in 4., if in strict accordance with the identical line search in the image in image object storehouse, adopt in this manner neighbouring node algorithm, find out the most similar in the foreground area in each second image block in the foreground area in image to be identified every width image in image object storehouse, for

in capable all the first image blocks of m in foreground area in every width image in image object storehouse, find out in foreground area in every width image in image object storehouse the most similar, suppose to find out in all the first image blocks that m in the foreground area in the q width image in image object storehouse is capable the most similar for coordinate position in the q width image of central pixel point in image object storehouse in the foreground area in the q width image in image object storehouse be (m, n ₁) the first image block, will

(m, n_{1}) &SubsetEqual; F .

will

matching score in q width image in image object storehouse is designated as

score (x_{m, n}^{A}, x_{m, n_{1}}^{B, q}) = e^{t},

Wherein, e represents nature radix,

σ is constant, and δ is an expression

the constant of frequency span, in specific implementation process, get σ=1, get δ=2.8,

represent

proper vector,

represent

proper vector.

According to the identical line search in the image in image object storehouse, by step 5. in approximately beam weight with respect to the space constraint of q width image in image object storehouse is designated as

wherein, e represents nature radix,

δ ₁it is an expression

the constant of frequency span, in specific implementation process, get δ=2.8,

represent

in k' element,

represent

in k' element,

represent

in q width image in image object storehouse the most similar

According to the identical line search in the image in image object storehouse, by step in 6.

be designated as

Score (I^{A}, I_{q}^{B}) = Σ_{m = 1, n = 1}^{m = K, n = L} {score (x_{m, n}^{A}, x_{m, n_{1}}^{B, q}) \cdot w (x_{m, n}^{A}, x_{m, n_{1}}^{B, q})},

Wherein, I ^arepresent image to be identified, q width image in presentation video object library.This according to the mode of the identical line search in the image in image object storehouse, can not process better the impact of spatial variations, therefore the inventive method adopts, allow to have in vertical direction the way of search of less fluctuation range to find the most similar of each the second image block in the foreground area in image to be identified.

For the remarkable effect of the global restriction of the inventive method is described better, the present invention utilizes the image in VIPeR database and these two databases of CHUK02 database, the inventive method is tested, again the inventive method and existing human body recognition method are carried out to the contrast of experimental result, the inventive method is chosen at random 316 pedestrians as collection to be tested from VIPeR storehouse (having 613 pedestrians), from CHUK02 storehouse (thering are 971 pedestrians), choose at random 486 pedestrians and carry out respectively contrast experiment as collection to be tested, comparing result as shown in Figure 5 and Figure 6, correct matching ratio in Fig. 5 and Fig. 6 represents the ratio of the correct identification of image in the front n of rank (Rank) image.For example, in Fig. 5, coordinate position is (10,52) represent that 316 images to be identified in CUHK02 storehouse and all images in image object storehouse carry out target body identification experiment by the inventive method, all images in image object storehouse according to matching score rank after, image using rank in front 10 image object storehouse, as the matching image of image to be identified, has 52% can be correctly validated in 316 images to be identified.In Fig. 6, coordinate position is (5,47) represent that 485 images to be identified in VIPeR storehouse and all images in image object storehouse carry out target body identification experiment by the inventive method, all images in image object storehouse according to matching score rank after, image using rank in front 5 image object storehouse, as the matching image of image to be identified, has 46% can be correctly validated in 485 images to be identified.In Fig. 5 and Fig. 6, Our is the experimental result of the inventive method, SDALF is the people such as M.Farenzena (M.Farenzena, L.Bazzani, A.Perina, V.Murino, and M.Cristani.Person re-identification by symmetry-driven accumulation of local features.In CVPR, 2010. based on the cumulative human body identification of symmetry local feature, IEEE international computer vision and pattern-recognition meeting) experimental result of the method that proposes, ELF is the people such as Malocal (B.Ma, Y.Su, the experimental result of the method human body identification of the partial descriptions of and F.Jurie.Local descriptors encoded by fisher vectors for person re-identification.2012. based on complement vector) proposing, bLDFV is the people such as Gray (D.Gray and H.Tao.Viewpoint invariant pedestrian recognition with an ensemble of localized features.ECCV, the human body identification of 2008. angle invariabilities in conjunction with a plurality of local features, the experimental result of method Europe computer vision international conference) proposing.From the experimental result shown in Fig. 5 and Fig. 6, can find out that the global restriction that the inventive method proposes has played significant effect in target body identification.

Claims

1. the target body recognition methods based on space constraint, is characterized in that: comprise the following steps:

d_{i, j}^{B, q} = {d_{i, j, 1}^{B, q}, d_{i, j, 2}^{B, q}, . . ., d_{i, j, k}^{B, q}, . . ., d_{i, j, M}^{B, q}},

d_{i, j, k}^{B, q} = \min {d (x_{i, j}^{B, q}, x_{k, 1}^{B, q}), . . ., d (x_{i, j}^{B, q}, x_{k, j^{'}}^{B, q}), . . ., d (x_{i, j}^{B, q}, x_{k, N^{'}}^{B, q})},

|| || ₂represent two normal form symbols, represent that in the foreground area in q width image, central pixel point coordinate position in q width image is first image block of (i, j)

proper vector, represent that in the foreground area in q width image, central pixel point coordinate position in q width image is first image block of (k, j')

proper vector,

d_{m, n}^{A} = {d_{m, n, 1}^{A}, d_{m, n, 2}^{A}, . . ., d_{m, n, k^{'}}^{A}, . . ., d_{m, n, K}^{A}},

d_{m, n, k^{'}}^{A} = \min {d (x_{m, n}^{A}, x_{k^{'}, 1}^{A}), . . ., d (x_{m, n}^{A}, x_{k^{'}, n^{'}}^{A}), . . ., d (x_{m, n}^{A}, x_{k^{'}, L^{'}}^{A})}, 1 \leq n^{'} \leq L^{'},

proper vector,

(k^{'}, n^{'}) &SubsetEqual; F^{'};

(m_{1}, n_{1}) &SubsetEqual; F;

will

matching score in q width image in image object storehouse is designated as

score (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q}) = e^{t_{1}},

Wherein, e represents nature radix,

σ is constant, δ ₂it is an expression

the constant of frequency span,

represent

proper vector,

represent

proper vector;

will

w (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q}) = e^{f_{1}},

Wherein, e represents nature radix,

f_{1} = - \frac{{Σ_{k^{'} = 1}^{K} (d_{m, n, k^{'}}^{A} - d_{m_{1}, n_{1}, k^{'}}^{B, q})}^{2}}{2 δ_{3}^{2}},

δ ₃it is an expression

the constant of frequency span,

represent

in k' element,

represent

in k' element,

represent

in q width image in image object storehouse the most similar

Score (I^{A}, I_{q}^{B}) = Σ_{m = 1, n = 1}^{m = K, n = L} {score (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q}) \cdot w (x_{m, n}^{A}, x_{m_{1}, n_{1}}^{B, q})},

Wherein, I ^arepresent image to be identified,

q width image in presentation video object library;

2. a kind of target body recognition methods based on space constraint according to claim 1, is characterized in that: described step detailed process is 1.:

c_{q} = [\begin{matrix} a_{q} \\ b_{q} \end{matrix}],

3. a kind of target body recognition methods based on space constraint according to claim 2, it is characterized in that: the process that in described step 1.-3, the foreground area in present image is divided into the first identical and overlapped image block of a plurality of sizes is: the moving window that the size of take is 10 * 10 moves every 4 pixels in present image, it is the first image block of 10 * 10 that present image is divided into a plurality of sizes.

4. a kind of target body recognition methods based on space constraint according to claim 3, is characterized in that: in described step 1.-4, the size of the first elementary area lattice is 4 * 4.

5. a kind of target body recognition methods based on space constraint according to claim 4, is characterized in that: described step detailed process is 2.:

c = [\begin{matrix} a \\ b \end{matrix}],

6. a kind of target body recognition methods based on space constraint according to claim 5, it is characterized in that: the process that in described step 2.-1, the foreground area in image to be identified is divided into the second identical and overlapped image block of a plurality of sizes is: the moving window that the size of take is 10 * 10 moves every 4 pixels in image to be identified, it is the second image block of 10 * 10 that image to be identified is divided into a plurality of sizes.

7. a kind of target body recognition methods based on space constraint according to claim 6, is characterized in that: the size of the second elementary area lattice in described step 2.-2 is 4 * 4.

8. a kind of target body recognition methods based on space constraint according to claim 7, it is characterized in that: described step adopts neighbouring node algorithm in 4., find out in the process of the most similar in the foreground area in each second image block in the foreground area in image to be identified every width image in image object storehouse, for

(m, n_{1}) &SubsetEqual; F;

will

matching score in q width image in image object storehouse is designated as

score (x_{m, n}^{A}, x_{m, n_{1}}^{B, q}) = e^{t},

Wherein, e represents nature radix,

t = - \frac{{{| | x_{m, n}^{A} - x_{m, n_{1}}^{B, q} | |}_{2}}^{2}}{σ δ^{2}},

σ is constant, and δ is an expression the constant of frequency span,

represent

proper vector,

represent proper vector.

9. a kind of target body recognition methods based on space constraint according to claim 8, is characterized in that: described step is 5. middle will