CN103942563A - Multi-mode pedestrian re-identification technology - Google Patents

Multi-mode pedestrian re-identification technology Download PDF

Info

Publication number
CN103942563A
CN103942563A CN201410125981.2A CN201410125981A CN103942563A CN 103942563 A CN103942563 A CN 103942563A CN 201410125981 A CN201410125981 A CN 201410125981A CN 103942563 A CN103942563 A CN 103942563A
Authority
CN
China
Prior art keywords
image
target
similarity
projection
sgn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410125981.2A
Other languages
Chinese (zh)
Inventor
赵志诚
刘凯
苏菲
赵衍运
庄伯金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201410125981.2A priority Critical patent/CN103942563A/en
Publication of CN103942563A publication Critical patent/CN103942563A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a multi-mode pedestrian re-identification technology which comprises the steps that firstly, foreground images containing targets are cut out from first images and second images shot through a first camera and a second camera respectively, wherein the second images correspond to known targets; secondly, color features and texture features are extracted from the foreground images which are cut out respectively, and the color features and the texture features are cascaded to form image features; thirdly, the image features are input to a Hash projection model, and the similarity between the targets in the first images and the second images is calculated; fourthly, if the calculated similarity is larger than a preset threshold value, the targets in the first images are determined as the known targets corresponding to the second images.

Description

A kind of multi-modal pedestrian is identification technique again
Technical field
The invention belongs to image model identification field, relate in particular to pedestrian based on anchor node and multi-modal Hash projection identification technique again.
Background technology
Along with the appearance of computer vision technique and increasing rapidly of computer computation ability, Novel monitoring video system has obtained swift and violent development.Simultaneously along with the variation of video surveillance applications, when monitoring on a large scale scene, single camera is difficult to meet monitoring requirement because visual field is limited, therefore, use a plurality of camera supervised scenes on a large scale to become the important development trend of video monitoring.Between the visual field of a plurality of video cameras, exist overlapping, can usage level camera calibration and the space time information of combining target.But overlapping when not existing between the visual field of a plurality of video cameras, be that pedestrian moves while there is " territory, time blind area " and " caecitas spatialis region " in camera field of view, for the continuity that guarantees that pedestrian follows the tracks of, need to carry out identity consistency checking to the target in different visual fields.
In video monitoring system on a large scale, under different cameras visual field, the consistency checking of the consistency checking of different time one skilled in the art identity or same camera field of view different time one skilled in the art identity becomes needs the major issue that solves.We are called pedestrian in monitor video identification problem again by the problems referred to above.Current pedestrian again identification technique, using pedestrian's clothing outward appearance as judgment basis, and supposes that pedestrian wears clothes and do not change in monitor procedure, then identification technique is carried out the identification of pedestrian's identity by outward appearance similarity matching.
At present, pedestrian again identification mainly contain following a few class technical scheme:
Technical scheme (1)
This scheme attempts to extract from original image the feature that more stable (having stability) has again the property distinguished simultaneously, stability refers to that same person should be the same (stable) in this feature in the same time not, the property distinguished refer to different people (not in the same time or synchronization) this feature should be different.In this technical scheme, from original image, designing the feature that meets above-mentioned requirements is crucial problem.Wherein typical example is Symmetry Driven Accumulation of Local Feature[list of references 1] (the symmetrical accumulation method of local feature), by pedestrian, the health in image detects the method, then health is divided into head, trunk and lower limb from vertical direction, horizontal direction is divided into left and right half body.After removing head, whole health has been divided into left trunk, right trunk, left leg, four parts of right leg.Then for every part, we extract " hsv color histogram ", " maximum stable color region (MSCR) " and " repeating image block " as characteristics of image.The most above-mentioned 3 kinds of image feature vectors of all splicing of connecting, forms the proper vector of whole health.This feature extracting method combines spatial information.
The shortcoming of technical scheme (1)
First: in order to guarantee that extracted feature has stability and the property distinguished, the design of feature needs artificial experience and trial repeatedly.
Second: actual pedestrian, again in identification problem, the parameter configuration of different video cameras is different, and the illumination condition of the visual field of video camera is different, and the shooting angle of same person under different visual fields is different, and may be subject to block interference not identical yet.Under the shooting condition of this complexity, be difficult to design a kind of above-mentioned feature with stability and the property distinguished.
Technical scheme (2)
This technical scheme no longer focuses in the design of image primitive character, but by the method for metric learning, image primitive character is carried out to projection, makes the feature projecting meet stability and the property distinguished.The primitive character of supposing two pictures is expressed as x ∈ R dwith y ∈ R d, the direct range of two primitive characters (Euclidean distance) is
d E ( x , y ) = | | x - y | | 2 2 - - - ( 1 )
Metric learning method attempts to find a projection matrix L ∈ R d * r, then use this matrix to carry out projection to primitive character and calculate projection Euclidean distance afterwards
d L ( x , y ) = | | Lx - Ly | | 2 2 - - - ( 2 )
How to obtain the key issue that a good projection matrix L is metric learning.[non-patent document 2] attempts same person image distance to be less than the maximum probability of different people image distance.[non-patent document 3] uses for reference classical metric learning method large margin nearest neighbor(maximal clearance nearest neighbor search) and improve according to identification particular problem again.Method based on metric learning can obtain better performance than technical scheme one.
The shortcoming of technical scheme (2)
Although attempting, with rational projection matrix, original image is carried out to linear projection, this technical scheme guarantees that the feature after projection meets stability and the property distinguished.But because two image takings of needs couplings are in different video cameras, cause existing between picture each species diversity, (parameter configuration of video camera is different, the illumination condition of the visual field of video camera is different, the shooting angle of same person under different visual fields is different, and may be subject to block interference not identical) this species diversity causes two pictures can be seen as in different mode.In this case, only use a metric matrix to be not enough to the Projection Character in two mode to carry out distance calculating in same space.
Summary of the invention
Pedestrian, again in identification problem, the image of different cameras is in different Modal Spaces, but existing technical scheme is only used single Metric Projections matrix, cannot complete the similarity measurement function of cross-module state.In order to overcome above problem, the present invention proposes multi-modal pedestrian based on anchor node and Hash projection identification technique again.
The present invention proposes multi-modal pedestrian based on anchor node and Hash projection identification technique again, belong to pattern-recognition and field of intelligent monitoring, be applied to detection and identification across specific pedestrian target in camera video monitor network.This method combine anchor node dimensionality reduction, Hash projection with across modal technique, first use anchor node projection to carry out Feature Dimension Reduction, then use different Hash functions that the feature of different cameras image is projected to respectively in identical Hamming space, form binary features, finally in Hamming space, carry out the tolerance of similarity.In video surveillance network, different cameras parameter is different, image illumination condition of living in, and the shooting condition such as block and be also not quite similar in the external world, and this image that causes same person to be taken under different cameras has the different forms of expression, in different Modal Spaces.The method can effectively overcome the problem that different cameras picture that Modal Space difference causes can not directly mate.; The XOR of binary features calculates can effectively improve pedestrian's real-time of identification system again.The method has been used anchor node shadow casting technique to carry out Feature Dimension Reduction simultaneously, compares with PCA technology, has avoided svd step, has reduced assessing the cost of method.
In the present invention, the image of taking for different cameras, different Hash functions are used to Projection Character, and primitive character is projected to unified Hamming space from different Modal Spaces.Then in Hamming space, carry out the calculating of Hamming distance.Not only can improve pedestrian's recognition performance of identification again, can also effectively reduce retrieval time, improve the practicality of system.
According to embodiments of the invention, a kind of multi-modal target discrimination method is again provided, comprise the following steps: the foreground image that step 1, the first image of taking separately from the first video camera and the second video camera and the second image, intercepting comprises target respectively, wherein, the second image is corresponding to known target; Step 2, from intercepted foreground image, extract respectively color characteristic and textural characteristics, and by color characteristic and textural characteristics cascade, form characteristics of image; Step 3, described characteristics of image is input to Hash projection model, calculates the similarity of the target in the first image and the second image; If the similarity that step 4 is calculated is greater than predetermined threshold, it by the target discrimination of the first image, is the corresponding known target of the second image.
Beneficial effect of the present invention is mainly the following aspects:
1. for the feature of different cameras image, use different Hash projection functions, can by the Feature Mapping in different modalities in same Hamming space, then carry out the calculating of distance, improved pedestrian's recognition performance of identification system again.
2., after primitive character projects to Hamming space, the vector distance in original real number field calculates and has become the binary XOR in Hamming space, can effectively improve pedestrian's real-time of identification system again.
3. by anchor node projection, carry out Feature Dimension Reduction, avoided the svd of conventional P CA Feature Dimension Reduction, lowered pedestrian's computation complexity of identification system again.
Accompanying drawing explanation
Fig. 1 is pedestrian's functional block diagram of identification system more according to an embodiment of the invention;
Fig. 2 is image characteristics extraction schematic diagram according to an embodiment of the invention;
Fig. 3 is the structured flowchart of multi-modal according to an embodiment of the invention Hash projection.
Embodiment
Below, by reference to the accompanying drawings the enforcement of technical scheme is described in further detail.
Fig. 1 is pedestrian's functional block diagram of identification system more according to an embodiment of the invention.Pedestrian's discrimination method is again described according to an embodiment of the invention with reference to the accompanying drawings, and the method mainly comprises the following steps:
The first step: the pedestrian in camera review is positioned
Pedestrian location refers to the position of determining pedestrian in whole video monitoring image, and the whether accurate performance on whole system of pedestrian location has important impact.Prospect of the application background separation technology of the present invention is carried out pedestrian location.First carry out mixed Gauss model and carry out background modeling, then use the method for background subtraction using the target as sport foreground (pedestrian) location out.A rectangle frame that comprises pedestrian target of final acquisition is as pedestrian detection result, and follow-up feature extraction operation is carried out in this rectangle frame.
Second step: pedestrian's image primitive character extracts
The rectangular image that comprises pedestrian target of locating acquisition for pedestrian, carries out feature extraction, finally obtains the characteristics of image of 5895 dimensions.Concrete feature extracting method is as described below.
Can adopt existing method to extract the primitive character of pedestrian's image (rectangle frame).For example, use bilinearity difference approach that the rectangular image that comprises pedestrian is normalized to 128*48 pixel, and the image after normalization is divided into the image block of a plurality of 16*24 pixel sizes, wherein,, there is the overlapping region of 12 pixels the overlapping region that has 8 pixels between the adjacent image block of level between vertical adjacent image block.Like this, original image is divided into 45(15*3 altogether) individual image block.
For each image block, carry out the extraction of color characteristic and textural characteristics.Color characteristic comprises RGB, HSV and YCbCr totally 9 passages, and the color characteristic of each passage is quantified as the color histogram of 8 dimensions, and the distribution situation of presentation video piece in color space obtains altogether 9 histograms.Textural characteristics is used local binary patterns, forms the texture histogram of 59 dimensions, and local binary patterns has the remarkable advantages such as rotational invariance and gray scale unchangeability, fully Description Image local grain.Local binary patterns computing method are referring to [list of references 4].Like this, the characteristic dimension of each image block is 8*9+59=131.Finally, the color characteristic of all image blocks and textural characteristics are connected, the characteristic dimension that obtains whole image is 131*45=5895.Fig. 2 is the schematic diagram of color and texture feature extraction.
The 3rd step: carry out Feature Dimension Reduction by anchor node projection
Due to the primitive character dimension of image too high (5895 dimension), if directly use can cause follow-up operation to consume a large amount of computing times, therefore, sending into pedestrian again before identification system, original image feature is carried out to dimensionality reduction, finally obtain the feature of 150 dimensions.The present invention uses anchor node shadow casting technique [list of references 7] to carry out Feature Dimension Reduction.The mathematic(al) representation of anchor node projection is:
z ( x ) = [ exp ( - D 2 ( x , u 1 ) t ) , . . . , exp ( - D 2 ( x , u m ) t ) ] Σ j = 1 m exp ( - D 2 ( x , u j ) t ) - - - ( 3 )
X ∈ R wherein 5895the original image feature that represents 5895 dimensions, represent 150 anchor nodes, D () represents Euclidean distance.T represents normaliztion constant.Z (x) ∈ R 150represent 150 dimensional features of original image feature x through obtaining after projection, be called anchor node feature.Can find out, anchor node projection can be projected as the primitive character of 5895 dimensions the anchor node feature of 150 dimensions, because 150 much smaller than 5895, so anchor node projection has realized Feature Dimension Reduction.In the present invention the selection of anchor node whether rationally can direct effect characteristics dimensionality reduction effect quality whether, in the present invention, we carry out K-means cluster for all primitive characters, cluster centre number is chosen as 150, then using 150 cluster centres of K-means as anchor node, all like this anchor nodes can relatively be evenly distributed in whole primitive character space, make Feature Dimension Reduction have robustness.
The 4th step: the measurement of similarity between feature
In order to judge whether two images belong to a people, two low dimensional features (being obtained by the 3rd step) corresponding to image are sent into Hash projection model (training process of Hash projection model will illustrate below), obtain the similarity of two images, judge that according to this whether the identity of pedestrian in two images is consistent.
Particularly, it is considered herein that two images that come from different cameras are in different modalities space, therefore first we project to respectively unified Hamming space by two low dimensional features, form respectively binary features, then calculate two Hamming distances between binary features from, finally take Hamming distance from calculating two similarities between feature for basic.Fig. 3 is the structured flowchart of the method.
Below this measuring similarity process is described in detail.
Hash projection is to use Hash function that raw data is projected to a kind of technology [list of references 5] in Hamming space.Suppose that original data space is X ∈ R d, x ∈ X is the data in X space, H={-1, and+1} is Hamming space, h (x) ∈ H is that data x is through the result after hash projection.Hash function definition is
h(x)=sgn(p Tx+a)∈{-1,+1} (4)
Wherein { 1 ,+1} represents sign function to sgn () ∈, p tthe transposition that represents projection vector p, a represents side-play amount (scalar).
Because the result of hash projection is binary data-1 and+1, therefore can define data x and the similarity function of y under Hash function h () condition
Two images of the definition of above similarity function s (x, y) based on to be compared are in same Modal Space.But pedestrian, again in identification problem, the picture that different cameras photographs is sentenced different Modal Spaces, and therefore above-mentioned similarity function can not be directly used in pedestrian's measuring similarity of identification again.
In order to overcome above problem, the present invention proposes the Hash projection of cross-module state and similarity function.Suppose that two video camera photographic images feature x and y are respectively in X space and Y space.Two kinds of different Hash function h xand h (x) y(y) respectively the feature in these two spaces is carried out to Hash projection,
h X(x)=sgn(p Tx+a)∈{-1,+1}
h Y(y)=sgn(q Ty+b)∈{-1,+1} (6)
Wherein, p t, q tthe transposition that represents respectively projection vector p, q, a, b represent side-play amount (scalar).
By Hash projection, characteristics of image x and y are projected to respectively in identical Hamming space, and corresponding similarity function is rewritten as
Above a pair of Hash function h x() and h y() only can represent two kinds of similarities (s (x, y)=+ 1 represents that x is similar with y, and s (x, y)=-1 represents x and y dissmilarity), and in order to portray better the similarity degree of x and y, as example, the present invention can introduce 50 pairs of Hash functions (every pair of Hash function has projection vector p, q, side-play amount separately), and the similarity function of x and y is rewritten as:
s ( x , y ) = Σ l = 1 50 h Xl ( x ) h Yl ( y ) = Σ l = 1 50 sgn ( p l T z ( x ) + a l ) sgn ( q l T z ( y ) + b l ) - - - ( 8 )
In addition, consider that different Hash functions are not identical to the effect of playing in measuring similarity, we are that every a pair of Hash function is set a weight α l, formula (8) is further rewritten as:
s ( x , y ) = Σ l = 1 50 α l h Xl ( z ( x ) ) h Yl ( z ( y ) ) = Σ l = 1 50 α l sgn ( p l T z ( x ) + a l ) sgn ( q l T z ( y ) + b l ) - - - ( 9 )
Sum up said process below.
Suppose in A video camera and photograph an image Q, in B video camera, photograph N and open image find the G the most similar to Q, the pedestrian who occurs in Q (or other target) is carried out to judging identity, wherein, the N photographing in B video camera opens image in the corresponding target classification (for example, certain pedestrian's identity) of every image.Use step 1~3 obtain described image characteristic of correspondence x and then use formula (9) to calculate similarity use formula (10) is found out the y with x similarity maximum *,
y * = max y s ( x , y i ) - - - ( 10 )
Afterwards, by y *the classification information of corresponding image (from B video camera) is as recognition result.
Below, the parameter training process of Hash projection model is described.
In order to guarantee that formula (9) can carry out rational similarity measurement, need to be to parameter wherein reasonably arrange, therefore, use training data image (in image, pedestrian's identity is known) to carry out the study of parameter.Suppose and have 316 pairs of training samples s(x k, y k) { 1 ,+1} shows that x and y belong to same person (s (x to ∈ k, y k)=+ 1) or belong to different people (s (x k, y k)=-1).Reasonably cross-module state Hash projection function should have following character:
1) through after projection, belong between the feature of different people (thering is different clothing outward appearances) and have larger distance,
2), through after projection, belong between the feature of same person (thering is identical clothing outward appearance) and have less distance.
The method of use based on AdaBoost carried out parameter training to 50 pairs of Hash functions.Training process be input as 316 pairs of training samples and corresponding label and 150 anchor nodes.50 iteration of whole training process experience, in iteration, first determine optimum projection vector p each time l, q lwith side-play amount a l, b l, then calculate Hash function to weight, final updating sample weights (for next iteration is prepared).Training process is output as the projection vector of 50 pairs of Hash functions and side-play amount and corresponding Hash function to weight.For the l time iterative process, be described below:
Shown in objective function formula (11), by maximizing objective function, obtain optimum projection vector p l, q lwith side-play amount a l, b l.
Φ l = Σ k = 1 K s ( x k , y k ) h Xl ( z ( x k ) ) h Yl ( z ( y k ) ) = Σ k = 1 K s ( x k , y k ) sgn ( p l T z ( x k ) + a l ) sgn ( q l T z ( y k ) + b l ) - - - ( 11 )
(1) training { p l, q l}
In optimizing process, the problem of bringing in order to overcome sign function, simplifies formula (11)
Φ ^ l = Σ k = 1 K s ( x k , y k ) sgn ( p l T z ( x k ) + a l ) sgn ( q l T z ( y k ) + b l ) = Σ k = 1 K s ( x k , y k ) ( p l T z ( x k ) + a l ) ( q l T z ( y k ) + b l ) = Σ k = 1 K s ( x k , y k ) ( p l T z ‾ ( x k ) ) ( q l T z ‾ ( y k ) ) = Σ k = 1 K ϵ lk ( p l T z ‾ ( x k ) ) ( q l T z ‾ ( y k ) ) = p l T ( Σ k = 1 K ϵ lk z ‾ ( x ) z ‾ T ( y ) ) q l = p l T Σ l q l - - - ( 12 )
ε wherein lk=s (x k, y k), the z (x of centralization k), the z (y of centralization k).According to [list of references 6], p land q lshould be in Σ lin the Projection Character space of matrix.Suppose with respectively Σ lfront 50 left eigenvectors and 50 right proper vectors, so p land q lcan use with linear combination carry out approximate representation:
p l = Σ m = 1 50 ζ m u m , q l = Σ m = 1 50 ξ m v m . - - - ( 13 )
Wherein, with be respectively with linear coefficient.
In order to reduce computation complexity, select at random the projection weight of 3000 pairs 50 dimensions use formula (14) to obtain N to projection vector then select to make objective function obtain peaked projection vector to optimal result the most.
{ p l * , q l * } = max { p l , q l } Φ ^ l - - - ( 14 )
(2) training { a l, b l}
Obtain projection vector to after, objective function becomes
Φ ‾ = Σ k = 1 K s ( x k , y k ) sgn ( p l * T z ( x k ) + a l ) sgn ( q l * T z ( y k ) + b l ) - - - ( 15 )
Find below and can make (a, b) combination of objective function maximum as optimum side-play amount pair.Particularly, (a, b) two-dimensional space is carried out to uniform grid and turn to 100 * 100 grid, common property raw 10000 (a, b) combination, then based on each (a, b) combination calculating target function, and select the combination that can maximize objective function as optimum side-play amount pair.
{ a l * , b l * } = max { a l , b l } Φ ‾ l - - - ( 16 )
More than describe the training process of l to Hash projection function, for all Hash functions (totally 50 pairs), used AdaBoost method to carry out joint training.In whole process, for every a pair of Hash projection function, add sample weights to objective function
Φ l = Σ k = 1 K ω l ( x k , y k ) s ( x k , y k ) h Xl ( z ( x k ) ) h Yl ( z ( y k ) ) = Σ k = 1 K ω l ( x k , y k ) s ( x k , y k ) sgn ( p l T z ( x k ) + a l ) sgn ( q l T z ( y k ) + b l ) - - - ( 17 )
ω wherein l(x k, y k) be the weight of k to sample.
(3) training { α l}
The right weight calculation formula of Hash function is
α 1 = 1 2 ln ( 1 + Φ l ) - 1 2 ln ( 1 - Φ l ) - - - ( 18 )
List of references list
1、Michela Farenzena,Loris Bazzani,Alessandro Perina,Vittorio Murino,and Marco Cristani,“Person re-identification by symmetry-driven accumulation of local features,”in Computer Vision and Pattern Recognition(CVPR),2010IEEE Conference on.IEEE,2010,pp.2360–2367.
2、Wei-Shi Zheng,Shaogang Gong,and Tao Xiang,“Person reidentification by probabilistic relative distance comparison,”in Computer Vision and Pattern Recognition(CVPR),2011IEEE Conference on.IEEE,2011,pp.649–656.
3、Mert Dikmen,Emre Akbas,Thomas S Huang,and Narendra Ahuja,“Pedestrian recognition with a learned metric,”in Computer Vision–ACCV2010,pp.501–512.Springer,2011.
4、T.Ojala,M. and D.Harwood(1994),"Performance evaluation of texture measures with classification based on Kullback discrimination of distributions",Proceedings of the12th IAPR International Conference on Pattern Recognition(ICPR1994),vol.1,pp.582-585
5.A.Torralba,R.Fergus,et al.,“Small codes and large image databases for recognition,”in Computer Vision and Pattern Recognition(CVPR),2008IEEE Conference on.IEEE,2008,
6.M/Bronstein,M.M.Bronstein,et al.,“The video genome,”arXiv preprint arXiv:1003.5320,2010.
7.Liu W,Wang J,Ji R,et al.Supervised hashing with kernels[C]//Computer Vision and Pattern Recognition(CVPR),2012IEEE Conference on.IEEE,2012:2074-2081.
For fear of the description that makes this instructions, be limited to miscellaneous, in description in this manual, may the processing such as omission, simplification, accommodation have been carried out to the part ins and outs that can obtain in above-mentioned list of references or other prior art data, this is understandable for a person skilled in the art, and this can not affect the open adequacy of this instructions.At this, above-mentioned list of references is herein incorporated by reference and in full.
In sum, those skilled in the art will appreciate that the above embodiment of the present invention can be made various modifications, modification and be replaced, it all falls into the protection scope of the present invention limiting as claims.

Claims (8)

1. a multi-modal target discrimination method again, comprises the following steps:
The foreground image that step 1, the first image of taking separately from the first video camera and the second video camera and the second image, intercepting comprises target respectively, wherein, the second image is corresponding to known target;
Step 2, from intercepted foreground image, extract respectively color characteristic and textural characteristics, and by color characteristic and textural characteristics cascade, form characteristics of image;
Step 3, described characteristics of image is input to Hash projection model, calculates the similarity of the target in the first image and the second image;
If the similarity that step 4 is calculated is greater than predetermined threshold, it by the target discrimination of the first image, is the corresponding known target of the second image.
2. target according to claim 1 discrimination method again, wherein, described in comprise target foreground image by the boundary rectangle frame of target, limited.
3. target according to claim 2 discrimination method again, wherein, described color characteristic comprises 9 color histograms corresponding with RGB, HSV and these 9 passages of YCbCr, it represents each pixel in described foreground image numeric distribution under each passage,
Wherein, the numerical range under each passage is quantified as 8 values, thereby forms 98 dimension color histograms, as described color characteristic,
Wherein, use local binary patterns, foreground image is calculated, obtain the texture histogram of 59 dimensions, as described textural characteristics.
4. target according to claim 3 discrimination method again, wherein, after described step 2, also comprises:
Step 21, the projection of use anchor node, carry out dimensionality reduction to described color characteristic and described textural characteristics, and its formula is as follows:
z ( x ) = [ exp ( - D 2 ( x , u 1 ) t ) , . . . , exp ( - D 2 ( x , u m ) t ) ] Σ j = 1 m exp ( - D 2 ( x , u j ) t ) - - - ( 3 )
Wherein, x represents the original image feature after described color characteristic and described textural characteristics cascade, represent m anchor node, D () represents Euclidean distance, and t represents normaliztion constant, and z (x) represents the low dimensional feature of original image feature x through obtaining after anchor node projection.
5. target according to claim 4 discrimination method again, wherein, described step 3 comprises:
Step 31, by following formula, calculate the similarity s (x, y) of the first and second video cameras target separately:
Wherein,
h X(x)=sgn(p Tx+a)∈{-1,+1}
h Y(y)=sgn(q Ty+b)∈{-1,+1} (6)
Wherein, p t, q trepresent respectively the projection vector p in Hash projection model, the transposition of q, a, b represent the side-play amount in Hash projection model, and sgn () is sign function.
6. target according to claim 4 discrimination method again, wherein, described step 3 comprises:
Step 31, by following formula, calculate the similarity s (x, y) of the first and second video cameras target separately:
s ( x , y ) = Σ l = 1 50 α l sgn ( p l T z ( x ) + a l ) sgn ( q l T z ( y ) + b l ) - - - ( 9 )
Wherein, represent respectively 50 projection vector p in Hash projection model, the transposition of q, a l, b lrepresent the side-play amount in described 50 Hash projection models, sgn () is sign function, α lthe weight that represents each Hash projection model.
7. according to the discrimination method again of the target described in claim 5 or 6, wherein, described the second video camera is one or more the second video cameras, and described the second image is a plurality of the second images that described the second video camera is taken separately, each in described a plurality of the second image is corresponding to different targets
Wherein, in step 3, the similarity of calculating the target in the first image and the second image comprises: calculate respectively target in the first image and the similarity of the target in each second image, obtain a plurality of similarities,
Described step 4 comprises:
If similarity maximum in the described a plurality of similarities of step 41 is greater than described predetermined threshold, the corresponding known target of the second image corresponding to the similarity of described maximum is defined as to the corresponding target of the first image.
8. target according to claim 6 discrimination method again, wherein, described Hash projection model is trained and is obtained as follows:
Step 51, by following formula, training obtains
{ p l * , q l * } = max { p l , q l } Φ ^ l ,
Wherein, Φ ^ l = p l T Σ l q l ,
Step 52, by following formula, training obtains
Obtain projection vector to after, objective function becomes
{ a l * , b l * } = max { a l , b l } Φ ‾ l - - - ( 16 )
Wherein,
Φ ‾ = Σ k = 1 K s ( x k , y k ) sgn ( p l * T z ( x k ) + a l ) sgn ( q l * T z ( y k ) + b l ) - - - ( 15 )
Step 53, by following formula, training obtains
α 1 = 1 2 ln ( 1 + Φ l ) - 1 2 ln ( 1 - Φ l ) - - - ( 18 ) .
CN201410125981.2A 2014-03-31 2014-03-31 Multi-mode pedestrian re-identification technology Pending CN103942563A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410125981.2A CN103942563A (en) 2014-03-31 2014-03-31 Multi-mode pedestrian re-identification technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410125981.2A CN103942563A (en) 2014-03-31 2014-03-31 Multi-mode pedestrian re-identification technology

Publications (1)

Publication Number Publication Date
CN103942563A true CN103942563A (en) 2014-07-23

Family

ID=51190226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410125981.2A Pending CN103942563A (en) 2014-03-31 2014-03-31 Multi-mode pedestrian re-identification technology

Country Status (1)

Country Link
CN (1) CN103942563A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484324A (en) * 2014-09-26 2015-04-01 徐晓晖 Pedestrian retrieval method based on multiple models and fuzzy color
CN104574440A (en) * 2014-12-30 2015-04-29 安科智慧城市技术(中国)有限公司 Video movement target tracking method and device
CN106446524A (en) * 2016-08-31 2017-02-22 北京智能管家科技有限公司 Intelligent hardware multimodal cascade modeling method and apparatus
CN106557757A (en) * 2016-11-24 2017-04-05 深圳明创自控技术有限公司 A kind of intelligent robot system
CN109101875A (en) * 2018-06-22 2018-12-28 上海市保安服务总公司 Dealing based on recognition of face divides illegal activities to investigate and prosecute system
CN109271545A (en) * 2018-08-02 2019-01-25 深圳市商汤科技有限公司 A kind of characteristic key method and device, storage medium and computer equipment
CN109426785A (en) * 2017-08-31 2019-03-05 杭州海康威视数字技术股份有限公司 A kind of human body target personal identification method and device
WO2020124448A1 (en) * 2018-12-19 2020-06-25 Zhejiang Dahua Technology Co., Ltd. Systems and methods for video surveillance
WO2022153406A1 (en) * 2021-01-13 2022-07-21 日本電信電話株式会社 Calculation device, calculation method, and program
CN115909741A (en) * 2022-11-30 2023-04-04 山东高速股份有限公司 Method, device and medium for judging traffic state

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007103494A3 (en) * 2006-03-09 2007-11-01 Gen Electric Method and system for performing image re-identification
US20090034791A1 (en) * 2006-12-04 2009-02-05 Lockheed Martin Corporation Image processing for person and object Re-identification
CN101504655A (en) * 2009-03-06 2009-08-12 中山大学 Color relationship characteristic based image approximate copy detection method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007103494A3 (en) * 2006-03-09 2007-11-01 Gen Electric Method and system for performing image re-identification
US20090034791A1 (en) * 2006-12-04 2009-02-05 Lockheed Martin Corporation Image processing for person and object Re-identification
CN101504655A (en) * 2009-03-06 2009-08-12 中山大学 Color relationship characteristic based image approximate copy detection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU KAI, ZHAO ZHICHENG, GUO XIN, CAI ANNI: "《Anchor-supported multi-modality hashing embedding for person re-identification》", 《 IN IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING(VCIP) 2013》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484324B (en) * 2014-09-26 2017-11-21 罗普特(厦门)科技集团有限公司 A kind of pedestrian retrieval method of multi-model and fuzzy color
CN104484324A (en) * 2014-09-26 2015-04-01 徐晓晖 Pedestrian retrieval method based on multiple models and fuzzy color
CN104574440A (en) * 2014-12-30 2015-04-29 安科智慧城市技术(中国)有限公司 Video movement target tracking method and device
CN106446524A (en) * 2016-08-31 2017-02-22 北京智能管家科技有限公司 Intelligent hardware multimodal cascade modeling method and apparatus
CN106557757A (en) * 2016-11-24 2017-04-05 深圳明创自控技术有限公司 A kind of intelligent robot system
CN109426785B (en) * 2017-08-31 2021-09-10 杭州海康威视数字技术股份有限公司 Human body target identity recognition method and device
CN109426785A (en) * 2017-08-31 2019-03-05 杭州海康威视数字技术股份有限公司 A kind of human body target personal identification method and device
US11126828B2 (en) 2017-08-31 2021-09-21 Hangzhou Hikvision Digital Technology Co., Ltd. Method and device for recognizing identity of human target
CN109101875A (en) * 2018-06-22 2018-12-28 上海市保安服务总公司 Dealing based on recognition of face divides illegal activities to investigate and prosecute system
CN109271545A (en) * 2018-08-02 2019-01-25 深圳市商汤科技有限公司 A kind of characteristic key method and device, storage medium and computer equipment
WO2020124448A1 (en) * 2018-12-19 2020-06-25 Zhejiang Dahua Technology Co., Ltd. Systems and methods for video surveillance
CN113243015A (en) * 2018-12-19 2021-08-10 浙江大华技术股份有限公司 Video monitoring system and method
US11605220B2 (en) 2018-12-19 2023-03-14 Zhejiang Dahua Technology Co., Ltd. Systems and methods for video surveillance
CN113243015B (en) * 2018-12-19 2024-03-26 浙江大华技术股份有限公司 Video monitoring system
WO2022153406A1 (en) * 2021-01-13 2022-07-21 日本電信電話株式会社 Calculation device, calculation method, and program
CN115909741A (en) * 2022-11-30 2023-04-04 山东高速股份有限公司 Method, device and medium for judging traffic state
CN115909741B (en) * 2022-11-30 2024-03-26 山东高速股份有限公司 Traffic state judging method, equipment and medium

Similar Documents

Publication Publication Date Title
CN103942563A (en) Multi-mode pedestrian re-identification technology
Bedagkar-Gala et al. A survey of approaches and trends in person re-identification
Johnson et al. Clustered pose and nonlinear appearance models for human pose estimation.
IL267116A (en) System and method for cnn layer sharing
US9773189B2 (en) Recognition apparatus and recognition method
Hu et al. Exploring structural information and fusing multiple features for person re-identification
Wan et al. Affine invariant description and large-margin dimensionality reduction for target detection in optical remote sensing images
Demirkus et al. Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos
Iodice et al. Salient feature based graph matching for person re-identification
Wang et al. Research on flame detection algorithm based on multi-feature fusion
Lian et al. Matching of tracked pedestrians across disjoint camera views using CI-DLBP
Bakour et al. Soft-CSRNet: real-time dilated convolutional neural networks for crowd counting with drones
Tong et al. Upper body human detection and segmentation in low contrast video
Wang et al. Pedestrian recognition in multi-camera networks using multilevel important salient feature and multicategory incremental learning
Li et al. Arbitrary body segmentation in static images
Xie et al. Learning visual-spatial saliency for multiple-shot person re-identification
TCT Image mining using content based image retrieval system
Soltani et al. Euclidean distance versus Manhattan distance for skin detection using the SFA database
Liu et al. Person re-identification based on visual saliency
Gad et al. Crowd density estimation using multiple features categories and multiple regression models
Parag Coupled label and intensity MRF models for IR target detection
Alobaidi et al. Face detection based on probability of amplitude distribution of local binary patterns algorithm
Kim et al. Background modeling using adaptive properties of hybrid features
Kanezaki et al. Learning similarities for rigid and non-rigid object detection
Huang et al. Multi-feature fusion based background subtraction for video sequences with strong background changes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140723

WD01 Invention patent application deemed withdrawn after publication