CN110826417B - Cross-view pedestrian re-identification method based on discriminant dictionary learning - Google Patents
Cross-view pedestrian re-identification method based on discriminant dictionary learning Download PDFInfo
- Publication number
- CN110826417B CN110826417B CN201910966029.8A CN201910966029A CN110826417B CN 110826417 B CN110826417 B CN 110826417B CN 201910966029 A CN201910966029 A CN 201910966029A CN 110826417 B CN110826417 B CN 110826417B
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- domain
- dictionary
- view
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to a cross-view pedestrian re-identification method based on discriminative dictionary learning, and belongs to the technical field of digital image processing. Firstly, on the basis of the fact that pedestrian images from the same camera view angle share the same domain, dividing pedestrian features of different view angles into specific view angle domain information components and domain invariant pedestrian appearance feature components, learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components, and meanwhile forcing pedestrian coding coefficients under the same view angle to have strong similarity; then, an expansion regular term is provided to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible; and finally, designing a pedestrian matching scheme by adopting the Euclidean distance based on the model only having the pedestrian characteristic information. The pedestrian re-identification method provided by the invention can separate the domain information in the image to solve the problem of domain deviation among different visual angles, and generates a good identification effect.
Description
Technical Field
The invention relates to a cross-view pedestrian re-identification method based on discriminative dictionary learning, and belongs to the technical field of digital image processing.
Background
Pedestrian re-identification is a technique that uses computer vision to determine the presence or absence of a target pedestrian from images or video sequences taken by different cameras. In recent years, pedestrian re-recognition has attracted increasing attention from researchers due to wide applications in pedestrian search, pedestrian tracking, and pedestrian behavior analysis, and a large number of methods of pedestrian re-recognition have been proposed. Although computer vision researchers have made great efforts to improve the performance of pedestrian re-identification systems, this technique still presents significant challenges because the appearance of pedestrians is often largely visually ambiguous in cross-camera views.
Disclosure of Invention
The invention aims to provide a cross-view pedestrian re-identification method based on discriminative dictionary learning, which is used for solving the problem of offset of a pedestrian re-identification domain in the prior art.
The technical scheme of the invention is as follows: a cross-view pedestrian re-recognition method based on discriminant dictionary learning comprises the following steps:
1) determining a global model framework of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
2) dividing the pedestrian image features of different visual angles into specific visual angle domain information components and domain invariant pedestrian appearance feature components, and learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components;
3) training a discrimination promoting item of the dictionary;
4) providing an expansion regular term to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible;
5) training a discrimination promoting item of the coding coefficient, and forcing the coding coefficients of the pedestrian images with the same visual angle to have strong similarity;
6) determining an overall objective function of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
7) solving variables to be updated in the overall objective function;
8) and designing a pedestrian matching scheme by adopting Euclidean distance based on the model with only the domain unchanged pedestrian appearance characteristics.
Specifically, the overall model framework of step 1) comprises:
by usingRepresenting a training sample set under a two-phase machine visual angle, wherein robust feature representation learning and discriminant metric learning are required to be integrated into a frame, and the overall model frame is as the formula (1)Shown in the figure:
in the formula (I), the compound is shown in the specification,a domain dictionary representing the pedestrian images under all cameras,representing a domain-specific dictionary for coding pedestrian appearance features after separating domain information, Z a ,Z b Is X on dictionary D a And X b Of the domain information, Z ta ,Z tb Is corresponding to the dictionary D t The coding coefficients of the domain-specific information. Phi (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Are data fidelity terms, minimizing which can be used to learn dictionaries D and D t Has the presentation capability. Ψ (D, D) t ) Is a discrimination promoting term of dictionary, gamma (Z) a ,Z b ,Z ta ,Z tb ) The term is a discrimination promoting term of the coding coefficient, and the minimization of the two terms is to enable the dictionary and the coding coefficient to have strong discrimination capability.Is of DRow ofIs D t To (1) aAnd (4) columns.
Specifically, the discriminative dictionary algorithm of step 2) includes:
to mitigate domain shifts between different camera perspectives, domain information is separated from pedestrian image features, and then data fidelity terms Φ (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Expressed as:
in the formula (I), the compound is shown in the specification,domain information for establishing a view angle of a and b two cameras,for separating the domain information from the appearance of pedestrians unaffected by the domain.
Specifically, the dictionary identification promoting item in the step 3) includes:
the dictionary D is used to represent domain information for different camera perspectives, since images from the same camera have the same domain features, it is desirable that the images are linearly related to each other in terms of domain features. To get from the sample X a And X b The domain information is separated, and the proposed dictionary discrimination promoting items are as follows:
in the formula, | D | non-conducting phosphor * The method is used for solving the nuclear norm of the dictionary D, because the domain information component and the real appearance characteristic of the pedestrian have different space morphological characteristics, an incoherent regular term of the structure is introducedTo promote the domain dictionary D and the pedestrian feature dictionary D t Are independent of each other. Alpha (alpha) ("alpha") 1 And alpha 2 Is two scalar parameters respectively representing | | | D | | non-woven phosphor * Andweight information of the item.
Specifically, the expanding regularization term of step 4) includes:
hoping that the same pedestrian from different camera views is in a domain specific dictionary D t Have the same coding coefficients, while it is desirable that the algorithm be able to make the distance between the coding coefficients of different pedestrians from different camera views larger than a constant. To meet this requirement, the following function is proposed for the viewing angle a, and a similar function is proposed for the viewing angle b by the same method, which is not described here again:
in the formula, { z } + Max { z,0}, c is an arbitrary constant,a k-th image representing the l-th pedestrian at a-camera view;representing the k-th pedestrian corresponding to the coding coefficient which is most dissimilar to the k-th image of the l-th pedestrian under the b view angle * An image, wherein k * ≠k;Indicating the l < th > image most similar to the k < th > image of the l < th > pedestrian under the b < th > view * Kth of individual pedestrian * An image of which * Not equal to l. In the formulaRepresentIt does not result in identity to the pedestrianThe misjudgment of (2). WhileTo representIt means that the pedestrian matching using the encoding coefficient of the pedestrian image feature causes misrecognition. In this case, minimizationCan promote
Specifically, the coding coefficient discrimination promoting term in step 5) includes:
matrix Z of coding coefficients for both a and b view fields a And Z b The same domain should have the same sparse representation. Based on the above considerations, Γ (Z) in the overall model framework (1) is defined a ,Z b ,Z ta ,Z tb ) Comprises the following steps:
in the formula (I), the compound is shown in the specification,minimizing Z laces 2,1 The entries in each row of Z may be made the same, which term may cause the same atoms to be selected from D to represent the original features of the same domain, and cause the coding coefficients of these features to share the same sparse representation on D. Alpha is alpha 3 ,α 4 ,α 5 Is three scalar parameters, each representing | | | Z a || 2,1 +||Z b || 2,1 、||Z ta || 1 +||Z tb || 1 Andweight information of the item.
Specifically, the overall objective function of step 6) includes:
in the formula, M a And M b Respectively representing the number of pedestrians at the view angle of the two-phase machine, N al And N bl The representations respectively represent the number of images corresponding to the ith pedestrian under the view angle of the two cameras.
Specifically, the variable solving of step 7) includes:
variables D, D for requirements in the overall objective function (6) t ,Z a ,Z b ,Z ta ,Z tb It is not co-convex, but it is convex for each variable when all other variables are fixed. Therefore, they can be optimized by an alternating iterative process, the solution for each variable being as follows:
in order to update the coding coefficient Z a Of variable Z b Update method and Z a Consistent, and not described in detail herein, assume first D, D t ,Z b ,Z ta ,Z tb Are all fixed, with the following objective function:
this is a typical l 2,1 Minimization problem, Z a The analytic solution of (a) can be expressed as:
Z a =(4D T D+α 3 Λ 1 ) -1 (4D T X a +2D T D t Z ta ) (8)
in the formula, Λ 1 Is formed byThe diagonal matrix is formed by the following steps,represents Z i Column j.
Then, by fixing D, D t ,Z a ,Z b ,Z tb To update Z ta Of variable Z tb Update method and Z ta In agreement, which is not described here, the following objective functions are available:
for convenience of optimization, equation (9) is rewritten as a vector form:
in the formula (I), the compound is shown in the specification,is the visual characteristic of the kth image of the ith pedestrian under the view angle a. To solve for (10), a relaxation variableIntroduced, equation (10) can then be relaxed as:
In updating the coding coefficient Z a And Z ta Then, dictionaries D and D t Can be updated alternately, with the following objective function:
to update D, an intermediate variable C is introduced, and equation (14) becomes:
c can be solved by:
this is a typical core specification minimization problem that can be solved by singular value thresholding algorithms. To update D t A relaxation variable H is introduced:
the closed solution for the relaxation variable H can be expressed as:
H=(α 2 D t D t T +I 1 ) -1 D (18)
wherein, I 1 Using updated C and H for an identity matrix, D can be optimized by solving:
this problem can be solved by the lagrange dual. Finally, D t The optimization can be achieved by solving:
this problem can be solved as the problem in equation (19).
Specifically, the pedestrian matching scheme of step 8) includes:
in the test, the dictionaries D and D are learned t The separation of the domain information and the specific pedestrian information can be achieved by solving:
in the formula, Z a ,Z b Representing the matrix of domain coding coefficients in views a, b, respectively, Z ta ,Z tb And coding coefficient matrixes respectively representing specific pedestrian information under the visual angles a and b. This problem can be solved by an alternating iteration method, whenAndand when so, stopping iteration. Order toAndis composed ofAndthe vector of coding coefficients of the second pedestrian may measure the similarity between pedestrians by calculating the following distance:
the invention has the beneficial effects that:
1. in the current pedestrian re-identification method, most researches assume that the pedestrian image to be identified has no domain difference between two visual angles, so that not only more image information is lost, but also false information is introduced to the result, and the visual effect of the pedestrian image is influenced. The pedestrian re-identification method provided by the invention can separate the domain information from the pedestrian image, avoids the transmission of false information, can reduce time consumption and improves the discrimination capability of pedestrians.
2. Compared with other methods, the pedestrian re-identification method provided by the invention has the advantage that the identification performance is obviously improved.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a pedestrian image pair from a perspective of two cameras on a PRID2011 dataset provided by an embodiment of the present invention;
fig. 3 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 1 The CMC curve of (1);
fig. 4 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 2 The CMC curve of (1);
FIG. 5 is a diagram of algorithm targeting on a PRID 2011-based data set, provided by an embodiment of the present inventionMiddle parameter alpha 3 The CMC curve of (1);
fig. 6 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 4 The CMC curve of (1);
fig. 7 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 5 The CMC curve of (1).
Detailed Description
The invention is further described with reference to the following drawings and detailed description.
Example 1: domain shift between pedestrian images from different camera perspectives is one of the major factors contributing to pedestrian appearance ambiguity. In addition, domain information in the same camera view is stable for a certain time, and all images in the same view share the same domain information. If the domain information can be separated from the pedestrian image, the remaining information will not be interfered by the domain information, and domain shift will not occur between pedestrian images from different camera perspectives. Based on the thought, the invention provides a novel domain invariant dictionary learning method which is used for cross-view pedestrian re-identification. In this approach, it is assumed that images from the same camera perspective share the same domain. In order to achieve a domain-invariant visual feature, pedestrian features at different viewing angles are divided into two components, one of which is a domain-specific component and the other of which is a domain-invariant feature component.
As shown in fig. 1, a cross-view pedestrian re-identification method based on discriminative dictionary learning includes the following steps:
1) determining a global model framework of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
2) dividing the pedestrian image features of different visual angles into specific visual angle domain information components and domain invariant pedestrian appearance feature components, and learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components;
3) training a discrimination promoting item of the dictionary;
4) providing an expansion regular term to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible;
5) training a discrimination promoting item of the coding coefficient, and forcing the coding coefficients of the pedestrian images with the same visual angle to have strong similarity;
6) determining an overall objective function of cross-view pedestrian re-recognition based on the learning of a discrimination dictionary;
7) solving variables to be updated in the overall objective function;
8) and designing a pedestrian matching scheme by adopting an Euclidean distance based on the model with only unchanged pedestrian appearance characteristics in the domain.
The specific implementation process is as follows: firstly, on the basis of the fact that pedestrian images from the same camera view angle share the same domain, dividing pedestrian features of different view angles into specific view angle domain information components and domain invariant pedestrian appearance feature components, learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components, and meanwhile forcing pedestrian coding coefficients under the same view angle to have strong similarity; then, in order to overcome the appearance ambiguity, an extended regular term is provided to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible; and finally, designing a pedestrian matching scheme by adopting the Euclidean distance based on the model only having the pedestrian characteristic information.
Further, the overall model framework of step 1) comprises:
by usingRepresenting a training sample set under a two-phase machine view, in this case, robust feature representation learning and discriminant metric learning need to be integrated into a framework, and the overall model framework is shown as formula (1):
in the formula (I), the compound is shown in the specification,a domain dictionary representing the pedestrian images under all cameras,representing a domain-specific dictionary for coding pedestrian appearance features after separating domain information, Z a ,Z b Is X on dictionary D a And X b Of the domain information, Z ta ,Z tb Is corresponding to the dictionary D t The coding coefficients of the domain-specific information. Phi (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Are data fidelity terms, minimizing which can be used to learn dictionaries D and D t Has the presentation capability. Ψ (D, D) t ) Is a discrimination promoting term of the dictionary, gamma (Z) a ,Z b ,Z ta ,Z tb ) The term is a discrimination promoting term of the coding coefficient, and the minimization of the two terms is to enable the dictionary and the coding coefficient to have strong discrimination capability.Is of DRow by rowIs D t To (1) aAnd (4) columns.
Further, the discriminant dictionary algorithm in step 2) includes:
to mitigate domain shifts between different camera perspectives, domain information is separated from pedestrian image features, and then data fidelity terms Φ (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Expressed as:
in the formula (I), the compound is shown in the specification,domain information for establishing a and b two-camera view angles,for separating the domain information from the appearance of pedestrians unaffected by the domain.
Further, the dictionary distinguishing promoting item in the step 3) comprises:
the dictionary D is used to represent domain information for different camera perspectives, since images from the same camera have the same domain features, it is desirable that the images are linearly related to each other in terms of domain features. To get from the sample X a And X b The domain information is separated, and the proposed dictionary discrimination promoting items are as follows:
in the formula, | D | non-conducting phosphor * The method is used for solving the nuclear norm of the dictionary D, because the domain information component and the real appearance characteristic of the pedestrian have different space morphological characteristics, an incoherent regular term of the structure is introducedTo promote the domain dictionary D and the pedestrian feature dictionary D t Are independent of each other. Alpha is alpha 1 And alpha 2 Is two scalar parameters respectively representing | | | D | | non-woven phosphor * Andweight information of the item.
Further, the extended regularization term of step 4) includes:
the same pedestrian domain-specific dictionary D, intended to come from different camera views t Have the same coding coefficients, while it is desirable that the algorithm be able to make the distance between the coding coefficients of different pedestrians from different camera views larger than a constant. In order to meet this need, it is known to provide,the following function is proposed for the viewing angle a, and a similar function is proposed for the viewing angle b by using the same method, which is not described here again:
in the formula, { z } + Max { z,0}, c is an arbitrary constant,a k-th image representing the l-th pedestrian at a-camera view;representing the k-th pedestrian corresponding to the coding coefficient which is most dissimilar to the k-th image of the l-th pedestrian under the b view angle * An image, wherein k * ≠k;Representing the l < th > image of the l < th > pedestrian under the b view angle and the k < th > image of the l < th > pedestrian under the a view angle * Kth of individual pedestrian * An image of which * Not equal to l. In the formulaRepresentIt does not lead to misjudgment of the identity of the pedestrian. WhileTo representIt means that the pedestrian matching using the encoding coefficient of the pedestrian image feature causes misrecognition. In this case, minimizationCan promote
Further, the coding coefficient discrimination promotion item in step 5) includes:
matrix Z of coding coefficients for both a and b view fields a And Z b The same domain should have the same sparse representation. Based on the above considerations, Γ (Z) in the overall model framework (1) is defined a ,Z b ,Z ta ,Z tb ) Comprises the following steps:
in the formula (I), the compound is shown in the specification,minimizing Z laces 2,1 The entries in each row of Z may be made the same, which term may cause the same atoms to be selected from D to represent the original features of the same domain, and cause the coding coefficients of these features to share the same sparse representation on D. Alpha is alpha 3 ,α 4 ,α 5 Is three scalar parameters, each representing | | | Z a || 2,1 +||Z b || 2,1 、||Z ta || 1 +||Z tb || 1 Andweight information of the item.
Further, the overall objective function of step 6) includes:
in the formula, M a And M b Respectively representing the number of pedestrians at the view angle of the two-phase machine, N al And N bl The representations respectively represent the number of images corresponding to the ith pedestrian under the view angle of the two cameras.
Further, the variable solving of step 7) includes:
variables D, D for requirements in the overall objective function (6) t ,Z a ,Z b ,Z ta ,Z tb It is not co-convex, but it is convex for each variable when all other variables are fixed. Therefore, they can be optimized by an alternating iterative process, the solution for each variable being as follows:
in order to update the coding coefficient Z a Of variable Z b Update method and Z a Consistent, and not described in detail herein, assume first D, D t ,Z b ,Z ta ,Z tb Are all fixed, with the following objective function:
this is a typical l 2,1 Minimization problem, Z a The analytic solution of (a) can be expressed as:
Z a =(4D T D+α 3 Λ 1 ) -1 (4D T X a +2D T D t Z ta ) (8)
in the formula, Λ 1 Is formed byThe diagonal matrix is formed by the following steps,represents Z i Column j.
Then, by fixing D, D t ,Z a ,Z b ,Z tb To update Z ta Of variable Z tb Update method and Z ta Consistently, and not described herein, there are the following objective functions:
for convenience of optimization, equation (9) is rewritten as a vector form:
in the formula (I), the compound is shown in the specification,is the visual characteristic of the kth image of the ith pedestrian under the view angle a. To solve for (10), a relaxation variableIntroduced, equation (10) can then be relaxed as:
In updating the coding coefficient Z a And Z ta Then, dictionaries D and D t Can be updated alternately, with the following objective function:
to update D, an intermediate variable C is introduced, and equation (14) becomes:
c can be solved by:
this is a typical core specification minimization problem that can be solved by singular value thresholding algorithms. To update D t A relaxation variable H is introduced:
the closed solution for the relaxation variable H can be expressed as:
H=(α 2 D t D t T +I 1 ) -1 D (18)
wherein, I 1 Using updated C and H for an identity matrix, D can be optimized by solving:
this problem can be solved by the lagrange dual. Finally, D t The optimization can be achieved by solving:
this problem can be solved as the problem in equation (19).
Further, the pedestrian matching scheme of step 8) includes:
in the test, the dictionaries D and D are learned t The separation of the domain information and the specific pedestrian information can be achieved by solving:
in the formula, Z a ,Z b Representing the matrix of domain coding coefficients in views a, b, respectively, Z ta ,Z tb And coding coefficient matrixes respectively representing specific pedestrian information under the visual angles a and b. This problem can be solved by an alternate iteration method, whenAndand when so, stopping iteration. Order toAndis composed ofAndthe vector of coding coefficients of the second pedestrian can be calculated as followsDistance to measure the similarity between pedestrians:
in the step 3), since images from the same camera view have domain similarity, dictionaries used for representing domain components are refined by low-rank terms, and meanwhile structural incoherent regular terms are introduced to enable a domain dictionary D and a pedestrian feature dictionary D to be promoted t The two judgment promoting terms aiming at the dictionary are added, so that the dictionary has stronger judgment capability.
In the steps 4) and 5), two discrimination promoting items aiming at the coding coefficient are added, so that the coding coefficient has stronger discrimination capability, and meanwhile, the coding coefficient Z is updated ta ,Z tb In this case, a gradient descent method is used.
In the step 8), a pedestrian matching scheme is designed by adopting an Euclidean distance based on the model with only unchanged pedestrian appearance characteristics of the domain, so that adverse effects on the recognition result caused by domain deviation are avoided.
The invention is further illustrated below with reference to specific experimental data.
In the experiment, each data set was randomly divided into two non-overlapping parts, one used as a training sample and the other used as a test sample. Cumulative matching feature (CMC) curves are used to quantitatively evaluate recognition performance. There are seven parameters in the model, including dictionaries D and D t Sizes d and d of t Five scalar parameters, i.e. alpha 1 ,α 2 ,α 3 ,α 4 And alpha 5 . The values of the above parameters were set to d 50 throughout the experiment, d t =760,α 1 =1,α 2 =0.01,α 3 =28,α 4 1 and α 5 5. Parameter alpha 1 ,α 2 ,α 3 ,α 4 And alpha 5 The impact on the recognition performance is given in fig. 3-7. Table 1 shows the performance comparison based on the most recent results on the PRID2011 data set, with the maximum values being bolded.
Table 1: performance comparison based on most recent results on PRID2011 dataset
The comparison result shows that the recognition rate of the proposed method is highest on different grades, and is even 5.4%, 3.9%, 4.9% and 0.5% higher than that of the suboptimal methods of grades 1, 5, 10 and 20 respectively.
While the present invention has been described in detail with reference to the embodiments, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.
Claims (5)
1. A cross-view pedestrian re-recognition method based on discriminative dictionary learning is characterized in that: the method comprises the following steps:
1) determining a global model framework of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
2) dividing the pedestrian image features of different visual angles into specific visual angle domain information components and domain invariant pedestrian appearance feature components, and learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components;
3) training a discrimination promoting item of the dictionary;
4) the method comprises the steps that according to an expansion regular term, coding coefficients of different pedestrians are forced to keep a certain distance, and the coding coefficients of the same pedestrian are close to each other as much as possible;
5) training a discrimination promoting item of the coding coefficient, and forcing the coding coefficients of the pedestrian images with the same visual angle to have strong similarity;
6) determining an overall objective function of cross-view pedestrian re-recognition based on the learning of a discrimination dictionary;
7) solving variables to be updated in the overall objective function;
8) designing a pedestrian matching scheme by adopting Euclidean distance based on a model with only domain invariant pedestrian appearance characteristics;
the overall model framework of the step 1) comprises the following steps:
by usingRepresenting a training sample set under a two-phase machine view, in this case, robust feature representation learning and discriminant metric learning need to be integrated into a framework, and the overall model framework is shown as formula (1):
in the formula (I), the compound is shown in the specification,a domain dictionary representing the pedestrian images under all cameras,representing a domain-specific dictionary for coding pedestrian appearance features after separating domain information, Z a ,Z b Is X on dictionary D a And X b Of the domain information, Z ta ,Z tb Is corresponding to the dictionary D t Phi (D, D) of the domain-specific information t ,Z a ,Z b ,Z ta ,Z tb ) Is the data fidelity term, Ψ (D, D) t ) Is a discrimination promoting term of the dictionary, gamma (Z) a ,Z b ,Z ta ,Z tb ) Is a discrimination promoting term of the coding coefficient,is of DRow by rowIs D t To (1) aColumns;
the discriminant dictionary algorithm in the step 2) comprises the following steps:
data fidelity term phi (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Expressed as:
in the formula (I), the compound is shown in the specification,establishing the domain information of the viewing angles of the a and b two cameras,separating the domain information from pedestrian appearance characteristics that are not affected by the domain;
the dictionary discrimination promoting item in the step 3) comprises:
the proposed dictionary discrimination promoting terms are:
in the formula, | D | non-conducting phosphor * Is to solve the nuclear norm of the dictionary D,is a structurally incoherent regularization term, α 1 And alpha 2 Is two scalar parameters respectively representing | | | D | | non-woven phosphor * Andweight information of the item;
the expanding regular term of the step 4) comprises the following steps:
the following function is proposed for the viewing angle a, and a similar function is proposed for the viewing angle b by using the same method, which is not described here again:
in the formula, { z } + Max { z,0}, c is an arbitrary constant,a k-th image representing the l-th pedestrian at a-camera view;representing the k-th pedestrian corresponding to the coding coefficient which is most dissimilar to the k-th image of the l-th pedestrian under the b view angle * An image, wherein k * ≠k;Indicating the l < th > image most similar to the k < th > image of the l < th > pedestrian under the b < th > view * Kth of individual pedestrian * An image of which * Not equal to l, and in the formulaTo representIt will not cause misjudgment of the identity of the pedestrianTo representIt means that the pedestrian matching using the coding coefficients of the pedestrian image features leads to misrecognition, in which case the minimizationCan promote
2. The cross-view pedestrian re-recognition method based on the discriminant dictionary learning is characterized in that: the coding coefficient discrimination promoting item in the step 5) comprises:
defining Γ (Z) in a global model framework (1) a ,Z b ,Z ta ,Z tb ) Comprises the following steps:
3. The cross-view pedestrian re-recognition method based on the discriminant dictionary learning as claimed in claim 2, wherein: the overall objective function of the step 6) comprises:
in the formula, M a And M b Respectively representing the number of pedestrians at the view angle of the two-phase machine, N al And N bl The representations respectively represent the number of images corresponding to the ith pedestrian under the view angle of the two cameras.
4. The cross-view pedestrian re-recognition method based on discriminative dictionary learning is characterized in that: the variable solving of the step 7) comprises the following steps:
variables D, D for requirements in the overall objective function (6) t ,Z a ,Z b ,Z ta ,Z tb It is not co-convex, but when all other variables are fixed, it is convex for each variable, so they are optimized by an alternating iterative process, the solution for each variable being as follows:
in order to update the coding coefficient Z a Of variable Z b Update method and Z a Consistent, and not described in detail herein, assume first D, D t ,Z b ,Z ta ,Z tb Are fixed, having the following objective function:
this is a typical l 2,1 Minimization problem, Z a The analytic solution of (a) can be expressed as:
Z a =(4D T D+α 3 Λ 1 ) -1 (4D T X a +2D T D t Z ta ) (8)
in the formula, Λ 1 Is formed byThe diagonal matrix is formed by the following steps,represents Z i Column j of (1);
then, by fixing D, D t ,Z a ,Z b ,Z tb To update Z ta Of variable quantityZ tb Update method and Z ta Consistently, and not described herein, there are the following objective functions:
for convenience of optimization, equation (9) is rewritten as a vector form:
in the formula (I), the compound is shown in the specification,is the visual characteristic of the kth image of the l pedestrian under the view angle a, and for solving (10), a relaxation variableIntroduced, equation (10) can then be relaxed as:
In updating the coding coefficient Z a And Z ta Then, dictionaries D and D t Can be updated alternately, with the following objective function:
to update D, an intermediate variable C is introduced, and equation (14) becomes:
c can be solved by:
this is a typical kernel specification minimization problem that can be solved by singular value thresholding algorithms to update D t A relaxation variable H is introduced:
the closed solution for the relaxation variable H can be expressed as:
H=(α 2 D t D t T +I 1 ) -1 D (18)
wherein, I 1 Using updated C and H for an identity matrix, D can be optimized by solving:
this problem can be solved by lagrange duality, and finally, D t The optimization can be achieved by solving:
this problem can be solved as the problem in equation (19).
5. The cross-view pedestrian re-recognition method based on discriminative dictionary learning according to claim 4, characterized in that: the pedestrian matching scheme of the step 8) comprises the following steps:
in the test, the dictionaries D and D are learned t The separation of the domain information and the specific pedestrian information can be achieved by solving:
in the formula, Z a ,Z b Representing the matrix of domain coding coefficients in views a, b, respectively, Z ta ,Z tb The problem can be solved by the alternate iteration method when the coding coefficient matrixes respectively represent the specific pedestrian information under the visual angles a and bAndthen stop iteration, orderAndis composed ofAndthe vector of coding coefficients of the second pedestrian may measure the similarity between pedestrians by calculating the following distance:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910966029.8A CN110826417B (en) | 2019-10-12 | 2019-10-12 | Cross-view pedestrian re-identification method based on discriminant dictionary learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910966029.8A CN110826417B (en) | 2019-10-12 | 2019-10-12 | Cross-view pedestrian re-identification method based on discriminant dictionary learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110826417A CN110826417A (en) | 2020-02-21 |
CN110826417B true CN110826417B (en) | 2022-08-16 |
Family
ID=69548968
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910966029.8A Active CN110826417B (en) | 2019-10-12 | 2019-10-12 | Cross-view pedestrian re-identification method based on discriminant dictionary learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110826417B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111783521B (en) * | 2020-05-19 | 2022-06-07 | 昆明理工大学 | Pedestrian re-identification method based on low-rank prior guidance and based on domain invariant information separation |
CN111783526B (en) * | 2020-05-21 | 2022-08-05 | 昆明理工大学 | Cross-domain pedestrian re-identification method using posture invariance and graph structure alignment |
CN113554569B (en) * | 2021-08-04 | 2022-03-08 | 哈尔滨工业大学 | Face image restoration system based on double memory dictionaries |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001202516A (en) * | 2000-01-19 | 2001-07-27 | Victor Co Of Japan Ltd | Device for individual identification |
CN103729462A (en) * | 2014-01-13 | 2014-04-16 | 武汉大学 | Pedestrian search method for processing shield on the basis of sparse representation |
CN104298992A (en) * | 2014-10-14 | 2015-01-21 | 武汉大学 | Self-adaptive scale pedestrian re-identification method based on data driving |
CN104778446A (en) * | 2015-03-19 | 2015-07-15 | 南京邮电大学 | Method for constructing image quality evaluation and face recognition efficiency relation model |
CN107194378A (en) * | 2017-06-28 | 2017-09-22 | 深圳大学 | A kind of face identification method and device based on mixing dictionary learning |
CN107679461A (en) * | 2017-09-12 | 2018-02-09 | 国家新闻出版广电总局广播科学研究院 | Pedestrian's recognition methods again based on antithesis integration analysis dictionary learning |
CN108509925A (en) * | 2018-04-08 | 2018-09-07 | 东北大学 | A kind of pedestrian's recognition methods again of view-based access control model bag of words |
CN109214442A (en) * | 2018-08-24 | 2019-01-15 | 昆明理工大学 | A kind of pedestrian's weight recognizer constrained based on list and identity coherence |
CN109284668A (en) * | 2018-07-27 | 2019-01-29 | 昆明理工大学 | A kind of pedestrian's weight recognizer based on apart from regularization projection and dictionary learning |
CN109409201A (en) * | 2018-09-05 | 2019-03-01 | 昆明理工大学 | A kind of pedestrian's recognition methods again based on shared and peculiar dictionary to combination learning |
CN109447123A (en) * | 2018-09-28 | 2019-03-08 | 昆明理工大学 | A kind of pedestrian's recognition methods again constrained based on tag compliance with stretching regularization dictionary learning |
CN109766748A (en) * | 2018-11-27 | 2019-05-17 | 昆明理工大学 | A kind of pedestrian based on projective transformation and dictionary learning knows method for distinguishing again |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7796121B2 (en) * | 2005-04-28 | 2010-09-14 | Research In Motion Limited | Handheld electronic device with reduced keyboard and associated method of providing improved disambiguation with reduced degradation of device performance |
-
2019
- 2019-10-12 CN CN201910966029.8A patent/CN110826417B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001202516A (en) * | 2000-01-19 | 2001-07-27 | Victor Co Of Japan Ltd | Device for individual identification |
CN103729462A (en) * | 2014-01-13 | 2014-04-16 | 武汉大学 | Pedestrian search method for processing shield on the basis of sparse representation |
CN104298992A (en) * | 2014-10-14 | 2015-01-21 | 武汉大学 | Self-adaptive scale pedestrian re-identification method based on data driving |
CN104778446A (en) * | 2015-03-19 | 2015-07-15 | 南京邮电大学 | Method for constructing image quality evaluation and face recognition efficiency relation model |
CN107194378A (en) * | 2017-06-28 | 2017-09-22 | 深圳大学 | A kind of face identification method and device based on mixing dictionary learning |
CN107679461A (en) * | 2017-09-12 | 2018-02-09 | 国家新闻出版广电总局广播科学研究院 | Pedestrian's recognition methods again based on antithesis integration analysis dictionary learning |
CN108509925A (en) * | 2018-04-08 | 2018-09-07 | 东北大学 | A kind of pedestrian's recognition methods again of view-based access control model bag of words |
CN109284668A (en) * | 2018-07-27 | 2019-01-29 | 昆明理工大学 | A kind of pedestrian's weight recognizer based on apart from regularization projection and dictionary learning |
CN109214442A (en) * | 2018-08-24 | 2019-01-15 | 昆明理工大学 | A kind of pedestrian's weight recognizer constrained based on list and identity coherence |
CN109409201A (en) * | 2018-09-05 | 2019-03-01 | 昆明理工大学 | A kind of pedestrian's recognition methods again based on shared and peculiar dictionary to combination learning |
CN109447123A (en) * | 2018-09-28 | 2019-03-08 | 昆明理工大学 | A kind of pedestrian's recognition methods again constrained based on tag compliance with stretching regularization dictionary learning |
CN109766748A (en) * | 2018-11-27 | 2019-05-17 | 昆明理工大学 | A kind of pedestrian based on projective transformation and dictionary learning knows method for distinguishing again |
Non-Patent Citations (4)
Title |
---|
A novel dictionary learning approach for multi-modality medical image fusion;Zhu Z;《Neurocomputing》;20161231;第471-482页 * |
基于字典学习和Fisher判别稀疏表示的行人重识别方法;张见威等;《华南理工大学学报(自然科学版)》;20170715(第07期);第55-62页 * |
基于核协同表示的步态识别;李占利等;《广西大学学报(自然科学版)》;20170425(第02期);第705-711页 * |
融合底层和中层字典特征的行人重识别;王丽;《中国光学》;20161015(第05期);第540-546页 * |
Also Published As
Publication number | Publication date |
---|---|
CN110826417A (en) | 2020-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10153001B2 (en) | Video skimming methods and systems | |
Zhu et al. | Multi-view deep subspace clustering networks | |
CN110826417B (en) | Cross-view pedestrian re-identification method based on discriminant dictionary learning | |
Yang et al. | Super normal vector for activity recognition using depth sequences | |
CN105590091B (en) | Face recognition method and system | |
Lee et al. | Collaborative expression representation using peak expression and intra class variation face images for practical subject-independent emotion recognition in videos | |
Deng et al. | Equidistant prototypes embedding for single sample based face recognition with generic learning and incremental learning | |
US9697614B2 (en) | Method for segmenting and tracking content in videos using low-dimensional subspaces and sparse vectors | |
Qin et al. | Compressive sequential learning for action similarity labeling | |
CN110889375B (en) | Hidden-double-flow cooperative learning network and method for behavior recognition | |
Xu et al. | Dynamic texture reconstruction from sparse codes for unusual event detection in crowded scenes | |
CN109409201B (en) | Pedestrian re-recognition method based on shared and special dictionary pair joint learning | |
CN111783521B (en) | Pedestrian re-identification method based on low-rank prior guidance and based on domain invariant information separation | |
CN108389189B (en) | Three-dimensional image quality evaluation method based on dictionary learning | |
Chen et al. | 3D object tracking via image sets and depth-based occlusion detection | |
Cao et al. | Robust depth-based object tracking from a moving binocular camera | |
Shao et al. | Action recognition using correlogram of body poses and spectral regression | |
Paul et al. | A conditional random field approach for audio-visual people diarization | |
Alavi et al. | Multi-shot person re-identification via relational stein divergence | |
Zhang et al. | Kernel dictionary learning based discriminant analysis | |
Bak et al. | Brownian descriptor: A rich meta-feature for appearance matching | |
Zhu et al. | Correspondence-free dictionary learning for cross-view action recognition | |
Torpey et al. | Human action recognition using local two-stream convolution neural network features and support vector machines | |
Guha et al. | A sparse reconstruction based algorithm for image and video classification | |
Al Ghamdi et al. | Alignment of nearly-repetitive contents in a video stream with manifold embedding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |