CN110826417B - Cross-view pedestrian re-identification method based on discriminant dictionary learning - Google Patents

Cross-view pedestrian re-identification method based on discriminant dictionary learning Download PDF

Info

Publication number
CN110826417B
CN110826417B CN201910966029.8A CN201910966029A CN110826417B CN 110826417 B CN110826417 B CN 110826417B CN 201910966029 A CN201910966029 A CN 201910966029A CN 110826417 B CN110826417 B CN 110826417B
Authority
CN
China
Prior art keywords
pedestrian
domain
dictionary
view
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910966029.8A
Other languages
Chinese (zh)
Other versions
CN110826417A (en
Inventor
谢明鸿
颜悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN201910966029.8A priority Critical patent/CN110826417B/en
Publication of CN110826417A publication Critical patent/CN110826417A/en
Application granted granted Critical
Publication of CN110826417B publication Critical patent/CN110826417B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a cross-view pedestrian re-identification method based on discriminative dictionary learning, and belongs to the technical field of digital image processing. Firstly, on the basis of the fact that pedestrian images from the same camera view angle share the same domain, dividing pedestrian features of different view angles into specific view angle domain information components and domain invariant pedestrian appearance feature components, learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components, and meanwhile forcing pedestrian coding coefficients under the same view angle to have strong similarity; then, an expansion regular term is provided to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible; and finally, designing a pedestrian matching scheme by adopting the Euclidean distance based on the model only having the pedestrian characteristic information. The pedestrian re-identification method provided by the invention can separate the domain information in the image to solve the problem of domain deviation among different visual angles, and generates a good identification effect.

Description

Cross-view pedestrian re-identification method based on discriminant dictionary learning
Technical Field
The invention relates to a cross-view pedestrian re-identification method based on discriminative dictionary learning, and belongs to the technical field of digital image processing.
Background
Pedestrian re-identification is a technique that uses computer vision to determine the presence or absence of a target pedestrian from images or video sequences taken by different cameras. In recent years, pedestrian re-recognition has attracted increasing attention from researchers due to wide applications in pedestrian search, pedestrian tracking, and pedestrian behavior analysis, and a large number of methods of pedestrian re-recognition have been proposed. Although computer vision researchers have made great efforts to improve the performance of pedestrian re-identification systems, this technique still presents significant challenges because the appearance of pedestrians is often largely visually ambiguous in cross-camera views.
Disclosure of Invention
The invention aims to provide a cross-view pedestrian re-identification method based on discriminative dictionary learning, which is used for solving the problem of offset of a pedestrian re-identification domain in the prior art.
The technical scheme of the invention is as follows: a cross-view pedestrian re-recognition method based on discriminant dictionary learning comprises the following steps:
1) determining a global model framework of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
2) dividing the pedestrian image features of different visual angles into specific visual angle domain information components and domain invariant pedestrian appearance feature components, and learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components;
3) training a discrimination promoting item of the dictionary;
4) providing an expansion regular term to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible;
5) training a discrimination promoting item of the coding coefficient, and forcing the coding coefficients of the pedestrian images with the same visual angle to have strong similarity;
6) determining an overall objective function of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
7) solving variables to be updated in the overall objective function;
8) and designing a pedestrian matching scheme by adopting Euclidean distance based on the model with only the domain unchanged pedestrian appearance characteristics.
Specifically, the overall model framework of step 1) comprises:
by using
Figure BDA0002230483080000021
Representing a training sample set under a two-phase machine visual angle, wherein robust feature representation learning and discriminant metric learning are required to be integrated into a frame, and the overall model frame is as the formula (1)Shown in the figure:
Figure BDA0002230483080000022
Figure BDA0002230483080000023
in the formula (I), the compound is shown in the specification,
Figure BDA0002230483080000024
a domain dictionary representing the pedestrian images under all cameras,
Figure BDA0002230483080000025
representing a domain-specific dictionary for coding pedestrian appearance features after separating domain information, Z a ,Z b Is X on dictionary D a And X b Of the domain information, Z ta ,Z tb Is corresponding to the dictionary D t The coding coefficients of the domain-specific information. Phi (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Are data fidelity terms, minimizing which can be used to learn dictionaries D and D t Has the presentation capability. Ψ (D, D) t ) Is a discrimination promoting term of dictionary, gamma (Z) a ,Z b ,Z ta ,Z tb ) The term is a discrimination promoting term of the coding coefficient, and the minimization of the two terms is to enable the dictionary and the coding coefficient to have strong discrimination capability.
Figure BDA0002230483080000026
Is of D
Figure BDA0002230483080000027
Row of
Figure BDA0002230483080000028
Is D t To (1) a
Figure BDA0002230483080000029
And (4) columns.
Specifically, the discriminative dictionary algorithm of step 2) includes:
to mitigate domain shifts between different camera perspectives, domain information is separated from pedestrian image features, and then data fidelity terms Φ (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Expressed as:
Figure BDA00022304830800000210
in the formula (I), the compound is shown in the specification,
Figure BDA00022304830800000211
domain information for establishing a view angle of a and b two cameras,
Figure BDA00022304830800000212
for separating the domain information from the appearance of pedestrians unaffected by the domain.
Specifically, the dictionary identification promoting item in the step 3) includes:
the dictionary D is used to represent domain information for different camera perspectives, since images from the same camera have the same domain features, it is desirable that the images are linearly related to each other in terms of domain features. To get from the sample X a And X b The domain information is separated, and the proposed dictionary discrimination promoting items are as follows:
Figure BDA00022304830800000213
in the formula, | D | non-conducting phosphor * The method is used for solving the nuclear norm of the dictionary D, because the domain information component and the real appearance characteristic of the pedestrian have different space morphological characteristics, an incoherent regular term of the structure is introduced
Figure BDA0002230483080000031
To promote the domain dictionary D and the pedestrian feature dictionary D t Are independent of each other. Alpha (alpha) ("alpha") 1 And alpha 2 Is two scalar parameters respectively representing | | | D | | non-woven phosphor * And
Figure BDA0002230483080000032
weight information of the item.
Specifically, the expanding regularization term of step 4) includes:
hoping that the same pedestrian from different camera views is in a domain specific dictionary D t Have the same coding coefficients, while it is desirable that the algorithm be able to make the distance between the coding coefficients of different pedestrians from different camera views larger than a constant. To meet this requirement, the following function is proposed for the viewing angle a, and a similar function is proposed for the viewing angle b by the same method, which is not described here again:
Figure BDA0002230483080000033
in the formula, { z } + Max { z,0}, c is an arbitrary constant,
Figure BDA0002230483080000034
a k-th image representing the l-th pedestrian at a-camera view;
Figure BDA0002230483080000035
representing the k-th pedestrian corresponding to the coding coefficient which is most dissimilar to the k-th image of the l-th pedestrian under the b view angle * An image, wherein k * ≠k;
Figure BDA0002230483080000036
Indicating the l < th > image most similar to the k < th > image of the l < th > pedestrian under the b < th > view * Kth of individual pedestrian * An image of which * Not equal to l. In the formula
Figure BDA0002230483080000037
Represent
Figure BDA0002230483080000038
It does not result in identity to the pedestrianThe misjudgment of (2). While
Figure BDA0002230483080000039
To represent
Figure BDA00022304830800000310
It means that the pedestrian matching using the encoding coefficient of the pedestrian image feature causes misrecognition. In this case, minimization
Figure BDA00022304830800000311
Can promote
Figure BDA00022304830800000312
Specifically, the coding coefficient discrimination promoting term in step 5) includes:
matrix Z of coding coefficients for both a and b view fields a And Z b The same domain should have the same sparse representation. Based on the above considerations, Γ (Z) in the overall model framework (1) is defined a ,Z b ,Z ta ,Z tb ) Comprises the following steps:
Figure BDA00022304830800000313
in the formula (I), the compound is shown in the specification,
Figure BDA0002230483080000041
minimizing Z laces 2,1 The entries in each row of Z may be made the same, which term may cause the same atoms to be selected from D to represent the original features of the same domain, and cause the coding coefficients of these features to share the same sparse representation on D. Alpha is alpha 345 Is three scalar parameters, each representing | | | Z a || 2,1 +||Z b || 2,1 、||Z ta || 1 +||Z tb || 1 And
Figure BDA0002230483080000042
weight information of the item.
Specifically, the overall objective function of step 6) includes:
Figure BDA0002230483080000043
in the formula, M a And M b Respectively representing the number of pedestrians at the view angle of the two-phase machine, N al And N bl The representations respectively represent the number of images corresponding to the ith pedestrian under the view angle of the two cameras.
Specifically, the variable solving of step 7) includes:
variables D, D for requirements in the overall objective function (6) t ,Z a ,Z b ,Z ta ,Z tb It is not co-convex, but it is convex for each variable when all other variables are fixed. Therefore, they can be optimized by an alternating iterative process, the solution for each variable being as follows:
in order to update the coding coefficient Z a Of variable Z b Update method and Z a Consistent, and not described in detail herein, assume first D, D t ,Z b ,Z ta ,Z tb Are all fixed, with the following objective function:
Figure BDA0002230483080000044
this is a typical l 2,1 Minimization problem, Z a The analytic solution of (a) can be expressed as:
Z a =(4D T D+α 3 Λ 1 ) -1 (4D T X a +2D T D t Z ta ) (8)
in the formula, Λ 1 Is formed by
Figure BDA0002230483080000045
The diagonal matrix is formed by the following steps,
Figure BDA0002230483080000046
represents Z i Column j.
Then, by fixing D, D t ,Z a ,Z b ,Z tb To update Z ta Of variable Z tb Update method and Z ta In agreement, which is not described here, the following objective functions are available:
Figure BDA0002230483080000051
for convenience of optimization, equation (9) is rewritten as a vector form:
Figure BDA0002230483080000052
in the formula (I), the compound is shown in the specification,
Figure BDA0002230483080000053
is the visual characteristic of the kth image of the ith pedestrian under the view angle a. To solve for (10), a relaxation variable
Figure BDA0002230483080000054
Introduced, equation (10) can then be relaxed as:
Figure BDA0002230483080000055
the variables can be updated by solving
Figure BDA0002230483080000056
Figure BDA0002230483080000057
The above problem can be solved by an iterative shrinkage algorithm,
Figure BDA0002230483080000058
the update may be by:
Figure BDA0002230483080000059
wherein h represents the h-th iteration,
Figure BDA00022304830800000510
using updates
Figure BDA00022304830800000511
Z ta Can be constructed as
Figure BDA00022304830800000512
In updating the coding coefficient Z a And Z ta Then, dictionaries D and D t Can be updated alternately, with the following objective function:
Figure BDA00022304830800000513
to update D, an intermediate variable C is introduced, and equation (14) becomes:
Figure BDA00022304830800000514
c can be solved by:
Figure BDA0002230483080000061
this is a typical core specification minimization problem that can be solved by singular value thresholding algorithms. To update D t A relaxation variable H is introduced:
Figure BDA0002230483080000062
the closed solution for the relaxation variable H can be expressed as:
H=(α 2 D t D t T +I 1 ) -1 D (18)
wherein, I 1 Using updated C and H for an identity matrix, D can be optimized by solving:
Figure BDA0002230483080000063
this problem can be solved by the lagrange dual. Finally, D t The optimization can be achieved by solving:
Figure BDA0002230483080000064
this problem can be solved as the problem in equation (19).
Specifically, the pedestrian matching scheme of step 8) includes:
in the test, the dictionaries D and D are learned t The separation of the domain information and the specific pedestrian information can be achieved by solving:
Figure BDA0002230483080000065
in the formula, Z a ,Z b Representing the matrix of domain coding coefficients in views a, b, respectively, Z ta ,Z tb And coding coefficient matrixes respectively representing specific pedestrian information under the visual angles a and b. This problem can be solved by an alternating iteration method, when
Figure BDA0002230483080000066
And
Figure BDA0002230483080000067
and when so, stopping iteration. Order to
Figure BDA0002230483080000068
And
Figure BDA0002230483080000069
is composed of
Figure BDA00022304830800000610
And
Figure BDA00022304830800000611
the vector of coding coefficients of the second pedestrian may measure the similarity between pedestrians by calculating the following distance:
Figure BDA0002230483080000071
the invention has the beneficial effects that:
1. in the current pedestrian re-identification method, most researches assume that the pedestrian image to be identified has no domain difference between two visual angles, so that not only more image information is lost, but also false information is introduced to the result, and the visual effect of the pedestrian image is influenced. The pedestrian re-identification method provided by the invention can separate the domain information from the pedestrian image, avoids the transmission of false information, can reduce time consumption and improves the discrimination capability of pedestrians.
2. Compared with other methods, the pedestrian re-identification method provided by the invention has the advantage that the identification performance is obviously improved.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a pedestrian image pair from a perspective of two cameras on a PRID2011 dataset provided by an embodiment of the present invention;
fig. 3 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 1 The CMC curve of (1);
fig. 4 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 2 The CMC curve of (1);
FIG. 5 is a diagram of algorithm targeting on a PRID 2011-based data set, provided by an embodiment of the present inventionMiddle parameter alpha 3 The CMC curve of (1);
fig. 6 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 4 The CMC curve of (1);
fig. 7 is a diagram of a parameter α in an algorithm based on a PRID2011 data set according to an embodiment of the present invention 5 The CMC curve of (1).
Detailed Description
The invention is further described with reference to the following drawings and detailed description.
Example 1: domain shift between pedestrian images from different camera perspectives is one of the major factors contributing to pedestrian appearance ambiguity. In addition, domain information in the same camera view is stable for a certain time, and all images in the same view share the same domain information. If the domain information can be separated from the pedestrian image, the remaining information will not be interfered by the domain information, and domain shift will not occur between pedestrian images from different camera perspectives. Based on the thought, the invention provides a novel domain invariant dictionary learning method which is used for cross-view pedestrian re-identification. In this approach, it is assumed that images from the same camera perspective share the same domain. In order to achieve a domain-invariant visual feature, pedestrian features at different viewing angles are divided into two components, one of which is a domain-specific component and the other of which is a domain-invariant feature component.
As shown in fig. 1, a cross-view pedestrian re-identification method based on discriminative dictionary learning includes the following steps:
1) determining a global model framework of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
2) dividing the pedestrian image features of different visual angles into specific visual angle domain information components and domain invariant pedestrian appearance feature components, and learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components;
3) training a discrimination promoting item of the dictionary;
4) providing an expansion regular term to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible;
5) training a discrimination promoting item of the coding coefficient, and forcing the coding coefficients of the pedestrian images with the same visual angle to have strong similarity;
6) determining an overall objective function of cross-view pedestrian re-recognition based on the learning of a discrimination dictionary;
7) solving variables to be updated in the overall objective function;
8) and designing a pedestrian matching scheme by adopting an Euclidean distance based on the model with only unchanged pedestrian appearance characteristics in the domain.
The specific implementation process is as follows: firstly, on the basis of the fact that pedestrian images from the same camera view angle share the same domain, dividing pedestrian features of different view angles into specific view angle domain information components and domain invariant pedestrian appearance feature components, learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components, and meanwhile forcing pedestrian coding coefficients under the same view angle to have strong similarity; then, in order to overcome the appearance ambiguity, an extended regular term is provided to force the coding coefficients of different pedestrians to keep a certain distance, and the coding coefficients of the same pedestrian are as close as possible; and finally, designing a pedestrian matching scheme by adopting the Euclidean distance based on the model only having the pedestrian characteristic information.
Further, the overall model framework of step 1) comprises:
by using
Figure BDA0002230483080000081
Representing a training sample set under a two-phase machine view, in this case, robust feature representation learning and discriminant metric learning need to be integrated into a framework, and the overall model framework is shown as formula (1):
Figure BDA0002230483080000082
in the formula (I), the compound is shown in the specification,
Figure BDA0002230483080000083
a domain dictionary representing the pedestrian images under all cameras,
Figure BDA0002230483080000084
representing a domain-specific dictionary for coding pedestrian appearance features after separating domain information, Z a ,Z b Is X on dictionary D a And X b Of the domain information, Z ta ,Z tb Is corresponding to the dictionary D t The coding coefficients of the domain-specific information. Phi (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Are data fidelity terms, minimizing which can be used to learn dictionaries D and D t Has the presentation capability. Ψ (D, D) t ) Is a discrimination promoting term of the dictionary, gamma (Z) a ,Z b ,Z ta ,Z tb ) The term is a discrimination promoting term of the coding coefficient, and the minimization of the two terms is to enable the dictionary and the coding coefficient to have strong discrimination capability.
Figure BDA0002230483080000091
Is of D
Figure BDA0002230483080000092
Row by row
Figure BDA0002230483080000093
Is D t To (1) a
Figure BDA0002230483080000094
And (4) columns.
Further, the discriminant dictionary algorithm in step 2) includes:
to mitigate domain shifts between different camera perspectives, domain information is separated from pedestrian image features, and then data fidelity terms Φ (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Expressed as:
Figure BDA0002230483080000095
in the formula (I), the compound is shown in the specification,
Figure BDA0002230483080000096
domain information for establishing a and b two-camera view angles,
Figure BDA0002230483080000097
for separating the domain information from the appearance of pedestrians unaffected by the domain.
Further, the dictionary distinguishing promoting item in the step 3) comprises:
the dictionary D is used to represent domain information for different camera perspectives, since images from the same camera have the same domain features, it is desirable that the images are linearly related to each other in terms of domain features. To get from the sample X a And X b The domain information is separated, and the proposed dictionary discrimination promoting items are as follows:
Figure BDA0002230483080000098
in the formula, | D | non-conducting phosphor * The method is used for solving the nuclear norm of the dictionary D, because the domain information component and the real appearance characteristic of the pedestrian have different space morphological characteristics, an incoherent regular term of the structure is introduced
Figure BDA0002230483080000099
To promote the domain dictionary D and the pedestrian feature dictionary D t Are independent of each other. Alpha is alpha 1 And alpha 2 Is two scalar parameters respectively representing | | | D | | non-woven phosphor * And
Figure BDA00022304830800000910
weight information of the item.
Further, the extended regularization term of step 4) includes:
the same pedestrian domain-specific dictionary D, intended to come from different camera views t Have the same coding coefficients, while it is desirable that the algorithm be able to make the distance between the coding coefficients of different pedestrians from different camera views larger than a constant. In order to meet this need, it is known to provide,the following function is proposed for the viewing angle a, and a similar function is proposed for the viewing angle b by using the same method, which is not described here again:
Figure BDA0002230483080000101
in the formula, { z } + Max { z,0}, c is an arbitrary constant,
Figure BDA0002230483080000102
a k-th image representing the l-th pedestrian at a-camera view;
Figure BDA0002230483080000103
representing the k-th pedestrian corresponding to the coding coefficient which is most dissimilar to the k-th image of the l-th pedestrian under the b view angle * An image, wherein k * ≠k;
Figure BDA0002230483080000104
Representing the l < th > image of the l < th > pedestrian under the b view angle and the k < th > image of the l < th > pedestrian under the a view angle * Kth of individual pedestrian * An image of which * Not equal to l. In the formula
Figure BDA0002230483080000105
Represent
Figure BDA0002230483080000106
It does not lead to misjudgment of the identity of the pedestrian. While
Figure BDA0002230483080000107
To represent
Figure BDA0002230483080000108
It means that the pedestrian matching using the encoding coefficient of the pedestrian image feature causes misrecognition. In this case, minimization
Figure BDA0002230483080000109
Can promote
Figure BDA00022304830800001010
Further, the coding coefficient discrimination promotion item in step 5) includes:
matrix Z of coding coefficients for both a and b view fields a And Z b The same domain should have the same sparse representation. Based on the above considerations, Γ (Z) in the overall model framework (1) is defined a ,Z b ,Z ta ,Z tb ) Comprises the following steps:
Figure BDA00022304830800001011
in the formula (I), the compound is shown in the specification,
Figure BDA00022304830800001012
minimizing Z laces 2,1 The entries in each row of Z may be made the same, which term may cause the same atoms to be selected from D to represent the original features of the same domain, and cause the coding coefficients of these features to share the same sparse representation on D. Alpha is alpha 345 Is three scalar parameters, each representing | | | Z a || 2,1 +||Z b || 2,1 、||Z ta || 1 +||Z tb || 1 And
Figure BDA00022304830800001013
weight information of the item.
Further, the overall objective function of step 6) includes:
Figure BDA0002230483080000111
in the formula, M a And M b Respectively representing the number of pedestrians at the view angle of the two-phase machine, N al And N bl The representations respectively represent the number of images corresponding to the ith pedestrian under the view angle of the two cameras.
Further, the variable solving of step 7) includes:
variables D, D for requirements in the overall objective function (6) t ,Z a ,Z b ,Z ta ,Z tb It is not co-convex, but it is convex for each variable when all other variables are fixed. Therefore, they can be optimized by an alternating iterative process, the solution for each variable being as follows:
in order to update the coding coefficient Z a Of variable Z b Update method and Z a Consistent, and not described in detail herein, assume first D, D t ,Z b ,Z ta ,Z tb Are all fixed, with the following objective function:
Figure BDA0002230483080000112
this is a typical l 2,1 Minimization problem, Z a The analytic solution of (a) can be expressed as:
Z a =(4D T D+α 3 Λ 1 ) -1 (4D T X a +2D T D t Z ta ) (8)
in the formula, Λ 1 Is formed by
Figure BDA0002230483080000113
The diagonal matrix is formed by the following steps,
Figure BDA0002230483080000114
represents Z i Column j.
Then, by fixing D, D t ,Z a ,Z b ,Z tb To update Z ta Of variable Z tb Update method and Z ta Consistently, and not described herein, there are the following objective functions:
Figure BDA0002230483080000115
for convenience of optimization, equation (9) is rewritten as a vector form:
Figure BDA0002230483080000116
in the formula (I), the compound is shown in the specification,
Figure BDA0002230483080000117
is the visual characteristic of the kth image of the ith pedestrian under the view angle a. To solve for (10), a relaxation variable
Figure BDA0002230483080000121
Introduced, equation (10) can then be relaxed as:
Figure BDA0002230483080000122
the variables can be updated by solving
Figure BDA0002230483080000123
Figure BDA0002230483080000124
The above problem can be solved by an iterative shrinkage algorithm,
Figure BDA0002230483080000125
the update may be by:
Figure BDA0002230483080000126
wherein h represents the h-th iteration,
Figure BDA0002230483080000127
using updates
Figure BDA0002230483080000128
Z ta Can be constructed as
Figure BDA0002230483080000129
In updating the coding coefficient Z a And Z ta Then, dictionaries D and D t Can be updated alternately, with the following objective function:
Figure BDA00022304830800001210
to update D, an intermediate variable C is introduced, and equation (14) becomes:
Figure BDA00022304830800001211
c can be solved by:
Figure BDA00022304830800001212
this is a typical core specification minimization problem that can be solved by singular value thresholding algorithms. To update D t A relaxation variable H is introduced:
Figure BDA00022304830800001213
the closed solution for the relaxation variable H can be expressed as:
H=(α 2 D t D t T +I 1 ) -1 D (18)
wherein, I 1 Using updated C and H for an identity matrix, D can be optimized by solving:
Figure BDA0002230483080000131
this problem can be solved by the lagrange dual. Finally, D t The optimization can be achieved by solving:
Figure BDA0002230483080000132
this problem can be solved as the problem in equation (19).
Further, the pedestrian matching scheme of step 8) includes:
in the test, the dictionaries D and D are learned t The separation of the domain information and the specific pedestrian information can be achieved by solving:
Figure BDA0002230483080000133
in the formula, Z a ,Z b Representing the matrix of domain coding coefficients in views a, b, respectively, Z ta ,Z tb And coding coefficient matrixes respectively representing specific pedestrian information under the visual angles a and b. This problem can be solved by an alternate iteration method, when
Figure BDA0002230483080000134
And
Figure BDA0002230483080000135
and when so, stopping iteration. Order to
Figure BDA0002230483080000136
And
Figure BDA0002230483080000137
is composed of
Figure BDA0002230483080000138
And
Figure BDA0002230483080000139
the vector of coding coefficients of the second pedestrian can be calculated as followsDistance to measure the similarity between pedestrians:
Figure BDA00022304830800001310
in the step 3), since images from the same camera view have domain similarity, dictionaries used for representing domain components are refined by low-rank terms, and meanwhile structural incoherent regular terms are introduced to enable a domain dictionary D and a pedestrian feature dictionary D to be promoted t The two judgment promoting terms aiming at the dictionary are added, so that the dictionary has stronger judgment capability.
In the steps 4) and 5), two discrimination promoting items aiming at the coding coefficient are added, so that the coding coefficient has stronger discrimination capability, and meanwhile, the coding coefficient Z is updated ta ,Z tb In this case, a gradient descent method is used.
In the step 8), a pedestrian matching scheme is designed by adopting an Euclidean distance based on the model with only unchanged pedestrian appearance characteristics of the domain, so that adverse effects on the recognition result caused by domain deviation are avoided.
The invention is further illustrated below with reference to specific experimental data.
In the experiment, each data set was randomly divided into two non-overlapping parts, one used as a training sample and the other used as a test sample. Cumulative matching feature (CMC) curves are used to quantitatively evaluate recognition performance. There are seven parameters in the model, including dictionaries D and D t Sizes d and d of t Five scalar parameters, i.e. alpha 1 ,α 2 ,α 3 ,α 4 And alpha 5 . The values of the above parameters were set to d 50 throughout the experiment, d t =760,α 1 =1,α 2 =0.01,α 3 =28,α 4 1 and α 5 5. Parameter alpha 1 ,α 2 ,α 3 ,α 4 And alpha 5 The impact on the recognition performance is given in fig. 3-7. Table 1 shows the performance comparison based on the most recent results on the PRID2011 data set, with the maximum values being bolded.
Figure BDA0002230483080000141
Table 1: performance comparison based on most recent results on PRID2011 dataset
The comparison result shows that the recognition rate of the proposed method is highest on different grades, and is even 5.4%, 3.9%, 4.9% and 0.5% higher than that of the suboptimal methods of grades 1, 5, 10 and 20 respectively.
While the present invention has been described in detail with reference to the embodiments, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (5)

1. A cross-view pedestrian re-recognition method based on discriminative dictionary learning is characterized in that: the method comprises the following steps:
1) determining a global model framework of cross-view pedestrian re-recognition based on the learning of a discriminant dictionary;
2) dividing the pedestrian image features of different visual angles into specific visual angle domain information components and domain invariant pedestrian appearance feature components, and learning a discrimination dictionary algorithm to create a domain general dictionary for describing the domain information components and a domain invariant dictionary for describing the domain invariant components;
3) training a discrimination promoting item of the dictionary;
4) the method comprises the steps that according to an expansion regular term, coding coefficients of different pedestrians are forced to keep a certain distance, and the coding coefficients of the same pedestrian are close to each other as much as possible;
5) training a discrimination promoting item of the coding coefficient, and forcing the coding coefficients of the pedestrian images with the same visual angle to have strong similarity;
6) determining an overall objective function of cross-view pedestrian re-recognition based on the learning of a discrimination dictionary;
7) solving variables to be updated in the overall objective function;
8) designing a pedestrian matching scheme by adopting Euclidean distance based on a model with only domain invariant pedestrian appearance characteristics;
the overall model framework of the step 1) comprises the following steps:
by using
Figure FDA0003552965260000011
Representing a training sample set under a two-phase machine view, in this case, robust feature representation learning and discriminant metric learning need to be integrated into a framework, and the overall model framework is shown as formula (1):
Figure FDA0003552965260000012
in the formula (I), the compound is shown in the specification,
Figure FDA0003552965260000013
a domain dictionary representing the pedestrian images under all cameras,
Figure FDA0003552965260000014
representing a domain-specific dictionary for coding pedestrian appearance features after separating domain information, Z a ,Z b Is X on dictionary D a And X b Of the domain information, Z ta ,Z tb Is corresponding to the dictionary D t Phi (D, D) of the domain-specific information t ,Z a ,Z b ,Z ta ,Z tb ) Is the data fidelity term, Ψ (D, D) t ) Is a discrimination promoting term of the dictionary, gamma (Z) a ,Z b ,Z ta ,Z tb ) Is a discrimination promoting term of the coding coefficient,
Figure FDA0003552965260000015
is of D
Figure FDA0003552965260000016
Row by row
Figure FDA0003552965260000017
Is D t To (1) a
Figure FDA0003552965260000018
Columns;
the discriminant dictionary algorithm in the step 2) comprises the following steps:
data fidelity term phi (D, D) t ,Z a ,Z b ,Z ta ,Z tb ) Expressed as:
Figure FDA0003552965260000019
in the formula (I), the compound is shown in the specification,
Figure FDA0003552965260000021
establishing the domain information of the viewing angles of the a and b two cameras,
Figure FDA0003552965260000022
separating the domain information from pedestrian appearance characteristics that are not affected by the domain;
the dictionary discrimination promoting item in the step 3) comprises:
the proposed dictionary discrimination promoting terms are:
Figure FDA0003552965260000023
in the formula, | D | non-conducting phosphor * Is to solve the nuclear norm of the dictionary D,
Figure FDA0003552965260000024
is a structurally incoherent regularization term, α 1 And alpha 2 Is two scalar parameters respectively representing | | | D | | non-woven phosphor * And
Figure FDA0003552965260000025
weight information of the item;
the expanding regular term of the step 4) comprises the following steps:
the following function is proposed for the viewing angle a, and a similar function is proposed for the viewing angle b by using the same method, which is not described here again:
Figure FDA0003552965260000026
in the formula, { z } + Max { z,0}, c is an arbitrary constant,
Figure FDA0003552965260000027
a k-th image representing the l-th pedestrian at a-camera view;
Figure FDA0003552965260000028
representing the k-th pedestrian corresponding to the coding coefficient which is most dissimilar to the k-th image of the l-th pedestrian under the b view angle * An image, wherein k * ≠k;
Figure FDA0003552965260000029
Indicating the l < th > image most similar to the k < th > image of the l < th > pedestrian under the b < th > view * Kth of individual pedestrian * An image of which * Not equal to l, and in the formula
Figure FDA00035529652600000210
To represent
Figure FDA00035529652600000211
It will not cause misjudgment of the identity of the pedestrian
Figure FDA00035529652600000212
To represent
Figure FDA00035529652600000213
It means that the pedestrian matching using the coding coefficients of the pedestrian image features leads to misrecognition, in which case the minimization
Figure FDA00035529652600000214
Can promote
Figure FDA00035529652600000215
2. The cross-view pedestrian re-recognition method based on the discriminant dictionary learning is characterized in that: the coding coefficient discrimination promoting item in the step 5) comprises:
defining Γ (Z) in a global model framework (1) a ,Z b ,Z ta ,Z tb ) Comprises the following steps:
Figure FDA0003552965260000031
in the formula (I), the compound is shown in the specification,
Figure FDA0003552965260000032
α 345 is three scalar parameters, each representing | | | Z a || 2,1 +||Z b || 2,1 、||Z ta || 1 +||Z tb || 1 And
Figure FDA0003552965260000033
weight information of the item.
3. The cross-view pedestrian re-recognition method based on the discriminant dictionary learning as claimed in claim 2, wherein: the overall objective function of the step 6) comprises:
Figure FDA0003552965260000034
in the formula, M a And M b Respectively representing the number of pedestrians at the view angle of the two-phase machine, N al And N bl The representations respectively represent the number of images corresponding to the ith pedestrian under the view angle of the two cameras.
4. The cross-view pedestrian re-recognition method based on discriminative dictionary learning is characterized in that: the variable solving of the step 7) comprises the following steps:
variables D, D for requirements in the overall objective function (6) t ,Z a ,Z b ,Z ta ,Z tb It is not co-convex, but when all other variables are fixed, it is convex for each variable, so they are optimized by an alternating iterative process, the solution for each variable being as follows:
in order to update the coding coefficient Z a Of variable Z b Update method and Z a Consistent, and not described in detail herein, assume first D, D t ,Z b ,Z ta ,Z tb Are fixed, having the following objective function:
Figure FDA0003552965260000035
this is a typical l 2,1 Minimization problem, Z a The analytic solution of (a) can be expressed as:
Z a =(4D T D+α 3 Λ 1 ) -1 (4D T X a +2D T D t Z ta ) (8)
in the formula, Λ 1 Is formed by
Figure FDA0003552965260000036
The diagonal matrix is formed by the following steps,
Figure FDA0003552965260000037
represents Z i Column j of (1);
then, by fixing D, D t ,Z a ,Z b ,Z tb To update Z ta Of variable quantityZ tb Update method and Z ta Consistently, and not described herein, there are the following objective functions:
Figure FDA0003552965260000041
for convenience of optimization, equation (9) is rewritten as a vector form:
Figure FDA0003552965260000042
in the formula (I), the compound is shown in the specification,
Figure FDA0003552965260000043
is the visual characteristic of the kth image of the l pedestrian under the view angle a, and for solving (10), a relaxation variable
Figure FDA0003552965260000044
Introduced, equation (10) can then be relaxed as:
Figure FDA0003552965260000045
the variables can be updated by solving
Figure FDA0003552965260000046
Figure FDA0003552965260000047
The above problem can be solved by an iterative shrinkage algorithm,
Figure FDA0003552965260000048
the update may be by:
Figure FDA0003552965260000049
wherein h represents the h-th iteration,
Figure FDA00035529652600000410
using updates
Figure FDA00035529652600000411
Z ta Can be constructed as
Figure FDA00035529652600000412
In updating the coding coefficient Z a And Z ta Then, dictionaries D and D t Can be updated alternately, with the following objective function:
Figure FDA00035529652600000413
to update D, an intermediate variable C is introduced, and equation (14) becomes:
Figure FDA00035529652600000414
c can be solved by:
Figure FDA00035529652600000415
this is a typical kernel specification minimization problem that can be solved by singular value thresholding algorithms to update D t A relaxation variable H is introduced:
Figure FDA0003552965260000051
the closed solution for the relaxation variable H can be expressed as:
H=(α 2 D t D t T +I 1 ) -1 D (18)
wherein, I 1 Using updated C and H for an identity matrix, D can be optimized by solving:
Figure FDA0003552965260000052
this problem can be solved by lagrange duality, and finally, D t The optimization can be achieved by solving:
Figure FDA0003552965260000053
this problem can be solved as the problem in equation (19).
5. The cross-view pedestrian re-recognition method based on discriminative dictionary learning according to claim 4, characterized in that: the pedestrian matching scheme of the step 8) comprises the following steps:
in the test, the dictionaries D and D are learned t The separation of the domain information and the specific pedestrian information can be achieved by solving:
Figure FDA0003552965260000054
in the formula, Z a ,Z b Representing the matrix of domain coding coefficients in views a, b, respectively, Z ta ,Z tb The problem can be solved by the alternate iteration method when the coding coefficient matrixes respectively represent the specific pedestrian information under the visual angles a and b
Figure FDA0003552965260000055
And
Figure FDA0003552965260000056
then stop iteration, order
Figure FDA0003552965260000057
And
Figure FDA0003552965260000058
is composed of
Figure FDA0003552965260000059
And
Figure FDA00035529652600000510
the vector of coding coefficients of the second pedestrian may measure the similarity between pedestrians by calculating the following distance:
Figure FDA00035529652600000511
CN201910966029.8A 2019-10-12 2019-10-12 Cross-view pedestrian re-identification method based on discriminant dictionary learning Active CN110826417B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910966029.8A CN110826417B (en) 2019-10-12 2019-10-12 Cross-view pedestrian re-identification method based on discriminant dictionary learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910966029.8A CN110826417B (en) 2019-10-12 2019-10-12 Cross-view pedestrian re-identification method based on discriminant dictionary learning

Publications (2)

Publication Number Publication Date
CN110826417A CN110826417A (en) 2020-02-21
CN110826417B true CN110826417B (en) 2022-08-16

Family

ID=69548968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910966029.8A Active CN110826417B (en) 2019-10-12 2019-10-12 Cross-view pedestrian re-identification method based on discriminant dictionary learning

Country Status (1)

Country Link
CN (1) CN110826417B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111783521B (en) * 2020-05-19 2022-06-07 昆明理工大学 Pedestrian re-identification method based on low-rank prior guidance and based on domain invariant information separation
CN111783526B (en) * 2020-05-21 2022-08-05 昆明理工大学 Cross-domain pedestrian re-identification method using posture invariance and graph structure alignment
CN113554569B (en) * 2021-08-04 2022-03-08 哈尔滨工业大学 Face image restoration system based on double memory dictionaries

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001202516A (en) * 2000-01-19 2001-07-27 Victor Co Of Japan Ltd Device for individual identification
CN103729462A (en) * 2014-01-13 2014-04-16 武汉大学 Pedestrian search method for processing shield on the basis of sparse representation
CN104298992A (en) * 2014-10-14 2015-01-21 武汉大学 Self-adaptive scale pedestrian re-identification method based on data driving
CN104778446A (en) * 2015-03-19 2015-07-15 南京邮电大学 Method for constructing image quality evaluation and face recognition efficiency relation model
CN107194378A (en) * 2017-06-28 2017-09-22 深圳大学 A kind of face identification method and device based on mixing dictionary learning
CN107679461A (en) * 2017-09-12 2018-02-09 国家新闻出版广电总局广播科学研究院 Pedestrian's recognition methods again based on antithesis integration analysis dictionary learning
CN108509925A (en) * 2018-04-08 2018-09-07 东北大学 A kind of pedestrian's recognition methods again of view-based access control model bag of words
CN109214442A (en) * 2018-08-24 2019-01-15 昆明理工大学 A kind of pedestrian's weight recognizer constrained based on list and identity coherence
CN109284668A (en) * 2018-07-27 2019-01-29 昆明理工大学 A kind of pedestrian's weight recognizer based on apart from regularization projection and dictionary learning
CN109409201A (en) * 2018-09-05 2019-03-01 昆明理工大学 A kind of pedestrian's recognition methods again based on shared and peculiar dictionary to combination learning
CN109447123A (en) * 2018-09-28 2019-03-08 昆明理工大学 A kind of pedestrian's recognition methods again constrained based on tag compliance with stretching regularization dictionary learning
CN109766748A (en) * 2018-11-27 2019-05-17 昆明理工大学 A kind of pedestrian based on projective transformation and dictionary learning knows method for distinguishing again

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7796121B2 (en) * 2005-04-28 2010-09-14 Research In Motion Limited Handheld electronic device with reduced keyboard and associated method of providing improved disambiguation with reduced degradation of device performance

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001202516A (en) * 2000-01-19 2001-07-27 Victor Co Of Japan Ltd Device for individual identification
CN103729462A (en) * 2014-01-13 2014-04-16 武汉大学 Pedestrian search method for processing shield on the basis of sparse representation
CN104298992A (en) * 2014-10-14 2015-01-21 武汉大学 Self-adaptive scale pedestrian re-identification method based on data driving
CN104778446A (en) * 2015-03-19 2015-07-15 南京邮电大学 Method for constructing image quality evaluation and face recognition efficiency relation model
CN107194378A (en) * 2017-06-28 2017-09-22 深圳大学 A kind of face identification method and device based on mixing dictionary learning
CN107679461A (en) * 2017-09-12 2018-02-09 国家新闻出版广电总局广播科学研究院 Pedestrian's recognition methods again based on antithesis integration analysis dictionary learning
CN108509925A (en) * 2018-04-08 2018-09-07 东北大学 A kind of pedestrian's recognition methods again of view-based access control model bag of words
CN109284668A (en) * 2018-07-27 2019-01-29 昆明理工大学 A kind of pedestrian's weight recognizer based on apart from regularization projection and dictionary learning
CN109214442A (en) * 2018-08-24 2019-01-15 昆明理工大学 A kind of pedestrian's weight recognizer constrained based on list and identity coherence
CN109409201A (en) * 2018-09-05 2019-03-01 昆明理工大学 A kind of pedestrian's recognition methods again based on shared and peculiar dictionary to combination learning
CN109447123A (en) * 2018-09-28 2019-03-08 昆明理工大学 A kind of pedestrian's recognition methods again constrained based on tag compliance with stretching regularization dictionary learning
CN109766748A (en) * 2018-11-27 2019-05-17 昆明理工大学 A kind of pedestrian based on projective transformation and dictionary learning knows method for distinguishing again

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A novel dictionary learning approach for multi-modality medical image fusion;Zhu Z;《Neurocomputing》;20161231;第471-482页 *
基于字典学习和Fisher判别稀疏表示的行人重识别方法;张见威等;《华南理工大学学报(自然科学版)》;20170715(第07期);第55-62页 *
基于核协同表示的步态识别;李占利等;《广西大学学报(自然科学版)》;20170425(第02期);第705-711页 *
融合底层和中层字典特征的行人重识别;王丽;《中国光学》;20161015(第05期);第540-546页 *

Also Published As

Publication number Publication date
CN110826417A (en) 2020-02-21

Similar Documents

Publication Publication Date Title
US10153001B2 (en) Video skimming methods and systems
Zhu et al. Multi-view deep subspace clustering networks
CN110826417B (en) Cross-view pedestrian re-identification method based on discriminant dictionary learning
Yang et al. Super normal vector for activity recognition using depth sequences
CN105590091B (en) Face recognition method and system
Lee et al. Collaborative expression representation using peak expression and intra class variation face images for practical subject-independent emotion recognition in videos
Deng et al. Equidistant prototypes embedding for single sample based face recognition with generic learning and incremental learning
US9697614B2 (en) Method for segmenting and tracking content in videos using low-dimensional subspaces and sparse vectors
Qin et al. Compressive sequential learning for action similarity labeling
CN110889375B (en) Hidden-double-flow cooperative learning network and method for behavior recognition
Xu et al. Dynamic texture reconstruction from sparse codes for unusual event detection in crowded scenes
CN109409201B (en) Pedestrian re-recognition method based on shared and special dictionary pair joint learning
CN111783521B (en) Pedestrian re-identification method based on low-rank prior guidance and based on domain invariant information separation
CN108389189B (en) Three-dimensional image quality evaluation method based on dictionary learning
Chen et al. 3D object tracking via image sets and depth-based occlusion detection
Cao et al. Robust depth-based object tracking from a moving binocular camera
Shao et al. Action recognition using correlogram of body poses and spectral regression
Paul et al. A conditional random field approach for audio-visual people diarization
Alavi et al. Multi-shot person re-identification via relational stein divergence
Zhang et al. Kernel dictionary learning based discriminant analysis
Bak et al. Brownian descriptor: A rich meta-feature for appearance matching
Zhu et al. Correspondence-free dictionary learning for cross-view action recognition
Torpey et al. Human action recognition using local two-stream convolution neural network features and support vector machines
Guha et al. A sparse reconstruction based algorithm for image and video classification
Al Ghamdi et al. Alignment of nearly-repetitive contents in a video stream with manifold embedding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant