CN116311377A - Method and system for re-identifying clothing changing pedestrians based on relationship between images - Google Patents

Method and system for re-identifying clothing changing pedestrians based on relationship between images Download PDF

Info

Publication number
CN116311377A
CN116311377A CN202310324819.2A CN202310324819A CN116311377A CN 116311377 A CN116311377 A CN 116311377A CN 202310324819 A CN202310324819 A CN 202310324819A CN 116311377 A CN116311377 A CN 116311377A
Authority
CN
China
Prior art keywords
image
pedestrian
features
identified
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310324819.2A
Other languages
Chinese (zh)
Inventor
袁彩虹
苏晨爽
邹明东
周玉洁
许元辰
关志杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University
Original Assignee
Henan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University filed Critical Henan University
Priority to CN202310324819.2A priority Critical patent/CN116311377A/en
Publication of CN116311377A publication Critical patent/CN116311377A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a re-identification method and a re-identification system for a clothing changing pedestrian based on a relationship between images. The method comprises the following steps: acquiring all pedestrian images to be identified and preprocessing to obtain a series of pedestrian images wearing the same clothes; constructing an intra-image relation mining model and an inter-image relation mining model; dividing the preprocessed pedestrian image to be identified into a plurality of batches; aiming at the current batch, carrying out image inner relation modeling by utilizing an image inner relation mining model to obtain N fusion features; aiming at the current batch, constructing inter-image relation features by utilizing an inter-image relation mining model according to N fusion features, respectively fusing the inter-image relation features with the N fusion features to obtain respective final features of the N pedestrian images to be identified, and judging whether pedestrians in the pedestrian images to be identified are target pedestrians according to the final features; the first two steps are repeated for the next batch until identification of all batches is completed.

Description

Method and system for re-identifying clothing changing pedestrians based on relationship between images
Technical Field
The invention relates to the technical field of pedestrian re-identification, in particular to a method and a system for re-identifying a clothing changing pedestrian based on the relationship between images.
Background
Pedestrian re-recognition can be seen as a pedestrian retrieval problem aimed at retrieving a specific person from a large number of images taken from different cameras and scenes. The method mainly faces various challenges such as low image resolution, visual angle change, posture change, light ray change, shielding and clothes changing. At present, most pedestrian re-identification methods are carried out under the condition that the wearing of pedestrians is unchanged in a short period, and the study of re-identification of clothing-changing pedestrians is unavoidable in order to realize the landing of the pedestrian re-identification industry.
Pedestrian re-identification research based on changing clothes is more in line with actual conditions, and has stronger practical significance. For example, a potential criminal may escape tracking by deliberately changing the dressing, and the dressing of a lost child or elderly may change over time, such as by removing a coat or hat. For re-identification of a clothing change pedestrian, the identity of a person is typically determined by his physiological characteristics, such as appearance, height, etc., rather than apparent characteristics, such as clothing, shoes, hairstyles, etc. The key to solve the re-identification problem of the clothing changing pedestrian is to force the model to learn the physiological characteristics of the pedestrian which are not easy to disguise or change, but not the apparent characteristics of the clothing color and the like.
To address the impact and challenges of changing clothing on pedestrian re-recognition, some studies have provided methods for segmenting character contours and extracting more discernable body shape features. Ye et al utilize joint point features and model the relationships between the points, and also introduce a shape decomposition module that eliminates clothing, inputting the regularized differences between global features and relationship features into a self-paying network, allowing the network to automatically separate clothing features and body shape features. Jin et al adopts a dual stream architecture to learn rich gait information by approximately predicting a continuous gait frame from a single input query image in the gait stream; obtaining feature vectors in the ReID stream through an off-the-shelf network (such as ResNet 50); advanced semantic consistency constraints are then imposed on the same person on the features of both streams, thereby encouraging the ReID stream to learn clothing-independent gait motion features.
However, the existing pedestrian re-recognition method still has the following limitations: (1) Most of the existing methods only pay attention to extracting global features, local features or contour features from one image, but the relationships between images are rarely utilized, and although some works propose to model the relationships between images by using methods such as conditional random fields, the works only model the relationships between a few images during training, and have certain limitations on the learning of the relationships. (2) The prior method inevitably encounters the influence of apparent characteristics such as color and the like when facing the clothes changing problem. (3) Most of the existing approaches are CNN-based networks, but CNNs can only exploit local dependencies and suffer from information loss due to the use of downsampling operations.
Disclosure of Invention
In order to solve at least part of the problems, the invention provides a re-identification method and a re-identification system for a clothing changing pedestrian based on the relationship between the images.
In one aspect, the invention provides a re-identification method for a clothing changing pedestrian based on a relationship between an image and an image, comprising the following steps:
step 1: acquiring all pedestrian images to be identified, and preprocessing all the pedestrian images to be identified so that pedestrians contained in all the pedestrian images to be identified wear the same clothes;
step 2: constructing an intra-image relation mining model and an inter-image relation mining model;
step 3: dividing all the preprocessed pedestrian images to be identified into a plurality of batches, wherein each batch contains N pedestrian images to be identified;
step 4: respectively carrying out image interior relation modeling on N pedestrian images to be identified by utilizing the image interior relation mining model aiming at the current batch to obtain N fusion features; the process of modeling the intra-image relationship of each pedestrian image to be identified specifically comprises the following steps: extracting global features and local features of the pedestrian image to be identified, and taking the combination of the global features and the local features as features in an original image; constructing an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature, and fusing the intra-image relation feature with the original intra-image feature to obtain a fused feature of the pedestrian image to be identified;
step 5: aiming at the current batch, according to N fusion features corresponding to N pedestrian images to be identified, constructing image relationship features among the N pedestrian images to be identified by utilizing the image relationship mining model, respectively fusing the image relationship features with the N fusion features to obtain respective final features of the N pedestrian images to be identified, and judging whether pedestrians in the pedestrian images to be identified are target pedestrians according to the respective final features of the N pedestrian images to be identified;
step 6: and (5) repeating the steps 4 to 5 for the next batch until the identification of all batches is completed.
Further, the step 1 specifically includes:
step 1.1: selecting one image from all pedestrian images to be identified as a target pedestrian reference image;
step 1.2: semantic segmentation is carried out on the input pedestrian image to be identified by utilizing the human body analysis model, so that pixels belonging to the body of the pedestrian in the pedestrian image to be identified are obtained; the pedestrian body comprises two body parts, namely a coat and a lower coat;
step 1.3: and respectively replacing the pixels of all parts of the pedestrian body corresponding to the target pedestrian reference image to the positions of the pixels of the corresponding parts of the body of the rest of the pedestrian images to be identified, wherein the pixels of the rest of the pedestrian images to be identified at other positions are kept unchanged.
Further, the human body analysis model is an SCHP model.
Further, the intra-image relation mining model comprises a CNN model, a human body posture estimation module and a first transducer module; correspondingly, the step 4 specifically includes:
extracting global features of the input pedestrian image to be identified by adopting a CNN model;
extracting a local key point heat map of an input pedestrian image to be identified by adopting a human body posture estimation module;
taking the result of multiplying the global feature and the local key point heat map as a local feature;
adopting a first transducer module to construct an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature;
and taking the result of multiplying the intra-image relation feature and the intra-image feature of the original image as the fusion feature of the pedestrian image to be identified.
Further, in step 4, before constructing the intra-image relationship feature, the method further includes:
optimizing the global features and the local features by adopting the loss function shown in the formula (1);
Figure BDA0004152907360000031
wherein K represents a local feature V l The number of features included in the set of features,
Figure BDA0004152907360000032
Figure BDA0004152907360000038
confidence of kth key point, < ->
Figure BDA0004152907360000039
A heat map indicating the kth key point, max indicating the maximum value taking operation, beta K+1 =1 refers to global feature v K+1 Confidence of->
Figure BDA0004152907360000033
Representing a classification loss function, +.>
Figure BDA0004152907360000034
Representing a triple loss function>
Figure BDA0004152907360000035
Is a local feature v k Probability of identity truth value predicted by classifier, alpha is boundary,/>
Figure BDA0004152907360000036
Representing positive pairs of features (v) from the same pedestrian ak ,v pk ) Distance between->
Figure BDA0004152907360000037
Representing a negative pair of features (v) from different pedestrians ak ,v nk ) Distance between them.
Further, the key points include one or more of a nose, a left eye, a right eye, a left ear, a right ear, a left shoulder, a right shoulder, a left elbow, a right elbow, a left wrist, a right wrist, a left crotch, a right crotch, a left knee, a right knee, a left ankle, and a right ankle.
Further, the intra-image relational features are represented by a vector u by aggregation as shown in formula (2) i To express:
Figure BDA0004152907360000041
wherein s (·) is a Softmax function that converts affinity into weight,
Figure BDA0004152907360000042
is to map vectorsLinear projection function to v matrix, A ε R (K+1)×(K+1) The representation comprising any two features v i and vj Affinity matrix of similarity between i+.j and i ε [1,2., K, k+1],j∈[1,2,...,K,K+1]When i=k+1 or j=k+1, v i or vj Representing global features, and conversely, representing local features;
wherein, calculate and get affinity matrix A according to formula (3):
Figure BDA0004152907360000043
wherein
Figure BDA0004152907360000044
Representing the mapping of the vector to a q matrix, the linear projection function of the K matrix, K (·,) being the inner product function,/>
Figure BDA0004152907360000045
Represents a scale factor, q i Representing v i Corresponding matrix, k j Representing v j The corresponding matrix, T, represents the transpose.
Further, the inter-image relationship mining model comprises a second transducer module; correspondingly, the step 5 specifically includes:
and constructing image relationship features among the N pedestrian images to be identified by using the image relationship mining model according to N fusion features corresponding to the N pedestrian images to be identified by using a second transducer module.
Further, the relation characteristic between images adopts an aggregate expression vector w shown in a formula (4) d To express:
Figure BDA0004152907360000046
wherein s (·) is a Softmax function that converts affinity into weight,
Figure BDA0004152907360000047
is a linear projection function, B epsilon R N ×N The representation comprises any two fusion features f d and fe Affinity matrix of similarity between d.noteq.e and d.e. [1,2., N]And j.epsilon.1, 2, N];
Wherein, calculate and get affinity matrix B according to formula (5):
Figure BDA0004152907360000051
wherein
Figure BDA0004152907360000052
Representing the mapping of the vector to a q matrix, the linear projection function of the K matrix, K (·,) being the inner product function,/>
Figure BDA0004152907360000053
Represents a scale factor, q d Represents f d Corresponding matrix, k e Represents f e The corresponding matrix, T, represents the transpose.
Further, the method further comprises the following steps: construction of a loss function
Figure BDA0004152907360000054
Optimizing the relationship features between images;
Figure BDA0004152907360000055
Figure BDA0004152907360000056
Figure BDA0004152907360000057
Figure BDA0004152907360000058
wherein ,λ1 、λ 2 and λ3 Represent weights, L id Representing Identity Loss, L tri Representing Triplet Loss, L C The Center Loss is indicated as being the Center Loss,
Figure BDA0004152907360000059
is the fusion feature f r Probability of identity truth value predicted by classifier, alpha is boundary,/>
Figure BDA00041529073600000510
Representing a positive pair of features (f ar ,f pr ) Distance between->
Figure BDA00041529073600000511
Representing a negative pair of features (f ar ,f nr ) A distance therebetween; m represents the total number of fusion features, +.>
Figure BDA00041529073600000512
Y to represent fusion features t The features of the individual classes represent centers.
On the other hand, the invention provides a re-identification system for a clothing changing pedestrian based on the relationship between the inside of an image and the image, which comprises the following components: the device comprises an image preprocessing unit, an intra-image relation mining unit, an inter-image relation mining unit and an identification unit;
the image preprocessing unit is used for acquiring all the pedestrian images to be recognized, and preprocessing all the pedestrian images to be recognized so that pedestrians contained in all the pedestrian images to be recognized wear the same clothes;
the image intra-relation mining unit is used for constructing an image intra-relation mining model so as to respectively perform image intra-relation modeling on N pedestrian images to be identified contained in the input batch by using the image intra-relation mining model to obtain N fusion features; the process of modeling the intra-image relationship of each pedestrian image to be identified specifically comprises the following steps: extracting global features and local features of the pedestrian image to be identified, and taking the combination of the global features and the local features as features in an original image; constructing an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature, and fusing the intra-image relation feature with the original intra-image feature to obtain a fused feature of the pedestrian image to be identified;
the image relation mining unit is used for constructing an image relation mining model so as to construct image relation features among the N pedestrian images to be identified according to N fusion features corresponding to the N pedestrian images to be identified contained in the input batch by using the image relation mining model, and fusing the image relation features with the N fusion features to obtain final features of the N pedestrian images to be identified respectively;
the identification unit is used for judging whether the pedestrians in the pedestrian images to be identified are target pedestrians according to the final characteristics of the N pedestrian images to be identified contained in the input batch.
The invention has the beneficial effects that:
(1) Unlike available method, which separates clothes features and identity features with self-focusing network, the present invention has no need of separating clothes features and identity features of pedestrians, and pre-processes all the original pedestrian images to be identified into one series of identical clothes images, and the pre-processed pedestrian images have different pedestrians with identical clothes, so that the clothes features are no longer taken as features for distinguishing pedestrians 'identities, and the subsequent model design has no need of paying great attention to the clothes features of pedestrians, so as to avoid the influence of clothes features on the model, and the model has no need of distinguishing pedestrians' identities depending on the color appearance features, so that the designed model can extract more distinguishing shape features;
(2) Compared with the existing method for processing the original image data set without extra, the method provided by the invention has the advantages that the human body analysis model is adopted to divide the image to obtain the pixels of the body part (namely, the upper garment and the lower garment) with the greatest influence on pedestrian identification in the image, and the pixels are replaced, so that the greatest influence factors can be abandoned before the pedestrian characteristic extraction is carried out, a series of pedestrian images with the same clothes are obtained, and the influence of changing the clothes on the pedestrian re-identification is eliminated. Meanwhile, by the preprocessing mode, the model can be more concentrated on physiological characteristics which are not easy to change under the condition that a data set is not required to be expanded, so that the model extracts more discernable shape characteristics;
(3) Unlike available method of extracting only local information from one image and comparing, or extracting only global information and comparing to identify pedestrian identity, the present invention has the advantages of establishing intra-image relation characteristic and inter-image relation characteristic, utilizing intra-image and inter-image relation information to extract more distinguishing identity characteristic and utilizing the semantic information contained in image fully;
(4) Compared with the existing method for obtaining the local features by horizontal segmentation, the method can obtain more accurate local key points by adopting human body posture estimation, and because the local key points of adjacent joints are connected under the human body topological structure, more relation information can be extracted by carrying out relation exploration on all the local key points.
(5) Compared with the method for processing the head block specially in the existing re-recognition method for the clothing-changing pedestrian, the method can additionally extract five key points of the face, including the nose, the left eye, the right eye, the left ear and the right ear, by adopting the human body posture estimation, and can extract more discernable fine-grained characteristics by exploring the mutual relation among the five key points.
(6) Compared with other pedestrian re-recognition methods based on CNN networks, the method has strong capability of obtaining long-distance dependence by using the Transformer, and can lead the model to pay attention to different representing elements together due to the introduction of multi-head attention, so that different parts of pedestrians can be paid attention to under the condition that all images wear the same clothes, and the discernable shape features can be extracted.
Drawings
Fig. 1 is a schematic flow chart of a re-identification method for a clothing changing pedestrian based on a relationship between images in an image according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an intra-image relationship mining model and an inter-image relationship mining model according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions in the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
As shown in fig. 1, an embodiment of the present invention provides a method for identifying a re-clothing-changing pedestrian based on a relationship between an image and an image, including the following steps:
s101: acquiring all pedestrian images to be identified, and preprocessing all the pedestrian images to be identified so that pedestrians contained in all the pedestrian images to be identified wear the same clothes;
s102: constructing an intra-image relation mining model and an inter-image relation mining model;
s103: dividing all the preprocessed pedestrian images to be identified into a plurality of batches, wherein each batch contains N pedestrian images to be identified;
s104: respectively carrying out image interior relation modeling on N pedestrian images to be identified by utilizing the image interior relation mining model aiming at the current batch to obtain N fusion features; the process of modeling the intra-image relationship of each pedestrian image to be identified specifically comprises the following steps: extracting global features and local features of the pedestrian image to be identified, and taking the combination of the global features and the local features as features in an original image; constructing an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature, and fusing the intra-image relation feature with the original intra-image feature to obtain a fused feature of the pedestrian image to be identified;
s105: aiming at the current batch, according to N fusion features corresponding to N pedestrian images to be identified, constructing image relationship features among the N pedestrian images to be identified by utilizing the image relationship mining model, respectively fusing the image relationship features with the N fusion features to obtain respective final features of the N pedestrian images to be identified, and judging whether pedestrians in the pedestrian images to be identified are target pedestrians according to the respective final features of the N pedestrian images to be identified;
in particular, the identity of a person is typically determined by its physiological characteristics, not the apparent characteristics of clothing or the like. By extracting the relational features between images using interactions between the images, it is possible to better find the salient features of one pedestrian image that are different from other pedestrian images.
S106: steps S104 to S105 are repeatedly performed for the next lot until the identification of all lots is completed.
Unlike the conventional method of separating clothes features and identity features by using a self-attention network, the pretreatment mode adopted by the embodiment of the invention does not separate the clothes features and the identity features of pedestrians, but pretreats all the original pedestrian images to be identified, which are input, into a series of pedestrian images with the same clothes, and even pedestrians with different identities wear the same clothes in the pretreated pedestrian images, so that the clothes features are not taken as the features for distinguishing the identities of the pedestrians, and therefore, the problem of the clothes of the pedestrians is not needed to be excessively focused in the subsequent model design, thereby avoiding the influence of the clothes features on the model, and the model is not dependent on the color appearance features for identifying the identities of the pedestrians, so that the designed model can also extract more distinguishing shape features;
in addition, unlike the existing method that only local information is extracted from an image and is compared, or only global information is extracted and is compared to identify the identity of a pedestrian, the embodiment of the invention provides a method for constructing the intra-image relationship features and the inter-image relationship features, further utilizes the intra-image and inter-image relationship information to extract more discernable identity features, and fully utilizes semantic information contained in the image.
Example 2
On the basis of the above embodiment, the embodiment of the present invention provides an image preprocessing method, so that all pedestrians included in the pedestrian image to be identified wear the same garment. The method specifically comprises the following steps:
s201: selecting one image from all pedestrian images to be identified as a target pedestrian reference image; in this embodiment, the first inputted pedestrian image to be identified is selected as the target pedestrian reference image by default.
S202: semantic segmentation is carried out on the input pedestrian image to be identified by utilizing the human body analysis model, so that pixels belonging to the body of the pedestrian in the pedestrian image to be identified are obtained; the pedestrian body comprises two body parts, namely a coat and a lower coat (also called trousers); in this embodiment, the human body analysis model is an sccp model.
Specifically, all pedestrian images to be recognized of the current lot are noted as x= [ X 1 ,x 2 ,....,x N ]Where N is the batch size, x i An i-th image of a pedestrian to be identified having a size of c×h×w is represented, wherein C, H, W represents the number of passages, the height, and the width, respectively.
First, the result of semantic segmentation using a human body analytical model is expressed as s= [ S ] 1 ,s 2 ,....,s N ],s i Representing image x i And has a size of 1 XH W for identifying s i Semantic information of the middle pixel can be set as follows: the pixel values for the background, head, coat, lower coat, arm and leg positions are set to 0, 1,2, 3, 4 and 5, respectively.
Then, pixels belonging to the pedestrian body (i.e., the upper garment and the lower garment) are acquired from the above semantic division result. Image x i Each pixel can be represented as a vector v of length C j Thus, image x i There are a total of h×w pixel vectors. Image x i The set of body part pixels is:
Figure BDA0004152907360000091
Figure BDA0004152907360000092
wherein ,
Figure BDA0004152907360000093
representing the set of pixels at the upper garment site, < >>
Figure BDA0004152907360000094
Representing the pixel set of the lower garment part, U 1 Is image x i Total number of pixel vectors of the jacket, U 2 Is image x i Total number of pixel vectors of the middle-lower garment, < >>
Figure BDA0004152907360000095
Represents the j 1 Individual pixel vectors, v j2 Represents the j 2 The pixel vectors, 2, represent the index of the jacket and 3 the index of the lower jacket.
It will be appreciated that U 1 and U2 At each x i May be different.
S203: when other pedestrian images to be identified are input, the pixels of all parts of the pedestrian body corresponding to the target pedestrian reference image are respectively replaced to the positions of the pixels of the corresponding parts of the other pedestrian images to be identified, and the pixels of the other positions of the other pedestrian images to be identified are kept unchanged.
Specifically, the pixel sets of the respective body parts corresponding to the target pedestrian reference image are stored separately and denoted as G, and the pixel sets of the body parts (i.e., the upper garment and the lower garment) included in G are denoted as G, respectively upper and Gpants The method comprises the steps of carrying out a first treatment on the surface of the Setting M as the total number of pixels, for N pedestrian images to be recognized, there is m=n×h×w, and all pixel vectors in X are expressed as V X
Figure BDA0004152907360000101
wherein ,
Figure BDA0004152907360000102
pixel vector belonging to upper garment, +.>
Figure BDA0004152907360000103
Pixel vectors belonging to trousers.
Next, V will be X Pixel vector set of other pedestrian images to be identified in the image G besides the image G
Figure BDA0004152907360000104
and />
Figure BDA0004152907360000105
Respectively replace G upper and Gpants The method comprises the steps of carrying out a first treatment on the surface of the The changed pixel vector may be represented as V X ':
Figure BDA0004152907360000106
Other steps are the same as those of embodiment 1, and will not be repeated here.
According to the embodiment of the invention, the human body analysis model is adopted to divide the image to obtain the pixels of the body parts (namely the upper garment and the lower garment) with the greatest influence on pedestrian identification in the image, and the pixels are replaced, so that the greatest influence factors can be abandoned before the pedestrian characteristic extraction is carried out, a series of pedestrian images with the same clothes are obtained, and the influence of changing the clothes on the pedestrian re-identification is eliminated. Meanwhile, by the preprocessing mode, the model can be more concentrated on physiological characteristics which are not easy to change under the condition that a data set is not required to be expanded, so that the model extracts more discernable shape characteristics.
Example 3
On the basis of the above embodiments, as shown in fig. 2, the embodiment of the present invention provides a network architecture of an intra-image relationship mining model and a network architecture of an inter-image relationship mining model, and based on the two relationship mining models provided in the present embodiment, intra-image relationship features and inter-image relationship features can be better constructed. In fig. 2, IP represents an image preprocessing process, INS represents an intra-image relationship mining model, ITS represents an inter-image relationship mining model, and the following is specific:
the intra-image relation mining model comprises a CNN model, a human body posture estimation module and a first transducer module; correspondingly, the step S104 specifically includes: extracting a feature map f of an input pedestrian image to be identified by adopting a CNN model cnn The method comprises the steps of carrying out a first treatment on the surface of the The human body posture estimation module is adopted to extract the local key point heat map of the input pedestrian image to be identified, and the local key point heat map is recorded as m kp The method comprises the steps of carrying out a first treatment on the surface of the The global feature V is obtained by an average pooling operation (g ()) of the feature map g (for convenience of description, it is denoted as V g =g(f cnn ) A) is provided; the result of multiplying the global feature and the local key point heat map is used as a local feature (for convenience of description, written as
Figure BDA0004152907360000118
Adopting a first transducer module to construct an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature; and taking the result of multiplying the intra-image relation feature and the intra-image feature of the original image as the fusion feature of the pedestrian image to be identified.
Specifically, the keypoints in the local keypoint heat map comprise one or more of a nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left crotch, right crotch, left knee, right knee, left ankle, and right ankle; it will be appreciated that a key point will correspond to a local feature, that is, the local feature may be more than one, but a plurality of.
Further, to optimize the global features and the local features, the method further comprises, before constructing the intra-image relational features: and optimizing the global features and the local features by adopting the loss function described in the formula (1).
Figure BDA0004152907360000111
Wherein K represents a local feature V l The number of features included in the set of features,
Figure BDA0004152907360000112
Figure BDA0004152907360000119
confidence of kth key point, < ->
Figure BDA00041529073600001110
A heat map indicating the kth key point, max indicating the maximum value taking operation, beta K+1 =1 refers to global feature v K+1 Confidence of->
Figure BDA0004152907360000113
Representing a classification loss function, +.>
Figure BDA0004152907360000114
Representing a triple loss function>
Figure BDA0004152907360000115
Is a local feature v k Probability of identity truth value predicted by classifier, alpha is boundary,/>
Figure BDA0004152907360000116
Representing positive pairs of features (v) from the same pedestrian ak ,v pk ) Distance between->
Figure BDA0004152907360000117
Representing a negative pair of features (v) from different pedestrians ak ,v nk ) Distance between them. It is noted that the classifiers of the different local features are not shared.
Based on the intra-image relation mining model shown in FIG. 2, the global feature V is calculated g And a set of local features V l Simultaneously inputting the image internal relation model into a first transducer module to carry out image internal relation modeling, and adopting an aggregation expression vector u shown in a formula (2) to image internal relation characteristics i To express:
Figure BDA0004152907360000121
wherein s (·) is a Softmax function that converts affinity into weight,
Figure BDA0004152907360000122
is a linear projection function mapping vectors to v matrix, A.epsilon.R (K+1)×(K+1) The representation comprising any two features v i and vj Affinity matrix of similarity between i+.j and i ε [1,2., K, k+1],j∈[1,2,...,K,K+1]When i=k+1 or j=k+1, v i or vj Representing global features, and conversely, representing local features;
wherein, calculate and get affinity matrix A according to formula (3):
Figure BDA0004152907360000123
wherein
Figure BDA0004152907360000124
Representing the mapping of the vector to a q matrix, the linear projection function of the K matrix, K (·,) being the inner product function,/>
Figure BDA0004152907360000125
Represents a scale factor, q i Representing v i Corresponding matrix, k j Representing v j The corresponding matrix, T, represents the transpose.
Finally, the obtained intra-image relationship feature u i With all original features v corresponding to i Respectively multiplying, and then splicing all the obtained features into one feature, namely the fusion feature of the image, wherein the fusion feature is filledThe relation information among the local key points in the image is utilized, and robustness is improved.
The inter-image relationship mining model includes a second transducer module, and correspondingly, step S105 specifically includes: and constructing image relationship features among the N pedestrian images to be identified by using the image relationship mining model according to N fusion features corresponding to the N pedestrian images to be identified by using a second transducer module. For convenience of description, N fusion features corresponding to N pedestrian images to be identified are denoted as f= [ F ] 1 ,f 2 ,....,f N ]。
Based on the image-to-image relation mining model shown in fig. 2, inputting N fusion features F into a second transducer module for image-to-image relation modeling, and using the aggregate expression vector w shown in formula (4) for the image-to-image relation features d To express:
Figure BDA0004152907360000126
wherein s (·) is a Softmax function that converts affinity into weight,
Figure BDA0004152907360000127
is a linear projection function that maps the vector to a v matrix; b epsilon R N×N The representation comprises any two fusion features f d and fe Affinity matrix of similarity between d.noteq.e and d.e. [1,2., N]And j.epsilon.1, 2, N]。
Wherein, calculate and get affinity matrix B according to formula (5):
Figure BDA0004152907360000131
wherein ,
Figure BDA0004152907360000132
representing the mapping of the vector to a q matrix, the linear projection function of the K matrix, K (·,) being the inner product function,/>
Figure BDA0004152907360000133
Represents a scale factor, q d Represents f d Corresponding matrix, k e Represents f e The corresponding matrix, T, represents the transpose.
Finally, the obtained relation feature w d With all original features f corresponding to d And multiplying the obtained features respectively, and taking the obtained features as final features corresponding to the pedestrian images to be identified.
Further, in order to improve the discrimination capability of the deep learning features, the fusion features can keep the features of different classes separable and minimize the intra-class variation, and a loss function is also constructed in the embodiment
Figure BDA0004152907360000134
To optimize the relationship features between images. The method comprises the following steps:
Figure BDA0004152907360000135
Figure BDA0004152907360000136
Figure BDA0004152907360000137
Figure BDA0004152907360000138
wherein ,λ1 、λ 2 and λ3 Represent weights, L id Representing Identity Loss, L tri Representing Triplet Loss, L C The Center Loss is indicated as being the Center Loss,
Figure BDA0004152907360000139
is the fusion feature f r Probability of identity truth value predicted by classifier, alpha is boundary,/>
Figure BDA00041529073600001310
Representing a positive pair of features (f ar ,f pr ) Distance between->
Figure BDA00041529073600001311
Representing a negative pair of features (f ar ,f nr ) A distance therebetween; one fusion feature is one sample, in order to minimize the distance between each sample in min-batch and the center of the corresponding class, we provide a class center for each class, m represents the total number of fusion features (i.e., the number of samples),
Figure BDA0004152907360000143
y to represent fusion features t The features of the individual classes represent centers. In the present embodiment, lambda 1 、λ 2 and λ3 The values were 1, 1 and 0.0005, respectively.
It is noted that when deriving the category center, only the picture of a certain category in the current batch is used to obtain the update amount of the category center. The update strategy of Center Loss is as formula (10):
Figure BDA0004152907360000141
Figure BDA0004152907360000142
the delta (condition) value is as follows: when the condition is established, the value is 1, otherwise, the value is 0.
Compared with the existing method for obtaining the local features by horizontal segmentation, the method for obtaining the local features by horizontal segmentation can obtain more accurate local key points by using human body posture estimation, and because the local key points of adjacent joints are connected under a human body topological structure, more relation information can be extracted by carrying out relation exploration on all the local key points. Meanwhile, compared with a method for processing head blocks specially in the existing re-recognition method of the changing pedestrian, the method can additionally extract five key points of the face, including a nose, a left eye, a right eye, a left ear and a right ear, by adopting human body posture estimation, and can extract more discernable fine-grained characteristics by exploring the mutual relation among the five key points.
Meanwhile, compared with other pedestrian re-recognition methods based on CNN networks, the embodiment utilizes the capability of a transducer to obtain long-distance dependency, and due to the introduction of multi-head attention, the model can pay attention to different representing elements together, so that different parts of pedestrians can be paid attention to under the condition that all images wear the same clothes, and discernable shape features can be extracted.
Example 4
Corresponding to the method, the embodiment of the invention provides a re-identification system for a clothing changing pedestrian based on the relationship between images in the images, which comprises an image preprocessing unit, an image relationship mining unit and an identification unit;
the image preprocessing unit is used for acquiring all the pedestrian images to be recognized, and preprocessing all the pedestrian images to be recognized so that pedestrians contained in all the pedestrian images to be recognized wear the same clothes;
the image intra-relation mining unit is used for constructing an image intra-relation mining model so as to respectively perform image intra-relation modeling on N pedestrian images to be identified contained in the input batch by using the image intra-relation mining model to obtain N fusion features; the process of modeling the intra-image relationship of each pedestrian image to be identified specifically comprises the following steps: extracting global features and local features of the pedestrian image to be identified, and taking the combination of the global features and the local features as features in an original image; constructing an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature, and fusing the intra-image relation feature with the original intra-image feature to obtain a fused feature of the pedestrian image to be identified;
the image relation mining unit is used for constructing an image relation mining model so as to construct image relation features among the N pedestrian images to be identified according to N fusion features corresponding to the N pedestrian images to be identified contained in the input batch by using the image relation mining model, and fusing the image relation features with the N fusion features to obtain final features of the N pedestrian images to be identified respectively;
the identification unit is used for judging whether the pedestrians in the pedestrian images to be identified are target pedestrians according to the final characteristics of the N pedestrian images to be identified contained in the input batch.
It should be noted that, the re-recognition system for a clothing changing pedestrian based on the relationship between the images in the image provided by the embodiment of the present invention is to implement the above method embodiment, and the function thereof may specifically refer to the above method embodiment and is not repeated herein.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The re-identification method for the clothing changing pedestrians based on the relationship between the images is characterized by comprising the following steps:
step 1: acquiring all pedestrian images to be identified, and preprocessing all the pedestrian images to be identified so that pedestrians contained in all the pedestrian images to be identified wear the same clothes;
step 2: constructing an intra-image relation mining model and an inter-image relation mining model;
step 3: dividing all the preprocessed pedestrian images to be identified into a plurality of batches, wherein each batch contains N pedestrian images to be identified;
step 4: respectively carrying out image interior relation modeling on N pedestrian images to be identified by utilizing the image interior relation mining model aiming at the current batch to obtain N fusion features; the process of modeling the intra-image relationship of each pedestrian image to be identified specifically comprises the following steps: extracting global features and local features of the pedestrian image to be identified, and taking the combination of the global features and the local features as features in an original image; constructing an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature, and fusing the intra-image relation feature with the original intra-image feature to obtain a fused feature of the pedestrian image to be identified;
step 5: aiming at the current batch, according to N fusion features corresponding to N pedestrian images to be identified, constructing image relationship features among the N pedestrian images to be identified by utilizing the image relationship mining model, respectively fusing the image relationship features with the N fusion features to obtain respective final features of the N pedestrian images to be identified, and judging whether pedestrians in the pedestrian images to be identified are target pedestrians according to the respective final features of the N pedestrian images to be identified;
step 6: and (5) repeating the steps 4 to 5 for the next batch until the identification of all batches is completed.
2. The method for re-identifying a clothing changing pedestrian based on the relationship between images according to claim 1, wherein the step 1 specifically comprises:
step 1.1: selecting one image from all pedestrian images to be identified as a target pedestrian reference image;
step 1.2: semantic segmentation is carried out on the input pedestrian image to be identified by utilizing the human body analysis model, so that pixels belonging to the body of the pedestrian in the pedestrian image to be identified are obtained; the pedestrian body comprises two body parts, namely a coat and a lower coat;
step 1.3: and respectively replacing the pixels of all parts of the pedestrian body corresponding to the target pedestrian reference image to the positions of the pixels of the corresponding parts of the body of the rest of the pedestrian images to be identified, wherein the pixels of the rest of the pedestrian images to be identified at other positions are kept unchanged.
3. The method for re-identifying a clothing changing pedestrian based on the relationship between images according to claim 2, wherein the human body analysis model is an SCHP model.
4. The method for re-identifying a clothing change pedestrian based on an intra-image and inter-image relationship according to claim 1, wherein the intra-image relationship mining model comprises a CNN model, a human body posture estimation module and a first transducer module; correspondingly, the step 4 specifically includes:
extracting global features of the input pedestrian image to be identified by adopting a CNN model;
extracting a local key point heat map of an input pedestrian image to be identified by adopting a human body posture estimation module;
taking the result of multiplying the global feature and the local key point heat map as a local feature;
adopting a first transducer module to construct an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature;
and taking the result of multiplying the intra-image relation feature and the intra-image feature of the original image as the fusion feature of the pedestrian image to be identified.
5. The method for re-identifying a clothing changing pedestrian based on an intra-image and inter-image relationship according to claim 4, wherein in step 4, before constructing the intra-image relationship feature, further comprises:
optimizing the global features and the local features by adopting the loss function shown in the formula (1);
Figure FDA0004152907350000021
wherein K represents a local feature V l The number of features included in the set of features,
Figure FDA0004152907350000022
β k =max(m kp [k])∈[0,1]is the confidence of the kth key point, m kp [k]A heat map indicating the kth key point, max indicating the maximum value taking operation, beta K+1 =1 refers to global feature v K+1 Confidence of->
Figure FDA0004152907350000023
Representing a classification loss function, +.>
Figure FDA0004152907350000024
Representing a triple loss function>
Figure FDA0004152907350000025
Is a local feature v k Probability of identity truth value predicted by classifier, alpha is boundary,/>
Figure FDA0004152907350000026
Representing positive pairs of features (v) from the same pedestrian ak ,v pk ) Distance between->
Figure FDA0004152907350000027
Representing a negative pair of features (v) from different pedestrians ak ,v nk ) Distance between them.
6. The method for re-identifying a clothing changing pedestrian based on the relationship between images according to claim 4, wherein the characteristic of the relationship between images is represented by a vector u by aggregation shown in the formula (2) i To express:
Figure FDA0004152907350000028
wherein s (·) is a Softmax function that converts affinity into weight,
Figure FDA0004152907350000031
is a linear projection function mapping vectors to v matrix, A.epsilon.R (K+1)×(K+1) The representation comprising any two features v i and vj Affinity matrix of similarity between i+.j and i ε [1,2., K, k+1],j∈[1,2,...,K,K+1]When i=k+1 or j=k+1, v i or vj Representing global features, and conversely, representing local features;
wherein, calculate and get affinity matrix A according to formula (3):
Figure FDA0004152907350000032
wherein ,
Figure FDA0004152907350000033
representing the mapping of the vector to a q matrix, the linear projection function of the K matrix, K (·,) being the inner product function,/>
Figure FDA0004152907350000034
Represents a scale factor, q i Representing v i Corresponding matrix, k j Representing v j The corresponding matrix, T, represents the transpose.
7. The method for re-identifying a clothing change pedestrian based on an intra-image and inter-image relationship according to claim 1, wherein the inter-image relationship mining model comprises a second transducer module; correspondingly, the step 5 specifically includes:
and constructing image relationship features among the N pedestrian images to be identified by using the image relationship mining model according to N fusion features corresponding to the N pedestrian images to be identified by using a second transducer module.
8. The method for re-identifying a clothing changing pedestrian based on the relationship between images according to claim 7, wherein the relationship between images is characterized by adoptingThe vector w is represented by the aggregation shown in equation (4) d To express:
Figure FDA0004152907350000035
wherein s (·) is a Softmax function that converts affinity into weight,
Figure FDA0004152907350000036
is a linear projection function mapping vectors to v matrix, B.epsilon.R N×N The representation comprises any two fusion features f d and fe Affinity matrix of similarity between d.noteq.e and d.e. [1,2., N]And j.epsilon.1, 2, N];
Wherein, calculate and get affinity matrix B according to formula (5):
Figure FDA0004152907350000037
wherein ,
Figure FDA0004152907350000038
representing the mapping of the vector to a q matrix, the linear projection function of the K matrix, K (·,) being the inner product function,/>
Figure FDA0004152907350000041
Represents a scale factor, q d Represents f d Corresponding matrix, k e Represents f e The corresponding matrix, T, represents the transpose.
9. The method for re-identifying a clothing changing pedestrian based on the relationship between images according to claim 8, further comprising: construction of a loss function
Figure FDA0004152907350000042
Optimizing the relationship features between images;
Figure FDA0004152907350000043
Figure FDA0004152907350000044
Figure FDA0004152907350000045
Figure FDA0004152907350000046
wherein ,λ1 、λ 2 and λ3 Represent weights, L id Representing Identity Loss, L tri Representing Triplet Loss, L C Represent Center Loss, p fr Is the fusion feature f r The probability of belonging to the identity truth value predicted by the classifier, alpha being the boundary,
Figure FDA0004152907350000049
representing a positive pair of features (f ar ,f pr ) Distance between->
Figure FDA0004152907350000047
Representing a negative pair of features (f ar ,f nr ) A distance therebetween; m represents the total number of fusion features, +.>
Figure FDA0004152907350000048
Y to represent fusion features t The features of the individual classes represent centers.
10. The re-identification system for the clothing changing pedestrians based on the relation between the images is characterized by comprising the following components: the device comprises an image preprocessing unit, an intra-image relation mining unit, an inter-image relation mining unit and an identification unit;
the image preprocessing unit is used for acquiring all the pedestrian images to be recognized, and preprocessing all the pedestrian images to be recognized so that pedestrians contained in all the pedestrian images to be recognized wear the same clothes;
the image intra-relation mining unit is used for constructing an image intra-relation mining model so as to respectively perform image intra-relation modeling on N pedestrian images to be identified contained in the input batch by using the image intra-relation mining model to obtain N fusion features; the process of modeling the intra-image relationship of each pedestrian image to be identified specifically comprises the following steps: extracting global features and local features of the pedestrian image to be identified, and taking the combination of the global features and the local features as features in an original image; constructing an intra-image relation feature related to the pedestrian image to be identified according to the global feature and the local feature, and fusing the intra-image relation feature with the original intra-image feature to obtain a fused feature of the pedestrian image to be identified;
the image relation mining unit is used for constructing an image relation mining model so as to construct image relation features among the N pedestrian images to be identified according to N fusion features corresponding to the N pedestrian images to be identified contained in the input batch by using the image relation mining model, and fusing the image relation features with the N fusion features to obtain final features of the N pedestrian images to be identified respectively;
the identification unit is used for judging whether the pedestrians in the pedestrian images to be identified are target pedestrians according to the final characteristics of the N pedestrian images to be identified contained in the input batch.
CN202310324819.2A 2023-03-29 2023-03-29 Method and system for re-identifying clothing changing pedestrians based on relationship between images Pending CN116311377A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310324819.2A CN116311377A (en) 2023-03-29 2023-03-29 Method and system for re-identifying clothing changing pedestrians based on relationship between images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310324819.2A CN116311377A (en) 2023-03-29 2023-03-29 Method and system for re-identifying clothing changing pedestrians based on relationship between images

Publications (1)

Publication Number Publication Date
CN116311377A true CN116311377A (en) 2023-06-23

Family

ID=86803112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310324819.2A Pending CN116311377A (en) 2023-03-29 2023-03-29 Method and system for re-identifying clothing changing pedestrians based on relationship between images

Country Status (1)

Country Link
CN (1) CN116311377A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117671297A (en) * 2024-02-02 2024-03-08 华东交通大学 Pedestrian re-recognition method integrating interaction attributes
CN117831081A (en) * 2024-03-06 2024-04-05 齐鲁工业大学(山东省科学院) Method and system for re-identifying clothing changing pedestrians based on clothing changing data and residual error network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117671297A (en) * 2024-02-02 2024-03-08 华东交通大学 Pedestrian re-recognition method integrating interaction attributes
CN117831081A (en) * 2024-03-06 2024-04-05 齐鲁工业大学(山东省科学院) Method and system for re-identifying clothing changing pedestrians based on clothing changing data and residual error network
CN117831081B (en) * 2024-03-06 2024-05-24 齐鲁工业大学(山东省科学院) Method and system for re-identifying clothing changing pedestrians based on clothing changing data and residual error network

Similar Documents

Publication Publication Date Title
CN110135375B (en) Multi-person attitude estimation method based on global information integration
CN109711281B (en) Pedestrian re-recognition and feature recognition fusion method based on deep learning
KR101941994B1 (en) System for pedestrian detection and attribute extraction based on a joint deep network
CN108520535B (en) Object classification method based on depth recovery information
CN116311377A (en) Method and system for re-identifying clothing changing pedestrians based on relationship between images
CN109902548B (en) Object attribute identification method and device, computing equipment and system
CN110852276B (en) Pedestrian re-identification method based on multitask deep learning
CN112001353B (en) Pedestrian re-identification method based on multi-task joint supervised learning
CN109086659B (en) Human behavior recognition method and device based on multi-channel feature fusion
CN110738154A (en) pedestrian falling detection method based on human body posture estimation
CN110991274B (en) Pedestrian tumbling detection method based on Gaussian mixture model and neural network
Xu et al. MSACon: Mining spatial attention-based contextual information for road extraction
CN113963032A (en) Twin network structure target tracking method fusing target re-identification
CN113177464B (en) End-to-end multi-mode gait recognition method based on deep learning
CN109919137B (en) Pedestrian structural feature expression method
CN111723600B (en) Pedestrian re-recognition feature descriptor based on multi-task learning
CN112215185A (en) System and method for detecting falling behavior from monitoring video
CN113610046B (en) Behavior recognition method based on depth video linkage characteristics
CN111985332A (en) Gait recognition method for improving loss function based on deep learning
CN115841697A (en) Motion recognition method based on skeleton and image data fusion
CN112084998A (en) Pedestrian re-identification method based on attribute information assistance
CN112669343A (en) Zhuang minority nationality clothing segmentation method based on deep learning
CN114821786A (en) Gait recognition method based on human body contour and key point feature fusion
CN115100684A (en) Clothes-changing pedestrian re-identification method based on attitude and style normalization
CN113420697B (en) Reloading video pedestrian re-identification method and system based on appearance and shape characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination