CN113128461A - Pedestrian re-recognition performance improving method based on human body key point mining full-scale features - Google Patents

Pedestrian re-recognition performance improving method based on human body key point mining full-scale features Download PDF

Info

Publication number
CN113128461A
CN113128461A CN202110492149.6A CN202110492149A CN113128461A CN 113128461 A CN113128461 A CN 113128461A CN 202110492149 A CN202110492149 A CN 202110492149A CN 113128461 A CN113128461 A CN 113128461A
Authority
CN
China
Prior art keywords
pedestrian
network
key point
human body
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110492149.6A
Other languages
Chinese (zh)
Other versions
CN113128461B (en
Inventor
杨绿溪
韩志伟
胡欣毅
惠鸿儒
李春国
黄永明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202110492149.6A priority Critical patent/CN113128461B/en
Publication of CN113128461A publication Critical patent/CN113128461A/en
Application granted granted Critical
Publication of CN113128461B publication Critical patent/CN113128461B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian re-identification performance improvement method based on full-scale features mined by human body key points, which utilizes the human body key points as local features of pedestrians, relieves the problem of insufficient global feature discrimination of the pedestrians under the shielding condition, combines key point thermodynamic diagrams obtained by a human body key point network with a pedestrian re-identification network, and increases the local features. Meanwhile, the visibility of the key points is predicted, and the visibility is applied to the calculation of the distance between the loss function and the characteristic, so that the negative influence of invisible local characteristics on the network performance is relieved. Experimental results show that the accuracy of pedestrian re-identification is effectively improved by the method provided by the invention.

Description

Pedestrian re-recognition performance improving method based on human body key point mining full-scale features
Technical Field
The invention relates to a pedestrian re-identification technology, and belongs to the technical field of computer visual image retrieval.
Background
Pedestrian re-identification (ReID) is a solution to the problem of pedestrian identification and retrieval across cameras and across scenes. And giving a target pedestrian picture, and finding the best matched pedestrian from a pedestrian picture library obtained by shooting through other cameras. Because blind spots exist among the cameras, the complete track of the pedestrian cannot be obtained by utilizing target tracking. Therefore, the pedestrian needs to be identified and matched between the two cameras, and if the time stamp can be combined, the number of matched pictures can be greatly reduced. Pedestrian re-identification can be regarded as a picture retrieval technology, and mainly aims at pedestrian retrieval. Compared with face recognition, the pedestrian re-recognition has less constraint on the scene, so that the method is more suitable for application in a security scene.
The early pedestrian re-identification algorithm is based on a manual feature extraction algorithm, firstly, features of pedestrians are extracted by utilizing a manually designed feature extraction template, the common features comprise color features, texture features, local features and semantic features, then, the distance between the features of an inquiry picture and the features of candidate pictures is calculated through a proper metric formula such as Mahalanobis distance and the like, and finally, whether the candidate pictures are matched with the inquiry picture can be judged. The core point of the traditional algorithm is to design pedestrian features with higher discriminability and more appropriate metric learning.
With the development of deep learning, more and more computer vision tasks adopt deep learning processing and have great success. In the field of pedestrian re-identification, the deep learning method is far superior to a pedestrian re-identification algorithm based on manually extracted features. Therefore, deep learning has become a mainstream research algorithm in the field of pedestrian re-identification in recent years. The mainstream algorithm for pedestrian re-identification at present comprises a method based on characterization learning, a method based on local characteristics, a method based on generation of a countermeasure network and a method based on metric learning.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the prior art, a pedestrian re-identification performance improvement method based on human body key point mining full-scale features is provided, and the problem that pedestrians are shielded in pedestrian re-identification is solved.
The technical scheme is as follows: a pedestrian re-identification performance improving method based on human body key point mining full-scale features comprises the following steps:
step 1: training a human body key point detection network hourglass network by using a data set aiming at human body key point detection, and inputting a pedestrian re-identification picture into the trained human body key point detection network hourglass network to obtain a thermodynamic diagram of key points of pedestrians;
step 2: inputting thermodynamic diagrams of pedestrians in a data set aiming at human body key point detection into a visibility classification sub-network, training the visibility classification sub-network, inputting the thermodynamic diagrams of the pedestrian key points obtained in the step 1 into the trained visibility classification sub-network for classification, and obtaining the visibility probability of each human body key point;
and step 3: inputting the pedestrian re-identification picture into a full-scale network of a pedestrian re-identification network to obtain global characteristics of pedestrians before global average pooling;
and 4, step 4: multiplying the global features obtained in the step (3) with thermodynamic diagrams of key points of the pedestrians to obtain a pedestrian feature map corresponding to each key point, and inputting the pedestrian feature maps and the global features into subsequent global average pooling to obtain global features and local features of the pedestrians;
and 5: in the training process, inputting the global features and the local features into a classifier to obtain the probability of each feature to each pedestrian identity, wherein the loss function is the cross entropy loss of each feature to each pedestrian identity and the real probability and then weighted average is carried out, and the weight is the probability of the visibility of key points of the human body;
in the testing process, the global characteristics and the local characteristics of the query picture and the database picture are respectively obtained in the step 4, the distance between the two picture characteristics is respectively calculated, the distance is obtained by weighted average of the global characteristics and the local Euclidean distance, and the weight of the local characteristics is the normalization of the visibility probability of each human key point.
Further, the step 1 specifically includes the following sub-steps:
step 1.1: training a human body key point detection network hourglass network, and cutting a pedestrian data set in a data set aiming at human body key point detection according to a provided rectangular frame to obtain each pedestrian picture, so that the hourglass network can train single key point detection;
step 1.2: performing data enhancement on each pedestrian picture obtained in the step 1, ensuring that the size of the picture is 256 multiplied by 128, wherein the data enhancement comprises the following steps: turning pictures, transforming sizes and filling the pictures;
step 1.3: inputting the data-enhanced picture into a human body key point detection network hourglass network, wherein the hourglass network comprises a plurality of hourglass modules, capturing global information and local information of pedestrians by utilizing the hourglass modules, predicting thermodynamic diagrams of key points of the pedestrians after the hourglass modules combine the global information and the local information, and taking the predicted thermodynamic diagrams as input of a next hourglass module until training of the human body key point detection network hourglass network is completed;
step 1.4: and changing the size of the pedestrian re-identification picture into 256 multiplied by 128, and inputting the picture into the trained human body key point detection network hourglass network to obtain the thermodynamic diagram of the pedestrian key points.
Further, the step 2 specifically includes the following sub-steps:
step 2.1: training a visibility classification sub-network, inputting thermodynamic diagrams of key points of pedestrians into a visibility classification sub-network convolution layer to obtain characteristics of the thermodynamic diagrams, inputting the characteristics of the thermodynamic diagrams into a full connection layer, and finally constraining the output within a range of 0 to 1 through an activation function so that the output of the visibility classification sub-network represents the visibility probability corresponding to each human key point;
step 2.2: taking the visibility probability as the input of a binary classification loss function, setting the visible key point as 1 and setting the invisible key point as 0;
step 2.3: and (3) taking the thermodynamic diagram of the key points of the pedestrian obtained in the step 1.4 as the input of the visibility classification sub-network trained in the step 2.2, and inputting the output of the visibility classification sub-network into an activation function to obtain the visibility probability of each human body key point.
Further, the step 3 specifically includes the following sub-steps:
step 3.1: the pedestrian re-identification picture is scaled to be 256 multiplied by 128 in the same size, and then data enhancement is carried out on the pedestrian re-identification picture through horizontal overturning;
step 3.2: inputting the data enhanced picture into a full-scale network of a pedestrian re-identification network, firstly extracting features by using a 7 multiplied by 7 convolutional network and a maximum pooling layer, and then inputting the extracted features into 4 residual modules to obtain new features, namely global features of pedestrians before global average pooling.
Further, the step 4 specifically includes the following sub-steps:
step 4.1: carrying out bilinear interpolation on the thermodynamic diagram of the key points of the pedestrian obtained in the step 1 to enable the size of the thermodynamic diagram to be consistent with the size of the global feature obtained in the step 3;
step 4.2: multiplying each thermodynamic diagram obtained in the step 4.1 by the global features obtained in the step 3 to obtain a pedestrian feature diagram corresponding to each key point;
step 4.3: inputting the pedestrian feature map and the global features corresponding to each key point into a global average pooling layer, and then leveling each feature to obtain the local characterization and the global characterization of the pedestrian.
Further, the step 5 specifically includes the following sub-steps:
step 5.1: in the training process, the local representation and the global representation of the pedestrian obtained in the step 4 are sequentially input into a full connection layer and an activation layer of a classifier, the output is constrained within the range of 0 to 1 by using an activation function, and the sum of the output is 1, so that the probability of different pedestrian identities is represented;
step 5.2: obtaining the loss of the network by utilizing the cross entropy loss according to the probability obtained in the step 5.1; the loss comprises the classification loss of global characterization and the classification loss of local characterization of the pedestrian, the two types of loss are weighted and averaged to obtain a loss function, and the weight is the probability of the visibility of the key points of the human body;
step 5.3: in the testing process, the local representation and the global representation of the query picture are obtained by using the step 4, and the local representation and the global representation of the database picture library are obtained by using the method of the step 4;
step 5.4: and respectively calculating the distances between the query picture and all picture representations in the database pictures, wherein the distances are obtained by Euclidean distance weighted average of global representations and local representations, the local representation weight is the normalization of the visibility probability of each human body key point, and then, the picture with the minimum distance from the query picture in the database pictures is taken as a picture matched with query.
Has the advantages that: in real scenes, occlusion is inevitable. In order to solve the problem of pedestrian re-identification in an occlusion scene, the invention is improved and optimized on two aspects: on one hand, in order to solve the problem of occlusion, the extraction of the local features of the pedestrians is of great importance, and the influence of occlusion on the global features can be effectively avoided by utilizing the effective local features. The key points of pedestrians are local features commonly used by pedestrians, so it is effective to combine human body key points to pedestrian re-recognition. On the other hand, occlusion may cause invisibility of the key points, and invisible local features may reduce the overall performance of the network, so it is necessary to increase the weight of the visible key point features and reduce the weight of the invisible key point features. The invention introduces the visibility of key points, applies the visibility to the calculation of the loss function and the characteristic distance of the network and reduces the influence of invisible key points.
Meanwhile, the thermodynamic diagram generated by the human body key point detection network is used for extracting the local features of the pedestrians, and the size of the key points is prevented from being manually designed. The accuracy of the method is 68.1% on the shielded pedestrian re-identification data set, and the method is greatly improved compared with the current algorithm.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a full scale staggering module;
FIG. 3 is a diagram of the overall structure of a full-scale network;
FIG. 4 is a schematic view of an hourglass module of the hourglass network;
FIG. 5 is an overall structural view of the hourglass network;
fig. 6 is a diagram of the overall network architecture of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings.
As shown in fig. 1 to 6, the method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature includes the following steps:
step 1: training the human body key point detection network hourglass network by using a data set aiming at human body key point detection, and inputting a pedestrian re-identification picture into the trained human body key point detection network hourglass network to obtain a thermodynamic diagram of the key points of the pedestrians.
The step 1 specifically comprises the following substeps:
step 1.1: training a human body key point detection network hourglass network, and cutting a pedestrian data set in a data set aiming at human body key point detection according to a provided rectangular frame to obtain each pedestrian picture, so that the hourglass network can train single key point detection.
Step 1.2: performing data enhancement on each pedestrian picture obtained in the step 1, ensuring that the size of the picture is 256 multiplied by 128, wherein the data enhancement comprises the following steps: picture turning, size conversion and picture filling.
Step 1.3: the method comprises the steps of inputting a picture with enhanced data to a human body key point detection network hourglass network, wherein the hourglass network comprises a plurality of hourglass modules, capturing global information and local information of pedestrians by utilizing the hourglass modules, predicting thermodynamic diagrams of key points of the pedestrians after the hourglass modules combine the global information and the local information, and taking the predicted thermodynamic diagrams as input of a next hourglass module until training of the human body key point detection network hourglass network is completed.
Step 1.4: and changing the size of the pedestrian re-identification picture into 256 multiplied by 128, and inputting the picture into the trained human body key point detection network hourglass network to obtain the thermodynamic diagram of the pedestrian key points.
Step 2: inputting thermodynamic diagrams of pedestrians in the data set aiming at human body key point detection into a visibility classification sub-network, training the visibility classification sub-network, inputting the thermodynamic diagrams of the pedestrian key points obtained in the step 1 into the trained visibility classification sub-network for classification, and obtaining the visibility probability of each human body key point.
The step 2 specifically comprises the following substeps:
step 2.1: training a visibility classification sub-network, inputting the thermodynamic diagram of the key points of the pedestrians into a visibility classification sub-network convolution layer to obtain the characteristics of the thermodynamic diagram, inputting the characteristics of the thermodynamic diagram into a full connection layer, and finally constraining the output within the range of 0 to 1 through an activation function, so that the output of the visibility classification sub-network represents the visibility probability corresponding to each human body key point.
Step 2.2: taking the visibility probability as the input of a binary classification loss function, setting the visible key point as 1 and setting the invisible key point as 0;
step 2.3: and (3) taking the thermodynamic diagram of the key points of the pedestrian obtained in the step 1.4 as the input of the visibility classification sub-network trained in the step 2.2, and inputting the output of the visibility classification sub-network into an activation function to obtain the visibility probability of each human body key point.
And step 3: and inputting the pedestrian re-identification picture into a full-scale network of the pedestrian re-identification network to obtain the global characteristics of the pedestrians before global average pooling.
The step 3 specifically comprises the following substeps:
step 3.1: the pedestrian re-identification picture is scaled to the same size of 256 × 128, and then data enhancement is performed on the pedestrian re-identification picture through horizontal flipping.
Step 3.2: inputting the data enhanced picture into a full-scale network of a pedestrian re-identification network, firstly extracting features by using a 7 multiplied by 7 convolutional network and a maximum pooling layer, and then inputting the extracted features into 4 residual modules to obtain new features, namely global features of pedestrians before global average pooling. The residual errors reduce model parameters by utilizing depth separable convolution, the calculated amount of the model is reduced, and the module can learn multi-scale features by staggering a plurality of branches in parallel, so that residual error connection is increased.
And 4, step 4: and (4) multiplying the global features obtained in the step (3) with thermodynamic diagrams of key points of the pedestrians to obtain a pedestrian feature map corresponding to each key point, and inputting the pedestrian feature map and the global features into subsequent global average pooling to obtain global and local features of the pedestrians.
The step 4 specifically comprises the following substeps:
step 4.1: and (3) carrying out bilinear interpolation on the thermodynamic diagram of the key points of the pedestrian obtained in the step (1) to enable the size to be consistent with the size of the global feature obtained in the step (3).
Step 4.2: and (4) multiplying each thermodynamic diagram obtained in the step (4.1) with the global features obtained in the step (3) to obtain a pedestrian feature diagram corresponding to each key point.
Step 4.3: inputting the pedestrian feature map and the global features corresponding to each key point into a global average pooling layer, and then leveling each feature to obtain the local characterization and the global characterization of the pedestrian.
And 5: in the training process, inputting the global features and the local features into a classifier to obtain the probability of each feature to each pedestrian identity, wherein the loss function is the cross entropy loss of each feature to each pedestrian identity and the real probability and then weighted average is carried out, and the weight is the probability of the visibility of key points of the human body;
in the testing process, the global characteristics and the local characteristics of the query picture and the database picture are respectively obtained in the step 4, the distance between the two picture characteristics is respectively calculated, the distance is obtained by weighted average of the global characteristics and the local Euclidean distance, and the weight of the local characteristics is the normalization of the visibility probability of each human key point.
The step 5 specifically comprises the following substeps:
step 5.1: in the training process, the local representation and the global representation of the pedestrian obtained in the step 4 are sequentially input into a full connection layer and an activation layer of a classifier, the output is constrained within the range of 0 to 1 by using an activation function, and the sum of the output is 1, so that the probability of different pedestrian identities is represented;
step 5.2: obtaining the loss of the network by utilizing the cross entropy loss according to the probability obtained in the step 5.1; the loss comprises the classification loss of global characterization and the classification loss of local characterization of the pedestrian, the two types of loss are weighted and averaged to obtain a loss function, and the weight is the probability of the visibility of the key points of the human body;
step 5.3: in the testing process, the local representation and the global representation of the query picture are obtained by using the step 4, and the local representation and the global representation of the database picture library are obtained by using the method of the step 4;
step 5.4: and respectively calculating the distances between the query picture and all picture representations in the database pictures, wherein the distances are obtained by Euclidean distance weighted average of global representations and local representations, the local representation weight is the normalization of the visibility probability of each human body key point, and then, the picture with the minimum distance from the query picture in the database pictures is taken as a picture matched with query.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (6)

1. A pedestrian re-recognition performance improving method based on human body key point mining full-scale features is characterized by comprising the following steps:
step 1: training a human body key point detection network hourglass network by using a data set aiming at human body key point detection, and inputting a pedestrian re-identification picture into the trained human body key point detection network hourglass network to obtain a thermodynamic diagram of key points of pedestrians;
step 2: inputting thermodynamic diagrams of pedestrians in a data set aiming at human body key point detection into a visibility classification sub-network, training the visibility classification sub-network, inputting the thermodynamic diagrams of the pedestrian key points obtained in the step 1 into the trained visibility classification sub-network for classification, and obtaining the visibility probability of each human body key point;
and step 3: inputting the pedestrian re-identification picture into a full-scale network of a pedestrian re-identification network to obtain global characteristics of pedestrians before global average pooling;
and 4, step 4: multiplying the global features obtained in the step (3) with thermodynamic diagrams of key points of the pedestrians to obtain a pedestrian feature map corresponding to each key point, and inputting the pedestrian feature maps and the global features into subsequent global average pooling to obtain global features and local features of the pedestrians;
and 5: in the training process, inputting the global features and the local features into a classifier to obtain the probability of each feature to each pedestrian identity, wherein the loss function is the cross entropy loss of each feature to each pedestrian identity and the real probability and then weighted average is carried out, and the weight is the probability of the visibility of key points of the human body;
in the testing process, the global characteristics and the local characteristics of the query picture and the database picture are respectively obtained in the step 4, the distance between the two picture characteristics is respectively calculated, the distance is obtained by weighted average of the global characteristics and the local Euclidean distance, and the weight of the local characteristics is the normalization of the visibility probability of each human key point.
2. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 1 specifically comprises the following substeps:
step 1.1: training a human body key point detection network hourglass network, and cutting a pedestrian data set in a data set aiming at human body key point detection according to a provided rectangular frame to obtain each pedestrian picture, so that the hourglass network can train single key point detection;
step 1.2: performing data enhancement on each pedestrian picture obtained in the step 1, ensuring that the size of the picture is 256 multiplied by 128, wherein the data enhancement comprises the following steps: turning pictures, transforming sizes and filling the pictures;
step 1.3: inputting the data-enhanced picture into a human body key point detection network hourglass network, wherein the hourglass network comprises a plurality of hourglass modules, capturing global information and local information of pedestrians by utilizing the hourglass modules, predicting thermodynamic diagrams of key points of the pedestrians after the hourglass modules combine the global information and the local information, and taking the predicted thermodynamic diagrams as input of a next hourglass module until training of the human body key point detection network hourglass network is completed;
step 1.4: and changing the size of the pedestrian re-identification picture into 256 multiplied by 128, and inputting the picture into the trained human body key point detection network hourglass network to obtain the thermodynamic diagram of the pedestrian key points.
3. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 2, wherein the step 2 specifically comprises the following sub-steps:
step 2.1: training a visibility classification sub-network, inputting thermodynamic diagrams of key points of pedestrians into a visibility classification sub-network convolution layer to obtain characteristics of the thermodynamic diagrams, inputting the characteristics of the thermodynamic diagrams into a full connection layer, and finally constraining the output within a range of 0 to 1 through an activation function so that the output of the visibility classification sub-network represents the visibility probability corresponding to each human key point;
step 2.2: taking the visibility probability as the input of a binary classification loss function, setting the visible key point as 1 and setting the invisible key point as 0;
step 2.3: and (3) taking the thermodynamic diagram of the key points of the pedestrian obtained in the step 1.4 as the input of the visibility classification sub-network trained in the step 2.2, and inputting the output of the visibility classification sub-network into an activation function to obtain the visibility probability of each human body key point.
4. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 3 specifically comprises the following substeps:
step 3.1: the pedestrian re-identification picture is scaled to be 256 multiplied by 128 in the same size, and then data enhancement is carried out on the pedestrian re-identification picture through horizontal overturning;
step 3.2: inputting the data enhanced picture into a full-scale network of a pedestrian re-identification network, firstly extracting features by using a 7 multiplied by 7 convolutional network and a maximum pooling layer, and then inputting the extracted features into 4 residual modules to obtain new features, namely global features of pedestrians before global average pooling.
5. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 4 specifically comprises the following substeps:
step 4.1: carrying out bilinear interpolation on the thermodynamic diagram of the key points of the pedestrian obtained in the step 1 to enable the size of the thermodynamic diagram to be consistent with the size of the global feature obtained in the step 3;
step 4.2: multiplying each thermodynamic diagram obtained in the step 4.1 by the global features obtained in the step 3 to obtain a pedestrian feature diagram corresponding to each key point;
step 4.3: inputting the pedestrian feature map and the global features corresponding to each key point into a global average pooling layer, and then leveling each feature to obtain the local characterization and the global characterization of the pedestrian.
6. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 5 specifically comprises the following substeps:
step 5.1: in the training process, the local representation and the global representation of the pedestrian obtained in the step 4 are sequentially input into a full connection layer and an activation layer of a classifier, the output is constrained within the range of 0 to 1 by using an activation function, and the sum of the output is 1, so that the probability of different pedestrian identities is represented;
step 5.2: obtaining the loss of the network by utilizing the cross entropy loss according to the probability obtained in the step 5.1; the loss comprises the classification loss of global characterization and the classification loss of local characterization of the pedestrian, the two types of loss are weighted and averaged to obtain a loss function, and the weight is the probability of the visibility of the key points of the human body;
step 5.3: in the testing process, the local representation and the global representation of the query picture are obtained by using the step 4, and the local representation and the global representation of the database picture library are obtained by using the method of the step 4;
step 5.4: and respectively calculating the distances between the query picture and all picture representations in the database pictures, wherein the distances are obtained by Euclidean distance weighted average of global representations and local representations, the local representation weight is the normalization of the visibility probability of each human body key point, and then, the picture with the minimum distance from the query picture in the database pictures is taken as a picture matched with query.
CN202110492149.6A 2021-05-06 2021-05-06 Pedestrian re-recognition performance improving method based on human body key point mining full-scale features Active CN113128461B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110492149.6A CN113128461B (en) 2021-05-06 2021-05-06 Pedestrian re-recognition performance improving method based on human body key point mining full-scale features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110492149.6A CN113128461B (en) 2021-05-06 2021-05-06 Pedestrian re-recognition performance improving method based on human body key point mining full-scale features

Publications (2)

Publication Number Publication Date
CN113128461A true CN113128461A (en) 2021-07-16
CN113128461B CN113128461B (en) 2022-11-08

Family

ID=76781541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110492149.6A Active CN113128461B (en) 2021-05-06 2021-05-06 Pedestrian re-recognition performance improving method based on human body key point mining full-scale features

Country Status (1)

Country Link
CN (1) CN113128461B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830637A (en) * 2022-12-13 2023-03-21 杭州电子科技大学 Method for re-identifying shielded pedestrian based on attitude estimation and background suppression

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784258A (en) * 2019-01-08 2019-05-21 华南理工大学 A kind of pedestrian's recognition methods again cut and merged based on Analysis On Multi-scale Features
CN110796026A (en) * 2019-10-10 2020-02-14 湖北工业大学 Pedestrian re-identification method based on global feature stitching
CN111666843A (en) * 2020-05-25 2020-09-15 湖北工业大学 Pedestrian re-identification method based on global feature and local feature splicing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784258A (en) * 2019-01-08 2019-05-21 华南理工大学 A kind of pedestrian's recognition methods again cut and merged based on Analysis On Multi-scale Features
CN110796026A (en) * 2019-10-10 2020-02-14 湖北工业大学 Pedestrian re-identification method based on global feature stitching
CN111666843A (en) * 2020-05-25 2020-09-15 湖北工业大学 Pedestrian re-identification method based on global feature and local feature splicing

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830637A (en) * 2022-12-13 2023-03-21 杭州电子科技大学 Method for re-identifying shielded pedestrian based on attitude estimation and background suppression
CN115830637B (en) * 2022-12-13 2023-06-23 杭州电子科技大学 Method for re-identifying blocked pedestrians based on attitude estimation and background suppression
US11908222B1 (en) 2022-12-13 2024-02-20 Hangzhou Dianzi University Occluded pedestrian re-identification method based on pose estimation and background suppression

Also Published As

Publication number Publication date
CN113128461B (en) 2022-11-08

Similar Documents

Publication Publication Date Title
CN111126379B (en) Target detection method and device
CN109543667B (en) Text recognition method based on attention mechanism
CN110738207A (en) character detection method for fusing character area edge information in character image
CN111832568B (en) License plate recognition method, training method and device of license plate recognition model
CN111967470A (en) Text recognition method and system based on decoupling attention mechanism
Kang et al. Deep learning-based weather image recognition
CN114187450A (en) Remote sensing image semantic segmentation method based on deep learning
CN111310728B (en) Pedestrian re-identification system based on monitoring camera and wireless positioning
CN109635726B (en) Landslide identification method based on combination of symmetric deep network and multi-scale pooling
CN113744311A (en) Twin neural network moving target tracking method based on full-connection attention module
CN105243154A (en) Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings
CN111460980A (en) Multi-scale detection method for small-target pedestrian based on multi-semantic feature fusion
CN113627266A (en) Video pedestrian re-identification method based on Transformer space-time modeling
CN114330529A (en) Real-time pedestrian shielding detection method based on improved YOLOv4
CN112434599A (en) Pedestrian re-identification method based on random shielding recovery of noise channel
CN112861840A (en) Complex scene character recognition method and system based on multi-feature fusion convolutional network
CN115240121A (en) Joint modeling method and device for enhancing local features of pedestrians
CN113128461B (en) Pedestrian re-recognition performance improving method based on human body key point mining full-scale features
Xia et al. Urban remote sensing scene recognition based on lightweight convolution neural network
CN112668662B (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN114022703A (en) Efficient vehicle fine-grained identification method based on deep learning
CN111079585B (en) Pedestrian re-identification method combining image enhancement with pseudo-twin convolutional neural network
Cho et al. Modified perceptual cycle generative adversarial network-based image enhancement for improving accuracy of low light image segmentation
Saha et al. Neural network based road sign recognition
CN107358200B (en) Multi-camera non-overlapping vision field pedestrian matching method based on sparse learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant