CN113128461A - Pedestrian re-recognition performance improving method based on human body key point mining full-scale features - Google Patents
Pedestrian re-recognition performance improving method based on human body key point mining full-scale features Download PDFInfo
- Publication number
- CN113128461A CN113128461A CN202110492149.6A CN202110492149A CN113128461A CN 113128461 A CN113128461 A CN 113128461A CN 202110492149 A CN202110492149 A CN 202110492149A CN 113128461 A CN113128461 A CN 113128461A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- network
- key point
- human body
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a pedestrian re-identification performance improvement method based on full-scale features mined by human body key points, which utilizes the human body key points as local features of pedestrians, relieves the problem of insufficient global feature discrimination of the pedestrians under the shielding condition, combines key point thermodynamic diagrams obtained by a human body key point network with a pedestrian re-identification network, and increases the local features. Meanwhile, the visibility of the key points is predicted, and the visibility is applied to the calculation of the distance between the loss function and the characteristic, so that the negative influence of invisible local characteristics on the network performance is relieved. Experimental results show that the accuracy of pedestrian re-identification is effectively improved by the method provided by the invention.
Description
Technical Field
The invention relates to a pedestrian re-identification technology, and belongs to the technical field of computer visual image retrieval.
Background
Pedestrian re-identification (ReID) is a solution to the problem of pedestrian identification and retrieval across cameras and across scenes. And giving a target pedestrian picture, and finding the best matched pedestrian from a pedestrian picture library obtained by shooting through other cameras. Because blind spots exist among the cameras, the complete track of the pedestrian cannot be obtained by utilizing target tracking. Therefore, the pedestrian needs to be identified and matched between the two cameras, and if the time stamp can be combined, the number of matched pictures can be greatly reduced. Pedestrian re-identification can be regarded as a picture retrieval technology, and mainly aims at pedestrian retrieval. Compared with face recognition, the pedestrian re-recognition has less constraint on the scene, so that the method is more suitable for application in a security scene.
The early pedestrian re-identification algorithm is based on a manual feature extraction algorithm, firstly, features of pedestrians are extracted by utilizing a manually designed feature extraction template, the common features comprise color features, texture features, local features and semantic features, then, the distance between the features of an inquiry picture and the features of candidate pictures is calculated through a proper metric formula such as Mahalanobis distance and the like, and finally, whether the candidate pictures are matched with the inquiry picture can be judged. The core point of the traditional algorithm is to design pedestrian features with higher discriminability and more appropriate metric learning.
With the development of deep learning, more and more computer vision tasks adopt deep learning processing and have great success. In the field of pedestrian re-identification, the deep learning method is far superior to a pedestrian re-identification algorithm based on manually extracted features. Therefore, deep learning has become a mainstream research algorithm in the field of pedestrian re-identification in recent years. The mainstream algorithm for pedestrian re-identification at present comprises a method based on characterization learning, a method based on local characteristics, a method based on generation of a countermeasure network and a method based on metric learning.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the prior art, a pedestrian re-identification performance improvement method based on human body key point mining full-scale features is provided, and the problem that pedestrians are shielded in pedestrian re-identification is solved.
The technical scheme is as follows: a pedestrian re-identification performance improving method based on human body key point mining full-scale features comprises the following steps:
step 1: training a human body key point detection network hourglass network by using a data set aiming at human body key point detection, and inputting a pedestrian re-identification picture into the trained human body key point detection network hourglass network to obtain a thermodynamic diagram of key points of pedestrians;
step 2: inputting thermodynamic diagrams of pedestrians in a data set aiming at human body key point detection into a visibility classification sub-network, training the visibility classification sub-network, inputting the thermodynamic diagrams of the pedestrian key points obtained in the step 1 into the trained visibility classification sub-network for classification, and obtaining the visibility probability of each human body key point;
and step 3: inputting the pedestrian re-identification picture into a full-scale network of a pedestrian re-identification network to obtain global characteristics of pedestrians before global average pooling;
and 4, step 4: multiplying the global features obtained in the step (3) with thermodynamic diagrams of key points of the pedestrians to obtain a pedestrian feature map corresponding to each key point, and inputting the pedestrian feature maps and the global features into subsequent global average pooling to obtain global features and local features of the pedestrians;
and 5: in the training process, inputting the global features and the local features into a classifier to obtain the probability of each feature to each pedestrian identity, wherein the loss function is the cross entropy loss of each feature to each pedestrian identity and the real probability and then weighted average is carried out, and the weight is the probability of the visibility of key points of the human body;
in the testing process, the global characteristics and the local characteristics of the query picture and the database picture are respectively obtained in the step 4, the distance between the two picture characteristics is respectively calculated, the distance is obtained by weighted average of the global characteristics and the local Euclidean distance, and the weight of the local characteristics is the normalization of the visibility probability of each human key point.
Further, the step 1 specifically includes the following sub-steps:
step 1.1: training a human body key point detection network hourglass network, and cutting a pedestrian data set in a data set aiming at human body key point detection according to a provided rectangular frame to obtain each pedestrian picture, so that the hourglass network can train single key point detection;
step 1.2: performing data enhancement on each pedestrian picture obtained in the step 1, ensuring that the size of the picture is 256 multiplied by 128, wherein the data enhancement comprises the following steps: turning pictures, transforming sizes and filling the pictures;
step 1.3: inputting the data-enhanced picture into a human body key point detection network hourglass network, wherein the hourglass network comprises a plurality of hourglass modules, capturing global information and local information of pedestrians by utilizing the hourglass modules, predicting thermodynamic diagrams of key points of the pedestrians after the hourglass modules combine the global information and the local information, and taking the predicted thermodynamic diagrams as input of a next hourglass module until training of the human body key point detection network hourglass network is completed;
step 1.4: and changing the size of the pedestrian re-identification picture into 256 multiplied by 128, and inputting the picture into the trained human body key point detection network hourglass network to obtain the thermodynamic diagram of the pedestrian key points.
Further, the step 2 specifically includes the following sub-steps:
step 2.1: training a visibility classification sub-network, inputting thermodynamic diagrams of key points of pedestrians into a visibility classification sub-network convolution layer to obtain characteristics of the thermodynamic diagrams, inputting the characteristics of the thermodynamic diagrams into a full connection layer, and finally constraining the output within a range of 0 to 1 through an activation function so that the output of the visibility classification sub-network represents the visibility probability corresponding to each human key point;
step 2.2: taking the visibility probability as the input of a binary classification loss function, setting the visible key point as 1 and setting the invisible key point as 0;
step 2.3: and (3) taking the thermodynamic diagram of the key points of the pedestrian obtained in the step 1.4 as the input of the visibility classification sub-network trained in the step 2.2, and inputting the output of the visibility classification sub-network into an activation function to obtain the visibility probability of each human body key point.
Further, the step 3 specifically includes the following sub-steps:
step 3.1: the pedestrian re-identification picture is scaled to be 256 multiplied by 128 in the same size, and then data enhancement is carried out on the pedestrian re-identification picture through horizontal overturning;
step 3.2: inputting the data enhanced picture into a full-scale network of a pedestrian re-identification network, firstly extracting features by using a 7 multiplied by 7 convolutional network and a maximum pooling layer, and then inputting the extracted features into 4 residual modules to obtain new features, namely global features of pedestrians before global average pooling.
Further, the step 4 specifically includes the following sub-steps:
step 4.1: carrying out bilinear interpolation on the thermodynamic diagram of the key points of the pedestrian obtained in the step 1 to enable the size of the thermodynamic diagram to be consistent with the size of the global feature obtained in the step 3;
step 4.2: multiplying each thermodynamic diagram obtained in the step 4.1 by the global features obtained in the step 3 to obtain a pedestrian feature diagram corresponding to each key point;
step 4.3: inputting the pedestrian feature map and the global features corresponding to each key point into a global average pooling layer, and then leveling each feature to obtain the local characterization and the global characterization of the pedestrian.
Further, the step 5 specifically includes the following sub-steps:
step 5.1: in the training process, the local representation and the global representation of the pedestrian obtained in the step 4 are sequentially input into a full connection layer and an activation layer of a classifier, the output is constrained within the range of 0 to 1 by using an activation function, and the sum of the output is 1, so that the probability of different pedestrian identities is represented;
step 5.2: obtaining the loss of the network by utilizing the cross entropy loss according to the probability obtained in the step 5.1; the loss comprises the classification loss of global characterization and the classification loss of local characterization of the pedestrian, the two types of loss are weighted and averaged to obtain a loss function, and the weight is the probability of the visibility of the key points of the human body;
step 5.3: in the testing process, the local representation and the global representation of the query picture are obtained by using the step 4, and the local representation and the global representation of the database picture library are obtained by using the method of the step 4;
step 5.4: and respectively calculating the distances between the query picture and all picture representations in the database pictures, wherein the distances are obtained by Euclidean distance weighted average of global representations and local representations, the local representation weight is the normalization of the visibility probability of each human body key point, and then, the picture with the minimum distance from the query picture in the database pictures is taken as a picture matched with query.
Has the advantages that: in real scenes, occlusion is inevitable. In order to solve the problem of pedestrian re-identification in an occlusion scene, the invention is improved and optimized on two aspects: on one hand, in order to solve the problem of occlusion, the extraction of the local features of the pedestrians is of great importance, and the influence of occlusion on the global features can be effectively avoided by utilizing the effective local features. The key points of pedestrians are local features commonly used by pedestrians, so it is effective to combine human body key points to pedestrian re-recognition. On the other hand, occlusion may cause invisibility of the key points, and invisible local features may reduce the overall performance of the network, so it is necessary to increase the weight of the visible key point features and reduce the weight of the invisible key point features. The invention introduces the visibility of key points, applies the visibility to the calculation of the loss function and the characteristic distance of the network and reduces the influence of invisible key points.
Meanwhile, the thermodynamic diagram generated by the human body key point detection network is used for extracting the local features of the pedestrians, and the size of the key points is prevented from being manually designed. The accuracy of the method is 68.1% on the shielded pedestrian re-identification data set, and the method is greatly improved compared with the current algorithm.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a schematic diagram of a full scale staggering module;
FIG. 3 is a diagram of the overall structure of a full-scale network;
FIG. 4 is a schematic view of an hourglass module of the hourglass network;
FIG. 5 is an overall structural view of the hourglass network;
fig. 6 is a diagram of the overall network architecture of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings.
As shown in fig. 1 to 6, the method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature includes the following steps:
step 1: training the human body key point detection network hourglass network by using a data set aiming at human body key point detection, and inputting a pedestrian re-identification picture into the trained human body key point detection network hourglass network to obtain a thermodynamic diagram of the key points of the pedestrians.
The step 1 specifically comprises the following substeps:
step 1.1: training a human body key point detection network hourglass network, and cutting a pedestrian data set in a data set aiming at human body key point detection according to a provided rectangular frame to obtain each pedestrian picture, so that the hourglass network can train single key point detection.
Step 1.2: performing data enhancement on each pedestrian picture obtained in the step 1, ensuring that the size of the picture is 256 multiplied by 128, wherein the data enhancement comprises the following steps: picture turning, size conversion and picture filling.
Step 1.3: the method comprises the steps of inputting a picture with enhanced data to a human body key point detection network hourglass network, wherein the hourglass network comprises a plurality of hourglass modules, capturing global information and local information of pedestrians by utilizing the hourglass modules, predicting thermodynamic diagrams of key points of the pedestrians after the hourglass modules combine the global information and the local information, and taking the predicted thermodynamic diagrams as input of a next hourglass module until training of the human body key point detection network hourglass network is completed.
Step 1.4: and changing the size of the pedestrian re-identification picture into 256 multiplied by 128, and inputting the picture into the trained human body key point detection network hourglass network to obtain the thermodynamic diagram of the pedestrian key points.
Step 2: inputting thermodynamic diagrams of pedestrians in the data set aiming at human body key point detection into a visibility classification sub-network, training the visibility classification sub-network, inputting the thermodynamic diagrams of the pedestrian key points obtained in the step 1 into the trained visibility classification sub-network for classification, and obtaining the visibility probability of each human body key point.
The step 2 specifically comprises the following substeps:
step 2.1: training a visibility classification sub-network, inputting the thermodynamic diagram of the key points of the pedestrians into a visibility classification sub-network convolution layer to obtain the characteristics of the thermodynamic diagram, inputting the characteristics of the thermodynamic diagram into a full connection layer, and finally constraining the output within the range of 0 to 1 through an activation function, so that the output of the visibility classification sub-network represents the visibility probability corresponding to each human body key point.
Step 2.2: taking the visibility probability as the input of a binary classification loss function, setting the visible key point as 1 and setting the invisible key point as 0;
step 2.3: and (3) taking the thermodynamic diagram of the key points of the pedestrian obtained in the step 1.4 as the input of the visibility classification sub-network trained in the step 2.2, and inputting the output of the visibility classification sub-network into an activation function to obtain the visibility probability of each human body key point.
And step 3: and inputting the pedestrian re-identification picture into a full-scale network of the pedestrian re-identification network to obtain the global characteristics of the pedestrians before global average pooling.
The step 3 specifically comprises the following substeps:
step 3.1: the pedestrian re-identification picture is scaled to the same size of 256 × 128, and then data enhancement is performed on the pedestrian re-identification picture through horizontal flipping.
Step 3.2: inputting the data enhanced picture into a full-scale network of a pedestrian re-identification network, firstly extracting features by using a 7 multiplied by 7 convolutional network and a maximum pooling layer, and then inputting the extracted features into 4 residual modules to obtain new features, namely global features of pedestrians before global average pooling. The residual errors reduce model parameters by utilizing depth separable convolution, the calculated amount of the model is reduced, and the module can learn multi-scale features by staggering a plurality of branches in parallel, so that residual error connection is increased.
And 4, step 4: and (4) multiplying the global features obtained in the step (3) with thermodynamic diagrams of key points of the pedestrians to obtain a pedestrian feature map corresponding to each key point, and inputting the pedestrian feature map and the global features into subsequent global average pooling to obtain global and local features of the pedestrians.
The step 4 specifically comprises the following substeps:
step 4.1: and (3) carrying out bilinear interpolation on the thermodynamic diagram of the key points of the pedestrian obtained in the step (1) to enable the size to be consistent with the size of the global feature obtained in the step (3).
Step 4.2: and (4) multiplying each thermodynamic diagram obtained in the step (4.1) with the global features obtained in the step (3) to obtain a pedestrian feature diagram corresponding to each key point.
Step 4.3: inputting the pedestrian feature map and the global features corresponding to each key point into a global average pooling layer, and then leveling each feature to obtain the local characterization and the global characterization of the pedestrian.
And 5: in the training process, inputting the global features and the local features into a classifier to obtain the probability of each feature to each pedestrian identity, wherein the loss function is the cross entropy loss of each feature to each pedestrian identity and the real probability and then weighted average is carried out, and the weight is the probability of the visibility of key points of the human body;
in the testing process, the global characteristics and the local characteristics of the query picture and the database picture are respectively obtained in the step 4, the distance between the two picture characteristics is respectively calculated, the distance is obtained by weighted average of the global characteristics and the local Euclidean distance, and the weight of the local characteristics is the normalization of the visibility probability of each human key point.
The step 5 specifically comprises the following substeps:
step 5.1: in the training process, the local representation and the global representation of the pedestrian obtained in the step 4 are sequentially input into a full connection layer and an activation layer of a classifier, the output is constrained within the range of 0 to 1 by using an activation function, and the sum of the output is 1, so that the probability of different pedestrian identities is represented;
step 5.2: obtaining the loss of the network by utilizing the cross entropy loss according to the probability obtained in the step 5.1; the loss comprises the classification loss of global characterization and the classification loss of local characterization of the pedestrian, the two types of loss are weighted and averaged to obtain a loss function, and the weight is the probability of the visibility of the key points of the human body;
step 5.3: in the testing process, the local representation and the global representation of the query picture are obtained by using the step 4, and the local representation and the global representation of the database picture library are obtained by using the method of the step 4;
step 5.4: and respectively calculating the distances between the query picture and all picture representations in the database pictures, wherein the distances are obtained by Euclidean distance weighted average of global representations and local representations, the local representation weight is the normalization of the visibility probability of each human body key point, and then, the picture with the minimum distance from the query picture in the database pictures is taken as a picture matched with query.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (6)
1. A pedestrian re-recognition performance improving method based on human body key point mining full-scale features is characterized by comprising the following steps:
step 1: training a human body key point detection network hourglass network by using a data set aiming at human body key point detection, and inputting a pedestrian re-identification picture into the trained human body key point detection network hourglass network to obtain a thermodynamic diagram of key points of pedestrians;
step 2: inputting thermodynamic diagrams of pedestrians in a data set aiming at human body key point detection into a visibility classification sub-network, training the visibility classification sub-network, inputting the thermodynamic diagrams of the pedestrian key points obtained in the step 1 into the trained visibility classification sub-network for classification, and obtaining the visibility probability of each human body key point;
and step 3: inputting the pedestrian re-identification picture into a full-scale network of a pedestrian re-identification network to obtain global characteristics of pedestrians before global average pooling;
and 4, step 4: multiplying the global features obtained in the step (3) with thermodynamic diagrams of key points of the pedestrians to obtain a pedestrian feature map corresponding to each key point, and inputting the pedestrian feature maps and the global features into subsequent global average pooling to obtain global features and local features of the pedestrians;
and 5: in the training process, inputting the global features and the local features into a classifier to obtain the probability of each feature to each pedestrian identity, wherein the loss function is the cross entropy loss of each feature to each pedestrian identity and the real probability and then weighted average is carried out, and the weight is the probability of the visibility of key points of the human body;
in the testing process, the global characteristics and the local characteristics of the query picture and the database picture are respectively obtained in the step 4, the distance between the two picture characteristics is respectively calculated, the distance is obtained by weighted average of the global characteristics and the local Euclidean distance, and the weight of the local characteristics is the normalization of the visibility probability of each human key point.
2. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 1 specifically comprises the following substeps:
step 1.1: training a human body key point detection network hourglass network, and cutting a pedestrian data set in a data set aiming at human body key point detection according to a provided rectangular frame to obtain each pedestrian picture, so that the hourglass network can train single key point detection;
step 1.2: performing data enhancement on each pedestrian picture obtained in the step 1, ensuring that the size of the picture is 256 multiplied by 128, wherein the data enhancement comprises the following steps: turning pictures, transforming sizes and filling the pictures;
step 1.3: inputting the data-enhanced picture into a human body key point detection network hourglass network, wherein the hourglass network comprises a plurality of hourglass modules, capturing global information and local information of pedestrians by utilizing the hourglass modules, predicting thermodynamic diagrams of key points of the pedestrians after the hourglass modules combine the global information and the local information, and taking the predicted thermodynamic diagrams as input of a next hourglass module until training of the human body key point detection network hourglass network is completed;
step 1.4: and changing the size of the pedestrian re-identification picture into 256 multiplied by 128, and inputting the picture into the trained human body key point detection network hourglass network to obtain the thermodynamic diagram of the pedestrian key points.
3. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 2, wherein the step 2 specifically comprises the following sub-steps:
step 2.1: training a visibility classification sub-network, inputting thermodynamic diagrams of key points of pedestrians into a visibility classification sub-network convolution layer to obtain characteristics of the thermodynamic diagrams, inputting the characteristics of the thermodynamic diagrams into a full connection layer, and finally constraining the output within a range of 0 to 1 through an activation function so that the output of the visibility classification sub-network represents the visibility probability corresponding to each human key point;
step 2.2: taking the visibility probability as the input of a binary classification loss function, setting the visible key point as 1 and setting the invisible key point as 0;
step 2.3: and (3) taking the thermodynamic diagram of the key points of the pedestrian obtained in the step 1.4 as the input of the visibility classification sub-network trained in the step 2.2, and inputting the output of the visibility classification sub-network into an activation function to obtain the visibility probability of each human body key point.
4. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 3 specifically comprises the following substeps:
step 3.1: the pedestrian re-identification picture is scaled to be 256 multiplied by 128 in the same size, and then data enhancement is carried out on the pedestrian re-identification picture through horizontal overturning;
step 3.2: inputting the data enhanced picture into a full-scale network of a pedestrian re-identification network, firstly extracting features by using a 7 multiplied by 7 convolutional network and a maximum pooling layer, and then inputting the extracted features into 4 residual modules to obtain new features, namely global features of pedestrians before global average pooling.
5. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 4 specifically comprises the following substeps:
step 4.1: carrying out bilinear interpolation on the thermodynamic diagram of the key points of the pedestrian obtained in the step 1 to enable the size of the thermodynamic diagram to be consistent with the size of the global feature obtained in the step 3;
step 4.2: multiplying each thermodynamic diagram obtained in the step 4.1 by the global features obtained in the step 3 to obtain a pedestrian feature diagram corresponding to each key point;
step 4.3: inputting the pedestrian feature map and the global features corresponding to each key point into a global average pooling layer, and then leveling each feature to obtain the local characterization and the global characterization of the pedestrian.
6. The method for improving the pedestrian re-identification performance based on the human body key point mining full-scale feature of claim 1, wherein the step 5 specifically comprises the following substeps:
step 5.1: in the training process, the local representation and the global representation of the pedestrian obtained in the step 4 are sequentially input into a full connection layer and an activation layer of a classifier, the output is constrained within the range of 0 to 1 by using an activation function, and the sum of the output is 1, so that the probability of different pedestrian identities is represented;
step 5.2: obtaining the loss of the network by utilizing the cross entropy loss according to the probability obtained in the step 5.1; the loss comprises the classification loss of global characterization and the classification loss of local characterization of the pedestrian, the two types of loss are weighted and averaged to obtain a loss function, and the weight is the probability of the visibility of the key points of the human body;
step 5.3: in the testing process, the local representation and the global representation of the query picture are obtained by using the step 4, and the local representation and the global representation of the database picture library are obtained by using the method of the step 4;
step 5.4: and respectively calculating the distances between the query picture and all picture representations in the database pictures, wherein the distances are obtained by Euclidean distance weighted average of global representations and local representations, the local representation weight is the normalization of the visibility probability of each human body key point, and then, the picture with the minimum distance from the query picture in the database pictures is taken as a picture matched with query.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110492149.6A CN113128461B (en) | 2021-05-06 | 2021-05-06 | Pedestrian re-recognition performance improving method based on human body key point mining full-scale features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110492149.6A CN113128461B (en) | 2021-05-06 | 2021-05-06 | Pedestrian re-recognition performance improving method based on human body key point mining full-scale features |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113128461A true CN113128461A (en) | 2021-07-16 |
CN113128461B CN113128461B (en) | 2022-11-08 |
Family
ID=76781541
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110492149.6A Active CN113128461B (en) | 2021-05-06 | 2021-05-06 | Pedestrian re-recognition performance improving method based on human body key point mining full-scale features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113128461B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115830637A (en) * | 2022-12-13 | 2023-03-21 | 杭州电子科技大学 | Method for re-identifying shielded pedestrian based on attitude estimation and background suppression |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784258A (en) * | 2019-01-08 | 2019-05-21 | 华南理工大学 | A kind of pedestrian's recognition methods again cut and merged based on Analysis On Multi-scale Features |
CN110796026A (en) * | 2019-10-10 | 2020-02-14 | 湖北工业大学 | Pedestrian re-identification method based on global feature stitching |
CN111666843A (en) * | 2020-05-25 | 2020-09-15 | 湖北工业大学 | Pedestrian re-identification method based on global feature and local feature splicing |
-
2021
- 2021-05-06 CN CN202110492149.6A patent/CN113128461B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784258A (en) * | 2019-01-08 | 2019-05-21 | 华南理工大学 | A kind of pedestrian's recognition methods again cut and merged based on Analysis On Multi-scale Features |
CN110796026A (en) * | 2019-10-10 | 2020-02-14 | 湖北工业大学 | Pedestrian re-identification method based on global feature stitching |
CN111666843A (en) * | 2020-05-25 | 2020-09-15 | 湖北工业大学 | Pedestrian re-identification method based on global feature and local feature splicing |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115830637A (en) * | 2022-12-13 | 2023-03-21 | 杭州电子科技大学 | Method for re-identifying shielded pedestrian based on attitude estimation and background suppression |
CN115830637B (en) * | 2022-12-13 | 2023-06-23 | 杭州电子科技大学 | Method for re-identifying blocked pedestrians based on attitude estimation and background suppression |
US11908222B1 (en) | 2022-12-13 | 2024-02-20 | Hangzhou Dianzi University | Occluded pedestrian re-identification method based on pose estimation and background suppression |
Also Published As
Publication number | Publication date |
---|---|
CN113128461B (en) | 2022-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111126379B (en) | Target detection method and device | |
CN109543667B (en) | Text recognition method based on attention mechanism | |
CN110738207A (en) | character detection method for fusing character area edge information in character image | |
CN111832568B (en) | License plate recognition method, training method and device of license plate recognition model | |
CN111967470A (en) | Text recognition method and system based on decoupling attention mechanism | |
Kang et al. | Deep learning-based weather image recognition | |
CN114187450A (en) | Remote sensing image semantic segmentation method based on deep learning | |
CN111310728B (en) | Pedestrian re-identification system based on monitoring camera and wireless positioning | |
CN109635726B (en) | Landslide identification method based on combination of symmetric deep network and multi-scale pooling | |
CN113744311A (en) | Twin neural network moving target tracking method based on full-connection attention module | |
CN105243154A (en) | Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings | |
CN111460980A (en) | Multi-scale detection method for small-target pedestrian based on multi-semantic feature fusion | |
CN113627266A (en) | Video pedestrian re-identification method based on Transformer space-time modeling | |
CN114330529A (en) | Real-time pedestrian shielding detection method based on improved YOLOv4 | |
CN112434599A (en) | Pedestrian re-identification method based on random shielding recovery of noise channel | |
CN112861840A (en) | Complex scene character recognition method and system based on multi-feature fusion convolutional network | |
CN115240121A (en) | Joint modeling method and device for enhancing local features of pedestrians | |
CN113128461B (en) | Pedestrian re-recognition performance improving method based on human body key point mining full-scale features | |
Xia et al. | Urban remote sensing scene recognition based on lightweight convolution neural network | |
CN112668662B (en) | Outdoor mountain forest environment target detection method based on improved YOLOv3 network | |
CN114022703A (en) | Efficient vehicle fine-grained identification method based on deep learning | |
CN111079585B (en) | Pedestrian re-identification method combining image enhancement with pseudo-twin convolutional neural network | |
Cho et al. | Modified perceptual cycle generative adversarial network-based image enhancement for improving accuracy of low light image segmentation | |
Saha et al. | Neural network based road sign recognition | |
CN107358200B (en) | Multi-camera non-overlapping vision field pedestrian matching method based on sparse learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |