CN107832672B - Pedestrian re-identification method for designing multi-loss function by utilizing attitude information - Google Patents

Pedestrian re-identification method for designing multi-loss function by utilizing attitude information Download PDF

Info

Publication number
CN107832672B
CN107832672B CN201710946443.3A CN201710946443A CN107832672B CN 107832672 B CN107832672 B CN 107832672B CN 201710946443 A CN201710946443 A CN 201710946443A CN 107832672 B CN107832672 B CN 107832672B
Authority
CN
China
Prior art keywords
pedestrian
information
loss function
library
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710946443.3A
Other languages
Chinese (zh)
Other versions
CN107832672A (en
Inventor
周忠
吴威
姜那
刘俊琦
孙晨新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201710946443.3A priority Critical patent/CN107832672B/en
Publication of CN107832672A publication Critical patent/CN107832672A/en
Application granted granted Critical
Publication of CN107832672B publication Critical patent/CN107832672B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Abstract

The invention discloses a pedestrian re-identification method for designing a multi-loss function by utilizing attitude information, which can effectively solve the difficulties caused by frequent pedestrian shielding, large video illumination difference and variable non-rigid pedestrian attitude in a monitoring video and is widely applied to the fields of security monitoring and the like. The method is mainly divided into two stages, namely an off-line stage and an on-line stage. The off-line stage is responsible for training and learning a deep learning network model with high accuracy, and comprises preprocessing, joint point information extraction, local feature extraction and feature fusion with global features extracted by a main network framework, and finally training is completed by utilizing a quintuple loss function for the fused features. And in the online stage, the trained deep learning network model is used for feature extraction, so that the pedestrian re-identification between the target to be analyzed and the stored target picture library is realized through similarity calculation.

Description

Pedestrian re-identification method for designing multi-loss function by utilizing attitude information
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a pedestrian re-identification method for designing a multi-loss function by utilizing attitude information, which is an accurate pedestrian re-identification method capable of resisting pedestrian shielding and having variable attitude and applied to an intelligent monitoring analysis system.
Background
The pedestrian re-recognition technology is used for searching a given target in a plurality of cameras and performing association matching on the search results. The technology provides basic support for the application of the video monitoring field, such as pedestrian retrieval, cross-camera tracking, man-machine interaction and the like. For the figure searching task of massive video data, the pedestrian re-identification can greatly liberate manpower. However, the pedestrian re-identification problem is very challenging due to different camera shooting visual angles, complex lighting conditions, frequent shielding, variable non-rigid pedestrian postures and the like. To overcome these difficulties, researchers have proposed many different solutions over the past 20 years. The algorithm principle can be roughly divided into two algorithms of design expression characteristics and optimized distance measurement.
Designing expression features refers to finding features that are robust to changes in the appearance of the image. The feature expression-based method focuses on how to design a feature description having a degree of recognition for pedestrians and stability against image changes. The method comprises the following steps of low-level visual features such as color histograms, texture features and local feature points, and middle-level features with semantic attributes.
In order to effectively utilize spatial information, the existing method generally divides the image into different areas, such as the Zheng Weishi from 2006 to 2013, which divides the image of a pedestrian into a plurality of horizontal stripes from top to bottom. Farenzena et al in 2010 divide the image of a pedestrian into three parts, namely the head, the trunk and the legs of the body by using image symmetry and asymmetric prior theories so as to extract feature combinations among different regions. Thanks to the advent of the large-data-volume pedestrian re-recognition data set Marker1501, MARS, researchers began using deep learning-based methods to represent image features. Cheng et al in 2016 proposed a multi-channel deep neural network framework based on local blocks that simultaneously extract global and local features using horizontally partitioned local stripes with artwork. However, due to the variation of different camera view angles and the pedestrian postures, the horizontal segmentation may generate misalignment, and may adversely affect the accuracy of the model. Based on the consideration, the invention adopts pedestrian joint point detection to obtain more accurate local position, achieves alignment based on semantics, and provides key conditions for the complementary fusion of global and local features.
Optimizing distance metrics refers to learning a distance space such that feature distances between images belonging to the same person are close, and feature distances between images belonging to different persons are far. In 2009 Weinberger et al proposed a large-interval nearest neighbor classification (LMNN), which employs a triplet constraint to make the k nearest neighbors of each sample belong to the same class in the new metric space. In 2012, Kostiger et al proposed a distance metric learning algorithm that remained simple and direct (Kiss). After that, gradually, the learners combine the distance measurement with deep learning to establish a verification model for pedestrian re-identification. The model takes the image pair as network input, and simultaneously calculates the distance between the features after the image features are extracted, and finally outputs the similarity between the images. Integrating extracted features and similarity metrics into one framework is a major advantage of this type of model. However, only with the verification model, only the features of dissimilarity between pairs of pictures can be extracted. The salient features that each picture has by itself are often ignored. Therefore, the invention considers the combined classification model and the verification model to carry out training, simultaneously calculates the classification loss and the verification loss, and weights the classification loss and the verification loss to achieve model complementation.
With the widespread application of deep learning to multiple subproblems in the field of computer vision, the method for accurately extracting joint point information in a complex scene, which is proposed by Wei and the like, provides possibility for re-identifying accurate local information acquisition for pedestrians. The algorithm for automatically extracting the joint point information based on the deep learning method can be applied to the pedestrian re-recognition problem in consideration of the fact that the pedestrian data posture change in the monitoring video presents a certain rule and rarely has abnormal postures. Therefore, the method calculates the local position of the human body by using the joint information obtained by the method and conjectures the pedestrian attitude orientation, wherein the local position information can be used for extracting local features and fusing global features, the pedestrian attitude orientation can be used for designing a quintuple loss function, and the information can improve the accuracy of pedestrian re-identification in a complex monitoring environment.
Disclosure of Invention
The purpose of the invention is: the pedestrian re-identification method based on the attitude information can deal with the situations that pedestrians are frequently sheltered, the illumination difference is large, the non-rigid pedestrians are changeable in attitude and the like in a monitoring video, and can be integrated into any intelligent monitoring system to realize the basic analysis of the pedestrians.
The technical scheme adopted by the invention is as follows: a pedestrian re-identification method for designing a multi-loss function by utilizing attitude information comprises the following steps: extracting two parts of main contents of feature network model training and pedestrian re-identification on line off line;
step (1), an off-line extraction characteristic network model training stage:
(m1) preprocessing all pictures, original picture rIiAfter treatment with IiRepresents;
(m2) Joint Point information is detected for each picture, and the obtained 18 pieces of Joint Point information have PIi={x1,y1,……,x18,y18In the four points, a corresponding Boolean array label indicates whether different joint points are detected or not, and the labeli=(True or False);
(m3) estimating a height high of each pedestrian based on the joint point information extracted in the step (m2)iCalculating local region information of head, trunk and leg respectively
Figure BDA0001431809360000021
(m4) estimating the orientation of the pedestrian target from the joint point information extracted in the step (m2), and recording the orientation as diri(1or2or 3), where equal to 1 denotes a forward sample, 2 denotes a lateral sample, and 3 denotes a backward sample;
(m5) extracting global features according to the designed backbone network, extracting local features according to the local region position information extracted in the step (m3) and the branch network structure, and fusing the global features and the local features of each picture to form expressive feature vectors together;
(m6) calculating a multi-classification loss function and a first triplet loss function of the present invention based on the data true tags, while calculating a second triplet loss function based on the pedestrian attitude heading presumed in step (m 4);
(m7) training the current feature extraction network by combining the plurality of loss function errors calculated in the step (m6), analyzing the influence of different loss function weights on the network, and selecting the optimal weight lambda1And λ2To complete the joint training;
step (2), an online pedestrian re-identification stage:
(s1) preprocessing all pictures I in the Picture librarygalleryAnd (2) extracting features by using the network model obtained by the off-line stage training in the step (1), and storing the extracted features one by one according to the identification information corresponding to the pictures to form a feature library Fgallery
(s2) preprocessing the picture I to be analyzedqueryExtracting features by using the network model obtained by the off-line stage training in the step (1), and obtaining a final feature vector fqueryUnique valid information used as similarity measure for the subsequent step (s 3);
(s3) calculating f extracted in the step (s2)queryAnd feature library FgalleryCarrying out normalization and sorting operation, and selecting the pictures with similarity greater than 0.7 and ranked at the top M as the retrieval result of pedestrian re-identification, wherein the numerical value of M is dynamically selected according to the quantity in the current picture library;
(s4) periodically updating the picture library and its corresponding feature library, with emphasis on both static library and dynamic library modalities detected and captured by the dynamic video library.
Further, the step (m3) comprises the following steps:
(m3.1) the joint point information P extracted according to the step (m2)IiRemoving labeliAll False, i.e. a sample of failure to detect the joint; removing a sample which represents that the joint points of the trunk part are mostly False;
(m3.2) step (m2) of extracting samples of which the joint information meets the requirements of the invention to participate in training, and according to the existing joint information PIiPresume pedestrian's height highi
(m3.3) calculating head region information from the joint points of the left and right ears, nose, and the like
Figure BDA0001431809360000031
(m3.4) calculating trunk area information according to the position information of the joint points related to the left shoulder, the right shoulder, the left crotch and the right crotch
Figure BDA0001431809360000032
(m3.5) calculating leg region information based on the waist position, ankle, height, etc
Figure BDA0001431809360000033
Because the detected bounding box often does not contain feet, the bounding box is scaled down proportionally according to the height;
and (m3.6) generating the interest region according to the local region position information obtained by calculation in the steps (m3.3) to (m3.5), and entering a branch network for local feature extraction by utilizing an improved interest region feature extraction layer.
Further, the step (m4) comprises the following steps:
(m4.1) after screening according to the step (m3.1), determining the posture orientation of the sample participating in training, and judging the sample without the left shoulder or the right shoulder as the lateral diri=2;
(m4.2) left and right shoulder vectors are calculated for samples where both left and right shoulders are present
Figure BDA0001431809360000041
(m4.3) left and right shoulder vectors obtained according to step (m4.2)
Figure BDA0001431809360000042
Calculating the included angle dir _ angle with the vertical lineIi
(m4.4) judging the included angle dir _ angle calculated in the step (m4.3)IiWithin the range of included angle dir _ angleIiIn the range of [260 °,280 ° ]]The inner direction is marked as forward diriIf not, the included angle is judged to be in the range of [80 degrees ] and 100 degrees DEG when the included angle is 1]Within this range this is marked as back diri3 if not aboveWithin the two ranges, the sample is marked as lateral diri=2。
Further, the step (m5) comprises the following steps:
(m5.1) extracting global features, labeled fglobal (I), from the backbone network in the network framework proposed by the present inventioni);
(m5.2) connecting the three local features (I) extracted in step (m3.6)i) Which are respectively labeled fh (I)i)、ft(Ii)、fl(Ii);
(m5.3) implementing the global feature extracted in the step (m5.1) and the local feature f (I) extracted in the step (m5.2) by using a full connection layeri)。
Further, the step (m6) comprises the following steps:
(m6.1) calculating a multi-classification loss function error;
(m6.2) calculate the first triple constraint of the invention:
Did(Ii a,Ii p,Ii n)=d(f(Ii a)-f(Ii p))-d(f(Ii a)-f(Ii n))<α
wherein, IaIs any one reference pedestrian image in the data set, Ii pRepresenting another image representing the same person as the reference pedestrian, i.e. a positive sample, Ii nFor the images of other people, namely negative samples, the triple input is subjected to network calculation to obtain respective feature vectors { f (I)i a),f(Ii p),f(Ii n)},d(f(Ii a)-f(Ii p) D (I) is the distance between the reference map and the positive sample pairi a)-f(Ii n) Distance between the reference map and the negative sample pairs, α a threshold for the triplet constraint;
(m6.3) calculating the second quintuple constraint of the invention:
Dpose(Ii a,Ii ps,Ii pd)=d(f(Ii a)-f(Ii ps))-d(f(Ii a)-f(Ii pd))<β
wherein the content of the first and second substances,
Figure BDA0001431809360000043
is shown and
Figure BDA0001431809360000044
a positive sample with the same posture is taken,
Figure BDA0001431809360000045
is shown and
Figure BDA0001431809360000046
positive samples with different poses, β, are thresholds for quintuple double constraints.
Further, the step (m7) comprises the following steps:
(m7.1) calculating a back-propagated combined error value from the multi-loss function error obtained in step (m 6):
Figure BDA0001431809360000051
Figure BDA0001431809360000052
Loss3(I,w)=λ1Loss1(I,w)+λ2Loss2(I,w)
therein, Loss1Representing a multi-class Loss function, Loss2Representing the quintuple Loss function, Loss3Representing a joint loss function, λ1And λ2To balance the weight of the joint loss function, λ is the weight of the balanced triplet and quintet constraints, w is the network parameter,
Figure BDA0001431809360000053
representing the probability of prediction, piIs the target probability, N is the number of pedestrian species, N is the quinaryNumber of groups.
(m7.2) analyzing the error weight parameter λ in step (m7.1)1And λ2Determining the optimal loss function distribution weight used in the off-line stage.
Further, the step (s3) includes the steps of:
(s3.1) dynamically selecting the value of M according to the number in the current picture library;
(s3.2) calculating f extracted in step (s2) in sequencequeryAnd feature library FgalleryA characteristic distance therebetween;
(s3.3) carrying out normalization and sorting operation on all the characteristic distances calculated in the step (s3.2), and selecting the picture with the similarity larger than 0.7 and ranked at the top M as a retrieval result of pedestrian re-identification;
(s3.4) visualizing the pedestrian re-identification search result obtained in step (s3.3) and displaying I for the static photo galleryqueryAnd after sorting IresultsFor the dynamic video library, the method is based on IresultsAnd restoring the real condition of the result in the video at the moment in the database by using the camera ID, the pedestrian ID, the bounding box position information, the frame number time and the like.
Further, the step (s4) includes the steps of:
(s4.1) setting a time t for periodic updating;
(s4.2) continuously adding query pictures I into the static picture library within the time t rangequeryInformation and characteristics of; at the moment t, replacing or updating the picture library according to requirements, re-extracting the changed picture characteristics, and establishing a new characteristic library;
(s4.3) continuously adding the detected new target to the dynamic video library within the time t range, and storing the camera ID, the pedestrian ID, the position information of the surrounding frame, the frame number, the time, the place and other world information in a database; and after the moment t is reached, clearing half of pedestrian data information in the current database according to time, adding new detection results frame by frame, and simultaneously extracting the characteristics of the new detection results as a main attribute stored in the database.
The principle of the invention is as follows:
the invention provides a pedestrian re-identification method for learning features by calculating various loss functions by utilizing human posture information. The design of the invention firstly derives from the increase of the number of monitoring cameras and bayonets and the enhancement of the storage capacity, and provides resource guarantee for pedestrian big data. The pedestrian data with different magnitudes provides a good data base for the pedestrian re-identification technology based on deep learning. Secondly, the method considers the presenting rule of the pedestrian in the monitoring video, and adjusts the aspect ratio of each picture in the preprocessing stage so as to keep good spatial information characteristics when extracting the characteristics in the subsequent deep network framework. In addition, in order to process the situation that the background in the surveillance video is noisy and frequently shielded, the invention introduces local features to make up the deficiency of global features. Namely, joint point information is introduced to calculate the position of a local area of the pedestrian and the orientation posture of the local area relative to the camera. And then extracting the local features of the human body according to the local position information, and fusing the local features with the global features. Finally, the invention also considers the improvement of the expression capability of the deep learning network model from the aspect of the training strategy, designs the quintuple loss function by utilizing the orientation information, and completes the training by combining with the cross entropy loss function, thereby obtaining the efficient and robust feature extraction model in the off-line stage.
In the face of the huge data volume of surveillance videos, it has become impractical to manually complete pedestrian re-identification. The automatic pedestrian re-identification technology can promote the development of various applications such as video analysis, security and the like. The main reason for the low efficiency of manual pedestrian re-identification is that the number of targets to be analyzed is large, and a large number of observed target features cannot be stored in the human brain in a short time. Therefore, after the feature extraction model is obtained, the technical route for completing pedestrian re-identification on line is designed. In the process, firstly, the existing picture library is updated regularly according to the monitoring content, and relevant characteristics are pre-fetched, so that the retrieval time is shortened. And then, after the target to be analyzed is obtained, the target to be analyzed is quickly matched, and necessary pedestrian re-identification is completed.
Specifically, the pedestrian re-identification method for designing the multi-loss function by utilizing the attitude information is divided into an off-line stage and an on-line stage. In an off-line stage, the invention firstly provides a feature extraction depth network framework for keeping the pedestrian aspect ratio; secondly, joint point information is introduced to calculate the position of the pedestrian local area and the orientation posture of the pedestrian local area relative to the camera; then, extracting local features of the human body according to the local position information, and fusing the local features with the global features; and finally, designing a quintuple loss function by using the orientation information, and training the quintuple loss function together with the cross entropy loss function. In the on-line stage, firstly, a feature extraction model obtained by off-line stage training is used for extracting and storing features of a preprocessed picture library; secondly, adjusting the aspect ratio of the target picture to be analyzed, and extracting features after adjustment; according to the extracted target features to be analyzed, similarity measurement is carried out in storage features of a picture library, the calculated similarities are subjected to normalization sorting, and pictures in the library which meet similarity conditions and are ranked in front are selected as retrieval results; and finally, integrating the information such as the camera and the ID matched with the retrieval results, outputting the information in a visual mode, and simultaneously storing the information in a query library to provide input for analysis of other applications. In addition, for surveillance videos or pedestrian data acquired currently, the picture library and the characteristics thereof need to be updated regularly to ensure that the most accurate pedestrian re-identification result is obtained.
In the off-line stage, the specific steps are as follows:
firstly, the invention preprocesses all pictures (pictures to be analyzed and picture library), adjusts the aspect ratio to 1:2, and adjusts the size to 107 × 219 before training, so as to ensure that effective spatial information can be reserved in the next feature extraction stage, and simultaneously, the invention also plays a role of reducing network parameters.
Secondly, the network structure proposed by the present invention consists of the following parts: the system comprises a joint point detection network with 1 fixed parameter, 1 main network, 3 local branch networks, 3 connection layers with integrated characteristics and two loss layers. The knuckle detection network provides pedestrian knuckle information. The main network, the branch network and the connection layer are responsible for extracting the global features and the local features and fusing the global features and the local features. The loss layer is responsible for combining the two loss functions and performing metric learning.
The joint point detection network mainly extracts 18 joint points of a human body, including a neck, a nose, left and right shoulders, elbows, wrists, knees, ankles, glutes, shoulders, eyes and ears, and allows partial joint points to be lost. After the coordinates of all the joint points are obtained, the pedestrian height of the pedestrian is estimated by using the coordinates, the height is used as the assistance, the region boundary is calculated through the maximum value and the minimum value of the joint coordinates of all the regions, and the position information is provided for the subsequent network extraction of the local features.
Meanwhile, the invention also uses the obtained joint point information to estimate the pedestrian posture orientation. And discarding the pedestrian targets with joint point information acquisition failure or without joint points of the trunk part, and not utilizing the defective samples for training so as to avoid polluting the feature extraction model. And preferentially judging whether the left shoulder joint point or the right shoulder joint point exists for the samples participating in the training, and locking the standard lateral samples. A left shoulder to right shoulder direction vector is then calculated, the angle between this vector and the vertical line being used as the primary credential for orientation discrimination. Those with included angles in the range of [80 °,100 ° ] are labeled as back-facing samples, and those with included angles in the range of [260 °,280 ° ] are labeled as front-facing samples.
Thirdly, the global features and the local features required by pedestrian re-identification are extracted by utilizing the backbone network, the branch network and the local area information designed by the invention. The backbone network structure is based on the idea of initiation _ v3, but the different cost lies in that the structure of the invention comprises 5 convolution modules, each module has a plurality of branches, and each branch is formed by stacking convolution layers with various scales and pooling layers. Such a structure can increase the width of the network and reduce the network parameters, and can also enhance the adaptability to the scale. The network of the present invention uses relus to introduce non-linearity capability and uses batch regularization before each ReLU to speed up convergence and mitigate the effects of parameter distribution variation, with a 50% Dropout set at the last fully connected layer to prevent overfitting. The branch network structure shares the parameters before conv5_ x with the backbone network. And the position information is added through the pooling layer to extract the local characteristics of the respective areas. The branch network is similar to the main network in structure, except that the output number of the last pooling layer and the full connection layer is less than that of the main network, thereby playing a role in adjusting the weight. And the network rear end combines the local features and the global features by utilizing a full connection layer to be used as feature vectors of the pedestrians.
Finally, the pedestrian similarity rule is obtained according to long-term experimental observation, namely the characteristic distance between different pedestrians is larger than that between the same pedestrians, and the characteristic distance between different postures of the same person is also larger than that between the same postures of the same person. According to the rule, the invention designs a quintuple loss function with double constraints and provides a strategy for joint training by using the loss function and a multi-classification loss function. The new loss function can correct the cognitive error that the network considers that the negative sample with the same posture is more similar to the positive sample with different postures in appearance, and fundamentally enables the network to learn the expression characteristics for overcoming the posture change. And the joint training strategy can increase the expression capability of the network under the condition of not changing the network structure, so that the obtained network model has better mobility, and the pedestrian re-recognition feature extraction network model required by the invention is obtained.
After the feature extraction network model obtained in the above steps is obtained in the off-line stage, pedestrian re-identification is carried out in the on-line stage, and the specific steps are as follows:
firstly, preprocessing all pictures in a picture library, adjusting the pictures to be in accordance with the unified size input by a feature extraction model, then extracting features of the preprocessed picture library by using the feature extraction model obtained by off-line stage training, and storing feature vectors in a strip mode according to key information of the preprocessed picture library to form a feature library;
secondly, preprocessing a target picture to be analyzed and extracting a feature vector with expression capability;
and thirdly, carrying out similarity measurement on the extracted target feature vector to be analyzed in a stored feature vector library. Carrying out normalization sorting on the calculated similarities, and selecting the images in the library which accord with the similarity conditions and are ranked in front as the retrieval results;
and finally, integrating the information such as the camera and the ID matched with the retrieval results, outputting the information in a visual mode, and simultaneously storing the information in a query library to provide input for other application analysis. In addition, for surveillance videos or pedestrian data acquired currently, the picture library and the feature vector library thereof need to be updated regularly, so that the most accurate pedestrian re-identification result is ensured.
Compared with the prior art, the invention has the advantages that:
1. the invention provides a deep neural network framework consisting of a main network and three sub-networks, wherein the main network is used for extracting global features, and the three sub-networks are used for extracting local features of the head, the trunk and the legs of a human body by utilizing joint point information. And finally, fusing the global and local features to improve the retrieval accuracy and effectively resist frequent occlusion in the surveillance video.
2. Quintuple constraint is designed by utilizing the deduced pedestrian orientation information, the metric learning ability is enhanced, and a strategy of training a network model by using joint classification loss and verification loss is used. It is ensured that the characteristic distance between images belonging to the same person is smaller than the characteristic distance between images belonging to different persons, and the characteristic distance between images of the same pose of the same person is smaller than the characteristic distance between images of different poses. The difficulty brought to the re-identification of the pedestrian by the variable target postures of the non-rigid pedestrian is fundamentally overcome.
3. The invention is separated from the common video analysis modules for detection, tracking and the like. The method can be integrated into any intelligent monitoring system as an independent module, provides accurate input for upper layer analysis, and is convenient to use and robust.
Drawings
FIG. 1 is a general diagram of a pedestrian re-identification method using attitude information to design a multi-loss function according to the present invention;
FIG. 2 is a schematic diagram of the present invention for estimating the orientation of a pedestrian and calculating the local region information to extract the local fine features of the head, the trunk and the legs according to the attitude information;
fig. 3 is a diagram illustrating a comparison between the position information of the extracted joint point and the local area of the present invention and the conventional striped local area division, where the first group is the original image, the second group is the extraction effect of the present invention, and the third group is the striped division effect, and it can be found that the method of the present invention can align the local area of the pedestrian target more effectively and can eliminate part of the background interference by comparing the second group and the third group of images;
FIG. 4 is a flow chart of estimating a pedestrian target attitude heading;
FIG. 5 is an exemplary diagram illustrating the method of estimating the orientation of a pedestrian using joint information, wherein the method is generally divided into three orientations, i.e., a side orientation, a front orientation and a back orientation;
FIG. 6 is a schematic diagram of the design of quintuple loss function according to the present invention.
Detailed Description
The specific steps of the present invention will be described in detail with reference to the accompanying drawings and examples.
The invention provides a pedestrian re-identification method for designing a multi-loss function by utilizing attitude information, which firstly introduces the pedestrian re-identification processing process in detail by combining the general schematic diagram of figure 1. The method comprises an off-line stage and an on-line stage, wherein the off-line stage comprises preprocessing, rough feature extraction, fine feature extraction, feature fusion, quintuple similarity measurement, multi-class loss function calculation, network parameter learning and the like; the online stage comprises four parts of feature extraction, similarity measurement, picture library updating and result visualization.
Stage (1) off-line stage: and training and learning a network model for extracting features.
A. The data preprocessing steps are as follows: note that in real video, the pedestrian bounding box is mostly rectangular, and the aspect ratio is about 0.5. Most of the existing pedestrian re-identification methods based on deep learning use square network inputs, and are not beneficial to keeping the spatial characteristics of pedestrians. Therefore, the input size of the network is changed into 107x219, the input size is consistent with the actual aspect ratio of the pedestrian image, effective feature extraction is facilitated, and meanwhile the effect of reducing network parameters is achieved. For image list LI1,I2,…,InAny picture I iniThe above pretreatment was carried out.
B. Step of coarse feature extractionComprises the following steps: the network architecture designed by the invention mainly comprises a main network, a branch network and a plurality of important parts of a loss function layer. The parts of the backbone network and the branch network before the Conv5_ x use shared network parameters, the parts are mainly responsible for extracting rough features of pictures, and the features comprise information such as semantics and the like, are close to global features, and therefore are also basic features for global feature extraction. By picture IiFor example, the partially extracted feature icon is labeled
Figure BDA0001431809360000091
Is the input for the next part of fine feature extraction.
C. The fine feature extraction steps are as follows: after the coarse features are obtained, the main network further extracts the fine features, and the branch network extracts the local fine features according to the local feature extraction schematic diagram shown in fig. 2. The specific process is as follows:
1) detecting a pedestrian joint point: as shown in fig. 2, joint detection is performed on the preprocessed pictures. The invention mainly extracts 18 joint points (allowed to be lost) of the human body, including neck, nose, left and right shoulders, elbows, wrists, knees, ankles, crotch, shoulders, eyes and ears. The joint point detection fails, and the sample without the trunk information is not involved in training. After the coordinates of each joint point are obtained, the position information of the left shoulder and the right shoulder of the sample meeting the conditions can be used as 2) the main basis for estimating the posture and the orientation of the pedestrian, and the height prediction can be used as 3) the auxiliary for extracting the local area information.
2) Presume pedestrian's posture orientation: after removing the sample that failed to detect the joint and had no torso, the presence of the left and right shoulders was observed, as shown in fig. 4, to determine a standard lateral sample. And calculating shoulder vectors of the samples existing in the left shoulder and the right shoulder, calculating included angles between the obtained vectors and the vertical line, and judging whether the samples are in the forward direction, the back direction or the side direction according to the included angle range. An example of the orientation of a pedestrian estimated by the method of the invention is shown in fig. 5.
3) Calculating pedestrian local area information: the traditional strip-type local area division can not eliminate the interference of a complex background and faces toIn the example of fig. 3, local feature region alignment cannot be achieved, and such errors may drive the network model to learn wrong features. Therefore, after the coordinates of each joint point are obtained, the height of the pedestrian is estimated, the height is taken as the assistance, the region boundary is calculated through the maximum value and the minimum value of the joint coordinates of each region, and the position information is provided for the subsequent network extraction of the local features. Also in picture IiFor the purpose of explaining the formulation in the present invention, three local area information are expressed as
Figure BDA0001431809360000101
Each local region position information is composed of four-tuple (x)i,yi,wi,hi) Composition, where x, y, w, h represent the upper left coordinates (x, y) of a region and the length and height of the region, respectively.
4) Extracting local fine features: after 3) obtaining the local area information, extracting local fine features using a branching network within the network structure. The partial branch network parameters are not shared.
D. The feature fusion mode is as follows: the invention analyzes a plurality of characteristic fusion modes in experiments, and mainly compares an ElementWise mode and a Concat mode. The results show that: the Concat mode of complementing the global characteristic and the local characteristic can obtain the most effective characteristic vector according to the design principle of the invention.
E. The design and the construction steps of the quintuple are as follows: the quintuple loss function is improved by adding the attitude constraint by the triplet loss function. The commonly used triplet loss function is mathematically expressed as a triplet { I }i a,Ii p,Ii nIn which IaIs any one reference pedestrian image in the data set, Ii pRepresenting another image representing the same person as the reference pedestrian, i.e. a positive sample, Ii nAn image of another person, i.e. a negative example. The triple inputs are subjected to network calculation to obtain respective feature vectors { f (I)i a),f(Ii p),f(Ii n) And there is a triple constraint:
Did(Ii a,Ii p,Ii n)=d(f(Ii a)-f(Ii p))-d(f(Ii a)-f(Ii n))<α
wherein d (f (I)i a)-f(Ii p) D (I) is the distance between the reference map and the positive sample pairi a)-f(Ii n) The inequality is meaningful by learning a metric in which the feature distance between the same person must be less than the feature distance between different persons in the distance space, i.e., the image features of the same person are more similar than those between different persons, on this basis, the present invention introduces a pose double constraint that is known from the pose orientation of the pedestrian obtained in C2), the present invention classifies the sample into three classes, forward, lateral, and backward.
Dpose(Ii a,Ii ps,Ii pd)=d(f(Ii a)-f(Ii ps))-d(f(Ii a)-f(Ii pd))<β
Wherein the content of the first and second substances,
Figure BDA0001431809360000111
is shown and
Figure BDA0001431809360000112
a positive sample with the same posture is taken,
Figure BDA0001431809360000113
is shown and
Figure BDA0001431809360000114
the objective of the loss function is a measure that the distance of image features in the same pose of the same person in the distance space is smaller than the distance of image features in different poses of the same person in the distance spaceThe distance between them. Such constraints ensure that the distance between positive samples with the same posture is smaller, and the influence caused by posture change can be reduced.
The method takes the original triple constraint as the first constraint, designs the improved posture constraint as the second constraint, combines the two as the quintuple structure, calculates the loss of the quintuple structure and can realize the verification model training. The loss function is calculated as follows:
Figure BDA0001431809360000115
F. the implementation scheme of the multi-class loss function calculation and network joint training is as follows: the invention uses two loss functions jointly, one is a softmax loss function, and focuses on classifying the images. The other is a quintuple loss function added with posture constraint, and focuses on verifying whether the two images are the same person or not. FIG. 6 is a schematic diagram of the design of quintuple loss function according to the present invention. The classification model generally uses a softmax layer with an output k after the network total feature layer, where k is the number of classes in the training set. The training of the classification network consists in minimizing the cross-entropy losses, i.e. the classification losses together with the quintuple loss function described in step E) above can jointly train the network model. The combined loss function is calculated in the following way:
Loss3(I,w)=λ1Loss1(I,w)+λ2Loss2(I,w)
stage (2) on-line stage: and carrying out re-identification on the specified pedestrian in the pedestrian database.
A. Extraction of expressive features: the invention utilizes the feature extraction network obtained by off-line stage training to extract the expressive features of the picture to be analyzed and the existing pedestrian picture library. And simultaneously storing the feature vectors corresponding to the current picture library one by one before the next updating.
B. Similarity measure with pedestrian picture library and feature library: and after comparing and analyzing the Euclidean distance and the cosine distance, selecting the cosine distance as a standard measurement mode. And sequentially carrying out similarity measurement on the feature vector of the picture to be analyzed and the feature vector of the picture library. And carrying out normalization and sequencing treatment on the obtained similarity. The similarity is more than 0.7, and the picture of M before the ranking is taken as the retrieval result. Wherein M is dynamically set according to the total number of the current pictures.
C. Regular updating mode of pedestrian picture library: and aiming at the static picture library, continuously adding the picture to be analyzed each time and storing the characteristic vector of the picture. And updating the pedestrian information obtained by inspection every 30 minutes according to the pedestrian data generated by the dynamic video, continuously adding a pedestrian target judged as a new person before the next updating, and extracting features later in the system so as to complete pedestrian re-identification on line after obtaining the target to be analyzed.
D. Visualization scheme of pedestrian re-identification result: the invention records the result of each inquiry and stores the result into the database. Displaying the pedestrian re-identification result in two modes, and displaying the target to be analyzed and no more than M pictures which are determined as the same pedestrian aiming at the static picture library; and for the dynamic video, firstly, locking the picture of the retrieval result, and visualizing the picture into a corresponding video picture according to the camera ID, the pedestrian ID, the frame number, the position in the video and other information stored in the database. The stored entry information can be used for visualization, and can also be used for upper-layer applications such as camera topology analysis and video content analysis.

Claims (6)

1. A pedestrian re-identification method for designing a multi-loss function by utilizing attitude information is characterized by comprising the following steps of: the method comprises two parts of main contents of off-line extraction characteristic network model training and on-line pedestrian re-identification;
step (1), an off-line extraction characteristic network model training stage:
(ml) preprocessing all pictures, the original picture rIiAfter treatment with IiRepresents;
(m2) Joint Point information is detected for each picture, and the obtained 18 pieces of Joint Point information have PIi={x1,y1,......,x18,y18In the four points, a corresponding Boolean array label indicates whether different joint points are detected or not, and the labeli=(True or False);
(m3) estimating a height high of each pedestrian based on the joint point information extracted in the step (m2)iCalculating local region information of head, trunk and leg respectively
Figure FDA0002467190380000011
(m4) estimating the orientation of the pedestrian target from the joint point information extracted in the step (m2), and recording the orientation as diri(1or2or 3), where equal to 1 denotes a forward sample, 2 denotes a lateral sample, and 3 denotes a backward sample;
(m5) extracting global features according to the designed backbone network, extracting local features according to the local region position information extracted in the step (m3) and the branch network structure, and fusing the global features and the local features of each picture to form expressive feature vectors together;
(m6) calculating a multi-classification loss function and a triplet according to the data true label, and simultaneously designing a quintet according to the pedestrian posture orientation presumed in the step (m4) and calculating a quintet loss function; the step (m6) comprises the following steps:
(m6.1) calculating a multi-classification loss function error;
(m6.2) computing the triplet constraints:
Did(Ii a,Ii p,Ii n)=d(f(Ii a)-f(Ii p))-d(f(Ii a)-f(Ii n))<α
wherein the content of the first and second substances,
Figure FDA0002467190380000016
is any one reference pedestrian image in the data set, Ii pRepresenting another image representing the same person as the reference pedestrian, i.e. a positive sample, Ii nFor the images of other people, namely negative samples, the triple input is subjected to network calculation to obtain respective feature vectors { f (I)i a),f(Ii p),f(Ii n)},d(f(Ii a)-f(Ii p) D (I) is the distance between the reference map and the positive sample pairi a)-f(Ii n) Distance between the reference map and the negative sample pairs, α a threshold for the triplet constraint;
(m6.3) computing the triplet constraints:
Dpose(Ii a,Ii ps,Ii pd)=d(f(Ii a)-f(Ii ps))-d(f(Ii a)-f(Ii pd))<β
wherein the content of the first and second substances,
Figure FDA0002467190380000012
is shown and
Figure FDA0002467190380000013
a positive sample with the same posture is taken,
Figure FDA0002467190380000014
is shown and
Figure FDA0002467190380000015
positive samples with different poses, β threshold for triple constraints;
(m7) training the current feature extraction network by combining the plurality of loss function errors calculated in the step (m6), analyzing the influence of different loss function weights on the network, and selecting the optimal weight lambda1And λ2To complete the joint training; the step (m7) comprises the following steps:
(m7.1) calculating a back-propagated combined error value from the multi-loss function error obtained in step (m 6):
Figure FDA0002467190380000021
Figure FDA0002467190380000022
Loss3(I,w)=λ1Loss1(I,w)+λ2Loss2(I,w)
therein, Loss1Representing a multi-class Loss function, Loss2Representing the quintuple Loss function, Loss3Representing a joint loss function, λ1And λ2To balance the weight of the combined loss function, and λ is D in the balanced quintuple loss functionposeThe weight value of the triplet, w being a network parameter,
Figure FDA0002467190380000023
representing the probability of prediction, piIs the target probability, N is the number of pedestrian species, N is the number of quintuple;
(m7.2) analyzing the error weight parameter λ in step (m7.1)1And λ2Determining the optimal loss function distribution weight used in the off-line stage;
step (2), an online pedestrian re-identification stage:
(s1) preprocessing all pictures I in the Picture librarygalleryAnd (2) extracting features by using the network model obtained by the off-line stage training in the step (1), and storing the extracted features one by one according to the identification information corresponding to the pictures to form a feature library Fgallery
(s2) preprocessing the picture I to be analyzedqueryExtracting features by using the network model obtained by the off-line stage training in the step (1), and obtaining a final feature vector fqueryUnique valid information used as similarity measure for the subsequent step (s 3);
(s3) calculating f extracted in the step (s2)queryAnd feature library FgalleryCarrying out normalization and sorting operation, and selecting the pictures with similarity greater than 0.7 and ranked at the top M as the retrieval result of pedestrian re-identification, wherein the numerical value of M is dynamically selected according to the quantity in the current picture library;
(s4) periodically updating the picture library and its corresponding feature library, with emphasis on both static library and dynamic library modalities detected and captured by the dynamic video library.
2. The method of claim 1, wherein the pedestrian re-identification method using attitude information to design multi-loss function is characterized in that: the step (m3) comprises the following steps:
(m3.1) the joint point information P extracted according to the step (m2)IiRemoving labeliAll False, i.e. a sample of failure to detect the joint;
(m3.2) step (m2) of extracting samples with joint information meeting the requirements to participate in training, and according to the existing joint information PIiPresume pedestrian's height highi
(m3.3) calculating head region information from the left and right ear and nose joint points
Figure FDA0002467190380000031
(m3.4) calculating trunk area information according to the position information of the joint points related to the left shoulder, the right shoulder, the left crotch and the right crotch
Figure FDA0002467190380000032
(m3.5) calculating leg region information based on the waist position, ankle, height information
Figure FDA0002467190380000033
Because the detected bounding box often does not contain feet, the bounding box is scaled down proportionally according to the height;
and (m3.6) generating the interest region according to the local region position information obtained by calculation in the steps (m3.3) to (m3.5), and entering a branch network for local feature extraction by utilizing an improved interest region feature extraction layer.
3. The pedestrian re-identification method for designing a multi-loss function by using attitude information as claimed in claim 2, wherein: the step (m4) comprises the following steps:
(m4.1) after screening according to the step (m3.1), determining samples participating in training to analyze the posture orientation of the samples, and judging the samples without a left shoulder or a right shoulder as samplesLateral diri=2;
(m4.2) left and right shoulder vectors are calculated for samples where both left and right shoulders are present
Figure FDA0002467190380000034
(m4.3) left and right shoulder vectors obtained according to step (m4.2)
Figure FDA0002467190380000035
Calculating the included angle dir _ angle with the vertical lineIi
(m4.4) judging the included angle dir _ angle calculated in the step (m4.3)IiWithin the range of included angle dir _ angleIiIn the range of [260 °,280 ° ]]The inner direction is marked as forward diriIf not, the included angle is judged to be in the range of [80 degrees ] and 100 degrees DEG when the included angle is 1]Within this range, the symbol is back diriIf not, label the sample as lateral diri=2。
4. The pedestrian re-identification method for designing a multi-loss function by using attitude information as claimed in claim 2, wherein: the step (m5) comprises the following steps:
(m5.1) extracting global features according to the backbone network in the network framework, and marking as fglobal (I)i);
(m5.2) connecting the three local features (I) extracted in step (m3.6)i) Which are respectively labeled fh (I)i)、ft(Ii)、fl(Ii);
(m5.3) implementing the global feature extracted in the step (m5.1) and the local feature f (I) extracted in the step (m5.2) by using a full connection layeri)。
5. The method of claim 1, wherein the pedestrian re-identification method using attitude information to design multi-loss function is characterized in that: the step (s3) includes the steps of:
(s3.1) dynamically selecting the value of M according to the number in the current picture library;
(s3.2) calculating f extracted in step (s2) in sequencequeryAnd feature library FgalleryA characteristic distance therebetween;
(s3.3) carrying out normalization and sorting operation on all the characteristic distances calculated in the step (s3.2), and selecting the picture with the similarity larger than 0.7 and ranked at the top M as a retrieval result of pedestrian re-identification;
(s3.4) visualizing the pedestrian re-identification search result obtained in step (s3.3) and displaying I for the static photo galleryqueryAnd after sorting IresultsFor the dynamic video library, the method is based on IresultsAnd recovering the real condition of the result in the video at the moment in time from the camera ID, the pedestrian ID, the bounding box position information and the frame number time stored in the database.
6. The method of claim 1, wherein the pedestrian re-identification method using attitude information to design multi-loss function is characterized in that: the step (s4) includes the steps of:
(s4.1) setting a time t for periodic updating;
(s4.2) continuously adding query pictures I into the static picture library within the time t rangequeryInformation and characteristics of; at the moment t, replacing or updating the picture library according to requirements, re-extracting the changed picture characteristics, and establishing a new characteristic library;
(s4.3) continuously adding the detected new target to the dynamic video library within the time t range, and storing the camera ID, the pedestrian ID, the position information of the surrounding frame, the frame number, the time and the place world information in a database; and after the moment t is reached, clearing half of pedestrian data information in the current database according to time, adding new detection results frame by frame, and simultaneously extracting the characteristics of the new detection results as a main attribute stored in the database.
CN201710946443.3A 2017-10-12 2017-10-12 Pedestrian re-identification method for designing multi-loss function by utilizing attitude information Active CN107832672B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710946443.3A CN107832672B (en) 2017-10-12 2017-10-12 Pedestrian re-identification method for designing multi-loss function by utilizing attitude information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710946443.3A CN107832672B (en) 2017-10-12 2017-10-12 Pedestrian re-identification method for designing multi-loss function by utilizing attitude information

Publications (2)

Publication Number Publication Date
CN107832672A CN107832672A (en) 2018-03-23
CN107832672B true CN107832672B (en) 2020-07-07

Family

ID=61647742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710946443.3A Active CN107832672B (en) 2017-10-12 2017-10-12 Pedestrian re-identification method for designing multi-loss function by utilizing attitude information

Country Status (1)

Country Link
CN (1) CN107832672B (en)

Families Citing this family (83)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316031B (en) * 2017-07-04 2020-07-10 北京大学深圳研究生院 Image feature extraction method for pedestrian re-identification
CN108596211B (en) * 2018-03-29 2020-08-28 中山大学 Shielded pedestrian re-identification method based on centralized learning and deep network learning
CN108537181A (en) * 2018-04-13 2018-09-14 盐城师范学院 A kind of gait recognition method based on the study of big spacing depth measure
CN108764065B (en) * 2018-05-04 2020-12-08 华中科技大学 Pedestrian re-recognition feature fusion aided learning method
CN109190646B (en) * 2018-06-25 2019-08-20 北京达佳互联信息技术有限公司 A kind of data predication method neural network based, device and nerve network system
CN108960140B (en) * 2018-07-04 2021-04-27 国家新闻出版广电总局广播科学研究院 Pedestrian re-identification method based on multi-region feature extraction and fusion
CN109190446A (en) * 2018-07-06 2019-01-11 西北工业大学 Pedestrian's recognition methods again based on triple focused lost function
CN109063607B (en) * 2018-07-17 2022-11-25 北京迈格威科技有限公司 Method and device for determining loss function for re-identification
CN109214271B (en) * 2018-07-17 2022-10-18 北京迈格威科技有限公司 Method and device for determining loss function for re-identification
CN109165589B (en) * 2018-08-14 2021-02-23 北京颂泽科技有限公司 Vehicle weight recognition method and device based on deep learning
CN110858295B (en) * 2018-08-24 2021-04-20 广州汽车集团股份有限公司 Traffic police gesture recognition method and device, vehicle control unit and storage medium
CN109446898B (en) * 2018-09-20 2021-10-15 暨南大学 Pedestrian re-identification method based on transfer learning and feature fusion
CN111091020A (en) * 2018-10-22 2020-05-01 百度在线网络技术(北京)有限公司 Automatic driving state distinguishing method and device
CN109614853B (en) * 2018-10-30 2023-05-05 国家新闻出版广电总局广播科学研究院 Bilinear pedestrian re-identification network construction method based on body structure division
CN109299707A (en) * 2018-10-30 2019-02-01 天津师范大学 A kind of unsupervised pedestrian recognition methods again based on fuzzy depth cluster
CN109508663B (en) * 2018-10-31 2021-07-13 上海交通大学 Pedestrian re-identification method based on multi-level supervision network
CN109583315B (en) * 2018-11-02 2023-05-12 北京工商大学 Multichannel rapid human body posture recognition method for intelligent video monitoring
CN109492583A (en) * 2018-11-09 2019-03-19 安徽大学 A kind of recognition methods again of the vehicle based on deep learning
CN109472248B (en) * 2018-11-22 2022-03-25 广东工业大学 Pedestrian re-identification method and system, electronic equipment and storage medium
CN109522850B (en) * 2018-11-22 2023-03-10 中山大学 Action similarity evaluation method based on small sample learning
CN109583502B (en) * 2018-11-30 2022-11-18 天津师范大学 Pedestrian re-identification method based on anti-erasure attention mechanism
CN111310518B (en) * 2018-12-11 2023-12-08 北京嘀嘀无限科技发展有限公司 Picture feature extraction method, target re-identification method, device and electronic equipment
CN109800794B (en) * 2018-12-27 2021-10-22 上海交通大学 Cross-camera re-identification fusion method and system for appearance similar targets
CN109711366B (en) * 2018-12-29 2021-04-23 浙江大学 Pedestrian re-identification method based on group information loss function
CN111401113A (en) * 2019-01-02 2020-07-10 南京大学 Pedestrian re-identification method based on human body posture estimation
CN109711386B (en) * 2019-01-10 2020-10-09 北京达佳互联信息技术有限公司 Method and device for obtaining recognition model, electronic equipment and storage medium
CN109919320B (en) * 2019-01-23 2022-04-01 西北工业大学 Triplet network learning method based on semantic hierarchy
CN109902573B (en) * 2019-01-24 2023-10-31 中国矿业大学 Multi-camera non-labeling pedestrian re-identification method for video monitoring under mine
CN109886141B (en) * 2019-01-28 2023-06-06 同济大学 Pedestrian re-identification method based on uncertainty optimization
CN109934197B (en) * 2019-03-21 2023-07-07 深圳力维智联技术有限公司 Training method and device for face recognition model and computer readable storage medium
CN110046553A (en) * 2019-03-21 2019-07-23 华中科技大学 A kind of pedestrian weight identification model, method and system merging attributive character
CN109993116B (en) * 2019-03-29 2022-02-11 上海工程技术大学 Pedestrian re-identification method based on mutual learning of human bones
CN110110755B (en) * 2019-04-04 2021-02-26 长沙千视通智能科技有限公司 Pedestrian re-identification detection method and device based on PTGAN region difference and multiple branches
CN109919141A (en) * 2019-04-09 2019-06-21 广东省智能制造研究所 A kind of recognition methods again of the pedestrian based on skeleton pose
CN111832348B (en) * 2019-04-17 2022-05-06 中国科学院宁波材料技术与工程研究所 Pedestrian re-identification method based on pixel and channel attention mechanism
CN110309701B (en) * 2019-04-17 2022-08-05 武汉大学 Pedestrian re-identification method based on same cross-view-angle area
CN110163110B (en) * 2019-04-23 2023-06-06 中电科大数据研究院有限公司 Pedestrian re-recognition method based on transfer learning and depth feature fusion
CN111738039A (en) * 2019-05-10 2020-10-02 北京京东尚科信息技术有限公司 Pedestrian re-identification method, terminal and storage medium
CN111783506A (en) * 2019-05-17 2020-10-16 北京京东尚科信息技术有限公司 Method and device for determining target characteristics and computer-readable storage medium
CN110288677B (en) * 2019-05-21 2021-06-15 北京大学 Pedestrian image generation method and device based on deformable structure
CN110232330B (en) * 2019-05-23 2020-11-06 复钧智能科技(苏州)有限公司 Pedestrian re-identification method based on video detection
CN110334738A (en) * 2019-06-05 2019-10-15 大连理工大学 The method of more sorter networks for image recognition
CN110321813B (en) * 2019-06-18 2023-06-20 南京信息工程大学 Cross-domain pedestrian re-identification method based on pedestrian segmentation
CN110458004B (en) * 2019-07-02 2022-12-27 浙江吉利控股集团有限公司 Target object identification method, device, equipment and storage medium
CN110321862B (en) * 2019-07-09 2023-01-10 天津师范大学 Pedestrian re-identification method based on compact ternary loss
CN110334675B (en) * 2019-07-11 2022-12-27 山东大学 Pedestrian re-identification method based on human skeleton key point segmentation and column convolution
CN110490901A (en) * 2019-07-15 2019-11-22 武汉大学 The pedestrian detection tracking of anti-attitudes vibration
CN110543817A (en) * 2019-07-25 2019-12-06 北京大学 Pedestrian re-identification method based on posture guidance feature learning
CN110688888B (en) * 2019-08-02 2022-08-05 杭州未名信科科技有限公司 Pedestrian attribute identification method and system based on deep learning
CN110619271A (en) * 2019-08-12 2019-12-27 浙江浩腾电子科技股份有限公司 Pedestrian re-identification method based on depth region feature connection
CN112417932B (en) * 2019-08-23 2023-04-07 中移雄安信息通信科技有限公司 Method, device and equipment for identifying target object in video
CN110874574A (en) * 2019-10-30 2020-03-10 平安科技(深圳)有限公司 Pedestrian re-identification method and device, computer equipment and readable storage medium
CN110968734B (en) * 2019-11-21 2023-08-04 华东师范大学 Pedestrian re-recognition method and device based on deep measurement learning
CN111126198B (en) * 2019-12-11 2023-05-09 中山大学 Pedestrian re-identification method based on deep representation learning and dynamic matching
CN111274958B (en) * 2020-01-20 2022-10-04 福州大学 Pedestrian re-identification method and system with network parameter self-correction function
CN111597876A (en) * 2020-04-01 2020-08-28 浙江工业大学 Cross-modal pedestrian re-identification method based on difficult quintuple
CN111582154A (en) * 2020-05-07 2020-08-25 浙江工商大学 Pedestrian re-identification method based on multitask skeleton posture division component
CN111598037B (en) * 2020-05-22 2023-04-25 北京字节跳动网络技术有限公司 Human body posture predicted value acquisition method, device, server and storage medium
CN111657926B (en) * 2020-07-08 2021-04-23 中国科学技术大学 Arrhythmia classification method based on multi-lead information fusion
CN111797813B (en) * 2020-07-21 2022-08-02 天津理工大学 Partial pedestrian re-identification method based on visible perception texture semantic alignment
CN111861335B (en) * 2020-07-23 2021-08-06 印象(山东)大数据有限公司 Industrial interconnection material management system
CN112084917A (en) * 2020-08-31 2020-12-15 腾讯科技(深圳)有限公司 Living body detection method and device
CN112307979A (en) * 2020-10-31 2021-02-02 成都新潮传媒集团有限公司 Personnel attribute identification method and device and computer equipment
CN112101300A (en) * 2020-11-02 2020-12-18 北京妙医佳健康科技集团有限公司 Medicinal material identification method and device and electronic equipment
CN112382068B (en) * 2020-11-02 2022-09-16 鲁班软件股份有限公司 Station waiting line crossing detection system based on BIM and DNN
CN112381859A (en) * 2020-11-20 2021-02-19 公安部第三研究所 System, method, device, processor and storage medium for realizing intelligent analysis, identification and processing for video image data
CN112733594A (en) * 2020-12-01 2021-04-30 贵州电网有限责任公司 Machine room figure re-identification method based on deformable convolutional network
CN112989911A (en) * 2020-12-10 2021-06-18 奥比中光科技集团股份有限公司 Pedestrian re-identification method and system
CN112488071B (en) * 2020-12-21 2021-10-26 重庆紫光华山智安科技有限公司 Method, device, electronic equipment and storage medium for extracting pedestrian features
CN112597944A (en) * 2020-12-29 2021-04-02 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112733921A (en) * 2020-12-31 2021-04-30 深圳辰视智能科技有限公司 Neural network loss function calculation method and system for predicting rigid body 6D posture
CN112733707B (en) * 2021-01-07 2023-11-14 浙江大学 Pedestrian re-recognition method based on deep learning
CN112784772B (en) * 2021-01-27 2022-05-27 浙江大学 In-camera supervised cross-camera pedestrian re-identification method based on contrast learning
CN112990120B (en) * 2021-04-25 2022-09-16 昆明理工大学 Cross-domain pedestrian re-identification method using camera style separation domain information
CN113408351B (en) * 2021-05-18 2022-11-29 河南大学 Pedestrian re-recognition method for generating confrontation network based on attitude guidance
CN113255598B (en) * 2021-06-29 2021-09-28 南京视察者智能科技有限公司 Pedestrian re-identification method based on Transformer
CN113963206A (en) * 2021-10-20 2022-01-21 中国石油大学(华东) Posture guidance-based target detection method for fast skating athletes
CN114067356B (en) * 2021-10-21 2023-05-09 电子科技大学 Pedestrian re-recognition method based on combined local guidance and attribute clustering
CN114120665B (en) * 2022-01-29 2022-04-19 山东科技大学 Intelligent phase control method and system based on pedestrian number
CN114550220B (en) * 2022-04-21 2022-09-09 中国科学技术大学 Training method of pedestrian re-recognition model and pedestrian re-recognition method
CN115762172A (en) * 2022-11-02 2023-03-07 济南博观智能科技有限公司 Method, device, equipment and medium for identifying vehicles entering and exiting parking places
CN115631464B (en) * 2022-11-17 2023-04-04 北京航空航天大学 Pedestrian three-dimensional representation method oriented to large space-time target association
CN117640794A (en) * 2023-02-21 2024-03-01 兴容(上海)信息技术股份有限公司 Network flow dividing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138998A (en) * 2015-09-07 2015-12-09 上海交通大学 Method and system for re-identifying pedestrian based on view angle self-adaptive subspace learning algorithm
CN105518744A (en) * 2015-06-29 2016-04-20 北京旷视科技有限公司 Pedestrian re-identification method and equipment
CN106778527A (en) * 2016-11-28 2017-05-31 中通服公众信息产业股份有限公司 A kind of improved neutral net pedestrian recognition methods again based on triple losses
CN107145852A (en) * 2017-04-28 2017-09-08 深圳市唯特视科技有限公司 A kind of character recognition method based on homologous cosine losses function

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7711146B2 (en) * 2006-03-09 2010-05-04 General Electric Company Method and system for performing image re-identification

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105518744A (en) * 2015-06-29 2016-04-20 北京旷视科技有限公司 Pedestrian re-identification method and equipment
CN105138998A (en) * 2015-09-07 2015-12-09 上海交通大学 Method and system for re-identifying pedestrian based on view angle self-adaptive subspace learning algorithm
CN106778527A (en) * 2016-11-28 2017-05-31 中通服公众信息产业股份有限公司 A kind of improved neutral net pedestrian recognition methods again based on triple losses
CN107145852A (en) * 2017-04-28 2017-09-08 深圳市唯特视科技有限公司 A kind of character recognition method based on homologous cosine losses function

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DeepReID: Deep Filter Pairing Neural Network for Person Re-Identification;Wei Li etal.;《2014 IEEE Conference on Computer Vision and Pattern Recognition》;20140925;全文 *
Person Re-Identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function;De Cheng etal.;《2016 IEEE Conference on Computer Vision and Pattern Recognition》;20161212;全文 *
基于双卷积神经网络的行人精细化识别;王枫等;《中国科技论文》;20170731;第12卷(第14期);全文 *

Also Published As

Publication number Publication date
CN107832672A (en) 2018-03-23

Similar Documents

Publication Publication Date Title
CN107832672B (en) Pedestrian re-identification method for designing multi-loss function by utilizing attitude information
CN109508654B (en) Face analysis method and system fusing multitask and multi-scale convolutional neural network
CN112101150B (en) Multi-feature fusion pedestrian re-identification method based on orientation constraint
CN108520226B (en) Pedestrian re-identification method based on body decomposition and significance detection
Jiang et al. Recognizing human actions by learning and matching shape-motion prototype trees
WO2016131300A1 (en) Adaptive cross-camera cross-target tracking method and system
Su et al. Global localization of a mobile robot using lidar and visual features
Bi et al. Rethinking camouflaged object detection: Models and datasets
CN109800794B (en) Cross-camera re-identification fusion method and system for appearance similar targets
CN113221625B (en) Method for re-identifying pedestrians by utilizing local features of deep learning
CN111178208A (en) Pedestrian detection method, device and medium based on deep learning
CN111814845B (en) Pedestrian re-identification method based on multi-branch flow fusion model
CN104794451B (en) Pedestrian's comparison method based on divided-fit surface structure
CN110110694B (en) Visual SLAM closed-loop detection method based on target detection
CN105718882A (en) Resolution adaptive feature extracting and fusing for pedestrian re-identification method
CN113963032A (en) Twin network structure target tracking method fusing target re-identification
Galiyawala et al. Person retrieval in surveillance video using height, color and gender
WO2013075295A1 (en) Clothing identification method and system for low-resolution video
CN111401113A (en) Pedestrian re-identification method based on human body posture estimation
CN111582154A (en) Pedestrian re-identification method based on multitask skeleton posture division component
Hu et al. Fast face detection based on skin color segmentation using single chrominance Cr
Mitsui et al. Object detection by joint features based on two-stage boosting
CN110766093A (en) Video target re-identification method based on multi-frame feature fusion
Liu et al. Mean shift fusion color histogram algorithm for nonrigid complex target tracking in sports video
Hou et al. Forest: A Lightweight Semantic Image Descriptor for Robust Visual Place Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant