WO2022236668A1 - 行人重识别方法、系统、电子装置及存储介质 - Google Patents

行人重识别方法、系统、电子装置及存储介质 Download PDF

Info

Publication number
WO2022236668A1
WO2022236668A1 PCT/CN2021/093029 CN2021093029W WO2022236668A1 WO 2022236668 A1 WO2022236668 A1 WO 2022236668A1 CN 2021093029 W CN2021093029 W CN 2021093029W WO 2022236668 A1 WO2022236668 A1 WO 2022236668A1
Authority
WO
WIPO (PCT)
Prior art keywords
pedestrian
sequence
candidate
query
feature
Prior art date
Application number
PCT/CN2021/093029
Other languages
English (en)
French (fr)
Inventor
裴继红
倪子阳
赵阳
杨烜
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Priority to PCT/CN2021/093029 priority Critical patent/WO2022236668A1/zh
Publication of WO2022236668A1 publication Critical patent/WO2022236668A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present invention relates to the technical field of image processing, in particular to a pedestrian re-identification method, system, electronic device and storage medium.
  • Pedestrian re-identification is a very important part of multi-camera intelligent video surveillance. It is often necessary to re-identify pedestrians who have appeared in a certain field of view in other fields of view to solve the matching and correlation problem of pedestrians across cameras and scenes. . Pedestrian re-identification is closely related to camera monitoring equipment. Due to the different deployment positions of monitoring equipment, there are certain differences in the shooting environment of different cameras. For example, the posture, angle, light and occlusion of pedestrian targets in different cameras will be different. These changes will affect the results of pedestrian re-identification. Therefore, accuracy and efficiency are very important goals in the field of person re-identification.
  • pedestrian re-identification methods based on single-frame images have achieved good results, but the information provided by single-frame images is very limited, and this type of method has high requirements for the quality of pedestrian images. For example, when pedestrians in the image encounter occlusion , lighting and attitude changes, it will cause a large error in pedestrian re-identification. In the actual monitoring environment, it is difficult to ensure that the images of pedestrians captured by the camera are clear and without interference. Therefore, pedestrian re-identification based on single-frame images has certain limitations.
  • CNN convolutional Neural Network
  • RNN recurrent neural networks
  • attention mechanism to obtain a single frame image
  • the features are fused to obtain pedestrian sequence features, which are used to characterize the entire pedestrian sequence, and the distance measurement is performed to finally obtain the re-identification result.
  • pedestrian sequence features obtained by feature fusion tend to ignore and lose the detailed information of some images in the sequence, and do not make full use of all the image information in the sequence, which will have a certain impact on the re-identification results.
  • the main purpose of the present invention is to provide a pedestrian re-identification method, system, electronic device and storage medium, which can make full use of all image information in the sequence, avoid problems caused by feature fusion, and achieve remarkable performance in pedestrian sequence re-identification tasks improvement.
  • the first aspect of the present invention provides a pedestrian re-identification method, including: data preparation, the data includes a training data set and a test data set for the feature extraction network, the training data set and the test data set
  • the data set has different query sets and candidate sets, the query set includes a query pedestrian sequence, and the candidate set includes a candidate pedestrian sequence; use the training data set to pre-train and fine-tune the feature extraction network; use the feature extraction
  • the network performs feature extraction on the test data in the test data set, and performs dimensionality reduction on the extracted features to obtain image features; performs similarity measurement on the image features of the test data to obtain the most matching candidate pedestrian; output The best matching candidate pedestrian.
  • the pre-training and fine-tuning of the feature extraction network using the training data set includes: constructing a feature extraction network, the feature extraction network is MGN; determining the training parameters of the feature extraction network; Perform preprocessing; input the training data in the preprocessed training data set into the feature extraction network, calculate the loss value, and use the loss value to carry out backpropagation on the feature extraction network, and update the training parameters , realizing the pre-training and fine-tuning of the feature extraction network.
  • said performing similarity measurement on the features of the test data to obtain the best matching candidate pedestrians includes: constructing a sequence of pedestrians represented by each test data according to all its image features to describe the corresponding sequence feature space Probability distribution function; estimate the probability of the image feature points of the query pedestrian sequence; measure the similarity between the probability distribution function of the query pedestrian sequence and the candidate pedestrian sequence; repeat the steps of similarity measurement until all query pedestrian sequences and candidate pedestrian sequences are completed Sequence similarity measure; sort according to the similarity and find the candidate pedestrians that best match the query pedestrian sequence.
  • the probability estimation of the image feature points of the query pedestrian sequence includes: inputting the image feature sample points in the query pedestrian sequence feature into the RBFNN probability function estimator constructed by it to calculate and obtain the corresponding probability estimation value; Input the image feature sample points of the query pedestrian sequence features into the RBFNN probability function estimator constructed by the candidate pedestrian sequence features and obtain the corresponding probability estimation value; repeat the above steps until the image feature sample points of the query pedestrian sequence features are all in the candidate set Probability Estimation for Pedestrian Sequences.
  • the measurement of the similarity between the probability distribution functions of the query pedestrian sequence and the candidate pedestrian sequence includes: combining the probability estimates of all the image feature sample points of the query pedestrian sequence feature in its own feature distribution and the probability estimation value of the candidate pedestrian sequence feature Calculate the KL divergence of the estimated value of the distribution probability, and complete the similarity measurement of the probability distribution function constructed by the sequence features of the query pedestrian and all the sequence features of the candidate pedestrians in the candidate set.
  • the preprocessing of the training data set includes: adjusting the size of the training images in the training data set to m ⁇ n; performing random horizontal flipping and cropping on the training images to complete the preprocessing.
  • the feature extraction network includes a Resnet50 framework and three different, mutually independent branches, the three different branches are respectively a global branch, a coarse-grained branch and a fine-grained branch, the global branch is used to extract global features, and the coarse-grained branch It is used to extract coarse-grained local features, and the fine-grained branch is used to extract fine-grained local features.
  • the second aspect of the present application provides a pedestrian re-identification system, including: a data preparation module for data preparation, the data includes a training data set and a test data set for the feature extraction network, the training data set and the The test data set has different query sets and candidate sets, the query set includes a query pedestrian sequence, and the candidate set includes a candidate pedestrian sequence; the network processing module is used to use the training data set to pre-train the feature extraction network and fine-tuning; feature extraction module, for using the feature extraction network to carry out feature extraction to the test data in the test data set, and carry out dimension reduction to the extracted features, to obtain image features; similarity measurement module, for The image features of the test data are subjected to similarity measurement to obtain the best matching candidate pedestrians; the result output module is configured to output the best matching candidate pedestrians.
  • a data preparation module for data preparation, the data includes a training data set and a test data set for the feature extraction network, the training data set and the The test data set has different query sets and candidate sets, the query set includes
  • the third aspect of the present application provides an electronic device, including: a memory and a processor, the memory stores a computer program that can run on the processor, and it is characterized in that when the processor executes the computer program , realizing the pedestrian re-identification method described in any one of the above.
  • the fourth aspect of the present application provides a computer-readable storage medium on which a computer program is stored, wherein, when the computer program is executed by a processor, the pedestrian re-identification method described in any one of the above is implemented.
  • the pedestrian re-identification method, system, electronic device, and storage medium provided by the present application have the beneficial effect of: by using the similarity measure between the sequence feature distributions instead of the measure between two sequence feature points, it is possible to make full use of the pedestrian sequence The information of each image, while avoiding the loss of partial image detail information caused by feature fusion.
  • FIG. 1 is a flowchart of a pedestrian re-identification method provided by an embodiment of the present invention
  • Fig. 2 is the Market1501, DukeMTMC-reID dataset used for training in the embodiment of the present invention
  • Fig. 3 is the ATR pedestrian sequence data set that the used laboratory of the embodiment of the present invention is fine-tuned and tested;
  • Fig. 4 is the training flowchart of the MGN used for image feature extraction
  • FIG. 5 is a schematic diagram of the structure of MGN
  • the three pictures in Fig. 6 from front to back are the feature point distribution map of the query sequence, the constructed probability distribution function map and the probability distribution map of feature points;
  • the three pictures in Fig. 7 from front to back are the feature point distribution map of the candidate sequence, the constructed probability distribution function map and the probability distribution map of feature points;
  • the three pictures in Fig. 8 from front to back are the feature point distribution map of the query sequence and the candidate sequence, the comparison map of the probability distribution function of the constructed query sequence and candidate sequence, and the comparison map of the probability distribution of common sample points in the two feature spaces;
  • the three pictures in Figure 9 from front to back are the contour map of the probability distribution of the query sequence, the contour map of the probability distribution of the candidate sequence, and the contrast map of the probability distribution contours of the two;
  • Fig. 10 is a flowchart of pedestrian sequence feature distribution similarity measurement
  • Fig. 11 is a frame structure diagram of pedestrian sequence feature distribution similarity measurement
  • Figure 12 is a RBFNN structure diagram for realizing the construction of pedestrian sequence probability distribution function
  • Fig. 13 is a flowchart of probability estimation for image feature points of query pedestrian sequence
  • Fig. 14 is a flow chart of measuring the similarity between the probability distribution functions of the query pedestrian sequence and the candidate pedestrian sequence
  • Figure 15 is the experimental results of the present invention, including mAP, Rank-1 and Rank-5 values;
  • Fig. 16 is a system structural block diagram of the pedestrian re-identification method provided by the present invention.
  • FIG. 17 is a schematic block diagram of the structure of the electronic device of the present invention.
  • the present invention proposes a pedestrian re-identification method.
  • the main idea of this method is to obtain the re-identification result by constructing the probability distribution function of the pedestrian sequence and measuring the similarity of the feature distribution of the pedestrian sequence. After obtaining all the image features in the pedestrian sequence through the feature extraction network, each pedestrian sequence can obtain the corresponding feature point set. For any query sequence and a candidate sequence in the test sample, the method constructs a corresponding probability distribution function through the set of image feature points in the sequence, which describes the feature distribution of the pedestrian sequence and is used to represent the corresponding sequence of pedestrians.
  • This method uses the similarity measure between sequence feature distributions instead of the measurement between two sequence feature points commonly used in the current pedestrian sequence re-identification method to obtain the re-identification results, fully utilizes the information of each image in the pedestrian sequence, and avoids the Problems brought about by feature fusion.
  • Figure 1 shows the flow of the pedestrian re-identification method provided by the embodiment of the present invention, specifically including S1-S5, wherein,
  • the data includes a training data set and a testing data set for the feature extraction network.
  • the training data set and the testing data set have different query sets and candidate sets, the query set includes query pedestrian sequences, and the candidate set includes candidate pedestrian sequences.
  • the data includes Market1501, DukeMTMC-reID dataset, and the ATR pedestrian sequence dataset produced by our laboratory.
  • Figure 2 shows some images of the Market1501 and DukeMTMC-reID data sets
  • Figure 3 shows the ATR pedestrian sequence data set produced by our laboratory. This invention needs to use the above data set for pre-training and fine-tuning, and will eventually be used in the ATR pedestrian sequence dataset for testing.
  • the Market1501 data set contains 1501 pedestrians
  • the training set contains 751 pedestrians, a total of 12936 pedestrian images
  • the test set contains 750 pedestrians, a total of 19732 images, of which the query set has 3368 pedestrian images
  • the candidate set has 16364 pedestrian images
  • the DukeMTMC-reID dataset contains 1404 pedestrians, of which the training set contains 702 pedestrians, a total of 16522 images
  • the test set contains 702 people, a total of 19889 pedestrian images, of which the query set has 2228 images
  • the candidate set has 176,661 images
  • the ATR pedestrian sequence data set was produced by our laboratory, including 300 pedestrians, 754 pedestrian sequences, a total of 152,951 pedestrian images, of which, the query set has 241 pedestrian sequences, a total of 52,483 images, the average pedestrian sequence There are 217 images
  • the candidate set has 513 pedestrian sequences with a total of 100,468 images, with an average of 195 images per pedestrian sequence.
  • S2 includes the following steps:
  • Figure 5 shows a schematic diagram of the MGN structure, which includes the Resnet50 framework and three different branches, namely the global branch, coarse-grained branch and fine-grained branch, and each branch is independent of each other.
  • the global branch is responsible for extracting global features
  • Coarse-grained branches extract coarse-grained local features
  • Fine-grained branches extract fine-grained local features
  • both coarse-grained and fine-grained branches generate corresponding global features
  • the present invention selects P ⁇ K pedestrian images from the training set for each training to form a training batch input network, wherein each batch contains P different pedestrians, each pedestrian randomly selects K images, and uses ADAM as the optimizer for parameter training, and sets the attenuation factor ⁇ of L2 regularization and the initial learning rate lr at the same time, and trains for a total of M times, during which the learning rate will be Decays by number of times.
  • the size of the training image needs to be adjusted to m ⁇ n, and then the training data is enhanced by random horizontal flipping and cropping to prevent the network from overfitting during training.
  • Horizontal flip is to flip the image 180 degrees along the horizontal direction, similar to mirror folding.
  • Cropping is to randomly select a part of the image for cropping.
  • the cropped image area is generally smaller than the network input image scale, so it is necessary to readjust the cropped image size to m ⁇ n.
  • S24 Input the training data in the preprocessed training data set into the feature extraction network, calculate the loss value, and use the loss value to perform backpropagation on the feature extraction network, update the training parameters, and realize the pre-training and fine-tuning of the feature extraction network .
  • P and K represent the number of pedestrian categories in the training batch and the number of pedestrian images selected for each category
  • represents the tolerance value of the distance between the farthest similar sample and the nearest heterogeneous sample in the batch
  • Represents the features of the a-th image in the i-th category express with The p-th image features of the same pedestrian category, express with The nth image features of different pedestrian categories, p i,a represents the pedestrian category of the sample feature, and q i,a represents the network's predicted value of the sample feature.
  • S3 Use the feature extraction network to extract features from the test data in the test data set, and reduce the dimensionality of the extracted features to obtain image features.
  • ⁇ qi represents the feature of the i-th pedestrian sequence in the query set
  • m represents the number of pedestrian sequences in the query set
  • ⁇ gi represents the feature of the i-th pedestrian sequence in the candidate set
  • n represents the number of pedestrian sequences in the candidate set.
  • S4 Perform similarity measurement on the image features of the test data to obtain the best matching candidate pedestrians.
  • the present invention does not characterize the corresponding pedestrian sequence through the fusion feature of the sequence, but constructs a feature describing the sequence based on all the image features in the pedestrian sequence.
  • the probability distribution function of the space is shown in Figure 6 and Figure 7, and the re-identification result is obtained by measuring the similarity of the pedestrian sequence feature distribution, as shown in Figure 8 and Figure 9, further, Figure 10 shows the pedestrian sequence feature Flow chart of distribution similarity measurement, Figure 11 shows the framework structure of pedestrian sequence feature distribution similarity measurement, step S4 includes the following steps:
  • the present invention uses each image feature sample in the pedestrian sequence as the core to construct a corresponding Gaussian function, and uses these Gaussian functions to fit the probability distribution function of the pedestrian sequence.
  • the expression of the Gaussian function is shown in formula (5).
  • the constructed pedestrian sequence probability distribution function can estimate the probability of any feature sample, and use it to describe the probability distribution of the feature sample in the feature space of the pedestrian sequence.
  • the Gaussian function will be On the contrary, when the input feature samples are far away from the core, the Gaussian function will show lower activity, and the Gaussian function will show lower and lower activity as the distance increases.
  • the constructed probability distribution functions are shown in formulas (6) and (7), respectively,
  • x represents the input feature sample
  • xi represents the feature sample as the Gaussian core
  • represents the kernel width of the Gaussian function
  • the optimal value of the kernel width will be selected through experiments
  • N q and N g respectively represent the query sequence features ⁇ q The number of image features and the number of image features in the candidate sequence feature ⁇ g .
  • the present invention adopts the RBFNN structure to realize the construction of the probability distribution function of the pedestrian sequence.
  • the RBFNN is a three-layer feedforward neural network, and its structure is shown in FIG. 12 . Therefore, the pedestrian sequence feature set Q of the query set obtained by S3 and the pedestrian sequence feature set G of the candidate set are based on the RBFNN structure to construct the corresponding probability distribution function, also known as the RBFNN probability function estimator, to describe the corresponding pedestrian Sequence feature distribution, the RBFNN probability function estimator set Q RBF of the query set and the RBFNN probability function estimator set G RBF of the candidate set can be obtained, as shown in formulas (8) and (9),
  • FIG. 13 shows the process of performing probability estimation on the image feature points of the query pedestrian sequence
  • the specific step S42 includes the following steps:
  • step S423 Repeat step S422 until the probability estimation of all pedestrian sequences in the candidate set G of the image feature sample point x k of the query pedestrian sequence feature ⁇ q ⁇ Q is completed.
  • the probability estimates of the image feature sample point x k of the query sequence feature ⁇ q ⁇ Q in different pedestrian sequence feature spaces can be obtained. These probability estimates describe the query sequence feature ⁇ q ⁇ Q
  • FIG. 14 shows the process of measuring the similarity between the probability distribution functions of the query pedestrian sequence and the candidate pedestrian sequence
  • the specific step S43 includes the following steps:
  • S432 Repeat S431 until the similarity measurement of the probability distribution function constructed between the query pedestrian sequence feature ⁇ q ⁇ Q and all candidate pedestrian sequence features in the candidate set G is completed.
  • S44 Repeat S42 and S43 until the similarity measurement between all query pedestrian sequences and candidate pedestrian sequences is completed.
  • S45 Sort according to the similarity and find the candidate pedestrians that best match the query pedestrian sequence.
  • the divergence value of each query pedestrian sequence and all candidate pedestrian sequences is calculated by S44, and all candidate pedestrian sequences are sorted in ascending order according to the divergence value.
  • the recognition results of each query pedestrian sequence in the candidate set will be sorted according to the similarity.
  • the mean average precision (mAP) and the top k bit accuracy (Rank-k) are used as the performance indicators to measure the pedestrian re-identification algorithm. .
  • mAP is the average precision rate (average precision, AP) of the pedestrian sequence in the query set, which is used to describe the ranking ratio of the results of the same category as the query sequence in the query results of the candidate set.
  • AP average precision
  • Rank-k indicates the probability that the top k candidates in the candidate set contain the correct pedestrian after similarity sorting. The larger the value, the higher the accuracy rate.
  • Rank-1 and Rank-5 are often used as evaluation indicators, respectively indicating the first hit and the probability of the top 5 hits.
  • Figure 15 is the experimental results of the present invention, showing the mAP, Rank-1 and Rank-5 values of the present invention in the ATR pedestrian sequence data set, "Ours" represents the pedestrian re-identification method proposed by the present invention.
  • the pedestrian re-identification system includes: a data preparation module 1, a network processing module 2, a feature extraction module 3, a similarity measurement module 4, and a result output module 5;
  • the data preparation module 1 is used for data preparation, and the data includes The training data set and the test data set of the network, the training data set and the test data set have different query sets and candidate sets, the query set includes the query pedestrian sequence, and the candidate set includes the candidate pedestrian sequence;
  • the network processing module 2 is used to utilize the training data set
  • the feature extraction network is pre-trained and fine-tuned;
  • the feature extraction module 3 is used to use the feature extraction network to perform feature extraction on the test data in the test data set, and perform dimensionality reduction on the extracted features to obtain image features;
  • the similarity measurement module 4 is used to measure the similarity of the image features of the test data to obtain the best matching candidate pedestrians
  • the present invention proposes a pedestrian re-identification method.
  • the main idea of this method is to obtain the re-identification result by constructing the probability distribution function of the pedestrian sequence and measuring the similarity of the feature distribution of the pedestrian sequence. After obtaining all the image features in the pedestrian sequence through the feature extraction network, each pedestrian sequence can obtain the corresponding feature point set. For any query sequence and a candidate sequence in the test sample, the method constructs a corresponding probability distribution function through the set of image feature points in the sequence, which describes the feature distribution of the pedestrian sequence and is used to represent the corresponding sequence of pedestrians.
  • This method uses the similarity measure between sequence feature distributions instead of the measurement between two sequence feature points commonly used in the current pedestrian sequence re-identification method to obtain the re-identification results, fully utilizes the information of each image in the pedestrian sequence, and avoids the Problems brought about by feature fusion.
  • the electronic device includes: a memory 601, a processor 602, and a computer program stored in the memory 601 and operable on the processor 602, and the processor 602 executes the computer program.
  • the program implement the pedestrian re-identification method described above.
  • the electronic device further includes: at least one input device 603 and at least one output device 604 .
  • the above-mentioned memory 601 , processor 602 , input device 603 and output device 604 are connected through a bus 605 .
  • the input device 603 may specifically be a camera, a touch panel, a physical button or a mouse, and the like.
  • the output device 604 may specifically be a display screen.
  • the memory 601 can be a high-speed random access memory (RAM, Random Access Memory) memory, or a non-volatile memory (non-volatile memory), such as a disk memory.
  • RAM Random Access Memory
  • non-volatile memory non-volatile memory
  • the memory 601 is used to store a set of executable program codes, and the processor 602 is coupled to the memory 601 .
  • the embodiment of the present application also provides a computer-readable storage medium, which can be set in the electronic device in each of the above-mentioned embodiments, and the computer-readable storage medium can be the aforementioned memory 601.
  • a computer program is stored on the computer-readable storage medium, and when the program is executed by the processor 602, the pedestrian re-identification method described in the foregoing embodiments is implemented.
  • the computer storage medium can also be various media that can store program codes such as U disk, mobile hard disk, read-only memory 601 (ROM, Read-Only Memory), RAM, magnetic disk or optical disk.
  • program codes such as U disk, mobile hard disk, read-only memory 601 (ROM, Read-Only Memory), RAM, magnetic disk or optical disk.
  • the disclosed devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, that is, they may be located in one place, or may be distributed to multiple network modules. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present invention may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules.
  • the integrated modules are realized in the form of software function modules and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种行人重识别方法、系统、电子装置及存储介质,方法包括:数据准备,数据包括用于特征提取网络的训练数据集和测试数据集,训练数据集和测试数据集具有不同的查询集及候选集,查询集包括查询行人序列,候选集包括候选行人序列;利用训练数据集对特征提取网络进行预训练及微调;利用特征提取网络对测试数据集内的测试数据进行特征提取,并对提取的特征进行维数简约,得到图像特征;对测试数据的图像特征进行相似性度量,得到最匹配的候选行人;输出最匹配的候选行人。本申请能够充分利用序列中所有的图像信息,避免特征融合带来的问题,在行人序列重识别任务中性能取得显著的提升。

Description

行人重识别方法、系统、电子装置及存储介质 技术领域
本发明涉及图像处理技术领域,尤其涉及一种行人重识别方法、系统、电子装置及存储介质。
背景技术
行人重识别是多摄像机智能视频监控中非常重要的一部分,常需要将在某一视场中已经出现过的行人在其他视场中再次识别出来,解决跨摄像机和跨场景下行人的匹配关联问题。行人重识别与摄像机监控设备有着密切的关系,由于监控设备所部署的位置不同,不同摄像机的拍摄环境存在一定差异,如行人目标在不同的摄像机中姿态、角度、光线以及遮挡等因素都会发生不同的变化,这些变化都会对行人重识别结果造成影响。因此,准确和高效是行人重识别领域非常重要的目标。
目前基于单帧图像的行人重识别方法取得了较好的效果,但是单帧图像提供的信息是十分有限的,并且该类方法对行人图像的质量要求较高,如当图像中行人遇到遮挡、光照和姿态变化等情况时,则会对行人重识别造成很大的误差。在实际的监控环境中,难以保证摄像机所拍摄到的行人图像都是清晰且无干扰的图像,因此,基于单帧图像的行人重识别存在一定的局限性。
考虑到以上问题,很多工作开始利用行人的视频序列进行行人重识别方法研究,该类方法以行人视频序列为单位进行重识别,因此,如何充分利用视频序列信息是其中的关键部分,也是目前没有很好解决的问题。相比于基于单帧图像的行人重识别方法,基于视频序列的方法能够通过利用多帧行人图像来获得更多的行人信息,目前大部分方法的主要思想是先利用卷积神经网络(Convolutional Neural Networks,CNN)提取单帧行人图像的特征,然后再通过 最大值池化、均值池化、3D卷积、循环神经网络(Recurrent neural networks,RNN)或注意力机制等方法将得到的单帧图像特征进行特征融合,从而获得行人序列特征,并且以此来表征整个行人序列,进行距离度量,最终获得重识别结果。但是,通过特征融合方式得到的行人序列特征容易忽略和丢失序列中部分图像的细节信息,没有充分利用序列中所有的图像信息,这会对重识别结果造成一定的影响。
发明内容
本发明的主要目的在于提供一种行人重识别方法、系统、电子装置及存储介质,能够充分利用序列中所有的图像信息,避免特征融合带来的问题,在行人序列重识别任务中性能取得显著的提升。
为实现上述目的,本发明第一方面提供一种行人重识别方法,包括:数据准备,所述数据包括用于特征提取网络的训练数据集和测试数据集,所述训练数据集和所述测试数据集具有不同的查询集及候选集,所述查询集包括查询行人序列,所述候选集包括候选行人序列;利用所述训练数据集对特征提取网络进行预训练及微调;利用所述特征提取网络对所述测试数据集内的测试数据进行特征提取,并对提取的特征进行维数简约,得到图像特征;对所述测试数据的图像特征进行相似性度量,得到最匹配的候选行人;输出所述最匹配的候选行人。
进一步地,所述利用所述训练数据集对特征提取网络进行预训练及微调,包括:构建特征提取网络,所述特征提取网络为MGN;确定所述特征提取网络的训练参数;对训练数据集进行预处理;将预处理后的训练数据集中的训练数据输入所述特征提取网络,计算损失值,并利用所述损失值进行在所述特征提取网络上进行反向传播,更新所述训练参数,实现所述特征提取网络的预训练及微调。
进一步地,所述对所述测试数据的特征进行相似性度量,得到最匹配的候 选行人,包括:将每个测试数据代表的行人序列都根据其所有的图像特征构建出描述对应序列特征空间的概率分布函数;对查询行人序列的图像特征点进行概率估计;度量查询行人序列和候选行人序列的概率分布函数之间的相似性;重复相似性度量的步骤,直至完成所有查询行人序列与候选行人序列的相似性度量;根据相似性大小进行排序并找出与查询行人序列最匹配的候选行人。
进一步地,所述对查询行人序列的图像特征点进行概率估计,包括:将查询行人序列特征中的图像特征样本点输入其构建的RBFNN概率函数估计器进行计算并得出对应的概率估计值;将查询行人序列特征的图像特征样本点输入候选行人序列特征所构建的RBFNN概率函数估计器并得到对应的概率估计值;重复上述步骤,直至完成查询行人序列特征的图像特征样本点在候选集中所有行人序列的概率估计。
进一步地,所述度量查询行人序列和候选行人序列的概率分布函数之间的相似性,包括:将所有查询行人序列特征的图像特征样本点在自身特征分布的概率估计值和在候选行人序列特征分布的概率估计值进行KL散度计算,完成查询行人序列特征与候选集中所有候选行人序列特征构建的概率分布函数相似性度量。
进一步地,所述对训练数据集进行预处理,包括:将所述训练数据集内的训练图像的大小调整至m×n;对所述训练图像进行随机水平翻转、裁剪,完成预处理。
进一步地,所述特征提取网络包括Resnet50框架和三个不同、相互独立的分支,三个不同的分支分别为全局分支、粗粒度分支及细粒度分支,全局分支用于提取全局特征,粗粒度分支用于提取粗粒度局部特征,细粒度分支用于提取细粒度局部特征。
本申请第二方面提供一种行人重识别系统,包括:数据准备模块,用于数据准备,所述数据包括用于特征提取网络的训练数据集和测试数据集,所述训 练数据集和所述测试数据集具有不同的查询集及候选集,所述查询集包括查询行人序列,所述候选集包括候选行人序列;网络处理模块,用于利用所述训练数据集对特征提取网络进行预训练及微调;特征提取模块,用于利用所述特征提取网络对所述测试数据集内的测试数据进行特征提取,并对提取的特征进行维数简约,得到图像特征;相似性度量模块,用于对所述测试数据的图像特征进行相似性度量,得到最匹配的候选行人;结果输出模块,用于输出所述最匹配的候选行人。
本申请第三方面提供一种电子装置,包括:存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时,实现上述中的任意一项所述的行人重识别方法。
本申请第四方面提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时,实现上述中的任意一项所述的行人重识别方法。
本申请提供的行人重识别方法、系统、电子装置及存储介质,有益效果在于:通过采用序列特征分布之间的相似性度量代替了两个序列特征点之间度量,这样能够充分利用行人序列中每幅图像的信息,同时避免特征融合带来的部分图像细节信息丢失问题。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的行人重识别方法的流程图;
图2是本发明实施例训练用到的Market1501、DukeMTMC-reID数据集;
图3是本发明实施例微调和测试的用到的本实验室制作的ATR行人序列数 据集;
图4是用于图像特征提取的MGN的训练流程图;
图5是MGN的结构示意图;
图6从前至后的三幅图依次为查询序列的特征点分布图、构建的概率分布函数图和特征点的概率分布图;
图7从前至后的三幅图依次为候选序列的特征点分布图、构建的概率分布函数图和特征点的概率分布图;
图8从前至后的三幅图依次为查询序列和候选序列的特征点分布图、构建的查询序列和候选序列概率分布函数的对比图和公共样本点在两个特征空间的概率分布对比图;
图9从前至后的三幅图依次为查询序列概率分布等高线图、候选序列概率分布等高线图和两者的概率分布等高线对比图;
图10是行人序列特征分布相似性度量流程图;
图11是行人序列特征分布相似性度量的框架结构图;
图12是用来实现行人序列概率分布函数构建的RBFNN结构图;
图13是对查询行人序列的图像特征点进行概率估计的流程图;
图14是度量查询行人序列和候选行人序列的概率分布函数之间的相似性的流程图;
图15是本发明的实验结果,其中包括mAP、Rank-1和Rank-5值;
图16是本发明提供的行人重识别方法的系统结构框图;
图17为本发明电子装置的结构示意框图。
具体实施方式
为使得本发明的发明目的、特征、优点能够更加的明显和易懂,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而非全部实施例。基于本发明中的实施例,本领域技术人员在没有做出创造性劳动前提下所获得的 所有其他实施例,都属于本发明保护的范围。
针对单帧图像信息的有限性和目前行人序列重识别方法中特征融合带来的部分图像细节信息丢失问题,本发明提出了行人重识别方法。不同于目前的行人序列重识别方法,该方法的主要思想是通过构建行人序列的概率分布函数,并度量行人序列特征分布的相似性来获得重识别的结果。经过特征提取网络获取行人序列中所有的图像特征后,各行人序列都可以获得对应的特征点集合。对于测试样本中的任意一个查询序列和一个候选序列,该方法通过对该序列中的图像特征点集合构建出相应的概率分布函数,该函数描述了行人序列的特征分布情况,用来表征对应的行人序列。
当需要对查询序列和候选序列进行相似性度量时,只需要通过度量这两个行人序列对应的特征分布相似性即可。将查询序列的特征点集作为公共特征点集,分别计算出公共特征点集在查询序列和候选序列特征空间的概率分布情况,并比较这两个概率分布情况的匹配程度,即可完成两个行人序列的特征分布相似性度量,并选择相似性最大的行人序列作为重识别结果。该方法用序列特征分布之间的相似性度量代替了目前行人序列重识别方法中常采用的两个序列特征点之间度量来获得重识别结果,充分利用行人序列中每幅图像的信息,避免了特征融合的带来的问题。
图1展示了本发明实施例提供的行人重识别方法的流程,具体包括S1-S5,其中,
S1:数据准备;
数据包括用于特征提取网络的训练数据集和测试数据集,训练数据集和测试数据集具有不同的查询集及候选集,查询集包括查询行人序列,候选集包括候选行人序列。
在本实施例中,数据包括Market1501、DukeMTMC-reID数据集,以及本实验室制作的ATR行人序列数据集。
图2展示了Market1501、DukeMTMC-reID数据集的部分图像,图3展示 了由本实验室制作的ATR行人序列数据集,本发明需要采用以上数据集进行预训练和微调,并且最终将在ATR行人序列数据集进行测试。其中,Market1501数据集包含1501个行人,其中训练集含有751个行人,共12936幅行人图像,测试集则包含750个行人,共19732幅图像,其中查询集有3368幅行人图像,候选集有16364幅行人图像;DukeMTMC-reID数据集包含1404个行人,其中训练集含有702个行人,共16522幅图像,测试集包含702人,共19889幅行人图像,其中查询集有2228幅图像,候选集有176661幅图像;ATR行人序列数据集由本实验室制作,包含300个行人,754个行人序列,共152951幅行人图像,其中,查询集有241个行人序列,共52483幅图像,平均每个行人序列有217幅图像;候选集有513个行人序列,共100468幅图像,平均每个行人序列有195幅图像。
S2:利用训练数据集对特征提取网络进行预训练及微调。
进一步地,MGN的训练流程如图4所示,因此S2包括以下步骤:
S21:构建特征提取网络,特征提取网络为MGN。
图5展示了MGN结构的示意图,该结构包括Resnet50框架和三个不同的分支,分别为全局分支、粗粒度分支和细粒度分支,且各分支相互独立。全局分支负责提取全局特征
Figure PCTCN2021093029-appb-000001
粗粒度分支提取粗粒度局部特征
Figure PCTCN2021093029-appb-000002
细粒度分支提取细粒度局部特征
Figure PCTCN2021093029-appb-000003
此外,粗粒度分支和细粒度分支都会生成对应的全局特征
Figure PCTCN2021093029-appb-000004
最终将上述的局部特征
Figure PCTCN2021093029-appb-000005
与全局特征
Figure PCTCN2021093029-appb-000006
进行拼接作为最终行人的特征表示f=(x 1,x 2,...,x 2048)。
S22:确定特征提取网络的训练参数。
MGN将在Market1501、DukeMTMC-reID和ATR行人序列数据集上进行训练与微调,本发明每次训练都从训练集选取P×K幅行人图像形成一个训练批次输入网络,其中每个批次包含P个不同的行人,每个行人随机选取K幅图像,并且采用ADAM作为参数训练的优化器,同时设置L2正则化的衰减因子β, 和初始学习率lr,总共训练M次,期间学习率会根据次数进行衰减。
S23:训练数据的预处理。
在训练数据输入MGN之前,需要对其进行预处理,首先需要训练图像的尺寸大小调整至m×n,随后采用随机水平翻转、裁剪两种方式增强训练数据,防止训练时网络出现过拟合。水平翻转是将图像沿水平方向进行180度翻转,类似于镜面对折。裁剪是从图像中随机选择部分区域进行裁剪,裁剪后的图像区域一般小于网络输入图像尺度,因此需要将裁剪后的图像尺寸大小重新调整至m×n。
S24:将预处理后的训练数据集中的训练数据输入特征提取网络,计算损失值,并利用损失值进行在特征提取网络上进行反向传播,更新训练参数,实现特征提取网络的预训练及微调。
将S23得到的数据输入MGN,并将其输出的全局特征
Figure PCTCN2021093029-appb-000007
用于计算Softmax交叉熵损失与三元组损失,而粗粒度分支与细粒度分支输出的局部特征
Figure PCTCN2021093029-appb-000008
则用于计算Softmax交叉熵损失值,并且最小化上述损失值,反向传播更新网络参数,不断重复,直至达到训练次数,完成训练和微调。Softmax交叉熵损失函数与三元组损失函数如公式(1)和(2)所示,
Figure PCTCN2021093029-appb-000009
Figure PCTCN2021093029-appb-000010
其中P和K表示训练批次中的行人类别数和各类别选取的行人图像数量,α表示批次中最远同类样本距离与最近异类样本距离的容忍值,
Figure PCTCN2021093029-appb-000011
表示第i类中的第a幅图像的特征,
Figure PCTCN2021093029-appb-000012
表示与
Figure PCTCN2021093029-appb-000013
相同行人类别的第p幅图像特征,
Figure PCTCN2021093029-appb-000014
表示与
Figure PCTCN2021093029-appb-000015
不同行人类别的第n幅图像特征,p i,a表示样本特征的行人类别,q i,a表示网络对样本特征的预测值。
S3:利用特征提取网络对测试数据集内的测试数据进行特征提取,并对提 取的特征进行维数简约,得到图像特征。
经过S2对MGN进行预训练和微调后,则需要对测试集的行人序列进行特征提取,MGN网络对单幅行人图像的输出特征f=(x 1,x 2,...,x d)维数较高,一个含有k幅图像的行人序列,其序列特征可表示为
Figure PCTCN2021093029-appb-000016
由于高维特征会对算法的计算带来巨大的负担,因此采用PCA对已获得的所有特征进行降维处理,其中能量值设为α,可得降维后的特征
Figure PCTCN2021093029-appb-000017
其中d'<d,则含有k幅图像的行人序列特征可表示为
Figure PCTCN2021093029-appb-000018
因此,即可即可获得查询集的行人序列特征集合Q和候选集的行人序列特征集合G,如公式(3)和(4)所示,
Q={ω q1q2,...,ω qi,...,ω qm}         (3)
G={ω g1g2,...,ω gi,...,ω gn}         (4)
其中ω qi表示查询集中第i个行人序列的特征,m表示查询集中的行人序列个数,ω gi表示候选集中的第i个行人序列的特征,n表示候选集中的行人序列个数。
S4:对测试数据的图像特征进行相似性度量,得到最匹配的候选行人。
由于目前行人序列重识别方法中的特征融合方式存在局限性,因此,本发明并没有通过序列的融合特征来表征对应的行人序列,而是根据行人序列中所有的图像特征构建出描述该序列特征空间的概率分布函数,如图6和图7所示,并通过度量行人序列特征分布的相似性来获得重识别结果,如图8和图9所示,进一步地,图10展示了行人序列特征分布相似性度量流程图,图11展示了行人序列特征分布相似性度量的框架结构,步骤S4包括以下步骤:
S41:将给每个测试数据代表的行人序列都根据其所有的图像特征构建出描述对应序列特征空间的概率分布函数。
本发明将行人序列中的每个图像特征样本作为核心构建对应的高斯函数,并通过这些高斯函数来拟合该行人序列的概率分布函数,高斯函数的表示如公式(5)所示。此外,构建的行人序列概率分布函数可对任意的特征样本进行概率估计,并以此来描述特征样本在该行人序列的特征空间的概率分布情况,当 输入的特征样本靠近核心时,高斯函数会表现出较高的活性,相反,当输入的特征样本远离核心时,高斯函数则会表现出较低的活性,高斯函数随着距离的增加,其表现活性会越来越低。对于任意的查询序列特征ω q和候选序列特征ω g,其构建的概率分布函数分别如公式(6)和(7)所示,
Figure PCTCN2021093029-appb-000019
Figure PCTCN2021093029-appb-000020
Figure PCTCN2021093029-appb-000021
其中,x表示输入的特征样本,x i表示作为高斯核心的特征样本,σ表示高斯函数的核宽,核宽将通过实验选取最优值,N q和N g分别表示查询序列特征ω q中的图像特征数量和候选序列特征ω g中的图像特征数量。
本发明采用RBFNN结构来实现行人序列的概率分布函数构建,RBFNN是一种三层的前馈型神经网络,其结构如图12所示。因此,由S3得到的查询集的行人序列特征集合Q和候选集的行人序列特征集合G基于RBFNN结构都构建出对应的概率分布函数,也称为RBFNN概率函数估计器,以此描述对应的行人序列特征分布,即可得到查询集的RBFNN概率函数估计器集合Q RBF和候选集的RBFNN概率函数估计器集合G RBF,如公式(8)和(9)所示,
Figure PCTCN2021093029-appb-000022
Figure PCTCN2021093029-appb-000023
其中
Figure PCTCN2021093029-appb-000024
Figure PCTCN2021093029-appb-000025
分别表示根据查询集中的第i个行人序列特征和候选集中第j个行人序列特征构建的RBFNN概率函数估计器。
S42:对查询行人序列的图像特征点进行概率估计。
进一步地,图13展示了对查询行人序列的图像特征点进行概率估计的流程,具体步骤S42包括以下步骤:
S421:将查询行人序列特征ω q∈Q中的图像特征样本点x k输入其构建的 RBFNN概率函数估计器进行计算并得出对应的概率估计值P q,其定义见公式(10)。
P q={p(x kq)|x k∈ω q}       (10)
S422:同样地将查询行人序列特征ω q∈Q的图像特征样本点x k输入候选行人序列特征ω g∈G所构建的RBFNN概率函数估计器并得到对应的概率估计值P g,其定义见公式(11)。
P g={p(x kg)|x k∈ω q}      (11)
S423:重复步骤S422,直至完成查询行人序列特征ω q∈Q的图像特征样本点x k在候选集G中所有行人序列的概率估计。
经过上述的计算后,即可得到查询行人序列特征ω q∈Q的图像特征样本点x k在不同行人序列特征空间的概率估计值,这些概率估计值分别描述了查询序列特征ω q∈Q的图像特征样本点x k在自身特征空间和所有候选行人序列特征空间中的分布情况;
S43:度量查询行人序列和候选行人序列的概率分布函数之间的相似性。
进一步地,图14展示了度量查询行人序列和候选行人序列的概率分布函数之间的相似性的流程,具体步骤S43包括以下步骤:
S431:将查询行人序列特征ω q∈Q的图像特征样本点x k在自身特征分布的概率估计值P q和在候选行人序列ω g∈G特征分布的概率估计值P g进行KL散度计算,如公式(12)所示。
Figure PCTCN2021093029-appb-000026
S432:重复S431,直至完成查询行人序列特征ω q∈Q与候选集G中所有候选行人序列特征构建的概率分布函数相似性度量。
S44:重复S42和S43,直至完成所有查询行人序列与候选行人序列的相似性度量。
S45:根据相似性大小进行排序并找出与查询行人序列最匹配的候选行人。
由S44计算得出了各查询行人序列与所有候选行人序列的散度值,根据散度值对所有候选行人序列进行升序排序,散度值越小表示两个特征分布越相似,散度值越大表示两个特征分布的差异越大,将最小散度值对应的候选行人作为重识别结果,如公式(13),
Figure PCTCN2021093029-appb-000027
S5:输出结果。
每个查询行人序列在候选集中的识别结果会按照相似度进行排序,这里采用平均精度均值(mean average precision,mAP)和前k位准确率(Rank-k)作为衡量行人重识别算法的性能指标。
mAP是查询集中行人序列的查准率(average precision,AP)的平均值,用来描述在候选集的查询结果中,与查询序列相同类别的结果的排序占比,当mAP越高,表示算法的性能越好。在ATR行人序列数据集中,查询集Q有241个行人序列,候选集G共有513个行人序列。假设任意查询行人序列q i与候选集中所有行人序列进行相似度计算后,排序后的候选集表示为
Figure PCTCN2021093029-appb-000028
命中的行人序列组成集合表示为
Figure PCTCN2021093029-appb-000029
如公式(14)和(15),则可计算出查询序列q i的AP,如公式(16)所示,因此,对查询集中所有行人序列的AP取平均值,即可得到mAP,如公式(17)所示,
Figure PCTCN2021093029-appb-000030
Figure PCTCN2021093029-appb-000031
Figure PCTCN2021093029-appb-000032
Figure PCTCN2021093029-appb-000033
其中,
Figure PCTCN2021093029-appb-000034
表示候选集中与查询序列q i的行人类别相同的序列数量,r j表示命中 的第j个候选行人序列在
Figure PCTCN2021093029-appb-000035
的排序名次,而
Figure PCTCN2021093029-appb-000036
表示在
Figure PCTCN2021093029-appb-000037
的排序名次。
Rank-k表示在相似度排序后候选集中前k个中含有正确行人的概率,该值越大,则表示准确率越高,常采用Rank-1,Rank-5作为评估指标,分别表示首次命中的概率和前5个命中的概率。
图15是本发明的实验结果,展示了本发明在ATR行人序列数据集的mAP、Rank-1和Rank-5值,“Ours”表示本发明提出的行人重识别方法。
图16展示出了本发明提供的行人重识别方法的系统结构框图,为了便于说明,仅示出了与本发明相关的部分。具体地,行人重识别系统包括:数据准备模块1、网络处理模块2、特征提取模块3、相似性度量模块4及结果输出模块5;数据准备模块1用于数据准备,数据包括用于特征提取网络的训练数据集和测试数据集,训练数据集和测试数据集具有不同的查询集及候选集,查询集包括查询行人序列,候选集包括候选行人序列;网络处理模块2用于利用训练数据集对特征提取网络进行预训练及微调;特征提取模块3用于利用特征提取网络对测试数据集内的测试数据进行特征提取,并对提取的特征进行维数简约,得到图像特征;相似性度量模块4用于对测试数据的图像特征进行相似性度量,得到最匹配的候选行人;结果输出模块5用于输出最匹配的候选行人。
针对单帧图像信息的有限性和目前行人序列重识别方法中特征融合带来的部分图像细节信息丢失问题,本发明提出了行人重识别方法。不同于目前的行人序列重识别方法,该方法的主要思想是通过构建行人序列的概率分布函数,并度量行人序列特征分布的相似性来获得重识别的结果。经过特征提取网络获取行人序列中所有的图像特征后,各行人序列都可以获得对应的特征点集合。对于测试样本中的任意一个查询序列和一个候选序列,该方法通过对该序列中的图像特征点集合构建出相应的概率分布函数,该函数描述了行人序列的特征分布情况,用来表征对应的行人序列。
当需要对查询序列和候选序列进行相似性度量时,只需要通过度量这两个行人序列对应的特征分布相似性即可。将查询序列的特征点集作为公共特征点 集,分别计算出公共特征点集在查询序列和候选序列特征空间的概率分布情况,并比较这两个概率分布情况的匹配程度,即可完成两个行人序列的特征分布相似性度量,并选择相似性最大的行人序列作为重识别结果。该方法用序列特征分布之间的相似性度量代替了目前行人序列重识别方法中常采用的两个序列特征点之间度量来获得重识别结果,充分利用行人序列中每幅图像的信息,避免了特征融合的带来的问题。
本申请实施例提供一种电子装置,请参阅图17,该电子装置包括:存储器601、处理器602及存储在存储器601上并可在处理器602上运行的计算机程序,处理器602执行该计算机程序时,实现前述中描述的行人重识别方法。
进一步的,该电子装置还包括:至少一个输入设备603以及至少一个输出设备604。
上述存储器601、处理器602、输入设备603以及输出设备604,通过总线605连接。
其中,输入设备603具体可为摄像头、触控面板、物理按键或者鼠标等等。输出设备604具体可为显示屏。
存储器601可以是高速随机存取记忆体(RAM,Random Access Memory)存储器,也可为非不稳定的存储器(non-volatile memory),例如磁盘存储器。存储器601用于存储一组可执行程序代码,处理器602与存储器601耦合。
进一步的,本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质可以是设置于上述各实施例中的电子装置中,该计算机可读存储介质可以是前述中的存储器601。该计算机可读存储介质上存储有计算机程序,该程序被处理器602执行时实现前述实施例中描述的行人重识别方法。
进一步的,该计算机可存储介质还可以是U盘、移动硬盘、只读存储器601(ROM,Read-Only Memory)、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可 以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
需要说明的是,对于前述的各方法实施例,为了简便描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其它顺序或者同时进行。 其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定都是本发明所必须的。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其它实施例的相关描述。
以上为对本发明所提供的一种行人重识别方法、系统、电子装置及存储介质的描述,对于本领域的技术人员,依据本发明实施例的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本发明的限制。

Claims (10)

  1. 一种行人重识别方法,其特征在于,包括:
    数据准备,所述数据包括用于特征提取网络的训练数据集和测试数据集,所述训练数据集和所述测试数据集具有不同的查询集及候选集,所述查询集包括查询行人序列,所述候选集包括候选行人序列;
    利用所述训练数据集对特征提取网络进行预训练及微调;
    利用所述特征提取网络对所述测试数据集内的测试数据进行特征提取,并对提取的特征进行维数简约,得到图像特征;
    对所述测试数据的图像特征进行相似性度量,得到最匹配的候选行人;
    输出所述最匹配的候选行人。
  2. 根据权利要求1所述的行人重识别方法,其特征在于,
    所述利用所述训练数据集对特征提取网络进行预训练及微调,包括:
    构建特征提取网络,所述特征提取网络为MGN;
    确定所述特征提取网络的训练参数;
    对训练数据集进行预处理;
    将预处理后的训练数据集中的训练数据输入所述特征提取网络,计算损失值,并利用所述损失值进行在所述特征提取网络上进行反向传播,更新所述训练参数,实现所述特征提取网络的预训练及微调。
  3. 根据权利要求1所述的行人重识别方法,其特征在于,
    所述对所述测试数据的特征进行相似性度量,得到最匹配的候选行人,包括:
    将每个测试数据代表的行人序列都根据其所有的图像特征构建出描述对应序列特征空间的概率分布函数;
    对查询行人序列的图像特征点进行概率估计;
    度量查询行人序列和候选行人序列的概率分布函数之间的相似性;
    重复相似性度量的步骤,直至完成所有查询行人序列与候选行人序列的相 似性度量;
    根据相似性大小进行排序并找出与查询行人序列最匹配的候选行人。
  4. 根据权利要求3所述的行人重识别方法,其特征在于,
    所述对查询行人序列的图像特征点进行概率估计,包括:
    将查询行人序列特征中的图像特征样本点输入其构建的RBFNN概率函数估计器进行计算并得出对应的概率估计值;
    将查询行人序列特征的图像特征样本点输入候选行人序列特征所构建的RBFNN概率函数估计器并得到对应的概率估计值;
    重复上述步骤,直至完成查询行人序列特征的图像特征样本点在候选集中所有行人序列的概率估计。
  5. 根据权利要求3所述的行人重识别方法,其特征在于,
    所述度量查询行人序列和候选行人序列的概率分布函数之间的相似性,包括:将所有查询行人序列特征的图像特征样本点在自身特征分布的概率估计值和在候选行人序列特征分布的概率估计值进行KL散度计算,完成查询行人序列特征与候选集中所有候选行人序列特征构建的概率分布函数相似性度量。
  6. 根据权利要求2所述的行人重识别方法,其特征在于,
    所述对训练数据集进行预处理,包括:
    将所述训练数据集内的训练图像的大小调整至m×n;
    对所述训练图像进行随机水平翻转、裁剪,完成预处理。
  7. 根据权利要求2所述的行人重识别方法,其特征在于,
    所述特征提取网络包括Resnet50框架和三个不同、相互独立的分支,三个不同的分支分别为全局分支、粗粒度分支及细粒度分支,全局分支用于提取全局特征,粗粒度分支用于提取粗粒度局部特征,细粒度分支用于提取细粒度局部特征。
  8. 一种行人重识别系统,其特征在于,包括:
    数据准备模块,用于数据准备,所述数据包括用于特征提取网络的训练数据集和测试数据集,所述训练数据集和所述测试数据集具有不同的查询集及候 选集,所述查询集包括查询行人序列,所述候选集包括候选行人序列;
    网络处理模块,用于利用所述训练数据集对特征提取网络进行预训练及微调;
    特征提取模块,用于利用所述特征提取网络对所述测试数据集内的测试数据进行特征提取,并对提取的特征进行维数简约,得到图像特征;
    相似性度量模块,用于对所述测试数据的图像特征进行相似性度量,得到最匹配的候选行人;
    结果输出模块,用于输出所述最匹配的候选行人。
  9. 一种电子装置,包括:存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时,实现权利要求1至7中的任意一项所述的行人重识别方法。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时,实现权利要求1至7中的任意一项所述的行人重识别方法。
PCT/CN2021/093029 2021-05-11 2021-05-11 行人重识别方法、系统、电子装置及存储介质 WO2022236668A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/093029 WO2022236668A1 (zh) 2021-05-11 2021-05-11 行人重识别方法、系统、电子装置及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/093029 WO2022236668A1 (zh) 2021-05-11 2021-05-11 行人重识别方法、系统、电子装置及存储介质

Publications (1)

Publication Number Publication Date
WO2022236668A1 true WO2022236668A1 (zh) 2022-11-17

Family

ID=84029186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/093029 WO2022236668A1 (zh) 2021-05-11 2021-05-11 行人重识别方法、系统、电子装置及存储介质

Country Status (1)

Country Link
WO (1) WO2022236668A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830637A (zh) * 2022-12-13 2023-03-21 杭州电子科技大学 一种基于姿态估计和背景抑制的遮挡行人重识别方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110457985A (zh) * 2019-06-05 2019-11-15 深圳大学 基于视频序列的行人再识别方法、装置及计算机设备
CN111310633A (zh) * 2020-02-10 2020-06-19 江南大学 基于视频的并行时空注意力行人重识别方法
US20200226421A1 (en) * 2019-01-15 2020-07-16 Naver Corporation Training and using a convolutional neural network for person re-identification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200226421A1 (en) * 2019-01-15 2020-07-16 Naver Corporation Training and using a convolutional neural network for person re-identification
CN110457985A (zh) * 2019-06-05 2019-11-15 深圳大学 基于视频序列的行人再识别方法、装置及计算机设备
CN111310633A (zh) * 2020-02-10 2020-06-19 江南大学 基于视频的并行时空注意力行人重识别方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115830637A (zh) * 2022-12-13 2023-03-21 杭州电子科技大学 一种基于姿态估计和背景抑制的遮挡行人重识别方法
CN115830637B (zh) * 2022-12-13 2023-06-23 杭州电子科技大学 一种基于姿态估计和背景抑制的遮挡行人重识别方法
US11908222B1 (en) 2022-12-13 2024-02-20 Hangzhou Dianzi University Occluded pedestrian re-identification method based on pose estimation and background suppression

Similar Documents

Publication Publication Date Title
CN107577990B (zh) 一种基于gpu加速检索的大规模人脸识别方法
CN113516012B (zh) 一种基于多层级特征融合的行人重识别方法及系统
CN110033473B (zh) 基于模板匹配和深度分类网络的运动目标跟踪方法
CN111339988B (zh) 基于动态间隔损失函数和概率特征的视频人脸识别方法
CN108960184A (zh) 一种基于异构部件深度神经网络的行人再识别方法
CN112884742A (zh) 一种基于多算法融合的多目标实时检测、识别及跟踪方法
CN111507183B (zh) 一种基于多尺度密度图融合空洞卷积的人群计数方法
CN111898406A (zh) 基于焦点损失和多任务级联的人脸检测方法
WO2023061102A1 (zh) 视频行为识别方法、装置、计算机设备和存储介质
Liu et al. Iterative relaxed collaborative representation with adaptive weights learning for noise robust face hallucination
WO2022213857A1 (zh) 动作识别方法和装置
Oikarinen et al. Graphmdn: Leveraging graph structure and deep learning to solve inverse problems
WO2024032010A1 (zh) 一种基于迁移学习策略的少样本目标实时检测方法
Liu et al. Pose-adaptive hierarchical attention network for facial expression recognition
CN109919246A (zh) 基于自适应特征聚类和多重损失融合的行人重识别方法
CN112364747A (zh) 一种有限样本下的目标检测方法
WO2022236668A1 (zh) 行人重识别方法、系统、电子装置及存储介质
Talker et al. Efficient sliding window computation for nn-based template matching
Gao et al. Structure-aware person search with self-attention and online instance aggregation matching
CN113536939B (zh) 一种基于3d卷积神经网络的视频去重方法
CN115690545A (zh) 训练目标跟踪模型和目标跟踪的方法和装置
CN111862156B (zh) 一种基于图匹配的多目标跟踪方法和系统
CN115830643B (zh) 一种姿势引导对齐的轻量行人重识别方法
Blanchart et al. Cascaded active learning for object retrieval using multiscale coarse to fine analysis
CN116681742A (zh) 基于图神经网络的可见光与红外热成像图像配准方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21941244

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23-02-2024)