CN110858276A

CN110858276A - Pedestrian re-identification method combining identification model and verification model

Info

Publication number: CN110858276A
Application number: CN201810957969.6A
Authority: CN
Inventors: 单鼎一; 范宇; 张晓林
Original assignee: China Changfeng Science Technology Industry Group Corp
Current assignee: China Changfeng Science Technology Industry Group Corp
Priority date: 2018-08-22
Filing date: 2018-08-22
Publication date: 2020-03-03

Abstract

The invention provides a pedestrian re-identification method combining an identification model and a verification model, which comprises the steps of extracting video frames in preset areas under cameras with different angles, and carrying out pedestrian detection and label marking to obtain training data; preprocessing the training data pictures, wherein each sample is a pair of pictures; respectively using a trained deep learning network to extract the characteristics of the test picture and the pictures in the search library, and selecting the output of the last convolution layer as the abstract representation of the pedestrian picture; for each test picture, calculating the feature similarity between the test picture and all the abstract features of the pedestrians in the image library, wherein the result represents the similarity degree of the test picture and all the abstract features of the pedestrians in the image library; and when the similarity of the characteristics meets the preset requirement, determining that the pedestrian in the test picture and the person with the maximum similarity in the picture library are the same person.

Description

Pedestrian re-identification method combining identification model and verification model

Technical Field

The invention belongs to the technical field of pattern recognition, and particularly relates to a pedestrian re-recognition method combining a recognition model and a verification model.

Background

Pedestrian Re-Identification (Person Re-Identification) refers to pedestrian matching performed under a non-overlapping view-angle domain multi-camera network, namely, determining whether pedestrians found by cameras at different positions at different times are the same Person. Specifically, a target pedestrian to be searched is used as a search source, and pictures related to the search source are automatically found in an existing non-overlapping camera pedestrian library by using appearance characteristics. The pedestrian re-identification can also be understood as a video picture retrieval task, and the technology has important application in the video monitoring fields of criminal investigation, pedestrian detection, multi-camera pedestrian tracking, behavior analysis and the like. However, because the same target is influenced by factors such as visual angles, illumination, object shielding and the like under different cameras, the expression of the features of the same target under different visual angles often has deviation, and the appearance features of different pedestrians are probably more similar to those of the same person, so that the pedestrian re-identification still has huge application challenges.

The deep learning is a new favorite of video image processing tasks, features are automatically learned from big data through multilayer nonlinear mapping instead of manually designed features, the capacity of a classification recognition algorithm is improved, and a new chapter of an intelligent video image analysis technology is opened. For the problem of pedestrian re-identification, most of the prior art adopts a supervised learning classification identification network and then extracts a certain layer of abstract features to represent pedestrians, and although the pedestrian re-identification has good distinguishing and distinguishing discriminativity, the effect is poor when the similarity is measured because the internal relation of the same person under different cameras is not considered. Some algorithms adopt a deep learning verification model to overcome the problem of insufficient intra-class correlation, and input a pair of pictures each time, and the labels indicate whether the pictures are the same person, so that the multi-target discrimination capability of the classification recognition problem is ignored.

In order to solve the above problems, the present invention provides a method for re-identifying pedestrians based on the combination of an identification model and a verification model in a deep learning technique,

disclosure of Invention

The invention aims to provide a pedestrian re-identification method combining an identification model and a verification model aiming at the defects and improvement requirements in the prior art, and aims to improve the discriminability and the representativeness of the characteristics from multiple angles, better ensure the accuracy and the practicability of the similarity distinction and improve the accuracy of the pedestrian re-identification technology.

In order to achieve the purpose, the invention adopts the technical scheme that:

a pedestrian re-identification method combining an identification model and a verification model is characterized in that: the cross entropy recognition loss function and paired picture two-classification verification loss function joint learning method of multiple pedestrians is added to a deep learning network structure, and the method comprises the following steps:

step 1: an acquisition module: and (4) pedestrian detection, namely extracting video frames in preset areas under cameras with different angles, and carrying out pedestrian detection and label marking to obtain training data.

Step 2: the training module is used for preprocessing the training data pictures, each sample is a pair of pictures, namely two pictures of the same pedestrian under different cameras and a pair of pictures of different pedestrians.

And step 3: an extraction module: and respectively using the trained deep learning network to extract the characteristics of the test picture and the pictures in the search library, and selecting the output of the last convolution layer as the abstract representation of the pedestrian picture.

And 4, step 4: a calculation module: for each test picture, calculating the feature similarity of the test picture with all the abstract features of the pedestrians in the image library, wherein the result represents the similarity degree of the test picture and the abstract features of the pedestrians.

And 5: a confirmation module: and when the similarity of the characteristics meets the preset requirement, determining that the pedestrian in the test picture and the person with the maximum similarity in the picture library are the same person.

The invention has the beneficial effects that:

1. the design of the core deep learning network feature extractor adopts multi-task joint learning to carry out network training, and two tasks share the characteristics of the convolutional neural network, so that the complexity of a model is reduced, and simultaneously the tasks of identification and verification are solved.

2. The model of the method has stronger generalization capability, and the extracted abstract features have stronger robustness.

3. The pedestrian re-identification method can effectively process the pedestrian re-identification problem of the monitoring system in real time.

Drawings

FIG. 1 is a block diagram of two deep learning approaches that are used in the present invention;

FIG. 2 is a block diagram of a multi-tasking convolutional neural network proposed by the present invention;

FIG. 3 is a flow chart of supervised training of the neural network architecture of the present invention;

FIG. 4 is an offline pedestrian identification flow diagram of the present invention;

fig. 5 is a flow chart of the on-line multi-camera real-time pedestrian re-identification of the present invention.

Detailed Description

The invention adds a cross entropy recognition loss function of multiple pedestrians and a paired picture two-classification verification loss function combined learning method on a deep learning network structure, which comprises the following steps:

As shown in fig. 1 (a), a large amount of discriminative discrimination information can be learned for a multi-target convolutional neural network identification structure. As shown in fig. 1 (b), the verification network has a strong similarity-versus-learning capability.

As shown in fig. 2, the deep convolutional network for multitask learning designed by the present invention. The method comprises the steps of firstly inputting a pair of pedestrian images into a network structure of a depth model, wherein the pedestrian images can be the same pedestrian or different pedestrians, the model adopts a multi-task fusion optimization strategy, and simultaneously fuses a classification recognition task and a binary verification task, and the method is embodied in two components of a loss function: a multi-label cross entropy loss function and a two-class verification loss function. In addition, under the support of a large amount of training data and a gradient descending error back propagation strategy, the deep convolutional neural network is finely adjusted, meanwhile, the characteristic discrimination capability and the similarity measurement discrimination capability are improved, and a training task is completed.

And step 3: an extraction module: and respectively using the deep learning network trained by the invention to extract the characteristics of the test picture and the pictures in the search library, and selecting the output of the last convolution layer as the abstract representation of the pedestrian picture.

To make the objects, technical solutions and advantages of the present invention clearer, the following description will be made in detail and in full with reference to the accompanying drawings:

referring to fig. 3, the inventive core deep convolutional neural network is described.

S301, data set construction: pedestrian detection can be performed by methods such as a deep learning pedestrian detection algorithm, a background difference method, an optical flow method, an inter-frame difference method, statistical learning-based methods and the like. The same tag is used for the shot pictures of the same pedestrian at different angles. Corresponding to the input of the network, a large number of positive and negative sample image pairs can be produced (the positive sample is the detection picture of two different cameras of the same person, and the negative sample is the detection picture of different persons under different cameras).

S302 structure of convolutional neural network: the main network structure adopted by the invention is a depth residual error network-Resnet 50 with 50 layers, the hidden discriminant characteristics of the pedestrian pictures are learned by adopting a depth learning technology, the artificial design interference of the traditional characteristics is overcome, and the pedestrian image detection system mainly comprises a large number of convolution layers, a pooling layer and a full connection layer. In order to meet the task requirement, the full connection layer of the last layer is removed, and a new convolution layer is added, so that the network becomes a full convolution network. Meanwhile, for the pair-wise input pictures, after the features are extracted, the pictures enter a new difference square layer (similarity contrast layer) together, and then the verification loss function is accessed, as shown in fig. 2.

S303, model learning process: and inputting paired positive and negative samples into the network for forward propagation, and obtaining identification loss and verification loss by the model according to the label information of the positive and negative samples and the information of whether the positive and negative samples are the same person. Under the given learning rate, the invention solves the update value of the weight according to the partial derivative through the gradient descent and the chain derivative principle. Under the condition of a large amount of label training data, the deep convolutional neural network is optimized and finely adjusted, and the discriminability and the representativeness of the characteristics are guaranteed through multi-task learning until the model is stable in convergence.

Referring to fig. 4, the flow chart of the offline pedestrian recognition of the present invention is shown.

S401, database picture construction: pedestrian detection can be performed by methods such as a deep learning pedestrian detection algorithm, a background difference method, an optical flow method, an inter-frame difference method, statistical learning-based methods and the like. And the shot pictures of the same pedestrian at different angles use the same label to form a target pedestrian search library.

S402, under the actual test scene, carrying out pedestrian detection on the video data obtained by the camera to obtain a to-be-detected pedestrian picture.

S403, feature extraction (only one time of database feature extraction is needed) is carried out on the database picture and the picture to be tested respectively by using the trained deep learning model, and the method reserves the output of the last convolutional layer as a characteristic vector specific to the target.

S404, forming a feature library: and storing the extracted database features and the corresponding labels, and also can be understood as a feature dictionary of a pedestrian re-recognition system.

S405, calculating similarity: the similarity between all the features of the feature dictionary and the features of the pedestrian to be detected can be calculated through various calculation methods such as cosine distance, Euclidean distance and the like, the smaller the distance is, the more the two features are like, and the more the features are more likely to be the same pedestrian target.

S406 determines that: the similarity reaches a preset value, namely the same pedestrian is possible, and as a plurality of characteristics higher than the threshold value are possible in the characteristic library, the determination method can adopt two strategies: 1. the marked pedestrian corresponding to the feature in the feature library with the minimum distance is the pedestrian to be detected; 2. and voting all the features higher than the threshold value according to the tags to determine the category of the pedestrian to be detected.

Referring to fig. 5, the flow chart of the online multi-camera real-time pedestrian re-identification of the invention is shown.

S501, S502: the cameras A and B collect real-time videos and perform pedestrian detection to obtain pedestrian pictures, pedestrians in the cameras A appear for the first time, and the system aims to judge whether pedestrians in the cameras B appear in the cameras A or not.

S503, respectively extracting the features of the pedestrian pictures of the cameras A and B by using the trained deep learning model, and reserving the output of the last convolution layer as a feature vector.

S504, the features extracted by the camera A are stored every time, and a feature library is provided for pedestrian re-identification of the camera B.

S505, calculating the similarity: and calculating the similarity between all the features of the feature library and the features of the pedestrian to be detected in the camera B, wherein the smaller the distance is, the more the two features are, the more likely the features are the same pedestrian target through various calculation methods such as cosine distance, Euclidean distance and the like.

S506 determines that: the similarity reaches a preset value, namely the same pedestrian is possible, and as a plurality of characteristics higher than the threshold value are possible in the characteristic library, the determination method can adopt two strategies: 1. the marked pedestrian corresponding to the feature in the feature library with the minimum distance is the pedestrian to be detected; 2. and voting all the features higher than the threshold value according to the tags to determine the category of the pedestrian to be detected.

Claims

1. A pedestrian re-identification method combining an identification model and a verification model is characterized in that: the cross entropy recognition loss function and paired picture two-classification verification loss function joint learning method of multiple pedestrians is added to a deep learning network structure, and the method comprises the following steps:

step 1: an acquisition module: pedestrian detection, namely extracting video frames in preset areas under cameras with different angles, and performing pedestrian detection and label marking to obtain training data;

step 2: the training module is used for preprocessing the training data pictures, each sample is a pair of pictures, and the pictures can be two pictures of the same pedestrian under different cameras or a pair of pictures of different pedestrians;

and step 3: an extraction module: respectively using a trained deep learning network to extract the characteristics of the test picture and the pictures in the search library, and selecting the output of the last convolution layer as the abstract representation of the pedestrian picture;

and 4, step 4: a calculation module: for each test picture, calculating the feature similarity between the test picture and all the abstract features of the pedestrians in the image library, wherein the result represents the similarity degree of the test picture and all the abstract features of the pedestrians in the image library;