CN111274992A

CN111274992A - Cross-camera pedestrian re-identification method and system

Info

Publication number: CN111274992A
Application number: CN202010088462.9A
Authority: CN
Inventors: 张师林; 赵延; 李颖宏; 庄东哲
Original assignee: North China University of Technology
Current assignee: North China University of Technology
Priority date: 2020-02-12
Filing date: 2020-02-12
Publication date: 2020-06-12

Abstract

The invention provides a cross-camera pedestrian re-identification method and a cross-camera pedestrian re-identification system, which can obtain an auxiliary training set containing various false image samples by training a camera style transfer network on each image in a training data set, thereby reducing the influence of the camera style on the identification accuracy. The invention also enables the local characteristics in the image to express the characteristic information of the subarea in a more concentrated way through the segmentation of different granularities, thereby filtering the information of other areas and enabling the description characteristics obtained by the re-identification network to have more distinguishing force. Therefore, the accuracy and the generalization of pedestrian re-identification can be improved, the identification effect can be obviously improved when the pedestrian is tracked and searched in a wide area video monitoring scene, and the influence of external environment factors, camera factors or pedestrian postures is reduced.

Description

Cross-camera pedestrian re-identification method and system

Technical Field

The invention relates to the field of intelligent transportation, in particular to a video monitoring and identifying method and system.

Background

Pedestrian re-identification is an important task in the field of wide area video surveillance. The technology aims to identify the same person across cameras, can be used for tracking specific pedestrians, and plays an important role in the aspects of traffic, public safety and video monitoring.

In the prior art, if a person to be queried currently exists, a pedestrian re-identification task needs to be deployed to a plurality of cameras at different positions at the same time, and the most possible targets are listed through data collected by the cameras. In the prior art, the difficulty of pedestrian re-identification is how to overcome the change of the posture and the appearance of the pedestrian caused by the difference of the human body posture, the illumination and the visual field, and in the prior art, the difference between different cameras is also an important factor influencing the re-identification accuracy. In the prior art, due to the influence of environmental factors, differences among equipment and pedestrian postures, the accuracy rate of actual recognition is not high.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a cross-camera pedestrian re-identification method and a cross-camera pedestrian re-identification system. The method is based on the computer vision technology, the video processing and the pattern recognition technology, combines the multi-granularity characteristic and the global characteristic, can obtain more detailed information, supplements the global information to improve the recognition accuracy, has the advantages of low installation cost, high recognition accuracy and the like, and has wide application prospect in the field of intelligent transportation.

In order to achieve the purpose, the pedestrian re-identification method across the cameras is provided, and the method comprises the following steps:

acquiring pedestrian images of areas corresponding to the images, which are acquired by cameras of all areas in wide area video monitoring, manually marking the pedestrian identity information and the camera information corresponding to the images, and establishing a training data set S;

secondly, according to each image and the corresponding pedestrian identity information and camera informationTraining a camera shooting style transfer network to obtain an auxiliary training set C, and obtaining a false image sample according to the auxiliary training set C

To calculate the distance loss and obtain the distance loss function

Wherein the content of the first and second substances,

representing false image samples

Corresponding real image sample, n_sRepresenting the number of real image samples, n, in a training batch_s ^*Representing the number of generated image samples;

thirdly, segmenting the training data set S and the auxiliary training set C with different granularities, respectively inputting each segmented image into a re-recognition network for training, and respectively obtaining the loss function of each segmented partition

Wherein j represents the number of divided partitions, i represents the number of each divided partition, and the loss function according to each partition

Computing a set of all partition penalty functions

According to the set L of all loss functions_CrossAnd updating the re-identification network by back propagation to obtain an overall loss function L ═ L_Cross+λL_TPerforming multi-granularity feature enhancement, wherein lambda represents a distance loss function L_TThe weight of (c);

fourthly, respectively inputting the image needing to be re-identified and the image in the image library which are shot by the camera into the re-identification network obtained by updating in the third step to obtain the description characteristics corresponding to the image, and comparing the description characteristics corresponding to the image needing to be re-identified and shot by the camera with the description characteristics corresponding to the image in the image library to carry out similarity sorting;

and fifthly, sequencing according to the similarity, visually outputting a sequencing result, and determining the region where the pedestrian appears.

Optionally, in any of the above methods for identifying pedestrian across cameras, the specific step of acquiring the pedestrian images of the corresponding areas acquired by the cameras in each area in the wide area video monitoring includes: the method comprises the steps of obtaining videos collected by cameras in various regions in wide-area video monitoring, and capturing pictures in the videos once every 120 frames to obtain pedestrian images of the corresponding regions.

Optionally, in any of the above methods for re-identifying a pedestrian crossing a camera, in the second step, a camera style transfer network is trained according to each image, corresponding pedestrian identity information thereof, and camera information, and the step of obtaining an auxiliary training set C includes: according to the number N of cameras corresponding to the training data set S, each real image sample is subjected to

N-1 false image samples are generated, respectively.

Optionally, in the third step, the training data set S and the auxiliary training set C are segmented according to three different granularities, namely, an integral part, a two-partition part and a three-partition part, and each segmented image is respectively input into the re-recognition network for training to respectively obtain the loss function L corresponding to the segmented integral partition part_Cross1The loss functions of the two partitions after obtaining the two partitions are respectively

The loss functions of the three partitions after the three partitions are obtained are respectively

According to eachLoss function of each partition

Computing a set of all partition penalty functions

According to the set L of all loss functions_CrossAnd updating the re-identification network by back propagation to obtain an overall loss function L ═ L_Cross+λL_TPerforming multi-granularity feature enhancement, wherein lambda represents a distance loss function L_TThe weight of (c).

Optionally, in the method for re-identifying a pedestrian across cameras, in the third step, the training data set S and the auxiliary training set C are uniformly divided according to the granularity of two partitions and three partitions.

Optionally, in the method for pedestrian re-identification across cameras, in the third step, the resnet-50 is used as a main network in the multi-granularity feature enhancement process.

Optionally, in the fifth step, the sorting result is output visually according to the similarity sorting, and the step of determining the region where the pedestrian appears specifically includes:

and sequencing according to the similarity of the images needing to be re-identified and two description features corresponding to the images in the image library, transmitting the sequencing result to a pedestrian re-identification visual display module, calling the images in the image library by the pedestrian re-identification visual display module, sending the images to the front end for display, and simultaneously displaying the position information and the time information of the images needing to be re-identified, which correspond to the sequencing result.

Meanwhile, the invention also provides a cross-camera pedestrian re-identification system, which comprises:

the system comprises a data acquisition module, a data acquisition module and a data processing module, wherein the data acquisition module is used for acquiring pedestrian images of areas corresponding to the cameras in each area in wide-area video monitoring, manually marking the pedestrian identity information and the camera information corresponding to each image and establishing a training data set S; it is also used to determine the image according to the image and its locationTraining a camera shooting style transfer network according to corresponding pedestrian identity information and camera information to obtain an auxiliary training set C, and training a camera shooting style transfer network according to false image samples in the auxiliary training set C

To calculate the distance loss and obtain the distance loss function

A pedestrian re-recognition module for dividing the training data set S and the auxiliary training set C into different granularities, inputting each divided image into a re-recognition network for training, and obtaining the loss function of each divided subarea

Computing a set of all partition penalty functions

According to the set L of all loss functions_CrossCarrying out reverse propagation to update the re-identification network;

the pedestrian matching module is used for respectively inputting the image needing to be re-identified and the image in the image library which are shot by the camera into the re-identification network obtained by updating the pedestrian re-identification module to obtain the description characteristics corresponding to the image, and comparing the description characteristics corresponding to the image needing to be re-identified and shot by the camera with the description characteristics corresponding to the image in the image library to carry out similarity sorting;

and the display module is used for sequencing according to the similarity, visually outputting a sequencing result and simultaneously displaying the position information and the time information of the pedestrian.

Optionally, in the system for pedestrian re-identification across cameras, the rennet-50 is used as a main network in the re-identification network in the module for pedestrian re-identification.

Advantageous effects

The invention obtains the auxiliary training set containing various false image samples by training the image shooting style transfer network in the training data set, thereby reducing the influence of the camera style on the identification accuracy. The invention also enables the local characteristics in the image to express the characteristic information of the subarea in a more concentrated way through the segmentation of different granularities, thereby filtering the information of other areas and enabling the description characteristics obtained by the re-identification network to have more distinguishing force. Therefore, the accuracy and the generalization of pedestrian re-identification can be improved, the identification effect can be obviously improved when the pedestrian is tracked and searched in a wide area video monitoring scene, and the influence of external environment factors, camera factors or pedestrian postures is reduced.

Further, the pedestrian image is divided according to three different granularities of whole, two partitions and three partitions, and then the divided images are input into a re-recognition network for training respectively. The invention realizes the diversity of image information through different partition numbers. With the increase of the number of the subareas, the local features in the image can more intensively express the feature information of the subarea, so that the information of other areas is filtered, and the features acquired by the method have more distinguishing force. The invention can better express the details of the image by combining multi-granularity loss and camera style learning, and reduce the influence of camera difference.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a schematic overall flow chart of a cross-camera pedestrian re-identification method according to the invention;

FIG. 2 is a schematic diagram of the transfer of the camera style in the cross-camera pedestrian re-identification method of the present invention;

fig. 3 is a schematic diagram of multi-granularity feature enhancement in the cross-camera pedestrian re-identification method of the present invention.

Detailed Description

In order to make the purpose and technical solution of the embodiments of the present invention clearer, the technical solution of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention without any inventive step, are within the scope of protection of the invention.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

Referring to fig. 1, the present invention provides a cross-camera pedestrian re-identification method capable of satisfying a real traffic scene, aiming at the defects of the existing various pedestrian re-identification methods in terms of generalization and accuracy, and the steps thereof include:

acquiring pedestrian images of each region through a camera, acquiring pedestrian images of the corresponding region acquired by the camera of each region in wide-area video monitoring, manually marking pedestrian identity information and camera information corresponding to each image, and establishing a training data set S;

secondly, training a shooting style transfer network according to each image and corresponding pedestrian identity information and camera information thereof to obtain an auxiliary training set C, and training a false image sample in the auxiliary training set C according to the false image sample in the auxiliary training set C

To calculate the distance loss and obtain the distance loss function

Wherein the content of the first and second substances,

representing false image samples

Corresponding real image sample, n_sRepresenting the number of real image samples, n, in a training batch_sRepresents the number of generated image samples;

Computing a set of all partition penalty functions

Therefore, the pedestrian re-recognition network is trained by combining the original training set S and the auxiliary training set C and acquiring more discriminative features through a multi-granularity feature enhancement method, when the camera shoots a target to be inquired, the description features are obtained through the pedestrian re-recognition network and compared with the images in the image library, and the sequencing result is visualized through software so as to determine the region where the pedestrian appears. According to the invention, the pedestrian objects to be searched in the pedestrian pictures shot by the cameras in each area can be more accurately found by utilizing the pattern recognition technology and combining the shooting style transfer network and the detail characteristics embodied by different granularities for the pedestrians shot by the cameras in each area.

In other implementation manners, the pedestrian re-identification method can also realize the tracking and searching of the pedestrians in the wide area video monitoring scene through the following system integrated at the control end of the intelligent traffic monitoring network. The system comprises:

the system comprises a data acquisition module, a data acquisition module and a data processing module, wherein the data acquisition module is used for acquiring pedestrian images of areas corresponding to the cameras in each area in wide area video monitoring through the cameras, and manually marking the pedestrian identity information and the camera information corresponding to each image to establish a training data set S; the method is also used for training a camera shooting style transfer network through camera information and pedestrian information according to each image and corresponding pedestrian identity information thereof to obtain an auxiliary training set C, and obtaining a false image sample in the auxiliary training set C according to the false image sample in the auxiliary training set C

To calculate the distance loss and obtain the distance loss function

In the auxiliary training set C, if the data set is collected by N cameras, N-1 real image samples directly collected by the cameras are generatedA false image sample. The style transfer image is shown in fig. 2, and the data in fig. 2 is acquired by 6 cameras;

the pedestrian re-recognition module is used for segmenting the training data set S and the auxiliary training set C into different granularities, sending each segmented image into the re-recognition network model respectively for model training, acquiring the characteristics of the partitioned force through multi-granularity characteristic enhancement by using resnet-50 as a main network in the training process, and acquiring the loss function corresponding to each segmented partition respectively

Computing a set of all partition penalty functions

the pedestrian matching module is used for respectively inputting the image which is shot by the camera and needs to be re-identified and the image of the pedestrian to be inquired in the image library into the re-identification network which is obtained by updating the pedestrian re-identification module to obtain the description characteristics corresponding to the image, comparing the distances between the description characteristics corresponding to the image which is shot by the camera and needs to be re-identified and the description characteristics corresponding to the image in the image library, carrying out similarity sorting, and storing the sorting result and related image information so as to determine the pedestrian including but not limited to position and other related information;

the display module is used for sorting and visually outputting sorting results according to the similarity, calling images of a rear-end image library through the stored information, sending the images to the front end for displaying, and simultaneously displaying the position information, the time information and other related information of pedestrians, so that the results are more visualized and convenient to observe, the pedestrian information is more organized, and the relevance of the pedestrian space-time information can be better determined.

Therefore, data required by a training model are obtained firstly, unique style images of different cameras are obtained through the camera style migration network, so that in image samples collected by any group of cameras, the image samples collected by each camera can be converted into images of other camera styles through the camera style migration network, and therefore re-recognition generalization is improved; in addition, the invention trains the re-recognition model according to the acquired data, loads the trained model for matching the image library images, and transmits the result to the display module for displaying according to the similarity measurement result obtained by calculation, so as to conveniently observe the result.

In a more specific implementation manner, the data acquisition module may obtain false image samples in different shooting styles through training of the shooting style transfer network in the manner shown in fig. 2, so as to improve generalization of pedestrian re-recognition. The specific implementation can be carried out according to the following steps:

and 2.1, acquiring the number of the cameras, and if the data set is collected by the N cameras, respectively generating N-1 false image samples for each real image sample.

Step 2.2 collects real image samples x by camera j (j ═ 1,2, …, N)_s,jN is the number of cameras used in the collected data set,

during training, x is measured_s,jAnd the generated false image is set to the same category. Omitting camera subscript, using real image samples

Jointly generated false image samples

To calculate the distance loss, the distance loss function may be expressed as,

wherein n is_sRepresenting the number of real image samples, n, in a training batch_sDenotes the number of generated image samples.

In a more specific implementation manner, the pedestrian re-identification module may further filter information of other regions by segmentation at different granularities through the segmentation manner shown in fig. 3, obtain more detailed information in the image, avoid losing the connection between the local information and the overall information by combination of different granularities, and effectively improve the identification accuracy. The principle of the method is that an image sample for training is shot through a camera, the scene is complex, and the global feature obtains information for integrally expressing the image feature, so that the foreground and the background of the image are difficult to distinguish. Particularly, when the pedestrian needing to be identified is blocked or the condition of severe light change is met, the global features can be damaged by environmental disturbance factors. That is, the prior art may ignore important detail features if only global features are considered. Therefore, the invention combines the multi-granularity characteristic with the global characteristic and reflects the diversity of the image information through different partition numbers. With the increase of the number of the partitions, the local features can be used for more intensively expressing the feature information of the partition, so that the re-recognition generalization can be improved by utilizing the camera style migration, and the applicability of the invention is improved; meanwhile, the characteristic enhancement is carried out by utilizing a multi-granularity method, so that the accuracy of re-identification of the invention is improved.

In a specific implementation, in one implementation, the multi-granularity feature enhancement can be achieved by the following steps:

step 3.1, segmenting the pedestrian image into an integral region, a two-partition region and a three-partition region respectively, wherein the image is divided into the integral region, the two-partition region and the three-partition region;

step 3.2, the segmented image is input into a re-recognition network for training, the number of different partitions represents the diversity of image information, and with the increase of the number of the partitions, local features can more intensively express the feature information of the partition where the local features are located, so that the information of other areas is filtered, and the obtained features have partition force;

step 3.3 results in 6 loss functions,

L_Cross1the set of all partition penalty functions can be expressed as:

a reverse propagation update re-identification network;

step 3.4 the method of step 2.2 and step 3.3 is combined to obtain the overall loss function as follows:

L＝L_Cross+λL_T

the above loss function contains 7 losses, L_CrossIs the set of all loss functions in step 3.3, which contains 6 losses, λ is the loss function L_TThe weight and the multi-granularity loss can better express the details of the image by combining the camera style learning, and the influence of the camera difference is reduced.

Therefore, pictures needing to be re-identified are input into the re-identification network to obtain description characteristics, the pictures in the image library are also sent into the re-identification network to obtain the description characteristics, the description characteristics are compared and sequenced, and sequencing results are transmitted to pedestrian re-identification visualization software, so that the identified pedestrian information and the time and the place corresponding to the image can be displayed.

The pedestrian object to be searched in the pedestrian picture shot by each area camera can be accurately found by processing the image of the pedestrian shot by each area camera and utilizing the pattern recognition technology. Aiming at the defects of the existing various pedestrian re-identification methods in the aspects of generalization and accuracy, the invention provides the pedestrian re-identification method based on the improvement of the camera style, and the accuracy and generalization of pedestrian re-identification can be improved. Firstly, acquiring pedestrian images in wide-area video monitoring, manually calibrating relevant information of pedestrians, and training on a given data set to obtain a style migration model; then, acquiring an auxiliary training set by using the style migration model to reduce the influence of the style of the camera on the recognition accuracy; and finally, training a pedestrian re-recognition model by combining an auxiliary training set, and acquiring a pedestrian sample with the description characteristics of the pedestrian and determined to be closest to the pedestrian to be detected by using the pedestrian re-recognition model, thereby acquiring the relevant information of the pedestrian. According to the pedestrian re-recognition method, more discriminative features can be obtained by utilizing technologies such as style migration and feature enhancement through a pedestrian re-recognition model obtained by training on a given data set; the re-recognition result can be observed more clearly through pedestrian visualization software. The obtained visual pedestrian information contains the position information, the time information and the like corresponding to the camera, is more orderly, and can better determine the relevance of the pedestrian space-time information.

Therefore, the invention solves the problem of reduced recognition precision caused by the style of the camera; and the influence of pedestrian attitude change or environmental shielding on the identification accuracy can be solved in a multi-granularity mode. The pedestrian re-identification method can realize stable and reliable re-identification of the pedestrian in wide-area video monitoring.

The above are merely embodiments of the present invention, which are described in detail and with particularity, and therefore should not be construed as limiting the scope of the invention. It should be noted that, for those skilled in the art, various changes and modifications can be made without departing from the spirit of the present invention, and these changes and modifications are within the scope of the present invention.

Claims

1. A cross-camera pedestrian re-identification method is characterized by comprising the following steps:

To calculate the distance loss and obtain the distance loss function

Wherein the content of the first and second substances,

representing false image samples

Computing a set of all partition penalty functions

2. The method for re-identifying pedestrians across cameras according to claim 1, wherein in the first step, the specific step of acquiring the images of the pedestrians in the corresponding areas collected by the cameras in the areas in the wide area video monitoring includes: the method comprises the steps of obtaining videos collected by cameras in various regions in wide-area video monitoring, and capturing pictures in the videos once every 120 frames to obtain pedestrian images of the corresponding regions.

3. The method for re-identifying pedestrians across cameras according to claims 1-2, wherein in the second step, the step of training the camera style transfer network according to each image and the corresponding pedestrian identity information and camera information to obtain the auxiliary training set C includes: according to the number N of cameras corresponding to the training data set S, each real image sample is subjected to

N-1 false image samples are generated, respectively.

4. The method for re-identifying pedestrians across cameras as claimed in claims 1 to 3, wherein in the third step, the training data set S and the auxiliary training set C are segmented according to three different granularities of whole, two partitions and three partitions, and each segmented image is respectively input into the re-identification network for training to respectively obtain the loss function L corresponding to the whole partitioned after segmentation_Cross1The loss functions of the two partitions after obtaining the two partitions are respectively

Loss function according to each partition

Computing a set of all partition penalty functions

5. The method for re-identifying pedestrians across cameras according to claim 4, wherein in the third step, the training data set S and the auxiliary training set C are evenly divided according to the granularity of two partitions and three partitions.

6. The method for pedestrian re-identification across cameras as claimed in claims 1-3, wherein in the third step, resnet-50 is used as the main network in the multi-granularity feature enhancement process.

7. The cross-camera pedestrian re-identification method according to claims 1 to 6, wherein in the fifth step, the sorting result is visually output according to the similarity sorting, and the step of determining the region where the pedestrian appears specifically includes:

8. A cross-camera pedestrian re-identification system, comprising:

the system comprises a data acquisition module, a data acquisition module and a data processing module, wherein the data acquisition module is used for acquiring pedestrian images of areas corresponding to the cameras in each area in wide-area video monitoring, manually marking the pedestrian identity information and the camera information corresponding to each image and establishing a training data set S; the method is also used for training the camera shooting style transfer network according to each image, the corresponding pedestrian identity information and the corresponding camera information to obtain an auxiliary training set C, and false image samples in the auxiliary training set C

To calculate the distance loss and obtain the distance loss function

Computing a set of all partition penalty functions

9. The cross-camera pedestrian re-identification system according to claim 8, wherein the pedestrian re-identification module uses resnet-50 as a main network in the re-identification network.