CN110765903A

CN110765903A - Pedestrian re-identification method and device and storage medium

Info

Publication number: CN110765903A
Application number: CN201910960572.7A
Authority: CN
Inventors: 吴翠玲
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2019-10-10
Filing date: 2019-10-10
Publication date: 2020-02-07

Abstract

The invention discloses a pedestrian re-identification method, a pedestrian re-identification device and a storage medium. The re-identification method comprises the following steps: extracting a to-be-processed video frame set of at least one to-be-identified pedestrian from the first video stream; respectively inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network to obtain a feature matrix of each pedestrian to be identified; calculating a matching value of each pedestrian to be recognized and the target pedestrian according to a characteristic matrix of the target pedestrian and at least one characteristic matrix of the pedestrian to be recognized; and re-identifying the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian. Through the mode, the invention improves the identification effect and can adapt to the situation that the number of pedestrians to be identified is large.

Description

Pedestrian re-identification method and device and storage medium

Technical Field

The present application relates to the field of video image processing, and in particular, to a method and an apparatus for re-identifying a pedestrian, and a storage medium.

Background

Pedestrian re-identification (Person re-identification), also called pedestrian re-identification, refers to identifying a pedestrian in video sequences or images captured by different image capturing devices, that is, given a video sequence or image of a monitored pedestrian, searching whether the pedestrian exists in the video sequence or image under the cross device. The pedestrian re-identification has a great application prospect in the application of monitoring videos. Because the pedestrian image under different monitoring scenes is greatly influenced by background, illumination, shooting angle and the like, the difficulty in re-identifying the pedestrian is high.

In the prior art, when cross-video pedestrian re-identification is carried out, a cross-video pedestrian re-identification method based on graph clustering clusters local area images, input categories need to be determined manually, and the categories need to be determined finally through multiple iterations, so that the complexity is high, and when the number of pedestrian ids in a video is extremely large, the re-identification effect is difficult to ensure. Moreover, after the pedestrian goes from one camera to another camera, the angle may change, for example, all be the front under one camera, all be the back under another camera, this can lead to the characteristic difference under these two cameras to be very big to lead to judging as different one person, influence the accuracy that the pedestrian heavily discerned.

Disclosure of Invention

The application provides a pedestrian re-identification method, a pedestrian re-identification device and a storage medium, which can solve the problem that the accuracy of pedestrian re-identification is influenced by a shooting angle in the prior art.

In order to solve the technical problem, the application adopts a technical scheme that: provided is a pedestrian re-identification method, comprising:

extracting a to-be-processed video frame set of at least one to-be-identified pedestrian from a first video stream, wherein the to-be-processed video frame set comprises a plurality of to-be-processed video frames, each to-be-processed video frame comprises the to-be-identified pedestrian, and the to-be-processed video frames are divided into at least one type according to the shooting angle of the to-be-identified pedestrian;

respectively inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network to obtain a feature matrix of each pedestrian to be identified, wherein the feature matrix of the pedestrian to be identified comprises a plurality of features of different areas of the pedestrian to be identified, which are output by the convolutional neural network, at different shooting angles;

calculating a matching value of each pedestrian to be recognized and a target pedestrian according to a feature matrix of the target pedestrian and at least one feature matrix of the pedestrian to be recognized, wherein the feature matrix of the target pedestrian is obtained by inputting a set of video frames to be processed of the target pedestrian into the convolutional neural network, the set of video frames to be processed of the target pedestrian is extracted from a second video stream, and the first video stream and the second video stream are from different cameras;

and re-identifying the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian.

In order to solve the above technical problem, another technical solution adopted by the present application is: provided is a pedestrian re-recognition device including:

the device comprises a video processing module, a pedestrian recognition module and a pedestrian recognition module, wherein the video processing module is used for extracting a to-be-processed video frame set of at least one to-be-recognized pedestrian from a first video stream, the to-be-processed video frame set comprises a plurality of to-be-processed video frames, each to-be-processed video frame comprises the to-be-recognized pedestrian, and the to-be-processed video frames are divided into at least one type according to the shooting angle of the to-be-recognized pedestrian;

the feature extraction module is used for inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network to obtain a feature matrix of each pedestrian to be identified, wherein the feature matrix of the pedestrian to be identified comprises a plurality of features of different areas of the pedestrian to be identified at different shooting angles, which are output by the convolutional neural network;

a calculating module, configured to calculate a matching value of each pedestrian to be identified and a target pedestrian according to a feature matrix of the target pedestrian and at least one feature matrix of the pedestrian to be identified, where the feature matrix of the target pedestrian is obtained by inputting a set of video frames to be processed of the target pedestrian into the convolutional neural network, the set of video frames to be processed of the target pedestrian is extracted from a second video stream, and the first video stream and the second video stream are from different cameras;

and the re-identification module is used for re-identifying the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian.

In order to solve the above technical problem, another technical solution adopted by the present application is: providing a pedestrian re-identification device, the device comprising a processor, and a memory coupled to the processor, the memory storing program instructions for implementing the pedestrian re-identification method described above; the processor is configured to execute the program instructions stored by the memory for target pedestrian re-identification.

In order to solve the above technical problem, another technical solution adopted by the present application is: there is provided a storage medium having stored therein program instructions capable of implementing the pedestrian re-identification method described above.

The beneficial effect of this application is: according to the pedestrian re-identification method, the pedestrian re-identification device and the storage medium, the features of different areas of pedestrians at different shooting angles are extracted through the convolutional neural network, the feature matrix of the pedestrian is constructed, the matching value of each pedestrian to be identified and the target pedestrian is calculated according to the feature matrix of the target pedestrian and the feature matrix of the pedestrian to be identified, and then the target pedestrian is re-identified according to the matching value; due to the fact that the characteristic matrix is constructed for the pedestrians, the matching value is calculated through the characteristic matrices of the two pedestrians, complexity is low, and the method can adapt to the situation that the number of the pedestrians to be recognized is large.

Drawings

Fig. 1 is a flowchart illustrating a pedestrian re-identification method according to a first embodiment of the present invention;

FIG. 2 is a schematic diagram of a convolutional neural network in the method of the first embodiment of the present invention;

FIG. 3 is a flow chart illustrating a pedestrian re-identification method according to a second embodiment of the present invention;

FIG. 4 is a flowchart illustrating a pedestrian re-identification method according to a third embodiment of the present invention;

fig. 5 is a schematic structural diagram of a pedestrian re-identification apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a pedestrian re-identification apparatus according to another embodiment of the present invention;

fig. 7 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first", "second" and "third" in this application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any indication of the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. All directional indications (such as up, down, left, right, front, and rear … …) in the embodiments of the present application are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indication is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Fig. 1 is a flowchart illustrating a pedestrian re-identification method according to a first embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 1 if the results are substantially the same. As shown in fig. 1, the pedestrian re-identification method includes the steps of:

s101, extracting a to-be-processed video frame set of at least one to-be-identified pedestrian from a first video stream, wherein the to-be-processed video frame set comprises a plurality of to-be-processed video frames, each to-be-processed video frame comprises the to-be-identified pedestrian, and the to-be-processed video frames are divided into at least one type according to the shooting angle of the to-be-identified pedestrian.

In step S101, the first video stream includes a plurality of consecutive video frames captured by the first camera, or the first video stream includes any number of the plurality of consecutive video frames captured by the first camera. And acquiring tracks of a plurality of pedestrians to be identified from the first video stream by adopting a pedestrian detection and tracking algorithm, classifying video frames in the tracks of the pedestrians to be identified according to different shooting angles, screening the video frames of each shooting angle, and finally acquiring a to-be-processed video frame set of each pedestrian to be identified.

Specifically, in step S101, firstly, a pedestrian detection and tracking algorithm is used to obtain a track of the pedestrian to be identified in the first video stream, where the track includes a plurality of video frames including the pedestrian to be identified, each video frame including the pedestrian to be identified includes a detection frame of the pedestrian to be identified, and the detection frame is a frame to select an external region of the pedestrian to be identified; then, dividing the multi-frame video frames in the track of the pedestrian to be recognized into at least one type according to the shooting angle of the pedestrian to be recognized; and finally, selecting a specified number of the video frames with the largest area and the detection frames of the pedestrians to be identified in each type of the video frames as the video frames to be processed and adding the video frames to be processed into the video frame set to be processed.

It should be noted that the detection frame may be a circumscribed rectangle region or a circumscribed circle region, and preferably, the detection frame selects a minimum circumscribed rectangle of the pedestrian to be identified, so as to completely acquire rich detail features including the pedestrian to be identified. When the video frames of each shooting angle are screened, the video frames, which correspond to the detection frame of the pedestrian to be identified, in the plurality of video frames corresponding to each shooting angle are deleted, the plurality of video frames corresponding to each shooting angle are sorted from large to small according to the area of the detection frame corresponding to the pedestrian to be identified, and K-frame video frames at K bits before the area sorting of the detection frames are extracted to obtain a set of video frames to be processed corresponding to the pedestrian to be identified, wherein K is an integer greater than or equal to 20.

S102, respectively inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network, and acquiring a feature matrix of each pedestrian to be identified, wherein the feature matrix of the pedestrian to be identified comprises a plurality of features of different areas of the pedestrian to be identified, which are output by the convolutional neural network and have different shooting angles.

In step 102, firstly, respectively inputting a to-be-processed video frame set of each to-be-identified pedestrian into a convolutional neural network, and performing feature extraction on different regions of each to-be-processed video frame in the to-be-processed video frame set of the to-be-identified pedestrian to obtain features of the different regions of each to-be-processed video frame; then, fusing the characteristics of the same area of the video frame to be processed of each shooting angle to obtain the characteristics of each area of each shooting angle; and finally, establishing a feature matrix of the pedestrian to be identified according to the features of different areas at different shooting angles.

In an alternative embodiment, the plurality of capture angles includes a frontal perspective, a lateral perspective, and a back perspective, and the different regions include an upper body region and a lower body region. Thus, the feature matrix F of the pedestrian i to be identified_i＝[F_i ^u，F_i ^d，B_i ^u，B_i ^d，S_i ^u，S_i ^d]Wherein F is_i ^uAnd F_i ^dRespectively representing the characteristics of the upper half body area and the lower half body area of the front visual angle of the pedestrian i to be identified, B_i ^uAnd B_i ^dRespectively representing the characteristics of the upper body region and the lower body region of the back visual angle of the pedestrian i to be identified S_i ^uAnd S_i ^dRespectively representing the upper half body area and the lower half body characteristic area of the side view angle of the pedestrian i to be identified.

Specifically, referring to fig. 2, in step S102, the obtaining characteristics of different areas output by each video frame to be processed specifically includes the following steps:

and S1021, respectively inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network to obtain a feature map of each to-be-processed video frame of each pedestrian to be identified.

And S1022, dividing the feature map into different regions according to a preset dividing mode, extracting features of the different regions of the feature map, and performing dimension reduction processing on the features to obtain features of the different regions output by each to-be-processed video frame of each to-be-identified pedestrian.

In steps S1021 and S1022, each to-be-processed video frame corresponds to one feature map, the feature maps are divided according to different areas, and features of different areas in the feature maps are respectively obtained as features of different areas of the to-be-processed video frame.

For example, taking the front view angle of the pedestrian i to be identified as an example, the to-be-processed video frame set of the pedestrian i to be identified includes to-be-processed video frames of N front view angles, and after feature extraction, N feature values of the upper body region of the front view angle are respectively obtained: f. of_i1 ^u，f_i2 ^u……，f_iN ^uAnd N feature values of the lower body region at the frontal perspective: f. of_i1 ^d，f_i2 ^d……，f_iN ^d。

In step S102, fusing the features of the same region of the video frame to be processed for each of the shooting angles, including: and calculating the arithmetic average value of the characteristics of the same area of the video frame to be processed of each shooting angle, and taking the arithmetic average value as the fused characteristics. Then, the fused upper body region characteristic F of the front view angle of the pedestrian i to be recognized_i ^uComprises the following steps:

feature F after fusion of lower body region of front visual angle of pedestrian i to be recognized_i ^dComprises the following steps:

the fusion characteristics of the back view angle and the side view angle of the pedestrian i to be identified are calculated according to the method, and are not described in detail herein.

S103, calculating a matching value of each pedestrian to be recognized and the target pedestrian according to a feature matrix of the target pedestrian and at least one feature matrix of the pedestrian to be recognized, wherein the feature matrix of the target pedestrian is obtained by inputting a video frame set to be processed of the target pedestrian into the convolutional neural network, the video frame set to be processed of the target pedestrian is extracted from a second video stream, and the first video stream and the second video stream are from different cameras.

In step S103, the second video stream includes a plurality of consecutive video frames captured by the second camera, or the second video stream includes any number of consecutive video frames captured by the second camera, and the target pedestrian appears in the second video stream. The feature matrix of the target pedestrian is obtained in a similar manner to that of the pedestrian to be identified, specifically, the feature matrix F ═ F of the target pedestrian^u，F^d，B^u，B^d，S^u，S^d]Wherein F is^uAnd F^dRespectively representing the characteristics of the upper half body area and the lower half body area of the front view angle of the target pedestrian, B^uAnd B^dRespectively representing the characteristics of the upper body region and the lower body region of the back view angle of the target pedestrian, S^uAnd S^dRespectively representing the upper half body area and the lower half body characteristic area of the side view angle of the target pedestrian.

In an alternative embodiment, the matching value of the pedestrian to be identified and the target pedestrian is calculated according to the following steps: firstly, respectively carrying out normalization processing on a feature matrix of a target pedestrian and a transposed matrix of the feature matrix of the pedestrian to be identified; then, the characteristics of the target pedestrian are determinedMultiplying the matrix and the transposed matrix of the characteristic matrix of the pedestrian to be identified to obtain an operation matrix; then, the distance between the feature matrix of the target pedestrian and the feature matrix of the pedestrian to be recognized is calculated according to each element of the operation matrix, and the distance is used as a matching value. Further, adding 1 to the value of each element of the operation matrix, and then multiplying the value by a natural coefficient P to obtain an adjustment value corresponding to each element, wherein P is a positive integer; and calculating the average value of the obtained adjustment values, and taking the average value as the distance between the characteristic matrix of the target pedestrian and the characteristic matrix of the pedestrian to be identified. Specifically, taking the matching value of the pedestrian i to be identified and the target pedestrian as an example, the operation matrix M is F 'x F'_i ^TF 'is obtained by normalizing F'_i ^TIs F_i ^TObtained by normalization treatment, F 'and F'_iIs a distance ofWherein M is_ijFor operating the elements of row i and column j of matrix M, the distance d is between 0 and 100 when the natural coefficient P is 50.

And S104, re-identifying the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian.

In step S104, the matching values of the pedestrian to be recognized and the target pedestrian of the first video stream are sorted from large to small, and T pedestrians to be recognized at T bits before the sorting of the matching values are extracted as the re-recognition result of the target pedestrian, where T is an integer greater than or equal to 1.

Fig. 3 is a flowchart illustrating a pedestrian re-identification method according to a second embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 3 if the results are substantially the same. As shown in fig. 3, the pedestrian re-identification method includes the steps of:

s201, extracting a to-be-processed video frame set of a target pedestrian from a second video stream, wherein the to-be-processed video frame set comprises a plurality of to-be-processed video frames, each to-be-processed video frame comprises the target pedestrian, and the to-be-processed video frames are divided into at least one type according to the shooting angle of the target pedestrian.

In step S201, a pedestrian detection algorithm and a tracking algorithm are used to obtain a trajectory of the target pedestrian in the second video stream, where the trajectory includes a plurality of video frames containing the target pedestrian, each video frame containing the target pedestrian includes a detection frame of the target pedestrian, and the detection frame is a frame to select an external region of the target pedestrian; dividing the multi-frame video frames in the track of the target pedestrian into at least one type according to the shooting angle of the target pedestrian; and selecting a specified number of the video frames with the maximum area and the detection frames of the target pedestrians which are not blocked in each type of the video frames as the video frames to be processed and adding the video frames to be processed into the video frame set to be processed.

S202, inputting the to-be-processed video frame set of the target pedestrian into a convolutional neural network, and acquiring a feature matrix of the target pedestrian, wherein the feature matrix of the target pedestrian comprises a plurality of features of different areas of different shooting angles of the target pedestrian, which are output by the convolutional neural network.

In step S202, the to-be-processed video frame set of the target pedestrian is respectively input into the convolutional neural network, and feature extraction is performed on different regions of each to-be-processed video frame in the to-be-processed video frame set of the target pedestrian, so as to obtain features of different regions of each to-be-processed video frame; then, fusing the characteristics of the same area of the video frame to be processed of each shooting angle to obtain the characteristics of each area of each shooting angle; and finally, establishing a feature matrix of the target pedestrian according to the features of different areas at different shooting angles.

S203, extracting a to-be-processed video frame set of at least one to-be-identified pedestrian from the first video stream, wherein the to-be-processed video frame set comprises a plurality of to-be-processed video frames, each to-be-processed video frame comprises the to-be-identified pedestrian, and the to-be-processed video frames are divided into at least one type according to the shooting angle of the to-be-identified pedestrian.

And S204, respectively inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network, and acquiring a feature matrix of each pedestrian to be identified, wherein the feature matrix of the pedestrian to be identified comprises a plurality of features of different areas of different shooting angles of the pedestrian to be identified, which are output by the convolutional neural network.

S205, calculating the matching between each pedestrian to be recognized and the target pedestrian according to the characteristic matrix of the target pedestrian and the characteristic matrix of at least one pedestrian to be recognized.

And S206, re-identifying the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian.

The steps S203 to S206 refer to the steps S101 to S104 of the first embodiment specifically, and the acquiring manners of the to-be-processed video frame set of the target pedestrian and the characteristic matrix of the target pedestrian in the steps S201 and S202 are similar to those of the to-be-identified pedestrian, and the specific details refer to the first embodiment.

Fig. 4 is a flowchart illustrating a pedestrian re-identification method according to a third embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 4 if the results are substantially the same. As shown in fig. 4, the pedestrian re-identification method includes the steps of:

s301, training the convolutional neural network.

In step S301, for each video frame of different shooting angles in the training set, a detection frame is used to mark a known pedestrian; dividing the detection frame of the known pedestrian of each video frame into a plurality of regions according to a preset dividing mode, and determining the corresponding characteristic of each region; the convolutional neural network is trained from each video frame for which the characteristics of the different regions are determined.

S302, extracting a to-be-processed video frame set of at least one to-be-identified pedestrian from a first video stream, wherein the to-be-processed video frame set comprises a plurality of to-be-processed video frames, each to-be-processed video frame comprises the to-be-identified pedestrian, and the to-be-processed video frames are divided into at least one type according to the shooting angle of the to-be-identified pedestrian.

And S303, respectively inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network, and acquiring a feature matrix of each pedestrian to be identified, wherein the feature matrix of the pedestrian to be identified comprises a plurality of features of different areas of different shooting angles of the pedestrian to be identified, which are output by the convolutional neural network.

S304, calculating a matching value of each pedestrian to be recognized and the target pedestrian according to a feature matrix of the target pedestrian and at least one feature matrix of the pedestrian to be recognized, wherein the feature matrix of the target pedestrian is obtained by inputting a set of video frames to be processed of the target pedestrian into the convolutional neural network, the set of video frames to be processed of the target pedestrian is extracted from a second video stream, and the first video stream and the second video stream are from different cameras.

S305, re-identifying the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian.

Steps S302 to S305 refer to steps S101 to S104 of the first embodiment specifically, which are not described in detail herein.

Fig. 5 is a schematic structural diagram of a pedestrian re-identification apparatus according to an embodiment of the present invention. As shown in fig. 5, the apparatus 40 includes: a video processing module 41, a feature extraction module 42, a calculation module 43 and a re-identification module 44.

The video processing module 41 is configured to extract a to-be-processed video frame set of at least one to-be-identified pedestrian from the first video stream, where the to-be-processed video frame set includes a plurality of to-be-processed video frames, each of the to-be-processed video frames includes the to-be-identified pedestrian, and the to-be-processed video frames are divided into at least one category according to a shooting angle of the to-be-identified pedestrian.

The feature extraction module 42 is configured to input the to-be-processed video frame set of each to-be-identified pedestrian into the convolutional neural network, and acquire a feature matrix of each to-be-identified pedestrian, where the feature matrix of each to-be-identified pedestrian includes features of different areas of different shooting angles of the to-be-identified pedestrian, which are output by the convolutional neural network.

A calculating module 43, configured to calculate a matching value of each pedestrian to be identified and a target pedestrian according to a feature matrix of the target pedestrian and at least one feature matrix of the pedestrian to be identified, where the feature matrix of the target pedestrian is obtained by inputting a set of video frames to be processed of the target pedestrian into the convolutional neural network, the set of video frames to be processed of the target pedestrian is extracted from a second video stream, and the first video stream and the second video stream are from different cameras.

And the re-identification module 44 is used for re-identifying the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian.

In the present embodiment, the photographing angle includes a frontal angle of view, a lateral angle of view, and a back angle of view, and the different regions include an upper body region and a lower body region.

Optionally, the video processing module 41 obtains a trajectory of the to-be-identified pedestrian in the first video stream by using a pedestrian detection and tracking algorithm, where the trajectory includes a plurality of video frames containing the to-be-identified pedestrian, each video frame containing the to-be-identified pedestrian includes a detection frame of the to-be-identified pedestrian, and the detection frame is a frame to select an external region of the to-be-identified pedestrian; dividing the multiframe video frames in the track of the pedestrian to be recognized into at least one type according to the shooting angle of the pedestrian to be recognized; and selecting a specified number of the video frames with the largest area and the detection frames of the pedestrians to be identified in each type of the video frames as the video frames to be processed and adding the video frames to be processed into the video frame set to be processed.

Optionally, the feature extraction module 42 respectively inputs the to-be-processed video frame set of each to-be-identified pedestrian into the convolutional neural network, so as to obtain features of different areas output by each to-be-processed video frame of each to-be-identified pedestrian; fusing the characteristics of the same area of the video frame to be processed of each shooting angle to obtain the characteristics of each area of each shooting angle; and establishing a characteristic matrix of the pedestrian to be identified according to the characteristics of different areas at different shooting angles.

Optionally, the feature extraction module 42 respectively inputs the to-be-processed video frame set of each to-be-identified pedestrian into the convolutional neural network to obtain a feature map of each to-be-processed video frame of each to-be-identified pedestrian; dividing the feature map into different regions according to a preset dividing mode, extracting features of the different regions of the feature map, and performing dimension reduction processing on the features to obtain features of the different regions output by each to-be-processed video frame of each to-be-identified pedestrian.

Alternatively, the calculating module 43 calculates an arithmetic average of the features of the same region of the video frame to be processed for each of the shooting angles, and takes the arithmetic average as the feature after fusion.

Optionally, the calculating module 43 multiplies the feature matrix of the target pedestrian by the transposed matrix of the feature matrix of the pedestrian to be identified to obtain an operation matrix; and calculating the distance between the characteristic matrix of the target pedestrian and the characteristic matrix of the pedestrian to be identified according to each element of the operation matrix, and taking the distance as a matching value.

Optionally, the calculating module 43 performs normalization processing on the feature matrix of the target pedestrian and the transposed matrix of the feature matrix of the pedestrian to be identified respectively.

Optionally, the calculating module 43 adds 1 to the value of each element of the operation matrix and then multiplies the value by a natural coefficient P to obtain an adjustment value corresponding to each element, where P is a positive integer; and calculating the average value of the obtained adjustment values, and taking the average value as the distance between the characteristic matrix of the target pedestrian and the characteristic matrix of the pedestrian to be identified.

Optionally, the re-recognition module 44 ranks the matching values of the to-be-recognized pedestrian and the target pedestrian of the first video stream from large to small, and extracts T to-be-recognized pedestrians T bits before the ranking of the matching values as the re-recognition result of the target pedestrian, where T is an integer greater than or equal to 1.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a pedestrian re-identification apparatus according to an embodiment of the present invention. As shown in fig. 6, the pedestrian re-identification apparatus 50 includes a processor 51 and a memory 52 coupled to the processor 51.

The memory 52 stores program instructions for implementing the pedestrian re-identification method according to any one of the above-described embodiments.

The processor 51 is operative to execute program instructions stored in the memory 52 for target pedestrian re-identification.

The processor 51 may also be referred to as a CPU (Central Processing Unit). The processor 51 may be an integrated circuit chip having signal processing capabilities. The processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a storage medium according to an embodiment of the invention. The storage medium of the embodiment of the present invention stores program instructions 61 capable of implementing all the methods described above, where the program instructions 61 may be stored in the storage medium in the form of a software product, and include several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. The aforementioned storage device includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The above embodiments are merely examples and are not intended to limit the scope of the present disclosure, and all modifications, equivalents, and flow charts using the contents of the specification and drawings are included in the scope of the present disclosure.

Claims

1. A pedestrian re-identification method, the method comprising:

2. The pedestrian re-identification method according to claim 1, wherein the photographing angle includes a front view angle, a side view angle, and a back view angle, and the different regions include an upper body region and a lower body region.

3. The pedestrian re-identification method according to claim 1 or 2, wherein the extracting the set of to-be-processed video frames of at least one to-be-identified pedestrian from the first video stream comprises:

acquiring a track of the pedestrian to be identified in the first video stream by adopting a pedestrian detection and tracking algorithm, wherein the track comprises a plurality of video frames containing the pedestrian to be identified, each video frame containing the pedestrian to be identified comprises a detection frame of the pedestrian to be identified, and the detection frame is a frame to select an external region of the pedestrian to be identified;

dividing the multiframe video frames in the track of the pedestrian to be identified into at least one type according to the shooting angle of the pedestrian to be identified;

and selecting a specified number of video frames with the largest area and the detection frames of the pedestrians to be identified in each type of video frames as the video frames to be processed and adding the video frames to be processed into the video frame set to be processed.

4. The pedestrian re-identification method according to claim 3, wherein the step of inputting the to-be-processed video frame set of each to-be-identified pedestrian into a convolutional neural network to obtain the feature matrix of each to-be-identified pedestrian comprises the steps of:

respectively inputting the to-be-processed video frame set of each pedestrian to be identified into a convolutional neural network to obtain the characteristics of different areas output by each to-be-processed video frame of each pedestrian to be identified;

fusing the characteristics of the same region of the video frame to be processed of each shooting angle to obtain the characteristics of each region of each shooting angle;

and establishing a characteristic matrix of the pedestrian to be identified according to the characteristics of different areas at different shooting angles.

5. The pedestrian re-identification method according to claim 4, wherein the step of inputting the to-be-processed video frame set of each to-be-identified pedestrian into a convolutional neural network respectively to obtain the features of different areas output by each to-be-processed video frame of each to-be-identified pedestrian comprises:

respectively inputting the to-be-processed video frame set of each to-be-identified pedestrian into a convolutional neural network to obtain a feature map of each to-be-processed video frame of each to-be-identified pedestrian;

dividing the feature map into different regions according to a preset dividing mode, extracting features of the different regions of the feature map, and performing dimension reduction processing on the features to obtain features of the different regions output by each to-be-processed video frame of each to-be-identified pedestrian.

6. The pedestrian re-identification method according to claim 4, wherein the fusing the features of the same region of the video frame to be processed of each shooting angle comprises:

and calculating the arithmetic mean value of the features of the same area of the video frame to be processed of each shooting angle, and taking the arithmetic mean value as the fused feature.

7. The pedestrian re-identification method according to claim 1, wherein the calculating of the matching value of each pedestrian to be identified and the target pedestrian according to the feature matrix of the target pedestrian and the feature matrix of the pedestrian to be identified comprises:

multiplying the characteristic matrix of the target pedestrian with the transposed matrix of the characteristic matrix of the pedestrian to be identified to obtain an operation matrix;

and calculating the distance between the characteristic matrix of the target pedestrian and the characteristic matrix of the pedestrian to be identified according to each element of the operation matrix, and taking the distance as a matching value.

8. The pedestrian re-identification method according to claim 7, wherein the calculating of the matching value of each pedestrian to be identified and the target pedestrian according to the feature matrix of the target pedestrian and the feature matrix of the pedestrian to be identified further comprises:

and respectively carrying out normalization processing on the feature matrix of the target pedestrian and the transposed matrix of the feature matrix of the pedestrian to be identified.

9. The pedestrian re-identification method according to claim 7, wherein the calculating of the distance between the feature matrix of the target pedestrian and the feature matrix of the pedestrian to be identified according to each element of the operation matrix comprises:

adding 1 to the value of each element of the operation matrix, and multiplying the value by a natural coefficient P to obtain an adjustment value corresponding to each element, wherein P is a positive integer;

and calculating the average value of the obtained adjustment values, and taking the average value as the distance between the characteristic matrix of the target pedestrian and the characteristic matrix of the pedestrian to be identified.

10. The pedestrian re-identification method according to claim 1, wherein the re-identification of the target pedestrian according to the matching value of the pedestrian to be identified and the target pedestrian comprises:

the method comprises the steps of sorting matching values of pedestrians to be recognized and target pedestrians of a first video stream from big to small, and extracting T pedestrians to be recognized at T positions before the sorting of the matching values as re-recognition results of the target pedestrians, wherein T is an integer larger than or equal to 1.

11. The pedestrian re-identification method according to claim 1 or 2, wherein the step of extracting the set of video frames to be processed of the target pedestrian comprises:

acquiring a track of the target pedestrian in a second video stream by adopting a pedestrian detection algorithm and a tracking algorithm, wherein the track comprises a plurality of video frames containing the target pedestrian, each video frame containing the target pedestrian comprises a detection frame of the target pedestrian, and the detection frame is a frame to select an external connection area of the target pedestrian;

dividing the multi-frame video frames in the track of the target pedestrian into at least one type according to the shooting angle of the target pedestrian;

and selecting a specified number of video frames with the maximum area and the detection frames of the target pedestrians which are not blocked in each type of video frames as the video frames to be processed and adding the video frames to be processed into the video frame set to be processed.

12. A pedestrian re-identification apparatus, the apparatus comprising:

13. A pedestrian re-identification apparatus, comprising a processor and a memory coupled to the processor,

the memory stores program instructions for implementing a pedestrian re-identification method according to any one of claims 1 to 11;

the processor is configured to execute the program instructions stored by the memory for target pedestrian re-identification.

14. A storage medium having stored therein program instructions capable of implementing the pedestrian re-identification method according to any one of claims 1 to 11.