CN115620098A

CN115620098A - Evaluation method and system of cross-camera pedestrian tracking algorithm and electronic equipment

Info

Publication number: CN115620098A
Application number: CN202211636207.9A
Authority: CN
Inventors: 尚果超
Original assignee: China Telecom Digital City Technology Co ltd
Current assignee: China Telecom Digital City Technology Co ltd
Priority date: 2022-12-20
Filing date: 2022-12-20
Publication date: 2023-01-17
Anticipated expiration: 2042-12-20
Also published as: CN115620098B

Abstract

The invention provides a method and a system for evaluating a cross-camera pedestrian tracking algorithm and electronic equipment, and relates to the technical field of artificial intelligence, wherein the method comprises the following steps: firstly, generating a pedestrian prediction result by utilizing a cross-camera pedestrian tracking algorithm model based on a test data set; the test data set comprises a label frame, and the pedestrian prediction result comprises a prediction frame; then dividing the monitoring video according to the nodes of a time axis, and determining the number of predicted matching nodes and the number of label matching nodes; performing frame matching on the tag frame and the predicted frame, and determining a target matching node; calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the number of the prediction matching nodes, the number of the label matching nodes and the number of the target matching nodes; and finally, calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the precision rate, solving the technical problems of low prediction efficiency and low precision of the existing cross-camera pedestrian tracking algorithm, and achieving the technical effect of improving the prediction accuracy.

Description

Evaluation method and system of cross-camera pedestrian tracking algorithm and electronic equipment

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a cross-camera pedestrian tracking algorithm evaluation method and system and electronic equipment.

Background

In an actual cross-environment tracking scene, due to the influence of factors such as camera resolution, video recording distance, illumination conditions, pedestrian shielding, clothing color change and the like, the algorithm cannot completely and accurately predict and track pedestrians in each frame. If the algorithm can identify the video frames in which the pedestrians appear in the predicted key nodes, the motion trail of the pedestrians can be successfully judged. Most of the existing techniques rely on frame-by-frame comparison to evaluate the percentage of correctly predicted frames, which is time-consuming and unreliable. That is to say, the existing cross-camera pedestrian tracking technology has the problems of low prediction efficiency and low precision.

Disclosure of Invention

The invention aims to provide a method and a system for evaluating a cross-camera pedestrian tracking algorithm and electronic equipment, so as to solve the technical problems of low prediction efficiency and low precision of the existing cross-camera pedestrian tracking algorithm.

In a first aspect, an embodiment of the present invention provides an evaluation method for a pedestrian tracking algorithm of a camera, where the method includes: generating at least one pedestrian prediction result by using a cross-camera pedestrian tracking algorithm model based on a pre-acquired test data set; the pre-acquired test data set comprises a monitoring video clip crossing the camera and a corresponding label frame; the pedestrian prediction result comprises a prediction frame;

dividing the monitoring video according to the nodes of a time axis, and determining the number of prediction matching nodes contained in the prediction frame and the number of label matching nodes contained in the label frame;

performing frame matching on the label frame and the predicted frame to determine a target matching node;

calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the number of the prediction matching nodes, the number of the label matching nodes and the number of the target matching nodes;

and calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate.

In some possible embodiments, the method further comprises: acquiring a target monitoring video clip containing a target pedestrian picture and a corresponding label frame according to target pedestrian videos of a plurality of monitoring cameras across scenes; the parameters of the tag frame comprise a target camera number, a target frame number, a target pedestrian ID number and target rectangular frame position information; generating target pedestrian library information based on the target monitoring video clips and the label frames; and generating a test data set according to a plurality of pieces of target pedestrian library information.

In some possible embodiments, the step of performing frame matching on the tag frame and the predicted frame to determine a target matching node includes: matching parameters contained in the prediction frame and the label frame respectively; wherein the parameters of the predicted frame include: predicting the number of a camera, the number of a frame, the ID number of a pedestrian and the position information of a rectangular frame;

if the parameters of the predicted frame and the parameters of the label frame both meet the matching conditions, the predicted frame and the label frame are correctly matched; wherein the matching conditions include: the predicted camera number matches the target camera number, the predicted frame number matches the target frame number, the predicted pedestrian ID number matches the target pedestrian ID number, and the predicted rectangular frame position information matches the target rectangular frame position information.

In some possible embodiments, the condition that the predicted rectangular frame position information matches the target rectangular frame position information is: the intersection ratio IOU of the prediction rectangular frame and the target rectangular frame is larger than a set threshold value; the intersection ratio IOU of the prediction rectangular frame and the target rectangular frame is the ratio of a first overlapping area to a first joint area; the first overlap region is an intersection region of the prediction rectangular frame and the target rectangular frame; the first union region is a union region of the prediction rectangular frame and the target rectangular frame.

In some possible embodiments, the step of performing frame matching on the tag frame and the predicted frame to determine a target matching node further includes: and if the number of the correct matches between the predicted frame and the label frame in the current node is not less than half of the number of the label frames, determining that the current node is a target matching node.

In some possible embodiments, the step of dividing the monitoring video according to the nodes of the time axis and determining the number of prediction matching nodes included in the prediction frame and the number of tag matching nodes included in the tag frame includes: dividing the monitoring video according to the nodes of a time axis to generate a plurality of time nodes; determining the number of prediction matching nodes and the number of label matching nodes according to the track of the label frame, the relation between the track of the prediction frame and the time node; wherein, the relationship between the track of the label frame and the time node comprises: if the track of the tag frame of the pedestrian passes through one camera in the current time axis, the track of the tag frame comprises the time node, and the time node is a tag matching node; the relationship between the trajectory of the predicted frame and the time node includes: if the trajectory of the predicted frame of the pedestrian falls within a certain time node, the trajectory of the predicted frame includes the time node, and the time node is the prediction matching node.

In some possible embodiments, the method further comprises: calculating the total recall rate and the total accuracy rate of all the pedestrian prediction results in the test data set based on the recall rate and the accuracy rate of each pedestrian prediction result;

the step of calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate comprises the following steps: and carrying out weighted average on the total recall rate and the total accuracy rate, and determining the evaluation score of the cross-camera pedestrian tracking algorithm model.

In a second aspect, an embodiment of the present invention provides an evaluation system for a cross-camera pedestrian tracking algorithm, where the system includes:

the pedestrian prediction result generation module is used for generating at least one pedestrian prediction result by utilizing a cross-camera pedestrian tracking algorithm model based on a pre-acquired test data set; the pre-acquired test data set comprises a monitoring video clip crossing the camera and a corresponding label frame; the pedestrian prediction result comprises a prediction frame;

the node number determining module is used for dividing the monitoring video according to nodes of a time axis and determining the number of prediction matching nodes contained in the prediction frame and the number of label matching nodes contained in the label frame;

a target matching node generation module, configured to perform frame matching on the tag frame and the predicted frame, and determine a target matching node;

a calculating module, configured to calculate a recall rate and an accuracy rate of each pedestrian prediction result based on the number of prediction matching nodes, the number of tag matching nodes, and the number of target matching nodes;

the calculation module is further configured to calculate an evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate.

In a third aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, where the memory stores a computer program operable on the processor, and the processor implements the steps of the method according to any one of the first aspect when executing the computer program.

The invention provides a method, a system and electronic equipment for evaluating a cross-camera pedestrian tracking algorithm, wherein the method comprises the following steps: firstly, based on a pre-acquired test data set, generating a pedestrian prediction result by using a cross-camera pedestrian tracking algorithm model; the test data set comprises monitoring video clips crossing the camera and corresponding label frames, and the pedestrian prediction result comprises a prediction frame; then dividing the monitoring video according to the nodes of a time axis, and determining the number of predicted matching nodes and the number of label matching nodes; performing frame matching on the tag frame and the predicted frame, and determining a target matching node; calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the number of the prediction matching nodes, the number of the label matching nodes and the number of the target matching nodes; and finally, calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the precision rate, realizing the evaluation of the existing cross-camera pedestrian tracking algorithm, solving the technical problems of low prediction efficiency and low precision of the existing model, and achieving the technical effect of improving the prediction accuracy.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic flowchart of an evaluation method of a cross-camera pedestrian tracking algorithm according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of another cross-camera pedestrian tracking algorithm evaluation method according to an embodiment of the present invention;

fig. 3 is a schematic diagram illustrating a corresponding relationship between a time axis and a matching node according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Some embodiments of the invention are described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

The traditional cross-camera pedestrian tracking system mainly comprises a pedestrian detection module, a pedestrian target tracking module and a pedestrian re-identification module. The pedestrian detection module usually adopts a lightweight network (RBFNet) to perform multi-scale and multi-proportion feature fusion; the pedestrian target tracking module generally adopts an SORT tracking algorithm, the SORT algorithm takes the detection result of the pedestrian detection algorithm as a key component, transmits the target state to a future frame, associates the current detection with the existing target, and manages the life cycle of the tracked target; the pedestrian re-identification module usually adopts a ReID algorithm to calculate the characteristic distance between the image of the pedestrian bank and the image of the detection target so as to judge whether the pedestrian is the same pedestrian.

The task of cross-camera pedestrian tracking is as follows: (1) Giving a pedestrian library, namely specified target pedestrian information to be identified; (2) Requiring the algorithm model to find out pedestrians in a pedestrian bank, positioning the positions of the pedestrians and tracking the pedestrians in different scene videos; (3) If the pedestrian disappears in the current scene, the algorithm model is required to be capable of searching in the videos of the other scenes for continuous tracking; (4) If the pedestrian target appears in a plurality of videos or appears in a single video for a plurality of times, the algorithm model is required to be capable of iteratively searching for continuous tracking; (5) The tracking result of the pedestrian in the pedestrian bank (camera number, frame number, pedestrian ID number, rectangular frame position information) is predicted.

The principle of the cross-camera pedestrian tracking algorithm is as follows: (1) Detecting all pedestrians in the video frame by using a pedestrian detection network to obtain a pedestrian detection frame; (2) Carrying out human body weight recognition on the detected pedestrian, and extracting a target pedestrian in the video; (3) Tracking the target pedestrian by using a pedestrian tracking algorithm, and if the pedestrian tracking is lost (trackID is lost), re-positioning the target by using a pedestrian re-identification algorithm (ReiD); (4) Calculating the distance between the center coordinates of the target pedestrian detection frames in the front and rear three frames of images, and if the distance is greater than a set threshold value, determining that the tracking is wrong, and re-positioning the ReiD target; and (5) if the tracking is successful, continuously tracking by using an SORT algorithm.

In an actual cross-environment tracking scene, due to the influence of factors such as camera resolution, video recording distance, illumination conditions, pedestrian shielding, clothing color change and the like, the algorithm cannot accurately predict and track pedestrians in each frame. If the algorithm can identify the video frames in which the pedestrians appear in the predicted key nodes, the motion trail of the pedestrians can be successfully judged. Most of the prior art rely on frame-by-frame comparison to evaluate the percentage of correctly predicted frames, which is time consuming and unreliable. That is to say, the existing cross-camera pedestrian tracking technology has the problems of low prediction efficiency and low precision, and a comprehensive evaluation system for the cross-camera pedestrian tracking technology is lacked so as to effectively evaluate the precision of the algorithm.

Based on this, the embodiment of the invention provides an evaluation method and system for a cross-camera pedestrian tracking algorithm and electronic equipment, so as to realize comprehensive evaluation of the cross-camera pedestrian tracking algorithm, and further alleviate the technical problems of low prediction efficiency and low precision of the existing cross-camera pedestrian tracking algorithm.

To facilitate understanding of the present embodiment, first, a detailed description is given to an evaluation method for a cross-camera pedestrian tracking algorithm disclosed in the embodiment of the present invention, referring to a flowchart of the evaluation method for a cross-camera pedestrian tracking algorithm shown in fig. 1, where the method may be executed by an electronic device and mainly includes the following steps:

s110: generating at least one pedestrian prediction result by using a cross-camera pedestrian tracking algorithm model based on a pre-acquired test data set; the method comprises the steps that a pre-acquired test data set comprises cross-camera monitoring video clips and corresponding label frames; the pedestrian prediction result comprises a prediction frame;

s120: dividing the monitoring video according to the nodes of a time axis, and determining the number of prediction matching nodes contained in a prediction frame and the number of label matching nodes contained in a label frame;

in this embodiment, a target surveillance video clip including a target pedestrian picture and a corresponding tag frame may be obtained according to a target pedestrian video of a plurality of surveillance cameras across a scene; then generating target pedestrian library information based on the target monitoring video clips and the label frames; and generating a test data set according to the information of the target pedestrian banks.

The parameters of the tag frame comprise a target camera number, a target frame number, a target pedestrian ID number and target rectangular frame position information; parameters of the predicted frame include: the predicted camera number, the predicted frame number, the predicted pedestrian ID number and the predicted rectangular frame position information.

S130: performing frame matching on the tag frame and the predicted frame, and determining a target matching node;

in this embodiment, the step S130 of performing frame matching on the tag frame and the predicted frame, and determining the target matching node may include:

(1) Respectively matching parameters contained in the predicted frame and the label frame;

(2) If the parameters of the predicted frame and the parameters of the label frame both meet the matching conditions, the predicted frame and the label frame are correctly matched;

the matching conditions specifically include: the predicted camera number matches with the target camera number, the predicted frame number matches with the target frame number, the predicted pedestrian ID number matches with the target pedestrian ID number, and the predicted rectangular frame position information matches with the target rectangular frame position information.

In this embodiment, the condition that the predicted rectangular frame position information matches the target rectangular frame position information may be: the intersection ratio IOU of the prediction rectangular frame and the target rectangular frame is larger than a set threshold value.

The intersection ratio IOU of the prediction rectangular frame and the target rectangular frame is the ratio of the first overlapping area to the first joint area; the first overlapping area is an intersection area of the prediction rectangular frame and the target rectangular frame; the first joint area is a union area of the prediction rectangular frame and the target rectangular frame.

In one embodiment, the step of performing frame matching on the tag frame and the predicted frame to determine the target matching node further includes: and if the number of the correct matches between the predicted frame and the label frame in the current node is not less than half of the number of the label frames, determining that the current node is a target matching node.

S140: calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the number of the prediction matching nodes, the number of the label matching nodes and the number of the target matching nodes;

s150: and calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate.

The invention provides a cross-camera pedestrian tracking algorithm evaluation method, which comprises the following steps: firstly, based on a pre-acquired test data set, generating a pedestrian prediction result by using a cross-camera pedestrian tracking algorithm model; the test data set comprises monitoring video clips crossing the camera and corresponding label frames, and the pedestrian prediction result comprises a prediction frame; then dividing the monitoring video according to the nodes of a time axis, and determining the number of predicted matching nodes and the number of label matching nodes; performing frame matching on the tag frame and the predicted frame, and determining a target matching node; calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the number of the prediction matching nodes, the number of the label matching nodes and the number of the target matching nodes; and finally, calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate, wherein the method realizes the evaluation of the existing cross-camera pedestrian tracking algorithm, solves the technical problems of low prediction efficiency and low accuracy of the existing model, and achieves the technical effect of improving the prediction accuracy.

In an embodiment, the dividing the monitoring video according to nodes of a time axis in step S120, and determining the number of prediction matching nodes included in the prediction frame and the number of tag matching nodes included in the tag frame may include: firstly, dividing a monitoring video according to nodes of a time axis to generate a plurality of time nodes; and then determining the number of the prediction matching nodes and the number of the label matching nodes according to the track of the label frame and the relation between the track of the prediction frame and the time node.

The relationship between the track of the label frame and the time node comprises the following steps: if the track of the tag frame of the pedestrian passes through one camera in the current time axis, the track of the tag frame comprises a time node, and the time node is a tag matching node; the relationship of the trajectory of the predicted frame to the temporal node includes: if the track of the predicted frame of the pedestrian falls within a certain time node, the track of the predicted frame comprises the time node, and the time node is a prediction matching node.

In an embodiment, after calculating the recall rate and the accuracy rate of the prediction result of each pedestrian in the step S140, the method may further include: and calculating the total recall rate and the total accuracy rate of all the pedestrian prediction results in the test data set based on the recall rate and the accuracy rate of each pedestrian prediction result.

As a specific example, the calculation formulas of recall rate and accuracy rate of individual pedestrian prediction results are as follows:

(formula 1);

(formula 2);

wherein i is a pedestrian number; n is the camera number; n is the total number of the cameras;

matching the number of nodes for the labels of the label frame track of the ith pedestrian in the camera n;

the number of the predicted matching nodes in the camera n for the track of the predicted frame of the ith pedestrian;

the number of nodes is matched for the ith pedestrian target in the camera n.

The calculation formulas of the total recall rate and the total accuracy rate of the prediction results of all pedestrians in the test data set are as follows:

(formula 3);

(equation 4);

where M is the number of all pedestrians noted in the pedestrian library of the test data set. If M =10, 10 different pedestrians are marked in the video.

Correspondingly, the step S150 of calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall ratio and the accuracy ratio may include: and carrying out weighted average on the total recall rate and the total accuracy rate, and determining the evaluation score of the cross-camera pedestrian tracking algorithm model.

As a specific example, the calculation formula of the evaluation score is as follows:

(equation 5);

and finally, performing algorithm precision evaluation sequencing according to the principle that the larger the Fscore value is, the higher the model precision is.

The evaluation method for the cross-camera pedestrian tracking algorithm provided by the embodiment combines the modules of pedestrian detection, pedestrian re-identification, pedestrian tracking and the like in the existing cross-camera pedestrian tracking algorithm for comprehensive evaluation, can effectively evaluate the accuracy of pedestrian detection, the accuracy of pedestrian re-identification and the effectiveness of tracking, solves the problem that the algorithm is sensitive to initial frame information, solves the problem that threshold percentage evaluation is unreliable, is more suitable for practical application scenes, and can effectively evaluate the accuracy of the algorithm.

As a specific example, an overall test evaluation flow is proposed for a cross-camera pedestrian tracking algorithm, which mainly includes the following four steps: (1) preparing a diversified test set; (2) Inputting the information of the pedestrian library into a cross-camera pedestrian tracking algorithm model, and storing the pedestrian result identified and tracked by the model; (3) Performing node division on the video at fixed intervals according to a time axis, calculating the Recall and Precision values of all pedestrian banks, and finally performing weighted average to obtain an Fscore value; (4) And carrying out algorithm evaluation ranking according to the principle that the larger the Fscore value, the higher the model precision.

Referring to fig. 2, the specific algorithm evaluation test procedure is as follows:

1. manufacturing a test data set;

s1-1, preparing a test set, wherein the test data set comprises a monitoring video clip and a tag (ground-route) of a pedestrian bank. Pedestrian videos (indoor and outdoor recordings) of N monitoring cameras across scenes are manufactured, and the pedestrian videos comprise various factors such as body shielding, head shielding, backpacks, front (back and side) faces of people, dresses in different color styles, different fuzzy degrees and the like. And the pedestrian library is randomly intercepted from the monitoring video and then manually marked, wherein the pedestrian library comprises a camera serial number, a frame number, a pedestrian ID number and rectangular frame position information.

Table 1: example of Manual annotation

cameraID	FrameIndex	TargetId	X1	Y1	X2	Y2
							1.mp4	2 895	2 3509	0 .825	208.152	295.004	1046.361
2.mp4	5 00	23595	1 23.456	2 35.236	3 33.123	6 78.123
							…	…	…	…	…	…	…
N.mp4	4 19	23598	2 34.689	6 78.123	1 23.123	4 56.123

2. Obtaining a model prediction result by using the test set;

and S2-1, inputting the pedestrian library information into a cross-camera pedestrian tracking algorithm model, and storing pedestrian results (camera number, frame number, pedestrian ID number and rectangular frame position information) identified and tracked by the model.

3. Dividing video nodes and calculating Fscore;

and S3-1, dividing the video nodes, and counting the number of nodes contained in the predicted frame, the number of nodes contained in the label frame and the number of matched nodes. For all that

Each of the cameras is arranged on a time axis

And each matching node has a certain range (for example, 50 frames before and after) on the time axis.

Referring to fig. 3, in the above steps, the node refers to: the video is divided into node sections according to fixed interval frames on a time axis. The frame matching is as follows: and if the camera number, the frame number, the pedestrian ID number and the rectangular frame position are matched, the predicted frame is considered to be matched with the label frame.

Wherein, the rectangular frame position matching comprises: and the matched measurement index is an intersection-to-parallel ratio IOU, and if the IOU of the detection frame and the mark frame is greater than a set threshold value, the rectangular frame is considered to be matched in position. The set threshold may be set to be greater than 0.7 with no fixed value.

The tag frame contains the matching nodes (i.e.: tag matching nodes): if the motion trail of a certain pedestrian passes through a camera in the time axis, the tag frame trail of the pedestrian is considered to contain a corresponding node, and the node is a tag frame and contains a matching node. The predicted frame contains the matching node (i.e.: the predicted matching node): if the predicted frame track is in a certain node, the predicted frame track is considered to contain the corresponding node, and the node is the predicted frame containing the matching node.

50% frame matching: for a specific tracking object, in a range where a tag frame includes a matching node, when the number of predicted correct frames is greater than or equal to half of the number of tag frames, the case is considered to be 50% frame matching.

And (3) target matching nodes: for a certain specified tracked object, in the range that a certain label frame contains a matching node, when more than 50% of submitted tracking tracks (namely tracks of predicted frames) and group Truth tracking tracks (namely tracks of label frames) can be matched, the tracked object is considered to be matched with the node during cross-head tracking, and the node is a target matching node.

And S3-2, calculating the Recall and Precision values of the prediction result of each pedestrian according to the fact that the prediction frame comprises the number of matched nodes, the tag frame (GroudRuth) comprises the number of matched nodes and 50% of the number of the frame matched nodes. As can be seen from fig. 3, the predicted frame includes 5 matching nodes, the tag frame includes 7 matching nodes, and the matched target matching nodes are 2. Therefore, the evaluation result on the track is: recall =2/7 and precision =2/5. The specific calculation formula is as follows.

For the ith pedestrian target, the evaluation indexes are as follows:

(formula 1);

(formula 2);

where N is the total number of cameras,

the number of matched nodes contained in a label frame of the ith tracking group route track of the appointed tracking object in the camera n is shown.

The number of matched nodes contained in a prediction frame of the ith specified tracking object in the camera n is determined.

Is the number of matching nodes which are contained in 50% of frames of the ith designated tracking object in the camera n in a matching way.

And S3-3, calculating the Recall values and Precision values of all the pedestrian bank prediction results.

(equation 3);

(equation 4);

and S3-4, calculating an Fscore value of a cross-camera pedestrian tracking algorithm. The Recall and Precision weighted average of all pedestrian pools resulted in the Fscore value.

(equation 5);

and S3-5, performing algorithm precision evaluation sequencing according to the principle that the larger the Fscore value is, the higher the model precision is.

The evaluation method for the cross-camera pedestrian tracking algorithm provided by the embodiment of the application can evaluate the algorithm precision more effectively and efficiently, and is more suitable for practical application scenes. The method provides concepts of frame matching, detection position matching, node matching, matching node contained in a label frame, matching node contained in a predicted frame, 50% of frame matching and matching node; calculating Recall rate (Recall) and Precision rate (Precision) of all specified tracked pedestrians on video time axis nodes, carrying out weighted average on the Recall rate and the Precision rate to obtain an Fscore value, and evaluating the advantages and the disadvantages of the algorithm according to the height of the Fscore value.

The comprehensive evaluation of the combination of modules such as pedestrian detection, pedestrian re-identification and pedestrian tracking in the existing camera-crossing pedestrian tracking algorithm is realized, the accuracy of pedestrian detection, the accuracy of pedestrian re-identification and the effectiveness of tracking can be effectively evaluated, the problem that the algorithm is sensitive to initial frame information is solved, and the problem that threshold percentage evaluation is unreliable is solved.

In addition, the embodiment of the invention also provides an evaluation system of a cross-camera pedestrian tracking algorithm, which comprises the following steps:

the pedestrian prediction result generation module is used for generating at least one pedestrian prediction result by utilizing a cross-camera pedestrian tracking algorithm model based on a pre-acquired test data set; the method comprises the steps that a pre-acquired test data set comprises cross-camera monitoring video clips and corresponding label frames; the pedestrian prediction result comprises a prediction frame;

the node number determining module is used for dividing the monitoring video according to the nodes of the time axis and determining the number of prediction matching nodes contained in the prediction frame and the number of label matching nodes contained in the label frame;

the target matching node generation module is used for carrying out frame matching on the label frame and the prediction frame to determine a target matching node;

the calculation module is used for calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the number of the prediction matching nodes, the number of the label matching nodes and the number of the target matching nodes;

and the calculation module is also used for calculating the evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate.

It should be noted that the evaluation system of the cross-camera pedestrian tracking algorithm provided by the embodiment of the invention can be deployed on embedded terminal equipment or an AI cloud server and implemented as a terminal evaluation product or an AI platform evaluation system.

The evaluation system of the cross-camera pedestrian tracking algorithm provided by the embodiment of the application can be specific hardware on the equipment or software or firmware installed on the equipment and the like. The device provided by the embodiment of the present application has the same implementation principle and technical effect as the foregoing method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiments where no part of the device embodiments is mentioned. It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the system, the apparatus and the unit described above may all refer to the corresponding processes in the method embodiments, and are not described herein again. The cross-camera pedestrian tracking algorithm evaluation system provided by the embodiment of the application has the same technical characteristics as the cross-camera pedestrian tracking algorithm evaluation method provided by the embodiment, so that the same technical problems can be solved, and the same technical effect is achieved.

The embodiment of the application further provides an electronic device, and specifically, the electronic device comprises a processor and a storage device; the storage means has stored thereon a computer program which, when executed by the processor, performs the method of any of the above described embodiments.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device 400 includes: a processor 40, a memory 41, a bus 42 and a communication interface 43, wherein the processor 40, the communication interface 43 and the memory 41 are connected through the bus 42; the processor 40 is arranged to execute executable modules, such as computer programs, stored in the memory 41.

The Memory 41 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 43 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.

The bus 42 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 4, but that does not indicate only one bus or one type of bus.

The memory 41 is used for storing a program, and the processor 40 executes the program after receiving an execution instruction, and the method performed by the apparatus defined by the flow program disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 40, or implemented by the processor 40.

The processor 40 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 40. The Processor 40 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory 41, and the processor 40 reads the information in the memory 41 and completes the steps of the method in combination with the hardware thereof.

Corresponding to the method, the embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores machine executable instructions, and when the computer executable instructions are called and executed by a processor, the computer executable instructions cause the processor to execute the steps of the method.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It should be noted that: like reference numbers and letters indicate like items in the figures, and thus once an item is defined in a figure, it need not be further defined or explained in subsequent figures, and moreover, the terms "first," "second," "third," etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A cross-camera pedestrian tracking algorithm evaluation method is characterized by comprising the following steps:

generating at least one pedestrian prediction result by using a cross-camera pedestrian tracking algorithm model based on a pre-acquired test data set; the pre-acquired test data set comprises a monitoring video clip crossing the camera and a corresponding label frame; the pedestrian prediction result comprises a prediction frame;

performing frame matching on the label frame and the predicted frame, and determining a target matching node;

calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the number of the prediction matching nodes, the number of the label matching nodes and the number of target matching nodes;

calculating an evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate.

2. The method for evaluating a cross-camera pedestrian tracking algorithm according to claim 1, further comprising:

acquiring a target monitoring video clip containing a target pedestrian picture and a corresponding label frame according to target pedestrian videos of a plurality of monitoring cameras across scenes;

the parameters of the tag frame comprise a target camera number, a target frame number, a target pedestrian ID number and target rectangular frame position information;

generating target pedestrian library information based on the target surveillance video clip and the tag frame;

and generating a test data set according to the information of the target pedestrian library.

3. The cross-camera pedestrian tracking algorithm evaluation method according to claim 2, wherein the step of performing frame matching on the tag frame and the predicted frame to determine a target matching node comprises:

respectively matching parameters contained in the predicted frame and the label frame; wherein the parameters of the predicted frame include: predicting the number of a camera, the number of a frame, the ID number of a pedestrian and the position information of a rectangular frame;

if the parameters of the predicted frame and the parameters of the label frame both meet the matching conditions, the predicted frame and the label frame are correctly matched; wherein the matching condition includes: the predicted camera number matches the target camera number, the predicted frame number matches the target frame number, the predicted pedestrian ID number matches the target pedestrian ID number, and the predicted rectangular frame position information matches the target rectangular frame position information.

4. The cross-camera pedestrian tracking algorithm evaluation method according to claim 3, wherein the condition that the predicted rectangular frame position information matches the target rectangular frame position information is: the intersection ratio IOU of the prediction rectangular frame and the target rectangular frame is larger than a set threshold value;

the intersection ratio IOU of the prediction rectangular frame and the target rectangular frame is the ratio of the first overlapping area to the first joint area; the first overlapping area is an intersection area of the prediction rectangular frame and the target rectangular frame; the first joint region is a union region of the prediction rectangular frame and the target rectangular frame.

5. The cross-camera pedestrian tracking algorithm evaluation method according to claim 3, wherein the step of performing frame matching on the tag frame and the predicted frame to determine a target matching node further comprises:

and if the number of the correct matches between the predicted frame and the label frame in the current node is not less than half of the number of the label frames, determining that the current node is a target matching node.

6. The cross-camera pedestrian tracking algorithm evaluation method according to claim 1, wherein the step of dividing the surveillance video according to the nodes of a time axis and determining the number of prediction matching nodes included in the prediction frame and the number of tag matching nodes included in the tag frame includes:

dividing the monitoring video according to the nodes of a time axis to generate a plurality of time nodes;

determining the number of prediction matching nodes and the number of label matching nodes according to the track of the label frame and the relation between the track of the prediction frame and the time node;

wherein the relationship between the track of the tag frame and the time node comprises: if the track of the tag frame of the pedestrian passes through one camera in the current time axis, the track of the tag frame comprises the time node, and the time node is a tag matching node;

the relationship of the trajectory of the predicted frame to the temporal node includes: if the track of the predicted frame of the pedestrian falls within a certain time node, the track of the predicted frame comprises the time node, and the time node is the predicted matching node.

7. The cross-camera pedestrian tracking algorithm evaluation method of claim 1, further comprising:

calculating the total recall rate and the total accuracy rate of all the pedestrian prediction results in the test data set based on the recall rate and the accuracy rate of each pedestrian prediction result;

calculating an evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate, comprising:

and carrying out weighted average on the total recall rate and the total accuracy rate, and determining the evaluation score of the cross-camera pedestrian tracking algorithm model.

8. An evaluation system for a cross-camera pedestrian tracking algorithm, the system comprising:

the node number determining module is used for dividing the monitoring video according to the nodes of a time axis and determining the number of prediction matching nodes contained in the prediction frame and the number of label matching nodes contained in the label frame;

the target matching node generation module is used for carrying out frame matching on the label frame and the predicted frame and determining a target matching node;

the calculation module is used for calculating the recall rate and the accuracy rate of each pedestrian prediction result based on the prediction matching node number, the label matching node number and the target matching node number;

the calculation module is further used for calculating an evaluation score of the cross-camera pedestrian tracking algorithm model based on the recall rate and the accuracy rate.

9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program operable on the processor, and wherein the processor implements the steps of the method of any of claims 1 to 7 when executing the computer program.