CN114693746A

CN114693746A - Intelligent monitoring system and method based on identity recognition and cross-camera target tracking

Info

Publication number: CN114693746A
Application number: CN202210335008.8A
Authority: CN
Inventors: 贺宇航; 余文涛; 韩洁; 魏星; 龚怡宏
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2022-03-31
Filing date: 2022-03-31
Publication date: 2022-07-01

Abstract

The intelligent monitoring system and method based on identity recognition and cross-camera target tracking are characterized in that a data reading module is used for parallelly acquiring video stream data of a plurality of paths of cameras; the identity registration module is used for carrying out target detection and face recognition on the video acquired by each camera in the data reading module, and binding face information with a pedestrian detection frame by adopting a center point distance greedy matching strategy to obtain an initial track with target identity information; the single-camera pedestrian tracking module is used for performing target detection under multiple paths of cameras in parallel, performing data association based on appearance information and motion information by adopting a Hungarian algorithm, and obtaining a tracking track of a target under the single camera; the cross-camera pedestrian track matching module is used for carrying out cross-camera track matching based on the appearance characteristics, the motion characteristics and the position characteristics of the local track, and generating a cross-camera global track for each target. The invention can efficiently realize real-time target identity recognition and cross-camera track generation.

Description

Intelligent monitoring system and method based on identity recognition and cross-camera target tracking

Technical Field

The invention belongs to the technical field of intelligent video monitoring, and particularly relates to an intelligent monitoring system and method based on identity recognition and cross-camera target tracking.

Background

The cross-camera multi-target tracking aims at realizing long-time and large-range tracking of different targets by utilizing a monitoring network formed by multiple cameras and generating a complete tracking track for each target. Compare in single camera tracking, cross the camera tracking and can be in the continuous pursuit target of wider scope, be the important basic technology in fields such as video monitoring, behavioral analysis, intelligent transportation, bring important practical application and worth for the development in wisdom city. For example, in a shopping mall environment, the consumption behavior of a customer is analyzed by tracking the shopping track of the customer, so that the optimization and upgrade of a consumption structure are realized, and greater economic benefit is obtained.

Compared with single-camera multi-target tracking, the task of cross-camera multi-target tracking mainly focuses on track matching among different cameras, and the difficulty is mainly embodied in the following three aspects: firstly, the appearance difference of targets under different cameras caused by factors such as shooting angle, shooting distance, light conditions and the like is obvious; secondly, visual field blind areas exist among the cameras, so that target motion information among different cameras is lost; thirdly, the matching relation across the camera tracks is uncertain because the total number of the targets is unknown.

At present, a cross-camera target tracking method is generally based on three core algorithms of target detection, data association under a single camera and cross-camera track matching. In the cross-camera track matching technology, camera calibration and feature matching methods are mainly used. As mentioned earlier, for the tracking problem of multiple targets in a complex scene, the method using feature matching has a great challenge due to the influence of shooting angles, distances and other factors. The method for calibrating the camera has high requirements on the arrangement mode of the cameras, and overlapping areas are required among the fields of view of different cameras. Due to the requirements on the precision, robustness and instantaneity of the tracking system in an actual application scene, the current algorithm cannot meet the requirements in the actual application scene, and further optimization is needed in the aspects of tracking precision and resource overhead.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an intelligent monitoring system and method based on identity recognition and cross-camera target tracking, which can accurately track the target under the condition of utilizing computing resources as efficiently as possible.

The invention is realized by adopting the following technical scheme:

the intelligent monitoring method based on identity recognition and cross-camera target tracking comprises the following steps:

1) the data reading module acquires video stream data of a plurality of paths of cameras in parallel;

2) the identity registration module divides the video stream acquired by each camera in the data reading module according to frames, performs target detection and face recognition on each frame of image, and binds face information with a pedestrian detection frame by adopting a central point distance greedy matching strategy to obtain an initial track with a real identity;

3) the single-camera pedestrian tracking module performs target detection and pedestrian feature extraction in parallel under multiple paths of cameras and performs data association based on appearance information and motion information by adopting a Hungarian algorithm to obtain a motion track of a target under a single camera;

4) the cross-camera pedestrian track matching module matches tracks and targets according to the appearance features, the motion features and the position features of the local tracks, and generates a global track under a cross-camera for each target.

The further improvement of the invention is that in step 1), the data reading module creates a multi-process reading queue for each path of camera based on a multi-process strategy, the sub-process A continuously reads each frame of image from the video stream through an rtsp protocol and puts the image into the multi-process reading queue, and the sub-process B continuously takes out the image from the multi-process reading queue for processing and then sends the image into a subsequent sub-process module.

The further improvement of the invention is that in the step 2), the identity registration module obtains the frame-by-frame image under each camera obtained by the data reading module, a pedestrian detection frame, a face detection frame and a face recognition result in the image respectively based on the target detection device and the face recognition device, after unreasonable matching is filtered out by calculating intersection and parallel ratio of the pedestrian detection frame and the face detection frame and position information, greedy matching is carried out based on Euclidean distance between an upper midpoint of the pedestrian detection frame and a central point of the face detection frame, and the face recognition result is bound with the pedestrian detection frame, so that an initialization track with real identity is obtained.

The invention is further improved in that the target detection device obtains a pedestrian detection frame in the image by using a Yolov4 algorithm, wherein the pedestrian detection frame comprises a parameter (x)_p,y_p,w_p,h_p) Wherein (x)_p,y_p) For pedestrian detection frame upper left point coordinates, w_pAnd h_pWidth and height of the frame, respectively; the face recognition device obtains a face detection box and a face recognition result by using an insight face algorithm, wherein the face detection box comprises a parameter (x)_f,y_f,w_f,h_f) And r, wherein (x)_f,y_f) For the face detection frame coordinates of the upper left point, w_fAnd h_fThe width and the height of the frame are respectively, and r is a face recognition result; the greedy matching based on the Euclidean distance between the upper middle point of the pedestrian detection frame and the central point of the face detection frame comprises the following steps:

s41: constructing a two-dimensional matrix D with M rows and N columns to represent the position distance between a face detection frame and a pedestrian detection frame, wherein M is the number of the pedestrian detection frames in the current frame, N is the number of the face detection frames, and the initial value of the distance matrix D is infinity;

by d_iIndicates the i-th pedestrian detection frame, f_jRepresents the jth personal face detection box, where i ∈ [1, M],j∈[1,N]；

S42: calculating pedestrian detection frame d_iAnd face detection frame f_jCross to parallel ratio ofIOU_i,jWhile detecting the frame f by judging the face in the horizontal direction_jWhether or not to include the pedestrian detection frame d_iFilters unreasonable matching relationships;

s43: adjusting a distance matrix D according to the coincidence degree of the face detection frame and the pedestrian detection frame:

wherein the content of the first and second substances,

frame for detecting pedestrian_iThe upper middle point of (a) is,

frame f for face detection_jA center point of (a);

s44: the steps S42 and S43 are carried out on all pedestrian detection frames and face detection frames, based on the adjusted distance matrix D, a greedy matching strategy is used for selecting the face detection frame with the minimum distance from each pedestrian detection frame as a matching result, and a matching list M is added_listPerforming the following steps;

s45: according to the matching list M_listAnd binding the pedestrian detection frame with the face recognition result to complete track initialization work with real identity.

The invention has the further improvement that in the step 3), the single-camera pedestrian tracking module adopts a multi-process strategy, so that multiple paths of cameras can be mutually independent and perform multi-target tracking under the single camera in parallel; distributing three sub-processes for each camera, respectively carrying out pedestrian target detection and pedestrian identity re-identification, and carrying out inter-frame data association based on appearance information and motion information by using a Hungarian algorithm, wherein the sub-processes carry out message transmission through a multi-process message queue:

s51: the pedestrian target detection subprocess continuously obtains frame-by-frame image information from the reading queue, and stores a frame-by-frame detection result and the image information into a detection queue after target detection is completed;

s52: continuously obtaining frame-by-frame detection results and image information from the detection queue by a pedestrian identity re-identification subprocess, and storing the frame-by-frame detection results and pedestrian features into a feature queue after feature extraction is completed;

s53: the data association subprocess continuously obtains frame-by-frame detection results and pedestrian features from the feature queue, obtains the prediction result of the current frame by passing the tracking result of the previous frame through a Kalman filter, and performs data association on the track of the previous frame and the detection result of the current frame by using Hungary algorithm to synthesize motion information and appearance information;

s54: and completing the steps of S51, S52 and S53 under each camera to obtain the motion track of the pedestrian under a single camera, and transmitting the tracking result under each camera into a shared folder for a cross-camera pedestrian track matching module to use, wherein the tracking result under each camera comprises a camera number, a current frame number, a track number, identity information, a tracking frame position, pedestrian characteristics and human face characteristics.

The further improvement of the invention is that in the step 4), the specific implementation method is as follows:

s61: a multi-process strategy is adopted to obtain local tracks tracked under each camera in parallel;

s62: representing appearance characteristics of local tracks under each camera by using the Reid characteristics;

s63: projecting the local track under each camera to the same reference coordinate system, and representing the position characteristics of the local track by using the position in the same reference coordinate system;

s64: acquiring all targets needing to be tracked in the current scene according to an identity registration module;

s65: based on the appearance characteristics and the position characteristics, the local track is matched with the target by using a cascade matching strategy, and the global track of the pedestrian crossing the camera is obtained.

In a further improvement of the present invention, in step S62, the method for characterizing the appearance of the local track includes:

is provided with

As a local track set under the ith camera

The u-th track of (1)

And

respectively representing local tracks

Corresponding to the width and height of the tracking frame at time t, the local track is obtained by the following formula

Can be characterized by t^*Pedestrian characteristics of the moment tracking frame:

in step S63, the method for projecting the local trajectory under each camera to the reference coordinate system includes:

by using

Showing the u-th local track under the ith camera

Corresponding to the position of the top left vertex of the tracking frame in the image at time t, wherein

The projected position of the tracking frame in the reference coordinate system can be obtained by the following formula:

wherein H_iIs a mapping matrix between the ith camera and the reference plane, which is calculated according to the calibration parameters of the camera, i.e. H_i＝R(K_i[R_i,T_i],[1,2,4]) Wherein, [, ]]Is a matrix column splicing function, R (, [1,2,4 ]]) Representing the splicing of 1,2,4 columns of the input matrix into a new matrix, K_i,R_i,T_iRespectively an internal reference matrix, an external reference rotation matrix and an external reference translation vector of the ith camera;

in the step S65, the cascade matching step includes:

(1) establishing a position similarity measurement matrix between each local track and each target based on the position characteristics of the step S63;

(2) according to the position similarity measurement matrix, matching the local track with the target by using a greedy matching strategy;

(3) and (5) matching the track which is not successfully matched in the step (2) again according to the appearance characteristics obtained in the step S62.

Intelligent monitoring system based on identification and cross-camera target tracking includes:

the control unit is used for controlling the start and the stop of the whole intelligent monitoring system based on the identity recognition and the cross-camera target tracking and controlling the scheduling among different units in the tracking system;

the image pickup unit is used for acquiring relevant video stream data of pedestrians;

the computing unit is used for carrying out operations such as image preprocessing, face recognition, target detection, feature extraction, target tracking and the like on the acquired video stream data;

and the display unit is used for displaying a real-time tracking result of the pedestrian under the single camera, a cross-camera tracking track of the pedestrian and relevant information extracted from the tracking track.

The invention further improves that the method also comprises the following steps:

and the switch is used for the communication coupling among the control unit, the image pickup unit, the calculation unit and the display unit and also used for the communication coupling among the calculation unit, the image pickup unit and the display unit.

The invention has the further improvement that the camera unit comprises an RGB camera network and a thermal infrared camera network; the RGB camera network is used for acquiring RGB optical images of a current scene, the RGB camera network is arranged in a mode that the visual fields of a plurality of cameras can cover the whole scene, the cameras are uniformly distributed, and the visual field ranges of different cameras are overlapped; the thermal infrared camera network is used for acquiring a thermal infrared image of a current scene at night and under the condition of insufficient illumination, the defect of the RGB camera under the condition of insufficient illumination is overcome, and the arrangement mode of the thermal infrared camera network is consistent with that of the RGB camera network;

the computing unit is realized by a plurality of computing nodes with GPUs through distributed deployment by a switch;

the display unit includes various types of displays including a touch screen.

The invention has at least the following beneficial technical effects:

the invention provides an intelligent monitoring method based on identity recognition and cross-camera target tracking, which comprises the steps of segmenting a plurality of paths of video streams acquired in parallel according to frames, carrying out target detection and face recognition on each frame of image, binding face information with a pedestrian detection frame by adopting a central point distance greedy matching strategy, and obtaining an initial track with a real identity; then, performing target detection and pedestrian feature extraction in parallel under multiple paths of cameras, and performing data association based on appearance information and motion information by adopting a Hungarian algorithm to obtain a motion track of a target under a single camera; and finally, matching the tracks and the targets according to the appearance characteristics, the motion characteristics and the position characteristics of the local tracks, and generating a global track crossing the camera for each target. The invention utilizes the visual information and the motion information of the target to accurately track the pedestrian under a single camera and generate the global track of the pedestrian across the cameras under the condition of utilizing the computing resources as efficiently as possible.

The intelligent monitoring system based on identity recognition and cross-camera target tracking can realize identity recognition of a tracked target through the face recognition system, and statistics of behavior habits and interests of the target can be realized through long-term information collection and analysis of personnel. Meanwhile, by using a cross-camera target tracking technology, uninterrupted tracking can be realized for personnel in a scene, a target can be tracked in a whole course in a seamless manner, and the method has important application value in unmanned supermarkets and superstores. On a cross-camera multi-target tracking data set EPFL, the precision of the monitoring system greatly exceeds that of the existing method:

drawings

Fig. 1 is a schematic structural diagram of an intelligent monitoring system based on identification and cross-camera target tracking according to an embodiment of the present invention.

Fig. 2 is a schematic flowchart of an intelligent monitoring method based on identification and cross-camera target tracking according to an embodiment of the present invention.

Fig. 3 is a schematic diagram of a physical architecture of an intelligent monitoring system based on identification and cross-camera target tracking according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention are described in detail below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a schematic structural diagram of an intelligent monitoring system based on identity recognition and cross-camera target tracking in an embodiment of the present invention, where the method includes: the system comprises a data reading module, an identity registration module, a single-camera pedestrian tracking module and a cross-camera pedestrian track matching module.

1) The data reading module is specifically implemented as follows:

based on a multi-process strategy, a multi-process reading queue is established for each path of camera, a subprocess A continuously reads each frame image from a video stream through an rtsp protocol and puts the image into the multi-process reading queue, and a subprocess B continuously takes out the image from the multi-process reading queue for processing and then sends the image into a subsequent subprocess module.

2) The identity registration module is specifically implemented in the following manner:

for the frame-by-frame images under each camera obtained by the data reading module, a pedestrian detection frame, a face detection frame and a face recognition result in the images are respectively obtained on the basis of a target detection device and a face recognition device, after unreasonable matching is carried out by calculating intersection and comparison of the pedestrian detection frame and the face detection frame and filtering position information, greedy matching is carried out on the basis of Euclidean distance between the upper midpoint of the pedestrian detection frame and the central point of the face detection frame, the face recognition result and the pedestrian detection frame are bound, and therefore an initialization track with real identity can be obtained.

Wherein the target detection device obtains a pedestrian detection frame in the image by using a Yolov4 algorithm, and the pedestrian detection frame comprises a parameter (x)_p,y_p,w_p,h_p) Wherein (x)_p,y_p) For pedestrian detection frame upper left point coordinates, w_pAnd h_pWidth and height of the frame, respectively; the face recognition device obtains a face detection frame and a face recognition result by using an Insightface algorithm, wherein the face detection frame comprises parameters (x)_f,y_f,w_f,h_f) And r, wherein (x)_f,y_f) For the face detection frame coordinates of the upper left point, w_fAnd h_fThe width and the height of the frame are respectively, and r is a face recognition result; the center point distance greedy matching strategy comprises the following steps:

s21: constructing a two-dimensional matrix D with M rows and N columns to represent the position distance between a face detection frame and a pedestrian detection frame, wherein M is the number of pedestrian detection frames in the current frame, N is the number of the face detection frames, and the initial value of the distance matrix D is infinity;

S22: calculating pedestrian detectionSide frame d_iAnd face detection frame f_jCross-over ratio of (IOU)_i,jWhile detecting the frame f by judging the face in the horizontal direction_jWhether or not to include the pedestrian detection frame d_iFilters unreasonable matching relationships;

s23: adjusting a distance matrix D according to the coincidence degree of the face detection frame and the pedestrian detection frame:

wherein, the first and the second end of the pipe are connected with each other,

frame for detecting pedestrian_iThe upper middle point of (a) is,

frame f for face detection_jA center point of (a);

s24: the steps S22 and S23 are carried out on all pedestrian detection frames and face detection frames, based on the adjusted distance matrix D, a greedy matching strategy is used for selecting the face detection frame with the minimum distance from each pedestrian detection frame as a matching result, and a matching list M is added_listIn (1).

S25: according to the matching list M_listAnd binding the pedestrian detection frame with the face recognition result to complete track initialization work with real identity.

3) The pedestrian tracking module with the single camera has the following specific implementation modes:

a multi-process strategy is adopted, so that multiple paths of cameras can be mutually independent and can perform multi-target tracking under a single camera in parallel; distributing three sub-processes for each camera, respectively carrying out pedestrian target detection and pedestrian identity re-identification, and carrying out inter-frame data association based on appearance information and motion information by using a Hungarian algorithm, wherein the sub-processes carry out message transmission through a multi-process message queue:

s31: the pedestrian target detection subprocess continuously obtains frame-by-frame image information from the reading queue, and stores a frame-by-frame detection result and the image information into a detection queue after target detection is completed;

s32: continuously obtaining frame-by-frame detection results and image information from the detection queue by a pedestrian identity re-identification subprocess, and storing the frame-by-frame detection results and pedestrian features into a feature queue after feature extraction is completed;

s33: and the data association subprocess continuously obtains a frame-by-frame detection result and pedestrian features from the feature queue, a tracking result of the previous frame is subjected to a Kalman filter to obtain a prediction result of the current frame, and the track of the previous frame and the detection result of the current frame are subjected to data association by using a Hungary algorithm according to the comprehensive motion information and appearance information.

S34: and completing the previous three steps under each camera to obtain the motion track of the pedestrian under a single camera, and transmitting the tracking result (camera number, current frame number, track number, identity information, tracking frame position, pedestrian characteristic and face characteristic) under each camera into a shared folder for a cross-camera pedestrian track matching module to use.

4) Stride camera pedestrian's orbit matching module, concrete implementation does:

s41: and acquiring local tracks tracked under each camera in parallel from the shared folder by adopting a multi-process strategy.

S42: and characterizing appearance characteristics of local tracks under each camera by using the Reid characteristics.

Specifically, it is provided

As a local track set under the ith camera

The u-th track in (1)

And

respectively representing local tracks

Can be characterized by t^*Tracking pedestrian characteristics of the frame at any moment:

s43: and projecting the local track under each camera into the same reference coordinate system, and using the position in the same reference coordinate system to represent the position characteristics of the local track.

Specifically, the local tracks generated under each camera are mapped into the world coordinate system plane by using the calibration information of each camera, so that the tracks belonging to the same ID are close in position in the reference plane, and the tracks belonging to different IDs are far in position in the reference plane. The specific operation method comprises the following steps: by using

Showing the u-th local track under the ith camera

wherein H_iIs a mapping matrix between the ith camera and the reference plane, the mapping matrix is calculated according to the calibration parameters of the camera,i.e. H_i＝R(K_i[R_i,T_i],[1,2,4]) Therein, [, ]]Is a matrix column splicing function, R (, [1,2,4 ]]) Representing the splicing of 1,2,4 columns of the input matrix into a new matrix, K_i,R_i,T_iRespectively an internal reference matrix, an external reference rotation matrix and an external reference translation vector of the ith camera.

S44: acquiring all targets needing to be tracked in the current scene according to a registration module;

s45: based on the appearance characteristics and the position characteristics, the local track is matched with the target by using a cascade matching strategy to obtain the global track of the pedestrian crossing the camera, and the specific cascade matching steps are as follows:

(1) and establishing a similarity measurement matrix between each local track and each target based on the position characteristics of S43, and specifically measuring the distance between two positions in the same reference frame by using Euclidean distance.

(3) and matching the tracks which are not successfully matched in the previous step again according to the appearance characteristics obtained in the step S42, and specifically measuring the distance between the appearance characteristics by using the Mahalanobis distance.

Fig. 3 is a schematic physical architecture diagram of an intelligent monitoring system 100 based on identification and cross-camera target tracking, which is capable of automatically performing cross-camera tracking on pedestrians in an indoor or outdoor all-weather scene according to an embodiment of the present invention.

And the control unit 102 is used for controlling the start and the stop of the whole intelligent monitoring system 100 based on the identification and the cross-camera target tracking, and controlling the scheduling among different units in the whole tracking system. The control unit 102 is communicatively coupled to the camera unit 104, and provides a corresponding control instruction or a control signal to the camera unit 104 to control the image acquisition of the intelligent monitoring system 100 based on identification and cross-camera target tracking. The control unit 102 is communicatively coupled to the computing unit 110 for controlling a series of processing operations of the image by the intelligent monitoring system 100 based on identification and cross-camera object tracking.

The camera unit 104 is configured to obtain video stream data related to pedestrians in all-weather indoor or outdoor scenes, and transmit the video stream data to the computing unit 110 through the switch for processing.

The calculating unit 110 is configured to perform calculation processing on the video data acquired by the camera unit 104, including operations such as image preprocessing, face recognition, target detection, feature extraction, single-camera target tracking, and cross-camera trajectory matching. The computing unit 110 is implemented by a plurality of computing nodes with GPUs in a distributed deployment through a switch. The result of the calculation processing performed by the calculation unit 110 may be provided to the display unit 112 via the control unit 102 for display.

And the display unit 112 is used for result display and mainly displays the real-time tracking result of the pedestrian under a single camera, the tracking track of the person across the cameras, relevant information extracted from the tracking track and the like. The control unit 102 sends out control instructions and control signals, and provides the results processed by the calculation unit 110 to the display unit 112 for display. The display unit 112 includes various types of displays including a touch screen.

As shown in fig. 3, the intelligent monitoring system based on identification and cross-camera object tracking further includes a switch 114 for communication coupling between the control unit 102 and the camera unit 104, the computing unit 110, and the display unit 112, and also for communication coupling between the computing unit 110 and the camera unit 104 and the display unit 112, so as to provide coordination and relay services for communication between different devices in the intelligent monitoring system 100 based on identification and cross-camera object tracking.

As shown in fig. 3, the camera unit 104 includes an RGB camera network 106 and a thermal infrared camera network 108 for capturing images of all weather scenes and providing usable data for pedestrian tracking of all weather scenes. Wherein the RGB camera network 106 is for acquiring RGB optical images of the current scene. The arrangement mode of the RGB camera network is that the visual fields of a plurality of cameras can cover the whole scene, the cameras are uniformly distributed, and the visual field ranges of different cameras are overlapped. The thermal infrared camera network 108 is used for acquiring a thermal infrared image of a current scene at night and under the condition of insufficient illumination, and is used for making up for the defects of the RGB cameras under the condition of insufficient illumination. The layout mode of the thermal infrared camera network is consistent with that of the RGB camera network.

The implementation case is as follows:

the invention is practically applied to a scientific and technological exhibition hall, and is used for monitoring and analyzing the behaviors of people in the exhibition hall. Firstly, face recognition is carried out on people entering an exhibition hall, and then the whole-process tracking is carried out on the moving track of the people in the exhibition hall. The number of visitors in each exhibition area in different time periods can be obtained by tracking the personnel in each exhibition area; by analyzing the tracks of the personnel in the whole exhibition room, the personnel can be further subjected to on-site snapshot and historical visiting information analysis and corresponding interested content recommendation.

On the cross-camera multi-target tracking data set EPFL, the precision of the monitoring system greatly exceeds that of the existing method, as shown in the table 1:

table 1: performance comparison on EPFL data sets. It can be seen that the present invention is significantly superior to existing methods.

Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims

1. The intelligent monitoring method based on identity recognition and cross-camera target tracking is characterized by comprising the following steps of:

4) the cross-camera pedestrian track matching module matches tracks and targets according to the appearance characteristics, the motion characteristics and the position characteristics of the local tracks, and generates a global track under a cross-camera for each target.

2. The intelligent monitoring method based on identity recognition and cross-camera target tracking as claimed in claim 1, wherein in step 1), the data reading module creates a multi-process reading queue for each path of camera based on a multi-process strategy, the sub-process a continuously reads each frame of image from the video stream through an rtsp protocol and puts it into the multi-process reading queue, and the sub-process B continuously takes out the image from the multi-process reading queue for processing and then sends it into the subsequent sub-process module.

3. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 1, wherein in step 2), the identity registration module obtains frame-by-frame images under each camera obtained by the data reading module, obtains a pedestrian detection frame, a face detection frame and a face recognition result in the images respectively based on the target detection device and the face recognition device, performs greedy matching based on the Euclidean distance between the upper midpoint of the pedestrian detection frame and the central point of the face detection frame after unreasonable matching is filtered out by calculating the intersection ratio between the pedestrian detection frame and the face detection frame and the position information, and binds the face recognition result with the pedestrian detection frame, thereby obtaining an initialization track with real identity.

4. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 3, characterized in that the target detection device obtains a pedestrian detection frame in the image by using a Yolov4 algorithm, wherein the pedestrian detection frame comprises a parameter (x)_p，y_p，w_p，h_p) Wherein (x)_p，y_p) For pedestrian detection frame upper left point coordinates, w_pAnd h_pWidth and height of the frame, respectively; the face recognition device obtains a face detection frame and a face recognition result by using an Insightface algorithm, wherein the face detection frame comprises parameters (x)_f，y_f，w_f，h_f) And r, wherein (x)_f，y_f) Coordinates of upper left point of face detection frame, w_fAnd h_fThe width and the height of the frame are respectively, and r is a face recognition result; the greedy matching based on the Euclidean distance between the upper midpoint of the pedestrian detection frame and the central point of the face detection frame comprises the following steps:

s41: constructing a two-dimensional matrix D with M rows and N columns to represent the position distance between a face detection frame and a pedestrian detection frame, wherein M is the number of pedestrian detection frames in the current frame, N is the number of the face detection frames, and the initial value of the distance matrix D is infinity;

by d_iIndicates the i-th pedestrian detection frame, f_jRepresents the jth personal face detection box, where i ∈ [1, M]，j∈[1，N]；

S42: calculating pedestrian detection frame d_iAnd face detection frame f_jCross over and cross over IOU of_i，jWhile detecting the frame f by judging the face in the horizontal direction_jWhether or not to include the pedestrian detection frame d_iFilters unreasonable matching relationships;

wherein the content of the first and second substances,

frame for detecting pedestrian_iThe upper middle point of (a) is,

frame f for face detection_jA center point of (a);

s44: the steps S42 and S43 are carried out on all the pedestrian detection frames and the face detection frames, based on the adjusted distance matrix D, a greedy matching strategy is used for selecting the face detection frame with the minimum distance from each pedestrian detection frame as a matching result, and a matching list M is added_listThe preparation method comprises the following steps of (1) performing;

5. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 1, wherein in step 3), a single-camera pedestrian tracking module adopts a multi-process strategy, so that multiple paths of cameras can be independent of each other and perform multi-target tracking under the condition of single camera in parallel; distributing three sub-processes for each camera, respectively carrying out pedestrian target detection and pedestrian identity re-identification, and carrying out inter-frame data association based on appearance information and motion information by using a Hungarian algorithm, wherein the sub-processes carry out message transmission through a multi-process message queue:

s53: the data association subprocess continuously obtains frame-by-frame detection results and pedestrian characteristics from the characteristic queue, a tracking result of a previous frame is subjected to a Kalman filter to obtain a prediction result of a current frame, and the track of the previous frame and the detection result of the current frame are subjected to data association by using a Hungary algorithm through comprehensive motion information and appearance information;

6. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 1, wherein in the step 4), the specific implementation method is as follows:

7. The intelligent monitoring method based on identification and cross-camera target tracking according to claim 1, wherein in the step S62, the method for characterizing the appearance characteristics of the local track is:

is provided with

As a local track set under the ith camera

To (1)u tracks, using

And

respectively representing local tracks

by using

Showing the u-th local track under the ith camera

wherein H_iIs a mapping matrix between the ith camera and the reference plane, which is calculated according to the calibration parameters of the camera, i.e. H_i＝R(K_i[R_i，T_i]，[1，2，4]) Therein, [, ]]Is a matrix column splicing function, R (, [1,2,4 ]]) Representing the splicing of 1,2,4 columns of the input matrix into a new matrix, K_i，R_i，T_iRespectively an internal reference matrix, an external reference rotation matrix and an external reference translation vector of the ith camera;

in the step S65, the cascade matching step includes:

8. Intelligent monitoring system based on identification and cross-camera target tracking, its characterized in that includes:

the control unit (102) is used for controlling the starting and the closing of the whole intelligent monitoring system (100) based on the identity recognition and the cross-camera target tracking, and controlling the scheduling among different units in the tracking system;

the image pickup unit (104) is used for acquiring relevant video stream data of pedestrians;

the computing unit (110) is used for carrying out operations such as image preprocessing, face recognition, target detection, feature extraction, target tracking and the like on the acquired video stream data;

and the display unit (112) is used for displaying a real-time tracking result of the pedestrian under the single camera, a cross-camera tracking track of the pedestrian and related information extracted from the tracking track.

9. The intelligent monitoring system based on identification and cross-camera target tracking of claim 8, further comprising:

and the switch (114) is used for controlling the communication coupling among the unit (102) and the image pickup unit (104), the calculation unit (110) and the display unit (112), and is also used for the communication coupling among the calculation unit (110) and the image pickup unit (104) and the display unit (112).

10. The intelligent monitoring system based on identification and cross-camera target tracking of claim 9, wherein the camera unit (104) comprises an RGB camera network (106) and a thermal infrared camera network (108); the RGB camera network (106) is used for acquiring RGB optical images of a current scene, the RGB camera network is arranged in a mode that the visual fields of a plurality of cameras can cover the whole scene, the cameras are uniformly distributed, and the visual field ranges of different cameras are overlapped; the thermal infrared camera network (108) is used for acquiring a thermal infrared image of a current scene at night under the condition of insufficient illumination, and is used for making up for the defect of the RGB camera under the condition of insufficient illumination, and the arrangement mode of the thermal infrared camera network is consistent with that of the RGB camera network;

the computing unit (110) is realized by a plurality of computing nodes with GPUs in a distributed deployment mode through a switch (114);

the display unit (112) includes various types of displays including a touch screen.