CN114693746A - Intelligent monitoring system and method based on identity recognition and cross-camera target tracking - Google Patents

Intelligent monitoring system and method based on identity recognition and cross-camera target tracking Download PDF

Info

Publication number
CN114693746A
CN114693746A CN202210335008.8A CN202210335008A CN114693746A CN 114693746 A CN114693746 A CN 114693746A CN 202210335008 A CN202210335008 A CN 202210335008A CN 114693746 A CN114693746 A CN 114693746A
Authority
CN
China
Prior art keywords
camera
frame
pedestrian
track
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210335008.8A
Other languages
Chinese (zh)
Inventor
贺宇航
余文涛
韩洁
魏星
龚怡宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202210335008.8A priority Critical patent/CN114693746A/en
Publication of CN114693746A publication Critical patent/CN114693746A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

The intelligent monitoring system and method based on identity recognition and cross-camera target tracking are characterized in that a data reading module is used for parallelly acquiring video stream data of a plurality of paths of cameras; the identity registration module is used for carrying out target detection and face recognition on the video acquired by each camera in the data reading module, and binding face information with a pedestrian detection frame by adopting a center point distance greedy matching strategy to obtain an initial track with target identity information; the single-camera pedestrian tracking module is used for performing target detection under multiple paths of cameras in parallel, performing data association based on appearance information and motion information by adopting a Hungarian algorithm, and obtaining a tracking track of a target under the single camera; the cross-camera pedestrian track matching module is used for carrying out cross-camera track matching based on the appearance characteristics, the motion characteristics and the position characteristics of the local track, and generating a cross-camera global track for each target. The invention can efficiently realize real-time target identity recognition and cross-camera track generation.

Description

Intelligent monitoring system and method based on identity recognition and cross-camera target tracking
Technical Field
The invention belongs to the technical field of intelligent video monitoring, and particularly relates to an intelligent monitoring system and method based on identity recognition and cross-camera target tracking.
Background
The cross-camera multi-target tracking aims at realizing long-time and large-range tracking of different targets by utilizing a monitoring network formed by multiple cameras and generating a complete tracking track for each target. Compare in single camera tracking, cross the camera tracking and can be in the continuous pursuit target of wider scope, be the important basic technology in fields such as video monitoring, behavioral analysis, intelligent transportation, bring important practical application and worth for the development in wisdom city. For example, in a shopping mall environment, the consumption behavior of a customer is analyzed by tracking the shopping track of the customer, so that the optimization and upgrade of a consumption structure are realized, and greater economic benefit is obtained.
Compared with single-camera multi-target tracking, the task of cross-camera multi-target tracking mainly focuses on track matching among different cameras, and the difficulty is mainly embodied in the following three aspects: firstly, the appearance difference of targets under different cameras caused by factors such as shooting angle, shooting distance, light conditions and the like is obvious; secondly, visual field blind areas exist among the cameras, so that target motion information among different cameras is lost; thirdly, the matching relation across the camera tracks is uncertain because the total number of the targets is unknown.
At present, a cross-camera target tracking method is generally based on three core algorithms of target detection, data association under a single camera and cross-camera track matching. In the cross-camera track matching technology, camera calibration and feature matching methods are mainly used. As mentioned earlier, for the tracking problem of multiple targets in a complex scene, the method using feature matching has a great challenge due to the influence of shooting angles, distances and other factors. The method for calibrating the camera has high requirements on the arrangement mode of the cameras, and overlapping areas are required among the fields of view of different cameras. Due to the requirements on the precision, robustness and instantaneity of the tracking system in an actual application scene, the current algorithm cannot meet the requirements in the actual application scene, and further optimization is needed in the aspects of tracking precision and resource overhead.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an intelligent monitoring system and method based on identity recognition and cross-camera target tracking, which can accurately track the target under the condition of utilizing computing resources as efficiently as possible.
The invention is realized by adopting the following technical scheme:
the intelligent monitoring method based on identity recognition and cross-camera target tracking comprises the following steps:
1) the data reading module acquires video stream data of a plurality of paths of cameras in parallel;
2) the identity registration module divides the video stream acquired by each camera in the data reading module according to frames, performs target detection and face recognition on each frame of image, and binds face information with a pedestrian detection frame by adopting a central point distance greedy matching strategy to obtain an initial track with a real identity;
3) the single-camera pedestrian tracking module performs target detection and pedestrian feature extraction in parallel under multiple paths of cameras and performs data association based on appearance information and motion information by adopting a Hungarian algorithm to obtain a motion track of a target under a single camera;
4) the cross-camera pedestrian track matching module matches tracks and targets according to the appearance features, the motion features and the position features of the local tracks, and generates a global track under a cross-camera for each target.
The further improvement of the invention is that in step 1), the data reading module creates a multi-process reading queue for each path of camera based on a multi-process strategy, the sub-process A continuously reads each frame of image from the video stream through an rtsp protocol and puts the image into the multi-process reading queue, and the sub-process B continuously takes out the image from the multi-process reading queue for processing and then sends the image into a subsequent sub-process module.
The further improvement of the invention is that in the step 2), the identity registration module obtains the frame-by-frame image under each camera obtained by the data reading module, a pedestrian detection frame, a face detection frame and a face recognition result in the image respectively based on the target detection device and the face recognition device, after unreasonable matching is filtered out by calculating intersection and parallel ratio of the pedestrian detection frame and the face detection frame and position information, greedy matching is carried out based on Euclidean distance between an upper midpoint of the pedestrian detection frame and a central point of the face detection frame, and the face recognition result is bound with the pedestrian detection frame, so that an initialization track with real identity is obtained.
The invention is further improved in that the target detection device obtains a pedestrian detection frame in the image by using a Yolov4 algorithm, wherein the pedestrian detection frame comprises a parameter (x)p,yp,wp,hp) Wherein (x)p,yp) For pedestrian detection frame upper left point coordinates, wpAnd hpWidth and height of the frame, respectively; the face recognition device obtains a face detection box and a face recognition result by using an insight face algorithm, wherein the face detection box comprises a parameter (x)f,yf,wf,hf) And r, wherein (x)f,yf) For the face detection frame coordinates of the upper left point, wfAnd hfThe width and the height of the frame are respectively, and r is a face recognition result; the greedy matching based on the Euclidean distance between the upper middle point of the pedestrian detection frame and the central point of the face detection frame comprises the following steps:
s41: constructing a two-dimensional matrix D with M rows and N columns to represent the position distance between a face detection frame and a pedestrian detection frame, wherein M is the number of the pedestrian detection frames in the current frame, N is the number of the face detection frames, and the initial value of the distance matrix D is infinity;
by diIndicates the i-th pedestrian detection frame, fjRepresents the jth personal face detection box, where i ∈ [1, M],j∈[1,N];
S42: calculating pedestrian detection frame diAnd face detection frame fjCross to parallel ratio ofIOUi,jWhile detecting the frame f by judging the face in the horizontal directionjWhether or not to include the pedestrian detection frame diFilters unreasonable matching relationships;
s43: adjusting a distance matrix D according to the coincidence degree of the face detection frame and the pedestrian detection frame:
Figure BDA0003576472450000031
wherein the content of the first and second substances,
Figure BDA0003576472450000032
frame for detecting pedestrianiThe upper middle point of (a) is,
Figure BDA0003576472450000033
frame f for face detectionjA center point of (a);
s44: the steps S42 and S43 are carried out on all pedestrian detection frames and face detection frames, based on the adjusted distance matrix D, a greedy matching strategy is used for selecting the face detection frame with the minimum distance from each pedestrian detection frame as a matching result, and a matching list M is addedlistPerforming the following steps;
s45: according to the matching list MlistAnd binding the pedestrian detection frame with the face recognition result to complete track initialization work with real identity.
The invention has the further improvement that in the step 3), the single-camera pedestrian tracking module adopts a multi-process strategy, so that multiple paths of cameras can be mutually independent and perform multi-target tracking under the single camera in parallel; distributing three sub-processes for each camera, respectively carrying out pedestrian target detection and pedestrian identity re-identification, and carrying out inter-frame data association based on appearance information and motion information by using a Hungarian algorithm, wherein the sub-processes carry out message transmission through a multi-process message queue:
s51: the pedestrian target detection subprocess continuously obtains frame-by-frame image information from the reading queue, and stores a frame-by-frame detection result and the image information into a detection queue after target detection is completed;
s52: continuously obtaining frame-by-frame detection results and image information from the detection queue by a pedestrian identity re-identification subprocess, and storing the frame-by-frame detection results and pedestrian features into a feature queue after feature extraction is completed;
s53: the data association subprocess continuously obtains frame-by-frame detection results and pedestrian features from the feature queue, obtains the prediction result of the current frame by passing the tracking result of the previous frame through a Kalman filter, and performs data association on the track of the previous frame and the detection result of the current frame by using Hungary algorithm to synthesize motion information and appearance information;
s54: and completing the steps of S51, S52 and S53 under each camera to obtain the motion track of the pedestrian under a single camera, and transmitting the tracking result under each camera into a shared folder for a cross-camera pedestrian track matching module to use, wherein the tracking result under each camera comprises a camera number, a current frame number, a track number, identity information, a tracking frame position, pedestrian characteristics and human face characteristics.
The further improvement of the invention is that in the step 4), the specific implementation method is as follows:
s61: a multi-process strategy is adopted to obtain local tracks tracked under each camera in parallel;
s62: representing appearance characteristics of local tracks under each camera by using the Reid characteristics;
s63: projecting the local track under each camera to the same reference coordinate system, and representing the position characteristics of the local track by using the position in the same reference coordinate system;
s64: acquiring all targets needing to be tracked in the current scene according to an identity registration module;
s65: based on the appearance characteristics and the position characteristics, the local track is matched with the target by using a cascade matching strategy, and the global track of the pedestrian crossing the camera is obtained.
In a further improvement of the present invention, in step S62, the method for characterizing the appearance of the local track includes:
is provided with
Figure BDA0003576472450000041
As a local track set under the ith camera
Figure BDA0003576472450000042
The u-th track of (1)
Figure BDA0003576472450000043
And
Figure BDA0003576472450000044
respectively representing local tracks
Figure BDA0003576472450000045
Corresponding to the width and height of the tracking frame at time t, the local track is obtained by the following formula
Figure BDA0003576472450000046
Can be characterized by t*Pedestrian characteristics of the moment tracking frame:
Figure BDA0003576472450000051
in step S63, the method for projecting the local trajectory under each camera to the reference coordinate system includes:
by using
Figure BDA0003576472450000052
Showing the u-th local track under the ith camera
Figure BDA0003576472450000053
Corresponding to the position of the top left vertex of the tracking frame in the image at time t, wherein
Figure BDA0003576472450000054
The projected position of the tracking frame in the reference coordinate system can be obtained by the following formula:
Figure BDA0003576472450000055
wherein HiIs a mapping matrix between the ith camera and the reference plane, which is calculated according to the calibration parameters of the camera, i.e. Hi=R(Ki[Ri,Ti],[1,2,4]) Wherein, [, ]]Is a matrix column splicing function, R (, [1,2,4 ]]) Representing the splicing of 1,2,4 columns of the input matrix into a new matrix, Ki,Ri,TiRespectively an internal reference matrix, an external reference rotation matrix and an external reference translation vector of the ith camera;
in the step S65, the cascade matching step includes:
(1) establishing a position similarity measurement matrix between each local track and each target based on the position characteristics of the step S63;
(2) according to the position similarity measurement matrix, matching the local track with the target by using a greedy matching strategy;
(3) and (5) matching the track which is not successfully matched in the step (2) again according to the appearance characteristics obtained in the step S62.
Intelligent monitoring system based on identification and cross-camera target tracking includes:
the control unit is used for controlling the start and the stop of the whole intelligent monitoring system based on the identity recognition and the cross-camera target tracking and controlling the scheduling among different units in the tracking system;
the image pickup unit is used for acquiring relevant video stream data of pedestrians;
the computing unit is used for carrying out operations such as image preprocessing, face recognition, target detection, feature extraction, target tracking and the like on the acquired video stream data;
and the display unit is used for displaying a real-time tracking result of the pedestrian under the single camera, a cross-camera tracking track of the pedestrian and relevant information extracted from the tracking track.
The invention further improves that the method also comprises the following steps:
and the switch is used for the communication coupling among the control unit, the image pickup unit, the calculation unit and the display unit and also used for the communication coupling among the calculation unit, the image pickup unit and the display unit.
The invention has the further improvement that the camera unit comprises an RGB camera network and a thermal infrared camera network; the RGB camera network is used for acquiring RGB optical images of a current scene, the RGB camera network is arranged in a mode that the visual fields of a plurality of cameras can cover the whole scene, the cameras are uniformly distributed, and the visual field ranges of different cameras are overlapped; the thermal infrared camera network is used for acquiring a thermal infrared image of a current scene at night and under the condition of insufficient illumination, the defect of the RGB camera under the condition of insufficient illumination is overcome, and the arrangement mode of the thermal infrared camera network is consistent with that of the RGB camera network;
the computing unit is realized by a plurality of computing nodes with GPUs through distributed deployment by a switch;
the display unit includes various types of displays including a touch screen.
The invention has at least the following beneficial technical effects:
the invention provides an intelligent monitoring method based on identity recognition and cross-camera target tracking, which comprises the steps of segmenting a plurality of paths of video streams acquired in parallel according to frames, carrying out target detection and face recognition on each frame of image, binding face information with a pedestrian detection frame by adopting a central point distance greedy matching strategy, and obtaining an initial track with a real identity; then, performing target detection and pedestrian feature extraction in parallel under multiple paths of cameras, and performing data association based on appearance information and motion information by adopting a Hungarian algorithm to obtain a motion track of a target under a single camera; and finally, matching the tracks and the targets according to the appearance characteristics, the motion characteristics and the position characteristics of the local tracks, and generating a global track crossing the camera for each target. The invention utilizes the visual information and the motion information of the target to accurately track the pedestrian under a single camera and generate the global track of the pedestrian across the cameras under the condition of utilizing the computing resources as efficiently as possible.
The intelligent monitoring system based on identity recognition and cross-camera target tracking can realize identity recognition of a tracked target through the face recognition system, and statistics of behavior habits and interests of the target can be realized through long-term information collection and analysis of personnel. Meanwhile, by using a cross-camera target tracking technology, uninterrupted tracking can be realized for personnel in a scene, a target can be tracked in a whole course in a seamless manner, and the method has important application value in unmanned supermarkets and superstores. On a cross-camera multi-target tracking data set EPFL, the precision of the monitoring system greatly exceeds that of the existing method:
drawings
Fig. 1 is a schematic structural diagram of an intelligent monitoring system based on identification and cross-camera target tracking according to an embodiment of the present invention.
Fig. 2 is a schematic flowchart of an intelligent monitoring method based on identification and cross-camera target tracking according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a physical architecture of an intelligent monitoring system based on identification and cross-camera target tracking according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are described in detail below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic structural diagram of an intelligent monitoring system based on identity recognition and cross-camera target tracking in an embodiment of the present invention, where the method includes: the system comprises a data reading module, an identity registration module, a single-camera pedestrian tracking module and a cross-camera pedestrian track matching module.
1) The data reading module is specifically implemented as follows:
based on a multi-process strategy, a multi-process reading queue is established for each path of camera, a subprocess A continuously reads each frame image from a video stream through an rtsp protocol and puts the image into the multi-process reading queue, and a subprocess B continuously takes out the image from the multi-process reading queue for processing and then sends the image into a subsequent subprocess module.
2) The identity registration module is specifically implemented in the following manner:
for the frame-by-frame images under each camera obtained by the data reading module, a pedestrian detection frame, a face detection frame and a face recognition result in the images are respectively obtained on the basis of a target detection device and a face recognition device, after unreasonable matching is carried out by calculating intersection and comparison of the pedestrian detection frame and the face detection frame and filtering position information, greedy matching is carried out on the basis of Euclidean distance between the upper midpoint of the pedestrian detection frame and the central point of the face detection frame, the face recognition result and the pedestrian detection frame are bound, and therefore an initialization track with real identity can be obtained.
Wherein the target detection device obtains a pedestrian detection frame in the image by using a Yolov4 algorithm, and the pedestrian detection frame comprises a parameter (x)p,yp,wp,hp) Wherein (x)p,yp) For pedestrian detection frame upper left point coordinates, wpAnd hpWidth and height of the frame, respectively; the face recognition device obtains a face detection frame and a face recognition result by using an Insightface algorithm, wherein the face detection frame comprises parameters (x)f,yf,wf,hf) And r, wherein (x)f,yf) For the face detection frame coordinates of the upper left point, wfAnd hfThe width and the height of the frame are respectively, and r is a face recognition result; the center point distance greedy matching strategy comprises the following steps:
s21: constructing a two-dimensional matrix D with M rows and N columns to represent the position distance between a face detection frame and a pedestrian detection frame, wherein M is the number of pedestrian detection frames in the current frame, N is the number of the face detection frames, and the initial value of the distance matrix D is infinity;
by diIndicates the i-th pedestrian detection frame, fjRepresents the jth personal face detection box, where i ∈ [1, M],j∈[1,N];
S22: calculating pedestrian detectionSide frame diAnd face detection frame fjCross-over ratio of (IOU)i,jWhile detecting the frame f by judging the face in the horizontal directionjWhether or not to include the pedestrian detection frame diFilters unreasonable matching relationships;
s23: adjusting a distance matrix D according to the coincidence degree of the face detection frame and the pedestrian detection frame:
Figure BDA0003576472450000081
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003576472450000082
frame for detecting pedestrianiThe upper middle point of (a) is,
Figure BDA0003576472450000083
frame f for face detectionjA center point of (a);
s24: the steps S22 and S23 are carried out on all pedestrian detection frames and face detection frames, based on the adjusted distance matrix D, a greedy matching strategy is used for selecting the face detection frame with the minimum distance from each pedestrian detection frame as a matching result, and a matching list M is addedlistIn (1).
S25: according to the matching list MlistAnd binding the pedestrian detection frame with the face recognition result to complete track initialization work with real identity.
3) The pedestrian tracking module with the single camera has the following specific implementation modes:
a multi-process strategy is adopted, so that multiple paths of cameras can be mutually independent and can perform multi-target tracking under a single camera in parallel; distributing three sub-processes for each camera, respectively carrying out pedestrian target detection and pedestrian identity re-identification, and carrying out inter-frame data association based on appearance information and motion information by using a Hungarian algorithm, wherein the sub-processes carry out message transmission through a multi-process message queue:
s31: the pedestrian target detection subprocess continuously obtains frame-by-frame image information from the reading queue, and stores a frame-by-frame detection result and the image information into a detection queue after target detection is completed;
s32: continuously obtaining frame-by-frame detection results and image information from the detection queue by a pedestrian identity re-identification subprocess, and storing the frame-by-frame detection results and pedestrian features into a feature queue after feature extraction is completed;
s33: and the data association subprocess continuously obtains a frame-by-frame detection result and pedestrian features from the feature queue, a tracking result of the previous frame is subjected to a Kalman filter to obtain a prediction result of the current frame, and the track of the previous frame and the detection result of the current frame are subjected to data association by using a Hungary algorithm according to the comprehensive motion information and appearance information.
S34: and completing the previous three steps under each camera to obtain the motion track of the pedestrian under a single camera, and transmitting the tracking result (camera number, current frame number, track number, identity information, tracking frame position, pedestrian characteristic and face characteristic) under each camera into a shared folder for a cross-camera pedestrian track matching module to use.
4) Stride camera pedestrian's orbit matching module, concrete implementation does:
s41: and acquiring local tracks tracked under each camera in parallel from the shared folder by adopting a multi-process strategy.
S42: and characterizing appearance characteristics of local tracks under each camera by using the Reid characteristics.
Specifically, it is provided
Figure BDA0003576472450000091
As a local track set under the ith camera
Figure BDA0003576472450000092
The u-th track in (1)
Figure BDA0003576472450000093
And
Figure BDA0003576472450000094
respectively representing local tracks
Figure BDA0003576472450000095
Corresponding to the width and height of the tracking frame at time t, the local track is obtained by the following formula
Figure BDA0003576472450000096
Can be characterized by t*Tracking pedestrian characteristics of the frame at any moment:
Figure BDA0003576472450000097
s43: and projecting the local track under each camera into the same reference coordinate system, and using the position in the same reference coordinate system to represent the position characteristics of the local track.
Specifically, the local tracks generated under each camera are mapped into the world coordinate system plane by using the calibration information of each camera, so that the tracks belonging to the same ID are close in position in the reference plane, and the tracks belonging to different IDs are far in position in the reference plane. The specific operation method comprises the following steps: by using
Figure BDA0003576472450000101
Showing the u-th local track under the ith camera
Figure BDA0003576472450000102
Corresponding to the position of the top left vertex of the tracking frame in the image at time t, wherein
Figure BDA0003576472450000103
The projected position of the tracking frame in the reference coordinate system can be obtained by the following formula:
Figure BDA0003576472450000104
wherein HiIs a mapping matrix between the ith camera and the reference plane, the mapping matrix is calculated according to the calibration parameters of the camera,i.e. Hi=R(Ki[Ri,Ti],[1,2,4]) Therein, [, ]]Is a matrix column splicing function, R (, [1,2,4 ]]) Representing the splicing of 1,2,4 columns of the input matrix into a new matrix, Ki,Ri,TiRespectively an internal reference matrix, an external reference rotation matrix and an external reference translation vector of the ith camera.
S44: acquiring all targets needing to be tracked in the current scene according to a registration module;
s45: based on the appearance characteristics and the position characteristics, the local track is matched with the target by using a cascade matching strategy to obtain the global track of the pedestrian crossing the camera, and the specific cascade matching steps are as follows:
(1) and establishing a similarity measurement matrix between each local track and each target based on the position characteristics of S43, and specifically measuring the distance between two positions in the same reference frame by using Euclidean distance.
(2) According to the position similarity measurement matrix, matching the local track with the target by using a greedy matching strategy;
(3) and matching the tracks which are not successfully matched in the previous step again according to the appearance characteristics obtained in the step S42, and specifically measuring the distance between the appearance characteristics by using the Mahalanobis distance.
Fig. 3 is a schematic physical architecture diagram of an intelligent monitoring system 100 based on identification and cross-camera target tracking, which is capable of automatically performing cross-camera tracking on pedestrians in an indoor or outdoor all-weather scene according to an embodiment of the present invention.
And the control unit 102 is used for controlling the start and the stop of the whole intelligent monitoring system 100 based on the identification and the cross-camera target tracking, and controlling the scheduling among different units in the whole tracking system. The control unit 102 is communicatively coupled to the camera unit 104, and provides a corresponding control instruction or a control signal to the camera unit 104 to control the image acquisition of the intelligent monitoring system 100 based on identification and cross-camera target tracking. The control unit 102 is communicatively coupled to the computing unit 110 for controlling a series of processing operations of the image by the intelligent monitoring system 100 based on identification and cross-camera object tracking.
The camera unit 104 is configured to obtain video stream data related to pedestrians in all-weather indoor or outdoor scenes, and transmit the video stream data to the computing unit 110 through the switch for processing.
The calculating unit 110 is configured to perform calculation processing on the video data acquired by the camera unit 104, including operations such as image preprocessing, face recognition, target detection, feature extraction, single-camera target tracking, and cross-camera trajectory matching. The computing unit 110 is implemented by a plurality of computing nodes with GPUs in a distributed deployment through a switch. The result of the calculation processing performed by the calculation unit 110 may be provided to the display unit 112 via the control unit 102 for display.
And the display unit 112 is used for result display and mainly displays the real-time tracking result of the pedestrian under a single camera, the tracking track of the person across the cameras, relevant information extracted from the tracking track and the like. The control unit 102 sends out control instructions and control signals, and provides the results processed by the calculation unit 110 to the display unit 112 for display. The display unit 112 includes various types of displays including a touch screen.
As shown in fig. 3, the intelligent monitoring system based on identification and cross-camera object tracking further includes a switch 114 for communication coupling between the control unit 102 and the camera unit 104, the computing unit 110, and the display unit 112, and also for communication coupling between the computing unit 110 and the camera unit 104 and the display unit 112, so as to provide coordination and relay services for communication between different devices in the intelligent monitoring system 100 based on identification and cross-camera object tracking.
As shown in fig. 3, the camera unit 104 includes an RGB camera network 106 and a thermal infrared camera network 108 for capturing images of all weather scenes and providing usable data for pedestrian tracking of all weather scenes. Wherein the RGB camera network 106 is for acquiring RGB optical images of the current scene. The arrangement mode of the RGB camera network is that the visual fields of a plurality of cameras can cover the whole scene, the cameras are uniformly distributed, and the visual field ranges of different cameras are overlapped. The thermal infrared camera network 108 is used for acquiring a thermal infrared image of a current scene at night and under the condition of insufficient illumination, and is used for making up for the defects of the RGB cameras under the condition of insufficient illumination. The layout mode of the thermal infrared camera network is consistent with that of the RGB camera network.
The implementation case is as follows:
the invention is practically applied to a scientific and technological exhibition hall, and is used for monitoring and analyzing the behaviors of people in the exhibition hall. Firstly, face recognition is carried out on people entering an exhibition hall, and then the whole-process tracking is carried out on the moving track of the people in the exhibition hall. The number of visitors in each exhibition area in different time periods can be obtained by tracking the personnel in each exhibition area; by analyzing the tracks of the personnel in the whole exhibition room, the personnel can be further subjected to on-site snapshot and historical visiting information analysis and corresponding interested content recommendation.
On the cross-camera multi-target tracking data set EPFL, the precision of the monitoring system greatly exceeds that of the existing method, as shown in the table 1:
table 1: performance comparison on EPFL data sets. It can be seen that the present invention is significantly superior to existing methods.
Figure BDA0003576472450000121
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (10)

1. The intelligent monitoring method based on identity recognition and cross-camera target tracking is characterized by comprising the following steps of:
1) the data reading module acquires video stream data of a plurality of paths of cameras in parallel;
2) the identity registration module divides the video stream acquired by each camera in the data reading module according to frames, performs target detection and face recognition on each frame of image, and binds face information with a pedestrian detection frame by adopting a central point distance greedy matching strategy to obtain an initial track with a real identity;
3) the single-camera pedestrian tracking module performs target detection and pedestrian feature extraction in parallel under multiple paths of cameras and performs data association based on appearance information and motion information by adopting a Hungarian algorithm to obtain a motion track of a target under a single camera;
4) the cross-camera pedestrian track matching module matches tracks and targets according to the appearance characteristics, the motion characteristics and the position characteristics of the local tracks, and generates a global track under a cross-camera for each target.
2. The intelligent monitoring method based on identity recognition and cross-camera target tracking as claimed in claim 1, wherein in step 1), the data reading module creates a multi-process reading queue for each path of camera based on a multi-process strategy, the sub-process a continuously reads each frame of image from the video stream through an rtsp protocol and puts it into the multi-process reading queue, and the sub-process B continuously takes out the image from the multi-process reading queue for processing and then sends it into the subsequent sub-process module.
3. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 1, wherein in step 2), the identity registration module obtains frame-by-frame images under each camera obtained by the data reading module, obtains a pedestrian detection frame, a face detection frame and a face recognition result in the images respectively based on the target detection device and the face recognition device, performs greedy matching based on the Euclidean distance between the upper midpoint of the pedestrian detection frame and the central point of the face detection frame after unreasonable matching is filtered out by calculating the intersection ratio between the pedestrian detection frame and the face detection frame and the position information, and binds the face recognition result with the pedestrian detection frame, thereby obtaining an initialization track with real identity.
4. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 3, characterized in that the target detection device obtains a pedestrian detection frame in the image by using a Yolov4 algorithm, wherein the pedestrian detection frame comprises a parameter (x)p,yp,wp,hp) Wherein (x)p,yp) For pedestrian detection frame upper left point coordinates, wpAnd hpWidth and height of the frame, respectively; the face recognition device obtains a face detection frame and a face recognition result by using an Insightface algorithm, wherein the face detection frame comprises parameters (x)f,yf,wf,hf) And r, wherein (x)f,yf) Coordinates of upper left point of face detection frame, wfAnd hfThe width and the height of the frame are respectively, and r is a face recognition result; the greedy matching based on the Euclidean distance between the upper midpoint of the pedestrian detection frame and the central point of the face detection frame comprises the following steps:
s41: constructing a two-dimensional matrix D with M rows and N columns to represent the position distance between a face detection frame and a pedestrian detection frame, wherein M is the number of pedestrian detection frames in the current frame, N is the number of the face detection frames, and the initial value of the distance matrix D is infinity;
by diIndicates the i-th pedestrian detection frame, fjRepresents the jth personal face detection box, where i ∈ [1, M],j∈[1,N];
S42: calculating pedestrian detection frame diAnd face detection frame fjCross over and cross over IOU ofi,jWhile detecting the frame f by judging the face in the horizontal directionjWhether or not to include the pedestrian detection frame diFilters unreasonable matching relationships;
s43: adjusting a distance matrix D according to the coincidence degree of the face detection frame and the pedestrian detection frame:
Figure FDA0003576472440000021
wherein the content of the first and second substances,
Figure FDA0003576472440000022
frame for detecting pedestrianiThe upper middle point of (a) is,
Figure FDA0003576472440000023
frame f for face detectionjA center point of (a);
s44: the steps S42 and S43 are carried out on all the pedestrian detection frames and the face detection frames, based on the adjusted distance matrix D, a greedy matching strategy is used for selecting the face detection frame with the minimum distance from each pedestrian detection frame as a matching result, and a matching list M is addedlistThe preparation method comprises the following steps of (1) performing;
s45: according to the matching list MlistAnd binding the pedestrian detection frame with the face recognition result to complete track initialization work with real identity.
5. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 1, wherein in step 3), a single-camera pedestrian tracking module adopts a multi-process strategy, so that multiple paths of cameras can be independent of each other and perform multi-target tracking under the condition of single camera in parallel; distributing three sub-processes for each camera, respectively carrying out pedestrian target detection and pedestrian identity re-identification, and carrying out inter-frame data association based on appearance information and motion information by using a Hungarian algorithm, wherein the sub-processes carry out message transmission through a multi-process message queue:
s51: the pedestrian target detection subprocess continuously obtains frame-by-frame image information from the reading queue, and stores a frame-by-frame detection result and the image information into a detection queue after target detection is completed;
s52: continuously obtaining frame-by-frame detection results and image information from the detection queue by a pedestrian identity re-identification subprocess, and storing the frame-by-frame detection results and pedestrian features into a feature queue after feature extraction is completed;
s53: the data association subprocess continuously obtains frame-by-frame detection results and pedestrian characteristics from the characteristic queue, a tracking result of a previous frame is subjected to a Kalman filter to obtain a prediction result of a current frame, and the track of the previous frame and the detection result of the current frame are subjected to data association by using a Hungary algorithm through comprehensive motion information and appearance information;
s54: and completing the steps of S51, S52 and S53 under each camera to obtain the motion track of the pedestrian under a single camera, and transmitting the tracking result under each camera into a shared folder for a cross-camera pedestrian track matching module to use, wherein the tracking result under each camera comprises a camera number, a current frame number, a track number, identity information, a tracking frame position, pedestrian characteristics and human face characteristics.
6. The intelligent monitoring method based on identity recognition and cross-camera target tracking according to claim 1, wherein in the step 4), the specific implementation method is as follows:
s61: a multi-process strategy is adopted to obtain local tracks tracked under each camera in parallel;
s62: representing appearance characteristics of local tracks under each camera by using the Reid characteristics;
s63: projecting the local track under each camera to the same reference coordinate system, and representing the position characteristics of the local track by using the position in the same reference coordinate system;
s64: acquiring all targets needing to be tracked in the current scene according to an identity registration module;
s65: based on the appearance characteristics and the position characteristics, the local track is matched with the target by using a cascade matching strategy, and the global track of the pedestrian crossing the camera is obtained.
7. The intelligent monitoring method based on identification and cross-camera target tracking according to claim 1, wherein in the step S62, the method for characterizing the appearance characteristics of the local track is:
is provided with
Figure FDA0003576472440000031
As a local track set under the ith camera
Figure FDA0003576472440000032
To (1)u tracks, using
Figure FDA0003576472440000033
And
Figure FDA0003576472440000034
respectively representing local tracks
Figure FDA0003576472440000035
Corresponding to the width and height of the tracking frame at time t, the local track is obtained by the following formula
Figure FDA0003576472440000036
Can be characterized by t*Tracking pedestrian characteristics of the frame at any moment:
Figure FDA0003576472440000041
in step S63, the method for projecting the local trajectory under each camera to the reference coordinate system includes:
by using
Figure FDA0003576472440000042
Showing the u-th local track under the ith camera
Figure FDA0003576472440000043
Corresponding to the position of the top left vertex of the tracking frame in the image at time t, wherein
Figure FDA0003576472440000044
The projected position of the tracking frame in the reference coordinate system can be obtained by the following formula:
Figure FDA0003576472440000045
wherein HiIs a mapping matrix between the ith camera and the reference plane, which is calculated according to the calibration parameters of the camera, i.e. Hi=R(Ki[Ri,Ti],[1,2,4]) Therein, [, ]]Is a matrix column splicing function, R (, [1,2,4 ]]) Representing the splicing of 1,2,4 columns of the input matrix into a new matrix, Ki,Ri,TiRespectively an internal reference matrix, an external reference rotation matrix and an external reference translation vector of the ith camera;
in the step S65, the cascade matching step includes:
(1) establishing a position similarity measurement matrix between each local track and each target based on the position characteristics of the step S63;
(2) according to the position similarity measurement matrix, matching the local track with the target by using a greedy matching strategy;
(3) and (5) matching the track which is not successfully matched in the step (2) again according to the appearance characteristics obtained in the step S62.
8. Intelligent monitoring system based on identification and cross-camera target tracking, its characterized in that includes:
the control unit (102) is used for controlling the starting and the closing of the whole intelligent monitoring system (100) based on the identity recognition and the cross-camera target tracking, and controlling the scheduling among different units in the tracking system;
the image pickup unit (104) is used for acquiring relevant video stream data of pedestrians;
the computing unit (110) is used for carrying out operations such as image preprocessing, face recognition, target detection, feature extraction, target tracking and the like on the acquired video stream data;
and the display unit (112) is used for displaying a real-time tracking result of the pedestrian under the single camera, a cross-camera tracking track of the pedestrian and related information extracted from the tracking track.
9. The intelligent monitoring system based on identification and cross-camera target tracking of claim 8, further comprising:
and the switch (114) is used for controlling the communication coupling among the unit (102) and the image pickup unit (104), the calculation unit (110) and the display unit (112), and is also used for the communication coupling among the calculation unit (110) and the image pickup unit (104) and the display unit (112).
10. The intelligent monitoring system based on identification and cross-camera target tracking of claim 9, wherein the camera unit (104) comprises an RGB camera network (106) and a thermal infrared camera network (108); the RGB camera network (106) is used for acquiring RGB optical images of a current scene, the RGB camera network is arranged in a mode that the visual fields of a plurality of cameras can cover the whole scene, the cameras are uniformly distributed, and the visual field ranges of different cameras are overlapped; the thermal infrared camera network (108) is used for acquiring a thermal infrared image of a current scene at night under the condition of insufficient illumination, and is used for making up for the defect of the RGB camera under the condition of insufficient illumination, and the arrangement mode of the thermal infrared camera network is consistent with that of the RGB camera network;
the computing unit (110) is realized by a plurality of computing nodes with GPUs in a distributed deployment mode through a switch (114);
the display unit (112) includes various types of displays including a touch screen.
CN202210335008.8A 2022-03-31 2022-03-31 Intelligent monitoring system and method based on identity recognition and cross-camera target tracking Pending CN114693746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210335008.8A CN114693746A (en) 2022-03-31 2022-03-31 Intelligent monitoring system and method based on identity recognition and cross-camera target tracking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210335008.8A CN114693746A (en) 2022-03-31 2022-03-31 Intelligent monitoring system and method based on identity recognition and cross-camera target tracking

Publications (1)

Publication Number Publication Date
CN114693746A true CN114693746A (en) 2022-07-01

Family

ID=82140864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210335008.8A Pending CN114693746A (en) 2022-03-31 2022-03-31 Intelligent monitoring system and method based on identity recognition and cross-camera target tracking

Country Status (1)

Country Link
CN (1) CN114693746A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542858A (en) * 2023-07-03 2023-08-04 众芯汉创(江苏)科技有限公司 Data splicing analysis system based on space track
CN116580063A (en) * 2023-07-14 2023-08-11 深圳须弥云图空间科技有限公司 Target tracking method, target tracking device, electronic equipment and storage medium
CN117241133A (en) * 2023-11-13 2023-12-15 武汉益模科技股份有限公司 Visual work reporting method and system for multi-task simultaneous operation based on non-fixed position
CN117495913A (en) * 2023-12-28 2024-02-02 中电科新型智慧城市研究院有限公司 Cross-space-time correlation method and device for night target track
CN117576764A (en) * 2024-01-15 2024-02-20 四川大学 Video irrelevant person automatic identification method based on multi-target tracking
CN117576167A (en) * 2024-01-16 2024-02-20 杭州华橙软件技术有限公司 Multi-target tracking method, multi-target tracking device, and computer storage medium
CN117953580A (en) * 2024-01-29 2024-04-30 浙江大学 Behavior recognition method and system based on cross-camera multi-target tracking and electronic equipment
CN118115927A (en) * 2024-04-30 2024-05-31 山东云海国创云计算装备产业创新中心有限公司 Target tracking method, apparatus, computer device, storage medium and program product
CN118115927B (en) * 2024-04-30 2024-07-09 山东云海国创云计算装备产业创新中心有限公司 Target tracking method, apparatus, computer device, storage medium and program product

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542858B (en) * 2023-07-03 2023-09-05 众芯汉创(江苏)科技有限公司 Data splicing analysis system based on space track
CN116542858A (en) * 2023-07-03 2023-08-04 众芯汉创(江苏)科技有限公司 Data splicing analysis system based on space track
CN116580063B (en) * 2023-07-14 2024-01-05 深圳须弥云图空间科技有限公司 Target tracking method, target tracking device, electronic equipment and storage medium
CN116580063A (en) * 2023-07-14 2023-08-11 深圳须弥云图空间科技有限公司 Target tracking method, target tracking device, electronic equipment and storage medium
CN117241133B (en) * 2023-11-13 2024-02-06 武汉益模科技股份有限公司 Visual work reporting method and system for multi-task simultaneous operation based on non-fixed position
CN117241133A (en) * 2023-11-13 2023-12-15 武汉益模科技股份有限公司 Visual work reporting method and system for multi-task simultaneous operation based on non-fixed position
CN117495913A (en) * 2023-12-28 2024-02-02 中电科新型智慧城市研究院有限公司 Cross-space-time correlation method and device for night target track
CN117495913B (en) * 2023-12-28 2024-04-30 中电科新型智慧城市研究院有限公司 Cross-space-time correlation method and device for night target track
CN117576764A (en) * 2024-01-15 2024-02-20 四川大学 Video irrelevant person automatic identification method based on multi-target tracking
CN117576167A (en) * 2024-01-16 2024-02-20 杭州华橙软件技术有限公司 Multi-target tracking method, multi-target tracking device, and computer storage medium
CN117576167B (en) * 2024-01-16 2024-04-12 杭州华橙软件技术有限公司 Multi-target tracking method, multi-target tracking device, and computer storage medium
CN117953580A (en) * 2024-01-29 2024-04-30 浙江大学 Behavior recognition method and system based on cross-camera multi-target tracking and electronic equipment
CN118115927A (en) * 2024-04-30 2024-05-31 山东云海国创云计算装备产业创新中心有限公司 Target tracking method, apparatus, computer device, storage medium and program product
CN118115927B (en) * 2024-04-30 2024-07-09 山东云海国创云计算装备产业创新中心有限公司 Target tracking method, apparatus, computer device, storage medium and program product

Similar Documents

Publication Publication Date Title
CN114693746A (en) Intelligent monitoring system and method based on identity recognition and cross-camera target tracking
CN111145545B (en) Road traffic behavior unmanned aerial vehicle monitoring system and method based on deep learning
Wheeler et al. Face recognition at a distance system for surveillance applications
CN100390811C (en) Method for tracking multiple human faces from video in real time
CN110830756B (en) Monitoring method and device
Hampapur et al. Smart video surveillance: exploring the concept of multiscale spatiotemporal tracking
CN110142785A (en) A kind of crusing robot visual servo method based on target detection
CN111836012A (en) Video fusion and video linkage method based on three-dimensional scene and electronic equipment
US20070058717A1 (en) Enhanced processing for scanning video
CN103795976A (en) Full space-time three-dimensional visualization method
CN110969118B (en) Track monitoring system and method
KR101912569B1 (en) The object tracking system of video images
CN102819847A (en) Method for extracting movement track based on PTZ mobile camera
Morellas et al. DETER: Detection of events for threat evaluation and recognition
CN107038714A (en) Many types of visual sensing synergistic target tracking method
CN109636763A (en) A kind of intelligence compound eye monitoring system
CN107862713A (en) Video camera deflection for poll meeting-place detects method for early warning and module in real time
CN111783675A (en) Intelligent city video self-adaptive HDR control method based on vehicle semantic perception
CN110378292A (en) Three dimension location system and method
CN1960479A (en) Method for tracking principal and subordinate videos by using single video camera
KR101916093B1 (en) Method for tracking object
CN111681269B (en) Multi-camera collaborative figure tracking system and training method based on space consistency
CN111970434A (en) Multi-camera multi-target athlete tracking shooting video generation system and method
Sankaranarayanan et al. A fast linear registration framework for multi-camera GIS coordination
Guler et al. Tracking and handoff between multiple perspective camera views

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination