CN113658223B - Multi-row person detection and tracking method and system based on deep learning - Google Patents
Multi-row person detection and tracking method and system based on deep learning Download PDFInfo
- Publication number
- CN113658223B CN113658223B CN202110917108.7A CN202110917108A CN113658223B CN 113658223 B CN113658223 B CN 113658223B CN 202110917108 A CN202110917108 A CN 202110917108A CN 113658223 B CN113658223 B CN 113658223B
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- frame
- image
- deep learning
- detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
A multi-line human detection and tracking method and system based on deep learning, which uses a method of detecting pedestrians by using a central Net network model at intervals, obtains pedestrian frame information, and can meet the requirements of pedestrian detection robustness and real-time. The tracking method for matching the pedestrian ID number by utilizing the color texture movement direction of the pedestrian characteristic region not only solves the problem of pedestrian tracking error caused by overlapping shielding of pedestrians in the monitoring video, but also improves the matching speed, so that the real-time performance and the accuracy of target tracking are further improved.
Description
Technical Field
The invention relates to the field of computer vision, in particular to a multi-row human detection and tracking method and system based on deep learning.
Background
The intelligent monitoring video analysis technology is an emerging application direction and a leading edge subject with great attention in the field of computer vision. With the rapid development of network technology and digital video technology, the intelligent monitoring technology brings true change to the video monitoring system. The intelligent monitoring system based on the computer vision algorithm can automatically capture important information in the monitoring video, and manpower and material resources are greatly reduced. The detection and tracking of the moving target serve as important bases of intelligent monitoring analysis technology, play an important role in personnel statistics, campus security and the like, and have wide application in industry.
In research and development of moving object detection technology for many years, new algorithms are continuously proposed, so that the moving object detection technology is mature. For example, the R-CNN series target detection method based on region generation, wang et al propose a Guided-fastening method, and guide the generation of an Anchor through image features, but the speed is too slow to meet the real-time performance. In order to solve the above problems, hei Law et al propose CornerNet algorithm, adopt Hourglass104 network as characteristic extraction network to directly predict upper left corner and lower right corner of human body to obtain detection frame, and use the object detection problem as key point detection problem to solve. The Zhou et al proposes an extreme net algorithm, innovations are made on the key point selection and key point combination modes, and the four extreme points of the upper, lower, left and right of a human body target and a central point are selected as key points, so that the edge and internal information of the target are focused more directly, and the detection is more stable. The moving object tracking algorithm is most widely influenced by a Kalman filtering algorithm, a mean shift algorithm and a particle filtering algorithm. The algorithms achieve the effects of moving target detection and tracking to a certain extent, but the traditional detection and tracking algorithms are used independently, so that the calculated amount is large, the robustness is insufficient, and the method cannot adapt to the change of a tracking target. Along with the development of deep learning, training and classification of samples are gradually introduced into moving target detection and tracking, but the problem of higher detection time consumption exists, and the real-time requirement of follow-up target tracking is difficult to meet.
The prior art does not fully exert the advantages of the relation of the moving targets of the frames before and after the monitoring video in the moving target detection and tracking, and most of the prior art cannot balance the two requirements of real-time performance and accuracy.
Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a multi-line person detection and tracking method based on deep learning, which effectively avoids tracking errors caused by overlapping shielding of a plurality of lines of persons.
The technical scheme adopted for overcoming the technical problems is as follows:
a multi-line person detection and tracking method and system based on deep learning comprises the following steps:
a) The computer collects the monitoring video in real time, preprocesses the monitoring video, and inputs the preprocessed monitoring video to the central Net pedestrian detection model;
b) Pedestrian detection is carried out on the preprocessed monitoring video through a deep learning technology, and a pedestrian area in an image is detected by using a CenterNet pedestrian detection model;
c) Judging whether pedestrians appear, if so, establishing a tracking model based on the first frame of pedestrian image and marking the pedestrian ID of each pedestrian frame, and if not, detecting the next frame of pedestrian image;
d) Calculating according to the coordinates of each pedestrian frame to obtain the barycenter coordinates of each pedestrian, and calculating according to the barycenter coordinates to obtain the pedestrian direction;
e) If the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is smaller than or equal to a threshold th1, the overlapping shielding exists between pedestrians, and if the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is larger than the threshold th1, the overlapping shielding does not exist between pedestrians;
f) If overlapping shielding occurs between pedestrian targets of the current frame, selecting a non-overlapping area of each pedestrian frame as a characteristic area according to the pedestrian frame of the pedestrian image of the current frame obtained in the step c), extracting color and texture characteristics of the characteristic area, matching the color and texture characteristics of the pedestrian of the current frame and the pedestrian direction with the color and texture characteristics of the pedestrian of the previous frame and the pedestrian direction, obtaining the pedestrian ID of the previous frame as the pedestrian ID of the current frame, connecting the centroids of the same ID numbers in two adjacent frames, and realizing pedestrian tracking;
g) If no overlapping shielding exists between the pedestrian targets of the current frame, the pedestrian direction of the current frame is matched with the pedestrian direction of the previous frame, the pedestrian ID of the previous frame is obtained as the pedestrian ID of the current frame, the centroid of the same ID number in two adjacent frames is connected, and pedestrian tracking is achieved.
Further, the step of preprocessing the surveillance video in the step a) includes: and converting the monitoring video into an image, and selecting one frame of image as the input of a pedestrian detection model every k frames on the basis of the image of the pedestrian contained in the first frame.
Further, the step of pedestrian detection on the preprocessed surveillance video by the deep learning technology in the step b) includes: and (3) inputting a central Net pedestrian detection model after unifying the sizes of the input frame images in proportion, modeling a pedestrian target as a single point, finding a central point to serve as a center point of a pedestrian frame through a Keypoint Heatmap key point thermodynamic diagram, and obtaining coordinates and size information of the pedestrian frame according to image feature regression of the central point.
Further, the step of marking the ID of the pedestrian frame in the step c) is as follows: if the first frame image of the monitoring video has a pedestrian frame, a tracking model is built based on the first frame pedestrian image, pedestrian IDs of n pedestrian frames are marked as ID1 and ID2.
Further, step d) comprises the steps of:
d-1) coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the kth frame pedestrian image are (x) i1,k ,y i1,k ) And (x) i2,k ,y i2,k ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k ,y j1,k ) And (x) j2,k ,y j2,k ) Coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the k-1 th frame pedestrian image are (x i1,k-1 ,y i1,k-1 ) And (x) i2,k-1 ,y i2,k-1 ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k-1 ,y j1,k-1 ) And (x) j2,k-1 ,y j2,k-1 );
d-2) is represented by formula C i =(x i,k ,y i,k ) I.e. 1.,. N establishes the centroid C of the pedestrian frame IDi in the kth frame pedestrian image i In the followingThrough formula C i =(x i,k-1 ,y i,k-1 ) I.e. 1,.. i In the following
d-3) is represented by formula C j =(x j,k ,y j,k ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-th frame pedestrian image j In the followingThrough formula C j =(x j,k-1 ,y j,k-1 ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-1 frame pedestrian image j In the following
d-4) passing through the formula Deltax i =x i,k -x i,k-1 ,i∈1,...,n,Δy i =y i,k -y i,k-1 I.e. 1, n. calculates the amount of change Δx of the pedestrian frame IDi from the k-1 frame to the k-frame centroid abscissa i And the variation Δy of the ordinate i By the formula deltax j =x j,k -x j,k-1 ,j∈1,...,n,Δy j =y j,k -y j,k-1 J.epsilon.1..n. calculating the variation Δx of the pedestrian frame IDj from the kth-1 frame to the kth frame centroid abscissa j And the variation Δy of the ordinate j ;
d-5) is represented by the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian i i By the formula->Calculating to obtain the angle theta of the motion direction of the pedestrian j j 。
Further, step e) is performed by the formulaCalculating the distance d between the pedestrian i and the pedestrian j in the kth frame of pedestrian image i,j 。
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the deep learning based multi-row human detection and tracking method of any of the above steps.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements the steps of the deep learning based multi-row human detection and tracking method as described in the steps above.
The beneficial effects of the invention are as follows: the method for detecting pedestrians by utilizing the CenterNet network model at intervals acquires pedestrian frame information, and can simultaneously meet the requirements of pedestrian detection robustness and instantaneity. The tracking method for matching the pedestrian ID number by utilizing the color texture movement direction of the pedestrian characteristic region not only solves the problem of pedestrian tracking error caused by overlapping shielding of pedestrians in the monitoring video, but also improves the matching speed, so that the real-time performance and the accuracy of target tracking are further improved.
Drawings
FIG. 1 is a flow chart of a multi-row human detection and tracking method of the present invention;
FIG. 2 is a schematic view of the initial motion direction of a pedestrian object in accordance with the present invention;
FIG. 3 is a schematic diagram of the centroid coordinates of a k-frame pedestrian in accordance with the present invention;
FIG. 4 is a schematic diagram of a pedestrian crossing overlapping occlusion tracking process of the present invention;
FIG. 5 is a schematic view of a pedestrian feature region selected for a kth frame of the present invention;
fig. 6 is a schematic diagram of a pedestrian non-occlusion tracking process of the present invention.
Detailed Description
The invention is further described with reference to fig. 1 to 6.
As shown in fig. 1, a multi-line human detection and tracking method based on deep learning includes:
a) The computer collects the monitoring video in real time, preprocesses the monitoring video, and inputs the preprocessed monitoring video to the central Net pedestrian detection model;
b) Pedestrian detection is carried out on the preprocessed monitoring video through a deep learning technology, and a pedestrian area in an image is detected by using a CenterNet pedestrian detection model;
c) Judging whether pedestrians appear, if so, establishing a tracking model based on the first frame of pedestrian image and marking the pedestrian ID of each pedestrian frame, and if not, detecting the next frame of pedestrian image;
d) Calculating according to the coordinates of each pedestrian frame to obtain the barycenter coordinates of each pedestrian, and calculating according to the barycenter coordinates to obtain the pedestrian direction;
e) If the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is smaller than or equal to a threshold th1, the overlapping shielding exists between pedestrians, and if the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is larger than the threshold th1, the overlapping shielding does not exist between pedestrians;
f) If overlapping shielding occurs between pedestrian targets of the current frame, selecting a non-overlapping area of each pedestrian frame as a characteristic area according to the pedestrian frame of the pedestrian image of the current frame obtained in the step c), extracting color and texture characteristics of the characteristic area, matching the color and texture characteristics of the pedestrian of the current frame and the pedestrian direction with the color and texture characteristics of the pedestrian of the previous frame and the pedestrian direction, obtaining the pedestrian ID of the previous frame as the pedestrian ID of the current frame, connecting the centroids of the same ID numbers in two adjacent frames, and realizing pedestrian tracking;
g) If no overlapping shielding exists between the pedestrian targets of the current frame, the pedestrian direction of the current frame is matched with the pedestrian direction of the previous frame, the pedestrian ID of the previous frame is obtained as the pedestrian ID of the current frame, the centroid of the same ID number in two adjacent frames is connected, and pedestrian tracking is achieved.
The method for detecting pedestrians by utilizing the CenterNet network model at intervals acquires pedestrian frame information, and can simultaneously meet the requirements of pedestrian detection robustness and instantaneity. The tracking method for matching the pedestrian ID number by utilizing the color texture movement direction of the pedestrian characteristic region not only solves the problem of pedestrian tracking error caused by overlapping shielding of pedestrians in the monitoring video, but also improves the matching speed, so that the real-time performance and the accuracy of target tracking are further improved.
Example 1:
the step of preprocessing the monitoring video in the step a) is as follows: and converting the monitoring video into an image, and selecting one frame of image as the input of a pedestrian detection model every k frames on the basis of the image of the pedestrian contained in the first frame.
Example 2:
the step of pedestrian detection on the preprocessed monitoring video through the deep learning technology in the step b) comprises the following steps: and (3) inputting a central Net pedestrian detection model after unifying the sizes of the input frame images in proportion, modeling a pedestrian target as a single point, finding a central point to serve as a center point of a pedestrian frame through a Keypoint Heatmap key point thermodynamic diagram, and obtaining coordinates and size information of the pedestrian frame according to image feature regression of the central point. The method comprises the following specific steps:
(1) Firstly, inputting a preprocessed monitoring video into a full convolution neural network, and generating a thermodynamic diagram for each frame of image; then taking the first 100 peak points on the thermodynamic diagram as detected pedestrian center points, and setting a threshold value for screening to obtain final pedestrian target center points; and finally, inputting the image characteristics corresponding to the central point on the thermodynamic diagram into the size information of the predicted pedestrian frame in the prediction network. The full convolution neural network of the central Net pedestrian detection model is a Hourgass network, a Hourgass-104 detection network is formed by cascading two Hourgass Hourglass modules, and multi-scale characteristics of a pedestrian target are extracted.
(2) The pedestrian detection training data set comprises an INRIA pedestrian data set and a pedestrian data set made of an acquired monitoring video, then a target detection marking tool Iabelmg is used for marking the position of a pedestrian in an image, and the corresponding. Json file is generated by marking the label in a parallel frame label format according to the MSCOCO data set format used in the CenterNet algorithm original experiment.
(3) In the network training process, setting the real key point of a training sample as p, and downsampling to correspond to the centerThe points areAnd->Wherein R is a downsampling step length, R=4, and mapping the real key points to the real key point thermodynamic diagram through a Gaussian kernel function>W is the width of the image, H is the height of the image, by the formula +.>Calculating to obtain a true key point thermodynamic diagram Y xy ,σ p Is a standard deviation related to the size of the target, (x, y) is the real key point coordinates, +.>And the coordinates of the corresponding central point after downsampling the real key points. The focal loss function is used as a training target loss function, and the training target loss function specifically comprises the following formula: />Wherein alpha is a super parameter, alpha=2, beta is a super parameter, beta=4, N is the number of key points of the input image, and +.>Network predicted keypoint thermodynamic diagram values are detected for the centrnet.
Since the true center points will have discrete deviations when the image is downsampled, a local offset prediction is added to each center point, i.eCenter point offset prediction +.>The training algorithm of (2) is that the function is expressed as the following formula:in->Is the center point->Is used for local offset prediction.
After the image is subjected to central point prediction, regression processing is carried out on each target, and the position coordinates of Bbox of target k are assumed to beAnd->Regression target size s k In particular->Bbox size is used +.>As a predictive value +.>The Bbox size loss function is as follows: />S in k Bbox size for target k, < +.>The size is predicted for the Bbox of target k.
By the formula l=l k +λ size L size +λ off L off Calculating the total loss function L, wherein lambda size =0.1,λ off =0.1, the centrnet pedestrian detection model was optimally trained using the total loss function L.
Example 3:
the step of marking the ID of the pedestrian frame in the step c) is as follows: if the first frame image of the monitoring video has a pedestrian frame, a tracking model is built based on the first frame pedestrian image, pedestrian IDs of n pedestrian frames are marked as ID1 and ID2.
Example 4:
step d) comprises the steps of:
(1) Initializing a motion direction:
taking a monitoring scene in a corridor environment as an example, the boundary of a monitoring plan is generally a wall body, a door, a passageway and the like, the position of a moving object appearing from the boundary mainly has four movement entering conditions, as shown in fig. 2, a coordinate system is determined according to the monitoring area plan, the ID of each pedestrian frame is marked on the basis of a first frame video frame of the detected pedestrian object, the barycenter coordinate of the pedestrian is calculated, and the initial movement direction is determined according to the position of the object entering from the positive direction and the negative direction of the x-axis y-axis in the plan.
(2) Updating the movement direction:
the pedestrian target movement direction update of the kth frame mark is based on the change condition of the mass center coordinates of the kth-1 frame and the kth frame. First, the barycenter coordinates of the pedestrian are calculated based on the detected coordinates of the pedestrian frame, as shown in fig. 3, and the coordinates of the pedestrian frame IDi of the kth frame are (x i1,k ,y i1,k ) And (x) i2,k ,y i2,k ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k ,y j1,k ) And (x) j2,k ,y j2,k ) Coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the k-1 th frame pedestrian image are (x i1,k-1 ,y i1,k-1 ) And (x) i2,k-1 ,y i2,k-1 ) Rectangular pedestrian frame of jth in kth frame pedestrian imageThe coordinates of the two opposite corners of IDj are (x) j1,k-1 ,y j1,k-1 ) And (x) j2,k-1 ,y j2,k-1 )。
Through formula C i =(x i,k ,y i,k ) I.e. 1.,. N establishes the centroid C of the pedestrian frame IDi in the kth frame pedestrian image i In the followingThrough formula C i =(x i,k-1 ,y i,k-1 ) I.e. 1,.. i In the following
Through formula C j =(x j,k ,y j,k ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-th frame pedestrian image j In the followingThrough formula C j =(x j,k-1 ,y j,k-1 ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-1 frame pedestrian image j In the following
By the formula deltax i =x i,k -x i,k-1 ,i∈1,...,n,Δy i =y i,k -y i,k-1 I.e. 1, n. calculates the amount of change Δx of the pedestrian frame IDi from the k-1 frame to the k-frame centroid abscissa i And the variation Δy of the ordinate i By the formula deltax j =x j,k -x j,k-1 ,j∈1,...,n,Δy j =y j,k -y j,k-1 J.epsilon.1..n. calculating the variation Δx of the pedestrian frame IDj from the kth-1 frame to the kth frame centroid abscissa j And the variation Δy of the ordinate j 。Δx i And Deltay i Positive signs respectively represent the movement direction of the pedestrian i from left to right, from top to bottom and delta x i And Deltay i Is the negative sign to respectively represent the moving direction of the pedestrian iRight to left, down to up. If the pedestrian target i moves in the horizontal direction (i.e., the x-axis direction), Δy i =0; if the pedestrian target i moves in the vertical direction (i.e., the y-axis direction), Δx i =0。
In order to more clearly show the direction of movement of the pedestrian, the method is based on Δx i And Deltay i Calculating the motion angle of the pedestrian i, and specifically passing through a formulaCalculating to obtain the angle theta of the motion direction of the pedestrian i i By the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian j j 。
Further, step e) is performed by the formulaCalculating the distance d between the pedestrian i and the pedestrian j in the kth frame of pedestrian image i,j 。(x i,k ,y i,k ) Centroid coordinate C for pedestrian i i ,(x j,k ,y j,k ) Barycentric coordinate C for pedestrian j j 。
For the problem that the target tracking error can occur during the tracking of the crossed and overlapped pedestrians, as shown in (b) of fig. 4, if d i,j If the value of (1) is smaller than or equal to a preset threshold value th1, the intersection overlapping shielding exists between the pedestrian i and the pedestrian j; as shown in fig. 4 (c), there may occur a problem of a tracking error of the pedestrians i and j in the case of the pedestrian tracking of the cross overlapping. Firstly, removing a cross overlapping area according to a pedestrian frame to obtain two non-overlapping areas of the pedestrian frames of the selected pedestrian i and the pedestrian j, taking the non-overlapping areas of the pedestrian frames as pedestrian characteristic areas, respectively selecting the characteristic areas of the pedestrian i and the pedestrian j in a kth frame image as shown in figure 5, then using a gray level co-occurrence matrix and a color moment to represent texture color characteristics of the characteristic areas, and calculating the movement direction theta of the pedestrian i and the pedestrian j from a kth-1 frame to a kth frame respectively i And theta j Using characteristic region colour texturesThe method combined with the pedestrian movement realizes pedestrian tracking. Specifically, the color textures and the motion directions of the characteristic areas of the pedestrians i and the pedestrians j in the kth frame image are matched with the color texture motion direction of each pedestrian in the kth-1 frame, when the similarity exceeds a preset threshold th2, the pedestrians are successfully matched, the ID number IDi of the pedestrian i in the kth-1 frame is obtained and used as the ID number of the pedestrian i in the kth frame, the ID number IDj of the pedestrian j is used as the ID number of the pedestrian j in the kth frame, the centroid of the same ID number in two adjacent frames is connected, and the tracking of each pedestrian is completed.
(2) For pedestrian tracking without cross-over, as shown in FIG. 6 (b), if the distance d between pedestrian i and pedestrian j in the kth frame image i,j If the detection result is larger than the preset threshold th1, no overlapping shielding exists between the pedestrians i and j; as shown in fig. 6 (c), the pedestrian i and the pedestrian j can be correctly tracked when the pedestrian without the cross overlap is tracked. Only the motion direction theta of the pedestrian i and the pedestrian j from the kth-1 frame to the kth frame is calculated i And theta j And matching the motion directions of the pedestrians i and j in the k frame image with the motion direction of the pedestrians in the k-1 frame, acquiring the ID number IDi of the pedestrian i in the k-1 frame as the ID number of the pedestrian i in the k frame, and connecting the centroid of the same ID number in two adjacent frames by taking the ID number IDj of the pedestrian j as the ID number of the pedestrian j in the k frame, thereby completing the tracking of each pedestrian.
The invention also relates to a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the deep learning based multi-row human detection and tracking method as described in any of the above steps.
The invention also relates to a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which processor, when executing the program, implements the steps of the deep learning based multi-line human detection and tracking method as described in any of the above steps.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (8)
1. A multi-row human detection and tracking method based on deep learning, comprising:
a) The computer collects the monitoring video in real time, preprocesses the monitoring video, and inputs the preprocessed monitoring video to the central Net pedestrian detection model;
b) Pedestrian detection is carried out on the preprocessed monitoring video through a deep learning technology, and a pedestrian area in an image is detected by using a CenterNet pedestrian detection model;
c) Judging whether pedestrians appear, if so, establishing a tracking model based on the first frame of pedestrian image and marking the pedestrian ID of each pedestrian frame, and if not, detecting the next frame of pedestrian image;
d) Calculating according to the coordinates of each pedestrian frame to obtain the barycenter coordinates of each pedestrian, and calculating according to the barycenter coordinates to obtain the pedestrian direction;
e) If the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is smaller than or equal to a threshold th1, the overlapping shielding exists between pedestrians, and if the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is larger than the threshold th1, the overlapping shielding does not exist between pedestrians;
f) If overlapping shielding occurs between pedestrian targets of the current frame, selecting a non-overlapping area of each pedestrian frame as a characteristic area according to the pedestrian frame of the pedestrian image of the current frame obtained in the step c), extracting color and texture characteristics of the characteristic area, matching the color and texture characteristics of the pedestrian of the current frame and the pedestrian direction with the color and texture characteristics of the pedestrian of the previous frame and the pedestrian direction, obtaining the pedestrian ID of the previous frame as the pedestrian ID of the current frame, connecting the centroids of the same ID numbers in two adjacent frames, and realizing pedestrian tracking;
g) If no overlapping shielding exists between the pedestrian targets of the current frame, the pedestrian direction of the current frame is matched with the pedestrian direction of the previous frame, the pedestrian ID of the previous frame is obtained as the pedestrian ID of the current frame, the centroid of the same ID number in two adjacent frames is connected, and pedestrian tracking is achieved.
2. The deep learning based multi-line person detection and tracking method according to claim 1, wherein the step of preprocessing the surveillance video in step a) is as follows: and converting the monitoring video into an image, and selecting one frame of image as the input of a pedestrian detection model every k frames on the basis of the image of the pedestrian contained in the first frame.
3. The multi-line person detection and tracking method and system based on deep learning according to claim 1, wherein the step of performing pedestrian detection on the preprocessed surveillance video by the deep learning technique in the step b) is as follows: and (3) inputting a central Net pedestrian detection model after unifying the sizes of the input frame images in proportion, modeling a pedestrian target as a single point, finding a central point to serve as a center point of a pedestrian frame through a Keypoint Heatmap key point thermodynamic diagram, and obtaining coordinates and size information of the pedestrian frame according to image feature regression of the central point.
4. The deep learning based multi-row human detection and tracking method of claim 1 wherein the step of marking the ID of the pedestrian box in step c) is: if the first frame image of the monitoring video has a pedestrian frame, a tracking model is built based on the first frame pedestrian image, pedestrian IDs of n pedestrian frames are marked as ID1 and ID2.
5. The deep learning based multi-row human detection and tracking method of claim 4 wherein step d) comprises the steps of:
d-1) coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the kth frame pedestrian image are (x) i1,k ,y i1,k ) And (x) i2,k ,y i2,k ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k ,y j1,k ) And (x) j2,k ,y j2,k ) Coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the k-1 th frame pedestrian image are (x i1,k-1 ,y i1,k-1 ) And (x) i2,k-1 ,y i2,k-1 ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k-1 ,y j1,k-1 ) And (x) j2,k-1 ,y j2,k-1 );
d-2) is represented by formula C i =(x i,k ,y i,k ) I.e. 1.,. N establishes the centroid C of the pedestrian frame IDi in the kth frame pedestrian image i In the followingThrough formula C i =(x i,k-1 ,y i,k-1 ) I.e. 1,.. i In the following
d-3) is represented by formula C j =(x j,k ,y j,k ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-th frame pedestrian image j In the followingThrough formula C j =(x j,k-1 ,y j,k-1 ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-1 frame pedestrian image j In the following
d-4) passing through the formula Deltax i =x i,k -x i,k-1 ,i∈1,...,n,Δy i =y i,k -y i,k-1 I.e. 1, n. calculates the amount of change Δx of the pedestrian frame IDi from the k-1 frame to the k-frame centroid abscissa i And the variation Δy of the ordinate i By the formula deltax j =x j,k -x j,k-1 ,j∈1,...,n,Δy j =y j,k -y j,k-1 J.epsilon.1..n. calculating the variation Δx of the pedestrian frame IDj from the kth-1 frame to the kth frame centroid abscissa j And the variation Δy of the ordinate j ;
d-5) is represented by the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian i i By the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian j j 。
6. The deep learning based multi-row human detection and tracking method of claim 5, wherein: step e) is performed by the formulaCalculating the distance d between the pedestrian i and the pedestrian j in the kth frame of pedestrian image i,j 。
7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the deep learning based multi-line person detection and tracking method of any of claims 1-6.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps in the deep learning based multi-row human detection and tracking method of any one of claims 1-6 when the program is executed by the processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110917108.7A CN113658223B (en) | 2021-08-11 | 2021-08-11 | Multi-row person detection and tracking method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110917108.7A CN113658223B (en) | 2021-08-11 | 2021-08-11 | Multi-row person detection and tracking method and system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113658223A CN113658223A (en) | 2021-11-16 |
CN113658223B true CN113658223B (en) | 2023-08-04 |
Family
ID=78491350
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110917108.7A Active CN113658223B (en) | 2021-08-11 | 2021-08-11 | Multi-row person detection and tracking method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113658223B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115695818B (en) * | 2023-01-05 | 2023-04-07 | 广东瑞恩科技有限公司 | Efficient management method for intelligent park monitoring data based on Internet of things |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20030073879A (en) * | 2002-03-13 | 2003-09-19 | 주식회사 엘지이아이 | Realtime face detection and moving tracing method |
CN102104771A (en) * | 2010-12-14 | 2011-06-22 | 浙江工业大学 | Multi-channel people stream rate monitoring system based on wireless monitoring |
CN104537647A (en) * | 2014-12-12 | 2015-04-22 | 中安消技术有限公司 | Target detection method and device |
CN104715238A (en) * | 2015-03-11 | 2015-06-17 | 南京邮电大学 | Pedestrian detection method based on multi-feature fusion |
CN107564034A (en) * | 2017-07-27 | 2018-01-09 | 华南理工大学 | The pedestrian detection and tracking of multiple target in a kind of monitor video |
CN110688987A (en) * | 2019-10-16 | 2020-01-14 | 山东建筑大学 | Pedestrian position detection and tracking method and system |
WO2020155873A1 (en) * | 2019-02-02 | 2020-08-06 | 福州大学 | Deep apparent features and adaptive aggregation network-based multi-face tracking method |
CN111783576A (en) * | 2020-06-18 | 2020-10-16 | 西安电子科技大学 | Pedestrian re-identification method based on improved YOLOv3 network and feature fusion |
-
2021
- 2021-08-11 CN CN202110917108.7A patent/CN113658223B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20030073879A (en) * | 2002-03-13 | 2003-09-19 | 주식회사 엘지이아이 | Realtime face detection and moving tracing method |
CN102104771A (en) * | 2010-12-14 | 2011-06-22 | 浙江工业大学 | Multi-channel people stream rate monitoring system based on wireless monitoring |
CN104537647A (en) * | 2014-12-12 | 2015-04-22 | 中安消技术有限公司 | Target detection method and device |
CN104715238A (en) * | 2015-03-11 | 2015-06-17 | 南京邮电大学 | Pedestrian detection method based on multi-feature fusion |
CN107564034A (en) * | 2017-07-27 | 2018-01-09 | 华南理工大学 | The pedestrian detection and tracking of multiple target in a kind of monitor video |
WO2020155873A1 (en) * | 2019-02-02 | 2020-08-06 | 福州大学 | Deep apparent features and adaptive aggregation network-based multi-face tracking method |
CN110688987A (en) * | 2019-10-16 | 2020-01-14 | 山东建筑大学 | Pedestrian position detection and tracking method and system |
CN111783576A (en) * | 2020-06-18 | 2020-10-16 | 西安电子科技大学 | Pedestrian re-identification method based on improved YOLOv3 network and feature fusion |
Also Published As
Publication number | Publication date |
---|---|
CN113658223A (en) | 2021-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sun et al. | Research on the hand gesture recognition based on deep learning | |
CN110059558B (en) | Orchard obstacle real-time detection method based on improved SSD network | |
CN108564097B (en) | Multi-scale target detection method based on deep convolutional neural network | |
CN110334762B (en) | Feature matching method based on quad tree combined with ORB and SIFT | |
CN107480585B (en) | Target detection method based on DPM algorithm | |
CN103310194A (en) | Method for detecting head and shoulders of pedestrian in video based on overhead pixel gradient direction | |
CN110929593A (en) | Real-time significance pedestrian detection method based on detail distinguishing and distinguishing | |
CN109145756A (en) | Object detection method based on machine vision and deep learning | |
CN105184265A (en) | Self-learning-based handwritten form numeric character string rapid recognition method | |
CN109886159B (en) | Face detection method under non-limited condition | |
CN111753682B (en) | Hoisting area dynamic monitoring method based on target detection algorithm | |
CN107146219B (en) | Image significance detection method based on manifold regularization support vector machine | |
CN105405138A (en) | Water surface target tracking method based on saliency detection | |
Yang et al. | An improved algorithm for the detection of fastening targets based on machine vision | |
CN113658223B (en) | Multi-row person detection and tracking method and system based on deep learning | |
CN112784722B (en) | Behavior identification method based on YOLOv3 and bag-of-words model | |
Zhu et al. | Human detection under UAV: an improved faster R-CNN approach | |
CN107679467B (en) | Pedestrian re-identification algorithm implementation method based on HSV and SDALF | |
CN110334703B (en) | Ship detection and identification method in day and night image | |
CN117079125A (en) | Kiwi fruit pollination flower identification method based on improved YOLOv5 | |
CN108985294B (en) | Method, device and equipment for positioning tire mold picture and storage medium | |
CN108985216B (en) | Pedestrian head detection method based on multivariate logistic regression feature fusion | |
CN110826575A (en) | Underwater target identification method based on machine learning | |
CN103020631A (en) | Human movement identification method based on star model | |
CN116912670A (en) | Deep sea fish identification method based on improved YOLO model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |