CN113658223B - Multi-row person detection and tracking method and system based on deep learning - Google Patents

Multi-row person detection and tracking method and system based on deep learning Download PDF

Info

Publication number
CN113658223B
CN113658223B CN202110917108.7A CN202110917108A CN113658223B CN 113658223 B CN113658223 B CN 113658223B CN 202110917108 A CN202110917108 A CN 202110917108A CN 113658223 B CN113658223 B CN 113658223B
Authority
CN
China
Prior art keywords
pedestrian
frame
image
deep learning
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110917108.7A
Other languages
Chinese (zh)
Other versions
CN113658223A (en
Inventor
曹建荣
朱亚琴
张玉婷
韩发通
庄园
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Jianzhu University
Original Assignee
Shandong Jianzhu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Jianzhu University filed Critical Shandong Jianzhu University
Priority to CN202110917108.7A priority Critical patent/CN113658223B/en
Publication of CN113658223A publication Critical patent/CN113658223A/en
Application granted granted Critical
Publication of CN113658223B publication Critical patent/CN113658223B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

A multi-line human detection and tracking method and system based on deep learning, which uses a method of detecting pedestrians by using a central Net network model at intervals, obtains pedestrian frame information, and can meet the requirements of pedestrian detection robustness and real-time. The tracking method for matching the pedestrian ID number by utilizing the color texture movement direction of the pedestrian characteristic region not only solves the problem of pedestrian tracking error caused by overlapping shielding of pedestrians in the monitoring video, but also improves the matching speed, so that the real-time performance and the accuracy of target tracking are further improved.

Description

Multi-row person detection and tracking method and system based on deep learning
Technical Field
The invention relates to the field of computer vision, in particular to a multi-row human detection and tracking method and system based on deep learning.
Background
The intelligent monitoring video analysis technology is an emerging application direction and a leading edge subject with great attention in the field of computer vision. With the rapid development of network technology and digital video technology, the intelligent monitoring technology brings true change to the video monitoring system. The intelligent monitoring system based on the computer vision algorithm can automatically capture important information in the monitoring video, and manpower and material resources are greatly reduced. The detection and tracking of the moving target serve as important bases of intelligent monitoring analysis technology, play an important role in personnel statistics, campus security and the like, and have wide application in industry.
In research and development of moving object detection technology for many years, new algorithms are continuously proposed, so that the moving object detection technology is mature. For example, the R-CNN series target detection method based on region generation, wang et al propose a Guided-fastening method, and guide the generation of an Anchor through image features, but the speed is too slow to meet the real-time performance. In order to solve the above problems, hei Law et al propose CornerNet algorithm, adopt Hourglass104 network as characteristic extraction network to directly predict upper left corner and lower right corner of human body to obtain detection frame, and use the object detection problem as key point detection problem to solve. The Zhou et al proposes an extreme net algorithm, innovations are made on the key point selection and key point combination modes, and the four extreme points of the upper, lower, left and right of a human body target and a central point are selected as key points, so that the edge and internal information of the target are focused more directly, and the detection is more stable. The moving object tracking algorithm is most widely influenced by a Kalman filtering algorithm, a mean shift algorithm and a particle filtering algorithm. The algorithms achieve the effects of moving target detection and tracking to a certain extent, but the traditional detection and tracking algorithms are used independently, so that the calculated amount is large, the robustness is insufficient, and the method cannot adapt to the change of a tracking target. Along with the development of deep learning, training and classification of samples are gradually introduced into moving target detection and tracking, but the problem of higher detection time consumption exists, and the real-time requirement of follow-up target tracking is difficult to meet.
The prior art does not fully exert the advantages of the relation of the moving targets of the frames before and after the monitoring video in the moving target detection and tracking, and most of the prior art cannot balance the two requirements of real-time performance and accuracy.
Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a multi-line person detection and tracking method based on deep learning, which effectively avoids tracking errors caused by overlapping shielding of a plurality of lines of persons.
The technical scheme adopted for overcoming the technical problems is as follows:
a multi-line person detection and tracking method and system based on deep learning comprises the following steps:
a) The computer collects the monitoring video in real time, preprocesses the monitoring video, and inputs the preprocessed monitoring video to the central Net pedestrian detection model;
b) Pedestrian detection is carried out on the preprocessed monitoring video through a deep learning technology, and a pedestrian area in an image is detected by using a CenterNet pedestrian detection model;
c) Judging whether pedestrians appear, if so, establishing a tracking model based on the first frame of pedestrian image and marking the pedestrian ID of each pedestrian frame, and if not, detecting the next frame of pedestrian image;
d) Calculating according to the coordinates of each pedestrian frame to obtain the barycenter coordinates of each pedestrian, and calculating according to the barycenter coordinates to obtain the pedestrian direction;
e) If the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is smaller than or equal to a threshold th1, the overlapping shielding exists between pedestrians, and if the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is larger than the threshold th1, the overlapping shielding does not exist between pedestrians;
f) If overlapping shielding occurs between pedestrian targets of the current frame, selecting a non-overlapping area of each pedestrian frame as a characteristic area according to the pedestrian frame of the pedestrian image of the current frame obtained in the step c), extracting color and texture characteristics of the characteristic area, matching the color and texture characteristics of the pedestrian of the current frame and the pedestrian direction with the color and texture characteristics of the pedestrian of the previous frame and the pedestrian direction, obtaining the pedestrian ID of the previous frame as the pedestrian ID of the current frame, connecting the centroids of the same ID numbers in two adjacent frames, and realizing pedestrian tracking;
g) If no overlapping shielding exists between the pedestrian targets of the current frame, the pedestrian direction of the current frame is matched with the pedestrian direction of the previous frame, the pedestrian ID of the previous frame is obtained as the pedestrian ID of the current frame, the centroid of the same ID number in two adjacent frames is connected, and pedestrian tracking is achieved.
Further, the step of preprocessing the surveillance video in the step a) includes: and converting the monitoring video into an image, and selecting one frame of image as the input of a pedestrian detection model every k frames on the basis of the image of the pedestrian contained in the first frame.
Further, the step of pedestrian detection on the preprocessed surveillance video by the deep learning technology in the step b) includes: and (3) inputting a central Net pedestrian detection model after unifying the sizes of the input frame images in proportion, modeling a pedestrian target as a single point, finding a central point to serve as a center point of a pedestrian frame through a Keypoint Heatmap key point thermodynamic diagram, and obtaining coordinates and size information of the pedestrian frame according to image feature regression of the central point.
Further, the step of marking the ID of the pedestrian frame in the step c) is as follows: if the first frame image of the monitoring video has a pedestrian frame, a tracking model is built based on the first frame pedestrian image, pedestrian IDs of n pedestrian frames are marked as ID1 and ID2.
Further, step d) comprises the steps of:
d-1) coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the kth frame pedestrian image are (x) i1,k ,y i1,k ) And (x) i2,k ,y i2,k ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k ,y j1,k ) And (x) j2,k ,y j2,k ) Coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the k-1 th frame pedestrian image are (x i1,k-1 ,y i1,k-1 ) And (x) i2,k-1 ,y i2,k-1 ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k-1 ,y j1,k-1 ) And (x) j2,k-1 ,y j2,k-1 );
d-2) is represented by formula C i =(x i,k ,y i,k ) I.e. 1.,. N establishes the centroid C of the pedestrian frame IDi in the kth frame pedestrian image i In the followingThrough formula C i =(x i,k-1 ,y i,k-1 ) I.e. 1,.. i In the following
d-3) is represented by formula C j =(x j,k ,y j,k ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-th frame pedestrian image j In the followingThrough formula C j =(x j,k-1 ,y j,k-1 ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-1 frame pedestrian image j In the following
d-4) passing through the formula Deltax i =x i,k -x i,k-1 ,i∈1,...,n,Δy i =y i,k -y i,k-1 I.e. 1, n. calculates the amount of change Δx of the pedestrian frame IDi from the k-1 frame to the k-frame centroid abscissa i And the variation Δy of the ordinate i By the formula deltax j =x j,k -x j,k-1 ,j∈1,...,n,Δy j =y j,k -y j,k-1 J.epsilon.1..n. calculating the variation Δx of the pedestrian frame IDj from the kth-1 frame to the kth frame centroid abscissa j And the variation Δy of the ordinate j
d-5) is represented by the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian i i By the formula->Calculating to obtain the angle theta of the motion direction of the pedestrian j j
Further, step e) is performed by the formulaCalculating the distance d between the pedestrian i and the pedestrian j in the kth frame of pedestrian image i,j
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the deep learning based multi-row human detection and tracking method of any of the above steps.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, implements the steps of the deep learning based multi-row human detection and tracking method as described in the steps above.
The beneficial effects of the invention are as follows: the method for detecting pedestrians by utilizing the CenterNet network model at intervals acquires pedestrian frame information, and can simultaneously meet the requirements of pedestrian detection robustness and instantaneity. The tracking method for matching the pedestrian ID number by utilizing the color texture movement direction of the pedestrian characteristic region not only solves the problem of pedestrian tracking error caused by overlapping shielding of pedestrians in the monitoring video, but also improves the matching speed, so that the real-time performance and the accuracy of target tracking are further improved.
Drawings
FIG. 1 is a flow chart of a multi-row human detection and tracking method of the present invention;
FIG. 2 is a schematic view of the initial motion direction of a pedestrian object in accordance with the present invention;
FIG. 3 is a schematic diagram of the centroid coordinates of a k-frame pedestrian in accordance with the present invention;
FIG. 4 is a schematic diagram of a pedestrian crossing overlapping occlusion tracking process of the present invention;
FIG. 5 is a schematic view of a pedestrian feature region selected for a kth frame of the present invention;
fig. 6 is a schematic diagram of a pedestrian non-occlusion tracking process of the present invention.
Detailed Description
The invention is further described with reference to fig. 1 to 6.
As shown in fig. 1, a multi-line human detection and tracking method based on deep learning includes:
a) The computer collects the monitoring video in real time, preprocesses the monitoring video, and inputs the preprocessed monitoring video to the central Net pedestrian detection model;
b) Pedestrian detection is carried out on the preprocessed monitoring video through a deep learning technology, and a pedestrian area in an image is detected by using a CenterNet pedestrian detection model;
c) Judging whether pedestrians appear, if so, establishing a tracking model based on the first frame of pedestrian image and marking the pedestrian ID of each pedestrian frame, and if not, detecting the next frame of pedestrian image;
d) Calculating according to the coordinates of each pedestrian frame to obtain the barycenter coordinates of each pedestrian, and calculating according to the barycenter coordinates to obtain the pedestrian direction;
e) If the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is smaller than or equal to a threshold th1, the overlapping shielding exists between pedestrians, and if the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is larger than the threshold th1, the overlapping shielding does not exist between pedestrians;
f) If overlapping shielding occurs between pedestrian targets of the current frame, selecting a non-overlapping area of each pedestrian frame as a characteristic area according to the pedestrian frame of the pedestrian image of the current frame obtained in the step c), extracting color and texture characteristics of the characteristic area, matching the color and texture characteristics of the pedestrian of the current frame and the pedestrian direction with the color and texture characteristics of the pedestrian of the previous frame and the pedestrian direction, obtaining the pedestrian ID of the previous frame as the pedestrian ID of the current frame, connecting the centroids of the same ID numbers in two adjacent frames, and realizing pedestrian tracking;
g) If no overlapping shielding exists between the pedestrian targets of the current frame, the pedestrian direction of the current frame is matched with the pedestrian direction of the previous frame, the pedestrian ID of the previous frame is obtained as the pedestrian ID of the current frame, the centroid of the same ID number in two adjacent frames is connected, and pedestrian tracking is achieved.
The method for detecting pedestrians by utilizing the CenterNet network model at intervals acquires pedestrian frame information, and can simultaneously meet the requirements of pedestrian detection robustness and instantaneity. The tracking method for matching the pedestrian ID number by utilizing the color texture movement direction of the pedestrian characteristic region not only solves the problem of pedestrian tracking error caused by overlapping shielding of pedestrians in the monitoring video, but also improves the matching speed, so that the real-time performance and the accuracy of target tracking are further improved.
Example 1:
the step of preprocessing the monitoring video in the step a) is as follows: and converting the monitoring video into an image, and selecting one frame of image as the input of a pedestrian detection model every k frames on the basis of the image of the pedestrian contained in the first frame.
Example 2:
the step of pedestrian detection on the preprocessed monitoring video through the deep learning technology in the step b) comprises the following steps: and (3) inputting a central Net pedestrian detection model after unifying the sizes of the input frame images in proportion, modeling a pedestrian target as a single point, finding a central point to serve as a center point of a pedestrian frame through a Keypoint Heatmap key point thermodynamic diagram, and obtaining coordinates and size information of the pedestrian frame according to image feature regression of the central point. The method comprises the following specific steps:
(1) Firstly, inputting a preprocessed monitoring video into a full convolution neural network, and generating a thermodynamic diagram for each frame of image; then taking the first 100 peak points on the thermodynamic diagram as detected pedestrian center points, and setting a threshold value for screening to obtain final pedestrian target center points; and finally, inputting the image characteristics corresponding to the central point on the thermodynamic diagram into the size information of the predicted pedestrian frame in the prediction network. The full convolution neural network of the central Net pedestrian detection model is a Hourgass network, a Hourgass-104 detection network is formed by cascading two Hourgass Hourglass modules, and multi-scale characteristics of a pedestrian target are extracted.
(2) The pedestrian detection training data set comprises an INRIA pedestrian data set and a pedestrian data set made of an acquired monitoring video, then a target detection marking tool Iabelmg is used for marking the position of a pedestrian in an image, and the corresponding. Json file is generated by marking the label in a parallel frame label format according to the MSCOCO data set format used in the CenterNet algorithm original experiment.
(3) In the network training process, setting the real key point of a training sample as p, and downsampling to correspond to the centerThe points areAnd->Wherein R is a downsampling step length, R=4, and mapping the real key points to the real key point thermodynamic diagram through a Gaussian kernel function>W is the width of the image, H is the height of the image, by the formula +.>Calculating to obtain a true key point thermodynamic diagram Y xy ,σ p Is a standard deviation related to the size of the target, (x, y) is the real key point coordinates, +.>And the coordinates of the corresponding central point after downsampling the real key points. The focal loss function is used as a training target loss function, and the training target loss function specifically comprises the following formula: />Wherein alpha is a super parameter, alpha=2, beta is a super parameter, beta=4, N is the number of key points of the input image, and +.>Network predicted keypoint thermodynamic diagram values are detected for the centrnet.
Since the true center points will have discrete deviations when the image is downsampled, a local offset prediction is added to each center point, i.eCenter point offset prediction +.>The training algorithm of (2) is that the function is expressed as the following formula:in->Is the center point->Is used for local offset prediction.
After the image is subjected to central point prediction, regression processing is carried out on each target, and the position coordinates of Bbox of target k are assumed to beAnd->Regression target size s k In particular->Bbox size is used +.>As a predictive value +.>The Bbox size loss function is as follows: />S in k Bbox size for target k, < +.>The size is predicted for the Bbox of target k.
By the formula l=l ksize L sizeoff L off Calculating the total loss function L, wherein lambda size =0.1,λ off =0.1, the centrnet pedestrian detection model was optimally trained using the total loss function L.
Example 3:
the step of marking the ID of the pedestrian frame in the step c) is as follows: if the first frame image of the monitoring video has a pedestrian frame, a tracking model is built based on the first frame pedestrian image, pedestrian IDs of n pedestrian frames are marked as ID1 and ID2.
Example 4:
step d) comprises the steps of:
(1) Initializing a motion direction:
taking a monitoring scene in a corridor environment as an example, the boundary of a monitoring plan is generally a wall body, a door, a passageway and the like, the position of a moving object appearing from the boundary mainly has four movement entering conditions, as shown in fig. 2, a coordinate system is determined according to the monitoring area plan, the ID of each pedestrian frame is marked on the basis of a first frame video frame of the detected pedestrian object, the barycenter coordinate of the pedestrian is calculated, and the initial movement direction is determined according to the position of the object entering from the positive direction and the negative direction of the x-axis y-axis in the plan.
(2) Updating the movement direction:
the pedestrian target movement direction update of the kth frame mark is based on the change condition of the mass center coordinates of the kth-1 frame and the kth frame. First, the barycenter coordinates of the pedestrian are calculated based on the detected coordinates of the pedestrian frame, as shown in fig. 3, and the coordinates of the pedestrian frame IDi of the kth frame are (x i1,k ,y i1,k ) And (x) i2,k ,y i2,k ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k ,y j1,k ) And (x) j2,k ,y j2,k ) Coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the k-1 th frame pedestrian image are (x i1,k-1 ,y i1,k-1 ) And (x) i2,k-1 ,y i2,k-1 ) Rectangular pedestrian frame of jth in kth frame pedestrian imageThe coordinates of the two opposite corners of IDj are (x) j1,k-1 ,y j1,k-1 ) And (x) j2,k-1 ,y j2,k-1 )。
Through formula C i =(x i,k ,y i,k ) I.e. 1.,. N establishes the centroid C of the pedestrian frame IDi in the kth frame pedestrian image i In the followingThrough formula C i =(x i,k-1 ,y i,k-1 ) I.e. 1,.. i In the following
Through formula C j =(x j,k ,y j,k ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-th frame pedestrian image j In the followingThrough formula C j =(x j,k-1 ,y j,k-1 ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-1 frame pedestrian image j In the following
By the formula deltax i =x i,k -x i,k-1 ,i∈1,...,n,Δy i =y i,k -y i,k-1 I.e. 1, n. calculates the amount of change Δx of the pedestrian frame IDi from the k-1 frame to the k-frame centroid abscissa i And the variation Δy of the ordinate i By the formula deltax j =x j,k -x j,k-1 ,j∈1,...,n,Δy j =y j,k -y j,k-1 J.epsilon.1..n. calculating the variation Δx of the pedestrian frame IDj from the kth-1 frame to the kth frame centroid abscissa j And the variation Δy of the ordinate j 。Δx i And Deltay i Positive signs respectively represent the movement direction of the pedestrian i from left to right, from top to bottom and delta x i And Deltay i Is the negative sign to respectively represent the moving direction of the pedestrian iRight to left, down to up. If the pedestrian target i moves in the horizontal direction (i.e., the x-axis direction), Δy i =0; if the pedestrian target i moves in the vertical direction (i.e., the y-axis direction), Δx i =0。
In order to more clearly show the direction of movement of the pedestrian, the method is based on Δx i And Deltay i Calculating the motion angle of the pedestrian i, and specifically passing through a formulaCalculating to obtain the angle theta of the motion direction of the pedestrian i i By the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian j j
Further, step e) is performed by the formulaCalculating the distance d between the pedestrian i and the pedestrian j in the kth frame of pedestrian image i,j 。(x i,k ,y i,k ) Centroid coordinate C for pedestrian i i ,(x j,k ,y j,k ) Barycentric coordinate C for pedestrian j j
For the problem that the target tracking error can occur during the tracking of the crossed and overlapped pedestrians, as shown in (b) of fig. 4, if d i,j If the value of (1) is smaller than or equal to a preset threshold value th1, the intersection overlapping shielding exists between the pedestrian i and the pedestrian j; as shown in fig. 4 (c), there may occur a problem of a tracking error of the pedestrians i and j in the case of the pedestrian tracking of the cross overlapping. Firstly, removing a cross overlapping area according to a pedestrian frame to obtain two non-overlapping areas of the pedestrian frames of the selected pedestrian i and the pedestrian j, taking the non-overlapping areas of the pedestrian frames as pedestrian characteristic areas, respectively selecting the characteristic areas of the pedestrian i and the pedestrian j in a kth frame image as shown in figure 5, then using a gray level co-occurrence matrix and a color moment to represent texture color characteristics of the characteristic areas, and calculating the movement direction theta of the pedestrian i and the pedestrian j from a kth-1 frame to a kth frame respectively i And theta j Using characteristic region colour texturesThe method combined with the pedestrian movement realizes pedestrian tracking. Specifically, the color textures and the motion directions of the characteristic areas of the pedestrians i and the pedestrians j in the kth frame image are matched with the color texture motion direction of each pedestrian in the kth-1 frame, when the similarity exceeds a preset threshold th2, the pedestrians are successfully matched, the ID number IDi of the pedestrian i in the kth-1 frame is obtained and used as the ID number of the pedestrian i in the kth frame, the ID number IDj of the pedestrian j is used as the ID number of the pedestrian j in the kth frame, the centroid of the same ID number in two adjacent frames is connected, and the tracking of each pedestrian is completed.
(2) For pedestrian tracking without cross-over, as shown in FIG. 6 (b), if the distance d between pedestrian i and pedestrian j in the kth frame image i,j If the detection result is larger than the preset threshold th1, no overlapping shielding exists between the pedestrians i and j; as shown in fig. 6 (c), the pedestrian i and the pedestrian j can be correctly tracked when the pedestrian without the cross overlap is tracked. Only the motion direction theta of the pedestrian i and the pedestrian j from the kth-1 frame to the kth frame is calculated i And theta j And matching the motion directions of the pedestrians i and j in the k frame image with the motion direction of the pedestrians in the k-1 frame, acquiring the ID number IDi of the pedestrian i in the k-1 frame as the ID number of the pedestrian i in the k frame, and connecting the centroid of the same ID number in two adjacent frames by taking the ID number IDj of the pedestrian j as the ID number of the pedestrian j in the k frame, thereby completing the tracking of each pedestrian.
The invention also relates to a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the deep learning based multi-row human detection and tracking method as described in any of the above steps.
The invention also relates to a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, which processor, when executing the program, implements the steps of the deep learning based multi-line human detection and tracking method as described in any of the above steps.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A multi-row human detection and tracking method based on deep learning, comprising:
a) The computer collects the monitoring video in real time, preprocesses the monitoring video, and inputs the preprocessed monitoring video to the central Net pedestrian detection model;
b) Pedestrian detection is carried out on the preprocessed monitoring video through a deep learning technology, and a pedestrian area in an image is detected by using a CenterNet pedestrian detection model;
c) Judging whether pedestrians appear, if so, establishing a tracking model based on the first frame of pedestrian image and marking the pedestrian ID of each pedestrian frame, and if not, detecting the next frame of pedestrian image;
d) Calculating according to the coordinates of each pedestrian frame to obtain the barycenter coordinates of each pedestrian, and calculating according to the barycenter coordinates to obtain the pedestrian direction;
e) If the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is smaller than or equal to a threshold th1, the overlapping shielding exists between pedestrians, and if the distance between the pedestrian marshAN_SNe centers in different pedestrian frames is larger than the threshold th1, the overlapping shielding does not exist between pedestrians;
f) If overlapping shielding occurs between pedestrian targets of the current frame, selecting a non-overlapping area of each pedestrian frame as a characteristic area according to the pedestrian frame of the pedestrian image of the current frame obtained in the step c), extracting color and texture characteristics of the characteristic area, matching the color and texture characteristics of the pedestrian of the current frame and the pedestrian direction with the color and texture characteristics of the pedestrian of the previous frame and the pedestrian direction, obtaining the pedestrian ID of the previous frame as the pedestrian ID of the current frame, connecting the centroids of the same ID numbers in two adjacent frames, and realizing pedestrian tracking;
g) If no overlapping shielding exists between the pedestrian targets of the current frame, the pedestrian direction of the current frame is matched with the pedestrian direction of the previous frame, the pedestrian ID of the previous frame is obtained as the pedestrian ID of the current frame, the centroid of the same ID number in two adjacent frames is connected, and pedestrian tracking is achieved.
2. The deep learning based multi-line person detection and tracking method according to claim 1, wherein the step of preprocessing the surveillance video in step a) is as follows: and converting the monitoring video into an image, and selecting one frame of image as the input of a pedestrian detection model every k frames on the basis of the image of the pedestrian contained in the first frame.
3. The multi-line person detection and tracking method and system based on deep learning according to claim 1, wherein the step of performing pedestrian detection on the preprocessed surveillance video by the deep learning technique in the step b) is as follows: and (3) inputting a central Net pedestrian detection model after unifying the sizes of the input frame images in proportion, modeling a pedestrian target as a single point, finding a central point to serve as a center point of a pedestrian frame through a Keypoint Heatmap key point thermodynamic diagram, and obtaining coordinates and size information of the pedestrian frame according to image feature regression of the central point.
4. The deep learning based multi-row human detection and tracking method of claim 1 wherein the step of marking the ID of the pedestrian box in step c) is: if the first frame image of the monitoring video has a pedestrian frame, a tracking model is built based on the first frame pedestrian image, pedestrian IDs of n pedestrian frames are marked as ID1 and ID2.
5. The deep learning based multi-row human detection and tracking method of claim 4 wherein step d) comprises the steps of:
d-1) coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the kth frame pedestrian image are (x) i1,k ,y i1,k ) And (x) i2,k ,y i2,k ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k ,y j1,k ) And (x) j2,k ,y j2,k ) Coordinates of two opposite corners of the rectangular pedestrian frame IDi of the ith rectangle in the k-1 th frame pedestrian image are (x i1,k-1 ,y i1,k-1 ) And (x) i2,k-1 ,y i2,k-1 ) Coordinates of two opposite corners of a rectangular pedestrian frame IDj of the jth rectangle in the kth frame pedestrian image are (x j1,k-1 ,y j1,k-1 ) And (x) j2,k-1 ,y j2,k-1 );
d-2) is represented by formula C i =(x i,k ,y i,k ) I.e. 1.,. N establishes the centroid C of the pedestrian frame IDi in the kth frame pedestrian image i In the followingThrough formula C i =(x i,k-1 ,y i,k-1 ) I.e. 1,.. i In the following
d-3) is represented by formula C j =(x j,k ,y j,k ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-th frame pedestrian image j In the followingThrough formula C j =(x j,k-1 ,y j,k-1 ) J epsilon 1, n establishes the centroid C of the pedestrian frame IDi in the k-1 frame pedestrian image j In the following
d-4) passing through the formula Deltax i =x i,k -x i,k-1 ,i∈1,...,n,Δy i =y i,k -y i,k-1 I.e. 1, n. calculates the amount of change Δx of the pedestrian frame IDi from the k-1 frame to the k-frame centroid abscissa i And the variation Δy of the ordinate i By the formula deltax j =x j,k -x j,k-1 ,j∈1,...,n,Δy j =y j,k -y j,k-1 J.epsilon.1..n. calculating the variation Δx of the pedestrian frame IDj from the kth-1 frame to the kth frame centroid abscissa j And the variation Δy of the ordinate j
d-5) is represented by the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian i i By the formulaCalculating to obtain the angle theta of the motion direction of the pedestrian j j
6. The deep learning based multi-row human detection and tracking method of claim 5, wherein: step e) is performed by the formulaCalculating the distance d between the pedestrian i and the pedestrian j in the kth frame of pedestrian image i,j
7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the deep learning based multi-line person detection and tracking method of any of claims 1-6.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps in the deep learning based multi-row human detection and tracking method of any one of claims 1-6 when the program is executed by the processor.
CN202110917108.7A 2021-08-11 2021-08-11 Multi-row person detection and tracking method and system based on deep learning Active CN113658223B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110917108.7A CN113658223B (en) 2021-08-11 2021-08-11 Multi-row person detection and tracking method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110917108.7A CN113658223B (en) 2021-08-11 2021-08-11 Multi-row person detection and tracking method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN113658223A CN113658223A (en) 2021-11-16
CN113658223B true CN113658223B (en) 2023-08-04

Family

ID=78491350

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110917108.7A Active CN113658223B (en) 2021-08-11 2021-08-11 Multi-row person detection and tracking method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN113658223B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115695818B (en) * 2023-01-05 2023-04-07 广东瑞恩科技有限公司 Efficient management method for intelligent park monitoring data based on Internet of things

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030073879A (en) * 2002-03-13 2003-09-19 주식회사 엘지이아이 Realtime face detection and moving tracing method
CN102104771A (en) * 2010-12-14 2011-06-22 浙江工业大学 Multi-channel people stream rate monitoring system based on wireless monitoring
CN104537647A (en) * 2014-12-12 2015-04-22 中安消技术有限公司 Target detection method and device
CN104715238A (en) * 2015-03-11 2015-06-17 南京邮电大学 Pedestrian detection method based on multi-feature fusion
CN107564034A (en) * 2017-07-27 2018-01-09 华南理工大学 The pedestrian detection and tracking of multiple target in a kind of monitor video
CN110688987A (en) * 2019-10-16 2020-01-14 山东建筑大学 Pedestrian position detection and tracking method and system
WO2020155873A1 (en) * 2019-02-02 2020-08-06 福州大学 Deep apparent features and adaptive aggregation network-based multi-face tracking method
CN111783576A (en) * 2020-06-18 2020-10-16 西安电子科技大学 Pedestrian re-identification method based on improved YOLOv3 network and feature fusion

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030073879A (en) * 2002-03-13 2003-09-19 주식회사 엘지이아이 Realtime face detection and moving tracing method
CN102104771A (en) * 2010-12-14 2011-06-22 浙江工业大学 Multi-channel people stream rate monitoring system based on wireless monitoring
CN104537647A (en) * 2014-12-12 2015-04-22 中安消技术有限公司 Target detection method and device
CN104715238A (en) * 2015-03-11 2015-06-17 南京邮电大学 Pedestrian detection method based on multi-feature fusion
CN107564034A (en) * 2017-07-27 2018-01-09 华南理工大学 The pedestrian detection and tracking of multiple target in a kind of monitor video
WO2020155873A1 (en) * 2019-02-02 2020-08-06 福州大学 Deep apparent features and adaptive aggregation network-based multi-face tracking method
CN110688987A (en) * 2019-10-16 2020-01-14 山东建筑大学 Pedestrian position detection and tracking method and system
CN111783576A (en) * 2020-06-18 2020-10-16 西安电子科技大学 Pedestrian re-identification method based on improved YOLOv3 network and feature fusion

Also Published As

Publication number Publication date
CN113658223A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
Sun et al. Research on the hand gesture recognition based on deep learning
CN110059558B (en) Orchard obstacle real-time detection method based on improved SSD network
CN108564097B (en) Multi-scale target detection method based on deep convolutional neural network
CN110334762B (en) Feature matching method based on quad tree combined with ORB and SIFT
CN107480585B (en) Target detection method based on DPM algorithm
CN103310194A (en) Method for detecting head and shoulders of pedestrian in video based on overhead pixel gradient direction
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
CN109145756A (en) Object detection method based on machine vision and deep learning
CN105184265A (en) Self-learning-based handwritten form numeric character string rapid recognition method
CN109886159B (en) Face detection method under non-limited condition
CN111753682B (en) Hoisting area dynamic monitoring method based on target detection algorithm
CN107146219B (en) Image significance detection method based on manifold regularization support vector machine
CN105405138A (en) Water surface target tracking method based on saliency detection
Yang et al. An improved algorithm for the detection of fastening targets based on machine vision
CN113658223B (en) Multi-row person detection and tracking method and system based on deep learning
CN112784722B (en) Behavior identification method based on YOLOv3 and bag-of-words model
Zhu et al. Human detection under UAV: an improved faster R-CNN approach
CN107679467B (en) Pedestrian re-identification algorithm implementation method based on HSV and SDALF
CN110334703B (en) Ship detection and identification method in day and night image
CN117079125A (en) Kiwi fruit pollination flower identification method based on improved YOLOv5
CN108985294B (en) Method, device and equipment for positioning tire mold picture and storage medium
CN108985216B (en) Pedestrian head detection method based on multivariate logistic regression feature fusion
CN110826575A (en) Underwater target identification method based on machine learning
CN103020631A (en) Human movement identification method based on star model
CN116912670A (en) Deep sea fish identification method based on improved YOLO model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant