CN111353448A - Pedestrian multi-target tracking method based on relevance clustering and space-time constraint - Google Patents

Pedestrian multi-target tracking method based on relevance clustering and space-time constraint Download PDF

Info

Publication number
CN111353448A
CN111353448A CN202010148317.5A CN202010148317A CN111353448A CN 111353448 A CN111353448 A CN 111353448A CN 202010148317 A CN202010148317 A CN 202010148317A CN 111353448 A CN111353448 A CN 111353448A
Authority
CN
China
Prior art keywords
pedestrian
matrix
space
frame
time constraint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010148317.5A
Other languages
Chinese (zh)
Inventor
李旻先
桑毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202010148317.5A priority Critical patent/CN111353448A/en
Publication of CN111353448A publication Critical patent/CN111353448A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian multi-target tracking method based on relevance clustering and space-time constraint, which comprises the following steps: and (3) pedestrian visual feature extraction, namely, pedestrian track association under a single camera based on the relevance clustering of visual features, and pedestrian track matching under a cross camera by utilizing a space-time constraint method. Aiming at the problem that the pedestrian tracking under a single camera is easy to interrupt, the invention introduces a space-time sliding window to solve the problem; meanwhile, a space-time constraint method is introduced to associate the same pedestrian in a cross-camera scene, so that multi-target tracking of the pedestrian under the cross-camera scene is realized; the method provided by the invention can consistently improve tracking indexes such as MOTA, MOTP and recall rate in pedestrian tracking.

Description

Pedestrian multi-target tracking method based on relevance clustering and space-time constraint
Technical Field
The invention relates to the field of computer vision and pattern recognition, in particular to a pedestrian multi-target tracking method based on relevance clustering and space-time constraint.
Background
Computer Vision (Computer Vision) is a science that studies how to "look" at a machine, and further, is a Computer technology that uses a camera and a Computer to identify, track, measure, etc. an object instead of the human eye. Computer vision aims at creating artificial intelligence systems that can extract "information" from images or multidimensional data, and therefore has been a hotspot and difficulty of research in the field of computer science in recent years.
The multi-target tracking problem is used as the basis of a high-level visual task and becomes a crucial research problem in the field of computer vision. Multiple Object Tracking (MOT). The main task is to give an image sequence, find moving objects in the image sequence, and correspond moving objects in different frames one to one. And then the motion tracks of different objects are given. These objects may be arbitrary, such as pedestrians, vehicles, athletes, various animals, etc., and most studied is "pedestrian tracking". This is because firstly the "pedestrian" is a typical non-rigid target, which is more difficult than a rigid target, and secondly the detection and tracking of the pedestrian is more commercially valuable in practical applications. By incomplete statistics, at least 75% of multi-objective tracking studies are studying pedestrian tracking.
In addition, the video monitoring in the current society is very extensive in arrangement, and how to effectively acquire the track information of pedestrians from the data of a plurality of video monitoring has very important value for a social security system. Therefore, cross-camera pedestrian tracking has become an important research content in the field of computer vision. However, cross-camera pedestrian tracking has some problems which are difficult to deal with in practical application, and mainly includes two aspects: on one hand, in most of the existing tracking algorithms, whether the two pedestrian detection frames belong to the same pedestrian target or not is judged by calculating the overlapping area between the two adjacent pedestrian detection frames, but due to the fact that the objects under the surveillance video are numerous, the calculation process of pedestrian tracking is interrupted due to the fact that the objects are shielded. On the other hand, for pedestrian tracking under a cross-camera, most of the methods adopted at present are similar to pedestrian retrieval methods in pedestrian re-identification, and the methods are mostly realized based on the characteristics of pedestrian targets, but do not use spatiotemporal information in a data set.
Disclosure of Invention
The invention aims to provide a pedestrian multi-target tracking method based on relevance clustering and space-time constraint, which is used for completing the tracking of pedestrians in different scenes.
The technical solution for realizing the purpose of the invention is as follows: a pedestrian multi-target tracking method based on relevance clustering and space-time constraint comprises the following steps:
1) decompressing the video into frames by inputting video streams and formulating a video data set according to the selected interval;
2) detecting each image of the video data set by using a pedestrian detection algorithm by using the video data set to obtain pedestrian detection data; detecting data as minimum matrix frame information containing pedestrians;
3) cutting the video data set according to the matrix frame information of the pedestrian detection data to generate a pedestrian picture set, and formulating a training set and a test set according to a selected interval;
4) training a deep convolutional neural network for pedestrian re-identification by utilizing a training set, and outputting the trained deep convolutional neural network for extracting the visual characteristics of pedestrians, wherein the loss of the deep convolutional neural network is formed by triple loss; inputting the test set picture into a trained deep convolution neural network for visual feature extraction to obtain a pedestrian visual feature matrix;
5) calculating pedestrian appearance characteristic correlation and motion correlation according to the pedestrian visual characteristic matrix and pedestrian detection information, finishing correlation clustering, and realizing pedestrian track correlation under a single camera by using a correlation clustering result so as to finish pedestrian multi-target tracking under the single camera;
6) and according to the multi-target tracking result of the pedestrian under the single camera, correlating the pedestrian track under the cross camera by using a space-time constraint method so as to complete the multi-target tracking of the pedestrian under the cross camera.
Compared with the prior art, the invention has the following remarkable advantages: (1) the method of the relevance clustering can solve the problem of tracking interruption caused by the shielding of objects or pedestrians, is accurate and stable, and has obviously better performance in a test data set than other algorithms; (2) the pedestrian tracking under the cross-camera is completed by using a space-time constraint method, the characteristic information in the pedestrian detection data is used, and the space-time information of the data set is fully used, so that the pedestrian tracking result under the cross-camera is more accurate, and the evaluation index is higher than that of other algorithms. Similarly, the method is also suitable for pedestrian re-identification, and the accuracy of pedestrian retrieval is improved.
Drawings
FIG. 1 is a flow chart of a pedestrian multi-target tracking method based on relevance clustering and space-time constraint.
Fig. 2 is an exemplary cross-camera pedestrian tracking diagram.
Detailed Description
The invention provides a pedestrian multi-target tracking method based on relevance clustering and space-time constraint, which mainly uses a deep convolutional neural network to extract features, wherein the relevance clustering is used for completing pedestrian multi-target tracking under a single camera, and the space-time constraint method is used for completing three main parts of pedestrian multi-target tracking across the cameras. With reference to fig. 1, the method comprises the following steps:
1) decompressing the video into frames by inputting video streams and formulating a video data set according to the selected interval;
2) detecting each image of the video data set by using a pedestrian detection algorithm by using the video data set to obtain pedestrian detection data; detecting data as minimum matrix frame information containing pedestrians;
3) cutting the video data set according to the matrix frame information of the pedestrian detection data to generate a pedestrian picture set, and formulating a training set and a test set according to a selected interval;
4) training a deep convolutional neural network for pedestrian re-identification by utilizing a training set, and outputting the trained deep convolutional neural network for extracting the visual characteristics of pedestrians, wherein the loss of the deep convolutional neural network is formed by triple loss; inputting the test set picture into a trained deep convolution neural network for visual feature extraction to obtain a pedestrian visual feature matrix;
5) calculating pedestrian appearance characteristic correlation and motion correlation according to the pedestrian visual characteristic matrix and pedestrian detection information, finishing correlation clustering, and realizing pedestrian track correlation under a single camera by using a correlation clustering result so as to finish pedestrian multi-target tracking under the single camera;
6) and according to the multi-target tracking result of the pedestrian under the single camera, correlating the pedestrian track under the cross camera by using a space-time constraint method so as to complete the multi-target tracking of the pedestrian under the cross camera.
Further, in step 1), the input video stream is decompressed into frames according to a frame rate of 60fps, and the decompressed pictures are named according to a specified naming rule to form a video data set.
Further, the specific method in the step 2) comprises the following steps: and carrying out pedestrian detection on the video data set based on a pedestrian detection algorithm openposition to obtain pedestrian key point data. And after the key point information is obtained, converting the key point information into matrix frame information in pedestrian detection so as to obtain the required pedestrian detection information, wherein the detection information result is the coordinate information of the upper left corner and the lower right corner of the rectangular frame.
Further, in the step 4), the deep convolutional neural network framework is ResNet50, and each parameter in the network is iteratively updated by using Adam in the adaptive learning rate gradient descent optimization algorithm until the parameter converges, so as to obtain a trained feature learning network.
The deep convolutional neural network uses a triple loss function, and the expression of the triple loss function L is as follows:
Figure BDA0002401535450000031
in the formula (I), the compound is shown in the specification,
Figure BDA0002401535450000032
is the distance between the anchor point and the positive sample,
Figure BDA0002401535450000033
is the distance of the anchor point from the negative sample α indicates a minimum separation between the two distances, and the subscript + indicates]When the internal value is larger than zero, the modified value is taken as loss, and when the internal value is smaller than zero, the loss is zero.
According to the deep convolutional neural network model, using the test set to extract the features of the pedestrians, obtaining an appearance feature matrix of the pedestrians and calculating feature correlation, wherein the formula is as follows:
Figure BDA0002401535450000041
in the formula, d (x)i,xj) The characteristic distance between the ith pedestrian and the jth pedestrian is calculated here using the euclidean distance. t is taIs the average of the distances between positive and negative samples in the training set.
Further, in the step 5), a pedestrian motion correlation is calculated using a linear motion model according to the pedestrian detection data.
First, the offset of the pedestrian matrix frame in one frame, which we call the velocity, is calculated
Figure BDA0002401535450000044
Expressed, the formula is:
Figure BDA0002401535450000042
in the formula, twIn order to set the size of the window,
Figure BDA0002401535450000043
is the coordinate of the center point of the b-th matrix frame in the j-th frame,
Figure BDA0002401535450000045
the coordinate of the center point of the a matrix frame of the ith frame.
Secondly, calculating the motion correlation error emThe meaning of which represents the error between the value of the nearest matrix box predicted by a matrix box according to its rate and the actual value, the formula is expressed as:
em=ef+eb
in the formula, efIs a forward error, ebIs a backward error.
Forward error efExpressed as:
ef(i,j)=ci+vifij-cj(j>i)
in the formula, ciThe coordinates of the center point of the ith matrix box in a certain frame. v. ofiIs the rate of the ith matrix box. f. ofijIs the distance between the frame number of the ith matrix frame and the frame number of the jth matrix frame. c. CjIs the center point coordinate of the jth matrix box.
The backward error is expressed as:
eb(i,j)=ci-vifij-cj(j<i)
the motion correlation error is expressed as:
em=ef+eb
finally, a motion correlation W is calculatedmThe meaning of the method represents the similarity degree between two adjacent matrix boxes, and the formula is as follows:
wm=α(tm-em)
wherein, α, tmParameters set for the algorithm, emIs a motion dependent error.
Further, the relevance cluster is determined by combining the feature relevance and the motion relevance, and the formula is as follows:
W=(Wa+Wm)⊙D
wherein W is a correlation matrix, WaIs a characteristic correlation matrix. WmThe motility matrix, D is the Discount matrix, ⊙ is the dot product operation, the Discount matrix is:
D=e-Δt∈[0,1]
in the formula, Δ t indicates the degree of attenuation, and as Δ t increases, the correlation tends to be 0.
According to the correlation matrix, a graph G ═ V, (E, W) is constructed, where V is a node, each matrix box is regarded as a node in the patent, E is an edge set, an edge can be constructed if two matrix boxes are positively correlated, and W is a weight, i.e., a value of correlation. After the graph is constructed, all nodes of one connected component in the graph are endowed with the same ID through calculating the connected component of the graph, and the nodes are the same pedestrian.
Further, the generation of the short track and the long track of the person moving downwards by the single camera can be generated by setting the size of the window, and the short track is expressed in m1The related clustering algorithm is used in a window within the frame image range. The long track is based on the short track, and the size of the window is set to be m2Frame, and the overlapping range of front and rear windows is m3Frame, and finally, using relevance clustering within the window.
The relevance clustering algorithm shown in table 1 is an index result of 8 individual camera scenes, wherein MOTA is multi-target tracking accuracy, and IDF1, IDP and IDR respectively represent the F value, accuracy and recall rate of pedestrian ID prediction. IDS represents the number of ID momentary transitions of pedestrian ID due to interaction or interruption, and FM represents the number of interruptions of tracking trajectory.
TABLE 1 index results of correlation clustering algorithm under 8 individual camera scenes
Figure BDA0002401535450000051
Figure BDA0002401535450000061
According to the table 1, most of the MOTA values of 8 cameras exceed 80, and even the number 4 camera with a small number of pedestrians exceeds 90, the three values of ID prediction exceed 80, and IDs and FM only have about 100 on the basis that the number of tracks exceeds 10000. In combination, the accuracy and the tracking efficiency of the correlation clustering algorithm in a single-camera scene are very high.
Further, the implementation of the cross-camera downlink human tracking algorithm specifically comprises the following steps:
1) on the basis of the pedestrian track under a single camera, selecting n images in each section of pedestrian track as a representative, putting the representative images into a deep convolution neural network to extract features, and then solving the mean value to be used as the appearance features of pedestrians. Then, each pedestrian is selected as an inquiry target in turn, and the tracks of the other pedestrians are used as a pedestrian pool.
2) And for each query target, respectively calculating the distance between the query target and other targets in the pedestrian pool and sequencing, wherein the calculation method is cosine distance.
3) And correcting the screening result by using a space-time constraint method according to the sorting screening result.
Further, for the time relationship in the space-time constraint, it is found through experiments that the movement of the pedestrian in different areas in the training set is concentrated in a time period, and for this case, we propose the concept of space-time similarity and use a space-time model to represent the space-time similarity of the pedestrian.
Further, for the spatio-temporal model, one is set up
Figure BDA0002401535450000062
Is used to represent the time difference between two new pictures and then uses maximum likelihood estimation, at
Figure BDA0002401535450000063
M in front and back4The probability of the occurrence of the target is calculated in the frame and is marked as P. Calculating P in the training set to obtain the time when the P value is maximum
Figure BDA0002401535450000064
The value of (c).
Further, for the screening result of the query target, the same target pedestrian of the scene connected with the query target can be obtained according to the spatio-temporal model and the spatio-temporal information of the query target. And taking the result as a new query target to continue to obtain a target pedestrian connected with the new query target according to space-time constraint until a complete track of the pedestrian under different cameras is obtained, wherein the tracking effect is as shown in an exemplary graph of pedestrian tracking under a camera-crossing scene in fig. 2.
The space-time constraint algorithm shown in table 2 is an index result in a cross-camera scene, because in a pedestrian multi-target tracking algorithm in the cross-camera scene, an ID-related index is very important.
TABLE 2 index results of space-time constraint algorithm in cross-camera scene
IDF1 IDP IDR
75.4 76.0 74.8
According to the table 2, the IDF1, the IDP and the IDR score are about 75 points, which indicates that the space-time constraint algorithm can obtain good pedestrian tracking performance in the cross-camera scene.
In summary, the invention discloses a pedestrian multi-target tracking method based on relevance clustering and space-time constraint, which starts with pedestrian multi-target tracking under a single camera, completes matching of a single-camera descending pedestrian track through a relevance clustering method, and completes the cross-camera descending pedestrian multi-target tracking by using a space-time constraint method on the basis of the matching. The tracking effect is improved under a single camera by combining the appearance characteristics and the motion characteristics of the pedestrians, and the accuracy rate of pedestrian matching is improved under the scene of crossing the cameras.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims (9)

1. A pedestrian multi-target tracking method based on relevance clustering and space-time constraint is characterized by comprising the following steps:
1) decompressing the video into frames by inputting video streams and formulating a video data set according to the selected interval;
2) detecting each image of the video data set by using a pedestrian detection algorithm by using the video data set to obtain pedestrian detection data; detecting data as minimum matrix frame information containing pedestrians;
3) cutting the video data set according to the matrix frame information of the pedestrian detection data to generate a pedestrian picture set, and formulating a training set and a test set according to a selected interval;
4) training a deep convolutional neural network for pedestrian re-identification by utilizing a training set, and outputting the trained deep convolutional neural network for extracting the visual characteristics of pedestrians, wherein the loss of the deep convolutional neural network is formed by triple loss; inputting the test set picture into a trained deep convolution neural network for visual feature extraction to obtain a pedestrian visual feature matrix;
5) calculating pedestrian appearance characteristic correlation and motion correlation according to the pedestrian visual characteristic matrix and pedestrian detection information, finishing correlation clustering, and realizing pedestrian track correlation under a single camera by using a correlation clustering result so as to finish pedestrian multi-target tracking under the single camera;
6) and according to the multi-target tracking result of the pedestrian under the single camera, correlating the pedestrian track under the cross camera by using a space-time constraint method so as to complete the multi-target tracking of the pedestrian under the cross camera.
2. The pedestrian multi-target tracking method based on relevance clustering and spatio-temporal constraints as claimed in claim 1, wherein in step 1), the input video stream is decompressed into pictures at a frame rate of 60fps, and the decompressed pictures are named according to a specified naming rule to form a video data set.
3. The pedestrian multi-target tracking method based on relevance clustering and space-time constraint according to claim 1, wherein the step 2) specifically comprises the following steps:
21) carrying out pedestrian detection on the video data set based on a pedestrian detection algorithm openposition to obtain pedestrian key point data;
22) after the key point data is obtained, converting the key point data into matrix frame information containing pedestrians, and thus obtaining required pedestrian detection data; the detection information is specifically coordinate information of the upper left corner and the lower right corner of the rectangular frame.
4. The pedestrian multi-target tracking method based on relevance clustering and space-time constraint according to claim 1, characterized in that in step 4), the deep convolutional neural network framework is ResNet50, and each parameter in the network is iteratively updated by using Adam in an adaptive learning rate gradient descent optimization algorithm until the parameters converge, so as to obtain a trained feature learning network; the method specifically comprises the following steps:
41) the deep convolutional neural network uses a triple loss function, and the expression of the triple loss function L is as follows:
Figure FDA0002401535440000021
in the formula (I), the compound is shown in the specification,
Figure FDA0002401535440000022
is the distance between the anchor point and the positive sample,
Figure FDA0002401535440000023
is the distance of the anchor point from the negative sample, α indicates a minimum separation between the two distances, and the subscript + indicates]When the internal value is larger than zero, taking the value as the final value of the loss function, and when the internal value is smaller than zero, the loss is zero;
42) according to the deep convolutional neural network model, the pedestrian appearance characteristics are extracted by using the test set, an appearance characteristic matrix of the pedestrian is obtained, the characteristic correlation is calculated, and the formula is as follows:
Figure FDA0002401535440000024
in the formula, d (x)i,xj) Calculating a characteristic distance between the ith pedestrian and the jth pedestrian by using the Euclidean distance; t is taIs the average of the distances between positive and negative samples in the training set.
5. The pedestrian multi-target tracking method based on relevance clustering and space-time constraint according to claim 4, wherein in the step 5), the pedestrian motion relevance is calculated by using a linear motion model according to the pedestrian detection data, and the method specifically comprises the following steps:
51) calculating the offset of the pedestrian matrix frame in one frame, called the speed, using
Figure FDA0002401535440000025
Expressed, the formula is:
Figure FDA0002401535440000026
in the formula, 0<i<j<tw,twIn order to set the size of the window,
Figure FDA0002401535440000027
is the coordinate of the center point of the b-th matrix frame in the j-th frame,
Figure FDA0002401535440000028
the coordinate of the center point of the a matrix frame of the ith frame is taken as the coordinate of the center point of the a matrix frame of the ith frame;
52) calculating a motion correlation error emThe meaning of which represents the error between the value of the nearest matrix box predicted by a matrix box according to its rate and the actual value, the formula is expressed as:
em=ef+eb
in the formula, efIs a forward error, ebIs a backward error;
forward error efExpressed as:
ef(i,j)=ci+vifij-cj(j>i)
in the formula, ciThe coordinates of the center point of the ith matrix frame in a certain frame are obtained; v. ofiIs the rate of the ith matrix box;
fijis the frame number of the ith matrix frame and the jth matrix frameThe distance between the frame numbers of the matrix frames; c. CjIs the center point coordinate of the jth matrix frame;
the backward error is expressed as:
eb(i,j)=ci-vifij-cj(j<i)
the motion correlation error is expressed as:
em=ef+eb
53) computing a motion correlation WmThe meaning of the method represents the similarity degree between two adjacent matrix boxes, and the formula is as follows:
wm=α(tm-em)
wherein, α, tmParameters set for the algorithm, emIs a motion dependent error.
6. The pedestrian multi-target tracking method based on relevance clustering and space-time constraint according to claim 1 or 5, wherein the relevance clustering is determined by feature relevance and motion relevance, and the formula is as follows:
W=(Wa+Wm)⊙D
wherein W is a correlation matrix, WaFor the feature correlation matrix, WmIs a motility matrix, D is a count matrix, ⊙ is a dot product, and the count matrix is:
D=e-Δt∈[0,1]
wherein Δ t denotes the degree of attenuation;
constructing a graph G, namely (V, E, W) according to the correlation matrix, wherein V is a node, each matrix frame is regarded as a node, E is an edge set, an edge can be constructed if two matrix frames are positively correlated, and W is a weight, namely a value of the correlation; after the graph is constructed, all nodes of one connected component in the graph are endowed with the same ID through calculating the connected component of the graph, and the nodes are the same pedestrian.
7. The pedestrian multi-target tracking method based on relevance clustering and space-time constraint according to claim 1, characterized in that under a single cameraThe generation of the short track and the long track of the pedestrian can be generated by setting the size of a window, and specifically comprises the following steps: short track is at m1The method is realized by using a related clustering algorithm in a window within a frame image range; the long track is based on the short track, and the size of the window is set to be m2Frame, and the overlapping range of front and rear windows is m3Frame, and finally, using relevance clustering within the window.
8. The pedestrian multi-target tracking method based on relevance clustering and space-time constraint according to claim 1, wherein the step 6) specifically comprises the following steps:
61) on the basis of the pedestrian track under a single camera, selecting n images in each section of pedestrian track as a representative, putting the representative images into a deep convolutional neural network to extract features, and then solving the mean value to serve as the appearance features of the pedestrian; then, sequentially selecting each pedestrian as an inquiry target, and taking the tracks of the rest pedestrians as a pedestrian pool;
62) for each query target, respectively calculating the distance between the query target and other targets in the pedestrian pool and sequencing the query targets, wherein the calculation method is cosine distance;
63) and correcting the screening result by using a space-time constraint method according to the sorting screening result.
9. The pedestrian multi-target tracking method based on relevance clustering and space-time constraint according to claim 8, wherein the space-time constraint method specifically comprises:
for the time relation in the space-time constraint, the movement of the pedestrians in different areas in a training set is concentrated in a time period according to the experimental finding, and a space-time model is used for representing the space-time similarity of the pedestrians;
for the space-time model, set up one
Figure FDA0002401535440000041
Is used to represent the time difference between two new pictures and then uses maximum likelihood estimation, at
Figure FDA0002401535440000042
Calculating the probability of the occurrence of the target in the front and back frames, and marking as P; calculating P in the training set to obtain the time when the P value is maximum
Figure FDA0002401535440000043
A value of (d);
for the screening result of the query target, obtaining the same target pedestrian in the scene connected with the query target according to the spatio-temporal model and the spatio-temporal information of the query target; taking the result as a new query target, and continuously obtaining a target pedestrian connected with the result according to space-time constraint until obtaining complete tracks of the pedestrian under different cameras;
and outputting a cross-camera small pedestrian tracking result.
CN202010148317.5A 2020-03-05 2020-03-05 Pedestrian multi-target tracking method based on relevance clustering and space-time constraint Withdrawn CN111353448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010148317.5A CN111353448A (en) 2020-03-05 2020-03-05 Pedestrian multi-target tracking method based on relevance clustering and space-time constraint

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010148317.5A CN111353448A (en) 2020-03-05 2020-03-05 Pedestrian multi-target tracking method based on relevance clustering and space-time constraint

Publications (1)

Publication Number Publication Date
CN111353448A true CN111353448A (en) 2020-06-30

Family

ID=71196034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010148317.5A Withdrawn CN111353448A (en) 2020-03-05 2020-03-05 Pedestrian multi-target tracking method based on relevance clustering and space-time constraint

Country Status (1)

Country Link
CN (1) CN111353448A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380461A (en) * 2020-11-20 2021-02-19 华南理工大学 Pedestrian retrieval method based on GPS track
CN114372996A (en) * 2021-12-02 2022-04-19 北京航空航天大学 Pedestrian track generation method oriented to indoor scene
CN114530043A (en) * 2022-03-03 2022-05-24 上海闪马智能科技有限公司 Event detection method and device, storage medium and electronic device
CN114820688A (en) * 2021-01-21 2022-07-29 四川大学 Public space social distance measuring and analyzing method based on space-time trajectory
CN117495913A (en) * 2023-12-28 2024-02-02 中电科新型智慧城市研究院有限公司 Cross-space-time correlation method and device for night target track
CN117576146A (en) * 2023-11-09 2024-02-20 中国矿业大学(北京) Method and system for restoring inter-view pedestrian track of multi-path camera in building
CN113627497B (en) * 2021-07-27 2024-03-12 武汉大学 Space-time constraint-based cross-camera pedestrian track matching method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380461A (en) * 2020-11-20 2021-02-19 华南理工大学 Pedestrian retrieval method based on GPS track
CN114820688A (en) * 2021-01-21 2022-07-29 四川大学 Public space social distance measuring and analyzing method based on space-time trajectory
CN114820688B (en) * 2021-01-21 2023-09-26 四川大学 Public space social distance measurement and analysis method based on space-time trajectory
CN113627497B (en) * 2021-07-27 2024-03-12 武汉大学 Space-time constraint-based cross-camera pedestrian track matching method
CN114372996A (en) * 2021-12-02 2022-04-19 北京航空航天大学 Pedestrian track generation method oriented to indoor scene
CN114530043A (en) * 2022-03-03 2022-05-24 上海闪马智能科技有限公司 Event detection method and device, storage medium and electronic device
CN117576146A (en) * 2023-11-09 2024-02-20 中国矿业大学(北京) Method and system for restoring inter-view pedestrian track of multi-path camera in building
CN117576146B (en) * 2023-11-09 2024-05-10 中国矿业大学(北京) Method and system for restoring inter-view pedestrian track of multi-path camera in building
CN117495913A (en) * 2023-12-28 2024-02-02 中电科新型智慧城市研究院有限公司 Cross-space-time correlation method and device for night target track
CN117495913B (en) * 2023-12-28 2024-04-30 中电科新型智慧城市研究院有限公司 Cross-space-time correlation method and device for night target track

Similar Documents

Publication Publication Date Title
CN111353448A (en) Pedestrian multi-target tracking method based on relevance clustering and space-time constraint
CN110728702B (en) High-speed cross-camera single-target tracking method and system based on deep learning
Chavdarova et al. Deep multi-camera people detection
CN108765394B (en) Target identification method based on quality evaluation
CN109064484B (en) Crowd movement behavior identification method based on fusion of subgroup component division and momentum characteristics
CN112836640B (en) Single-camera multi-target pedestrian tracking method
CN110765880B (en) Light-weight video pedestrian heavy identification method
CN110288627B (en) Online multi-target tracking method based on deep learning and data association
CN113011367A (en) Abnormal behavior analysis method based on target track
CN108564598B (en) Improved online Boosting target tracking method
Zhang et al. Coarse-to-fine object detection in unmanned aerial vehicle imagery using lightweight convolutional neural network and deep motion saliency
CN113963032A (en) Twin network structure target tracking method fusing target re-identification
CN110728216A (en) Unsupervised pedestrian re-identification method based on pedestrian attribute adaptive learning
CN111524164A (en) Target tracking method and device and electronic equipment
CN114373194A (en) Human behavior identification method based on key frame and attention mechanism
CN113642482A (en) Video character relation analysis method based on video space-time context
CN112507859A (en) Visual tracking method for mobile robot
Yang et al. A method of pedestrians counting based on deep learning
CN115457082A (en) Pedestrian multi-target tracking algorithm based on multi-feature fusion enhancement
CN109359530B (en) Intelligent video monitoring method and device
CN111160099B (en) Intelligent segmentation method for video image target
Casagrande et al. Abnormal motion analysis for tracking-based approaches using region-based method with mobile grid
Si et al. Tracking multiple zebrafish larvae using yolov5 and deepsort
Peng et al. Tracklet siamese network with constrained clustering for multiple object tracking
CN110766093A (en) Video target re-identification method based on multi-frame feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200630