CN114373154A - Appearance characteristic updating method and system for multi-target tracking in dense crowd scene - Google Patents

Appearance characteristic updating method and system for multi-target tracking in dense crowd scene Download PDF

Info

Publication number
CN114373154A
CN114373154A CN202210037157.6A CN202210037157A CN114373154A CN 114373154 A CN114373154 A CN 114373154A CN 202210037157 A CN202210037157 A CN 202210037157A CN 114373154 A CN114373154 A CN 114373154A
Authority
CN
China
Prior art keywords
target
current frame
appearance
frame image
targets
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210037157.6A
Other languages
Chinese (zh)
Inventor
梁栋
徐标异
权荣
高攀
李铃
杜云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202210037157.6A priority Critical patent/CN114373154A/en
Publication of CN114373154A publication Critical patent/CN114373154A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Abstract

The invention relates to a method and a system for updating appearance characteristics of multi-target tracking in a dense crowd scene, wherein the method comprises the following steps: acquiring a current frame image, and obtaining a detection frame of a plurality of targets in the current frame image; extracting the features of the targets in the detection frames, and determining the appearance features of the targets in the current frame image and the motion features of the targets in the current frame image; determining a cost matrix according to the extracted features; judging whether the target of the previous frame and the target of the current frame are successfully matched by using the cost matrix; if yes, updating the appearance characteristic of the current frame target according to the distribution weight; and if not, not updating the appearance characteristic of the current frame target. The invention updates the appearance characteristics by using the distribution weight, thereby improving the accuracy of multi-target tracking.

Description

Appearance characteristic updating method and system for multi-target tracking in dense crowd scene
Technical Field
The invention relates to the field of multi-target tracking, in particular to a multi-target tracking appearance characteristic updating method and system in a dense crowd scene.
Background
Multi-target tracking has been a hot issue in the field of computer vision research, and multi-target tracking in dense scenes is a very challenging task. Since mutual occlusion between targets and occlusion of the targets by the surrounding environment are common in dense scenes, how to keep the identity of the targets unchanged and accurately track the targets after the targets are occluded is a difficult problem. From the application perspective, tracking multiple targets in a dense scene is also a very common task in practical industrial application scenes such as intelligent traffic monitoring and unmanned driving. The task of multi-target tracking is mainly divided into locating multiple objects, maintaining their identities and generating their trajectories from the input video. The object detection module and the Re-identification module are two key modules of a multi-target tracking algorithm, wherein the object detection module is responsible for positioning a target in an image and obtaining a detection frame of the target, and the Re-identification module is responsible for extracting appearance features, also called Re-ID features, of the target in the detection frame obtained in the object detection module and is used for identifying the target and distributing related IDs for the target.
The appearance updating method of the current multi-target tracking algorithm is either a two-stage multi-target tracking algorithm, such as Deepsort (N.Wojke, A.Bewley, and D.Paulus, "Simple online and real tracking with a depth association metric," in 2017 IEEE International conference on image processing (ICIP), "IEEE, 2017, pp.3645-3649)," POI (F.Yu, W.Li, Q.Li, Y.Liu, X.Shi, and J.Yan, "Poi: Multi object tracking with high performance monitoring and evaluating," inner CV.Springer, pp.2016, 2016-42.), or a single-stage multi-target tracking algorithm, such as JD, L.Zong, Y.Zong, and "sample S-horizontal tracking"1909.12605,2019.), FairMOT (Y.Zhang, C.Wang, X.Wang, W.Zeng, and W.Liu, "Fairmot: On the course of detection and re-identification in multiple object tracking," International Journal of Computer Vision, pp.1-19,2021.), and the like, and the formula of the mode of obtaining the appearance template characteristics is as follows: f ═ λ f1+(1-λ)f2Wherein f represents the appearance template characteristic of the target in the current frame image obtained by updating, f1Representing appearance characteristics of objects in the current frame image, f2And expressing the appearance template characteristics of the target in the previous frame image, wherein lambda is a hyper-parameter.
The update mechanism is to linearly weight the historical frames once when a new image is generated for each frame, and the weight of each historical frame is attenuated continuously as time goes on. In essence, the update method assumes that the frame closest in time interval to the current frame has the greatest update weight, while the effect of frames further in time interval (e.g., the start frame) will progressively die over time. The updating mechanism has the defects in a dense crowd scene, if a target in a previous frame is shielded, the appearance characteristics acquired through the updating mechanism are doped with a plurality of appearance characteristic components which do not belong to the target, so that the misjudgment of the tracker is caused, the wrong identity identification allocated to the target is caused, and the tracker cannot accurately track multiple targets.
Disclosure of Invention
The invention aims to provide a method and a system for updating appearance characteristics of multi-target tracking in a dense crowd scene, so as to solve the problem that the multi-target tracking method in the prior art is inaccurate in target tracking.
In order to achieve the purpose, the invention provides the following scheme:
a multi-target tracking appearance feature updating method in a dense crowd scene comprises the following steps:
acquiring a current frame image; the current frame image is an image with a plurality of targets;
inputting the current frame image into a target detection network to obtain a plurality of target detection frames in the current frame image;
inputting the targets in the detection frames into a feature extraction network, and determining appearance features of the targets in the current frame image and motion features of the targets in the current frame image;
determining a cost matrix according to the appearance characteristics of a plurality of targets in the current frame image, the motion characteristics of a plurality of targets in the current frame image, the appearance template characteristics of a plurality of targets in the previous frame image and the updated motion characteristics of a plurality of targets in the previous frame image;
judging whether the target of the previous frame and the target of the current frame are successfully matched by using the cost matrix to obtain a judgment result;
if the judgment result is that the target of the previous frame is successfully matched with the target of the current frame, distributing the weight of the appearance characteristic of the target in the image of the current frame as a preset weight; the total weight of the appearance characteristic of the target in the current frame image and the appearance template characteristic of the target in the historical frame image is 1, the preset weight is less than 1, and a formula is utilized
Figure BDA0003468449780000031
Updating the appearance characteristic of the current frame target to obtain the appearance template characteristic of the target in the current frame image; updating the motion characteristics of the target in the current frame image by using Kalman filtering to obtain updated motion characteristics of the target in the current frame image; wherein, f'nRepresenting the appearance template characteristics of the target in the current frame image obtained after updating; f. ofnRepresenting the appearance characteristics of the current frame target; f. ofiRepresenting the appearance template characteristic of the target of the ith frame before the current frame; mu.siThe assigned weight of the appearance template characteristic representing the ith frame target; gamma represents the assigned weight of the appearance feature of the current frame target; n represents the total number of frames used for updating;
and if the judgment result is that the target of the previous frame is not successfully matched with the target of the current frame, not updating the appearance characteristic of the target in the image of the current frame and the motion characteristic of the target in the image of the current frame.
Optionally, before the obtaining of the current frame image, the method further includes:
appearance template features of the plurality of objects in the historical frame image and updated motion features of the plurality of objects in the historical frame image are obtained.
Optionally, the determining the cost matrix according to the appearance features of the multiple targets in the current frame image, the motion features of the multiple targets in the current frame image, the appearance template features of the multiple targets in the previous frame image, and the updated motion features of the multiple targets in the previous frame image specifically includes:
determining the similarity of the appearance characteristics of the targets according to the appearance characteristics of the targets in the current frame image and the appearance template characteristics of the targets in the previous frame image;
determining motion characteristic similarity of the targets according to the motion characteristics of the targets in the current frame image and the updated motion characteristics of the targets in the previous frame image;
using the formula costmatrix ═ λ d1+(1-λ)d2Determining a cost matrix; wherein costmatrix represents a cost matrix; d1Similarity of appearance features; d2Similarity of motion characteristics; λ is a hyperparameter.
A multi-target tracking appearance characteristic updating system in a dense crowd scene comprises:
the image acquisition module is used for acquiring a current frame image; the current frame image is an image with a plurality of targets;
the target detection frame determining module is used for inputting the current frame image into a target detection network to obtain a plurality of target detection frames in the current frame image;
the feature extraction module is used for inputting the targets in the detection frames into a feature extraction network and determining appearance features of the targets in the current frame image and motion features of the targets in the current frame image;
the cost matrix determining module is used for determining a cost matrix according to the appearance characteristics of the targets in the current frame image, the motion characteristics of the targets in the current frame image, the appearance template characteristics of the targets in the previous frame image and the updated motion characteristics of the targets in the previous frame image;
the judging module is used for judging whether the target of the previous frame is successfully matched with the target of the current frame by utilizing the cost matrix to obtain a judging result;
the first execution module is used for distributing the weight of the appearance characteristic of the target in the current frame image as a preset weight if the judgment result is that the target in the previous frame is successfully matched with the target in the current frame; the total weight of the appearance characteristic of the target in the current frame image and the appearance template characteristic of the target in the historical frame image is 1, the preset weight is less than 1, and a formula is utilized
Figure BDA0003468449780000041
Updating the appearance characteristics of the current frame target to obtain the appearance template characteristics of the target in the current frame image; updating the motion characteristics of the target in the current frame image by using Kalman filtering to obtain updated motion characteristics of the target in the current frame image; wherein, f'nRepresenting the appearance template characteristics of the target in the current frame image obtained after updating; fn represents the appearance characteristic of the current frame target; f. ofiRepresenting the appearance template characteristic of the target of the ith frame before the current frame; mu.siThe assigned weight of the appearance template characteristic representing the ith frame target; gamma represents the assigned weight of the appearance feature of the current frame target; n represents the total number of frames used for updating;
and the second execution module is used for not updating the appearance characteristics of the target in the current frame image and the motion characteristics of the target in the current frame image if the judgment result shows that the target of the previous frame and the target of the current frame are not successfully matched.
Optionally, the method further includes:
and the historical frame updating characteristic acquisition module is used for acquiring the appearance template characteristics of the plurality of targets in the historical frame image and the updated motion characteristics of the plurality of targets in the historical frame image.
Optionally, the cost matrix determining module specifically includes:
the appearance feature similarity determining unit is used for determining the appearance feature similarity of the targets according to the appearance features of the targets in the current frame image and the appearance template features of the targets in the previous frame image;
the motion characteristic similarity determining unit is used for determining the motion characteristic similarity of the targets according to the motion characteristics of the targets in the current frame image and the updated motion characteristics of the targets in the previous frame image;
a cost matrix determination unit for determining the cost matrix using the formula costmatrix ═ λ d1+(1-λ)d2Determining a cost matrix; wherein costmatrix represents a cost matrix; d1Similarity of appearance features; d2Similarity of motion characteristics; λ is a hyperparameter.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the method comprises the steps of detecting a plurality of targets in a current frame image and extracting appearance characteristics and motion characteristics of the plurality of targets in the current frame image by acquiring the current frame image and inputting the current frame image into a target detection network; determining a cost matrix according to the appearance characteristics of a plurality of targets in the current frame image, the motion characteristics of a plurality of targets in the current frame image, the appearance template characteristics of a plurality of targets in the previous frame image and the updated motion characteristics of a plurality of targets in the previous frame image; and solving the cost matrix, and updating the appearance characteristics of the current frame target according to preset distribution weight so as to replace an appearance updating scheme of historical frame linear attenuation. By using the scheme, even if the target is instantly shielded in the updating process, the ratio of the shielded appearance features to the updated appearance template features is controlled, which is usually not enough to cause the misjudgment of the target in the matching process, and the tracker can more accurately track the target, thereby improving the accuracy of multi-target tracking.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flowchart of an appearance feature updating method for multi-target tracking in a dense crowd scene according to the present invention;
FIG. 2 is a general flowchart of a method for updating appearance characteristics of multi-target tracking in a dense crowd scenario according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an appearance feature update strategy provided by the present invention;
FIG. 4 is a graph comparing the tracking results using the appearance feature update mechanism of the present invention with the tracking results of the existing appearance feature update mechanism;
fig. 5 is a structural diagram of an appearance feature updating system for multi-target tracking in a dense crowd scene provided by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a method and a system for updating appearance characteristics of multi-target tracking in a dense crowd scene, so as to solve the problem that the multi-target tracking method in the prior art is inaccurate in target tracking.
In the invention, considering that the mutual shielding between people or the shielding of people by the environment in a dense crowd scene is a common phenomenon, the Re-ID characteristic (appearance characteristic) of the current frame target and the appearance template characteristic of the previous frame target are only used for updating, and the obtained appearance template characteristic of the current frame target is doped with a large part of characteristics which do not belong to the target per se, so that the target is misjudged in the matching process, therefore, the invention provides a new appearance template updating mechanism. Updating by using the appearance characteristic of the current frame target and the appearance template characteristic of the target of the historical frame to obtain the appearance template characteristic of the target in the current frame image, fixedly allocating the weight of the Re-ID characteristic of the current frame to be 0.1 for the weight allocation of the appearance template characteristics of different frames, allocating the weight of the appearance template characteristic of the initial frame as large as possible, and equally allocating the rest weights for the appearance template characteristics of the rest frames. In this way, a more uniform appearance template feature is obtained, so that even if the target is occluded midway, it is not misidentified. Wherein, the appearance characteristics are directly extracted, and the appearance template characteristics are obtained by updating the appearance characteristics.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a flowchart of an appearance feature updating method for multi-target tracking in a dense crowd scene provided by the present invention, and fig. 2 is a general flowchart of an appearance feature updating method for multi-target tracking in a dense crowd scene in an embodiment of the present invention, as shown in fig. 1 and fig. 2, an appearance feature updating method for multi-target tracking in a dense crowd scene includes:
step 101: and acquiring a current frame image. The current frame image is an image having a plurality of objects.
In a specific embodiment, before the step 101, the method further includes:
appearance template features of the plurality of objects in the historical frame image and updated motion features of the plurality of objects in the historical frame image are obtained.
In practical applications, whether the current frame image or the historical frame image, the image includes a plurality of targets, and the targets may be people, vehicles, and the like.
Step 102: and inputting the current frame image into a target detection network to obtain a plurality of target detection frames in the current frame image.
In practical application, the current frame T is usednInputting the target detection network (DLA-34 network) to obtain the detection frame of each target in the current frame, and preparing for accurately extracting the characteristics of each target later.
Step 103: and inputting the targets in the detection frames into a feature extraction network, and determining appearance features of the targets in the current frame image and motion features of the targets in the current frame image.
In practical application, the target of each detection frame, namely the person in the picture, is intercepted, and the Re-ID characteristic and the motion characteristic of each target are obtained by inputting the targets into a characteristic extraction network, wherein the Re-ID characteristic is the appearance characteristic of the target, and is a 128-dimensional characteristic vector. The motion feature is also a feature vector, and includes information such as a motion direction, a motion speed, and a motion position of the object.
Step 104: and determining a cost matrix according to the appearance characteristics of the multiple targets in the current frame image, the motion characteristics of the multiple targets in the current frame image, the appearance template characteristics of the multiple targets in the previous frame image and the updated motion characteristics of the multiple targets in the previous frame image.
In a specific embodiment, the step 104 specifically includes:
determining the similarity of the appearance characteristics of the targets according to the appearance characteristics of the targets in the current frame image and the appearance template characteristics of the targets in the previous frame image;
determining motion characteristic similarity of the targets according to the motion characteristics of the targets in the current frame image and the updated motion characteristics of the targets in the previous frame image;
using the formula costmatrix ═ λ d1+(1-λ)d2Determining a cost matrix; wherein costmatrix represents a cost matrix; d1Similarity of appearance features; d2Similarity of motion characteristics; λ is a hyperparameter.
In practical application, the cost matrix is determined by combining the Re-ID characteristic and the motion characteristic of the current frame target with the appearance template characteristic and the updated motion characteristic of the previous frame target.
Specifically, the similarity of the appearance characteristics of each target in the current frame and the previous frame is calculated by a cosine distance formula, which is as follows:
Figure BDA0003468449780000081
Figure BDA0003468449780000082
wherein A isjRe-ID feature representing the jth target of the current frame, BkRepresenting the appearance template features of the kth object of the previous frame. cos (theta) represents the cosine similarity of the Re-ID characteristic of the jth target of the current frame and the appearance template characteristic of the kth target of the previous frame; c is dimension;
Figure BDA0003468449780000083
representing the characteristics of the ith dimension of the jth target of the current frame;
Figure BDA0003468449780000084
representing the ith dimension appearance template characteristic of the kth target of the previous frame; d1The similarity (cosine distance) of the appearance characteristics of the previous frame target and the current frame target, m is the number of the targets in the current frame image, and n is the number of the targets in the previous frame image.
And calculating the motion characteristic similarity of each target in the current frame and the previous frame through a Mahalanobis distance formula.
Figure BDA0003468449780000085
Wherein x represents the updated motion characteristic of the target in the previous frame, and y represents the motion characteristic of the target in the current frame. Where Σ is the covariance matrix of the multidimensional random variable, DM(x, y) denotes the Mahalanobis distance, d2Is the motion feature similarity of the previous frame target and the current frame target.
Characterizing the appearance of each objectAnd combining the characteristic similarity and the motion characteristic similarity through a hyper-parameter lambda to obtain a cost matrix. The concrete formula is as follows: costmatrix ═ λ d1+(1-λ)d2Costmatrix, which represents a cost matrix; d1And d2Are all m x n dimensional vectors, costmatrix is a matrix of m x n dimensions; λ is here set to 0.98.
Step 105: judging whether the target of the previous frame is successfully matched with the target of the current frame by using the cost matrix, if so, executing a step 106; if not, step 107 is performed.
Step 106: distributing the weight of the appearance characteristic of the target in the current frame image as a preset weight; the total weight of the appearance characteristic of the target in the current frame image and the appearance template characteristic of the target in the historical frame image is 1, the preset weight is less than 1, and a formula is utilized
Figure BDA0003468449780000086
Updating the appearance characteristic of the current frame target to obtain the appearance template characteristic of the target in the current frame image; and updating the motion characteristics of the target in the current frame image by using Kalman filtering to obtain the updated motion characteristics of the target in the current frame image. Wherein, f'nRepresenting the appearance template characteristics of the target in the current frame image obtained after updating; f. ofnRepresenting the appearance characteristics of the current frame target; f. ofiRepresenting the appearance template characteristic of the target of the ith frame before the current frame; mu.siThe assigned weight of the appearance template characteristic representing the ith frame target; gamma represents the assigned weight of the appearance feature of the current frame target; n denotes the total number of frames used for updating.
Step 107: and not updating the appearance characteristic of the target in the current frame image and the motion characteristic of the target in the current frame image.
In practical application, the cost matrix is solved through the Hungarian algorithm, and whether the targets of the front frame and the rear frame belong to the same target or not is judged. The specific process comprises the following steps:
the method comprises the following steps: and subtracting the minimum value of the line from each line of the cost matrix, and entering the step two.
Step two: and subtracting the minimum value of the new matrix from each column, and entering the step three.
Step three: all 0 elements in the matrix are covered with the least number of row and column lines and it is checked whether the current individual 0 element is equal to the order of the cost matrix and if so, the optimal allocation. The independent 0 element means that the row and the column where the 0 element is located have only one 0 element, if the row line and the column line do not cover all the elements of the matrix, the fourth step is entered, otherwise, the fifth step is entered.
Step four: the smallest element is found among the elements not covered by the row and column lines, the smallest element is subtracted from the remaining elements, and the smallest element is added to the element in the intersection point in the corresponding row and column line. And if the optimal distribution is obtained, entering the step five, and if the optimal distribution is not obtained, entering the step four.
Step five: and finding out all independent 0 elements in the matrix, wherein the row and the column where the independent 0 element is located are targets matched on the previous frame and the current frame.
If the same target is found, the target is updated according to the updating strategy of fig. 3, and the appearance characteristic of the target is updated. And updating the motion characteristics by using Kalman filtering. If the target characters are not matched with the target characters, the appearance characteristics and the motion characteristics are not updated, a more balanced appearance template characteristic can be obtained through the updating mode, even if the middle target characters are shielded, the ratio of the appearance characteristics of the shielded part to the appearance template characteristic is not large, and therefore the judgment of the target in the matching process is not influenced.
Fig. 3 is a schematic diagram of an appearance feature update policy provided by the present invention, and as shown in fig. 3, the update policy is specifically as follows:
(1) will T0Frame input feature extraction network to obtain T0Re-ID characteristics of the frame object.
(2) Will TiFrame input feature extraction network to obtain TiRe-ID characteristics of the frame object.
(3) And calculating the similarity of appearance characteristics of each target of the previous frame and the current frame by using a cosine distance formula. Calculating the previous frame by using the Mahalanobis distance formulaAnd the similarity of the motion characteristics of each target of the current frame, combining the motion characteristics with the motion characteristics of each target of the current frame through a hyper-parameter lambda to obtain a cost matrix, solving the cost matrix through a Hungarian algorithm to obtain a target matched with the previous frame and the current frame, and giving the Re-ID characteristics, namely T, of the target of the current frame if the targets are matchediWeight u of frame Re-ID featureiIs set to 0.1, T0Weight u of appearance template feature of frame0The value of (d) is set to 0.9. Thus passing through T0Appearance template features and weight values u for frames0Multiplied value of (D) and TiAppearance characteristics of a frame and its weight values uiIs added to obtain TiThe appearance template features of the frame object, that is, the appearance template features of the matched object of the current frame. The appearance template feature of the current frame object is also essentially a 128-dimensional feature vector. If the target is not matched, the target appearance feature is not updated.
(4) Will TnFrame input feature extraction network to obtain TnRepeating the matching process in (3) for the Re-ID feature of the frame object if TnIs matched to target of, in conjunction with TnRe-ID feature and T of frame objecti、T0The appearance template characteristics on the frames are updated by combining the weights distributed to the respective frames to obtain TnAppearance template features of the frame. In practice, a fixed allocation TnThe weight gamma of the frame is 0.1, so that the obtained appearance template characteristics can be adapted to some changes of the appearance of the target, and T is used0Weight u of frame0As large as possible is provided so that over time a portion of its appearance characteristics remain in the resulting appearance template features. And the rest weighted values are averagely distributed to the appearance template characteristics of the rest frames, and a balanced appearance template characteristic can be obtained through the updating mechanism. Therefore, the target does not have misjudgment in the matching process and keeps the identity identification of the target unchanged. The specific update mechanism is formulated as follows:
Figure BDA0003468449780000101
Figure BDA0003468449780000102
wherein, f'nRepresenting the appearance template characteristics of the target in the current frame image obtained after updating; f. ofnRepresenting the appearance characteristics of the current frame target; f. ofiRepresenting the appearance template characteristic of the target of the ith frame before the current frame; mu.siThe assigned weight of the appearance template characteristic representing the ith frame target; gamma represents the assigned weight of the appearance characteristic of the current frame target, and gamma is 0.1; n denotes the total number of frames used for updating.
In the actual experiment process, a large number of experiments are carried out, the effect of searching how many frames are used for updating and tracking is the best, and the search T0Weight value u of0Is set to what value is most beneficial for the update mechanism of the present invention. Through the experimental result, the number of frames is set to 3, u0Setting the value of (c) to 0.6 is most advantageous for the update mechanism of the present invention.
Comparing the tracking result using the original appearance feature updating mechanism with the tracking result using the appearance feature updating mechanism of the present invention, as shown in fig. 4, fig. 4(a) - (c) are the tracking results using the original updating mechanism, and the arrow points to the tracked target, it can be seen that after the target is blocked by the crowd, his identity is changed from 363 to 217.
Fig. 4(d) - (f) are the tracking results obtained by using the appearance feature updating mechanism of the present invention, and it can be seen that after the target is occluded, the identity of the target does not change, and remains as the mark 366, so that the appearance feature updating mechanism of the present invention can well maintain the identity of the target.
Fig. 5 is a structural diagram of an appearance feature updating system for multi-target tracking in a dense crowd scene, as shown in fig. 5, the appearance feature updating system for multi-target tracking in a dense crowd scene includes:
an image obtaining module 501, configured to obtain a current frame image. The current frame image is an image with a plurality of target persons.
A target detection frame determining module 502, configured to input the current frame image into a target detection network, so as to obtain detection frames of multiple targets in the current frame image.
The feature extraction module 503 is configured to input the objects in the detection frames into a feature extraction network, and determine appearance features of the objects in the current frame image and motion features of the objects in the current frame image.
In one embodiment, the appearance feature updating system for multi-target tracking in dense crowd scenes further includes:
and the historical frame updating characteristic acquisition module is used for acquiring the appearance template characteristics of the plurality of targets in the historical frame image and the updated motion characteristics of the plurality of targets in the historical frame image.
A cost matrix determining module 504, configured to determine a cost matrix according to the appearance features of the multiple targets in the current frame image, the motion features of the multiple targets in the current frame image, the appearance template features of the multiple targets in the previous frame image, and the updated motion features of the multiple targets in the previous frame image.
In a specific embodiment, the cost matrix determining module 504 specifically includes:
and the appearance feature similarity determining unit is used for determining the appearance feature similarity of the targets according to the appearance features of the targets in the current frame image and the appearance template features of the targets in the previous frame image.
And the motion characteristic similarity determining unit is used for determining the motion characteristic similarity of the targets according to the motion characteristics of the targets in the current frame image and the updated motion characteristics of the targets in the previous frame image.
A cost matrix determination unit for determining the cost matrix using the formula costmatrix ═ λ d1+(1-λ)d2Determining a cost matrix; wherein costmatrix represents a cost matrix; d1Similarity of appearance features; d2Similarity of motion characteristics; λ is a hyperparameter.
And the judging module 505 is configured to judge whether the previous frame target and the current frame target are successfully matched by using the cost matrix, so as to obtain a judgment result.
A first executing module 506, configured to, if the determination result is that the previous frame target and the current frame target are successfully matched, assign a weight of an appearance feature of the target in the current frame image as a preset weight; the total weight of the appearance characteristic of the target in the current frame image and the appearance template characteristic of the target in the historical frame image is 1, the preset weight is less than 1, and a formula is utilized
Figure BDA0003468449780000121
Updating the appearance characteristic of the current frame target to obtain the appearance template characteristic of the target in the current frame image, and updating the motion characteristic of the target in the current frame image by using Kalman filtering to obtain the updated motion characteristic of the target in the current frame image. Wherein, f'nRepresenting the appearance template characteristics of the target in the current frame image obtained after updating; f. ofnRepresenting the appearance characteristics of the current frame target; f. ofiRepresenting the appearance template characteristic of the target of the ith frame before the current frame; mu.siThe assigned weight of the appearance template characteristic representing the ith frame target; gamma represents the assigned weight of the appearance feature of the current frame target; n denotes the total number of frames used for updating.
The second executing module 507 is configured to not update the appearance feature of the target in the current frame image and the motion feature of the target in the current frame image if the determination result is that the previous frame target and the current frame target are not successfully matched.
The invention has the following advantages:
1. different from an appearance updating mechanism of the existing multi-target tracking algorithm, the Re-ID characteristic of the current frame and the appearance template characteristic of the previous frame are simply used for updating to obtain the appearance template characteristic of the target. The method and the device have the advantages that the appearance characteristics of the multiple frames of the target are updated, the appearance template characteristics of the current frame of the target are obtained, and the influence of the shielded target on the updated appearance template characteristics is relieved.
2. By fixedly setting the weight of the Re-ID feature of the current frame to 0.1 in the updating process, the appearance template feature of the target obtained in the way can adapt to changes of the appearance of the target.
3. By assigning as much weight as possible to the appearance template features of the original frame. With the lapse of time, it still can keep a part of the characteristic in the appearance template characteristic that is actually obtained, the appearance template characteristic of the other frames is distributed evenly, thus even if the midway target is sheltered, the proportion of the sheltered appearance characteristic to the obtained appearance template characteristic is not very large, and is not enough to cause misjudgment to the target in the matching process, so the tracker can track the target accurately.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (6)

1. A multi-target tracking appearance characteristic updating method in a dense crowd scene is characterized by comprising the following steps:
acquiring a current frame image; the current frame image is an image with a plurality of targets;
inputting the current frame image into a target detection network to obtain a plurality of target detection frames in the current frame image;
inputting the targets in the detection frames into a feature extraction network, and determining appearance features of the targets in the current frame image and motion features of the targets in the current frame image;
determining a cost matrix according to the appearance characteristics of a plurality of targets in the current frame image, the motion characteristics of a plurality of targets in the current frame image, the appearance template characteristics of a plurality of targets in the previous frame image and the updated motion characteristics of a plurality of targets in the previous frame image;
judging whether the target of the previous frame and the target of the current frame are successfully matched by using the cost matrix to obtain a judgment result;
if the judgment result is that the target of the previous frame is successfully matched with the target of the current frame, distributing the weight of the appearance characteristic of the target in the image of the current frame as a preset weight; the total weight of the appearance characteristic of the target in the current frame image and the appearance template characteristic of the target in the historical frame image is 1, the preset weight is less than 1, and a formula is utilized
Figure FDA0003468449770000011
Updating the appearance characteristics of the current frame target to obtain the appearance template characteristics of the target in the current frame image; updating the motion characteristics of the target in the current frame image by using Kalman filtering to obtain updated motion characteristics of the target in the current frame image; wherein, f'nRepresenting the appearance template characteristics of the target in the current frame image obtained after updating; f. ofnRepresenting the appearance characteristics of the current frame target; f. ofiRepresenting the appearance template characteristic of the target of the ith frame before the current frame; mu.siThe assigned weight of the appearance template characteristic representing the ith frame target; gamma represents the assigned weight of the appearance feature of the current frame target; n represents the total number of frames used for updating;
and if the judgment result is that the target of the previous frame is not successfully matched with the target of the current frame, not updating the appearance characteristic of the target in the image of the current frame and the motion characteristic of the target in the image of the current frame.
2. The method for updating appearance features of multi-target tracking in dense crowd scenes as claimed in claim 1, further comprising:
appearance template features of the plurality of objects in the historical frame image and updated motion features of the plurality of objects in the historical frame image are obtained.
3. The method as claimed in claim 2, wherein the determining the cost matrix according to the appearance features of the plurality of targets in the current frame image, the motion features of the plurality of targets in the current frame image, the appearance template features of the plurality of targets in the previous frame image, and the updated motion features of the plurality of targets in the previous frame image specifically includes:
determining the similarity of the appearance characteristics of the targets according to the appearance characteristics of the targets in the current frame image and the appearance template characteristics of the targets in the previous frame image;
determining motion characteristic similarity of the targets according to the motion characteristics of the targets in the current frame image and the updated motion characteristics of the targets in the previous frame image;
using the formula costmatrix ═ λ d1+(1-λ)d2Determining a cost matrix; wherein costmatrix represents a cost matrix; d1Similarity of appearance features; d2Similarity of motion characteristics; λ is a hyperparameter.
4. A multi-target tracking appearance feature updating system in a dense crowd scene is characterized by comprising:
the image acquisition module is used for acquiring a current frame image; the current frame image is an image with a plurality of targets;
the target detection frame determining module is used for inputting the current frame image into a target detection network to obtain a plurality of target detection frames in the current frame image;
the feature extraction module is used for inputting the targets in the detection frames into a feature extraction network and determining appearance features of the targets in the current frame image and motion features of the targets in the current frame image;
the cost matrix determining module is used for determining a cost matrix according to the appearance characteristics of the targets in the current frame image, the motion characteristics of the targets in the current frame image, the appearance template characteristics of the targets in the previous frame image and the updated motion characteristics of the targets in the previous frame image;
the judging module is used for judging whether the target of the previous frame is matched with the target of the current frame by utilizing the cost matrix to obtain a judging result;
the first execution module is used for distributing the weight of the appearance characteristic of the target in the current frame image as a preset weight if the judgment result is that the target in the previous frame is successfully matched with the target in the current frame; the total weight of the appearance characteristic of the target in the current frame image and the appearance template characteristic of the target in the historical frame image is 1, the preset weight is less than 1, and a formula is utilized
Figure FDA0003468449770000031
Updating the appearance characteristics of the current frame target to obtain the appearance template characteristics of the target in the current frame image; updating the motion characteristics of the target in the current frame image by using Kalman filtering to obtain updated motion characteristics of the target in the current frame image; wherein, f'nRepresenting the appearance template characteristics of the target in the current frame image obtained after updating; f. ofnRepresenting the appearance characteristics of the current frame target; f. ofiRepresenting the appearance template characteristic of the target of the ith frame before the current frame; mu.siThe assigned weight of the appearance template characteristic representing the ith frame target; gamma represents the assigned weight of the appearance feature of the current frame target; n represents the total number of frames used for updating;
and the second execution module is used for not updating the appearance characteristics of the target in the current frame image and the motion characteristics of the target in the current frame image if the judgment result shows that the target of the previous frame and the target of the current frame are not successfully matched.
5. The multi-target tracking appearance feature updating system in the dense crowd scene as claimed in claim 4, further comprising:
and the historical frame updating characteristic acquisition module is used for acquiring the appearance template characteristics of the plurality of targets in the historical frame image and the updated motion characteristics of the plurality of targets in the historical frame image.
6. The system for updating appearance characteristics of multi-target tracking in dense crowd scenes as claimed in claim 5, wherein said cost matrix determining module specifically comprises:
the appearance feature similarity determining unit is used for determining the appearance feature similarity of the targets according to the appearance features of the targets in the current frame image and the appearance template features of the targets in the previous frame image;
the motion characteristic similarity determining unit is used for determining the motion characteristic similarity of the targets according to the motion characteristics of the targets in the current frame image and the updated motion characteristics of the targets in the previous frame image;
a cost matrix determination unit for determining the cost matrix using the formula costmatrix ═ λ d1+(1-λ)d2Determining a cost matrix; wherein costmatrix represents a cost matrix; d1Similarity of appearance features; d2Similarity of motion characteristics; λ is a hyperparameter.
CN202210037157.6A 2022-01-13 2022-01-13 Appearance characteristic updating method and system for multi-target tracking in dense crowd scene Pending CN114373154A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210037157.6A CN114373154A (en) 2022-01-13 2022-01-13 Appearance characteristic updating method and system for multi-target tracking in dense crowd scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210037157.6A CN114373154A (en) 2022-01-13 2022-01-13 Appearance characteristic updating method and system for multi-target tracking in dense crowd scene

Publications (1)

Publication Number Publication Date
CN114373154A true CN114373154A (en) 2022-04-19

Family

ID=81187772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210037157.6A Pending CN114373154A (en) 2022-01-13 2022-01-13 Appearance characteristic updating method and system for multi-target tracking in dense crowd scene

Country Status (1)

Country Link
CN (1) CN114373154A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114690176A (en) * 2022-06-01 2022-07-01 南京隼眼电子科技有限公司 Moving target tracking method and device, electronic equipment and storage medium
CN114943943A (en) * 2022-05-16 2022-08-26 中国电信股份有限公司 Target track obtaining method, device, equipment and storage medium
CN116862952A (en) * 2023-07-26 2023-10-10 合肥工业大学 Video tracking method for substation operators under similar background conditions

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943943A (en) * 2022-05-16 2022-08-26 中国电信股份有限公司 Target track obtaining method, device, equipment and storage medium
CN114943943B (en) * 2022-05-16 2023-10-03 中国电信股份有限公司 Target track obtaining method, device, equipment and storage medium
CN114690176A (en) * 2022-06-01 2022-07-01 南京隼眼电子科技有限公司 Moving target tracking method and device, electronic equipment and storage medium
CN116862952A (en) * 2023-07-26 2023-10-10 合肥工业大学 Video tracking method for substation operators under similar background conditions
CN116862952B (en) * 2023-07-26 2024-02-27 合肥工业大学 Video tracking method for substation operators under similar background conditions

Similar Documents

Publication Publication Date Title
CN114373154A (en) Appearance characteristic updating method and system for multi-target tracking in dense crowd scene
CN111325089B (en) Method and apparatus for tracking object
CN112308881B (en) Ship multi-target tracking method based on remote sensing image
CN113034548B (en) Multi-target tracking method and system suitable for embedded terminal
CN107423702B (en) Video target tracking method based on TLD tracking system
CN116309757B (en) Binocular stereo matching method based on machine vision
CN111626194B (en) Pedestrian multi-target tracking method using depth correlation measurement
CN113191180B (en) Target tracking method, device, electronic equipment and storage medium
CN104865570B (en) Tracking before a kind of quick Dynamic Programming detection
CN108596157B (en) Crowd disturbance scene detection method and system based on motion detection
CN109816051B (en) Hazardous chemical cargo feature point matching method and system
CN112734858B (en) Binocular calibration precision online detection method and device
CN112444374B (en) Tracking evaluation method based on optical tracking measurement equipment servo system
CN116311063A (en) Personnel fine granularity tracking method and system based on face recognition under monitoring video
CN106033613B (en) Method for tracking target and device
CN111062954B (en) Infrared image segmentation method, device and equipment based on difference information statistics
CN115131705A (en) Target detection method and device, electronic equipment and storage medium
CN110969657B (en) Gun ball coordinate association method and device, electronic equipment and storage medium
AU2016342547A1 (en) Improvements in and relating to missile targeting
CN111524161B (en) Method and device for extracting track
CN113657169B (en) Gait recognition method, device and system and computer readable storage medium
CN110706202B (en) Atypical target detection method, atypical target detection device and computer readable storage medium
CN110427982B (en) Automatic wiring machine route correction method and system based on image processing
CN112365526A (en) Binocular detection method and system for weak and small targets
CN114091519A (en) Shielded pedestrian re-identification method based on multi-granularity shielding perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination