CN115994928B

CN115994928B - Target tracking method, device, equipment and medium

Info

Publication number: CN115994928B
Application number: CN202310293456.0A
Authority: CN
Inventors: 薛巍; 赵诗宇
Original assignee: Imotion Automotive Technology Suzhou Co Ltd
Current assignee: Imotion Automotive Technology Suzhou Co Ltd
Priority date: 2023-03-24
Filing date: 2023-03-24
Publication date: 2023-06-09
Anticipated expiration: 2043-03-24
Also published as: CN115994928A

Abstract

The application discloses a target tracking method, a device, equipment and a medium, which relate to the technical field of computer vision and comprise the following steps: performing target detection on the current frame to obtain a detection result set; matching the high-resolution detection result set with the prediction result of the previous frame, updating the prediction result by using the detection result and putting the prediction result into the tracking set if the matching is successful, and putting the detection result and the prediction result which are failed to match into the first detection legacy set and the first prediction result set; matching the low-resolution detection result set with the first prediction result set, updating the prediction result and putting the prediction result into the tracking set if the low-resolution detection result set is successful, discarding the detection result of matching failure, and putting the prediction result of matching failure into the second prediction result set; matching the first detection legacy set with the second prediction result set, and if successful, updating the prediction result and putting the prediction result into the tracking set, and putting the detection result which fails to match into the second detection legacy set; and initializing a second detection legacy set, and then putting the second detection legacy set into a tracking set to return a target tracking result.

Description

Target tracking method, device, equipment and medium

Technical Field

The present invention relates to the field of computer vision, and in particular, to a target tracking method, apparatus, device, and medium.

Background

The most mature tracking method applied at present is as follows: tracking means based on detection. The first stage uses the detection model to obtain a detection result; and in the second stage, a distance or feature matching method is used for matching the detection result with the tracking result of the previous frame. However, the existing tracking method has the following problems: after a plurality of frames are continuously lost, when the tracking target reappears, the detection target and the tracking prediction result have no intersection, so that the detection target and the tracking prediction result cannot be matched, and finally the lost target cannot be matched again, so that the problem of tracking ID switching is caused.

In summary, how to increase the matching success rate to continuously track the target, so as to avoid the problem that the lost target cannot be matched again is currently to be solved.

Disclosure of Invention

Accordingly, the present invention aims to provide a target tracking method, device, equipment and medium, which can increase the matching success rate to continuously track the target, and avoid that the lost target cannot be matched again. The specific scheme is as follows:

in a first aspect, the present application discloses a target tracking method, including:

detecting each target in the current frame to obtain a detection result set, and determining a high-resolution detection result set and a low-resolution detection result set in the detection result set;

Matching the high-resolution detection result set with the prediction result of the previous frame, updating the corresponding prediction result by the high-resolution detection result which is successfully matched, putting the updated prediction result into a tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to be matched into a first detection legacy set and a first prediction result set;

matching the low-score detection result set with the first prediction result set, updating the corresponding prediction result by using the successfully matched low-score detection result, putting the updated prediction result into the tracking set, discarding the low-score detection result with failed matching, and putting the prediction result with failed matching into a second prediction result set;

matching the first detection legacy set with the second prediction result set, updating the corresponding prediction result by using the successfully matched high-resolution detection result, putting the updated prediction result into the tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to match into the second detection legacy set and the lost tracking set;

initializing a high-score detection result in the second detection legacy set, and placing the initialized high-score detection result into the tracking set so as to return a target tracking result based on the tracking set;

Wherein said matching the first set of detected carryover with the second set of predicted outcomes comprises:

determining CIoU distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set based on a CIoU method;

determining appearance feature distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set based on a cosine similarity method;

and determining a target cost value from the CIoU distance cost and the appearance feature distance cost, and determining a matching result based on the target cost value.

Optionally, the detecting each target in the current frame to obtain a detection result set includes:

and detecting the current frame by using a target detector to obtain detection results and detection scores corresponding to the targets, and constructing a detection result set based on the detection results.

Optionally, the determining the high-score detection result set and the low-score detection result set in the detection result set includes:

and determining a preset detection score threshold value, and determining a high-resolution detection result set and a low-resolution detection result set in the detection result set based on the detection score threshold value and the detection scores corresponding to the targets.

Optionally, before the matching the high-resolution detection result set with the prediction result of the previous frame, the method further includes:

and predicting the tracking result of the previous frame by using a Kalman prediction method to obtain a prediction result of the previous frame.

Optionally, the determining a target cost value from the CIoU distance cost and the appearance feature distance cost, and determining a matching result based on the target cost value includes:

taking the smaller value of the CIoU distance cost and the appearance characteristic distance cost as a target cost value, and judging whether the target cost value is smaller than a preset cost threshold value or not;

if the target cost value is smaller than the preset cost threshold value, the matching result is successful;

and if the target cost value is not smaller than the preset cost threshold, the matching result is a matching failure.

Optionally, the determining, based on the cosine similarity method, an appearance feature distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set includes:

determining cosine similarity by using a preset CIoU threshold and a preset appearance characteristic threshold;

judging whether the cosine similarity and the CIoU distance are respectively smaller than the corresponding preset appearance characteristic threshold and the corresponding preset CIoU threshold;

If yes, determining appearance feature distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set by using the cosine similarity and a preset coefficient;

and if not, determining the appearance characteristic distance cost between the high-resolution detection result in the first detection legacy set and the prediction result in the second prediction result set as 1.

In a second aspect, the present application discloses a target tracking apparatus comprising:

the detection result acquisition module is used for detecting each target in the current frame to acquire a detection result set, and determining a high-resolution detection result set and a low-resolution detection result set in the detection result set;

the first matching module is used for matching the high-resolution detection result set with the prediction result of the previous frame, updating the corresponding prediction result by the high-resolution detection result which is successfully matched, putting the updated prediction result into the tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to be matched into the first detection legacy set and the first prediction result set;

the second matching module is used for matching the low-score detection result set with the first prediction result set, updating the corresponding prediction result by using the successfully matched low-score detection result, putting the updated prediction result into the tracking set, discarding the low-score detection result with failed matching, and putting the prediction result with failed matching into the second prediction result set;

The third matching module is used for matching the first detection legacy set with the second prediction result set, updating the corresponding prediction result by the high-resolution detection result which is successfully matched, putting the updated prediction result into the tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to be matched into the second detection legacy set and the lost tracking set;

the tracking result acquisition module is used for initializing the high-score detection result in the second detection legacy set and placing the initialized high-score detection result into the tracking set so as to return a target tracking result based on the tracking set;

the third matching module is specifically configured to:

In a third aspect, the present application discloses an electronic device comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the previously disclosed target tracking method.

In a fourth aspect, the present application discloses a computer-readable storage medium for storing a computer program; wherein the computer program when executed by a processor implements the steps of the previously disclosed object tracking method.

Therefore, the method and the device detect each target in the current frame to obtain a detection result set, and determine a high-resolution detection result set and a low-resolution detection result set in the detection result set; matching the high-resolution detection result set with the prediction result of the previous frame, updating the corresponding prediction result by the high-resolution detection result which is successfully matched, putting the updated prediction result into a tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to be matched into a first detection legacy set and a first prediction result set; matching the low-score detection result set with the first prediction result set, updating the corresponding prediction result by using the successfully matched low-score detection result, putting the updated prediction result into the tracking set, discarding the low-score detection result with failed matching, and putting the prediction result with failed matching into a second prediction result set; matching the first detection legacy set with the second prediction result set, updating the corresponding prediction result by using the successfully matched high-resolution detection result, putting the updated prediction result into the tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to match into the second detection legacy set and the lost tracking set; initializing high-score detection results in the second detection legacy set, and placing the initialized high-score detection results into the tracking set so as to return target tracking results based on the tracking set. Wherein said matching the first set of detected carryover with the second set of predicted outcomes comprises: determining CIoU distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set based on a CIoU method; determining appearance feature distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set based on a cosine similarity method; and determining a target cost value from the CIoU distance cost and the appearance feature distance cost, and determining a matching result based on the target cost value.

Therefore, after the high-resolution detection result set and the low-resolution detection result set of the current frame are sequentially matched with the prediction result of the previous frame, a first detection legacy set formed by the high-resolution detection result which is not successfully matched and a second prediction result set formed by the prediction result which is not successfully matched are obtained; and then matching the first detection legacy set and the second prediction result set again, if the matching is successful, updating the prediction result by using the corresponding high-resolution detection result, and placing the updated prediction result into the tracking set, if the matching is failed, respectively placing the corresponding high-resolution detection result and the prediction result into the second detection legacy set and the lost tracking set, initializing the high-resolution detection result in the second detection legacy set, and placing the initialized high-resolution detection result into the tracking set so as to return the final target tracking result based on the tracking set. Specifically, when the first detection legacy set is matched with the second prediction result set, the matching method used in calculating the distance is a CIoU method, when the appearance feature distance between the detection target and the tracking target is calculated, a cosine similarity calculation mode is used, the final target cost value is determined from the CIoU distance cost and the appearance feature distance cost, and the matching result is determined based on the target cost value, so that the target cost value can accurately reflect the final cost between the detection target and the tracking target of the previous frame, the matching success rate is increased, and the tracking target ID switching phenomenon is reduced. That is, the matching process of the first detection legacy set and the second prediction result set is further added on the basis of the original tracking process, so that the matching range can be further expanded to a peripheral area, the matching success rate is increased, the problem that targets cannot be continuously tracked due to the fact that lost targets cannot be matched again is solved, and the phenomenon of ID switching of tracking targets is reduced.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.

FIG. 1 is a flow chart of a target tracking method disclosed in the present application;

FIG. 2 is a specific matching flow chart disclosed herein;

FIG. 3 is a schematic diagram of the positions of a detection target and a tracking target disclosed in the present application;

FIG. 4 is a schematic diagram of a target tracking apparatus disclosed in the present application;

fig. 5 is a block diagram of an electronic device disclosed in the present application.

Detailed Description

The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The existing tracking method has the following problems: after a plurality of frames are continuously lost, when the tracking target reappears, the detection target and the tracking prediction result have no intersection, so that the detection target and the tracking prediction result cannot be matched, and finally the lost target cannot be matched again, so that the problem of tracking ID switching is caused. Therefore, the embodiment of the application discloses a target tracking method, device, equipment and medium, which can increase the matching success rate so as to continuously track the target and avoid the situation that the lost target cannot be matched again.

Referring to fig. 1, an embodiment of the present application discloses a target tracking method, which includes:

step S11: and detecting each target in the current frame to obtain a detection result set, and determining a high-resolution detection result set and a low-resolution detection result set in the detection result set.

In this embodiment, a video to be traversed is acquiredVAnd traverse the videoVEach frame of (2)f. In the current framef _k For example, performing target detection on the current frame to obtain a corresponding detection result setD _k And determining a high-resolution detection result set in the detection result setD _high And a low score detection result setD _low 。

In a specific embodiment, the detecting each target in the current frame to obtain a detection result set specifically includes: and detecting the current frame by using a target detector to obtain detection results and detection scores corresponding to the targets, and constructing a detection result set based on the detection results.

Further, in a specific embodiment, the determining the high-score detection result set and the low-score detection result set in the detection result set includes: and determining a preset detection score threshold value, and determining a high-resolution detection result set and a low-resolution detection result set in the detection result set based on the detection score threshold value and the detection scores corresponding to the targets.

It will be appreciated that the present embodiment utilizes a target detectorDetAnd detecting the current frame to obtain detection results and detection scores corresponding to the targets, and constructing a detection result set based on the detection results. And dividing the detection result set into a high-score detection result set and a low-score detection result set according to a preset detection score threshold and detection scores corresponding to the targets. Specifically, when the detection score is lower than the threshold value, the corresponding detection result is put into the low-score detection result set, and when the detection score is not lower than the threshold value, the corresponding detection result is put into the high-score detection result set.

Step S12: and matching the high-resolution detection result set with the prediction result of the previous frame, updating the corresponding prediction result by using the successfully matched high-resolution detection result, putting the updated prediction result into a tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to be matched into a first detection legacy set and a first prediction result set.

In the present embodiment, the high-resolution detection results are collectedD _high Prediction result of previous frameP _k Matching, if the matching is successful, updating the corresponding prediction result by using the high-resolution detection result on the matching, and putting the updated prediction result into a tracking setTThe method comprises the steps of carrying out a first treatment on the surface of the If the matching fails, the high-resolution detection result which is not matched is put into a first detection legacy setD _remain And placing the unmatched prediction results into the first prediction result setP _remain 。

It should be noted that, in the specific embodiment, before the matching the high-resolution detection result set with the prediction result of the previous frame, the method further includes: and predicting the tracking result of the previous frame by using a Kalman prediction method to obtain a prediction result of the previous frame. It will be appreciated that the prediction result of the previous frame is obtained by tracking the previous frameT _k-1 And (5) performing Kalman prediction.

Step S13: matching the low-score detection result set with the first prediction result set, updating the corresponding prediction result by using the successfully matched low-score detection result, putting the updated prediction result into the tracking set, discarding the low-score detection result with failed matching, and putting the prediction result with failed matching into the second prediction result set.

In the present embodiment, the low-resolution detection results are collectedD _low And a first set of prediction resultsP _remain Matching, if the matching is successful, updating the corresponding prediction result by using the low-resolution detection result on the matching, and putting the updated prediction result into a tracking setTThe method comprises the steps of carrying out a first treatment on the surface of the If the matching fails, discarding the low-resolution detection result without matching, and putting the prediction result without matching into a second prediction result setP _re-remain 。

Step S14: and matching the first detection legacy set with the second prediction result set, updating the corresponding prediction result by using the successfully matched high-resolution detection result, putting the updated prediction result into the tracking set, and respectively putting the high-resolution detection result and the prediction result which are failed to match into the second detection legacy set and the lost tracking set.

In this embodiment, the first detection legacy setD _remain And a second prediction result setP _re-remain Performing re-matching, if the matching is successful, updating the corresponding prediction result by using the high-resolution detection result on the matching, and putting the updated prediction result into a tracking setTThe method comprises the steps of carrying out a first treatment on the surface of the If the matching is lostIf the detection result is out of date, the high-score detection result which is not matched is put into a second detection legacy setD _re-remain And placing the prediction result without the match into the loss tracking set T _loss 。

Specifically, referring to fig. 2, the embodiment of the present application provides specific steps for matching a first detection legacy set with the second prediction result set, including:

step S141: determining CIoU distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set based on a CIoU method; and determining appearance feature distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set based on a cosine similarity method.

In this embodiment, when the first detection legacy set is matched with the second prediction result set, the matching method used is the CIoU method. The CIoU method is used for better reflecting the distance between the detection target frame and the tracking target frame, and the distance parameter can be obtained for matching when the detection target frame and the tracking target frame are not intersected. The method comprises the steps of firstly determining CIoU distance cost between a high-resolution detection result in a first detection legacy set and a prediction result in a second prediction result set based on a CIoU method, and determining appearance feature distance cost between the high-resolution detection result in the first detection legacy set and the prediction result in the second prediction result set based on a cosine similarity method. It can be understood that the calculation of the matching value has two parts, one part is distance calculation, and the additional matching in this embodiment uses CIoU; the second part is to calculate the appearance feature distance between the detection target and the tracking target, and the present embodiment uses the cosine similarity calculation mode.

In a specific embodiment, the determining, based on the cosine similarity method, the appearance feature distance cost between the high-resolution detection result in the first detection legacy set and the prediction result in the second prediction result set includes: determining cosine similarity by using a preset CIoU threshold and a preset appearance characteristic threshold; judging whether the cosine similarity and the CIoU distance are respectively smaller than the corresponding preset appearance characteristic threshold and the corresponding preset CIoU threshold; if yes, determining appearance feature distance cost between a high-resolution detection result in the first detection legacy set and a prediction result in the second prediction result set by using the cosine similarity and a preset coefficient; and if not, determining the appearance characteristic distance cost between the high-resolution detection result in the first detection legacy set and the prediction result in the second prediction result set as 1. In this embodiment, the feature similarity calculation mode is the cosine distance of the feature, and since the ByteTrack calculates only the distance and does not use the feature similarity, the matching mode in the ByteTrack tracking flow is not continued, because the matching range is expanded, and when CIoU is possibly calculated, other target detection frames are also matched, so that the feature cosine similarity is used to further restrict the matching. When the cosine similarity is smaller than a preset appearance feature threshold value and the CIoU distance is also smaller than the preset CIoU threshold value, the appearance feature distance cost is the product of a preset coefficient and the cosine similarity; otherwise, the appearance feature distance cost is set to 1. It can be understood that when the cosine similarity and the CIoU distance are not smaller than the respective corresponding thresholds, it is indicated that the detection result and the tracking result are not the same target, and therefore, by setting the cost value to 1, it is ensured that the subsequent matching result is a matching failure.

The specific formula is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,

is the firstiPrediction result and the firstjAppearance feature distance cost between detection results, < ->

Is the firstiPrediction result and the firstjCosine similarity between the detection results, +.>

For the preset appearance characteristic threshold value, < >>

For the preset CIoU threshold, +.>

Is the firstiPrediction result and the firstjThe CIoU distance between the detection results is 0.5 which is a preset coefficient and can be specifically set according to actual conditions; and, cosine similarity is determined by a preset appearance threshold +.>

And presetting a CIoU threshold

And (5) calculating and determining.

For example, referring to fig. 3, when the tracking target is lost and appears again, due to the problem of prediction error or detection error, the prediction frame and the corresponding detection frame of the tracking target do not necessarily have an overlapping area, so that the length value of IoU is 0, and therefore, the two cannot be matched. In the figure, solid line boxes represent detection targets, respectively denoted by 1', 2', and 3', dotted line boxes represent tracking targets, respectively denoted by 1, 2, and 3, and the same numerals denote the same targets. When the solid line box and the dotted line box do not intersect, matching can also be performed by setting a CIoU threshold. In addition, the additional matching flow does not interfere with the matching logic of the original tracking flow, and can be suitable for most of the current tracking flows to realize plug and play.

Step S142: and determining a target cost value from the CIoU distance cost and the appearance feature distance cost, and determining a matching result based on the target cost value.

In this embodiment, the determining a target cost value from the CIoU distance cost and the appearance feature distance cost, and determining a matching result based on the target cost value specifically includes: taking the smaller value of the CIoU distance cost and the appearance characteristic distance cost as a target cost value, and judging whether the target cost value is smaller than a preset cost threshold value or not; if the target cost value is smaller than the preset cost threshold value, the matching result is successful; and if the target cost value is not smaller than the preset cost threshold, the matching result is a matching failure. It can be understood that, in this embodiment, the CIoU distance cost and the appearance feature distance cost are fused, and a smaller value of the CIoU distance cost and the appearance feature distance cost is selected as a final target cost value, and the target cost value can be successfully matched when the target cost value meets a cost threshold condition. Specifically, if the target cost value is smaller than the preset cost threshold, the matching result is successful; if the target cost value is not smaller than the preset cost threshold, the matching result is a matching failure. Compared with the scheme that the distance cost and the characteristic cost are weighted and summed to obtain the final cost value in the prior art by using a weighted average mode, the target cost value obtained by the method can accurately reflect the final cost between the detection target and the tracking target of the previous frame, so that the matching success rate is effectively increased, and the ID switching phenomenon of the tracking target is reduced.

Step S15: initializing high-score detection results in the second detection legacy set, and placing the initialized high-score detection results into the tracking set so as to return target tracking results based on the tracking set.

In this embodiment, the second detection legacy set is initializedD _re-remain The high-resolution detection result in (1) is newtrackers(t)And put into tracking collectionTSo as to return a target tracking result based on the resulting tracking set. It can be understood that the high-resolution detection result in the second detection legacy set corresponds to an unsuccessfully matched target in the current frame, so that the purpose of initialization is to take the high-resolution detection result as a new target so as to match with the next frame, so as to further improve the matching success rate, and in the process of initialization, a new ID value is given, the position of the high-resolution detection result is recorded, and the state of the high-resolution detection result is set to be an unacknowledged state and then is put into the tracking set of the current frame. Further, it should be noted that when tracking sets are lostT _loss When there is continuous loss exceeding the preset frame number, the videos are processedFrames are deleted from the loss tracking set, wherein the preset number of frames may be specifically set to 30.

Referring to fig. 4, an embodiment of the present application discloses a target tracking apparatus, which includes:

a detection result obtaining module 11, configured to detect each target in the current frame to obtain a detection result set, and determine a high-resolution detection result set and a low-resolution detection result set in the detection result set;

a first matching module 12, configured to match the high-resolution detection result set with the prediction result of the previous frame, update the high-resolution detection result that is successfully matched with the corresponding prediction result, put the updated prediction result into the tracking set, and put the high-resolution detection result and the prediction result that are failed to be matched into the first detection legacy set and the first prediction result set, respectively;

a second matching module 13, configured to match the low-resolution detection result set with the first prediction result set, update the low-resolution detection result that is successfully matched with the corresponding prediction result, put the updated prediction result into the tracking set, discard the low-resolution detection result that is failed to be matched, and put the prediction result that is failed to be matched into a second prediction result set;

a third matching module 14, configured to match the first detection legacy set with the second prediction result set, update the high-resolution detection result that is successfully matched with the corresponding prediction result, and put the updated prediction result into the tracking set, and put the high-resolution detection result and the prediction result that are failed to match into the second detection legacy set and the lost tracking set, respectively;

A tracking result obtaining module 15, configured to initialize a high-score detection result in the second detection legacy set, and put the initialized high-score detection result into the tracking set, so as to return a target tracking result based on the tracking set;

wherein, the third matching module 14 is specifically configured to:

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Specifically, the method comprises the following steps: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is configured to store a computer program that is loaded and executed by the processor 21 to implement the relevant steps of the object tracking method performed by the electronic device as disclosed in any of the foregoing embodiments.

In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.

Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 21 may also comprise a main processor, which is a processor for processing data in an awake state, also called CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 21 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon include an operating system 221, a computer program 222, and data 223, and the storage may be temporary storage or permanent storage.

The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and the computer program 222, so as to implement the operation and processing of the processor 21 on the mass data 223 in the memory 22, which may be Windows, unix, linux. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the object tracking method performed by the electronic device 20 as disclosed in any of the previous embodiments. The data 223 may include, in addition to data received by the electronic device and transmitted by the external device, data collected by the input/output interface 25 itself, and so on.

Further, the embodiment of the application also discloses a computer readable storage medium, wherein the storage medium stores a computer program, and when the computer program is loaded and executed by a processor, the method steps executed in the target tracking process disclosed in any of the previous embodiments are realized.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above description of the target tracking method, device, apparatus and storage medium provided by the present invention applies specific examples to illustrate the principles and embodiments of the present invention, and the above examples are only used to help understand the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. A target tracking method, comprising:

determining a target cost value from the CIoU distance cost and the appearance feature distance cost, and determining a matching result based on the target cost value;

and determining appearance feature distance cost between the high-resolution detection result in the first detection legacy set and the prediction result in the second prediction result set based on the cosine similarity method, including:

determining cosine similarity by utilizing high-resolution detection results in the first detection legacy set and prediction results in the second prediction result set;

Judging whether the cosine similarity and the CIoU distance cost are respectively smaller than a corresponding preset appearance characteristic threshold value and a preset CIoU threshold value or not;

2. The method according to claim 1, wherein detecting each object in the current frame to obtain a detection result set includes:

3. The target tracking method of claim 2, wherein the determining a high score detection result set and a low score detection result set of the detection result sets comprises:

4. The target tracking method of claim 1, wherein before the matching the high-resolution detection result set with the prediction result of the previous frame, further comprising:

5. The target tracking method according to claim 1, wherein the determining a target cost value from the CIoU distance cost and the appearance feature distance cost, and determining a matching result based on the target cost value, comprises:

6. An object tracking device, comprising:

the third matching module is specifically configured to:

and, the third matching module is further configured to:

7. An electronic device, comprising:

a memory for storing a computer program;

a processor for executing the computer program to implement the steps of the object tracking method according to any one of claims 1 to 5.

8. A computer-readable storage medium storing a computer program; wherein the computer program when executed by a processor implements the steps of the object tracking method according to any one of claims 1 to 5.