CN111429477B

CN111429477B - Target tracking method and device, storage medium and computer equipment

Info

Publication number: CN111429477B
Application number: CN202010286778.9A
Authority: CN
Inventors: 王非凡; 何佳伟; 熊佳; 刘艳艳; 邓麟; 彭晓峰
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2020-04-13
Filing date: 2020-04-13
Publication date: 2022-08-26
Anticipated expiration: 2040-04-13
Also published as: CN111429477A

Abstract

A target tracking method and device, a storage medium and computer equipment are provided, wherein the method comprises the following steps: acquiring a target video stream, wherein the target video stream comprises a plurality of frame images; identifying a tracking target from each frame image of the target video stream; when the dimensional multiframe is stably tracked, judging whether the tracking target keeps still or not; if the judgment result is yes, setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame. By the method, the target tracking performance can be effectively improved.

Description

Target tracking method and device, storage medium and computer equipment

Technical Field

The invention relates to the field of computer vision, in particular to a target tracking method and device, a storage medium and computer equipment.

Background

Moving target tracking is one of the important subjects in the field of computer vision, and plays an important role in intelligent transportation, video monitoring, human-computer interaction, medical research and the like. Under the complex tracking conditions, the realization of real-time and stable tracking of a moving target still faces huge challenges, which mainly include: the size change and the form change of the tracking target, the shielding or disappearance of the tracking target, the blurring of the collected image and the like. The size change of the tracking target means that when the target size is increased or decreased, if the tracking frame cannot be adaptively scaled, information of the target model is updated incorrectly or incompletely; the form change of the tracked target means that when the moving target has a posture change, such as rotation or small deformation, the tracking is easy to fail; the tracked target is blocked or disappears, which means that the tracked target may be blocked or temporarily disappear in the moving process, and the tracking frame may include an obstruction or other background information, so that the target tracking frame in the subsequent frame drifts to the obstruction or the tracking fails; the acquired image blurring means that when the illumination intensity changes, the target moves rapidly, and the moving target is similar to the background, an obvious feature needs to be selected to distinguish the target from the background.

In order to overcome the above challenges, algorithms are provided in the prior art, wherein a Mean Shift (Mean Shift) algorithm is a classic algorithm for target tracking, and the algorithm is based on parameter-free estimation of density gradient rise. Although the target tracking algorithm with the mean shift as the core of the algorithm further improves the accuracy and robustness of tracking by adding features, weights and other methods, in practical application, the problem of visual jitter of a tracking frame caused by scale or background change and the problem of false shift or tracking failure caused by occlusion or disappearance still exist, so that the performance of target tracking is reduced.

Disclosure of Invention

The invention provides a target tracking method and device, a storage medium and computer equipment to improve the target tracking performance.

To solve the above technical problem, an embodiment of the present invention provides a target tracking method, where the method includes: acquiring a target video stream, wherein the target video stream comprises a plurality of frame images; identifying a tracking target from each frame image of the target video stream; when the dimensional multiframe is stably tracked, judging whether the tracking target keeps still or not; if the judgment result is yes, setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame.

Optionally, the determining whether the tracking target remains stationary includes: acquiring a plurality of last frames including a current frame, and judging whether the sizes of the tracking targets in the plurality of last frames are close; when the sizes of the tracking targets in the last frames are close, calculating brightness difference values among the last frames; when the brightness difference value is smaller than the brightness threshold value, the tracking target keeps still.

Optionally, the determining whether the sizes of the tracking targets in the last frames are close to each other includes: for each frame in the last frames, acquiring a long axis and a short axis of the tracking target; when the difference between the long axis of the tracking target in each frame and the long axis of the tracking target in other frames is within a preset range, and the difference between the short axis of the tracking target in each frame and the short axis of the tracking target in other frames is within a preset range, the sizes of the tracking targets in the last frames are similar.

Optionally, the calculating the luminance difference values between the last frames includes: dividing each frame in the last frames into preset areas, and calculating the brightness value of each divided area; for the last frames, respectively calculating the brightness difference value between each region of each frame; and obtaining the brightness difference values among the last frames according to the brightness difference value among each area of each frame.

Optionally, the calculating the luminance difference value between two frames in the last several frames according to the following formula includes:

wherein j is the number of frames and the value of j is0 or 1, when j equals 0, it represents the 1 st frame of the two frames, when j equals 1, it represents the 2 nd frame of the two frames, m is the number of divided regions of each frame and m is a constant, BlkSum _j The weighted luminance sum of the jth frame is obtained, avgji is the luminance value of the ith area in the jth frame, and wji is the preset weight of the ith block area in the jth frame;

wherein, BlkRatio _j Is BlkSum _j A normalized value;

where RatioDiff is the luminance difference between the two frames, and Blkratio0 is Blkratio when j is0 _j Blkratio1 is Blkratio when j equals 1 _j The value of (c).

Optionally, the method further includes: when multi-frame stable tracking is maintained, if the tracking target disappears, calculating the target position of the tracking target in the disappeared first frame according to a mean shift algorithm, and beginning to accumulate the number of failure frames; when the tracking target is not successfully tracked in the next frame, keeping the target position to the next frame, and adding one to the number of failed frames; and when the number of the failure frames exceeds a frame number threshold value, the tracking of the tracking target is failed.

Optionally, the method further includes: when the tracking target is successfully tracked in the next frame, the tracking target is retrieved.

Optionally, the position of the tracking target in each frame is represented by a target box, and after the target position is maintained to the next frame, the method further includes: and expanding the target frame of the tracking target.

An embodiment of the present invention further provides a target tracking apparatus, where the apparatus includes: the video stream acquisition module is used for acquiring a target video stream, and the target video stream comprises a plurality of frame images; the tracking module is used for identifying a tracking target from each frame image of the target video stream; the static judging module is used for judging whether the tracking target keeps static or not when the multi-dimensional frame is stably tracked; and the static anti-shake module is used for setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame if the judgment result is yes.

An embodiment of the present invention further provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of any of the above-mentioned methods for tracking a landmark.

The embodiment of the invention also provides computer equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of any item tag tracking method when executing the computer program.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides a target tracking method, which comprises the following steps: acquiring a target video stream, wherein the target video stream comprises a plurality of frame images; identifying a tracking target from each frame image of the target video stream; when the dimensional multiframe is stably tracked, judging whether the tracking target keeps still or not; if the judgment result is yes, setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame. Compared with the prior art, in the scheme of the invention, when the target video stream is tracked frame by frame, the position of a static tracking target can be maintained, so that the anti-shake effect of continuous frame tracking is realized. When the target is tracked by the mean shift method, the problem of visual jitter of the tracking frame when the relative position of the tracked target is unchanged in a complex environment is solved.

Furthermore, the last frames of the multiple frames which are stably tracked are compared in terms of the size of the tracking target and the brightness between the multiple frames to judge whether the identified situations of the tracking target in the last frames are similar or not, and if so, the tracking target can be judged to be still.

Furthermore, the last frames can be divided into a plurality of areas, the brightness value of each frame is calculated by combining the particularity of each area, and then the brightness comparison is carried out so as to accurately obtain the brightness comparison result among the last frames, thereby accurately judging whether the tracking target is static or not.

Furthermore, the retracing function when the object of the tracking target reappears after being shielded or temporarily lost is realized, the tracking robustness is further increased, and the long-term characteristic in the visual monitoring application is met.

Drawings

FIG. 1 is a schematic flow chart illustrating a target tracking method according to an embodiment of the present invention;

FIG. 2 is a partial flow chart illustrating a target tracking method according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating region segmentation for the last two frames according to an embodiment of the present invention;

FIG. 4 is a partial flow chart of another target tracking method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a target tracking device according to an embodiment of the present invention.

Detailed Description

As described in the background art, the prior art employs a mean shift algorithm for target tracking, however, in practical applications, there still exists a problem of visual jitter of a tracking frame due to scale or background change, thereby reducing the performance of target tracking.

The reason for the problem of visual jitter is that the real-time characteristic of the mean shift algorithm can continuously predict the new position of the object at each frame, and when the relative position of the target object is almost unchanged, the scale slightly changes or the background changes, the calculated position fluctuates in a small range, and visual jitter is continuously caused in the video.

In order to solve the above problems in the prior art, an embodiment of the present invention provides a target tracking method, where the method includes: acquiring a target video stream, wherein the target video stream comprises a plurality of frame images; identifying a tracking target from each frame image of the target video stream; when the dimensional multiframe is stably tracked, judging whether the tracking target keeps still or not; if the judgment result is yes, setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame.

Through the scheme, the performance of target tracking can be effectively improved.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

Referring to fig. 1, fig. 1 is a schematic flow chart of a target tracking method according to an embodiment of the present invention, where the method includes the following steps:

step S101, acquiring a target video stream, and identifying a tracking target frame by frame for the target video stream, wherein the target video stream comprises a plurality of frame images;

the target video stream is a video stream captured when performing target tracking, and the video stream includes a number of frames of captured images. The target video stream may be a video stream acquired by a photographing apparatus for a certain area.

Step S102, identifying a tracking target from each frame image of the target video stream;

optionally, a mean shift method may be used to identify the tracking target from frame to frame in the target video stream to obtain the position of the tracking target in each frame of image, so as to continuously track the tracking target.

The main algorithm steps of the mean shift method are as follows: determining the position, shape, size and characteristics of an initial frame of a tracking target in an initial frame; calculating characteristic information of the initial frame as a target sample; and comparing the information of the current frame tracking frame with the initial frame, calculating an offset mean value, and continuously moving by taking the offset mean value as a starting point until a certain condition is met.

Step S103, when maintaining the stable tracking of multiple frames, judging whether the tracking target keeps still;

when the tracking target can be stably tracked in continuous multiframes in the target video stream, whether the tracking target is in a static state or not can be judged, and if yes, a static anti-shake algorithm can be started.

When a current frame tracking frame of a tracking target is determined by a mean shift algorithm of each frame, if the similarity between the current frame tracking frame and an initial frame reaches a certain condition, the frame outputs a stable flag quantity, if the similarity is continuous, the stable tracking frame number is continuously recorded, and the stable tracking frame number can be set as a sta _ cout. When the number of stable tracking frames is greater than the frame number threshold (such as thrd0), i.e. the sta _ cout > thrd0, stable tracking of multiple frames is maintained for the tracking target.

And step S104, if the judgment result is yes, setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame.

And when the tracking target is judged to be in a static state, setting the position of the tracking target in the current frame and the position of the tracking target in the previous frame to be the same position.

For the mean shift method, the position of the tracking target in each frame is the tracking frame in each frame, that is, the positions of the tracking frames of the static tracking target in continuous multiple frames are kept consistent, so that the anti-shake effect is realized.

By the target tracking method, when the target video stream is tracked frame by frame, the position of a static tracking target can be maintained, so that the anti-shake effect of continuous frame tracking is realized. When the target is tracked by the mean shift method, the problem of visual jitter of the tracking frame when the relative position of the tracked target is unchanged in a complex environment is solved.

In an embodiment, with continuing reference to fig. 1, the determining whether the tracking target remains still in step S103 in fig. 1 may include: acquiring a plurality of last frames including a current frame, and judging whether the sizes of the tracking targets in the plurality of last frames are close; when the sizes of the tracking targets in the last frames are close, calculating brightness difference values among the last frames; when the brightness difference value is smaller than the brightness threshold value, the tracking target keeps still.

When multi-frame stable tracking is continued, the last frames (i.e. the current frame and the frames between the current frames, the current frame being the last frame) in the multi-frame stable tracking can be analyzed to determine whether the tracking target keeps a static state.

When the tracking target is kept in a static state, the size of the tracking target in the last frames of continuous tracking should be close, and the brightness of each frame is similar, so that the analysis can be performed from two aspects of the size of the tracking target and the brightness difference between adjacent frames.

The brightness threshold is a preset value for determining the brightness similarity of the last frames, and can be set to a constant value according to the requirement.

In this embodiment, the sizes of the tracking targets and the brightness between the multiple frames are compared for the last multiple frames of the multiple frames that are stably tracked to determine whether the identified situations of the tracking targets in the last multiple frames are close, and if so, it can be determined that the tracking target remains still.

In an embodiment, the determining whether the sizes of the tracking targets in the last frames are close to each other may include: for each frame in the last frames, acquiring a long axis and a short axis of the tracking target; when the difference between the long axis of the tracking target in each frame and the long axis of the tracking target in other frames is within a preset range, and the difference between the short axis of the tracking target in each frame and the short axis of the tracking target in other frames is within a preset range, the sizes of the tracking targets in the last frames are similar.

The size of the target calculated by the mean shift algorithm is represented by a covariance matrix (i.e., an ellipse), and the major axis and the minor axis of the ellipse of the tracked target can be obtained to represent the size of the tracked target. And when the difference value between the long axis (or the short axis) of the tracking target in each frame in the last frames and the long axis (or the short axis) of other frames does not exceed the preset range, judging that the sizes of the tracking targets in the last frames are similar.

For example, when the last frames are the last three frames, the last three frames are respectively set as the previous frame, lastframe and current frame of the last frame, the major axis of each frame tracking target is denoted as axis0, the minor axis is denoted as axis1, so the major and minor axes of the last three frames are denoted as: the size relationship of the tracking target between the last three frames (prv _ axis0, prv _ axis1), the last frame (last _ axis0, last _ axis1), and the current frame (cur _ axis0, cur _ axis1) should satisfy the following equations (1) to (6) to ensure that the difference between the major axis (or minor axis) of the last three frames is within a preset range:

prv_axis0×thrd1<last_axis0<prv_axis0×thrd2 (1)

last_axis0×thrd1<cur_axis0<last_axis0×thrd2 (2)

cur_axis0×thrd1<prv_axis0<cur_axis0×thrd2 (3)

last_axis1×thrd1<cur_axis1<last_axis1×thrd2 (4)

last_axis1×thrd1<cur_axis1<last_axis1×thrd2 (5)

cur_axis1×thrd1<prv_axis1<cur_axis1×thrd2 (6)

wherein thrd1 and thrd2 are preset coefficients and can be set as any constant according to requirements.

In an embodiment, the above calculating the luminance difference values between the last frames may include steps S201 to S203 in fig. 2, where:

step S201, dividing each frame in the last frames into preset areas, and calculating the brightness value of each divided area;

alternatively, the picture of each of the last frames is divided into 3 × 3 — 9 regions (blocks), and the average luminance of each region is calculated.

Step S202, for the last frames, respectively calculating the brightness difference value between each area of each frame;

step S203, the brightness difference value among the last frames is obtained according to the brightness difference value among each area of each frame.

When the last frames are compared in brightness, the brightness difference (or the average brightness difference) of each corresponding area of each frame can be respectively compared to obtain the brightness comparison result between each corresponding area of each frame, and further the brightness comparison result of each frame is obtained.

Optionally, a weight may be added to each region of each frame to increase the specificity of the region, and the weight of each region may be specifically set according to the actual brightness situation. For example, of the 9 divided regions per frame, the middle region has the largest weight, i.e., the middle region has the strongest specificity.

Alternatively, the luminance difference value between the two frames is calculated according to the following formula (7), formula (8) and formula (9), for example, the last two frames (last frame and current frame) can be selected as the two frames of luminance contrast:

wherein, BlkRatio _j Is BlkSum _j A normalized value;

Referring to fig. 3, fig. 3 provides a schematic diagram of region segmentation for the last two frames, where each segmented region is denoted as avgji, where j is0 or 1, when j is0, it denotes the 1 st frame of the two frames, and when j is1, it denotes the 2 nd frame of the two frames; i is0, 1,2, …, and 8, and each of the 9 regions obtained by dividing each frame is denoted.

In this embodiment, the last frames may be divided into a plurality of regions, the luminance value of each frame may be calculated by combining the specificity of each region, and then the luminance comparison may be performed to accurately obtain the luminance comparison result between the last frames, thereby accurately determining whether the tracking target is still.

Optionally, for the target tracking method in fig. 1 and fig. 2, confidence threshold calculation may be introduced, where the size of the tracked target in the last several frames and the brightness difference between the last several frames are both used as a condition for the confidence calculation, if a condition is met, the confidence is added by 1, and when the confidence exceeds a preset value, it is determined that the tracked target is in a static state, and a still image anti-shake process is performed, that is, the operation of step S104 in fig. 1 is performed.

Optionally, please continue to refer to fig. 1, after step S103 in fig. 1 determines whether the tracked target remains stationary, if the determination result is negative, the subsequent processing flow is exited, and the target tracking is continued to be performed by using the mean shift algorithm.

On the other hand, for the tracking target temporarily disappearing or partially hiding in the prior art, since the mean shift method continuously searches for a region similar to the initial frame target sample in the real-time iterative calculation process, if a certain similar condition is met, the tracking target is mistakenly shifted to a new (false) target region, so that the subsequent tracking is wrong, and if the similar condition is not found, the tracking failure is declared.

To solve the problem, referring to fig. 4, the target tracking method according to the embodiment of the present invention may further include the following steps S401 to S403, where:

step S401, when multi-frame stable tracking is maintained, if the tracking target disappears, calculating the target position of the tracking target in the disappeared first frame according to a mean shift algorithm, and beginning to accumulate the number of failure frames;

on the premise of multi-frame stable tracking, that is, when the number of continuously tracked frames is judged to be greater than the preset number of frames (thrd4), if the tracked target disappears in the tracked picture, the tracking is not finished, but the tracked target enters a lost and to-be-recovered state, at this time, the position of the tracked target is continuously predicted according to the mean shift algorithm, and the number of failed frames, that is, the number of failed frames (which can be set as the stay _ cout), starts to be accumulated.

Optionally, at this time, a temporary failure status may be reported.

Step S402, when the tracking target is not successfully tracked in the next frame, keeping the target position to the next frame, and adding one to the number of failed frames;

and then, when the tracking target is not successfully tracked by one more frame, adding one to the number of failed frames.

Step S403, when the number of failed frames exceeds the frame number threshold, failing to track the tracking target.

When the number of failed frames exceeds a frame number threshold (which may be set to thrd), i.e., stay _ cout > thrd, it is determined that the trace failed.

Optionally, after the tracking fails, the mean shift algorithm is exited, and the tracking failure is reported.

Optionally, the maintaining the target position until after the next frame further includes: when the tracking target is successfully tracked in the next frame, the tracking target is retrieved.

If the tracked target is identified again within the frame number threshold, the function of finding the tracked target after occlusion/temporary disappearance is realized.

Optionally, the position of the tracking target in each frame is represented by a target box, and the keeping the target position until after the next frame further includes: and expanding the target frame of the tracking target.

If the mobile terminal enters a lost state to be retrieved, extracting characteristic information from the position of a target frame of a next frame which is still tracking the target, and comparing the characteristic information with the initial frame, and simultaneously extracting the characteristic information by referring to the position of the expanded target frame and comparing the characteristic information with a standby initial frame (the initial frame with the same expansion ratio).

In the embodiment, the retracing function when the object of the tracking target appears again after being shielded or temporarily lost is realized, the tracking robustness is further increased, and the long-term characteristic in the visual monitoring application is met.

Referring to fig. 5, the present invention further provides a target tracking device, which includes:

a video stream obtaining module 501, configured to obtain a target video stream, where the target video stream includes a plurality of frame images;

a tracking module 502, configured to identify a tracking target from each frame image of the target video stream;

a stillness determination module 503, configured to determine whether the tracking target remains still when the multi-frame is stably tracked;

and the static anti-shake module 504 is configured to set the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame if the determination result is yes.

In one embodiment, the stillness determination module 503 includes:

the size comparison unit is used for acquiring a plurality of last frames including a current frame and judging whether the sizes of the tracking targets in the plurality of last frames are close or not;

the brightness comparison unit is used for calculating the brightness difference values among the last frames when the sizes of the tracking targets in the last frames are close;

and the static determining unit is used for keeping the tracking target static when the brightness difference value is smaller than the brightness threshold value.

In one embodiment, the size comparing unit includes:

a long and short axis obtaining subunit, configured to obtain, for each of the last frames, a long axis and a short axis of the tracking target;

a size comparing subunit, configured to, when a difference between the long axis of the tracking target in each frame and the long axis of the tracking target in another frame is within a preset range, and a difference between the short axis of the tracking target in each frame and the short axis of the tracking target in another frame is within a preset range, make the sizes of the tracking targets in the last frames similar.

In one embodiment, the brightness contrast unit includes:

the region dividing subunit is used for dividing each frame in the last frames into preset regions and calculating the brightness value of each divided region;

the regional brightness contrast subunit is used for respectively calculating the brightness difference value between each region of each frame for the last frames;

and the frame brightness contrast subunit is used for obtaining the brightness difference values among the last frames according to the brightness difference value among each area of each frame.

wherein, BlkRatio _j Is BlkSum _j A normalized value;

In one embodiment, the target tracking apparatus of the embodiment of the present invention further includes:

the disappearance detection module is used for calculating the target position of the tracking target in a disappeared first frame according to a mean shift algorithm and starting to accumulate the number of failure frames if the tracking target disappears when the multi-frame stable tracking is maintained;

the tracking maintaining module is used for maintaining the target position to the next frame when the tracking target is not successfully tracked in the next frame, and adding one to the number of failed frames;

and the tracking failure module is used for failing to track the tracking target when the number of the failure frames exceeds a frame number threshold value.

and the tracking and retrieving module is used for retrieving the tracking target when the tracking target is successfully tracked in the next frame.

In one embodiment, the position of the tracking target in each frame is represented by a target frame, and the target tracking apparatus of the embodiment of the present invention further includes:

and the target frame expanding module is used for expanding the target frame of the tracking target.

For more details of the working principle and working mode of the target tracking device, reference may be made to the description of the target tracking method in fig. 1 to 4, which is not repeated herein.

Further, an embodiment of the present invention further discloses a storage medium, on which computer instructions are stored, and when the computer instructions are executed, the technical solution of the target tracking method in the embodiments shown in fig. 1 to fig. 4 is executed. Preferably, the storage medium may include a computer-readable storage medium such as a non-volatile (non-volatile) memory or a non-transitory (non-transient) memory. The storage medium may include ROM, RAM, magnetic or optical disks, and the like.

Further, an embodiment of the present invention further discloses a computer device, which includes a memory and a processor, where the memory stores computer instructions capable of being executed on the processor, and the processor executes the technical solution of the target tracking method in the embodiments shown in fig. 1 to 4 when executing the computer instructions. The computer equipment can be mobile phones, watches, servers, tablet computers, intelligent watches and other equipment.

The technical scheme of the invention can be applied to 5G (5Generation) communication systems, 4G and 3G communication systems, and various future new communication systems such as 6G and 7G.

It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein indicates that the former and latter associated objects are in an "or" relationship.

The "plurality" appearing in the embodiments of the present application means two or more.

The descriptions of the first, second, etc. appearing in the embodiments of the present application are only for illustrating and differentiating the objects, and do not represent the order or the particular limitation of the number of the devices in the embodiments of the present application, and do not constitute any limitation to the embodiments of the present application.

The term "connect" in the embodiments of the present application refers to various connection manners, such as direct connection or indirect connection, to implement communication between devices, which is not limited in this embodiment of the present application.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method of target tracking, the method comprising:

acquiring a target video stream, wherein the target video stream comprises a plurality of frame images;

identifying a tracking target from each frame image of the target video stream;

when the dimensional multiframe is stably tracked, judging whether the tracking target keeps still or not;

if the judgment result is yes, setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame;

wherein the determining whether the tracking target remains stationary comprises:

acquiring a plurality of last frames including a current frame, and judging whether the sizes of the tracking targets in the plurality of last frames are close;

when the sizes of the tracking targets in the last frames are close, calculating brightness difference values among the last frames;

when the brightness difference value is smaller than a brightness threshold value, the brightness of the last frames is similar, and the tracking target is kept static;

the determining whether the sizes of the tracking targets in the last frames are close includes:

for each frame in the last frames, acquiring a long axis and a short axis of the tracking target;

when the difference between the long axis of the tracking target in each frame and the long axis of the tracking target in other frames is within a preset range, and the difference between the short axis of the tracking target in each frame and the short axis of the tracking target in other frames is within a preset range, the sizes of the tracking targets in the last frames are similar.

2. The method of claim 1, wherein said calculating the luminance difference between the last frames comprises:

dividing each frame in the last frames into preset regions, and calculating the brightness value of each divided region;

for the last frames, respectively calculating the brightness difference value between each region of each frame;

and obtaining the brightness difference values among the last frames according to the brightness difference value among each area of each frame.

3. The method of claim 2, wherein calculating the luminance difference between two of the last frames according to the following formula comprises:

wherein, BlkRatio _j Is BlkSum _j A normalized value;

4. The method of claim 1, further comprising:

when multi-frame stable tracking is maintained, if the tracking target disappears, calculating the target position of the tracking target in the disappeared first frame according to a mean shift algorithm, and beginning to accumulate the number of failure frames;

when the tracking target is not successfully tracked in the next frame, keeping the target position to the next frame, and adding one to the number of failed frames;

and when the number of the failed frames exceeds a frame number threshold, the tracking of the tracking target is failed.

5. The method of claim 4, further comprising:

when the tracking target is successfully tracked in the next frame, the tracking target is retrieved.

6. The method of claim 4, wherein the position of the tracking target in each frame is represented by a target box, and wherein maintaining the target position until after the next frame further comprises:

and expanding the target frame of the tracking target.

7. An object tracking device, the device comprising:

the video stream acquisition module is used for acquiring a target video stream, and the target video stream comprises a plurality of frame images;

the tracking module is used for identifying a tracking target from each frame image of the target video stream;

the static judging module is used for judging whether the tracking target keeps static or not when the multi-dimensional frame is stably tracked;

the static anti-shake module is used for setting the position of the tracking target in the current frame to be the same as the position of the tracking target in the previous frame of the current frame if the judgment result is yes;

the standstill determination module includes: the size comparison unit is used for acquiring a plurality of last frames including a current frame and judging whether the sizes of the tracking targets in the plurality of last frames are close or not; the brightness comparison unit is used for calculating the brightness difference values among the last frames when the sizes of the tracking targets in the last frames are close; a still determination unit, configured to, when the brightness difference is smaller than a brightness threshold, make brightness of the last frames similar, and keep the tracking target still;

the size comparison unit includes: a long and short axis obtaining subunit, configured to obtain, for each of the last several frames, a long axis and a short axis of the tracking target; a size comparing subunit, configured to, when a difference between the long axis of the tracking target in each frame and the long axis of the tracking target in another frame is within a preset range, and a difference between the short axis of the tracking target in each frame and the short axis of the tracking target in another frame is within a preset range, make the sizes of the tracking targets in the last frames similar.

8. A storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.