CN116309729A

CN116309729A - Target tracking method, device, terminal, system and readable storage medium

Info

Publication number: CN116309729A
Application number: CN202310147512.XA
Authority: CN
Inventors: 潘颢文; 张勇
Original assignee: Zhuhai Shixi Technology Co Ltd
Current assignee: Zhuhai Shixi Technology Co Ltd
Priority date: 2023-02-20
Filing date: 2023-02-20
Publication date: 2023-06-23

Abstract

The application discloses a target tracking method, a target tracking device, a target tracking terminal, a target tracking system and a readable storage medium, wherein the target tracking method comprises the following steps: acquiring a target image set, wherein the target image set comprises a continuous multi-frame depth map shot by a camera; determining a target tracking result of the current frame according to the historical tracking result and the depth information of each frame image, wherein the historical tracking result is the target tracking result determined by the previous frame; and updating the historical tracking result according to the target tracking result of the current frame. The application has simple logic, small calculated amount and high calculation speed; the real-time performance is high, the fineness is high, and the application range is wide; the method and the device can be used for the scene of camera movement, and the application scene of target detection is obviously expanded.

Description

Target tracking method, device, terminal, system and readable storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a target tracking method, device, terminal, system, and readable storage medium.

Background

Tracking of moving objects is applied to various scenes such as a passenger flow statistics scene, a living body posture recognition scene, and the like. At present, there are two modes of target tracking, namely, a deep learning algorithm is adopted, but the deep learning algorithm can only learn what and identify what, so that a great deal of learning training is needed; secondly, a background modeling method is adopted, but when a background modeling algorithm is applied to detect a target, a scene is necessarily fixed, such as a person or a car, if the scene is stationary, the scene cannot be detected during background modeling, and the background modeling method of the target is large in calculated amount and low in accuracy.

Disclosure of Invention

In order to solve the above problems, embodiments of the present application provide a target tracking method, device, terminal, system and readable storage medium, so as to provide a target tracking method that is fast, efficient and applicable to mobile scenes.

In a first aspect, an embodiment of the present application provides a target tracking method, including:

acquiring a target image set, wherein the target image set comprises a continuous multi-frame depth map shot by a camera;

determining a target tracking result of the current frame according to the historical tracking result and the depth information of each frame image, wherein the historical tracking result is the target tracking result determined by the previous frame;

and updating the historical tracking result according to the target tracking result of the current frame.

Optionally, in the above method, the history tracking result includes a history movement area, a history communication area, and history target identity information;

the determining the target tracking result of the current frame according to the history tracking result and the depth information of each frame image comprises the following steps:

constructing a background model according to multi-frame historical images in the target image set;

determining a motion area of the current frame according to the depth value of each pixel point of the current frame, the background model, the historical motion area and the historical communication area;

Determining at least one connected region of the current frame according to the motion region of the current frame;

and determining a target tracking result of the current frame according to the connected region, the history connected region and the history target identity information.

Optionally, in the above method, the constructing a background model according to multiple frames of historical images in the target image set includes:

constructing an initial background model;

determining an average value of corresponding pixel points of each frame of historical image;

and assigning the average value of each pixel point to the corresponding pixel point of the initial background model to obtain the background model.

Optionally, in the above method, the determining the motion area of the current frame according to the depth value of each pixel point of the current frame, the background model, the historical motion area, and the historical connected area includes:

determining the difference value between the depth value of each pixel point of the current frame and the depth value of the corresponding pixel point of the background model;

dividing the current frame into a first motion area and a history trace area according to the positive and negative of the difference value and the relative sizes of the difference value and a plurality of preset thresholds;

performing difference on the historical motion area and the historical trace area to obtain a public area;

Determining a motion area of the current frame according to the first motion area and the public area;

and multiplying the motion area of the current frame serving as a template by each pixel of the current frame to determine the motion pixel of the current frame.

Optionally, in the above method, the plurality of preset thresholds include a first threshold and a second threshold;

the dividing the current frame into a first motion area and a history trace area according to the positive and negative of the difference value and the relative sizes of the difference value and a plurality of preset thresholds comprises the following steps:

if the difference value corresponding to one pixel point is determined to be a negative value and larger than or equal to the first threshold value, determining that the corresponding pixel point in the current frame belongs to the first motion area;

and if the difference value corresponding to one pixel point is determined to be a positive value and is larger than or equal to the second threshold value, determining that the corresponding pixel point in the current frame belongs to the history trace area.

Optionally, in the above method, the determining a motion area of the current frame according to the first motion area and the common area includes:

summing the first motion area and the public area to obtain a high-noise motion area;

and carrying out median filtering treatment on the high-noise motion area to obtain the motion area of the current frame.

Optionally, the method further comprises:

determining a transmission threshold according to a preset motion region transmission model;

determining a product of the transfer threshold and the historical motion region as a final historical motion region; wherein, the motion region transfer model is:

where thr represents a transfer threshold, N represents the number of pixels of the previous frame image, N represents the number of pixels of the moving object region of the previous frame image, and t is a constant (t > 0) to be determined.

Optionally, in the above method, the determining the target tracking result of the current frame according to the connected region, the historical connected region and the historical target identity information includes:

if the communication area and the history communication area have intersection, transmitting the history target identity information to the communication area;

and if the connected region and the history connected region do not have intersection, new identity information is given to the connected region.

In a second aspect, embodiments of the present application further provide an object tracking device, including:

the acquisition unit is used for acquiring a target image set, wherein the target image set comprises a plurality of continuous frame depth maps shot by a camera;

The tracking unit is used for determining a target tracking result of the current frame according to the historical tracking result and the depth information of each frame image, wherein the historical tracking result is the target tracking result determined by the previous frame;

and the updating unit is used for updating the historical tracking result according to the target tracking result of the current frame.

In a third aspect, an embodiment of the present application further provides a target tracking terminal, where the target tracking terminal is deployed with the target tracking device.

In a fourth aspect, an embodiment of the present application further provides a target tracking system, where the system includes a tracking server and a plurality of acquisition terminals, where each of the acquisition terminals is respectively connected to the tracking server in a communication manner; the tracking server is deployed with the target tracking device;

the acquisition terminal is used for acquiring a target image set and sending the target image set to the tracking server.

In a fifth aspect, a computer device is provided, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the above-described object tracking method when executing the computer program.

In a sixth aspect, a computer readable storage medium is provided, the computer readable storage medium storing a computer program, which when instructed by a processor, implements the above-described object tracking method.

The above-mentioned at least one technical scheme that this application embodiment adopted can reach following beneficial effect:

the method combines the depth information of each frame image in the target image set with the historical tracking result formed by the previous frame, so that the target tracking result of the current frame is determined. The application has simple logic, small calculated amount and high calculation speed; the real-time performance is high, the fineness is high, and the application range is wide; the method and the device can be used for the scene of camera movement, and the application scene of target detection is obviously expanded.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

FIG. 1 shows a flow diagram of a target tracking method according to one embodiment of the present application;

FIG. 2 illustrates a schematic diagram of target movement according to one embodiment of the present application;

FIG. 3 illustrates a schematic view of a movement region according to one embodiment of the present application;

FIG. 4 shows a flow diagram of a target tracking method according to another embodiment of the present application;

FIG. 5 shows a schematic diagram of the structure of an object tracking device according to one embodiment of the present application;

FIG. 6 shows a schematic diagram of the structure of a target tracking terminal according to one embodiment of the present application;

FIG. 7 illustrates a schematic diagram of a target tracking system according to one embodiment of the present application;

fig. 8 shows a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

For the purposes, technical solutions and advantages of the present application, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

In order to make the technical solutions provided by the embodiments of the present application more clearly understood by those skilled in the art, the technical solution concept of the present application will be described first.

In view of the shortcomings of the prior art, the present application provides a target tracking method, fig. 1 shows a schematic flow chart of the target tracking method according to an embodiment of the present application, and as can be seen from fig. 1, the present application at least includes steps S110 to S130:

step S110: a target image set is acquired, wherein the target image set comprises a plurality of continuous frames of depth maps shot by a camera.

The target image set may be a video stream or a video stream obtained in real time, which may be stored in a database, and obtained through shooting in a historical time, or may be obtained through shooting in real time by a camera. In this application, the camera may be a depth camera, such as a TOF camera.

If the camera is installed at a designated place, a target area can be shot to obtain a video stream, and the video stream can be a video composed of depth maps according to time sequence in the application.

Step S120: and determining a target tracking result of the current frame according to the historical tracking result and the depth information of each frame image, wherein the historical tracking result is the target tracking result determined by the previous frame.

In the application, the target is continuously tracked, each frame outputs a target tracking result, the target tracking result determined by the previous frame of the current frame is recorded as a historical tracking result, the target tracking result of the previous frame is combined with the depth information of each frame image, and the target tracking result of the current frame can be determined according to a preset tracking strategy.

Step S130: and updating the historical tracking result according to the target tracking result of the current frame.

And finally, updating the historical tracking result of the previous frame according to the current round, namely the target tracking result of the current frame, so as to facilitate the target tracking of the next frame. The tracking result may include, but is not limited to, tracked targets, and each target is assigned an identity information, which is denoted as target identity information, such as an identity ID.

As can be seen from the method shown in fig. 1, the present application combines the depth information of each frame image in the target image set with the history tracking result formed by the previous frame, so as to determine the target tracking result of the current frame. The method has the advantages of simple logic, small calculated amount and high calculation speed; the real-time performance is high, the fineness is high, and the application range is wide; the method and the device can be used for the scene of camera movement, and the application scene of target detection is obviously expanded.

In some embodiments of the present application, in the above method, the historical tracking result includes a historical motion region, a historical connected region, and historical target identity information; the determining the target tracking result of the current frame according to the history tracking result and the depth information of each frame image comprises the following steps: constructing a background model according to multi-frame historical images in the target image set; determining a motion area of the current frame according to the depth value of each pixel point of the current frame, the background model, the historical motion area and the historical communication area; determining at least one connected region of the current frame according to the motion region; and determining a target tracking result of the current frame according to the connected region, the history connected region and the history target identity information.

Wherein the motion area is defined as the pixel area of the one or more objects identified by the current frame and the historical motion area is defined as the pixel area of the one or more objects identified by the previous frame. The connected region is defined as one or more regions which are obtained by dividing the motion region in the current frame and are not mutually independent and connected, but pixels in each region are connected; the history connected region is defined as one or more regions which are obtained by dividing the motion region in the previous frame of the current frame and are not mutually independent, but pixels in each region are connected, (the effect of the connected region or the history connected region please refer to fig. 3) is recorded as the history connected region; the identity information is a unique identity identification allocated to the detected target in the current frame; the historical target identity information is a unique identity identification, such as an identity ID, allocated to the detected target in the previous frame. In the following, the description is omitted.

Firstly, constructing a background model according to a plurality of frames of historical images in the target image set, specifically, in some embodiments of the application, the constructing a background model according to a plurality of frames of historical images in the target image set includes: constructing an initial background model; determining an average value of corresponding pixel points of each frame of historical image; and assigning the average value of each pixel point to the corresponding pixel point of the initial background model to obtain the background model.

For the background model, continuous multi-frame images in the history can be selected for construction, for example, a video stream comprises continuous 4-frame depth maps which are respectively marked as a 1 st frame, a 2 nd frame, a 3 rd frame and a 4 th frame, wherein the 4 th frame is a current frame, when the background model is constructed, the depth value of each pixel point in the 1 st frame, the 2 nd frame and the 3 rd frame images can be adopted for construction, specifically, an initial background model is firstly constructed, the initial background model can be an image with a null pixel value or a preset value, and the background of the initial background model is consistent with the size of the depth map of each frame; and then calculating the average value of corresponding pixel points in the 1 st frame, the 2 nd frame and the 3 rd frame images, and then assigning the obtained average value of each pixel point to the corresponding pixel point of the initial background model to obtain the background model.

Referring to fig. 2, fig. 2 shows a schematic diagram of a target movement according to an embodiment of the present application, it can be seen from fig. 2 that the moving target object is assumed to be a circle, the first circle is an area where the circle of the previous frame is located, the second circle is an area where the circle of the current frame is located, an intersection (C area) of the two circles is a common area of the circles of the current frame and the previous frame, and the rest is the background.

According to the characteristics of the depth camera, the depth value of the foreground is smaller than that of the background, and when the depth value of each pixel point of the current frame is different from that of the corresponding pixel point in the background model, the following rule is obtained:

as can be seen from fig. 2, the second circle is a motion area of the current frame, where the area a in fig. 2 represents a portion of the motion area formed by the second circle with the common portion of the first circle removed, and the difference between the pixels is negative and the absolute value is large, because the area a in the previous historical frames is that only the background does not have the motion object, the area a of the current frame is a portion of the motion object, and the depth is smaller than the background, so the difference is negative.

In fig. 2, the part of the area where the moving object is located in the previous frame of the B area is a part of the historical moving area formed by the first round planing and the common part of the second round, and the difference value corresponding to the pixel point is positive and larger in absolute value, because the previous historical frames contain the moving object, but the current frame is not, and therefore the depth of the B area of the current frame is larger than that of the B area of the background model.

The area C is a common area of the first circle and the second circle, the part belongs to the motion area of the previous frame and the motion area of the current frame, and the absolute value of the corresponding difference value is small, because the current frame and the background model both contain the part, the depth is close, and the absolute value of the difference is small.

The D region (all regions except the two circles) is a background region of the previous frame and the current frame, and the absolute value of the difference value corresponding to the pixel point is also small because the D region is always a background.

Thus, the motion area of the current frame may be determined according to the depth value of each pixel point of the current frame, the background model, the historical motion area, and the historical connected area, and in some embodiments of the present application, the determining the motion area of the current frame includes: determining the difference value between the depth value of each pixel point of the current frame and the depth value of the corresponding pixel point of the background model; dividing the current frame into a first motion area and a history trace area according to the positive and negative of the difference value and the relative sizes of the difference value and a plurality of preset thresholds; performing difference on the historical motion area and the historical trace area to obtain a public area; determining a motion area of the current frame according to the first motion area and the public area; and multiplying each image of the current frame by using the motion area of the current frame as a template to determine motion pixels of the current frame.

Wherein, the historical motion area is the same as the definition, and is not repeated here; the common region refers to the intersection of the motion region in the previous frame and the motion region of the current frame, and can be understood as an overlapping portion (e.g., region C of fig. 2).

In some embodiments, when tracking a target, the overall concept is: constructing a background model, separating a motion area, extracting a connected area and distributing identity information.

Referring to fig. 2, the second circle is the motion area of the current frame, that is, the union of the a area and the C area is denoted as the motion area of the current frame.

Specifically, after the depth value of the current frame is adopted to make a difference on the depth value of the background model, the difference value of each pixel point is obtained, and then according to the rules, the current frame can be divided into a first motion area and a historical trace area according to the positive and negative of the difference value and the relative sizes of the difference value and a plurality of preset thresholds, wherein the first motion area refers to an area A in fig. 2, and the historical trace area refers to an area B in fig. 2.

For the first motion area, a threshold value can be set to separate the area a according to the characteristic that the difference value of the pixel points is negative and the absolute value is large. Here, the dividing the current frame into a first motion area and a history trace area according to the positive and negative of the difference value and the relative sizes of the difference value and a plurality of preset thresholds includes: and if the difference value corresponding to one pixel point is determined to be a negative value and larger than or equal to the first threshold value, determining that the corresponding pixel point in the current frame belongs to the first motion area.

That is, if the difference between the depth value of a pixel point of the current frame and the depth value of a pixel point corresponding to the background model is negative and greater than or equal to the first threshold, determining that the pixel point of the current frame belongs to the first motion area, and processing each pixel point to separate the first motion area, namely the area a.

For the C area, the difference between the pixel points of the C area and the D area cannot be directly determined, because the characteristics of the difference are similar, namely for the C area, the difference belongs to a motion area, and the difference is small; while for the D region, it is background, the difference is small, so that the C region and the D region cannot be distinguished directly from each other according to the difference.

For the circle in fig. 2, to separate the region C, the coverage area of the moving object in the previous frame (assuming that the object has been found in the previous frame) is recorded, because the region C is a common area of the moving object in the previous and subsequent frames, so that the moving area of the moving object in the previous frame includes the region C in the previous frame, only the region B needs to be separated from the region of the moving object in the previous frame, then the coverage area of the moving object in the previous frame is subtracted from the region B in the current frame, and the remaining part is regarded as the region C. That is, the first circle is known, and the region C is obtained by obtaining the region B from the difference, and subtracting the region B from the first circle.

For the area B, the area B is marked as a history trace area, and can be divided according to the positive and negative of the difference value and the relative sizes of the difference value and a plurality of preset thresholds, and the method specifically comprises the following steps: and if the difference value corresponding to one pixel point is a positive value and is greater than or equal to the second threshold value, determining that the corresponding pixel point in the current frame belongs to the history trace area.

That is, if the difference between the depth value of a pixel point of the current frame and the depth value of a pixel point corresponding to the background model is positive and greater than or equal to the second threshold, it is determined that the pixel point of the current frame belongs to a history trace area, and each pixel point is processed to separate the history trace area, i.e., the B area.

The historical motion area is a first circle, the historical trace area is a B area, the historical motion area and the historical trace area are subjected to difference, and a public area, namely a C area, can be obtained, and a calculation formula of the C area can be represented by the following formula (1):

tmp _n ＝ tmp _n-1 +neg-pos formula (1);

wherein tmp is _n Representing the motion region of the current frame, tmp _n-1 Representing the motion area of the previous frame, neg represents the history trace area, tmp _n-1 Representing a first region of motion.

And then determining the motion area of the current frame according to the first motion area and the public area, namely adding the first motion area and the public area to obtain the motion area of the current frame, namely a second circle.

And adding the first motion region and the public region to obtain a region marked as a high-noise motion region, and performing median filtering processing on the obtained high-noise motion region to remove noise points and obtain the motion region of the current frame in some embodiments of the application.

And then determining the motion pixel of the current frame, specifically, multiplying the motion region of the current frame with each pixel of the current frame by using the motion region of the current frame as a template to determine the motion pixel of the current frame.

Specifically, the pixel value of the motion area can be set to be 1, the other pixel values are set to be 0, a binary image is formed, the motion area of the current frame is used as a template (mask) to be multiplied with each pixel of the current frame, namely, each pixel point value of the binary image is multiplied with a corresponding pixel point in the current frame respectively, and the motion pixel of the current frame can be obtained.

Then, the moving pixels are separated, if there are multiple moving objects in the image, there may be multiple non-connected areas in the image, as shown in fig. 3, fig. 3 shows a schematic diagram of the moving areas according to an embodiment of the present application, and as can be seen from fig. 3, the areas in the gray frame are all moving areas, but the E area and the F area are not connected, which means that the two parts are two moving objects, and at this time, the moving pixels may be separated to obtain one or more connected areas, and referring to fig. 3, the E area may be regarded as one connected area, and the F area may be regarded as another connected area. Wherein the connected region is defined as the current frame

And finally, determining a target tracking result of the current frame according to the connected region, the history connected region and the history target identity information. Specifically, a connected region may be used as a target, and since a target may be left in a frame for a period of time, a new target may appear in the frame or a target may disappear, the results of the previous and subsequent frames are usually combined when the target is tracked.

Specifically, in some embodiments of the present application, the determining, according to the connected region, the historical connected region, and the historical target identity information, the target tracking result of the current frame includes: if the communication area and the history communication area have intersection, transmitting the history target identity information to the communication area; and if the connected region and the history connected region do not have intersection, new identity information is given to the connected region.

That is, for a connected region, if the connected region has no intersection (common region) with the connected region of the previous frame, an independent id is given, i.e. a new tracking target is added; if the connected region has an intersection (public region) with the connected region of the previous frame, the identity id of the connected region of the previous frame is transmitted, and thus continuous tracking of a target is realized.

In addition, during the object tracking process, one of the following two situations may occur, namely, some objects which do not move originally, objects which are not interested, may suddenly move under the action of external force due to accidental reasons, then stop, and although the object does not move any more later, the object moves in front of the lens, so that the object is detected, thereby causing false detection; secondly, the lens position becomes changed for some unknown reason, which results in many stationary objects being detected, but it is not desirable that these stationary objects stay in the picture all the time.

In this regard, the present application may control how much of the motion region of the previous frame is transferred to the motion region of the next frame by using a transfer threshold, that is, multiplying the historical motion region by a transfer threshold, and taking the obtained region as the final historical motion region detected by the current frame.

In some embodiments of the present application, a fixed transfer threshold is set so that the proportion of moving pixels transferred from the previous frame is constant, but this does not significantly improve the detection accuracy.

The more motion pixels that one wishes to examine motion during the detection process, the smaller this threshold, the fewer pixels that are transferred; conversely, the fewer moving pixels it is desired to detect motion, the larger this threshold, and the larger the pixel transferred. In this application, therefore, a motion region transfer model is previously set, which is shown in the formula (2):

That is, the transfer threshold is a dynamic data, so the reason for setting the transfer threshold is to consider the normal duty cycle of the moving pixel, t should be set smaller if the moving pixel duty cycle is high, and t should be set larger if the moving pixel duty cycle is low; in some embodiments of the present application, t=2 is set, and the estimated motion pixel ratio is 5%.

On the basis of setting the transfer threshold, the method of the application further comprises the following steps: determining a transmission threshold according to a preset motion region transmission model; and determining the product of the transfer threshold and the historical motion region as a final historical motion region.

In the case where the transfer threshold is set, the calculation formula of the C region can be represented by the following formula (2):

tmp _n ＝ thr*tmp _n-1 +neg-pos formula (2);

Fig. 4 shows a flow chart of a target tracking method according to another embodiment of the present application, and as can be seen from fig. 4, the present embodiment includes:

And acquiring a real-time depth video stream of the target area.

And for the current target frame depth map, acquiring a multi-frame historical depth map, and constructing a background model according to the average value of the depth values of the historical depth maps of each frame.

And reading a history tracking result of the previous frame, wherein the history tracking result comprises a history motion area, at least one history communication area and an identity ID of each history communication area.

Traversing each pixel point of the target frame depth map, determining the difference value between the depth value of each pixel point and the background model, and determining that the corresponding pixel point in the current frame depth map belongs to a first motion area if the difference value of one pixel point is a negative value and is greater than or equal to a first threshold value; and if the difference value corresponding to one pixel point is a positive value and is larger than or equal to a second threshold value, determining that the corresponding pixel point in the current frame depth map belongs to a history trace area.

And determining a transmission threshold according to a preset motion region transmission model, and determining the product of the transmission threshold and the historical motion region as the final historical motion region transmitted by the historical tracking result of the previous frame.

The final historical motion area and the historical trace area are subjected to difference to obtain a public area; summing the first motion area and the public area to obtain a high-noise motion area; and carrying out median filtering treatment on the high-noise motion region to obtain a motion region of the current frame depth map.

The motion region is extracted as at least one current connected region.

For a current communication area, if the communication area has an intersection with any history communication area, transmitting the identity ID of the history communication area to the communication area, and traversing the next current communication area; if the connected region does not have intersection with any historical connected region, a new identity ID is assigned to the connected region, and then the next current connected region is traversed.

And updating the historical tracking result of the previous frame according to the tracking result of the current frame depth map.

Fig. 5 shows a schematic structural diagram of an object tracking device according to an embodiment of the present application, and as can be seen from fig. 5, the device includes:

an obtaining unit 510, configured to obtain a target image set, where the target image set includes a plurality of consecutive frames of depth maps captured by a camera;

the tracking unit 520 is configured to determine a target tracking result of the current frame according to the history tracking result and depth information of each frame image, where the history tracking result is the target tracking result determined by the previous frame;

an updating unit 530 for updating the history tracking result according to the target tracking result of the current frame

The method of claim 1, wherein the historical tracking results include a historical motion region, a historical connected region, and historical target identity information;

in some embodiments of the present application, in the above apparatus, the tracking unit 520 is configured to construct a background model according to a plurality of frames of history images in the target image set; determining a motion area of the current frame according to the depth value of each pixel point of the current frame, the background model, the historical motion area and the historical communication area; determining at least one connected region of the current frame according to the motion region of the current frame; and determining a target tracking result of the current frame according to the connected region, the history connected region and the history target identity information.

In some embodiments of the present application, in the above apparatus, the tracking unit 520 is configured to construct an initial background model; determining an average value of corresponding pixel points of each frame of historical image; and assigning the average value of each pixel point to the corresponding pixel point of the initial background model to obtain the background model.

In some embodiments of the present application, in the foregoing apparatus, the tracking unit 520 is configured to determine a difference between a depth value of each pixel of the current frame and a depth value of a corresponding pixel of the background model; dividing the current frame into a first motion area and a history trace area according to the positive and negative of the difference value and the relative sizes of the difference value and a plurality of preset thresholds; performing difference on the historical motion area and the historical trace area to obtain a public area; determining a motion area of the current frame according to the first motion area and the public area; and multiplying the motion area of the current frame serving as a template by each pixel of the current frame to determine the motion pixel of the current frame.

In some embodiments of the present application, in the foregoing apparatus, the plurality of preset thresholds includes a first threshold and a second threshold; the tracking unit 520 is configured to determine that a corresponding pixel point in the current frame belongs to the first motion region if the difference value corresponding to the pixel point is a negative value and is greater than or equal to the first threshold; and if the difference value corresponding to one pixel point is a positive value and is greater than or equal to the second threshold value, determining that the corresponding pixel point in the current frame belongs to the history trace area.

In some embodiments of the present application, in the foregoing apparatus, the tracking unit 520 is configured to sum the first motion area and the common area to obtain a high-noise motion area; and carrying out median filtering treatment on the high-noise motion area to obtain the motion area of the current frame.

In some embodiments of the present application, in the above apparatus, the tracking unit 520 is further configured to determine a transfer threshold according to a preset motion region transfer model; determining a product of the transfer threshold and the historical motion region as a final historical motion region; wherein, the motion region transfer model is:

In some embodiments of the present application, in the foregoing apparatus, the tracking unit 520 is configured to transmit the historical target identity information to the connected area if there is an intersection between the connected area and the historical connected area; and if the connected region and the history connected region do not have intersection, new identity information is given to the connected region.

It should be noted that, the target tracking apparatus 500 may implement the target tracking method described above, which is not described herein.

Fig. 6 shows a schematic structural diagram of an object tracking terminal according to an embodiment of the present application, and as can be seen from fig. 6, an object tracking terminal 600 is deployed with the above-mentioned object tracking device 500, for implementing any of the methods described in the present application. The target tracking terminal may be an electronic device with a depth image photographing function, such as a tof camera, a depth passenger camera, etc., and may be mounted at a designated position of the target area so as to photograph the target area.

Fig. 7 shows a schematic structural diagram of a target tracking system according to an embodiment of the present application, and as can be seen from fig. 7, the target tracking system 700 includes a tracking server 710 and a plurality of target tracking terminals 720, where each of the target tracking terminals 720 is communicatively connected to the tracking server 710; the tracking server 710 is configured with any one of the target tracking devices 500 described above, and the target tracking terminal 720 is configured to collect a target image set and send the obtained target image set to the tracking server 710, where the tracking server 710 performs any one of the methods described above according to the received target image set, so as to track the target.

As can be seen from fig. 6 and fig. 7, the target tracking device 500 may be deployed at a target tracking terminal, or may be deployed at a tracking server, which is not limited in this application, and may be determined according to actual needs and the computing power of the target tracking terminal and the tracking server; however, it should be noted that the algorithm is simple, the requirement on calculation force is not high, and the hardware cost is saved by directly deploying the algorithm in the target tracking terminal.

Fig. 8 shows a schematic structural diagram of a computer device according to an embodiment of the present application, which according to fig. 8 comprises a processor, a memory, a network interface and a database connected via a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes non-volatile and/or volatile storage media and internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is for communicating with an external client via a network connection. The computer program is executed by a processor to perform the functions or steps of the object tracking method.

In one embodiment, the computer device provided in the present application includes a memory and a processor, the memory storing a database and a computer program executable on the processor, the processor executing the computer program to perform the steps of:

The method performed by the object tracking device disclosed in the embodiment shown in fig. 5 of the present application may be applied to a processor or implemented by a processor. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.

In one embodiment, there is also provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

It should be noted that, the functions or steps that can be implemented by the computer device or the computer readable storage medium may correspond to the relevant descriptions in the foregoing method embodiments, and are not described herein for avoiding repetition.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. A method of tracking a target, comprising:

2. The method of claim 1, wherein the historical tracking results include a historical motion region, a historical connected region, and historical target identity information;

3. The method of claim 2, wherein said constructing a background model from a plurality of frames of historical images in the target image set comprises:

Constructing an initial background model;

4. The method of claim 2, wherein determining the motion region of the current frame based on the depth value of each pixel of the current frame, the background model, the historical motion region, and the historical connected region comprises:

5. The method of claim 4, wherein the plurality of preset thresholds comprises a first threshold and a second threshold;

6. A target tracking device, the device comprising:

7. A target tracking terminal, characterized in that the target tracking terminal is deployed with the target tracking device of claim 6.

8. The target tracking system is characterized by comprising a tracking server and a plurality of acquisition terminals, wherein each acquisition terminal is respectively in communication connection with the tracking server; wherein the tracking server is deployed with the target tracking device of claim 6;

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the object tracking method according to any one of claims 1-5 when executing the computer program.

10. A computer-readable storage medium storing a computer program, wherein the computer program when instructed by a processor implements the object tracking method according to any one of claims 1 to 5.