WO2022227761A1

WO2022227761A1 - Target tracking method and apparatus, electronic device, and storage medium

Info

Publication number: WO2022227761A1
Application number: PCT/CN2022/074956
Authority: WO
Inventors: 关英妲; 刘文韬; 钱晨
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-04-28
Filing date: 2022-01-29
Publication date: 2022-11-03
Also published as: CN113129339B; TW202244847A; CN113129339A

Abstract

The present disclosure provides a target tracking method and apparatus, an electronic device, and a storage medium. The target tracking method comprises: obtaining video pictures at the current moment acquired by a plurality of acquisition devices provided in a target place, the plurality of acquisition devices having different acquisition angles in the target place, and the video pictures comprising regions of interest of target objects in the target place; on the basis of the video pictures at the current moment acquired by the plurality of acquisition devices, determining first position coordinates of each target object at the current moment; and for each target object, on the basis of the first position coordinates of the target object and second position coordinates of the target object at the previous moment, determining second position coordinates of the target object at the current moment.

Description

Target tracking method, device, electronic device and storage medium

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application claims the priority of the Chinese patent application filed on April 28, 2021, with the application number of 202110467650.7 and the invention titled "A target tracking method, device, electronic device and storage medium", which is cited by way to be incorporated into the text.

technical field

The present disclosure relates to the field of computer vision technology, and in particular, to a target tracking method, apparatus, electronic device, and storage medium.

Background technique

Artificial intelligence technology is playing an increasingly important role in creating intelligent education, entertainment and life. Among them, computer vision, as one of the key technologies, is widely used. For example, the positioning technology based on computer vision can track the target object in the target place in different scenarios, and determine the trajectory of the target object in the target place.

In the process of tracking the target object based on computer vision, the image of the target site collected by the camera can be used to determine the position of the target object in the target site image at different times, and further track the target object according to the position of the target object at different times. .

SUMMARY OF THE INVENTION

The embodiments of the present disclosure provide at least one target tracking solution.

In a first aspect, an embodiment of the present disclosure provides a target tracking method, including:

Acquire the video images at the current moment collected by multiple collection devices set in the target site; the multiple collection devices have different collection perspectives in the target site, and the video images include the sense of the target object in the target site. area of interest;

Determine the first position coordinates of each of the target objects at the current moment based on the video images at the current moment collected by the multiple collection devices;

For each of the target objects, the second position coordinates of the target object at the current moment are determined based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment.

In a second aspect, an embodiment of the present disclosure provides a target tracking device, including:

an acquisition module, configured to acquire the video images at the current moment collected by multiple collection devices set in the target place; the multiple collection devices have different collection perspectives in the target place, and the video images include the target place the region of interest of the target object;

A determination module, configured to determine the first position coordinates of each of the target objects at the current moment based on the video images at the current moment collected by the multiple collection devices;

The tracking module is configured to, for each of the target objects, determine the second position coordinates of the target object at the current moment based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment.

In a third aspect, embodiments of the present disclosure provide an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processing The processor and the memory communicate through a bus, and the machine-readable instructions execute the steps of the target tracking method according to the first aspect when the machine-readable instructions are executed by the processor.

In a fourth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to execute the target tracking method according to the first aspect. step.

In a fifth aspect, an embodiment of the present disclosure provides a computer program product, the computer program product includes a computer program and is stored on a storage medium, and when the computer program is executed by a processor, executes the target tracking method according to the first aspect. step.

In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and easy to understand, some embodiments are hereinafter described in detail together with the accompanying drawings.

Description of drawings

FIG. 1 shows a flowchart of a target tracking method provided by an embodiment of the present disclosure;

FIG. 2 shows a flowchart of a method for determining a first position coordinate of a target object provided by an embodiment of the present disclosure;

3 shows a schematic diagram of a detection frame obtained by performing target detection on a video image at the current moment according to an embodiment of the present disclosure;

4 shows a flowchart of a specific method for determining the first position coordinates of the same target object at the current moment provided by an embodiment of the present disclosure;

5 shows a flowchart of a method for determining the second position coordinates of a target object at the current moment provided by an embodiment of the present disclosure;

FIG. 6 shows a flowchart of a method for determining the coordinates of the observed position of a target object that is missed at the current moment provided by an embodiment of the present disclosure;

FIG. 7 shows a flowchart of a method for determining trajectory data of each target object provided by an embodiment of the present disclosure;

FIG. 8 shows a flowchart of a method for revising an identity identifier of a target object that deviates from a target group provided by an embodiment of the present disclosure;

FIG. 9 shows an early warning method provided by an embodiment of the present disclosure;

FIG. 10 shows a schematic diagram of a scene for tracking a target object provided by an embodiment of the present disclosure;

FIG. 11 shows a schematic structural diagram of a target tracking device provided by an embodiment of the present disclosure;

FIG. 12 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purposes, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only These are some, but not all, embodiments of the present disclosure. The components of the disclosed embodiments generally described and illustrated in the drawings herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure as claimed, but is merely representative of selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other embodiments obtained by those skilled in the art without creative work fall within the protection scope of the present disclosure.

It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.

The term "and/or" in this paper only describes an association relationship, which means that there can be three kinds of relationships, for example, A and/or B, which can mean: the existence of A alone, the existence of A and B at the same time, the existence of B alone. a situation. In addition, the term "at least one" herein refers to any combination of any one of the plurality or at least two of the plurality, for example, including at least one of A, B, and C, and may mean including from A, B, and C. Any one or more elements selected from the set of B and C.

In many application scenarios, it is usually necessary to track the target object in a place. For example, in a factory, it is necessary to track whether employees have a tendency to enter dangerous areas. In shopping malls, the movements of customers can be tracked. The position of the target object can be determined through the image captured by the camera, and then the tracking of the target object can be completed. However, for some target sites with complex and large areas, in the process of tracking the target object based on the camera, there may be situations where the target object cannot be captured, that is, there is a problem of tracking interruption; Tracking of target objects in these occluded regions is done.

Based on the above research, the present disclosure provides a target tracking method, in which the acquisition perspectives of the acquisition devices set in the target site are different, and each target object in the target site is simultaneously acquired by at least two acquisition devices. In this way, a comprehensive and accurate positioning of the target object in the target place can be completed, and the first position coordinates of the target object at the current moment can be obtained. Further combining the second position coordinates of the target object with higher accuracy at the previous moment and the first position coordinates at the current moment, accurately determine the second position coordinates of the target object at the current moment, that is, to complete the target object entering the target place. tracking.

In order to facilitate the understanding of this embodiment, a target tracking method disclosed in this embodiment of the present disclosure is first introduced in detail. The execution body of the positioning method provided by the embodiment of the present disclosure is a computer device with computing capability, and the computer device includes, for example, a server or other processing device. In some possible implementations, the target tracking method may be implemented by a processor invoking computer-readable instructions stored in a memory.

Referring to FIG. 1, which is a flowchart of a target tracking method provided by an embodiment of the present disclosure, the method includes steps S101-S103.

S101: Acquire a video image at the current moment collected by multiple collection devices set in the target site; the multiple collection devices have different collection perspectives in the target site, and the video image includes the region of interest of the target object in the target site.

Exemplarily, for different application scenarios, the target location may be a location corresponding to the application scenario. For example, if you need to locate the employees in the factory, the target place can be the factory; if you need to locate the customers in the mall, the target place can be the shopping mall; if you need to locate the athletes in the gym, the target The venue may be a gymnasium.

Exemplarily, the region of interest is the region in the video frame where objects that need to be positioned (such as the aforementioned employees, customers and athletes) are located in the target venue.

Exemplarily, the acquisition device may be a monocular camera or a binocular camera. Multiple acquisition devices can be set up in the target site. For different target sites, the installation positions of multiple collection devices can be determined according to the actual site of the target site. For example, the acquisition angles of the acquisition devices in the target site can be made different, so as to cover the entire area of the target site without leaving a dead angle. In addition, considering that too many acquisition devices will lead to too many video images captured at the same time, it will affect the video images. Therefore, when installing the acquisition equipment in the target site, it is necessary to consider the installation angle and quantity of the acquisition equipment. For example, each target object entering the target site can be acquired by two acquisition equipment at the same time, so that Multiple acquisition devices set up in the target site can completely capture the current video image of the entire area of the target site.

S102: Determine the first position coordinates of each target object at the current moment based on the video images at the current moment collected by multiple collection devices.

Exemplarily, the target objects are objects that need to be positioned in the target site, such as the aforementioned employees, customers and athletes. Taking the target site as a factory as an example, each employee in the factory is the target object. Since the multi-view acquisition of the target object is considered when the acquisition device is arranged, the same target object can be acquired by at least two acquisition devices at the same time, and at least two video images can be obtained, each of which involves a sensory image. area of interest.

Illustratively, take the target site as a factory as an example. For example, the number of employees in the factory is 2, that is, the target audience is 2. Since the same employee needs to be captured by at least 2 capture devices, it can be assumed that there are 2 cameras in the factory. In this case, a possible situation is that at 9:00 am, each of the two cameras captures a video frame, for example, video frame 1 and video frame 2; each video frame involves 2 interested parties. For example, in video screen 1, the area of interest 1 corresponding to employee 1 and the area of interest 2 corresponding to employee 2; in video screen 2, the area of interest 3 corresponding to employee 1 and the area of interest corresponding to employee 2 4.

Exemplarily, considering that each target object in the target site is collected by at least two collection devices at the same time, for each target object in the target site, the senses including the target object collected by multiple collection devices can be used. The video picture of the region of interest at the current moment is used to determine the first position coordinates of the target object at the current moment.

Exemplarily, the first position coordinates of the target object may refer to the position coordinates of the target object in a world coordinate system pre-built for the target location. In this way, when determining the first position coordinates of the target object based on the video image at the current moment, the pixel coordinates of the region of interest in the video image at the current moment in the video image can be used as well as the parameter information of the acquisition device that collects the video image. , and determine the initial position coordinates of the target object corresponding to the region of interest in the video picture in the target place. Considering that there are some errors between the parameter information of different collection devices, the initial position coordinates of the same target object in the target objects corresponding to the region of interest determined based on the video images collected by different collection devices at the current moment will have some differences. The initial position coordinates belonging to the same target object may be fused, and then the first position coordinates of the target object at the current moment are determined according to multiple initial position coordinates associated with the same target object at the current moment.

Exemplarily, the world coordinate system corresponding to the target site may take a fixed position in the target site as the coordinate origin to establish a unique world coordinate system. For example, you can take the center point of the ground of the target site as the origin of the coordinate system, set a direction on the ground as the positive direction of the X-axis of the world coordinate system, and set the direction perpendicular to the X-axis on the ground as the positive direction of the Y-axis of the world coordinate system , take the vertical and ground-up direction as the positive direction of the Z-axis of the world coordinate system.

S103 , for each target object, determine the second position coordinates of the target object at the current moment based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment.

Exemplarily, after obtaining the first position coordinates of the target object at the current moment, it can be based on the second position coordinates of the target object in the target place at the previous moment, and the first position of the target object in the target place at the current moment. Coordinates, to perform temporal association on the position of the same target object at different times. Determine the second position coordinates of the target object in the target place at the current moment, and then associate the second position coordinates of each target object in the target place at different times to obtain the movement track of each target object in the target place.

Exemplarily, considering that the first position coordinates are collected based on the collection device, and considering that there is a certain error in the external parameter information of the collection device, the target object determined based on the video image at the current moment collected by the collection device is at the current moment. The first position coordinates also have certain errors. Therefore, in the process of correlating the positions of the same target object at different times, the first position coordinates of the target object are corrected to determine the second position coordinates of the target object with higher accuracy at the current moment.

Exemplarily, the manner of determining the second position coordinates of the target object at the previous moment is similar to the manner of determining the second position coordinates of the target object at the current moment. Therefore, the present disclosure mainly describes the process of determining the second position coordinates of the target object at the current moment.

In the embodiment of the present disclosure, the collection viewing angles of the collection devices set in the target place are different, so that the target object in the target place can be positioned comprehensively and accurately, and the first position coordinates of the target object at the current moment can be obtained. Further combining the second position coordinates of the target object with higher accuracy at the previous moment and the first position coordinates at the current moment, accurately determine the second position coordinates of the target object at the current moment, that is, to complete the target object entering the target place. tracking.

The foregoing S101 to S103 will be described below with reference to specific embodiments.

For the above-mentioned S102, when determining the first position coordinates of each target object at the current moment based on the video pictures at the current moment collected by multiple collection devices, as shown in Figure 2, the following S201 to S203 are included:

S201: Acquire pixel coordinates of a region of interest in a video image at the current moment collected by multiple collection devices respectively.

Exemplarily, the region of interest of the target object in the video picture at the current moment may be identified based on a pre-trained neural network for target detection. Further, the pixel coordinates of the set position point in the region of interest in the image coordinate system corresponding to the video screen can be read, and the pixel coordinates corresponding to the set position point can be used as the pixel coordinates of the region of interest.

Specifically, the following steps S2011 to S2012 may be included when acquiring the pixel coordinates of the region of interest in the video images at the current moment respectively collected by multiple collection devices:

S2011, inputting a plurality of video pictures at the current moment into a pre-trained neural network to obtain a target detection frame in each video picture; wherein, the neural network includes a plurality of target detections for detecting regions of interest of target objects of different sizes Sub-network, the region where the target detection frame is located in the video picture is the region of interest.

S2012, extracting the pixel coordinates of the target position point on the target detection frame in each video picture in the video picture, to obtain the pixel coordinates of the region of interest in the video picture.

Exemplarily, the neural network can detect the region of interest of each target object contained in the video picture at the current moment, and mark each target detection frame. As shown in FIG. 3 , it is a schematic diagram of the target detection frame included in the video picture at the current moment. The video image at the current moment includes two target detection frames corresponding to the region of interest, including the target detection frame A1B1C1D1 of the region of interest 1 and the target detection frame A2B2C2D2 of the region of interest 2, respectively. A position point can be extracted as the target position point on the target detection frame of each region of interest. For example, the midpoint of the bottom edge of the detection frame is extracted as the target position point, as shown in FIG. The pixel coordinates of the region of interest 2 are represented by the pixel coordinates of the midpoint position point K2 of the bottom edge D2C2 of the target detection frame A2B2C2D2.

Exemplarily, considering that the position of the target object in the target place changes, and the capture angles of multiple capture devices set in the target place are different in the target place, therefore, the current moment captured by different capture devices at the same time is different. The size of the region of interest of the target object contained in the video frame may vary. In order to accurately mark detection frames of regions of interest of target objects of different sizes, the neural network used in the embodiments of the present disclosure may include multiple target detection sub-networks for detecting regions of interest of target objects of different sizes. For example, it can be a feature pyramid network. Each target detection sub-network in the feature pyramid network is used to detect and identify a region of interest of a target object of a size corresponding to the target detection sub-network in the video picture at the current moment. Through the neural network, the regions of interest of target objects of different sizes in the video picture at the current moment collected by the same collection device can be accurately detected.

In the embodiment of the present disclosure, the neural network includes a plurality of target detection sub-networks for detecting regions of interest of target objects of different sizes, so that when the neural network is used to perform target detection on the regions of interest of target objects in a video picture , the region of interest of target objects of different sizes in the same video frame can be accurately detected.

S202, for the video picture at the current moment collected by each acquisition device, based on the pixel coordinates of the region of interest in the video picture and the parameter information of the acquisition device, determine the target object corresponding to the region of interest in the video picture at the current moment The coordinates of the initial position in the target location.

Exemplarily, the parameter information of each acquisition device may include a homography matrix of the acquisition device, wherein the homography matrix may represent the image coordinate system corresponding to the video picture at the current moment collected by the acquisition device and the target where the acquisition device is located. The transformation relationship between the world coordinate systems corresponding to the place. In this way, after obtaining the pixel coordinates in the image coordinate system corresponding to the video image of the region of interest at the current moment, the position coordinates of the region of interest in the world coordinate system corresponding to the target site can be determined according to the parameter information of the acquisition device, and the The position coordinates of the region of interest in the world coordinate system are taken as the initial position coordinates of the target object corresponding to the region of interest in the world coordinate system.

Specifically, in the video picture at the current moment collected by each acquisition device, based on the pixel coordinates of the region of interest in the video picture and the parameter information of the acquisition device, it is determined that the target object corresponding to the region of interest in the video picture is in The initial position coordinates in the target location at the current moment include the following S2021-S2022:

S2021, for each acquisition device, based on the internal parameter matrix and the distortion parameter of the acquisition device, correct the pixel coordinates of the region of interest in the video picture collected by the acquisition device, and obtain the corrected pixel of the region of interest in the video picture coordinate.

Exemplarily, the internal parameter matrix of the acquisition device contains

(f _x , f _y ) represents the focal length of the acquisition device, and (c _x , c _y ) represents the pixel coordinates in the image coordinate system of the center point of the video image captured by the acquisition device at the current moment. The distortion parameters of the acquisition device include radial distortion parameters and tangential distortion coefficients. After obtaining the internal parameter matrix and distortion coefficient of each acquisition device in advance, the pixel coordinates of the region of interest in the video image at the current moment collected by the acquisition device can be dedistorted according to the internal parameter matrix and distortion coefficient of the acquisition device. For example, the corrected pixel coordinates of the region of interest in the video image captured by the acquisition device at the current moment can be obtained through the de-distortion function in the Opencv software.

Exemplarily, the internal parameter matrix and distortion parameters of each acquisition device may be predetermined in the manner of Zhang Zhengyou's chessboard calibration. For example, you can take multiple checkerboard images from different angles, detect the feature points in the images, and solve the internal parameter matrix and distortion parameters of the acquisition device according to the pixel coordinates of these feature points in the checkerboard image, and then continuously compare the internal parameter matrix and Distortion parameters are optimized. In the optimization process, the same pixel coordinate can be corrected according to the internal parameter matrix and distortion parameters obtained twice adjacently, and whether to end the optimization is determined by the difference between the two corrected pixel coordinates before and after. For example, after the difference is no longer reduced, the optimization can be ended to obtain the internal parameter matrix and distortion parameters of the acquisition device.

S2022 , based on the predetermined homography matrix of the acquisition device and the corrected pixel coordinates of the region of interest in the video picture, determine the initial position coordinates of the target object corresponding to the region of interest in the video picture.

Exemplarily, the homography matrix may represent the conversion relationship between the image coordinate system corresponding to the video frame at the current moment collected by the collecting device and the world coordinate system corresponding to the target location where the collecting device is located. The homography matrix can also be determined when the acquisition device is calibrated in advance. For example, a sample video image with multiple markers can be collected by a collection device, and the intersection of the multiple markers and the ground (the plane where the X and Y axes of the world coordinate system are located) is in the world coordinate system corresponding to the target site. world coordinates. Then, the corrected pixel coordinates corresponding to the intersections of the multiple markers and the ground in the sample video picture are determined according to the above method. Further, the homography matrix of the acquisition device may be determined based on the corrected pixel coordinates and world coordinates corresponding to the plurality of markers respectively.

Exemplarily, when determining the initial position coordinates of the target object corresponding to the region of interest in the video picture, it can be based on the corrected pixel coordinates of the region of interest in the video picture and the homography of the acquisition device that collects the video picture at the current moment. to obtain the position coordinates of the region of interest in the video screen in the world coordinate system corresponding to the target location, and determine the position coordinates of the region of interest in the video screen in the world coordinate system as the target object corresponding to the region of interest The initial position coordinates of .

In the embodiment of the present disclosure, after obtaining the pixel coordinates of the region of interest in the video picture, the pixel coordinates are first corrected based on the internal parameter matrix and the distortion coefficient of the capture device that captures the video picture, so that corrected pixels with higher accuracy can be obtained. coordinates, further improving the accuracy of the obtained initial position coordinates of the target object corresponding to the region of interest in the target place.

S203 , fuse the initial position coordinates of the same target object in the initial coordinate positions to obtain the first position coordinates of the target object in the target place at the current moment.

Exemplarily, considering that there are some errors between the parameter information of different collection devices, there will be some differences in the initial position coordinates of the same target object determined based on the video images collected by different collection devices at the current moment. The initial position coordinates belonging to the same target object may be fused to obtain the first position coordinates of the target object.

In the embodiment of the present disclosure, it is proposed that the pixel coordinates of the region of interest in the video picture can be determined first, and then the initial position coordinates of the target object corresponding to the region of interest in the target place can be obtained according to the parameter information of the acquisition device. The initial position coordinates belonging to the same target object among the initial position coordinates in different video pictures are further fused to obtain the first position coordinates with higher accuracy of the target object.

Specifically, for the above S203, when the initial position coordinates belonging to the same target object in the initial coordinate positions are fused to obtain the first position coordinates of the target object in the target place at the current moment, as shown in FIG. Including the following S301～S302:

S301 , based on the initial position coordinates of the target object corresponding to the region of interest determined from multiple video images at the current moment, determine a plurality of initial position coordinates associated with the same target object in the initial position coordinates.

Exemplarily, according to the above-mentioned target location, each target object is captured by at least two capture devices at the same time, and for each target object, in the case of being captured by different capture devices at the same time, the capture device There is a certain error in the parameter information, and the error between the parameter information of different acquisition devices is different. Therefore, the initial position coordinates of the same target object determined based on video pictures at different current moments may be different. Before fusing the initial position coordinates of the same target object, it is necessary to determine multiple initial position coordinates associated with the same target object.

S302 , successively fuse multiple initial position coordinates associated with the same target object to obtain the first position coordinates of the same target object in the target place at the current moment.

Exemplarily, assuming that the multiple initial position coordinates associated with the same target object include N, the first two may be fused first. The fused initial position coordinates are obtained, and then the fused initial position coordinates are fused with the third initial position coordinates. After being fused with the last initial position coordinate, the position coordinate obtained by the final fusion is used as the first position coordinate of the target object.

In the embodiment of the present disclosure, considering that there may be some errors in the initial position coordinates of the same target object determined based on video images collected by different collection devices, the initial position coordinates of the same target object collected by multiple collection devices can be fused, Thus, the first position coordinates of the target object with higher accuracy can be obtained.

In an embodiment, for the above S301, when the initial position coordinates of the target object corresponding to the region of interest determined based on multiple video images at the current moment are determined, when multiple initial position coordinates associated with the same target object in the initial coordinates are determined , including the following S3011～S3012:

S3011, with respect to any two video pictures at the current moment, determine that the target object corresponding to each region of interest in the first video picture in any two video pictures is the first target object, and the second video picture in the arbitrary two video pictures is the first target object. The target object corresponding to each region of interest is a second target object, and for the initial position coordinates of each of the first target objects, determine the initial position coordinates of the first target object and the second current moment in the video picture. a second distance between the initial position coordinates of the second target object;

S3012, the initial position coordinates of the first target object and the initial position coordinates of the second target object with the minimum second distance from the first target object are taken as multiple initial position coordinates associated with the same target object, the minimum second The distance is less than the second preset fusion distance threshold.

Exemplarily, for example, A collection devices are set up in the target site, and it is assumed that the video images of the current moment collected by the A collection devices at the same time all include the region of interest of at least one target object, and the initial position of the A group can be obtained at this moment. Coordinates, the initial position coordinates of the A group constitute the initial coordinate set s={S1, S2, S3, ...... SA}, where S1, S2, S3 ...... SA are sequentially expressed as the No. 1 in the A collection device The set of initial position coordinates of the target object in the video picture at the current moment captured by one acquisition device, the second acquisition device, the third acquisition device to the A-th acquisition device. The set of initial position coordinates of the target object corresponding to the region of interest in the video image captured by the Nth collection device at the current moment, where N is an integer greater than or equal to 1 and less than or equal to A. The following is an example of how to determine multiple initial position coordinates associated with the same target object by taking any two video images at the current moment as the video images at the current moment captured by the first capture device and the second capture device:

Exemplarily, S1 includes initial position coordinates (also referred to as first initial position coordinates) of a first target objects, and S2 includes b initial position coordinates (also referred to as second initial position coordinates) of second target objects. ), the Euclidean distance between each first initial position coordinate and each second initial position coordinate can be determined to obtain a distance matrix:

Wherein, d ₁₁ represents the second distance between the first first initial position coordinate in S1 and the first second initial position coordinate in S2; d _1b represents the first first initial position coordinate in S1 and the first second initial position coordinate in S2 The second distance between the b second initial position coordinates; d _ij represents the second distance between the i-th first initial position coordinate in S1 and the j-th second initial position coordinate in S2; d _a1 represents the second distance in S1 The second distance between the a-th first initial position coordinate and the first second initial position coordinate in S2; d _ab represents the a-th first initial position coordinate in S1 and the b-th second initial position coordinate in S2 the second distance between.

Exemplarily, during operation, multiple initial position coordinates associated with the same target object in S1 and S2 can be determined in the following manner, including S30121-S3012:

S30121, find the current minimum second distance in the elements in the current distance matrix;

Exemplarily, in the case of finding the minimum second distance for the first time, the elements in the current distance matrix include the Euclidean equation between the initial position coordinates of each first target object in S1 and the initial position coordinates of each second target object in S2. distance.

S30122: Determine whether the current minimum second distance is less than a second preset fusion distance threshold.

Exemplarily, the second preset fusion distance may be set empirically. For example, the same target object is photographed by different collection devices in advance, and then multiple initial position coordinates of the same target object in the target site are determined according to the video images collected by different collection devices, according to the distance between the multiple initial position coordinates. to determine the second preset fusion distance threshold.

S30123: In the case where it is determined that the current minimum second distance is less than the second preset fusion distance threshold, determine that the two initial position coordinates associated with the current minimum second distance are initial position coordinates associated with the same target object.

Exemplarily, for example, it is determined that d _a1 is the current minimum distance, and d _a1 is less than the second preset fusion distance threshold, the initial position coordinates of the a-th first target object in S1 and the first second target in S2 can be used. The initial position coordinates of the object serve as the initial position coordinates associated with the same target object.

S30124: Set the current minimum second distance in the current distance matrix and all other second distances between any one of the two initial position coordinates associated with the current minimum second distance as the second preset fusion distance After the threshold is set, return to execute S30121 until the current minimum second distance in the current distance matrix is greater than or equal to the second preset fusion distance threshold, obtain all initial position coordinates associated with the same target object in S1 and S2.

Exemplarily, it is assumed that the current distance matrix is calculated from the initial position coordinates in S1 and S2, and the specific one is a 3×3 matrix:

The second preset fusion distance threshold is d _th ; assuming that d ₁₁ is the minimum distance in the current matrix and is less than d _th , then the first first initial position coordinate in S1 and the first second initial position coordinate in S2, is the associated initial position coordinate of the same target object. Then in the current distance matrix, all other distances calculated from any of the two initial position coordinates are d ₁₂ , d ₁₃ , d ₂₁ , and d ₃₁ . Therefore, according to S30124, in the current matrix, it is necessary to set d ₁₁ , d ₁₂ , d ₁₃ , d ₂₁ , and d ₃₁ to d _th ; the set matrix is:

Then, it returns to execute S30121.

Exemplarily, the current minimum second distance in the current distance matrix and all other distances between any one of the two initial position coordinates associated with the current minimum second distance are set as the second preset fusion. After the distance threshold is determined, in the process of continuing to search for the current minimum second distance, the element set as the second preset fusion distance threshold can be excluded, thereby improving the search efficiency.

Exemplarily, in an implementation manner, after obtaining multiple initial position coordinates associated with the same target object in S1 and S2, it can continue to determine whether there is an association with the same target object based on any two other video images at the current moment. The initial position coordinates of . After judging the video images of the current moment collected by the A collection devices at the same time, the different initial position coordinates of each target object involved in the current video images collected by the A collection devices at the same time can be obtained. Then, the initial position coordinates associated with the same target object are fused in different initial position coordinates to obtain the first position of each target object in the target place in the A pieces of video images of the current moment shot by the A collection devices at the same time coordinate.

Exemplarily, in another embodiment, after obtaining multiple initial position coordinates associated with the same target object in S1 and S2, coordinate fusion can be performed on the plurality of initial position coordinates to obtain the updated version of the same target object. Initial position coordinates. For the initial position coordinates to be fused in S1 and S2, S2' can be formed with the updated initial position coordinates. Further form a new current distance matrix by the initial position coordinates in S2' and S3, repeat the steps of S30121 to S30124, obtain a plurality of initial position coordinates associated with the same target object in S2' and S3, obtain S3 in the same way '. A new current distance matrix is further formed by the initial position coordinates in S3 ' and S4, and the steps of S30121 to S30124 are repeatedly executed, until after completing the fusion with the initial position coordinates in the last element (that is, SA) in the initial coordinate set, Obtain the first position coordinates of each target object in the target place in the A video images shot by the A collection devices at the same time.

In particular, until the fusion with the initial position coordinates in the last element in the initial coordinate set (ie SA) is completed, if any initial position coordinates are detected to be involved in the fusion from the beginning to the end, considering the target location Each target object in is simultaneously collected by at least two collection devices, so any initial position coordinate can be used as the error initial position coordinate for filtering.

In the embodiment of the present disclosure, according to the initial position coordinates of the target object corresponding to the region of interest in any two video images at the current moment, and the second preset fusion distance threshold, the initial position coordinates associated with the same target object can be quickly determined , so as to provide a basis for the subsequent determination of the first position coordinates of each target object.

For the above S302, when the multiple initial position coordinates associated with the same target object are sequentially fused to obtain the first position coordinates of the same target object in the target place at the current moment, the following steps S3021 to S3022 may be included:

S3021, select any initial position coordinate from a plurality of initial position coordinates associated with the same target object, and use any selected initial position coordinate as the first intermediate fusion position coordinate;

S3022, fuse the first intermediate fusion position coordinates with any other initial position coordinates to be fused to generate second intermediate fusion position coordinates; use the second intermediate fusion position coordinates as the updated first intermediate fusion position coordinates, and return The step of generating the second intermediate fused position coordinates until there are no initial position coordinates to be fused.

The initial position coordinates to be fused refer to the initial position coordinates that do not participate in the fusion.

Exemplarily, when the first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused to generate the second intermediate fusion position coordinates, the steps include:

Determine the midpoint coordinate of the first intermediate fusion position coordinate and any other initial position coordinate to be fused, and use the midpoint coordinate as the generated second intermediate fusion position coordinate.

Exemplarily, in combination with the above embodiments, if it is determined that the plurality of initial position coordinates associated with the target object A include N pieces. Any initial position coordinate may be used as the first intermediate fusion position coordinate, and the midpoint coordinate of the first intermediate fusion position coordinate and any other initial position coordinate to be fused is determined. Then use the midpoint coordinate as the updated first intermediate fusion position coordinate, and continue to fuse with any other initial position coordinate to be fused. Until there is no initial position coordinate to be fused among the N initial position coordinates, the first position coordinate of the target object A is obtained.

In this embodiment of the present disclosure, multiple initial position coordinates associated with the same target object may be fused in a manner of taking midpoints in sequence, so as to obtain first position coordinates with higher accuracy.

In a possible implementation manner, when determining the second position coordinates of the target object at the current moment based on the obtained first position coordinates of the target object and the second position coordinates of the target object at the previous moment, as shown in FIG. 5 As shown, the following S401 to S403 may be included:

S401, based on the second position coordinates of the target object at the previous moment, determine the predicted position coordinates of the target object at the current moment;

S402, based on the predicted position coordinates and the first position coordinates of the target object at the current moment, determine the observed position coordinates of the target object at the current moment;

S403 , based on the predicted position coordinates and the observed position coordinates of the target object at the current moment, determine the second position coordinates of the target object at the current moment.

Exemplarily, a Kalman filter may be introduced here to determine the second position coordinates of the target object at the current moment by means of Kalman filtering. In the process of determining the second position coordinates of the target object with higher accuracy at the current moment based on the Kalman filtering method, it is necessary to determine the observed position coordinates and the predicted position coordinates. The predicted position coordinates refer to the position coordinates of the target object at the current moment that can be predicted based on the second position coordinates of the previous moment. The coordinates of the observation position may be determined according to the video image at the current moment collected by the collection device, such as the first position coordinates of the target object at the current moment determined above. However, considering that there may be errors in the first position coordinates, the embodiment of the present disclosure proposes to jointly determine the observation position coordinates by combining the predicted position coordinates and the first position coordinates determined based on the video images collected by the collection device at the current moment. Finally, the observed position coordinates and the predicted position coordinates can be combined to obtain the second position coordinates of the target object at the current moment.

Specifically, when determining the predicted position coordinates of the target object at the current moment based on the second position coordinates of the target object at the previous moment, it can be determined according to the following formula (1) in the Kalman filter formula:

Trk(t|t-1)=ATrk(t-1|t-1)+Bu(t-1)+W(t-1) (1);

Among them, Trk(t|t-1) indicates the predicted position coordinates of the target object at the current moment determined according to the second position coordinates of the target object at the previous moment; Trk(t-1|t-1) indicates that the target object is in The second position coordinate of the previous moment; W(t-1) represents the white noise in the process of predicting the target object's predicted position coordinates at the current moment, and represents the error amount of the predicted position coordinates; A and B represent the parameters of the Kalman filter matrix, where A represents the state transition matrix in the Kalman filter, and u(t-1) represents the control amount of the Kalman filter at the previous moment, which can be 0.

In addition, further, after obtaining the observation position coordinates of the target object at the current moment, the covariance matrix of the observation position coordinates of the target object at the current moment can be determined according to the following formula (2):

P(t|t-1)=AP(t-1|t-1)A ^T +Q (2);

Among them, P(t|t-1) represents the covariance matrix of the coordinates of the observed position of the target object at the current moment, which can represent the uncertainty of the coordinates of the observed position of the target object at the current moment; P(t-1|t-1 ) represents the covariance matrix of the second position coordinate of the target object at the last moment, which can represent the uncertainty of the second position coordinate of the target object at the last moment; Q represents the system process covariance matrix introduced by the Kalman filter, It is used to represent the error of the state transition matrix compared to the actual process.

Exemplarily, after the predicted position coordinates of the target object at the current moment are obtained, the predicted position coordinates of the target object and the first position coordinates can be combined to determine the observed position coordinates of the target object at the current moment, which will be described in detail later.

After obtaining the predicted position coordinates and the observed position coordinates of the target object at the current moment, the second position coordinates of the target object at the current moment can be determined according to the following formula (3) in the Kalman filter formula:

Trk(t|t)=Trk(t|t-1)+K _g (t)(z(t)-HTrk(t|t-1)) (3);

Among them, Trk(t|t) represents the second position coordinate of the target object at the current moment; Z(t) represents the observation position coordinate of the target object at the current moment; K _g (t) represents the filter gain matrix in the Kalman filter , the filter gain matrix can be determined by the following formula (4):

Among them, H represents the parameter matrix in the Kalman filter, which represents the observation matrix; R represents the known measurement noise covariance in the Kalman filter.

Further, based on the filtering gain matrix, the second position coordinate of the target object at the next moment needs to be determined, so it is necessary to determine the covariance matrix P(t|t) of the second position coordinate of the target object at the current moment. The following formula (5) is determined:

P(t|t)=(IK _g (t)H)P(t|t-1) (5);

After obtaining the covariance matrix of the second position coordinates of the target object at the current moment, the covariance matrix of the observed position coordinates of the target object at the next moment can be determined based on the covariance matrix, so as to determine the second position of the target object at the next moment. Prepare the coordinates.

Exemplarily, if the current moment is the initial moment of acquisition, the target object does not have the second position coordinates of the previous moment. In this case, the first position coordinate of the target object at the current moment may be directly determined as the second position coordinate of the target object at the current moment.

In the embodiment of the present disclosure, the predicted position coordinates of the target object at the current moment can be determined according to the second position coordinates of the target object at the previous moment, and further combined with the first position coordinates of the target object at the current moment, the current position of the target object can be obtained. The second position coordinate with higher time accuracy.

Specifically, there are multiple target objects. For the above S402, when determining the observed position coordinates of the target object at the current moment based on the predicted position coordinates and the first position coordinates of the target object at the current moment, the following steps S4021 to S4022 may be included:

S4021 , based on the predicted position coordinates and the first position coordinates of the multiple target objects at the current moment, determine the predicted position coordinates and the first position coordinates associated with the same target object.

S4022: Determine the predicted position coordinates associated with the same target object and the first midpoint coordinates of the first position coordinates, and use the first midpoint coordinates as the observed position coordinates of the target object at the current moment.

Exemplarily, according to the second position coordinates of the N target objects included in the target place at the previous moment, the predicted position coordinates of the N target objects at the current moment can be obtained. In addition, based on the video images at the current moment collected by multiple collection devices, the first position coordinates of the M target objects in the target place at the current moment can be obtained. Among the N predicted position coordinates and the M first position coordinates, the predicted position coordinates and the first position coordinates associated with the same target object may be determined by a distance-based greedy algorithm. Then, the midpoint coordinates of the predicted position coordinates associated with the same target object and the first position coordinates can be further used as the observed position coordinates of the same target object at the current moment.

Illustratively, N may be greater than or equal to M. In the case where N is greater than M, there may be a target object that is missed in the video image at the current moment collected by the collection device. For example, the video image of a certain target object cannot be captured due to the occlusion of the obstacle in the target area. In this case, if the first position coordinates of the target object in the target area are determined based on the video image at the current moment, there will be missed detections. Happening. In this case, the observed position coordinates of the target object at the current moment can be determined by the predicted position coordinates of the target object.

In this embodiment of the present disclosure, determining the observed position coordinates of the target object at the current moment based on the predicted position coordinates and the first position coordinates of the target object at the current moment may include: based on multiple first position coordinates of the target objects at the current moment. a position coordinate and the predicted position coordinate of the target object at the current moment, determine the first position coordinate of the target object; determine the predicted position coordinate of the target object and the first middle of the first position coordinate point coordinates, and the first midpoint coordinates are taken as the coordinates of the observation position of the target object at the current moment.

Specifically, for the above S4021, when determining the predicted position coordinates and the first position coordinates associated with the same target object based on the predicted position coordinates and the first position coordinates of the multiple target objects at the current moment, the following steps S40211 to S40212 may be included:

S40211, for each predicted position coordinate, determine the first distance between the predicted position coordinate and each first position coordinate;

S40212, use the predicted position coordinates and the first position coordinates with the minimum first distance from the predicted position coordinates as the predicted position coordinates and the first position coordinates associated with the same target object, and the minimum first distance is smaller than the first preset fusion distance threshold.

Exemplarily, for example, the current moment includes N predicted position coordinates and M observed position coordinates, and the Euclidean distance between each predicted position coordinate and each observed position coordinate is determined according to the N predicted position coordinates and M observed position coordinates, Get the distance matrix:

Wherein, l ₁₁ represents the first distance between the first predicted position coordinate in the N predicted position coordinates and the first observed position coordinate in the M observed position coordinates; l _1M represents the first predicted position in the N predicted position coordinates The first distance between the position coordinates and the Mth observation position coordinate in the M observation position coordinates; l _nm represents the nth prediction position coordinate in the N prediction position coordinates and the mth observation position coordinate in the M observation position coordinates The first distance between; l _N1 represents the first distance between the Nth predicted position coordinate in the N predicted position coordinates and the first observed position coordinate in the M observed position coordinates; l _NM represents the N predicted position coordinates The first distance between the Nth predicted position coordinate in and the Mth observation position coordinate in the M observation position coordinates.

Further, the predicted position coordinates and the first position coordinates associated with the same target object may be determined according to the above-mentioned method of determining multiple initial position coordinates associated with the same target object, and the specific process will not be repeated here.

In this embodiment of the present disclosure, determining the first position coordinates of the target object based on the plurality of first position coordinates of the target objects at the current moment and the predicted position coordinates of the target objects at the current moment may include: determining a first distance between the predicted position coordinates of the target object and each of the first position coordinates; forming a first position with the smallest first distance from the predicted position coordinates of the target object among the plurality of first position coordinates coordinates, as the first position coordinates with the target object, the minimum first distance is smaller than the first preset fusion distance threshold.

In the embodiment of the present disclosure, combining the predicted position coordinates of the target object at the current moment predicted according to the position coordinates of the target object at the historical moment, and the first position coordinates of the target object determined according to the video images collected by the acquisition device at the current moment, a On the one hand, the position coordinates of the same target object at different times can be quickly obtained, and on the other hand, the observed position coordinates with high accuracy can be obtained.

In an implementation manner, as shown in FIG. 6 , the target tracking method provided by the embodiment of the present disclosure further includes the following S501 to S502:

S501, determine whether there is an undetected target object in the video picture at the current moment, wherein the missed target object has a predicted position coordinate at the current moment, and the first position coordinate at the current moment is empty;

S502 , when it is determined that there is an undetected target object, the predicted position coordinates of the missed target object at the current moment are taken as the observed position coordinates of the missed target object at the current moment.

Exemplarily, considering that when there are many target objects in the target site, congestion is likely to occur between different target objects, so there may be occlusion between different target objects at a certain moment, resulting in the video captured by the capture device. There is a missed detection in the screen. For example, the target object A in the video images of the current moment collected by the capture device 1 and the capture device 2 among the multiple capture devices are all blocked. The first position coordinate mark of the target object A is empty, and at this time, the target object A is regarded as the missed target object.

Exemplarily, in the case of determining the predicted position coordinates of the target object A at the current moment using the Kalman filtering method, the second position coordinates of the target object A at the historical moment will be used. Since the target object A will be collected by the acquisition device in the process of entering the target place, the second position coordinates of the target object A at the historical moment can be determined, so that the target object A can be determined at the current moment according to the method of Kalman filtering. The predicted location coordinates of . If the first position coordinates of the target object A at the current moment are empty, the predicted position coordinates of the target object A at the current moment can be directly used as the observed position coordinates at the current moment.

In the embodiment of the present disclosure, when there is an occluded target object in the video image at the current moment collected by the collection device, the occluded target object may be determined based on the second position coordinates of the occluded target object in the historical moment. The coordinates of the observed position at the current moment, so as to determine the second position coordinates of the target object with higher accuracy at the current moment.

In one embodiment, the target object includes multiple objects. As shown in FIG. 7 , the target tracking method provided by the embodiment of the present disclosure further includes the following S601 to S602:

S601, after determining the second position coordinates of the target object at the current moment, mark the identity identifier associated with the target object in the map position indicated by the second position coordinates;

S602, based on the second position coordinates of the target objects marked with the same identity identifier at multiple times, generate trajectory data of each target object.

Exemplarily, taking the target place as a factory and the target objects as employees entering the factory as an example, a capturing device for capturing images of employees may be set at the entrance of the factory. Feature extraction is performed based on the collected employee images, for example, facial features and/or human body features in the employee images are extracted. The identity of each employee entering the factory is determined based on the extracted characteristic information and the pre-stored characteristic information of each employee in the employee identity database. In the process of tracking the target object, after determining the second position coordinates of the target object at the current moment, the identity identifier associated with the target object may be marked in the map position indicated by the second position coordinates. Then, by connecting the second position coordinates of multiple moments with the same identity identifier, the movement trajectories of different target objects in the map can be obtained.

Illustratively, the map may be a pre-built high-resolution map. The pre-built highland map has a corresponding relationship with the target site, and the two can be presented 1:1 in the same coordinate system. Therefore, based on the second position coordinates of the target objects marked with the same identity identifier at multiple times, trajectory data representing the movement trajectory of each target object in the target place can be generated.

In the embodiment of the present disclosure, the movement trajectory of each target object in the target place can be quickly determined according to the identity identifier of the target object and the second position coordinates at different times.

Exemplarily, in the case that there are some target objects in the target place that are close in distance, clustering these target objects can form a target group. Errors may occur when marking the identity identifier with the second position coordinates of the target objects in the same target group, for example, marking the identity identifier of target object A in the target group to target object B, marking the identity identifier of target object B to Target object A, that is, the serial number problem occurs. When the target object A and the target object B belong to the same target group, when a serial number occurs, the distance between the target objects in which the serial number occurs is relatively close, so the impact on the trajectory data is small. However, when the target object with serial number is far away from the target group, if the identity identifier of the target object is wrong, the final determined trajectory data of the target object will also be wrong. Therefore, in an implementation manner, after determining the second position coordinates of the target object at the current moment, as shown in FIG. 8 , the target tracking method provided by the embodiment of the present disclosure further includes the following S701 to S703:

S701 , based on the second position coordinates of the multiple target objects at the current moment, detect whether there are target objects that deviate from the target group; the target group is obtained by clustering according to the second position coordinates of the multiple target objects at the previous moment.

Exemplarily, according to a clustering algorithm (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), the second position coordinates of the multiple target objects at the previous moment can be clustered to obtain the target group. The distance of the second position coordinates between different target objects in the target group is less than a preset distance threshold for entering the target group.

Exemplarily, based on the second position coordinates of the multiple target objects at the current moment and a preset distance threshold from the target group, it may be determined whether there are target objects that deviate from the target group to which they belong.

S702, in the case where it is determined that there is a target object deviating from the target group, detect whether the identity identifier associated with the target object deviating from the target group is accurate.

Specifically, when detecting whether the identity identifier associated with the target object that deviates from the target group is accurate, it includes:

S7021, extract the feature information of the target object deviating from the target group;

S7022, based on the feature information of the target objects that deviate from the target group and the pre-stored mapping relationship between the feature information of each target object entering the target place and the identity identifiers, detect the identity identifiers associated with the target objects that deviate from the target group Is it accurate.

Exemplarily, when it is determined that there is a target object deviating from the target group, a video picture of the current moment of the target object deviating from the target group is acquired. Based on the video picture at the current moment, the feature information of the target object deviating from the target group is extracted. Based on the feature information and the pre-stored identity identifier of the target object and the corresponding feature information, it is determined whether the identity identifier associated with the target object deviating from the target group is accurate. For example, the similarity between the feature information of the currently extracted target object that deviates from the target group and the pre-marked feature information associated with the identity identifier of the target object can be determined. When it is determined that the similarity reaches a preset similarity threshold, it is determined that the identity identifier associated with the target object deviating from the target group is accurate. On the contrary, if it is determined that the similarity does not reach the preset similarity threshold, it is determined that the identity identifier associated with the target object deviating from the target group is inaccurate.

For example, if it is detected that there is a target object that deviates from the target group, the pre-marked identity identifier of the target object is 001. In the case where it is determined that the similarity between the feature information of the deviation target object extracted from the video image at the current moment and the feature information associated with the identity identifier 001 is less than the preset similarity threshold, it is determined that the identity identifier of the target object 001 is inaccurate .

S703 , in the case that the identity identifiers of the target objects deviating from the target group are determined to be inaccurate, correct the identity identifiers associated with the target objects deviating from the target group.

Exemplarily, in the case of determining that the identity identifier of the target object deviating from the target group is inaccurate, it can be based on the extracted feature information of the target object deviating from the target group and the pre-stored feature information of each employee in the employee identity database. , and re-determine the identity identifier of the target object that deviates from the target group.

In the embodiment of the present disclosure, when it is detected that there is a target object leaving the target group, the identity identifier of the target object leaving the target group is re-verified, which can improve the accuracy of the identity identifier of the target object marked at different times. to improve the accuracy of the trajectory data of the target object.

The target tracking method proposed in the embodiment of the present disclosure can accurately determine the second position coordinates of each target object in the target place at the current moment, and this method can be applied to various application scenarios. Taking the application in a factory as an example, after obtaining the second position coordinates of the target object in the target place, as shown in FIG. 9 , the positioning method provided by the embodiment of the present disclosure further includes the following S801 to S802:

S801, based on the second position coordinates corresponding to each target object in the target place and the preset target area, determine whether there is a target object entering the target area;

S802, if it is determined that there is a target object entering the target area, perform an early warning prompt.

Exemplarily, in the case where the target site is a factory, a coordinate range corresponding to a dangerous target area in the factory may be set in advance in the world coordinates corresponding to the target site. Then, it is determined whether there is a target object entering the target area according to the second position coordinates corresponding to each target object in the determined target place at the current moment and the target location in the corresponding coordinate range. Further, when it is determined that there is a target object entering the target area, an early warning prompt is performed.

Exemplarily, the early warning prompts may include, but are not limited to, sound and light alarm prompts, voice alarm prompts, and the like. Through the early warning prompts, the safety of employees in the target site can be guaranteed and the safety of the target site can be improved.

In the embodiment of the present disclosure, after obtaining the second position coordinates of each target object in the target place with high accuracy, the target object in the target place can be determined based on a preset target area, such as a preset dangerous area Whether to enter the target area, so as to prompt early warning and improve the safety of the target site.

Below in conjunction with FIG. 10 , the target tracking process provided by the embodiment of the present disclosure is described by taking the target site as a factory and the target object as an employee as an example:

1) Install the acquisition equipment for the factory, such as installing multiple cameras in the factory; in order to achieve accurate positioning of the target in the scene, and ensure the versatility and robustness of the algorithm, different acquisition equipment has different acquisition perspectives in the factory. Ensure that each employee entering the factory is simultaneously captured by at least two capture devices;

2) Use Zhang Zhengyou's calibration method to determine the internal parameter matrix and distortion coefficient of each camera;

3) Set up multiple markers in the factory, and determine the position coordinates of the intersection of the marker and the ground in the world coordinate system corresponding to the factory; and determine the intersection of the marker and the ground in the sample video screen according to the camera's internal parameter matrix and distortion coefficient. The corrected pixel coordinates in ; and the homography matrix of each camera is determined according to the position coordinates of the intersection in the world coordinate system and the corrected pixel coordinates in the sample video screen;

4) Perform feature detection for each employee entering the factory, such as the human body detection and face recognition in Figure 10, to obtain the feature information of each employee; and based on the extracted feature information and pre-built pre-stored employee identities The characteristic information of each employee in the library to determine the identity identifier of each employee entering the factory;

5) For the video picture of the current moment collected by the acquisition device, use the neural network added to the feature pyramid to perform target detection, and obtain the pixel coordinates of the employees involved in the video picture of each current moment;

6) According to the internal parameter matrix and the distortion coefficient of the camera that collects the video picture, the pixel coordinates of the employees involved in the video picture are corrected to obtain the corrected pixel coordinates of the employees involved in the video picture;

7) According to the homography matrix of the camera that collects the video picture and the corrected pixel coordinates of the employee involved in the video picture, determine the initial position coordinates of the employee involved in the video picture in the factory;

8) Fusion of the initial position coordinates of the same employee involved in the video images collected at the same moment, to obtain the first position coordinates of the employees in the factory at this moment;

9) According to the determined second position coordinates of the employee at the previous moment and the first position coordinates of the employee at the current moment, determine the second position coordinates of the employee at the current moment. For the specific process, please refer to the above;

10) When determining the second position coordinates of the employee at the current moment, the employee's associated identity identifier can be marked in the map position indicated by the employee's second position coordinate; The second position coordinates of each moment, and the trajectory data of each employee is generated.

Those skilled in the art can understand that in the above method of the specific implementation, the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.

Based on the same technical concept, the embodiment of the present disclosure also provides a target tracking device corresponding to the target tracking method. Reference may be made to the implementation of the method, and repeated descriptions will not be repeated.

Referring to FIG. 11 , which is a schematic diagram of a target tracking apparatus 900 according to an embodiment of the present disclosure, the target tracking apparatus includes:

The acquisition module 901 is used to acquire the video images at the current moment collected by multiple collection devices set in the target site; the collection perspectives of the multiple collection devices in the target site are different, and the video images include the region of interest of the target object in the target site ;

A determination module 902, configured to determine the first position coordinates of each of the target objects at the current moment based on the video images at the current moment collected by multiple collection devices;

The tracking module 903 is configured to, for each of the target objects, determine the second position coordinates of the target object at the current moment based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment.

In a possible implementation manner, the tracking module 903 is configured to, for each of the target objects, determine, based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment, that the target object is The coordinates of the second position at the current moment include:

Determine the predicted position coordinates of the target object at the current moment based on the second position coordinates of the target object at the previous moment;

Based on the predicted position coordinates and the first position coordinates of the target object at the current moment, determine the observed position coordinates of the target object at the current moment;

Based on the predicted position coordinates and the observed position coordinates of the target object at the current moment, the second position coordinates of the target object at the current moment are determined.

In a possible implementation manner, when the tracking module 903 is used to determine the observed position coordinates of the target object at the current moment based on the predicted position coordinates and the first position coordinates of the target object at the current moment, the method includes:

Determine the predicted position coordinates and the first position coordinates associated with the same target object based on the predicted position coordinates and the first position coordinates of the multiple target objects at the current moment;

Determine the predicted position coordinates associated with the same target object and the first midpoint coordinates of the first position coordinates, and use the first midpoint coordinates as the observed position coordinates of the target object at the current moment.

In a possible implementation manner, when the tracking module 903 is used to determine the predicted position coordinates and the first position coordinates associated with the same target object based on the predicted position coordinates and the first position coordinates of the multiple target objects at the current moment, include:

For each predicted position coordinate, determine a first distance between the predicted position coordinate and each first position coordinate;

The predicted position coordinates and the first position coordinates with the minimum first distance from the predicted position coordinates are taken as the predicted position coordinates and the first position coordinates associated with the same target object, and the minimum first distance is smaller than the first preset fusion distance threshold. .

In a possible implementation manner, the tracking module 903 is further configured to:

Determine whether there is an undetected target object in the video picture at the current moment, wherein the missed target object has a predicted position coordinate at the current moment, and the first position coordinate at the current moment is empty;

When it is determined that there is an undetected target object, the predicted position coordinates of the missed target object at the current moment are taken as the observed position coordinates of the missed target object at the current moment.

In a possible implementation manner, the target object includes multiple objects, and the tracking module 903 is further configured to:

After determining the second position coordinates of the target object at the current moment, marking the identity identifier associated with the target object in the map position indicated by the second position coordinates;

Based on the second position coordinates of the target objects marked with the same identity identifier at multiple times, the trajectory data of each target object is generated.

In a possible implementation manner, after determining the second position coordinates of the target object at the current moment, the tracking module 903 is further configured to:

Based on the second position coordinates of the multiple target objects at the current moment, detect whether there are target objects that deviate from the target group; the target group is obtained by clustering according to the second position coordinates of the multiple target objects at the previous moment;

In the case of determining that there is a target object deviating from the target group, detecting whether the identity identifier associated with the target object deviating from the target group is accurate;

If it is determined that the identity identifiers of the target objects deviating from the target group are inaccurate, the identity identifiers associated with the target objects deviating from the target group are corrected.

In a possible implementation manner, when the tracking module 903 is used to detect whether the identity identifier associated with the target object deviating from the target group is accurate, the method includes:

Extract feature information of target objects that deviate from the target group;

Based on the feature information of the target objects that deviate from the target group, and the pre-saved mapping relationship between the feature information of each target object entering the target place and the identity identifiers, it is detected whether the identity identifiers associated with the target objects that deviate from the target group are accurate. .

In a possible implementation manner, when the determining module 902 is used to determine the first position coordinates of each target object at the current moment based on the video images at the current moment collected by multiple collection devices, the method includes:

Acquiring the pixel coordinates of the region of interest in the video image at the current moment collected by multiple collection devices respectively;

For each acquisition device, based on the pixel coordinates of the region of interest in the video picture collected by the acquisition device at the current moment and the parameter information of the acquisition device, it is determined that the target object corresponding to the region of interest in the video picture is at the current moment. the coordinates of the initial position in the target site;

The initial position coordinates belonging to the same target object in the initial position coordinates are fused to obtain the first position coordinates of the target object in the target place at the current moment.

In a possible implementation manner, when the determining module 902 is used to acquire the pixel coordinates of the region of interest in the video images at the current moment respectively collected by multiple collection devices, the method includes:

Inputting multiple video images at the current moment into a pre-trained neural network to obtain target detection frames in the video images at each current moment; wherein the neural network includes multiple targets for detecting regions of interest of target objects of different sizes detect subnets;

Extract the pixel coordinates of the target position point on the target detection frame in the video picture at the current moment in the video picture at the current moment, and obtain the pixel coordinates of the region of interest in the video picture at the current moment.

In a possible implementation manner, the determining module 902 is used to determine the target corresponding to the region of interest based on the pixel coordinates of the region of interest in the video image at the current moment collected by each acquisition device and the parameter information of the acquisition device The initial position coordinates of the object in the target location at the current moment include:

Based on the internal parameter matrix and distortion parameters of each acquisition device, correct the pixel coordinates of the region of interest in the video picture collected by the acquisition device, and obtain the corrected pixel coordinates of the region of interest in the video picture;

Based on the predetermined homography matrix of the acquisition device and the corrected pixel coordinates of the region of interest in the video image at the current moment collected by the acquisition device, determine the initial position coordinates of the target object corresponding to the region of interest in the video image .

In a possible implementation manner, when the determining module 902 is used to fuse the initial position coordinates belonging to the same target object in the initial coordinate positions to obtain the first position coordinates of the target object in the target place at the current moment, include:

Determine a plurality of initial position coordinates associated with the same target object based on the initial position coordinates determined by a plurality of video images at the current moment;

The multiple initial position coordinates associated with the same target object are sequentially fused to obtain the first position coordinates of the target object in the target place at the current moment.

In a possible implementation manner, when the determining module 902 is used to sequentially fuse multiple initial position coordinates associated with the same target object to obtain the first position coordinates of the target object in the target place at the current moment, include:

Select any initial position coordinate from a plurality of initial position coordinates associated with the target object, and use any selected initial position coordinate as the first intermediate fusion position coordinate;

The first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused to generate the second intermediate fusion position coordinates, and the second intermediate fusion position coordinates are used as the updated first intermediate fusion position coordinates, and return to generate the first intermediate fusion position coordinates. Step 2: The intermediate position coordinates are fused until there is no initial position coordinates to be fused.

In a possible implementation manner, when the determining module 902 is used to fuse the first intermediate fused position coordinates with any other initial position coordinates to be fused to generate the second intermediate fused position coordinates, it includes:

In a possible implementation manner, when the determining module 902 is used to determine a plurality of initial position coordinates associated with the same target object based on the initial position coordinates determined based on a plurality of video images at the current moment, the method includes:

For any two video pictures at the current moment, determine the target object corresponding to each region of interest in the first video picture in the arbitrary two video pictures as the first target object, and each of the second video pictures in the arbitrary two video pictures is the first target object. The target object corresponding to each region of interest is a second target object, and the second distance between the initial position coordinates of the first target object and the initial position coordinates of each second target object in the video picture at the second current moment is determined;

For the initial position coordinates of each first target object, the initial position coordinates of the first target object and the initial position coordinates of the second target object having the smallest second distance from the first target object are regarded as being associated with the same target object multiple initial position coordinates; the minimum second distance is less than the second preset fusion distance threshold.

In a possible implementation manner, after the tracking module 903 determines the second position coordinates of the target object at the current moment, the determining module 902 is further configured to:

Determine whether there is a target object entering the target area based on the second position coordinates corresponding to each target object in the target place and the preset target area;

When it is determined that there is a target object entering the target area, an early warning prompt is given.

For the description of the processing flow of each module in the apparatus and the interaction flow between the modules, reference may be made to the relevant descriptions in the foregoing method embodiments, which will not be described in detail here.

Corresponding to the target tracking method in FIG. 1 , an embodiment of the present disclosure further provides an electronic device 1100 . As shown in FIG. 12 , a schematic structural diagram of the electronic device 1100 provided by an embodiment of the present disclosure includes:

The processor 111, the memory 112, and the bus 113; the memory 112 is used to store the execution instructions, including the memory 1121 and the external memory 1122; the memory 1121 here is also called the internal memory, which is used to temporarily store the operation data in the processor 111, and The data exchanged by the external memory 1122 such as the hard disk; the processor 111 exchanges data with the external memory 1122 through the memory 1121; when the electronic device 1100 is running, the processor 111 and the memory 112 communicate through the bus 113, so that the processor 111 executes the following instructions : Obtain the video images at the current moment collected by multiple collection devices set up in the target site; the collection perspectives of multiple collection devices in the target site are different, and the video images include the area of interest of the target object in the target site; The video screen at the current moment collected by the device determines the first position coordinates of the target object in the target place at the current moment; for each target object, based on the first position coordinates of the target object and the second position of the target object at the previous moment. Position coordinates, which determine the second position coordinates of the target object at the current moment.

Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the steps of the target tracking method described in the above method embodiments are executed. Wherein, the storage medium may be a volatile or non-volatile computer-readable storage medium.

Embodiments of the present disclosure further provide a computer program product, where the computer program product carries program codes, and the instructions included in the program codes can be used to execute the steps of the target tracking method described in the foregoing method embodiments. For details, please refer to the foregoing method. The embodiments are not repeated here.

Wherein, the above-mentioned computer program product can be specifically implemented by hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, for the specific working process of the system and device described above, reference may be made to the corresponding process in the foregoing method embodiments, which will not be repeated here. In the several embodiments provided by the present disclosure, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on such understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that contribute to the prior art or the parts of the technical solutions. The computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present disclosure, and are used to illustrate the technical solutions of the present disclosure, but not to limit them. The protection scope of the present disclosure is not limited to this, although the aforementioned The embodiments describe the present disclosure in detail, and those skilled in the art should understand that: any person skilled in the art can still modify the technical solutions described in the foregoing embodiments within the technical scope disclosed by the present disclosure. Or can easily think of changes, or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure, and should be covered in the present disclosure. within the scope of protection. Therefore, the protection scope of the present disclosure should be based on the protection scope of the claims.

Claims

A target tracking method comprising:

Obtain the video images at the current moment collected by multiple collection devices set in the target place; the multiple collection devices have different capture angles in the target place, and the video images include the image of the target object in the target place. area of interest;

determining the first position coordinates of each of the target objects at the current moment based on the video images at the current moment collected by the multiple collection devices;

For each of the target objects, the second position coordinates of the target object at the current moment are determined based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment.
The target tracking method according to claim 1, characterized in that, based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment, determining the position of the target object at the current moment Second location coordinates, including:

Based on the second position coordinates of the target object at the previous moment, determine the predicted position coordinates of the target object at the current moment;

Determine the observed position coordinates of the target object at the current moment based on the predicted position coordinates and the first position coordinates of the target object at the current moment;

Based on the predicted position coordinates and the observed position coordinates of the target object at the current moment, the second position coordinates of the target object at the current moment are determined.
The target tracking method according to claim 2, wherein the observed position of the target object at the current moment is determined based on the predicted position coordinates and the first position coordinates of the target object at the current moment Coordinates, including:

determining the first position coordinates of the target object based on a plurality of first position coordinates of the target object at the current moment and the predicted position coordinates of the target object at the current moment;

The predicted position coordinates of the target object and the first midpoint coordinates of the first position coordinates are determined, and the first midpoint coordinates are used as the observed position coordinates of the target object at the current moment.
The target tracking method according to claim 3, wherein the target object is determined based on a plurality of first position coordinates of the target object at the current moment and the predicted position coordinates of the target object at the current moment The first position coordinates of , including:

determining a first distance between the predicted position coordinates of the target object and each of the first position coordinates;

Taking the first position coordinate that forms the minimum first distance with the predicted position coordinate of the target object among the plurality of first position coordinates as the first position coordinate with the target object, the minimum first distance is less than the first preset fusion distance threshold.
The target tracking method according to any one of claims 2 to 4, wherein the target tracking method further comprises:

determining whether there is an undetected target object in the video picture at the current moment, wherein the missed target object has a predicted position coordinate at the current moment, and the first position coordinate at the current moment is empty;

In the case where it is determined that there is a target object that is missed, the predicted position coordinates of the target object that is missed at the current moment are used as the coordinates of the observed position of the target object that is missed at the current moment.
The target tracking method according to any one of claims 1 to 5, wherein the target tracking method further comprises:

After determining the second position coordinates of each of the target objects at the current moment, marking the identity identifier associated with each of the target objects in the map position indicated by the second position coordinates;

Based on the second position coordinates of the target objects marked with the same identity identifier at multiple times, the trajectory data of each target object is generated.
The target tracking method according to claim 6, wherein after determining the second position coordinates of each of the target objects at the current moment, the target tracking method further comprises:

Based on the second position coordinates of each target object at the current moment, it is detected whether there are target objects that deviate from the target group; the target group is clustered according to the second position coordinates of each target object at the previous moment owned;

In the case of determining that there is a target object deviating from the target group, detecting whether the identity identifier associated with the target object deviating from the target group is accurate;

When it is determined that the identity identifiers of the target objects deviating from the target group are inaccurate, the identity identifiers associated with the target objects deviating from the target group are corrected.
The target tracking method according to claim 7, wherein detecting whether the identity identifier associated with the target object deviating from the target group is accurate, comprising:

extracting feature information of the target object deviating from the target group;

Based on the feature information of the target objects that deviate from the target group, and the pre-stored mapping relationship between the feature information of each object entering the target place and the identity identifiers, detecting the relationship between the target objects that deviate from the target group Whether the identity identifier is accurate.
The target tracking method according to claim 1, wherein the first position coordinates of each target object at the current moment are determined based on the video images at the current moment collected by the multiple collection devices, comprising: :

Acquiring the pixel coordinates of the region of interest in the video picture at the current moment respectively collected by the multiple collection devices;

For each of the plurality of acquisition devices, based on the pixel coordinates of the region of interest in the video picture collected by the acquisition device and the parameter information of the acquisition device, it is determined that the target object corresponding to the region of interest in the video picture is in the video picture. The initial position coordinates in the target place at the current moment;

The initial position coordinates belonging to the same target object in the initial position coordinates are fused to obtain the first position coordinates of the target object in the target place at the current moment.
The target tracking method according to claim 9, wherein acquiring the pixel coordinates of the region of interest in the video images at the current moment respectively collected by the multiple collection devices comprises:

Inputting the video images at the current moment into the pre-trained neural network respectively,

For each of the video pictures at the current moment,

Obtain the target detection frame in the video image;

The pixel coordinates of the target position point on the target detection frame in the video picture are extracted in the video picture, and the pixel coordinates of the region of interest in the video picture are obtained.
The target tracking method according to claim 9 or 10, wherein, based on the pixel coordinates of the region of interest in the video picture collected by the acquisition device and the parameter information of the acquisition device, determine the corresponding region of interest in the video picture. The initial position coordinates of the target object in the target location at the current moment, including:

Based on the internal parameter matrix and the distortion parameter of the acquisition device, the pixel coordinates of the region of interest in the video picture are corrected to obtain the corrected pixel coordinates of the region of interest in the video picture;

Based on the predetermined homography matrix of the acquisition device and the corrected pixel coordinates of the region of interest in the video picture, the initial position coordinates of the target object corresponding to the region of interest in the video picture are determined.
The target tracking method according to any one of claims 9 to 10, wherein the initial position coordinates belonging to the same target object in the initial position coordinates are fused to obtain the target object at the current moment in the said target object. The coordinates of the first location in the target site, including:

Based on the initial position coordinates, determine a plurality of initial position coordinates associated with the same target object;

The plurality of initial position coordinates associated with the target object are sequentially fused to obtain the first position coordinates of the target object in the target place at the current moment.
The target tracking method according to claim 12, wherein the plurality of initial position coordinates associated with the target object are sequentially fused to obtain the target object's position in the target place at the current moment. The first position coordinates, including:

Select any initial position coordinate from the plurality of initial position coordinates associated with the target object, and use the selected initial position coordinate as the first intermediate fusion position coordinate;

The first intermediate fusion position coordinates are fused with any other initial position coordinates to be fused in the plurality of initial position coordinates to generate the second intermediate fusion position coordinates, and the second intermediate fusion position coordinates are used as the updated and returning to the step of generating the second intermediate fusion position coordinates, until there is no initial position coordinate to be fused in the plurality of initial position coordinates.
The target tracking method according to claim 13, wherein the first intermediate fusion position coordinate is fused with any other initial position coordinate to be fused among the plurality of initial position coordinates to generate a second intermediate fusion Location coordinates, including:

Determine a midpoint coordinate of the first intermediate fusion position coordinate and any other initial position coordinate to be fused among the plurality of initial position coordinates, and use the midpoint coordinate as the second intermediate fusion position coordinate.
The target tracking method according to any one of claims 12 to 14, wherein, based on the initial position coordinates, determining a plurality of initial position coordinates associated with the same target object, comprising:

For any two video pictures in the video pictures at the current moment, the target object corresponding to the region of interest in the first video picture in the any two video pictures is determined as the first target object, and the arbitrary two video pictures are determined as the first target object. The target object corresponding to the region of interest in the second video picture in the video picture is determined as the second target object;

determining the initial position coordinates of each of the first target objects in the first video frame;

determining the initial position coordinates of each of the second target objects in the second video frame;

For the initial position coordinates of each of the first target objects,

determining a second distance between the initial position coordinates of the first target object and the initial position coordinates of each of the second target objects;

It is determined that a second target object with a minimum second distance from the first target object is the same target object as the first target object; wherein the minimum second distance is less than a second preset fusion distance threshold;

The initial position coordinates of the first target object and the initial position coordinates of the second target object having the smallest second distance from the first target object are used as a plurality of initial position coordinates associated with the same target object.
The target tracking method according to any one of claims 1 to 15, wherein after determining the second position coordinates of each of the target objects at the current moment, the target tracking method further comprises:

determining whether there is a target object entering the target area based on the second position coordinates corresponding to each of the target objects in the target place and a preset target area;

When it is determined that there is a target object entering the target area, an early warning prompt is performed.
A target tracking device, comprising:

an acquisition module, configured to acquire the video images at the current moment collected by multiple collection devices set in the target place; the multiple collection devices have different collection perspectives in the target place, and the video images include the target place the region of interest of the target object;

a determination module, configured to determine the first position coordinates of each of the target objects at the current moment based on the video images at the current moment collected by the multiple collection devices;

The tracking module is configured to, for each of the target objects, determine the second position coordinates of the target object at the current moment based on the first position coordinates of the target object and the second position coordinates of the target object at the previous moment.
An electronic device, comprising: a processor, a memory and a bus, the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory communicate through the bus , the machine-readable instructions execute the steps of the target tracking method according to any one of claims 1 to 16 when the machine-readable instructions are executed by the processor.
A computer-readable storage medium storing a computer program on the computer-readable storage medium, the computer program executing the steps of the target tracking method according to any one of claims 1 to 16 when the computer program is run by a processor.
A computer program product, the computer program product comprising a computer program stored in a storage medium, the computer program executing the target tracking method according to any one of claims 1 to 16 when the computer program is executed by a processor.