CN107240120A - The tracking and device of moving target in video - Google Patents
The tracking and device of moving target in video Download PDFInfo
- Publication number
- CN107240120A CN107240120A CN201710254328.XA CN201710254328A CN107240120A CN 107240120 A CN107240120 A CN 107240120A CN 201710254328 A CN201710254328 A CN 201710254328A CN 107240120 A CN107240120 A CN 107240120A
- Authority
- CN
- China
- Prior art keywords
- tracking
- moving target
- target
- moving
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000000007 visual effect Effects 0.000 claims abstract description 70
- 238000000034 method Methods 0.000 claims description 42
- 238000004364 calculation method Methods 0.000 claims description 17
- 238000001514 detection method Methods 0.000 claims description 4
- 230000009466 transformation Effects 0.000 description 17
- 239000011159 matrix material Substances 0.000 description 14
- 230000000903 blocking effect Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000002123 temporal effect Effects 0.000 description 3
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to the tracking and device of moving target in a kind of video, the tracking of moving target comprises the following steps in the video:Calculate the shielding rate of the moving target of present frame in the video under the first visual angle;The learning rate of space-time context model is calculated according to the shielding rate of moving target, and according to the space-time context model of learning rate renewal moving target;The image feature value of the moving target in present frame is obtained, the context prior model of moving target is updated according to image feature value;Convolution algorithm is carried out to the space-time context model after renewal and context prior model, the tracing positional of the moving target of next frame in the video under the first visual angle is obtained.The tracking computation complexity of moving target in above-mentioned video, tracks efficiency high and tracking accuracy is high.Correspondingly, the present invention also provides a kind of tracks of device of moving target in video.
Description
Technical Field
The invention relates to the technical field of video tracking, in particular to a method and a device for tracking a moving target in a video.
Background
With the rapid development of information technology, the application of computer vision technology in the field of video tracking is more and more extensive, and especially in video analysis of sports events, the labor cost can be greatly reduced and the analysis accuracy can be improved by tracking moving targets through computer vision to analyze the sports events. In recent years, tracking algorithms based on online machine learning, such as an online Boosting algorithm, a tracking-detection-learning algorithm, a tracking algorithm based on compressed sensing, and the like, have been rapidly developed, however, the various online machine learning-based moving target tracking methods have high computational complexity, affect tracking efficiency, easily generate tracking drift problem, and have low tracking accuracy because new models need to be continuously learned.
Disclosure of Invention
Therefore, it is necessary to provide a fast, accurate and effective method and apparatus for tracking a moving target in a video, aiming at the problems of low tracking efficiency and low tracking accuracy of the conventional moving target tracking method.
A method for tracking a moving object in a video comprises the following steps:
calculating the shielding rate of a moving target of a current frame in a video under a first visual angle;
calculating the learning rate of the space-time context model according to the shielding rate of the moving target, and updating the space-time context model of the moving target according to the learning rate;
acquiring an image characteristic value of a moving target in a current frame, and updating a context prior model of the moving target according to the image characteristic value;
and performing convolution operation on the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame in the video under the first visual angle.
The method for tracking the moving target in the video calculates the learning rate of a space-time context model by calculating the shielding rate of the moving target of the current frame in the video under the first visual angle, and updates the space-time context model of the moving target according to the learning rate; updating a context prior model of the moving target according to the image characteristic value; and performing convolution operation according to the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame in the video under the first visual angle. The method for tracking the moving target in the video can realize the tracking and positioning of the moving target of the next frame by updating the context prior model of the space-time context model of the moving target, only the model is updated, a new model does not need to be learned all the time, the calculation complexity is effectively reduced, the tracking efficiency is effectively improved, in addition, the method for tracking the moving target in the video dynamically determines the learning rate of the space-time context model according to the shielding condition of the moving target so as to update the space-time context model, the phenomenon that the moving target is learned to a wrong model when being shielded by other objects can be avoided, the tracking drift is effectively avoided, and the tracking accuracy is greatly improved.
In one embodiment, the step of calculating the blocking rate of the moving object of the current frame in the video under the first view angle comprises:
detecting whether the tracking frames of different moving targets of the current frame comprise an intersection point or not;
when the tracking frames of different moving targets comprise intersection points, calculating the length and the width of the overlapped part between the tracking frames of different moving targets, and calculating the shielding area of the moving target shielded according to the length and the width;
and acquiring the pre-stored area of the tracking frame of the moving target, and calculating the shielding rate of the moving target as the ratio of the shielding area to the area of the tracking frame.
In one embodiment, the learning rate is calculated using the following formula:
wherein:
e is a natural logarithm;
delta S is the shielding rate of the moving target;
k、are all constant parameters.
In one embodiment, the step of obtaining the image feature value of the moving object in the current frame comprises:
acquiring the color intensity of a moving object in the current frame on a red channel, the color intensity on a green channel and the color intensity on a blue channel;
assigning corresponding color intensity weighted values to the color intensity of the moving object on the red channel, the color intensity of the moving object on the green channel and the color intensity of the moving object on the blue channel;
and carrying out weighted summation on the color intensity on each channel to obtain an image characteristic value of the moving object in the current frame.
In one embodiment, the method for tracking a moving object in a video further includes:
and extracting a sideline area of the tracking field, establishing a tracking field overlook two-dimensional model, and projecting the tracking position to a first projection coordinate in the tracking field overlook two-dimensional model.
In one embodiment, the method for tracking a moving object in a video further includes:
acquiring a video under a second visual angle, and calculating a second projection coordinate of a tracking position of a moving target of a video frame corresponding to a next frame in the video under the second visual angle in the tracking field overlooking two-dimensional model;
respectively comparing the occlusion rate of the current frame moving target in the video under the first visual angle and the occlusion rate of the current frame moving target in the video under the second visual angle with a preset occlusion rate threshold value;
when the shielding rate of the current frame moving target in the video at the first view angle and the shielding rate of the current frame moving target in the video at the second view angle are both smaller than or equal to a preset shielding rate threshold value, calculating target projection coordinates of the moving target in a tracking field overlook two-dimensional model according to the first projection coordinate and the second projection coordinate;
when the shielding rate of a current frame moving target in the video under the first visual angle is greater than a preset shielding rate threshold value, selecting a second projection coordinate as a target projection coordinate of the moving target in a tracking field overlooking two-dimensional model; and when the shielding rate of the current frame moving target in the video under the second visual angle is greater than a preset shielding rate threshold value, selecting the first projection coordinate as a target projection coordinate of the moving target in the tracking field overlooking two-dimensional model.
In one embodiment, after the step of selecting the second projection coordinate as the target projection coordinate of the moving target in the tracking field top view two-dimensional model, the method further includes: correcting the first projection coordinate according to the second projection coordinate;
after the step of selecting the first projection coordinate as the target projection coordinate of the moving target in the tracking field overlooking two-dimensional model, the method further comprises the following steps: and correcting the second projection coordinate according to the first projection coordinate.
An apparatus for tracking a moving object in a video, comprising:
the occlusion rate calculation module is used for calculating the occlusion rate of a moving target of a current frame in the video at a first visual angle;
the space-time context model updating module is used for calculating the learning rate of the space-time context model according to the shielding rate of the moving target and updating the space-time context model of the moving target according to the learning rate;
the context prior model updating module is used for acquiring the image characteristic value of the moving target in the current frame and updating the context prior model of the moving target according to the image characteristic value;
and the tracking module is used for carrying out convolution operation on the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame in the video under the first visual angle.
In one embodiment, the spatio-temporal context model update module comprises:
the intersection point detection submodule is used for detecting whether the tracking frames of different moving targets of the current frame comprise intersection points or not;
the occlusion area calculation submodule is used for calculating the length and the width of an overlapped part between the tracking frames of different moving targets when the tracking frames of different moving targets comprise intersection points, and calculating the occlusion area of the moving target for occlusion according to the length and the width;
and the shielding rate calculation submodule is used for acquiring the pre-stored tracking frame area of the moving target and calculating the shielding rate of the moving target as the ratio of the shielding area to the tracking frame area.
In one embodiment, the learning rate calculation module calculates the learning rate using the following formula:
wherein:
e is a natural logarithm;
delta S is the shielding rate of the moving target;
k、are all constant parameters.
Drawings
FIG. 1 is a flow diagram of a method for tracking a moving object in a video according to one embodiment;
FIG. 2 is a flow diagram of calculating an occlusion rate of a moving object in one embodiment;
FIG. 3 is a schematic diagram illustrating a principle of calculating an occlusion area of a moving object according to an embodiment;
FIG. 4 is a flowchart of a method for tracking a moving object in a video according to yet another embodiment;
FIG. 5 is a diagram illustrating a spatiotemporal context information interface display of a moving object in one embodiment;
FIG. 6 is a schematic diagram of a two-dimensional model display of a top view of a tracking field in one embodiment;
FIG. 7 is a diagram illustrating an exemplary embodiment of an apparatus for tracking a moving object in a video;
FIG. 8 is a block diagram that illustrates a spatiotemporal context model update module in accordance with an embodiment;
FIG. 9 is a block diagram that illustrates a context prior model update module in one embodiment;
FIG. 10 is a block diagram of a spatiotemporal context model update module in accordance with yet another embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, a method for tracking a moving object in a video includes the following steps:
step 102: and calculating the occlusion rate of the moving object of the current frame in the video under the first visual angle.
Specifically, the occlusion rate of the moving object represents the degree of occlusion of the moving object, and is calculated according to the area of the moving object where occlusion occurs. The terminal detects whether the moving target is blocked, and when the moving target is blocked, the blocking rate of the moving target is calculated; otherwise, the shielding rate of the moving target is 0.
Step 104: and calculating the learning rate of the space-time context model according to the shielding rate of the moving target, and updating the space-time context model of the moving target according to the learning rate.
In particular, the spatial context model focuses on the spatial position relationship, including distance and direction relationships, of the moving object with its local context. The video sequence is continuous, the context correlation in time is very important for the tracking result, and the space-time context model of the moving target of each frame is obtained by learning the space-time context model and the space context model of the tracking target of the previous frame at a learning rate. When the moving target is blocked by other objects, the appearance model of the moving target is changed, and the reliability of the spatial context model of the moving target is reduced. Specifically, the learning rate of the spatiotemporal context model is dynamically determined according to the shielding condition of the moving target, and the spatiotemporal context model of the moving target of the next frame in the video under the first visual angle is updated according to the learning rate.
Step 106: and acquiring an image characteristic value of the moving target in the current frame, and updating the context prior model of the moving target according to the image characteristic value.
Specifically, the context prior model reflects the spatial composition of the current local context of the moving object, and is related to the image characteristics of the context space and the spatial position structure of the moving object.
Step 108: and performing convolution operation on the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame in the video under the first visual angle.
The method for tracking the moving target in the video calculates the learning rate of a space-time context model by calculating the shielding rate of the moving target of the current frame in the video under the first visual angle, and updates the space-time context model of the moving target according to the learning rate; updating a context prior model of the moving target according to the image characteristic value; and performing convolution operation according to the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame in the video under the first visual angle. The method for tracking the moving target in the video can realize the tracking and positioning of the moving target of the next frame by updating the context prior model of the space-time context model of the moving target, only the model is updated, a new model does not need to be learned all the time, the calculation complexity is effectively reduced, the tracking efficiency is effectively improved, in addition, the method for tracking the moving target in the video dynamically determines the learning rate of the space-time context model according to the shielding condition of the moving target so as to update the space-time context model, the phenomenon that the moving target is learned to a wrong model when being shielded by other objects can be avoided, the tracking drift is effectively avoided, and the tracking accuracy is greatly improved.
As shown in FIG. 2, in one embodiment, step 102 comprises:
step 1022: and detecting whether the intersection points are included between the tracking frames of different moving targets of the current frame.
In order to ensure the accuracy of the initial position of the moving target and lay a good foundation for subsequent tracking, in this embodiment, the initial position of the moving target in the first frame of the video at the first view angle is manually calibrated through human-computer interaction, a tracking frame is manually selected, and the initial position of each moving target is determined. In the tracking process, the terminal detects whether intersection points are included among tracking frames of different moving targets in real time, if the intersection points are included among the tracking frames of the different moving targets, the fact that shielding occurs among the moving targets is indicated, and step 1024 is executed; otherwise, the moving target is not shielded, and the shielding rate of the moving target is directly obtained to be 0.
Step 1024: when intersection points are included among the tracking frames of different moving targets, the length and the width of the overlapped part among the tracking frames of different moving targets are calculated, and the shielding area of the moving target shielded is calculated according to the length and the width.
As shown in fig. 3, in this embodiment, a coordinate system is established with the upper left corner of the tracking frame as the origin, the right side as the X axis, and the downward side as the Y axis. And finishing the tracking of the current frame to obtain the tracking position of the moving target, namely obtaining the position coordinates of the top point of the tracking frame. For convenience of calculation, in the present embodiment, the top left corner vertex coordinates and the bottom right corner vertex coordinates of the tracking box K1 are selected for calculation, where the top left corner vertex coordinates of the tracking box K1 are (minX1, minY1), the bottom right corner vertex coordinates of the tracking box K2 are (maxX1, maxY1), the top left corner vertex coordinates of the tracking box K2 are (minX2, minY2), and the bottom right corner vertex coordinates of the tracking box K2 are (maxY 2). The tracking frame K1 and the tracking frame K2 comprise two intersection points, namely an intersection point E and an intersection point F, and the coordinate of the intersection point E can be obtained according to the abscissa of the vertex at the lower right corner of the tracking frame K1 and the ordinate of the vertex at the upper left corner of the tracking frame K2 to be (maxX1, minY 2); similarly, the coordinate of the intersection point F is obtained as (minX2, maxY1) according to the abscissa of the upper left corner of the tracking frame K2 and the ordinate of the lower right corner of the tracking frame K1. After the coordinates of the intersection point E and the intersection point F are obtained, the length and the width of the overlapped part of the tracking frame K1 and the tracking frame K2 can be calculated. And calculating the difference value of the abscissa of the intersection point E and the abscissa of the vertex at the upper left corner of the second tracking frame K2 to obtain the width of the overlapped part of the tracking frame K1 and the tracking frame K2, and calculating the difference value of the ordinate of the intersection point F and the ordinate of the vertex at the upper left corner of the second tracking frame K2 to obtain the length of the overlapped part of the tracking frame K1 and the tracking frame K2. Further, the product of the length and the width is calculated to obtain the occlusion area of the moving object, wherein the occlusion area Soverlap is (maxX1-minX2) × (maxY1-minY 2).
In this embodiment, the length and the width of the overlapping portion between the tracking frames are further calculated by calculating the coordinates of the intersection point to calculate the occlusion area. However, the above embodiment is not intended to specifically limit the mask area calculation. For example, in other embodiments, the occlusion area may also be calculated directly from the coordinates of the top left and bottom right vertices of tracking box K1 and the top left and bottom right vertices of tracking box K2. For convenience of illustration, and still referring to fig. 3 as an example, in one embodiment, minX is defined as max (minX1, minX2), i.e., minX is the larger of minX1 and minX 2; meanwhile, maxX ═ min (maxX1, maxX2) is defined, and maxX is the smaller of maxX1 and maxX 2; minY is max (minY1, minY2), and the minY is the larger value of minY1 and minY 2; maxY is min (maxY1, maxY2), and maxY is the smaller of maxY1 and maxY 2. In the tracking process, the terminal judges whether shielding occurs between tracking frames according to the real-time comparison of the sizes of minX and maxdX and the sizes of minY and maxY, and calculates the shielding area when shielding occurs. Specifically, if minX < maxX and minY < maxY, the tracking frame K1 and the tracking frame K2 are overlapped, the moving object is occluded, and the occlusion area is calculated as: soverlap ═ (maxX-minX) (maxY-minY). As shown in fig. 3, in the present embodiment, minX ═ minX2, minY ═ minY2, maxX ═ maxX1, maxY ═ maxY1, and the blocking area Soverlap ═ (maxX1-minX2) (maxY1-minY 2).
Step 1026: and acquiring the area of a tracking frame of the pre-stored moving target, and calculating the shielding rate of the moving target as the ratio of the shielding area to the area of the tracking frame.
Specifically, in step 1022, after the moving target tracking frame is calibrated, the area of the tracking frame is calculated and stored, after the sheltering area of the moving target is calculated in step 1024, the terminal reads the area of the tracking frame, calculates the ratio of the sheltering area to the area of the tracking frame, and obtains the sheltering rate of the moving target as follows:
wherein SoverlapIs the shielded area of the moving object, S0Is the tracking frame area.
In one embodiment, in step 104, the learning rate is calculated using the following formula:
wherein: e is a natural logarithm; delta S is the shielding rate of the moving target; k.are all constant parameters. Specifically, the value range of k is 2-4;the value range of (A) is 1.5-2.5. In one embodiment, k-3,
in one embodiment, in step 106, the step of obtaining the image feature value of the moving object in the current frame includes: acquiring the color intensity of a moving object in the current frame on a red channel, the color intensity on a green channel and the color intensity on a blue channel; assigning corresponding color intensity weighted values to the color intensity of the moving target on the red channel, the color intensity of the moving target on the green channel and the color intensity of the moving target on the blue channel; and carrying out weighted summation on the color intensity on each channel to obtain the image characteristic value of the moving target in the current frame.
Specifically, the color intensity of the moving object on the red channel, the color intensity on the green channel, and the color intensity on the blue channel are assigned with corresponding color intensity weighted values according to the difference of the color intensities of different moving objects on the red channel, the green channel, and the blue channel, and the larger the color intensity difference is, the larger the color intensity weighted value on the channel is. In the embodiment, the color characteristic value of each moving target is determined through the color difference between different moving targets and is used for updating the context prior model, so that the tracking accuracy of the context prior model is ensured, and the tracking accuracy is further improved.
In an embodiment, the method for tracking a moving object in a video further includes: and extracting a sideline area of the tracking field, establishing a tracking field overlook two-dimensional model, and projecting the tracking position to a first projection coordinate in the tracking field overlook two-dimensional model.
In order to visualize and display the tracking result for the tracking analysis, in this embodiment, a two-dimensional model of a top view of the tracking field is established to synchronously display the tracking position of each moving target. And each moving target in the two-dimensional model for the overlooking of the tracking field is provided with a target identifier, and after each frame of tracking is completed, the target identifier of the moving target is moved to a first projection coordinate corresponding to the currently determined tracking position from a corresponding first projection coordinate in the previous frame.
Generally, the two-dimensional model of the tracking field is a top view, and the shooting angle of the original video is a side view with a certain angle. In this embodiment, the view angle and the data scale are converted according to the position and the angle of the camera, and the tracking position of the moving object is synchronously displayed on the two-dimensional model of the top view of the tracking field. Specifically, the present embodiment establishes a conversion relationship between the original video image and the two-dimensional model through homogeneous transformation. Firstly, the projective transformation of a two-dimensional plane is expressed as the product of a vector and a 3x3 matrix under homogeneous coordinates, namely x' is Hx, and a specific homography transformation matrix is expressed as follows:
according to the homography transformation matrix, the plane homography is transformed into eight degrees of freedom, and the homography transformation matrix can be obtained by solving eight unknowns in the transformation matrix, so that the target projection transformation is completed. Because one set of corresponding point coordinates can obtain two equations by the matrix multiplication formula, all unknowns in the original transformation matrix are required, and four sets of equations are required, so that if a homography transformation matrix is required, only four sets of corresponding point coordinates are required to be known. Specifically, in this embodiment, four sets of vertex coordinates of the tracking field are determined by extracting the edge line area of the tracking field to obtain a transformation matrix, so as to implement two-dimensional projection transformation. According to the embodiment, the two-dimensional projection transformation of the three-dimensional video image is calculated through the homography transformation matrix, the parameter information of the camera equipment does not need to be acquired, the video analysis system is simple and easy to use, and the transformation flexibility is high.
In an embodiment, the method for tracking a moving object in a video further includes: acquiring a video under a second visual angle, and calculating a second projection coordinate of a tracking position of a moving target of a video frame corresponding to a next frame in the video under the second visual angle in the tracking field overlooking two-dimensional model; respectively comparing the blocking rate of the current frame moving target in the video under the first visual angle and the blocking rate of the current frame moving target in the video under the second visual angle with a preset blocking rate threshold value; when the shielding rate of the current frame moving target in the video at the first visual angle and the shielding rate of the current frame moving target in the video at the second visual angle are both smaller than or equal to a preset shielding rate threshold value, calculating target projection coordinates of the moving target in a tracking field overlook two-dimensional model according to the first projection coordinates and the second projection coordinates; when the shielding rate of a current frame moving target in the video under the first visual angle is greater than a preset shielding rate threshold value, selecting a second projection coordinate as a target projection coordinate of the moving target in a tracking field overlooking two-dimensional model; and when the shielding rate of the current frame moving target in the video under the second visual angle is greater than a preset shielding rate threshold value, selecting the first projection coordinate as a target projection coordinate of the moving target in the tracking field overlooking two-dimensional model.
Specifically, the determination of the tracking position of the next frame of moving object in the video at the second view angle and the conversion process and principle of the second projection coordinate of the tracking position in the tracking field overlooking two-dimensional model are the same as those of the determination of the tracking position of the next frame of moving object in the video at the first view angle and the conversion of the second projection coordinate of the tracking position in the tracking field overlooking two-dimensional model, which is not repeated herein.
In a multi-moving-target tracking scene, due to the fact that the target moves complicatedly, large-area shielding and even complete shielding are easy to occur, if two tracking frames are superposed together, drift and jump occur, and in the tracking process, when a moving target is greatly shielded, even if drift and jump do not occur, due to the fact that the distance between objects with shielding relations cannot be judged under the view angle, coordinate information obtained by the shielded objects is also inaccurate. Therefore, in this embodiment, whether the first projection coordinate of the moving target obtained at the first view angle and the second projection coordinate of the moving target obtained at the second view angle are wrong is determined according to the situation that the moving target is blocked, if the situation that the moving target is blocked at a certain view angle is serious and the blocking rate of the moving target is greater than the preset blocking rate threshold value, the projection coordinate of the moving target obtained at the view angle is wrong, and the projection coordinate of the moving target obtained at another view angle is selected as the final target projection coordinate; if the shielding rate of the moving target at the two visual angles is smaller than or equal to the preset shielding rate threshold value, the projection coordinates of the moving target obtained at the two visual angles are correct, at the moment, weight values are given to the first projection coordinate and the second projection coordinate, weighting and calculation are carried out according to the first projection coordinate and the second projection coordinate, the target projection coordinate of the moving target in the downward-looking two-dimensional model of the tracking field is obtained, meanwhile, the tracking result is optimized according to the first projection coordinate and the second projection coordinate, and the tracking accuracy is ensured.
Specifically, since the size of the tracking frame is fixed during tracking, the tracked moving object has a big-end-to-small relationship under the view angle of the camera. Therefore, in one embodiment, a preset occlusion rate threshold defining an occlusion condition is defined in relation to a distance of the target from the camera on the two-dimensional model, the preset occlusion rate threshold is calculated according to a distance of the moving target from the camera in the two-dimensional model when the moving target overlooks the tracking field, and a weight of a tracking position of a next moving target in the first perspective video and a tracking position of a next moving target in the second perspective video is calculated according to distances of the moving target from the camera in the first perspective video and the second perspective video.
The embodiment simultaneously tracks videos at different visual angles, the tracking position of the moving target obtained at each visual angle is projected to a tracking field overlooking two-dimensional model, the tracking results of the same moving target at two visual angles are unified according to the blocking condition of the moving target, the tracking results are optimized based on the video tracking at two visual angles, the tracking accuracy is ensured, and the tracking accuracy is greatly improved.
In one embodiment, after the step of selecting the second projection coordinate as the target projection coordinate of the moving target in the two-dimensional model of the top view of the tracking field, the method further includes: correcting the first projection coordinate according to the second projection coordinate; after the step of selecting the first projection coordinate as the target projection coordinate of the moving target in the tracking field overlooking two-dimensional model, the method further comprises the following steps: and correcting the second projection coordinate according to the first projection coordinate. In this embodiment, when the tracking result at a certain viewing angle is incorrect, the incorrect tracking result is corrected by the tracking result at another viewing angle, and the spatiotemporal context model is updated according to the corrected tracking result, so as to ensure the accuracy of the subsequent tracking result, and further improve the tracking accuracy.
Further, in order to facilitate understanding of the technical solution of the present invention, the following describes in detail a tracking method of a moving object in the video by taking football video tracking as an example with reference to fig. 4 to 6. For ease of illustration, two soccer teams are defined as team a and team b, wherein players of team a in the two-dimensional model of the court plan view are identified as rectangles and team b in the two-dimensional model of the court plan view are identified as circles.
A method for tracking a moving object in a video comprises the following steps:
1) and determining the initial position of the moving target and calibrating the moving target tracking frame.
Firstly, reading the image of the t-th frame, and determining the initial position of each player (namely the moving object) in the t-th frame through a manual calibration tracking frame. Specifically, when the player tracking frame is manually calibrated, a mouse can be used for selecting the tracking frame, the initial positions of players in the first frame in the video at the first visual angle and the first frame in the video at the second visual angle are respectively calibrated, and the initial positions of the players in the first frame in the video at the first visual angle and the first frame in the video at the second visual angle are determined. Further, after the initial positions of the players are calibrated, the terminal further calculates and stores the area of the tracking frame of each player tracking frame.
2) And calculating the shielding rate of the moving target of the current frame.
Specifically, the occlusion rate of each player of the current frame corresponding to the video at the first view angle and the video at the second view angle is calculated respectively, the occlusion rate of each player of the current frame corresponding to the video at the first view angle and the occlusion rate of each player of the current frame corresponding to the video at the second view angle are calculated according to the tracking frame area of each player tracking frame in the current frame and the occlusion area of the current player, and the calculation principle and process of the occlusion rate of the specific moving object are described in detail in the foregoing embodiments, and are not described herein again.
3) And calculating the learning rate of the space-time context model and updating the space-time context model.
The temporal context information is the temporal relevance between successive frames, and the spatial context information is the combination of the tracking target and the background image within a certain range around the tracking target. Tracking a target by using space-time context information firstly needs to establish a tracking model, and specifically, a target tracking problem is a probability problem of a position of the target. Let o be the target to be tracked, x be a two-dimensional coordinate point on an image, and P (x | o) represent the occurrence of coordinate x in target o, transforming target tracking to the computational problem of maximum confidence.
Order:
m (x) ═ P (x | o); formula (4)
When the value of the confidence map m (x) is the maximum, the corresponding coordinate x can be regarded as the most likely position of the target o. As shown in fig. 5, the area within the solid line frame is the target area, and the area within the outer dotted line frame is the local context area. Using target centre position coordinates x*Representing the location of the target, and z is a point within the local context area. Defining an object x*Has a local context region of Ωc(x*) And defining the context feature set of the local area as XC={c(z)=(I(z),z)|z∈Ωc(x*) Where i (z) is the image feature value at the z coordinate. By using a total probability formula and taking the local context characteristics as an intermediate quantity, the formula (4) is expanded to obtain:
wherein, P (x | c (z), o) represents the probability of the target appearing at the point x when the target o and the local context feature c (z) thereof are given, and establishes a spatial context model for tracking the spatial relationship between the position of the target and the context information thereof. And P (c (z) o) represents the probability that a certain context feature c (z) appears in the target o, is the context prior probability of the target o, and is the appearance prior model of the current local context. The context prior model represents that when the target position is predicted by calculating the confidence map M (x), the selected context is similar to the position appearance of the target in the previous frame, and the space context model ensures that the selected new target position is not only similar to the original target in appearance, but also reasonable in space position, so that interference caused by the appearance of other objects with similar appearances is avoided to a certain extent, and the drift phenomenon in tracking is avoided.
Based on the above, in the present embodiment, specific mathematical model establishment is performed on each part in the formula (5) in advance, specifically including confidence map modeling, spatial context model modeling, and context prior model modeling.
First, the confidence map is modeled as follows: since the target location in the first frame of the video is known (as scaled from the tracking frame for the initial frame), the confidence map m (x) should satisfy the property that the closer the target x position, the greater its confidence level. Therefore, let:
wherein b is a normalized constant parameter; alpha is a scale constant parameter, and beta is a function curve image control constant parameter. Alpha is related to the size of the tracked target, and the value range is 1.75-2.75; the value range of beta is 0.5-1.5. In one embodiment, α ═ 2.25 and β ═ 1.
Second, the spatial context model P (x | c (z), o) is modeled as follows: since the spatial context model focuses on the spatial position relationship, including distance and direction relationship, between the tracking target and its local context, P (x | c (z), o) is defined as a non-radially symmetric function:
p (x | c (z), o) ═ hsc (x-z); formula (7)
Wherein: x is the position of the target, z is a certain position in the local context, even if two points z1 and z2 are the same as the x-fold distance of the central position of the target, because the two points z1 and z2 are different from the x-fold distance of the central position of the target, hsc (x-fold-z 1) is not equal to hsc (x-fold-z 2), which shows that the two points represent different contexts for x-fold, so that different spatial relationships can be effectively distinguished, and ambiguity can be prevented.
Finally, the context prior model P (c (z) o) is modeled as follows: the context prior model reflects the current local/context spatial composition itself, and is intuitively considered to be related to the image characteristics of the context space and the spatial position structure thereof. Therefore, let:
P(c(z)|o)=I(z)ωσ(z-x*) (ii) a Formula (8)
Where I (z) is the image feature value at z point in the local context region, ωσ(Δ) is a weight function.
Specifically, in the tracking process, similar to the process of tracking something by human eyes, a context area closer to the tracking target may be considered to be more relevant to the tracking target and thus have higher importance, and a context area farther from the tracking target may be considered to be less relevant to the tracking target and thus have lower importance. Accordingly, define:
wherein, Δ is the distance between two points, λ is a normalized constant parameter, which is used to make the value of P (c (z) o) between 0 and 1, so as to conform to the definition of probability function; and sigma is a scale parameter and is related to the size of the tracked target.
Substituting equation (9) into equation (8) yields a context prior model as follows:
that is, the spatial composition of the local context is modeled as a gaussian weighted sum of the image feature values of the points in the region.
Further, in this embodiment, after completing the confidence map modeling, the spatial context model modeling, and the context prior model modeling, the spatio-temporal context model is further updated according to the confidence map, the spatial context model, and the context prior model:
first, formula (5) is substituted into formula (6), formula (7), and formula (10) to obtain:
where hsc (x-z) is a spatial context model, i.e., the object to be computed and learned for each frame of image.
According to a convolutionDefinition of (1):
equation (11) can be changed to:
according to the convolution theorem, there are:
then:
wherein,andrespectively, representing fourier and inverse fourier transforms.
Assuming the t-th frame, the target center position is knownAnd the local context area omega of the targetc(x*) The space context model of the tracking target and its local context region in the t-th frame can be calculated and recorded asSince a continuous video sequence is processed, temporal context correlation is also crucial for tracking the results. To take this dimension into account, a spatiotemporal context model learning rate ρ is set, and the spatiotemporal context model of each frame tracking target is expressed as two parts, namely a historical spatiotemporal context model and a newly learned spatiotemporal context model, as follows:
wherein,a spatial context model for the t-th frame;is the spatio-temporal context model for the t-th frame,the spatio-temporal context model of the t +1 th frame.
Generally, in a situation where a plurality of tracking targets with similar appearances exist, when an occlusion situation occurs, that is, an appearance model of a target has changed greatly, and at this time, a space-time context model is still learned and updated at the same rate, an error model is continuously learned, and then the tracking target is finally lost. According to the embodiment, the learning rate is dynamically determined according to the shielding condition of the moving object, and the learning rate rho is a dynamic value which is automatically updated, so that the phenomenon that the historical model information is completely lost due to too fast updating can be effectively prevented. Specifically, when the tracking target is blocked by another object, the target appearance model is changed, and the reliability of the spatial context model is reduced, so that the learning rate needs to be reduced to prevent a wrong model from being learned, and the tracking accuracy is ensured. In this embodiment, the obtained spatiotemporal context model of the moving target is updated according to the learning rate of the spatiotemporal context model calculated according to the occlusion rate of the moving target. The learning rate of the spatio-temporal context model is calculated according to the formula (3), which is not described herein.
4) And obtaining the image characteristic value of the moving target, and updating the context prior model of the moving target.
Specifically, the image characteristic value of the moving object is calculated by the following formula:
I(x)=w1·IR(x)+w2·IG(x)+w3·IB(x) (ii) a Formula (17)
Wherein, IR(x) Is the color intensity of x on the red channel; i isG(x) Color intensity on green channel for x; i isB(x) Color intensity on blue channel for x; w is a1、w2、w3Is a weight value and w1+w2+w31. In one embodiment, the colors of the two teams of uniforms are obviously different in the R channel, and I is defined asR(x) Given a greater weight, w1=0.4,w2=0.3,w3=0.3。
5) And carrying out convolution operation on the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame.
Specifically, at the t +1 th frame, its spatio-temporal context model is known asAfter updateThen the context prior model of the t +1 frame is calculatedThe confidence map of the t +1 th frame can be obtained by performing convolution calculation through the formula (7) to the formula (11), as follows:
then, there is a confidence map M when the t +1 th framet + (x) When the maximum value is taken, the corresponding x value is regarded as the central position of the moving object in the t +1 th frameAnd determining the tracking position of the moving object, namely determining the tracking position of the tracked player of the next frame in the video under the first visual angle and the tracking position of the tracked player of the next frame in the video under the second visual angle respectively.
6) And establishing a court overlook two-dimensional model, projecting the tracking position of the next frame of moving target in the video at the first visual angle to a first projection coordinate in the court overlook two-dimensional model, and projecting the tracking position of the next frame of moving target in the video at the second visual angle to a second projection coordinate in the court overlook two-dimensional model.
Specifically, in this embodiment, four corner points of a half field of the soccer game field are selected as four reference points for calculating the plane single-strain transformation matrix. Firstly, extracting a court border area through a threshold processing technology and Hough transformation straight line detection in a series of digital image processing; and then, combining the dispersed line segments to obtain a spherical field sideline linear equation and four groups of calibration point coordinates, and finally, obtaining a conversion matrix of two visual angles according to the four groups of calibration point coordinates, wherein the specific spherical field overlooking two-dimensional model is shown in fig. 6.
7) And detecting whether the first projection coordinate and the second projection coordinate are wrong.
Specifically, the occlusion rate of the current frame of player in the video at the first viewing angle and the occlusion rate of the current frame of player in the video at the second viewing angle are compared with a preset occlusion rate threshold value respectively, and whether the first projection coordinate and the second projection coordinate are wrong or not is judged. If the occlusion rate of the current frame player in the video at the first visual angle is greater than a preset occlusion rate threshold value, the first projection coordinate is wrong; and if the occlusion rate of the current frame of player in the video at the second visual angle is greater than the preset occlusion rate threshold value, the second projection coordinate is wrong. And if the occlusion rate of the current frame player in the video under the first visual angle and the occlusion rate of the current frame player in the video under the second visual angle are both smaller than or equal to the preset occlusion rate threshold value, the first projection coordinate and the second projection coordinate are both correct. In this embodiment, the preset occlusion rate threshold is calculated according to the distance from the player to the camera in the two-dimensional model of the court overlooking, wherein the distance from the player to the camera in the two-dimensional model of the court overlooking is:
wherein [ x, y ] is the coordinate of the player of the current frame on the two-dimensional model of the overlooking of the court, and height and width are the height and width of the two-dimensional model of the overlooking of the court respectively.
Then, the preset occlusion rate threshold is:
threshold=γe-μ·Δd(ii) a Formula (20)
Wherein threshold is a preset shielding rate threshold; gamma and mu are both constant parameters, wherein the gamma is used for adjusting the change range of the preset occlusion rate threshold value, and the mu is used for adjusting the change speed of the preset occlusion rate threshold value.
8) And when the first projection coordinate or the second projection coordinate is wrong, selecting the projection coordinate of the other visual angle as the target projection coordinate of the player.
Specifically, the shooting angles of the video at the first visual angle and the video at the second visual angle are different, and the player in the video shot at the first visual angle is shielded, but at the moment, the player in the video shot at the second visual angle is not shielded, so that the first projection coordinate and the second projection coordinate are not simultaneously mistaken. Therefore, when the first projection coordinate is wrong, the second projection coordinate is selected as the target projection coordinate of the player, and the tracking of the t-th frame is finished. And when the second projection coordinate is wrong, selecting the first projection coordinate as the target projection coordinate of the player, and finishing the tracking of the t-th frame.
Further, in one embodiment, when the tracking result at one view angle is incorrect, the incorrect tracking result is also corrected by the tracking result at another view angle to ensure that the subsequent tracking result is accurate. Assuming that the first projection coordinate is wrong, the maximum likelihood position given by the confidence map of the tracked player with tracking drift is P under the first view angle1The projection matrix of the first visual angle video converted to the two-dimensional model of the top view of the court is H1At this time, the maximum likelihood position of the tracked player is P under the second visual angle2And converting the second visual angle video into the two-dimensional court overlook model by using a projection matrix H2Then P is2The second projection coordinate on the two-dimensional model of the overlooking of the court is H2·P2Due to P2As a result of correct tracking, the position P will then be incorrectly tracked1The first projection coordinates at the first view angle updated to the correct tracking position are:
P1=H1 -1·H2·P2(ii) a Formula (21)
Similarly, if the second projection coordinate is wrong, the second projection coordinate is corrected according to the first projection coordinate, and the specific correction principle is the same as that of the first projection coordinate, and is not repeated. And after the first projection coordinate or the second projection coordinate is corrected, updating the space-time context model under the corresponding view angle according to the corrected tracking result, and ensuring the accuracy of the subsequent tracking result.
9) And when the first projection coordinate and the second projection coordinate are both correct, calculating the target projection coordinate of the player according to the first projection coordinate and the second projection coordinate.
And when the first projection coordinate and the second projection coordinate are correct, mutually assisting and adjusting to determine the target projection coordinate of the player through the first projection coordinate and the second projection coordinate, and finishing the tracking of the t-th frame after the target projection coordinate is determined. Specifically, in the image after projection transformation, the position of the player is clearer at a position closer to the camera, and the specific position of the player is more fuzzy at a position farther from the camera due to deformation and stretching. Therefore, when the target is closer to the camera at a certain angle of view, the tracking result in the video shot by the camera is considered to be more reliable, that is, the tracking result obtained from the angle of view occupies a larger weight when the target position is finally determined, so that the weight values of the first projection coordinate and the second projection coordinate are determined according to the distance between the target and the camera.
Assume that in the first viewing angle the camera is in the position shown in figure 6. Defining the position of the shooting under the first visual angle as an origin, and defining the coordinates of the team member player M on the two-dimensional court overlooking model as posmodel1=[x1y1]Then, there are:
as shown in FIG. 6, the camera at the second view angle is opposite to the camera at the first view angle, and the tracking result of the player M obtained by the camera at the second view angle is converted to the position pos of the coordinate on the two-dimensional model of the top view of the courtmodel2=[x2y2]Then, thenThe distance of player M from the camera at the second viewing angle is:
wherein, width and height are respectively the width and height of the two-dimensional model of the court overlooking.
Then, after fusing the first projection coordinate and the second projection coordinate, the final position of the player M on the two-dimensional model of the court is: posfinal=[x y],
Further, in one embodiment, according to the above steps (1) to (10), the football videos in two view angles are tracked, and the tracking operation is implemented on a PC computer, and the hardware environment is as follows: a central processor: intel Corei5, the master frequency of 2.5GHz and the internal memory of 8 GB. The programming environment is Matlab 2014 a. The original videos under two visual angles are in an avi format, the size of each frame is 1696x1080, the size of each video is about 20MB, the lengths of the two videos are about 18 seconds, 30 frames are taken per second, and about 540 frames are counted.
Referring to fig. 7, an apparatus 700 for tracking a moving object in a video includes:
an occlusion rate calculating module 702, configured to calculate an occlusion rate of a moving object of a current frame in a video at a first view angle.
And a spatiotemporal context model updating module 704, configured to calculate a learning rate of the spatiotemporal context model according to the occlusion rate of the moving object, and update the spatiotemporal context model of the moving object according to the learning rate.
A context prior model updating module 706, configured to obtain an image feature value of the moving object in the current frame, and update the context prior model of the moving object according to the image feature value.
The tracking module 708 is configured to perform convolution operation on the updated spatio-temporal context model and the context prior model to obtain a tracking position of a moving target of a next frame in the video at the first view angle.
As shown in FIG. 8, in one embodiment, the spatiotemporal context model update module 704 includes:
and an intersection detection sub-module 7042, configured to detect whether intersections are included between tracking frames of different moving objects of the current frame.
And the occlusion area calculating submodule 7044 is configured to calculate lengths and widths of overlapping portions between the tracking frames of different moving objects when the tracking frames of different moving objects include an intersection, and calculate an occlusion area where the moving object is occluded according to the lengths and the widths.
And the occlusion rate calculating sub-module 7046 is configured to obtain a pre-stored area of the tracking frame of the moving target, and calculate an occlusion rate of the moving target as a ratio of the occlusion area to the area of the tracking frame.
In one embodiment, the learning rate calculation module 702 calculates the learning rate using the following formula:
wherein: e is a natural logarithm; delta S is the shielding rate of the moving target; k.are all constant parameters.
As shown in FIG. 9, in one embodiment, the context prior model update module 706 includes:
color intensity acquisition sub-module 7062: for obtaining the color intensity of the moving object in the current frame on the red channel, the color intensity on the green channel and the color intensity on the blue channel.
The color intensity weight value selecting submodule 7064 is configured to assign corresponding color intensity weight values to the color intensity of the moving object on the red channel, the color intensity on the green channel, and the color intensity on the blue channel.
And an image feature value calculating module 7066, configured to perform weighted summation on the color intensities on each channel to obtain an image feature value of the moving object in the current frame.
As shown in fig. 10, in one embodiment, the apparatus 700 for tracking a moving object in a video further includes:
and the two-dimensional model projection module 710 is used for extracting a side line area of the tracking field, establishing a tracking field overlook two-dimensional model, and projecting the tracking position to a first projection coordinate in the tracking field overlook two-dimensional model.
In one embodiment, the tracking device 700 for the moving object in the video is configured to acquire the video from the second view angle, and calculate a second projection coordinate of the tracking position of the moving object in the video frame corresponding to the next frame in the video from the second view angle in the two-dimensional model of the top view of the tracking field. As shown in fig. 10, the apparatus 700 for tracking a moving object in a video further includes:
and the occlusion rate comparing module 712 is configured to compare the occlusion rate of the current frame moving object in the video at the first view angle and the occlusion rate of the current frame moving object in the video at the second view angle with a preset occlusion rate threshold value, respectively.
And the first target projection coordinate selecting module 714 is configured to calculate a target projection coordinate of the moving target in the two-dimensional model of the top view of the tracking field according to the first projection coordinate and the second projection coordinate when both the occlusion rate of the current frame moving target in the video at the first view angle and the occlusion rate of the current frame moving target in the video at the second view angle are less than or equal to a preset occlusion rate threshold value.
The second target projection coordinate selecting module 716 is configured to select a second projection coordinate as a target projection coordinate of the moving target in the tracking field overlooking two-dimensional model when the occlusion rate of the current frame moving target in the video at the first view angle is greater than a preset occlusion rate threshold; and when the shielding rate of the current frame moving target in the video under the second visual angle is greater than a preset shielding rate threshold value, selecting the first projection coordinate as a target projection coordinate of the moving target in the two-dimensional model overlooked in the tracking field.
As shown in fig. 10, in one embodiment, the apparatus 700 for tracking a moving object in a video further includes:
the projection coordinate correction module 718 is configured to correct the first projection coordinate according to the second projection coordinate when the blocking rate of the current frame moving object in the video at the first view angle is greater than a preset blocking rate threshold; and when the shielding rate of the current frame moving object in the video under the second visual angle is greater than the preset shielding rate threshold value, correcting the second projection coordinate according to the first projection coordinate.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A method for tracking a moving object in a video is characterized by comprising the following steps:
calculating the shielding rate of a moving target of a current frame in a video under a first visual angle;
calculating the learning rate of the space-time context model according to the shielding rate of the moving target, and updating the space-time context model of the moving target according to the learning rate;
acquiring an image characteristic value of a moving target in a current frame, and updating a context prior model of the moving target according to the image characteristic value;
and carrying out convolution operation on the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame in the video under the first visual angle.
2. The method according to claim 1, wherein the step of calculating the occlusion rate of the moving object of the current frame in the video at the first view comprises:
detecting whether the tracking frames of different moving targets of the current frame comprise an intersection point or not;
when the tracking frames of different moving targets comprise intersection points, calculating the length and the width of an overlapped part between the tracking frames of different moving targets, and calculating the shielding area of the moving target shielded according to the length and the width;
the method comprises the steps of obtaining a pre-stored area of a tracking frame of a moving target, and calculating the shielding rate of the moving target as the ratio of the shielding area to the area of the tracking frame.
3. The method of claim 1, wherein the learning rate is calculated using the following formula:
wherein:
e is a natural logarithm;
delta S is the shielding rate of the moving target;
k、are all constant parameters.
4. The method of claim 1, wherein the step of obtaining the image feature value of the moving object in the current frame comprises:
acquiring the color intensity of a moving object in the current frame on a red channel, the color intensity on a green channel and the color intensity on a blue channel;
assigning corresponding color intensity weighted values to the color intensity of the moving object on a red channel, the color intensity of the moving object on a green channel and the color intensity of the moving object on a blue channel;
and carrying out weighted summation on the color intensity on each channel to obtain the image characteristic value of the moving object in the current frame.
5. The method of claim 1, further comprising:
and extracting a sideline area of the tracking field, establishing a tracking field overlook two-dimensional model, and projecting the tracking position to a first projection coordinate in the tracking field overlook two-dimensional model.
6. The method of claim 5, further comprising:
acquiring a video under a second visual angle, and calculating a second projection coordinate of a tracking position of a moving target of a video frame corresponding to a next frame in the video under the second visual angle in the tracking field overlooking two-dimensional model;
respectively comparing the occlusion rate of the current frame moving target in the video under the first visual angle and the occlusion rate of the current frame moving target in the video under the second visual angle with a preset occlusion rate threshold value;
when the shielding rate of the current frame moving target in the video under the first visual angle and the shielding rate of the current frame moving target in the video under the second visual angle are both smaller than or equal to the preset shielding rate threshold value, calculating target projection coordinates of the moving target in the two-dimensional model of the tracking field overlook according to the first projection coordinates and the second projection coordinates;
when the shielding rate of a current frame moving target in the video under the first visual angle is greater than the preset shielding rate threshold value, selecting the second projection coordinate as a target projection coordinate of the moving target in the tracking field overlooking two-dimensional model; and when the shielding rate of the current frame moving target in the video under the second visual angle is greater than the preset shielding rate threshold value, selecting the first projection coordinate as a target projection coordinate of the moving target in the tracking field overlooking two-dimensional model.
7. The method of claim 6,
after the step of selecting the second projection coordinate as the target projection coordinate of the moving target in the tracking field overlooking two-dimensional model, the method further comprises the following steps:
correcting the first projection coordinate according to the second projection coordinate;
after the step of selecting the first projection coordinate as the target projection coordinate of the moving target in the tracking field overlooking two-dimensional model, the method further comprises the following steps:
and correcting the second projection coordinate according to the first projection coordinate.
8. An apparatus for tracking a moving object in a video, comprising:
the occlusion rate calculation module is used for calculating the occlusion rate of a moving target of a current frame in the video under the first visual angle;
the space-time context model updating module is used for calculating the learning rate of the space-time context model according to the shielding rate of the moving target and updating the space-time context model of the moving target according to the learning rate;
the context prior model updating module is used for acquiring the image characteristic value of the moving target in the current frame and updating the context prior model of the moving target according to the image characteristic value;
and the tracking module is used for carrying out convolution operation on the updated space-time context model and the context prior model to obtain the tracking position of the moving target of the next frame in the video under the first visual angle.
9. The apparatus of claim 8, wherein the spatiotemporal context model update module comprises:
the intersection point detection submodule is used for detecting whether intersection points are included among tracking frames of different moving targets of the current frame;
the occlusion area calculation submodule is used for calculating the length and the width of an overlapped part between the tracking frames of different moving targets when the tracking frames of different moving targets comprise intersection points, and calculating the occlusion area of the moving target in occlusion according to the length and the width;
and the shielding rate calculation submodule is used for acquiring the pre-stored tracking frame area of the moving target and calculating the shielding rate of the moving target as the ratio of the shielding area to the tracking frame area.
10. The apparatus of claim 8, wherein the learning rate calculation module calculates the learning rate using the following equation:
wherein:
e is a natural logarithm;
delta S is the shielding rate of the moving target;
k、are all constant parameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710254328.XA CN107240120B (en) | 2017-04-18 | 2017-04-18 | Method and device for tracking moving target in video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710254328.XA CN107240120B (en) | 2017-04-18 | 2017-04-18 | Method and device for tracking moving target in video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107240120A true CN107240120A (en) | 2017-10-10 |
CN107240120B CN107240120B (en) | 2019-12-17 |
Family
ID=59983446
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710254328.XA Active CN107240120B (en) | 2017-04-18 | 2017-04-18 | Method and device for tracking moving target in video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107240120B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108022254A (en) * | 2017-11-09 | 2018-05-11 | 华南理工大学 | A kind of space-time contextual target tracking based on sign point auxiliary |
CN109636828A (en) * | 2018-11-20 | 2019-04-16 | 北京京东尚科信息技术有限公司 | Object tracking methods and device based on video image |
CN111223104A (en) * | 2018-11-23 | 2020-06-02 | 杭州海康威视数字技术股份有限公司 | Package extraction and tracking method and device and electronic equipment |
CN111241872A (en) * | 2018-11-28 | 2020-06-05 | 杭州海康威视数字技术股份有限公司 | Video image shielding method and device |
CN112489086A (en) * | 2020-12-11 | 2021-03-12 | 北京澎思科技有限公司 | Target tracking method, target tracking device, electronic device, and storage medium |
CN115712354A (en) * | 2022-07-06 | 2023-02-24 | 陈伟 | Man-machine interaction system based on vision and algorithm |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11889227B2 (en) | 2020-10-05 | 2024-01-30 | Samsung Electronics Co., Ltd. | Occlusion processing for frame rate conversion using deep learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631895A (en) * | 2015-12-18 | 2016-06-01 | 重庆大学 | Temporal-spatial context video target tracking method combining particle filtering |
CN105976401A (en) * | 2016-05-20 | 2016-09-28 | 河北工业职业技术学院 | Target tracking method and system based on partitioned multi-example learning algorithm |
CN106127798A (en) * | 2016-06-13 | 2016-11-16 | 重庆大学 | Dense space-time contextual target tracking based on adaptive model |
CN106485732A (en) * | 2016-09-09 | 2017-03-08 | 南京航空航天大学 | A kind of method for tracking target of video sequence |
-
2017
- 2017-04-18 CN CN201710254328.XA patent/CN107240120B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105631895A (en) * | 2015-12-18 | 2016-06-01 | 重庆大学 | Temporal-spatial context video target tracking method combining particle filtering |
CN105976401A (en) * | 2016-05-20 | 2016-09-28 | 河北工业职业技术学院 | Target tracking method and system based on partitioned multi-example learning algorithm |
CN106127798A (en) * | 2016-06-13 | 2016-11-16 | 重庆大学 | Dense space-time contextual target tracking based on adaptive model |
CN106485732A (en) * | 2016-09-09 | 2017-03-08 | 南京航空航天大学 | A kind of method for tracking target of video sequence |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108022254A (en) * | 2017-11-09 | 2018-05-11 | 华南理工大学 | A kind of space-time contextual target tracking based on sign point auxiliary |
CN108022254B (en) * | 2017-11-09 | 2022-02-15 | 华南理工大学 | Feature point assistance-based space-time context target tracking method |
CN109636828A (en) * | 2018-11-20 | 2019-04-16 | 北京京东尚科信息技术有限公司 | Object tracking methods and device based on video image |
CN111223104A (en) * | 2018-11-23 | 2020-06-02 | 杭州海康威视数字技术股份有限公司 | Package extraction and tracking method and device and electronic equipment |
CN111223104B (en) * | 2018-11-23 | 2023-10-10 | 杭州海康威视数字技术股份有限公司 | Method and device for extracting and tracking package and electronic equipment |
CN111241872A (en) * | 2018-11-28 | 2020-06-05 | 杭州海康威视数字技术股份有限公司 | Video image shielding method and device |
CN111241872B (en) * | 2018-11-28 | 2023-09-22 | 杭州海康威视数字技术股份有限公司 | Video image shielding method and device |
CN112489086A (en) * | 2020-12-11 | 2021-03-12 | 北京澎思科技有限公司 | Target tracking method, target tracking device, electronic device, and storage medium |
CN115712354A (en) * | 2022-07-06 | 2023-02-24 | 陈伟 | Man-machine interaction system based on vision and algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN107240120B (en) | 2019-12-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107240120B (en) | Method and device for tracking moving target in video | |
CN109903312B (en) | Football player running distance statistical method based on video multi-target tracking | |
JP7427188B2 (en) | 3D pose acquisition method and device | |
US9727787B2 (en) | System and method for deriving accurate body size measures from a sequence of 2D images | |
US9208395B2 (en) | Position and orientation measurement apparatus, position and orientation measurement method, and storage medium | |
US20180321776A1 (en) | Method for acting on augmented reality virtual objects | |
US20090262113A1 (en) | Image processing apparatus and image processing method | |
CN107240124A (en) | Across camera lens multi-object tracking method and device based on space-time restriction | |
WO2022191140A1 (en) | 3d position acquisition method and device | |
JPH1186004A (en) | Moving body tracking device | |
WO2021093275A1 (en) | Method for adaptively calculating size of gaussian kernel in crowd counting system | |
Taketomi et al. | Real-time and accurate extrinsic camera parameter estimation using feature landmark database for augmented reality | |
JP2014026429A (en) | Posture estimation device, posture estimation method and posture estimation program | |
US20170090586A1 (en) | User gesture recognition | |
JP2018004638A (en) | Method and system for measuring ball spin and non-transitory computer-readable recording medium | |
CN108629799B (en) | Method and equipment for realizing augmented reality | |
Ohashi et al. | Synergetic reconstruction from 2D pose and 3D motion for wide-space multi-person video motion capture in the wild | |
CN107240117A (en) | The tracking and device of moving target in video | |
KR20180048443A (en) | Optimal Spherical Image Acquisition Method Using Multiple Cameras | |
Shalnov et al. | Convolutional neural network for camera pose estimation from object detections | |
CN113706373A (en) | Model reconstruction method and related device, electronic equipment and storage medium | |
CN114120168A (en) | Target running distance measuring and calculating method, system, equipment and storage medium | |
JP2009236569A (en) | Ground point estimation device, ground point estimation method, flow line display system, and server | |
CN114037923A (en) | Target activity hotspot graph drawing method, system, equipment and storage medium | |
US20230405432A1 (en) | Device and method for sensing movement of sphere moving on plane surface using camera, and device and method for sensing golfball moving on putting mat |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |