WO2018209934A1 - 基于时空约束的跨镜头多目标跟踪方法及装置 - Google Patents

基于时空约束的跨镜头多目标跟踪方法及装置 Download PDF

Info

Publication number
WO2018209934A1
WO2018209934A1 PCT/CN2017/115672 CN2017115672W WO2018209934A1 WO 2018209934 A1 WO2018209934 A1 WO 2018209934A1 CN 2017115672 W CN2017115672 W CN 2017115672W WO 2018209934 A1 WO2018209934 A1 WO 2018209934A1
Authority
WO
WIPO (PCT)
Prior art keywords
tracking
target
imaging devices
information
camera
Prior art date
Application number
PCT/CN2017/115672
Other languages
English (en)
French (fr)
Inventor
鲁继文
周杰
任亮亮
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2018209934A1 publication Critical patent/WO2018209934A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Definitions

  • the invention relates to the field of visual target tracking technology in computer image processing, and in particular to a cross-lens multi-target tracking method and device based on space-time constraints.
  • Video target tracking refers to the initial position of a given target in the video, and then outputs the position of the target at each moment in the video.
  • Object tracking is an important issue in computer vision and is usually the first step in video analytics processing. Therefore, a large number of scholars engaged in object tracking research, as well as a number of effective object tracking algorithms have been proposed.
  • multiple objects need to be tracked simultaneously in a complex scene.
  • Mutual occlusion between multiple objects increases the difficulty of object tracking, which is often seen in pedestrian tracking.
  • multi-target tracking methods are mainly divided into two categories: multi-target tracking based on single camera and multi-target tracking method based on multi-camera camera.
  • the multi-target tracking method based on single camera mainly includes the method of inter-frame Tracklet splicing and the method of global optimization.
  • Tracklet splicing and linear programming-based LP tracking are two other methods for optimizing all trajectories simultaneously throughout the sequence.
  • a tracking small segment is generated, which is formed by the traditional group detection result.
  • These tracking small fragments are then connected by the Hungarian partitioning algorithm.
  • This method assumes that all tracking small segments are correct trajectories and therefore difficult to extend to many false detections in each original trajectory segment.
  • a subgraph is generated for each object's trajectory and the edges between them, and each object interacts through edges.
  • a multipath search problem is solved in the subgraph using approximate linear programming and rounding. It assumes that the relative position of the object is relatively stable and that the number of targets is fixed.
  • the current focus is on how to perform multi-camera data fusion, mainly based on the method of camera equipment calibration and feature matching.
  • the method based on the calibration of the imaging device mainly uses the projection matrix of the imaging device to project different images of the imaging device onto the same screen.
  • the matching result is mainly improved by finding efficient appearance features and spatiotemporal information.
  • the tracking problem of multi-camera equipment is more challenging than the tracking problem of the camera equipment due to the large difference in illumination and viewing angle between different lenses.
  • multi-camera surveillance Control system.
  • the information of the plurality of imaging devices can be used to obtain the position of the object more accurately.
  • the multi-camera real-time tracking problem has two main parts: tracking inside the camera and cross-camera tracking. Among them, the repeated coverage area in the cross-camera tracking problem and the processing method of the uncovered area are discussed in many articles.
  • multi-camera based multi-target tracking is very meaningful. But at the same time, because of the complexity of its problems, this work is also very challenging.
  • some scholars have proposed a variety of information using multiple cameras to improve the robustness of object tracking, but they ignore geometric constraints and other issues, violate geometric assumptions, and require more complicated methods to solve the resulting errors.
  • the present invention aims to solve at least one of the technical problems in the related art to some extent.
  • an object of the present invention is to propose a multi-target multi-target tracking method based on spatiotemporal constraints, which can improve tracking robustness, reduce tracking error, and improve tracking accuracy.
  • Another object of the present invention is to provide a cross-lens multi-target tracking device based on spatiotemporal constraints.
  • an embodiment of the present invention provides a cross-lens multi-target tracking method based on spatiotemporal constraints, which includes the following steps: performing image preprocessing on different color spaces to make the pictures consistent in color temperature and hue, Acquiring the imaging information of the plurality of imaging devices; establishing a correspondence relationship between the plurality of imaging devices by using a projection matrix of the imaging device, wherein the projection matrix is a projection matrix about the 3D world; The human body feature matching between the plurality of cameras is performed according to the imaging information and the geometric information, so that each camera device screen and the real-time tracking result are acquired by using the apparent and spatiotemporal features of the tracking target.
  • the multi-target multi-target tracking method based on space-time constraint performs human body feature matching between multiple cameras through camera information and geometric information to achieve target tracking, effectively combining current multi-target tracking algorithm and multi-camera processing.
  • the method and the network pose relationship matrix of the camera device are used to realize the multi-target object tracking based on multi-camera. While improving the robustness of object tracking, the tracking error is reduced and the tracking accuracy is improved.
  • spatiotemporal constraint-based cross-lens multi-target tracking method may further have the following additional technical features:
  • the performing human body feature matching between the plurality of cameras according to the imaging information and the geometric information further includes: any one of the plurality of imaging devices When the tracking target is detected, the position of the tracking target is projected into the coordinate system corresponding to the ground through the projection matrix; all the points are clustered and analyzed to obtain the other imaging devices in the plurality of imaging devices. The same tracking target.
  • the acquiring the same tracking target in the other imaging devices of the plurality of imaging devices further includes: acquiring an optimal group among all the results, the optimal group Number of camera devices The target is the most and the phase position error is the smallest; the 3D coordinates of the tracking target are determined by the optimal group to remove the point selected in the group from the 3D coordinate of the tracking target that is greater than the first preset value, and Select a point in the remaining points where the deviation is less than the second preset value, and remove the set until all points select the set.
  • the Hough voting method is employed, and the position of the pedestrian is determined based on the positions of the plurality of imaging devices of the human body and the posture information of the imaging device.
  • the method further includes: matching the tracking result with a pedestrian model to eliminate mismatch, occlusion, and leak detection, wherein the pedestrian model includes speed and current position.
  • the pedestrian model includes speed and current position.
  • another embodiment of the present invention provides a cross-border multi-target tracking device based on space-time constraints, comprising: a pre-processing module for performing image pre-processing on different color spaces to make the image in color A uniform color tone is obtained to obtain image information of a plurality of image capturing devices; and an acquisition module is configured to establish a correspondence relationship between the 2D points by using a projection matrix of the image capturing device to acquire geometric information between the plurality of image capturing devices, wherein The projection matrix is a projection matrix about the 3D world; the tracking module is configured to perform human body feature matching between the plurality of cameras according to the imaging information and the geometric information, to acquire each of the apparent and spatiotemporal features of the tracking target Camera device screen and real-time tracking results.
  • the multi-target multi-target tracking device based on space-time constraint performs human body feature matching between multiple cameras through camera information and geometric information to achieve target tracking, effectively combining current multi-target tracking algorithm and multi-camera processing.
  • the method and the network pose relationship matrix of the camera device are used to realize the multi-target object tracking based on multi-camera. While improving the robustness of object tracking, the tracking error is reduced and the tracking accuracy is improved.
  • spatiotemporal constraint-based cross-lens multi-target tracking apparatus may further have the following additional technical features:
  • the tracking module is further configured to: when any one of the plurality of imaging devices detects a tracking target, project a position of the tracking target by using the projection matrix Go to the corresponding coordinate system of the ground, and perform cluster analysis on all the points to acquire the same tracking target among the other imaging devices of the plurality of imaging devices.
  • the tracking module is further configured to acquire an optimal group among all the results, where the optimal group is the largest number of imaging devices and the phase position error is the smallest, and the most The optimal group determines a 3D coordinate of the tracking target to remove a point selected in the group from the 3D coordinate of the tracking target that is greater than a first preset value, and select a deviation in the remaining points to be smaller than the second preset The point of the value, remove the collection until all points select the collection.
  • the method further includes: a positioning module, configured to adopt a Hough voting method, and determine a position of the pedestrian according to the position of the plurality of imaging devices of the human body and the posture information of the imaging device.
  • a positioning module configured to adopt a Hough voting method, and determine a position of the pedestrian according to the position of the plurality of imaging devices of the human body and the posture information of the imaging device.
  • the method further includes: a matching module, configured to match the tracking result with a pedestrian model to eliminate mismatch, occlusion, and leakage detection, wherein the pedestrian model includes speed, current One or more parameters of position, color characteristics, first appearance time, trajectory, and current state.
  • a matching module configured to match the tracking result with a pedestrian model to eliminate mismatch, occlusion, and leakage detection, wherein the pedestrian model includes speed, current One or more parameters of position, color characteristics, first appearance time, trajectory, and current state.
  • FIG. 1 is a flowchart of a cross-lens multi-target tracking method based on spatiotemporal constraints according to an embodiment of the present invention
  • FIG. 2 is a flow chart of a cross-lens multi-target tracking method based on spatiotemporal constraints, in accordance with an embodiment of the present invention
  • FIG. 3 is a schematic diagram of detection results at a certain moment according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of positioning and clustering results according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a camera detection result according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of positioning results according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a cross-lens multi-target tracking apparatus based on space-time constraints according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a multi-target multi-target tracking method based on space-time constraints according to an embodiment of the present invention.
  • the multi-target multi-target tracking method based on space-time constraints includes the following steps:
  • step S101 image preprocessing is performed on different color spaces, so that the pictures are consistent in color temperature and hue to acquire imaging information of a plurality of imaging devices.
  • the imaging device (hereinafter, taking a video camera as an example) needs to perform preprocessing.
  • preprocessing in color science, a variety of color models can be used to describe a color. Commonly used are RGB color space, Lab color space, CMYK image preprocessing, and it is necessary to reduce the difference of different cameras. Color space, HSV color space Wait.
  • RGB color space RGB color space
  • Lab color space Lab color space
  • CMYK image preprocessing CMYK image preprocessing
  • Color space HSV color space Wait.
  • the same object has different colors in different camera images due to camera orientation, illumination, and device differences, and the embodiment of the present invention utilizes pedestrians due to target tracking later.
  • Color statistical information is an important feature, so image preprocessing is performed in different color spaces.
  • the embodiment of the present invention adopts a simple and effective algorithm, and performs the same mean value in the lab color space, and the result after the same variance processing is the best, because the coupling degree of the three channels of the lab color space is the smallest, and the processed image has no image. Noise appears and there is no serious color distortion.
  • each frame uses the following formula to perform the same mean and variance normalization for each frame of each camera, so as to avoid the influence of pedestrians in the video on normalization:
  • step S102 a correspondence relationship of 2D points is established by a projection matrix of the imaging device to acquire geometric information between the plurality of imaging devices, wherein the projection matrix is a projection matrix about the 3D world.
  • step S103 human body feature matching between the plurality of cameras is performed according to the imaging information and the geometric information, so that each camera device screen and the real-time tracking result are acquired by using the apparent and spatiotemporal features of the tracking target.
  • the human body feature matching between the plurality of cameras is performed according to the imaging information and the geometric information, and further includes: when the tracking target is detected by any one of the plurality of imaging devices, the projection matrix is adopted. The position of the tracking target is projected into the coordinate system corresponding to the ground; all the points are clustered and analyzed to acquire the same tracking target among the other imaging devices in the plurality of imaging devices.
  • acquiring the same tracking target in the other imaging devices of the plurality of imaging devices further includes: acquiring an optimal group among all the results, wherein the optimal group is the largest number of imaging devices and The phase position error is minimum; the 3D coordinates of the tracking target are determined by the optimal group to remove the point selected in the group from the first preset value according to the 3D coordinate of the tracking target, and the deviation is less than the selected one in the remaining points. Two preset points, remove the collection until all points select the collection.
  • the embodiment of the present invention uses the Faster-R-CNN to perform object detection after comprehensively comparing various object detection algorithms, and then the 2D point in the image and the 3D point in the world are the same.
  • H is called the projection matrix of the camera:
  • two cameras can establish a relationship through their projection matrix about the 3D world, that is, establish a correspondence of 2D points:
  • the ground can be thought of as a huge camera, and then the projection matrix of all cameras with respect to the earth is solved. Knowing the projection matrix H i ⁇ g of the camera i to the earth, and any point (x i , y i ) in the camera i, its coordinates corresponding to the earth Can be derived by the following formula:
  • I the total number of people detected in all cameras in the kth frame.
  • I the total number of people detected in all cameras in the kth frame.
  • the optimization problem is an integer optimization problem. In the actual system, the global optimal solution cannot be solved accurately. In the actual algorithm, the embodiment of the present invention designs a method for approximating the optimal solution:
  • the candidate set is first clustered by using location and color information. Then use the cluster center feature information to filter, and then use the remaining reliable elements to calculate the optimal position.
  • the specific calculation algorithm is given in the following sections.
  • the Hough voting method is employed, and the position of the pedestrian is determined based on the positions of the plurality of imaging devices of the human body and the posture information of the imaging device.
  • the intersection of the line segments projected by the human body in the direction on the ground of the two cameras is more likely to be the position of the real pedestrian on the earth.
  • the idea of Hough voting can be adopted, comprehensive consideration
  • the position of the plurality of cameras of the human body and the pose information of the camera determine the position of the pedestrian. Assume that all camera images are horizontal, that is, in the camera image, the value of the x coordinate of each person's head and foot is the same.
  • the footholds (x, y) and (x, in the camera picture are used.
  • y+ ⁇ ) is projected onto the ground to obtain (x' 1 , y' 1 ) and (x' 2 , y' 2 ). then For the direction after projection, and The change in scale for the camera (x, y) when projected onto the ground, which will be used later to visualize the tracking results.
  • the circle indicates the Hough vote
  • the star indicates the traditional method result
  • the cam1 is indicated by the solid line 4
  • the cam2 is indicated by the solid line 3
  • the cam3 is indicated by the solid line 2
  • the cam4 is indicated by the solid line 1
  • the center of each line is the position where the camera's descending footing is projected on the ground. It can be seen that the results obtained by the Hough voting method generally appear at the convergence of the projection directions of multiple cameras. For example, in the lower left corner of the earth coordinates, he is detected in cam1, cam2, cam3, where the position detected in cam1, cam2 is accurate, and the standpoint detected by cam3 is large, but in each camera The direction of the detected human body is accurate, that is, the left and right positions of the rectangular frame are reliable.
  • the randsec idea is utilized in the algorithm, that is, not all the data are all combined together is optimal, and an optimal one can be found.
  • the data combination has the highest reliability and the smallest variance.
  • two lines can determine a point, so in the algorithm, randomly select two cameras from the set and then solve the corresponding position, then calculate the global loss function at this position, and then select among multiple combinations.
  • the position of the loss function is the smallest, so that the influence of the positioning information with a large individual error can be removed. This is a good way to improve the accuracy of positioning when the number of cameras is limited (usually less than or equal to 4).
  • Figure 5 shows the situation when there are 7 people in the camera coverage area. Three of them were seen by four cameras at the same time, two people on the far right in the middle of the earth, and on the far left of cam1, the man on the far right of cam3. The remaining two people in the middle are seen by three cameras at the same time. The rest except the top is only detected in cam4, both of them appear in the two camera screens. It can be seen in the positioning result of Fig. 6 that the result obtained by the Hough voting method is very accurate, which can be seen by the relative position between each person and the degree of convergence of the projection lines. Except for the top one that was only detected by cam4, the projection lines of each of the other pedestrians almost intersected at one point. Notice that there are two errors in the position of the two detection frames.
  • the first one is the second smaller rectangular frame on the left side of cam4.
  • the recognition result is biased due to the occlusion of the footstep, and the distance from the person is due to the distance from the cam4.
  • the error can be seen in the ground plane (the red line in the upper left corner, the center is the result of estimating the projection onto the ground based on the cam4 foothold), which is different from the real result by more than 100 pixel values.
  • the error in its direction is small, and its extension line passes almost the position determined by the other three cameras.
  • the rightmost rectangular frame in cam2 has a certain error in foothold recognition.
  • the error is amplified by the resolution, resulting in the ground standing point estimation error of the camera being more than 50 pixel values in the actual ground coordinates, but the error in the projection direction is small, and the positioning of the last pedestrian can be seen.
  • the result is that the information of the two camera screens is used simultaneously to achieve precise positioning.
  • ⁇ (i,k) is the color feature of the i-th person in the kth frame
  • K is a correlation function. They are related functions of position and velocity.
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are threshold parameters, that is, the situation where the pedestrian disappears and appears, and the mismatch is eliminated.
  • An adjacency matrix representing the relationship between the current frame and the previous frame, if Then the two pedestrians are the same person, if Then the two are not the same person.
  • the last constraint can be expressed as at least one element per column in matrix F k is one.
  • the above problem can transform an optimization problem of minimum cost flow, and the global optimal solution can be obtained by the minimum cost flow solving algorithm.
  • an actual tracking problem requires real-time and causality, that is, the current frame can only be considered in the prediction of the current frame, and cannot be affected by the subsequent results.
  • the above method can find a feasible solution in a fixed linear time, and only uses the information of the current frame and the previous frame.
  • the method further includes: matching the tracking result with the pedestrian model to eliminate the problem of mismatching, occlusion, and leak detection, wherein the pedestrian model includes speed, current position, color One or more of the characteristics, first occurrence time, trajectory, and current state.
  • the embodiment of the present invention proposes a pedestrian model, which fully utilizes the previous tracking results, eliminates mismatches, and allows the target to disappear in a short time to solve the occlusion and leakage detection problems.
  • each pedestrian model contains the following parameters:
  • the current frame and the already constructed pedestrian model perform the above-described base-based matching.
  • Each pedestrian model is updated after the final matching result is obtained.
  • the specific update is divided into two cases, that is, the matching corresponding to the current frame is found, and the changer is considered to be detected in this frame. If the matching match is not found, the changer is considered to be lost in this frame.
  • is an exponential smoothing term to smooth the pedestrian's speed, reducing the influence of noise on the tracking result in each frame estimation error. Also notice that the pedestrian position is not directly updated with the position of the current frame, but the speed is updated first, and then the speed is updated. This has the advantage of utilizing the previous speed information, and because of the maximum speed limit, Reduce the problem caused by mismatch of a certain frame, but there will be a certain lag, that is, if the speed of the object changes greatly, the model takes a long time to correct, but considering the actual tracking problem A similar situation arises, so taking such a strategy is more beneficial.
  • the setting of ⁇ needs to be considered comprehensively. If it is too small, it will not be able to filter.
  • the response time to the speed change will be very long. It is reasonable to test between 0.8 and 0.9 in the experiment.
  • is the correction coefficient
  • the class has passed the long-term correction is that the color features in the model are more in line with the original color characteristics of the person, where ⁇ is not too small, in this
  • 0.99
  • the fourth item is the update of the state.
  • the state of the pedestrian model has two states of loss and activation. If the pedestrian is in the lost state in the previous frame, the state needs to be modified to be activated. Finally, the position of this frame is recorded in the track information.
  • the multi-target multi-target tracking method based on space-time constraint combines the information of multiple cameras, considers the geometric information between the cameras, and the apparent and spatio-temporal features of the target to achieve more efficient data fusion, and utilizes Hough voted to determine the pedestrian 3D position, using camera prior, eliminating the traditional method based on the inaccuracy of the foothold estimation, and directly tracking the pedestrian's 3D position to achieve more effective human analysis, and introducing the pedestrian model, comprehensive consideration of multi-frame tracking
  • a more robust multi-target tracking is realized, in which the human body feature matching between multiple cameras is performed through the imaging information and the geometric information, the target tracking is realized, and the current multi-target is effectively combined. Tracking algorithm and multi-camera processing method, and using the camera pose relationship matrix to achieve multi-target multi-target object tracking purposes, improve the robustness of object tracking, reduce tracking error and improve tracking accuracy. .
  • FIG. 8 is a schematic structural diagram of a cross-lens multi-target tracking apparatus based on space-time constraints according to an embodiment of the present invention.
  • the spatio-temporal constraint-based cross-lens multi-target tracking device 10 includes a pre-processing module 100, an acquisition module 200, and a tracking module 300.
  • the pre-processing module 100 is configured to perform image pre-processing on different color spaces, so that the pictures are consistent in color temperature and hue to obtain imaging information of the plurality of imaging devices.
  • the acquisition module 200 is configured to establish a correspondence relationship between the 2D points by using a projection matrix of the imaging device to obtain geometric information between the plurality of imaging devices, wherein the projection matrix is a projection matrix about the 3D world.
  • the tracking module 300 is configured to perform human body feature matching between the plurality of cameras according to the imaging information and the geometric information, to acquire the image of each camera device and the tracking result in real time by using the apparent and spatiotemporal features of the tracking target.
  • the device 10 of the embodiment of the present invention combines the current multi-target tracking algorithm and the multi-camera processing method, and utilizes the network pose relationship matrix of the camera device to realize the multi-target object tracking target based on multi-camera, and improves the robustness of the object tracking. At the same time, reduce tracking error and improve tracking accuracy.
  • the tracking module 300 is further configured to use any one of a plurality of imaging devices.
  • the imaging device detects the tracking target
  • the position of the tracking target is projected into the coordinate system corresponding to the ground through the projection matrix, and all the points are clustered and analyzed to obtain the same one of the other imaging devices in the plurality of imaging devices. Track the target.
  • the tracking module 300 is further configured to acquire an optimal group among all the results, the optimal group is the largest number of imaging devices and the phase position error is the smallest, and the tracking is determined by the optimal group.
  • the 3D coordinates of the target in order to remove the point selected in the group from the first preset value according to the 3D coordinate of the tracking target, and select the point where the deviation is less than the second preset value among the remaining points, and remove the set until All points are selected from the collection.
  • the apparatus 10 of the embodiment of the present invention further includes: a positioning module.
  • the positioning module is configured to adopt a Hough voting method, and determine the position of the pedestrian according to the position of the plurality of imaging devices of the human body and the posture information of the imaging device.
  • the apparatus 10 of the embodiment of the present invention further includes: a matching module.
  • the matching module is used to match the tracking result with the pedestrian model to eliminate mismatch, occlusion and leak detection problems, wherein the pedestrian model includes speed, current position, color characteristics, first appearance time, trajectory and current state. One or more parameters.
  • the multi-target multi-target tracking device based on space-time constraint combines the information of multiple cameras, considers the geometric information between the cameras, and the apparent and spatio-temporal features of the target to achieve more efficient data fusion, and utilizes Hough voted to determine the pedestrian 3D position, using camera prior, eliminating the traditional method based on the inaccuracy of the foothold estimation, and directly tracking the pedestrian's 3D position to achieve more effective human analysis, and introducing the pedestrian model, comprehensive consideration of multi-frame tracking
  • a more robust multi-target tracking is realized, in which the human body feature matching between multiple cameras is performed through the imaging information and the geometric information, the target tracking is realized, and the current multi-target is effectively combined. Tracking algorithm and multi-camera processing method, and using the camera pose relationship matrix to achieve multi-target multi-target object tracking purposes, improve the robustness of object tracking, reduce tracking error and improve tracking accuracy. .
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining “first” and “second” may be explicitly or implicitly The inclusion includes at least one such feature. In the description of the present invention, the meaning of "a plurality" is at least two, such as two, three, etc., unless specifically defined otherwise.
  • the terms “installation”, “connected”, “connected”, “fixed” and the like shall be understood broadly, and may be either a fixed connection or a detachable connection, unless explicitly stated and defined otherwise. , or integrated; can be mechanical or electrical connection; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of two elements or the interaction of two elements, unless otherwise specified Limited.
  • the specific meanings of the above terms in the present invention can be understood on a case-by-case basis.
  • the first feature "on” or “under” the second feature may be a direct contact of the first and second features, or the first and second features may be indirectly through an intermediate medium, unless otherwise explicitly stated and defined. contact.
  • the first feature "above”, “above” and “above” the second feature may be that the first feature is directly above or above the second feature, or merely that the first feature level is higher than the second feature.
  • the first feature “below”, “below” and “below” the second feature may be that the first feature is directly below or obliquely below the second feature, or merely that the first feature level is less than the second feature.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种基于时空约束的跨镜头多目标跟踪方法及装置,其中,方法包括:对不同的色彩空间进行图像预处理,使图片在色温和色调上一致,以获取多个摄像设备的摄像信息(S101);通过摄像设备的投影矩阵建立2D点的对应关系,以获取多个摄像设备之间的几何信息,其中,投影矩阵为关于3D世界的投影矩阵(S102);根据摄像信息和几何信息进行多个摄像头之间的人体特征匹配,以利用跟踪目标的表观和时空特征获取每个摄像设备画面以及实时的跟踪结果(S103)。该方法通过结合多目标跟踪算法和多摄像头处理方法,并且利用摄像设备网络位姿关系矩阵,从而实现基于多摄像头的多目标物体跟踪目的,在提高物体跟踪的鲁棒性的同时,减少跟踪误差,提高跟踪的准确性。

Description

基于时空约束的跨镜头多目标跟踪方法及装置
相关申请的交叉引用
本申请要求清华大学于2017年05月19日提交的、发明名称为“基于时空约束的跨镜头多目标跟踪方法及装置”的、中国专利申请号“201710358354.7”的优先权。
技术领域
本发明涉及计算机图像处理中的视觉目标跟踪技术领域,特别涉及一种基于时空约束的跨镜头多目标跟踪方法及装置。
背景技术
视频目标跟踪是指给定目标在视频中的初始位置,然后输出该目标在视频中的每一个时刻的位置。物体跟踪是计算机视觉中一个重要的问题,通常是视频分析处理的第一步。因此有大量学者从事物体跟踪的研究,以及众多有效的物体跟踪的算法被提出来。在一些监控场景下,需要在一个复杂的场景下同时跟踪多个物体。多个物体之间的相互遮挡增加了物体跟踪的难度,这一点在行人的跟踪经常出现。当一大群人同时出现在摄像设备画面中时,每个人之间相互重叠使得无法准确的获取其实际位置。目前多目标追踪方法主要分为两类:基于单摄像头的多目标追踪和基于多摄像相机的多目标的追踪方法
基于单摄像头的多目标追踪方法主要有基于帧间Tracklet拼接的方法和全局优化的方法。Tracklet拼接和基于线性规划的LP跟踪是在整个序列同时优化所有的轨迹另外两种方法。首先生成跟踪小片段,这是由传统群体检测结果形成轨迹片段。然后,这些跟踪小片段通过匈牙利分区算法进行连接。这种方法假定所有跟踪小片段都是正确的轨迹,因此很难扩展到在每个原始轨迹片段中许多误检测的情形。对每个对象的轨迹与它们之间的边生成子图,每个对象之间通过边进行交互。在子图中利用近似线性规划和舍入解决一个多路径搜索问题。它假设物件相对关系位置相对稳定的,以及目标的数量是固定的。
基于多摄像头的方法,目前主要关注如何进行多摄像头的数据融合,主要有基于摄像设备标定的方法和特征匹配的方法。基于摄像设备标定的方法主要是利用摄像设备投影矩阵,将不同的摄像设备画面投影到同一个画面上。对于基于特征匹配的方法,主要是通过寻找高效的表观特征和时空信息来提高匹配结果。多摄像设备的追踪问题由于不同镜头间有较大的光照和视角差异,相比于摄像设备的跟踪问题,具有更大的挑战性。
然而,针对于复杂场景下多个物体的跟踪问题,其中一个有效途径是利用多摄像头监 控系统。在多个摄像设备重合的监控区域,可以借助多个摄像设备的信息来较为准确的获取物体的位置。随着传感器和处理器价格的下降,在很多场景下多摄像头配合使用也变得越来越普遍。多摄像头实时跟踪问题主要有两个部分:摄像头内部的跟踪和跨摄像头跟踪。其中跨摄像头跟踪问题中的重复覆盖区域,以及未覆盖区域的处理方法,在很多文章中都有讨论随着安保和行人数据分析等需求,基于多摄像头的多目标的跟踪是很有意义的,但同时由于其问题的复杂性,这项工作也具有很大的挑战性。最近有学者提出了多种利用多个摄像头的信息来提高物体跟踪的鲁棒性,但是它们忽略了几何约束等问题,违反了几何假设,需要更复杂的方法来解决由此带来的误差。
发明内容
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,本发明的一个目的在于提出一种基于时空约束的跨镜头多目标跟踪方法,该方法可以在提高物体跟踪的鲁棒性的同时,减少跟踪误差,提高跟踪的准确性。
本发明的另一个目的在于提出一种基于时空约束的跨镜头多目标跟踪装置。
为达到上述目的,本发明一方面实施例提出了一种基于时空约束的跨镜头多目标跟踪方法,包括以下步骤:对不同的色彩空间进行图像预处理,使图片在色温和色调上一致,以获取多个摄像设备的摄像信息;通过摄像设备的投影矩阵建立2D点的对应关系,以获取所述多个摄像设备之间的几何信息,其中,所述投影矩阵为关于3D世界的投影矩阵;根据所述摄像信息和所述几何信息进行多个摄像头之间的人体特征匹配,以利用跟踪目标的表观和时空特征获取每个摄像设备画面以及实时的跟踪结果。
本发明实施例的基于时空约束的跨镜头多目标跟踪方法,通过摄像信息和几何信息进行多个摄像头之间的人体特征匹配,实现目标的跟踪,有效结合目前的多目标跟踪算法和多摄像头处理方法,并且利用摄像设备网络位姿关系矩阵,从而实现基于多摄像头的多目标物体跟踪目的,在提高物体跟踪的鲁棒性的同时,减少跟踪误差,提高跟踪的准确性。
另外,根据本发明上述实施例的基于时空约束的跨镜头多目标跟踪方法还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述根据所述摄像信息和所述几何信息进行多个摄像头之间的人体特征匹配,进一步包括:在所述多个摄像设备中任意一个摄像设备检测到跟踪目标时,通过所述投影矩阵将所述跟踪目标的位置投影到地面对应的坐标系中;将所有的点进行聚类分析,以获取所述多个摄像设备中其它摄像设备中的同一个跟踪目标。
进一步地,在本发明的一个实施例中,所述获取所述多个摄像设备中其它摄像设备中的同一个跟踪目标,进一步包括:获取所有结果中最优的组,所述最优的组为摄像设备数 目最多且相位位置误差最小;通过所述最优的组确定所述跟踪目标的3D坐标,以根据所述跟踪目标的3D坐标去除组中所选的偏差大于第一预设值的点,并且在剩余的点中选择偏差小于第二预设值的点,移除集合,直至所有的点选出集合。
进一步地,在本发明的一个实施例中,采用Hough投票方法,并且根据人体多个摄像设备的位置以及摄像设备的位姿信息确定行人的位置。
进一步地,在本发明的一个实施例中,在跟踪中,还包括:将所述跟踪结果与行人模型进行匹配,以消除误匹配、遮挡和漏检测问题,其中,行人模型包含速度、当前位置、色彩特征、第一次出现时间、轨迹和当前状态中的一种或多种参数。
为达到上述目的,本发明另一方面实施例提出了一种基于时空约束的跨境头多目标跟踪装置,包括:预处理模块,用于对不同的色彩空间进行图像预处理,使图片在色温和色调上一致,以获取多个摄像设备的摄像信息;采集模块,用于通过摄像设备的投影矩阵建立2D点的对应关系,以获取所述多个摄像设备之间的几何信息,其中,所述投影矩阵为关于3D世界的投影矩阵;跟踪模块,用于根据所述摄像信息和所述几何信息进行多个摄像头之间的人体特征匹配,以利用跟踪目标的表观和时空特征获取每个摄像设备画面以及实时的跟踪结果。
本发明实施例的基于时空约束的跨镜头多目标跟踪装置,通过摄像信息和几何信息进行多个摄像头之间的人体特征匹配,实现目标的跟踪,有效结合目前的多目标跟踪算法和多摄像头处理方法,并且利用摄像设备网络位姿关系矩阵,从而实现基于多摄像头的多目标物体跟踪目的,在提高物体跟踪的鲁棒性的同时,减少跟踪误差,提高跟踪的准确性。
另外,根据本发明上述实施例的基于时空约束的跨镜头多目标跟踪装置还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述跟踪模块还用于在所述多个摄像设备中任意一个摄像设备检测到跟踪目标时,通过所述投影矩阵将所述跟踪目标的位置投影到地面对应的坐标系中,并且将所有的点进行聚类分析,以获取所述多个摄像设备中其它摄像设备中的同一个跟踪目标。
进一步地,在本发明的一个实施例中,所述跟踪模块还用于获取所有结果中最优的组,所述最优的组为摄像设备数目最多且相位位置误差最小,并且通过所述最优的组确定所述跟踪目标的3D坐标,以根据所述跟踪目标的3D坐标去除组中所选的偏差大于第一预设值的点,并且在剩余的点中选择偏差小于第二预设值的点,移除集合,直至所有的点选出集合。
进一步地,在本发明的一个实施例中,还包括:定位模块,用于采用Hough投票方法,并且根据人体多个摄像设备的位置以及摄像设备的位姿信息确定行人的位置。
进一步地,在本发明的一个实施例中,还包括:匹配模块,用于将所述跟踪结果与行人模型进行匹配,以消除误匹配、遮挡和漏检测问题,其中,行人模型包含速度、当前位置、色彩特征、第一次出现时间、轨迹和当前状态中的一种或多种参数。
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为根据本发明实施例的基于时空约束的跨镜头多目标跟踪方法的流程图;
图2为根据本发明一个具体实施例的基于时空约束的跨镜头多目标跟踪方法的流程图;
图3为根据本发明一个实施例的某一时刻的检测结果示意图;
图4为根据本发明一个实施例的定位和聚类结果示意图;
图5为根据本发明一个实施例的摄像机检测结果示意图;
图6为根据本发明一个实施例的定位结果示意图;
图7为根据本发明一个实施例的实际跟踪结果示意图;
图8为根据本发明实施例的基于时空约束的跨镜头多目标跟踪装置的结构示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
下面参照附图描述根据本发明实施例提出的基于时空约束的跨镜头多目标跟踪方法及装置,首先将参照附图描述根据本发明实施例提出的基于时空约束的跨镜头多目标跟踪方法。
图1是本发明实施例的基于时空约束的跨镜头多目标跟踪方法的流程图。
如图1所示,该基于时空约束的跨镜头多目标跟踪方法包括以下步骤:
在步骤S101中,对不同的色彩空间进行图像预处理,使图片在色温和色调上一致,以获取多个摄像设备的摄像信息。
具体地,首先摄像设备(下面均以摄像机为例)需要进行预处理。其中,在色彩学中,可以利用很多种色彩模型来描述一种颜色,常用的有RGB色彩空间,Lab色彩空间,CMYK图像预处理,减少不同摄像机的差异很有必要色彩空间,HSV色彩空间 等。在原始的多个摄像机画面中,由于摄像机朝向,光照以及设备差异的影响,同一物体在不同的摄像机画面中有不同的颜色,而且由于在后面进行目标跟踪时,本发明实施例利用到了行人的颜色统计信息作为其重要特征,所以在不同的色彩空间进行了图像预处理。
举例而言,目前四个摄像头虽然画面显示的是同一个地面,同一时刻的照片,但是四张图片在色温和色调上有较大的差异,这将影响以后的多个摄像头之间的人体特征匹配。因此,本发明实施例采取一种简单有效的算法,在lab色彩空间进行同均值,同方差处理后的结果最好,这是由于lab色彩空间三个通道的耦合度最小,而且处理后图像无噪点出现,也无严重的色彩失真。
其中,首先固定
Figure PCTCN2017115672-appb-000001
mt,α,β作为各通道的目标均值和方差,同时对于记录每个相机背景画面(刚开始的一帧,或者是利用背景构建算法得到背景)的均值和方差,
Figure PCTCN2017115672-appb-000002
mi,α,β。然后每一帧利用下面的公式,对每个摄像机每一帧进行同均值同方差归一化处理,这样可以避免视频中行人的出现影响归一化处理:
Figure PCTCN2017115672-appb-000003
在步骤S102中,通过摄像设备的投影矩阵建立2D点的对应关系,以获取多个摄像设备之间的几何信息,其中,投影矩阵为关于3D世界的投影矩阵。
在步骤S103中,根据摄像信息和几何信息进行多个摄像头之间的人体特征匹配,以利用跟踪目标的表观和时空特征获取每个摄像设备画面以及实时的跟踪结果。
其中,在本发明的一个实施例中,根据摄像信息和几何信息进行多个摄像头之间的人体特征匹配,进一步包括:在多个摄像设备中任意一个摄像设备检测到跟踪目标时,通过投影矩阵将跟踪目标的位置投影到地面对应的坐标系中;将所有的点进行聚类分析,以获取多个摄像设备中其它摄像设备中的同一个跟踪目标。
进一步地,在本发明的一个实施例中,获取多个摄像设备中其它摄像设备中的同一个跟踪目标,进一步包括:获取所有结果中最优的组,最优的组为摄像设备数目最多且相位位置误差最小;通过最优的组确定跟踪目标的3D坐标,以根据跟踪目标的3D坐标去除组中所选的偏差大于第一预设值的点,并且在剩余的点中选择偏差小于第二预设值的点,移除集合,直至所有的点选出集合。
具体地,基于多摄像头的多目标跟踪,本发明实施例在综合比较多种物体检测算法后,利用Faster-R-CNN进行物体检测,然后图像中的2D点和世界中的3D点有如 下的对应关系,H称为摄像机的投影矩阵:
Figure PCTCN2017115672-appb-000004
其中,两个摄像机可以通过其关于3D世界的投影矩阵建立关系,即建立2D点的对应关系:
Figure PCTCN2017115672-appb-000005
在本发明的实施例中,可以把大地看作一个巨大的相机,然后求解出所有相机关于大地的投影矩阵。已知摄像机i到大地的投影矩阵Hi→g,摄像机i中的任意一点(xi,yi),则其在大地对应的坐标
Figure PCTCN2017115672-appb-000006
可由如下的公式推出:
Figure PCTCN2017115672-appb-000007
在第i个摄像机画面中检测到了ni个人,其位置为
Figure PCTCN2017115672-appb-000008
则通过对应的投影矩阵将其投射到地面对应的坐标系中,
Figure PCTCN2017115672-appb-000009
接下来需要将所有的点进行聚类分析,即找到不同摄像机中的同一个人。对此需要求解如下的一个优化问题:
Figure PCTCN2017115672-appb-000010
其中,
Figure PCTCN2017115672-appb-000011
是第k帧所有摄像机中检测到人总数。
Figure PCTCN2017115672-appb-000012
表示第i和j之间相似程度,包含两个因素,首先是人体特征之间的相似程度
Figure PCTCN2017115672-appb-000013
Φ(i,k)是第i个人色彩特征,然后通过K(a,b)计算协方差系数。第二个是位置相似程度
Figure PCTCN2017115672-appb-000014
II(e)是示性函数,若e为真,则II(e)=1,反之II(e)=0,δ是距离控制系数。
Figure PCTCN2017115672-appb-000015
表示第i和j检测目标之间的关系,若
Figure PCTCN2017115672-appb-000016
则二者是同一个人,若
Figure PCTCN2017115672-appb-000017
则二者不是同一个人,考虑到在摄像机内部检测到的两个人不可能是同一个人,以及摄像机画面中出现每个物体在另一个摄像机画面中最多有一个匹配。最后一列的三角不等式表示,若l和i,l和j都是同一人,那么i,j 也是同一个人,即环路约束。该优化问题是一个整数型优化问题,在实际系统中无法准确的求解全局最优解,在实际的算法中,本发明实施例设计了一种近似最优解的方法:
(1)先找到所有结果中最优的组(摄像机数目最多,且相对位置误差较小)。具体为,首先利用位置和色彩信息对备选集进行聚类。然后利用聚类中心特征信息进行筛选,然后利用剩余可靠的元素,计算最优的位置,具体计算算法在下面的章节给出。
(2)利用这组中的结果确定该人的3D坐标,然后根据该坐标去除组中所选的偏差比较大的点,并在剩余的点中选择偏差较小的点,移除集合。具体为,利用上面的计算结果得到该人的位置以及色彩特征,然后在剩余集合中寻找可能是这个人的元素但是由于之前聚类算法并没有聚到该类的元素,并移除备选集合。然后利用色彩特征和位置去除该类中不是该人的元素,重新放回备选集合中。
(3)重复(1),(2)的操作,直到所有的点选出集合。
进一步地,在本发明的一个实施例中,采用Hough投票方法,并且根据人体多个摄像设备的位置以及摄像设备的位姿信息确定行人的位置。
具体地,对于Hough投票方法实现,人体在两个摄像机中大地上的方向投影的线段的交点更有可能是真实的行人在大地上的位置,按照这个思路,可以采用Hough投票的思想,综合考虑人体多个摄像机的位置以及该摄像机的位姿信息确定行人的位置。假设所有的摄像机画面都是水平的,即在摄像画面中,每个人头部和脚部x坐标的数值是一样的,根据式,将摄像机画面中的立足点(x,y)和(x,y+∈)投影到地面上得到(x′1,y′1)和(x′2,y′2)。则
Figure PCTCN2017115672-appb-000018
为投影后的方向,而
Figure PCTCN2017115672-appb-000019
为该摄像机(x,y)投影到地面上时scale的变化,这一点将在后面可视化跟踪结果小时用到。
Figure PCTCN2017115672-appb-000020
Figure PCTCN2017115672-appb-000021
Figure PCTCN2017115672-appb-000022
从上面计算中可以得到在摄像机i,与大地平面的映射矩阵为Hi→g,中任意一个点(x,y),在大地平面上的坐标为(x′,y′),且投影方向是
Figure PCTCN2017115672-appb-000023
然后让∈→0,从而 得到w′2→w′1=w′,投影方向为
Figure PCTCN2017115672-appb-000024
如图3所示,在实际的人体检测的输出中,立足点的估计往往是带有一定误差的。第三个摄像机右边第二个人的检测结果的矩形框。
从图4中,圆形表示Hough投票,星星表示传统方法结果,cam1用4号实线表示,cam2用3号实线表示,cam3用2号实线表示,cam4用1号实线表示,其中,每条线的中心是在该摄像机下行人立足点投影在地面的位置。可以看出利用Hough投票的方法得到的结果一般出现在多个摄像机投影方向的汇聚处。例如在大地坐标下左下角的人,他在cam1,cam2,cam3中出被检测到了,其中cam1,cam2中检测的位置准确的,而cam3检测的立足点偏差较大,但是在每个摄像机中检测到的人体的方向是准确的,即矩形框的左右位置是可靠地。可以注意到在大地平面上三个摄像机中心点并不重合而且位置相差很大,但是三条直线几乎交于一点,说明通过Hough投票确定的位置可信度大大提高。为了处理可能出现的如cam2中左边第二个矩形框左右定位不准的情况,在算法中利用了randsec思想,即并不是所有的数据全部联合到一起是最优的,而可以找到一个最优的数据组合,其结果可信度最高,方差最小。在平面上,两条直线可以确定一个点,所以在算法中,随机从集合中选取2个摄像机然后求解出对应的位置,然后计算在这个位置下的全局损失函数,然后在多个组合中选取损失函数最小的位置,这样可以去除个别误差较大的定位信息影响。这一点在摄像机数目有限(通常小于等于4个)时,能很好地提高定位的准确性。
图5中所示是摄像机覆盖区域有7个人时的情形。其中有三个人被四个摄像机同时看到,即在大地中间偏右的两人,以及在cam1最左边,cam3最右边的男子。中间的剩余两个人被三个摄像机同时看到。剩下的除了最上面只在cam4中被检测到的,两人都在两个摄像机画面中出现。在图6的定位结果中可以看到利用Hough投票的方法得到的结果是非常准确的,这一点可以由每个人之间的相对位置以及,投影线的汇聚程度看出。除了最上面的只被cam4检测到那个人之外,其余的行人每个摄像机的投影直线都几乎相交于一点。注意到其中有两个检测框的位置有较大的误差,第一个是cam4中左边第二个较小的矩形框,由于脚步被遮挡导致识别结果偏上,而且由于该人距离cam4的距离较远,误差通过投射时分辨率的放大,可以看到在大地平面上(左上角的红线,中心是基于cam4立足点估计投射到地面的结果),其与真实结果相差100多个像素值,但是注意到其方向的误差是很小的,其延长线几乎通过了利用另外三个摄像机确定的位置。cam2中最右边的矩形框,立足点识别也带有一定的误差。而且由于距离cam2较远,误差通过分辨率放大,导致在实际中地面坐标中,该摄像机立足点估计误差为50多个像素值,但是投影方向的误差很小,可以看到最最后行人的定位结果是同时用到了两个摄像机画面的信息实现了精准定位。
数学描述:
Figure PCTCN2017115672-appb-000025
为所有相邻帧中出 现的行人。Φ(i,k)为第k帧中的第i个人的色彩特征,K为相关函数,
Figure PCTCN2017115672-appb-000026
分别为位置和速度的相关函数。θ1,θ2,θ3为阈值参数,即处理行人消失和出现的情形,消除错误匹配。
Figure PCTCN2017115672-appb-000027
其中,
Figure PCTCN2017115672-appb-000028
表示当前帧和前一帧关系的邻接矩阵,若
Figure PCTCN2017115672-appb-000029
则两个行人是同一个人,若
Figure PCTCN2017115672-appb-000030
则二者不是同一个人。注意到最后一个约束条件可以表述为矩阵Fk中的每一行每一列至多有一个元素是1。
上面的问题可以转化一个最小费用流的优化问题,求取全局最优解可以利用最小费用流求解算法得到。但是注意到一个实际的跟踪问题是要求实时性和因果性,即预测当前帧时只能考虑之前帧的,而不能受到后面结果的影响。
(1)先找到置信度最该的匹配---遮挡最少,人群密度稀疏的点。具体为在当前检测到所有行人和前一帧的行人进行匹配,找到匹配分数最高的一组。
(2)将其移除集合E。
(3)在剩余的集合中重复上述操作。
(4)若当前集合所有的中的置信度最高的低于给定阈值,则判断剩余的点无相关关系,判断之前行人从画面中消失,或当前帧出现新的人。
上述方法可以在固定的线性时间内求得可行解,而且只用到了当前帧和之前帧的信息。
进一步地,在本发明的一个实施例中,在跟踪中,还包括:将跟踪结果与行人模型进行匹配,以消除误匹配、遮挡和漏检测问题,其中,行人模型包含速度、当前位置、色彩特征、第一次出现时间、轨迹和当前状态中的一种或多种参数。
可以理解的是,由于在跟踪中,每一步只是简单考虑了间隔帧之间的关系,所以出现错误匹配的可能性较大,而且在实际的视频中由于遮挡以及误检测和漏检测的影响,导致可能会出现跟踪丢失等问题。基于上述的问题,本发明实施例提出一个行人模型,充分利用前面的跟踪结果,消除误匹配,并且允许目标短时间内消失已解决遮挡和漏检测问题。
举例而言,每个行人模型包含以下参数:
(1)速度:v
(2)当前位置:(x,y)
(3)色彩特征:hist统计特征
(4)第一次出现时间:Tappear
(5)轨迹(历史坐标):
Figure PCTCN2017115672-appb-000031
(6)当前:state
然后,在进行多目标跟踪的时候就是当前帧和已经构建好的行人模型进行上述基于式的匹配。在得到最后的匹配结果后对每个行人模型进行更新。具体更新分两种情况,即在当前帧中找到了与之对应的匹配,则认为改行人在这一帧被检测到,若没有找到符合要求的匹配,则认为改行人在这一帧丢失。
若检测到进行如下的信息更新:
(1)速度:v=α*v+(1-α)*vnew,vnew=(xnew-ynew)(x,y)
(2)位置:(x,y)=(x,y)+v
(3)色彩特征:hist=β*hist+(1-β)*histnew
(4)当前状态:state=1
(5)轨迹更新:(xt,yy)=(x,y)
其中,α是指数平滑项,来对行人的速度进行平滑处理,减少每一帧估计误差中噪声对跟踪结果的影响。而且注意到对行人位置不是直接利用当前帧的位置进行更新,而是先更新速度,然后通过速度在更新位置,这样做的好处是可以利用之前的速度信息,而且由于有最大速度限制,也会减少某一帧误匹配带来的问题,但会有一定的滞后性,即若物体的速度发生较大的变化,该模型需要较长的时间来进行修正,但是考虑到实际跟踪问题中很少出现类似的情形,所以采取这种策略是利大于的。α的设定需要综合考虑,若过小则无法起到滤波的效果,若过大则会对速度改变的响应时间非常长,在实验中经过测试0.8-0.9之间是比较合理的。第三个是对行人模型进行色彩特征进行修正,β是修正系数,课已通过长时间的修正是得模型中的色彩特征更符合该人原有的色彩特征,其中β不易过小,在本次试验中为β=0.99,第四项是状态的更新,行人模型的状态有丢失和激活两种状态,若前一帧该行人处于丢失状态,则需要修改状态为激活。最后将这一帧的位置记录到轨迹信息中。
若未检测到进行信息更新:
(1)速度:v=γ*v,0≤γ≤1
(2)位置:(x,y)=(x,y)+v
(3)色彩特征:不变
(4)轨迹更新:(xt,yy)=(x,y)
(5)当前状态:state=state-1
未检测到行人有两种情况,一种是该行人从摄像机画面中消失,另外一种是由于遮挡或者误检测和误匹配导致的未检测到该行人。对于前一种情形只需要删除该行人就可以。对于第二种情形需要保留该行人的所有信息,而且尽可能的为下次检测匹配做好准备。在 实际中,首先有一个速度衰减项γ,行人在丢失后可以继续按照原先的速度前进,这样下一帧进行匹配的时候可以出现在合适的位置,容易得到正确匹配,另外需要对速度进行衰减,这样做的好处是可以增加系统的稳定性,在实验中γ不易选的过大,过大容易使得行人在丢失后由于没有真实的信息对其进行修正,移动速度过快不仅会导致自己本身很难再次别检测到,也会影响到其他人的匹配,但也不易过小,过小则丢失后该模型很快停在原地,同样会带来上述的问题,在实际过程中一般去γ=0.9。然后利用速度更新位置信息。同样的将当前位置添加到轨迹中。最后是十分重要的状态调整环节,state经过上述的调整,可以反映该行人丢失帧数,若一个行人在较长的一段时间内都没有被激活,则算法会认为该行人已经永远的从监控区域消失,可以将该行人从列表中移除。
最终,将每个摄像机画面以及实时的跟踪结果显示到一起,如图7所示。
根据本发明实施例提出的基于时空约束的跨镜头多目标跟踪方法,结合多个相机的信息,同时考虑摄像机间的几何信息,以及目标的表观和时空特征实现更有效的数据融合,并且利用hough投票确定行人3D位置,利用摄像机先验,消除传统方法基于立足点估计不准确的影响,且直接跟踪行人的3D位置实现以更有效的人分析,以及引入行人模型,综合考虑多帧的跟踪结果,并且考虑行人空间位置和行走轨迹,实现更加鲁棒的多目标跟踪,其中,通过摄像信息和几何信息进行多个摄像头之间的人体特征匹配,实现目标的跟踪,有效结合目前的多目标跟踪算法和多摄像头处理方法,并且利用摄像设备网络位姿关系矩阵,从而实现基于多摄像头的多目标物体跟踪目的,在提高物体跟踪的鲁棒性的同时,减少跟踪误差,提高跟踪的准确性。
其次参照附图描述根据本发明实施例提出的基于时空约束的跨镜头多目标跟踪装置。
图8是本发明实施例的基于时空约束的跨镜头多目标跟踪装置的结构示意图。
如图8所示,该基于时空约束的跨镜头多目标跟踪装置10包括:预处理模块100、采集模块200和跟踪模块300。
其中,预处理模块100用于对不同的色彩空间进行图像预处理,使图片在色温和色调上一致,以获取多个摄像设备的摄像信息。采集模块200用于通过摄像设备的投影矩阵建立2D点的对应关系,以获取多个摄像设备之间的几何信息,其中,投影矩阵为关于3D世界的投影矩阵。跟踪模块300用于根据摄像信息和几何信息进行多个摄像头之间的人体特征匹配,以利用跟踪目标的表观和时空特征获取每个摄像设备画面以及实时的跟踪结果。本发明实施例的装置10通过结合目前的多目标跟踪算法和多摄像头处理方法,并且利用摄像设备网络位姿关系矩阵,从而实现基于多摄像头的多目标物体跟踪目的,在提高物体跟踪的鲁棒性的同时,减少跟踪误差,提高跟踪的准确性。
进一步地,在本发明的一个实施例中,跟踪模块300还用于在多个摄像设备中任意一 个摄像设备检测到跟踪目标时,通过投影矩阵将跟踪目标的位置投影到地面对应的坐标系中,并且将所有的点进行聚类分析,以获取多个摄像设备中其它摄像设备中的同一个跟踪目标。
进一步地,在本发明的一个实施例中,跟踪模块300还用于获取所有结果中最优的组,最优的组为摄像设备数目最多且相位位置误差最小,并且通过最优的组确定跟踪目标的3D坐标,以根据跟踪目标的3D坐标去除组中所选的偏差大于第一预设值的点,并且在剩余的点中选择偏差小于第二预设值的点,移除集合,直至所有的点选出集合。
进一步地,在本发明的一个实施例中,本发明实施例的装置10还包括:定位模块。其中,定位模块用于采用Hough投票方法,并且根据人体多个摄像设备的位置以及摄像设备的位姿信息确定行人的位置。
进一步地,在本发明的一个实施例中,本发明实施例的装置10还包括:匹配模块。其中,匹配模块用于将跟踪结果与行人模型进行匹配,以消除误匹配、遮挡和漏检测问题,其中,行人模型包含速度、当前位置、色彩特征、第一次出现时间、轨迹和当前状态中的一种或多种参数。
需要说明的是,前述对基于时空约束的跨镜头多目标跟踪方法实施例的解释说明也适用于该实施例的基于时空约束的跨镜头多目标跟踪装置,此处不再赘述。
根据本发明实施例提出的基于时空约束的跨镜头多目标跟踪装置,结合多个相机的信息,同时考虑摄像机间的几何信息,以及目标的表观和时空特征实现更有效的数据融合,并且利用hough投票确定行人3D位置,利用摄像机先验,消除传统方法基于立足点估计不准确的影响,且直接跟踪行人的3D位置实现以更有效的人分析,以及引入行人模型,综合考虑多帧的跟踪结果,并且考虑行人空间位置和行走轨迹,实现更加鲁棒的多目标跟踪,其中,通过摄像信息和几何信息进行多个摄像头之间的人体特征匹配,实现目标的跟踪,有效结合目前的多目标跟踪算法和多摄像头处理方法,并且利用摄像设备网络位姿关系矩阵,从而实现基于多摄像头的多目标物体跟踪目的,在提高物体跟踪的鲁棒性的同时,减少跟踪误差,提高跟踪的准确性。
在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”、“顺时针”、“逆时针”、“轴向”、“径向”、“周向”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐 含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本发明中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。
在本发明中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (10)

  1. 一种基于时空约束的跨镜头多目标跟踪方法,其特征在于,包括以下步骤:
    对不同的色彩空间进行图像预处理,使图片在色温和色调上一致,以获取多个摄像设备的摄像信息;
    通过摄像设备的投影矩阵建立2D点的对应关系,以获取所述多个摄像设备之间的几何信息,其中,所述投影矩阵为关于3D世界的投影矩阵;以及
    根据所述摄像信息和所述几何信息进行多个摄像头之间的人体特征匹配,以利用跟踪目标的表观和时空特征获取每个摄像设备画面以及实时的跟踪结果。
  2. 根据权利要求1所述的基于时空约束的跨境头多目标跟踪方法,其特征在于,所述根据所述摄像信息和所述几何信息进行多个摄像头之间的人体特征匹配,进一步包括:
    在所述多个摄像设备中任意一个摄像设备检测到跟踪目标时,通过所述投影矩阵将所述跟踪目标的位置投影到地面对应的坐标系中;
    将所有的点进行聚类分析,以获取所述多个摄像设备中其它摄像设备中的同一个跟踪目标。
  3. 根据权利要求2所述的基于时空约束的跨镜头多目标跟踪方法,其特征在于,所述获取所述多个摄像设备中其它摄像设备中的同一个跟踪目标,进一步包括:
    获取所有结果中最优的组,所述最优的组为摄像设备数目最多且相位位置误差最小;
    通过所述最优的组确定所述跟踪目标的3D坐标,以根据所述跟踪目标的3D坐标去除组中所选的偏差大于第一预设值的点,并且在剩余的点中选择偏差小于第二预设值的点,移除集合,直至所有的点选出集合。
  4. 根据权利要求1所述的基于时空约束的跨镜头多目标跟踪方法,其特征在于,采用Hough投票方法,并且根据人体多个摄像设备的位置以及摄像设备的位姿信息确定行人的位置。
  5. 根据权利要求1-4任一项所述的基于时空约束的跨镜头多目标跟踪方法,其特征在于,在跟踪中,还包括:
    将所述跟踪结果与行人模型进行匹配,以消除误匹配、遮挡和漏检测问题,其中,行人模型包含速度、当前位置、色彩特征、第一次出现时间、轨迹和当前状态中的一种或多种参数。
  6. 一种基于时空约束的跨境头多目标跟踪装置,其特征在于,包括:
    预处理模块,用于对不同的色彩空间进行图像预处理,使图片在色温和色调上一致,以获取多个摄像设备的摄像信息;
    采集模块,通过摄像设备的投影矩阵建立2D点的对应关系,以获取所述多个摄像设备之间的几何信息,其中,所述投影矩阵为关于3D世界的投影矩阵;以及
    跟踪模块,用于根据所述摄像信息和所述几何信息进行多个摄像头之间的人体特征匹配,以利用跟踪目标的表观和时空特征获取每个摄像设备画面以及实时的跟踪结果。
  7. 根据权利要求6所述的基于时空约束的跨境头多目标跟踪装置,其特征在于,所述跟踪模块还用于在所述多个摄像设备中任意一个摄像设备检测到跟踪目标时,通过所述投影矩阵将所述跟踪目标的位置投影到地面对应的坐标系中,并且将所有的点进行聚类分析,以获取所述多个摄像设备中其它摄像设备中的同一个跟踪目标。
  8. 根据权利要求7所述的基于时空约束的跨境头多目标跟踪装置,其特征在于,所述跟踪模块还用于获取所有结果中最优的组,所述最优的组为摄像设备数目最多且相位位置误差最小,并且通过所述最优的组确定所述跟踪目标的3D坐标,以根据所述跟踪目标的3D坐标去除组中所选的偏差大于第一预设值的点,并且在剩余的点中选择偏差小于第二预设值的点,移除集合,直至所有的点选出集合。
  9. 根据权利要求6所述的基于时空约束的跨境头多目标跟踪装置,其特征在于,还包括:
    定位模块,用于采用Hough投票方法,并且根据人体多个摄像设备的位置以及摄像设备的位姿信息确定行人的位置。
  10. 根据权利要求6-9任一项所述的基于时空约束的跨境头多目标跟踪装置,其特征在于,还包括:
    匹配模块,用于将所述跟踪结果与行人模型进行匹配,以消除误匹配、遮挡和漏检测问题,其中,行人模型包含速度、当前位置、色彩特征、第一次出现时间、轨迹和当前状态中的一种或多种参数。
PCT/CN2017/115672 2017-05-19 2017-12-12 基于时空约束的跨镜头多目标跟踪方法及装置 WO2018209934A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710358354.7A CN107240124B (zh) 2017-05-19 2017-05-19 基于时空约束的跨镜头多目标跟踪方法及装置
CN201710358354.7 2017-05-19

Publications (1)

Publication Number Publication Date
WO2018209934A1 true WO2018209934A1 (zh) 2018-11-22

Family

ID=59985144

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/115672 WO2018209934A1 (zh) 2017-05-19 2017-12-12 基于时空约束的跨镜头多目标跟踪方法及装置

Country Status (2)

Country Link
CN (1) CN107240124B (zh)
WO (1) WO2018209934A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110942471A (zh) * 2019-10-30 2020-03-31 电子科技大学 一种基于时空约束的长时目标跟踪方法

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107240124B (zh) * 2017-05-19 2020-07-17 清华大学 基于时空约束的跨镜头多目标跟踪方法及装置
CN108921881A (zh) * 2018-06-28 2018-11-30 重庆邮电大学 一种基于单应性约束的跨摄像头目标跟踪方法
CN108876823B (zh) * 2018-07-02 2022-05-17 晋建志 基于时空连续性的单目跨相机多目标识别定位跟踪装置及方法
CN110969644B (zh) * 2018-09-28 2023-12-01 杭州海康威视数字技术股份有限公司 人员轨迹追踪方法、装置及系统
CN109558831B (zh) * 2018-11-27 2023-04-07 成都索贝数码科技股份有限公司 一种融合时空模型的跨摄像头行人定位方法
WO2020179730A1 (ja) * 2019-03-04 2020-09-10 日本電気株式会社 情報処理装置、情報処理方法、およびプログラム
CN110379050A (zh) * 2019-06-06 2019-10-25 上海学印教育科技有限公司 一种闸机控制方法、装置及系统
CN110428449B (zh) * 2019-07-31 2023-08-04 腾讯科技(深圳)有限公司 目标检测跟踪方法、装置、设备及存储介质
CN110728702B (zh) * 2019-08-30 2022-05-20 深圳大学 一种基于深度学习的高速跨摄像头单目标跟踪方法及系统
CN110706250B (zh) * 2019-09-27 2022-04-01 广东博智林机器人有限公司 一种对象的跟踪方法、装置、系统及存储介质
CN110807804B (zh) * 2019-11-04 2023-08-29 腾讯科技(深圳)有限公司 用于目标跟踪的方法、设备、装置和可读存储介质
CN111027462A (zh) * 2019-12-06 2020-04-17 长沙海格北斗信息技术有限公司 跨多摄像头的行人轨迹识别方法
CN111061825B (zh) * 2019-12-10 2020-12-18 武汉大学 一种蒙面和换装伪装身份的时空关系匹配关联识别方法
CN111738220B (zh) * 2020-07-27 2023-09-15 腾讯科技(深圳)有限公司 三维人体姿态估计方法、装置、设备及介质
CN111815682B (zh) * 2020-09-07 2020-12-22 长沙鹏阳信息技术有限公司 一种基于多轨迹融合的多目标跟踪方法
CN112907652B (zh) * 2021-01-25 2024-02-02 脸萌有限公司 相机姿态获取方法、视频处理方法、显示设备和存储介质
CN113223060B (zh) * 2021-04-16 2022-04-15 天津大学 基于数据共享的多智能体协同跟踪方法、装置及存储介质
CN113449627B (zh) * 2021-06-24 2022-08-09 深兰科技(武汉)股份有限公司 基于ai视频分析的人员跟踪方法及相关装置
CN114299120B (zh) * 2021-12-31 2023-08-04 北京银河方圆科技有限公司 补偿方法、注册方法和可读存储介质
CN115631464B (zh) * 2022-11-17 2023-04-04 北京航空航天大学 面向大时空目标关联的行人立体表示方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101226638A (zh) * 2007-01-18 2008-07-23 中国科学院自动化研究所 一种对多相机系统的标定方法及装置
CN104376577A (zh) * 2014-10-21 2015-02-25 南京邮电大学 基于粒子滤波的多摄像头多目标跟踪算法
CN104778690A (zh) * 2015-04-02 2015-07-15 中国电子科技集团公司第二十八研究所 一种基于摄像机网络的多目标定位方法
CN104899894A (zh) * 2014-03-05 2015-09-09 南京理工大学 一种采用多台摄像机进行运动目标跟踪的方法
CN106355604A (zh) * 2016-08-22 2017-01-25 湖南挚新科技发展有限公司 图像目标跟踪方法与系统
US20170109930A1 (en) * 2015-10-16 2017-04-20 Fyusion, Inc. Augmenting multi-view image data with synthetic objects using imu and image data
CN107240124A (zh) * 2017-05-19 2017-10-10 清华大学 基于时空约束的跨镜头多目标跟踪方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184242B (zh) * 2011-05-16 2013-08-14 天津大学 跨摄像头视频摘要提取方法
CN102831445B (zh) * 2012-08-01 2014-09-03 厦门大学 基于语义Hough变换和偏最小二乘法的目标检测方法
CN105631881B (zh) * 2015-12-30 2019-02-12 四川华雁信息产业股份有限公司 目标检测方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101226638A (zh) * 2007-01-18 2008-07-23 中国科学院自动化研究所 一种对多相机系统的标定方法及装置
CN104899894A (zh) * 2014-03-05 2015-09-09 南京理工大学 一种采用多台摄像机进行运动目标跟踪的方法
CN104376577A (zh) * 2014-10-21 2015-02-25 南京邮电大学 基于粒子滤波的多摄像头多目标跟踪算法
CN104778690A (zh) * 2015-04-02 2015-07-15 中国电子科技集团公司第二十八研究所 一种基于摄像机网络的多目标定位方法
US20170109930A1 (en) * 2015-10-16 2017-04-20 Fyusion, Inc. Augmenting multi-view image data with synthetic objects using imu and image data
CN106355604A (zh) * 2016-08-22 2017-01-25 湖南挚新科技发展有限公司 图像目标跟踪方法与系统
CN107240124A (zh) * 2017-05-19 2017-10-10 清华大学 基于时空约束的跨镜头多目标跟踪方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAO, JIAN: "Multi-Camera Multi-Person 3D Space Tracking with MCMC in Surveillance Scenarios", M2SFA2 2008, 31 October 2008 (2008-10-31), pages 1 - 12, XP055612291 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110942471A (zh) * 2019-10-30 2020-03-31 电子科技大学 一种基于时空约束的长时目标跟踪方法
CN110942471B (zh) * 2019-10-30 2022-07-01 电子科技大学 一种基于时空约束的长时目标跟踪方法

Also Published As

Publication number Publication date
CN107240124B (zh) 2020-07-17
CN107240124A (zh) 2017-10-10

Similar Documents

Publication Publication Date Title
WO2018209934A1 (zh) 基于时空约束的跨镜头多目标跟踪方法及装置
WO2021196294A1 (zh) 一种跨视频人员定位追踪方法、系统及设备
US20220272313A1 (en) Methods for automatic registration of 3d image data
US9646212B2 (en) Methods, devices and systems for detecting objects in a video
Xu et al. A minimum error vanishing point detection approach for uncalibrated monocular images of man-made environments
US7583815B2 (en) Wide-area site-based video surveillance system
JP6273685B2 (ja) 追尾処理装置及びこれを備えた追尾処理システム並びに追尾処理方法
US9542753B2 (en) 3D reconstruction of trajectory
WO2020252974A1 (zh) 一种针对运动状态下的多目标对象追踪方法和装置
CN110009732B (zh) 基于gms特征匹配的面向复杂大尺度场景三维重建方法
CN108470356B (zh) 一种基于双目视觉的目标对象快速测距方法
CN108921881A (zh) 一种基于单应性约束的跨摄像头目标跟踪方法
Liu et al. Robust autocalibration for a surveillance camera network
CN111383204A (zh) 视频图像融合方法、融合装置、全景监控系统及存储介质
WO2022127181A1 (zh) 客流的监测方法、装置、电子设备及存储介质
CN107862713A (zh) 针对轮询会场的摄像机偏转实时检测预警方法及模块
CN106971381B (zh) 一种具有重叠视域的广角相机视野分界线生成方法
Lee et al. Vehicle counting based on a stereo vision depth maps for parking management
CN116152471A (zh) 基于视频流的厂区安全生产监管方法及其系统、电子设备
CN115880643A (zh) 一种基于目标检测算法的社交距离监测方法和装置
JP6548306B2 (ja) カメラの撮影画像に映る人物を追跡する画像解析装置、プログラム及び方法
JP2017182295A (ja) 画像処理装置
Zhang et al. 3D pedestrian tracking and frontal face image capture based on head point detection
TWI771857B (zh) 判斷人員進出場域的系統、方法及記錄媒體
Zhou et al. A spatiotemporal warping-based video synchronization method for video stitching

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17910317

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17910317

Country of ref document: EP

Kind code of ref document: A1