WO2016131300A1 - 一种自适应跨摄像机多目标跟踪方法及系统 - Google Patents

一种自适应跨摄像机多目标跟踪方法及系统 Download PDF

Info

Publication number
WO2016131300A1
WO2016131300A1 PCT/CN2015/092765 CN2015092765W WO2016131300A1 WO 2016131300 A1 WO2016131300 A1 WO 2016131300A1 CN 2015092765 W CN2015092765 W CN 2015092765W WO 2016131300 A1 WO2016131300 A1 WO 2016131300A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
target object
tracking
image
feature
Prior art date
Application number
PCT/CN2015/092765
Other languages
English (en)
French (fr)
Inventor
陆平
于慧敏
邓硕
郑伟伟
高燕
谢奕
汪东旭
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016131300A1 publication Critical patent/WO2016131300A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • Embodiments of the present invention relate to, but are not limited to, a cross-camera multi-target tracking technology in the field of computer vision, and in particular, to an adaptive cross-camera multi-target tracking method and system.
  • Multi-target matching and tracking problems are hotspots and difficulties in multi-camera surveillance networks with unknown overlapping topology. Multi-target matching and tracking across cameras connects the same moving targets in different surveillance cameras. It is the basis for monitoring the network's motion analysis and behavior understanding. Therefore, it can be said that multi-target matching and tracking across cameras is intelligent video surveillance. An indispensable part of it.
  • the target tracking algorithms in the related art can generally be divided into a generating model method and a discriminating model method.
  • the tracking algorithm based on the generated model focuses on accurately describing the features of the target, and then searches for the object with the greatest similarity as the target; while the tracking algorithm based on the discriminant model focuses on separating the target from the background, often collecting the background region as a negative sample.
  • the target object is trained as a positive sample to obtain a two-classifier.
  • the advantage of this method is that it uses background information to effectively reduce the influence of objects in the background similar to the target on the tracking algorithm. Therefore, the tracking algorithm based on discriminant model tends to have better tracking effect.
  • a method for judging the disappearance of a target object is to first calculate the optical flow of the target region, and then judge whether the object disappears according to the degree of chaos of the optical flow, but the current optical flow calculation method is a ill-conditioned problem, and the calculated light
  • the flow is not reliable, so it is theoretically more problematic to judge whether the object disappears according to the optical flow.
  • This method also shows a high probability of misjudgment in practical applications. Traver et al., in "Robotics and Autonomous Systems", Vol. 58, pp.
  • Embodiments of the present invention provide an adaptive cross-camera multi-target tracking method and system to solve the problem of low accuracy of cross-camera multi-target tracking in the related art.
  • the embodiment of the invention provides an adaptive cross-camera multi-target tracking method, the method comprising:
  • the image of the target object of the current video frame is subjected to log polar transformation, and the image of the target object after the log polar transformation is mixed Gaussian modeling, and the center offset and degree of change of the target object are measured. Determine if the target object has disappeared.
  • the foregoing method further includes:
  • the feature of the target object is extracted and saved while tracking, and the feature library of the target object is established.
  • the feature library of the target object is established.
  • performing logarithmic polar coordinate transformation on the image of the current video frame target object includes: performing logarithmic polar coordinate transformation on the image of the current video frame target object according to the following formula:
  • the horizontal axis and the vertical axis of the image after the log polar transformation are ⁇ and ⁇ .
  • the hybrid Gaussian modeling of the image of the target object after the log polar transformation comprises: performing Gaussian modeling on the image of the target object after the log polar transformation according to the following formula:
  • q t (x, y) represents the pixel value of the feature image of the target at (x, y)
  • H represents the number of mixed Gaussian models
  • N() represents a normal distribution.
  • the measuring the center offset and the degree of change of the target object, and determining whether the target object has disappeared includes:
  • the mixed Gaussian model of the target object is used to detect the image of the tracking target object in the next frame of the video image, and the ratio of the non-target pixel to the total pixel is counted, and if the ratio of the non-target pixel total pixel reaches or exceeds a preset value
  • the threshold value determines that the target object disappears.
  • the method further includes performing a time domain filtering operation if a ratio of the total pixels of the non-target pixel points reaches or exceeds a preset threshold;
  • determining that the target object disappears includes: only when the proportion of the total pixels of the non-target pixel points reaches or exceeds a preset threshold, When the disappearance judgment condition of the time domain filtering is satisfied, it is judged that the target object disappears.
  • the extracting and saving the feature of the target object while tracking, and establishing the feature library of the target object includes:
  • the color, saturation and value HSV block histogram feature, edge direction histogram feature, main color feature and direction gradient histogram HOG feature are extracted periodically from the image of the tracking target object, and the above four features are merged to obtain the target feature. , stored in the target feature library.
  • the foregoing method further includes:
  • the original target feature in the target signature database is replaced with the new target feature.
  • the hybrid Gaussian model is used to extract the foreground object for the whole matching video.
  • the target feature database is used to calculate the similarity between the tracking target and all foreground objects, and the foreground object with the highest similarity is selected as the target for tracking across the camera.
  • the embodiment of the invention further provides an adaptive cross-camera multi-target tracking system, the system comprising:
  • the judging module is configured to perform logarithmic polar coordinate transformation on the image of the target object of the current video frame according to the updated tracking model, and perform mixed Gaussian modeling on the image of the target object after the log polar coordinate transformation, and measure the center deviation of the target object. Move and change the degree to determine whether the target object has disappeared.
  • the above system further includes:
  • a feature library building module of the target object configured to extract and save features of the target object while tracking, and establish a feature library of the target object
  • the tracking module is configured to match and detect the most similar target object in another video while the tracked target object leaves the video, and continue tracking.
  • the determining module performs logarithmic polar coordinate transformation on the image of the current video frame target object, including: the determining module performs logarithmic polar coordinates on the image of the current video frame target object according to the following formula: Transform:
  • the horizontal axis and the vertical axis of the image after the log polar transformation are ⁇ and ⁇ .
  • the determining module performs hybrid Gaussian modeling on the image of the target object after the log polar transformation, including: the determining module performs the logarithmic polar transformed target object according to the following formula: Image mixed Gaussian modeling:
  • q t (x, y) represents the pixel value of the feature image of the target at (x, y)
  • H represents the number of mixed Gaussian models
  • N() represents a normal distribution.
  • the determining module measures a center offset and a degree of change of the target object, and determines whether the target object has disappeared, including:
  • the determining module detects the image of the tracking target object in the next frame of the video image by using the mixed Gaussian model of the target object, and counts the proportion of the non-target pixel in the total pixel, if the ratio of the total pixel of the non-target pixel reaches or exceeds
  • the preset threshold determines that the target object disappears.
  • the determining module is further configured to perform a time domain filtering operation when a ratio of total pixels of the non-target pixel points reaches or exceeds a preset threshold;
  • Determining, by the judging module, that the proportion of the total pixels of the non-target pixel reaches or exceeds a preset threshold, and determining that the target object disappears comprises: the determining module only when the ratio of the total pixels of the non-target pixel reaches or exceeds a preset When the threshold value is satisfied and the disappearance judgment condition of the time domain filtering is satisfied, it is judged that the target object disappears.
  • the feature library building module of the target object extracts and saves the feature of the target object while tracking, and the feature library of the target object includes:
  • the feature library building module of the target object periodically extracts color, saturation and value HSV block histogram features, edge direction histogram features, main color features and direction gradient histogram HOG features for the image of the tracking target object, The above four features are merged to obtain the target features and stored in the target feature database.
  • the feature library establishing module of the target object is further configured to replace the most original target feature in the target feature database with the new target feature when the target feature library capacity reaches an upper limit.
  • the tracking module matches and detects the most similar target object in another video, and the tracking continues:
  • the tracking module uses the mixed Gaussian model to extract foreground objects for the entire matching video, uses the target feature database to calculate the similarity between the tracking target and all foreground objects, and selects the foreground object with the highest similarity as the target for tracking across the camera.
  • the embodiment of the invention further provides a computer readable storage medium storing program instructions, which can be implemented when the program instructions are executed.
  • the technical solution of the embodiment of the present application utilizes the online SVM framework to implement an adaptive scale transformation tracking algorithm, and combines the disappearance judgment mechanism based on log polar transformation and hybrid Gaussian modeling to effectively improve the accuracy of the tracking algorithm and the Lu Great.
  • the technical solution of the embodiment of the present application can perform cross-camera tracking on multiple targets at the same time, has strong robustness, and can meet real-time requirements and can be applied to actual scenarios.
  • FIG. 1 is a schematic overall flow chart of a method according to an embodiment of the present invention.
  • Example 2 is a graph of single target tracking results in Example 1;
  • Example 3 is a multi-target tracking result diagram in Example 2.
  • Example 4 is a single target tracking result diagram in Example 3.
  • Example 5 is a cross-camera matching result diagram in Example 3.
  • FIG. 6 is a schematic structural diagram of a system according to an embodiment of the present invention.
  • This embodiment provides an adaptive multi-target multi-target tracking method, which mainly includes the following operations:
  • the tracking window size is fixed and the position of the current video frame target object is obtained by using the pre-established tracking model, and then the size of the tracking window is changed at the obtained position, and the scale of the target object is obtained by using the tracking model, according to the obtained target object.
  • Scale online update tracking model
  • the image of the target object of the current video frame is subjected to log polar transformation, and the image of the target object after the log polar transformation is mixed Gaussian modeling, and the center offset and degree of change of the target object are measured. Determine if the target object has disappeared.
  • the process of measuring the center offset and the degree of change of the target object and determining whether the target object has disappeared is further divided into the following operations:
  • the mixed Gaussian model of the target object is used to detect the image of the tracking target object in the next frame of the video image, and the ratio of the non-target pixel to the total pixel is counted, and if the ratio of the non-target pixel total pixel reaches or exceeds a preset value
  • the threshold value determines that the target object disappears.
  • the time domain filtering operation can also be performed, so that only when the proportion of the total pixel of the non-target pixel reaches or exceeds the When the threshold is set and the disappearance judgment condition of the time domain filtering is satisfied, the target object is judged to disappear, thereby improving the reliability of the judgment result.
  • the feature of the target object is extracted and saved while tracking, and the process of establishing the feature library of the target object includes:
  • the HSV color Hue, saturation Saturation, and Value block histogram features, edge direction histogram features, main color features, and hog features are periodically extracted from the image of the tracking target object, and the four features are merged to obtain the target.
  • Features can be stored in the target signature database.
  • the original target feature in the target feature database can also be replaced with the new target feature.
  • the hybrid Gaussian model is used to extract the foreground object for the whole matching video.
  • the target feature database is used to calculate the similarity between the tracking target and all foreground objects, and the foreground object with the highest similarity is selected as the target for tracking across the camera.
  • the technical solution of the embodiment mainly uses the online SVM classifier as the basic framework of the tracking algorithm.
  • the online SVM is used to search the area within the search range.
  • the score the region with the highest score is the position of the current frame target, and then the method of searching for the location and then searching for the scale is used to quickly and accurately track the scale of the target.
  • the positive and negative samples are sampled around the target to update the target online.
  • Model mixed Gaussian modeling of the target image after log-polar transformation to measure the center offset and degree of change of the target to determine whether the target has been lost or disappeared.
  • multiple target templates are extracted for each target and the features are calculated for cross-camera matching.
  • the user specifies a video for cross-camera matching, first processing each frame in the matched video for Gaussian modeling and foreground extraction, calculating features for each foreground object, and then using each tracking target.
  • the template feature is matched with the foreground feature to obtain the corresponding similarity.
  • the object with the highest matching similarity is found in the whole video as the target object, and the target is continuously tracked. Repeat the above steps to achieve multi-target tracking across cameras.
  • the implementation process of the foregoing method is as shown in FIG. 1 and includes the following steps:
  • Step 100 Initialize input
  • the user enters the video to be tracked, the initial frame in which single or multiple tracking targets are located, and the initial rectangular frame.
  • Step 200 Tracking model establishment
  • the initial frame of the target is read, and the sample is collected within a range from the target position R.
  • the size of the rectangular frame of the sample is the same as the size of the target rectangular frame, the haar-like features are extracted for all samples, and the rectangular frame position of each sample is recorded.
  • the online SVM tracking model is obtained by solving the convex optimization objective function as follows:
  • ⁇ i is a relaxation variable
  • C is a harmonic parameter
  • ⁇ i (y) ⁇ (x i , y i )- ⁇ (x i , y); Represents the coverage between rectangular boxes at two different locations.
  • h(x, y) in the above formula calculates a score. The higher the score, the higher the similarity with the target feature, and the more likely it is the target object.
  • Step 300 Track the target template library establishment
  • a target template library For each tracking target, a target template library is created, and the capacity of the template library is P. Firstly, the HSV block histogram feature, the edge direction histogram feature, the main color feature and the hog feature are extracted from the initial target image, and the four features are merged to obtain the target feature, and the feature is added to the template library. Then, the above feature is extracted for the target image every interval frame number Q, and the feature is added to the template library. When the number of templates in the template library reaches the upper limit of the capacity, delete the oldest template except the initial template. Replace with the latest template.
  • Step 400 The disappearance judgment model is established
  • the logarithmic polar coordinate transformation is performed on the target image, and the formula is as follows:
  • the horizontal axis and the vertical axis of the image after the log polar transformation are ⁇ and ⁇ .
  • q t (x, y) represents the pixel value of the feature image of the target at (x, y)
  • H represents the number of mixed Gaussian models
  • N() represents a normal distribution.
  • Step 500 Input a next frame video image
  • Step 600 Track the location and scale of the target
  • This step 600 further includes the following operations:
  • Step 60.1 target location tracking
  • the scale of the fixed search window is the target scale of the previous frame
  • the initial search position is the target position of the previous frame
  • the traversal search is performed within the range of the domain size of the initial search position, using tracking
  • the online SVM scorer h(x, y) in the model scores all search windows, with the highest score being the location of the target in the current video frame.
  • Step 600.2 target scale search
  • the scale of the target in the previous frame is used as the initial search scale
  • the initial search scale is multiplied by a series of coefficients to obtain the scale search space
  • a series of images are taken as the candidate target image at the target position using each scale of the scale search space.
  • haar-like features are extracted from these candidate target images, and the features of all candidate targets are scored using the online SVM scorer h(x, y).
  • the scale corresponding to the highest score is used as the scale of the target in the current video frame. .
  • Step 600.3 tracking model update
  • the tracking model update can be performed using the online SVM classifier optimization solution method in step 200.
  • Step 600.4 updating the target template library
  • the target template library may be updated using the method described in step 300.
  • Step 700 The target disappears judgment
  • the target image is first subjected to log polar transformation, and then the Gaussian model in the step disappearance judgment mechanism determines whether the target disappears.
  • the Gaussian model corresponding to the pixel is not found, the point is considered to be the former attraction, and the pixel is used as the mean to construct a new Gaussian model, and the Gaussian model with the lowest ranking is replaced.
  • the update of the Gaussian model uses the following formula:
  • the parameter ⁇ represents the update rate of the Gaussian model.
  • n t represents the number of pixels in the log polar image that is determined by the Gaussian model as the foreground
  • N t represents the total number of pixels in the log polar image
  • the disappearance judgment mechanism is triggered. At this time, it is suspected that the target may have disappeared, but it is not very certain, because it is possible that only the target object changes posture. The misjudgment is triggered, so the following time domain filtering can be performed to reduce the occurrence of false positives.
  • the following equation represents the disappearance judgment mechanism, that is, the statistics of the occurrence of the subsequent K frame, when there are at least M (M ⁇ K) frames in the K frame, the proportion of the foreground pixel points to the total number of pixels Above the threshold T2, the target is considered to have disappeared.
  • the update of the Gaussian model in the tracking model and the disappearance judging mechanism is stopped, and at this time, the target position search range r of the tracking algorithm becomes larger and becomes r2. .
  • the information that the target disappears can be returned to the upper user.
  • Step 800 Returning the target rectangular frame and disappearing the determination result to the upper user, and repeating steps 500 to 700 until the target leaves the video;
  • Step 900 Matching across camera targets
  • the step 900 further includes the following operations:
  • Step 900.1 the next paragraph to be matched to the video input
  • the user specifies the next video that needs to be matched across cameras.
  • Step 900.2 foreground target extraction
  • Background modeling and foreground extraction are performed on each frame of the input video using a mixed Gaussian model, and the connected regions are extracted from the foreground image to obtain all foreground objects in each frame.
  • Step 900.3 foreground target feature extraction and cross-camera matching
  • step 300 The features described in step 300 are then extracted for all foreground objects. Then, for each tracking target, the target template library is used to match the features of these foregrounds to obtain matching similarity. Thus, for each tracking target, there is a similarity curve in the video, and any point in the curve represents the similarity between the most similar foreground object and the tracking target in the video at that time point. Measured value. Finally, it is only necessary to find the foreground object and the time point corresponding to the peak of the curve, and use the foreground object as the matching target object, and use the time point as the initial appearance time of the matched target.
  • Step 900.4 continue tracking
  • the target is tracked using the methods in steps 200 to 800 from the initial time at which the target appears.
  • Step 900.5 multiple camera targets are continuously tracked.
  • This example tracks a face in an indoor scene video with severe occlusion, where the characters in the video use the book to block the face.
  • This embodiment includes the following steps:
  • Step 1 Initialize the input
  • the target in the initial rectangle in the video is the face of the person, as shown in Figure 2a in Figure 2.
  • Step 2 Tracking model establishment
  • the size of the rectangular frame of the sample is the same as the size of the target rectangular frame, and haar-like features are extracted for all samples, and the rectangular shape of each sample is recorded. Box position.
  • the online SVM tracking model is obtained by solving the convex optimization objective function as follows:
  • ⁇ i is a relaxation variable
  • the Gaussian kernel k(x, x') e - ⁇
  • h(x, y) in the above formula calculates a score. The higher the score, the higher the similarity with the target feature, and the more likely it is the target object.
  • Step 3 The disappearance judgment model is established
  • the logarithmic polar coordinate transformation is performed on the target image, and the formula is as follows:
  • the horizontal axis and the vertical axis of the image after the log polar transformation are ⁇ and ⁇ .
  • q t (x, y) represents the pixel value of the feature image of the target at (x, y)
  • H represents the number of mixed Gaussian models
  • N() represents a normal distribution.
  • H 8
  • the initial weight w 0 0.05
  • the initial mean ⁇ 0 is the target feature image pixel value
  • the initial variance ⁇ 0 900.
  • Step 4 Input the next frame video image
  • Step 5 Track the location and scale of the target
  • the scale of the fixed search window is the target scale of the previous frame
  • the initial search position is the target position of the previous frame
  • All search windows are scored using the online SVM scorer h(x,y) in the tracking model, with the highest score being the location of the target in the current video frame.
  • the size of the target that is, the mesh.
  • Each scale of the search space intercepts a series of images at the target position as candidate target images, and then extracts haar-like features from the candidate target images, and uses the online SVM scorer h(x, y) to characterize all candidate targets. The score is scored, and the scale corresponding to the highest score is taken as the scale of the target in the current video frame.
  • the tracking model update uses the online SVM classifier optimization solution method in step 2.
  • Step 6 The target disappears judgment
  • the target image is first subjected to log polar transformation, and then the Gaussian model in the step disappearance judgment mechanism determines whether the target disappears.
  • the Gaussian model corresponding to the pixel is not found, the point is considered to be the former attraction, and the pixel is used as the mean to construct a new Gaussian model, and the Gaussian model with the lowest ranking is replaced.
  • the update of the Gaussian model uses the following formula:
  • the parameter ⁇ represents the update rate of the Gaussian model.
  • n t represents the number of pixels in the log polar image that is determined by the Gaussian model as the foreground
  • N t represents the total number of pixels in the log polar image
  • the disappearance judgment mechanism When the ratio in the above formula is greater than the threshold T1, the disappearance judgment mechanism is triggered. At this time, it is suspected that the target may have disappeared, but it is not very certain, because there may be only a misjudgment caused by the target object changing posture, so the following is performed. Time domain filtering reduces the occurrence of false positives.
  • the following equation represents the disappearance judgment mechanism, that is, the statistics of the occurrence of the subsequent K frame, when there are at least M (M ⁇ K) frames in the K frame, the proportion of the foreground pixel points to the total number of pixels Above the threshold T2, the target is considered to have disappeared.
  • the update of the Gaussian model in the tracking model and the disappearance judging mechanism is stopped, and at this time, the target position search range r of the tracking algorithm becomes larger and becomes r2. .
  • the information that the target disappears can be returned to the upper user.
  • Step 7 Return to the target rectangle and the result of the disappearance judgment to the upper user, repeat steps 4 to 6 until the end of the video or the user is interrupted.
  • the time sequence is from left to right, from top to bottom, and the target object is tracked in the rectangular frame.
  • the face of a person is colored, and the color of the rectangular frame is white.
  • the target object is normally tracked, and the color of the rectangle is black to indicate that the target disappears. It can be seen that in the present embodiment, the tracking algorithm accurately tracks the target object, and when the target is blocked by a large area of other objects, it can accurately determine that the target disappears.
  • Step 1 Initialize the input
  • the video sequence is selected by the user and one or more targets to be tracked are selected, the tracking targets in the video being two pedestrians, as shown in Figure 3a.
  • Step 2 Tracking model establishment
  • the size of the rectangular frame of the sample is the same as the size of the target rectangular frame, and haar-like features are extracted for all samples, and the rectangular shape of each sample is recorded. Box position.
  • the online SVM tracking model is obtained by solving the convex optimization objective function as follows:
  • ⁇ i is a relaxation variable
  • ⁇ i (y) ⁇ (x i , y i )- ⁇ (x i , y); Represents the coverage between rectangular boxes at two different locations.
  • the Gaussian kernel k(x, x') e - ⁇
  • h(x, y) in the above formula calculates a score. The higher the score, the higher the similarity with the target feature, and the more likely it is the target object.
  • Step 3 The disappearance judgment model is established
  • the logarithmic polar coordinate transformation is performed on the target image, and the formula is as follows:
  • the horizontal axis and the vertical axis of the image after the log polar transformation are ⁇ and ⁇ .
  • q t (x, y) represents the pixel value of the feature image of the target at (x, y)
  • H represents the number of mixed Gaussian models
  • N() represents a normal distribution.
  • H 8
  • the initial weight w 0 0.05
  • the initial mean ⁇ 0 is the target feature image pixel value
  • the initial variance ⁇ 0 900.
  • Step 4 Input the next frame video image
  • Step 5 Track the location and scale of the target
  • the scale of the fixed search window is the target scale of the previous frame
  • the initial search position is the target position of the previous frame
  • All search windows are scored using the online SVM scorer h(x,y) in the tracking model, with the highest score being the location of the target in the current video frame.
  • the size of the target that is, the length and width of the target rectangular frame.
  • Each scale of the search space intercepts a series of images at the target position as candidate target images, and then extracts haar-like features from the candidate target images, and uses the online SVM scorer h(x, y) to characterize all candidate targets. The score is scored, and the scale corresponding to the highest score is taken as the scale of the target in the current video frame.
  • the tracking model update uses the online SVM classifier optimization solution method in step 2.
  • Step 6 The target disappears judgment
  • the target image is first subjected to log polar transformation, and then the Gaussian model in the step disappearance judgment mechanism determines whether the target disappears.
  • the Gaussian model corresponding to the pixel is not found, the point is considered to be the former attraction, and the pixel is used as the mean to construct a new Gaussian model, and the Gaussian model with the lowest ranking is replaced.
  • the update of the Gaussian model uses the following formula:
  • the parameter ⁇ represents the update rate of the Gaussian model.
  • n t represents the number of pixels in the log polar image that is determined by the Gaussian model as the foreground
  • N t represents the total number of pixels in the log polar image
  • the disappearance judgment mechanism When the ratio in the above formula is greater than the threshold T1, the disappearance judgment mechanism is triggered. At this time, it is suspected that the target may have disappeared, but it is not very certain, because there may be only a misjudgment caused by the target object changing posture, so the following is performed. Time domain filtering reduces the occurrence of false positives.
  • the following equation represents the disappearance judgment mechanism, that is, the statistics of the occurrence of the subsequent K frame, when there are at least M (M ⁇ K) frames in the K frame, the proportion of the foreground pixel points to the total number of pixels Above the threshold T2, the target is considered to have disappeared.
  • the update of the Gaussian model in the tracking model and the disappearance judging mechanism is stopped, and at this time, the target position search range r of the tracking algorithm becomes larger and becomes r2. .
  • the information that the target disappears is returned to the upper user.
  • Step 7 Return to the target rectangle and the result of the disappearance judgment to the upper user, repeat steps 4 to 6 until the end of the video or the user is interrupted.
  • Fig. 3 shows the result of this embodiment, the time sequence is from left to right, top to bottom, and the target object is tracked in the rectangular frame.
  • the tracking target is two pedestrians.
  • the tracking algorithm accurately tracks two target pedestrians. When the two target pedestrians are occluded, the tracking algorithm still does not occur, indicating that the tracking method of the embodiment is robust.
  • one of the pedestrians leaves the video boundary, and the tracking algorithm can accurately determine the situation where the target leaves the boundary and stop tracking the target. It can be seen from this embodiment that the tracking method of the embodiment of the present invention can effectively track multiple targets, and the tracking accuracy is still maintained in the case of occlusion.
  • This example performs cross-camera tracking and matching for students in two videos.
  • One of the videos is an indoor scene, tracking one student in the video; the other video is a corridor scene, where the tracking target in the previous video appears, and there are many other students appearing as interference, the video
  • the target students in the match are matched.
  • This embodiment includes the following steps:
  • Step 1 Initialize the input
  • the video sequence is selected by the user, and the tracking target of interest is selected by a rectangular frame in the video.
  • the tracking target is a male student, as shown in Figure 4a.
  • Step 2 Tracking model establishment
  • the size of the rectangular frame of the sample is the same as the size of the target rectangular frame, and haar-like features are extracted for all samples, and the rectangular shape of each sample is recorded. Box position.
  • the online SVM tracking model is obtained by solving the convex optimization objective function as follows:
  • ⁇ i is a relaxation variable
  • ⁇ i (y) ⁇ (x i , y i )- ⁇ (x i , y); Represents the coverage between rectangular boxes at two different locations.
  • the Gaussian kernel k(x, x') e - ⁇
  • h(x, y) in the above formula calculates a score. The higher the score, the higher the similarity with the target feature, and the more likely it is the target object.
  • Step 3 Track the target template library establishment
  • the HSV block histogram feature, the edge direction histogram feature, the main color feature and the hog feature are extracted from the initial target image, and the four features are merged to obtain the target feature, and the feature is added to the template library.
  • the oldest template except the initial template is deleted and replaced with the latest template.
  • Step 4 The disappearance judgment model is established
  • the logarithmic polar coordinate transformation is performed on the target image, and the formula is as follows:
  • the horizontal axis and the vertical axis of the image after the log polar transformation are ⁇ and ⁇ .
  • q t (x, y) represents the pixel value of the feature image of the target at (x, y)
  • H represents the number of mixed Gaussian models
  • N() represents a normal distribution.
  • H 8
  • the initial weight w 0 0.05
  • the initial mean ⁇ 0 is the target feature image pixel value
  • the initial variance ⁇ 0 900.
  • Step 5 input the next frame video image
  • Step 6 Track the location and scale of the target
  • the scale of the fixed search window is the target scale of the previous frame
  • the initial search position is the target position of the previous frame
  • All search windows are scored using the online SVM scorer h(x,y) in the tracking model, with the highest score being the location of the target in the current video frame.
  • the size of the target that is, the length and width of the target rectangular frame.
  • Each scale of the search space intercepts a series of images at the target position as candidate target images, and then extracts haar-like features from the candidate target images, and uses the online SVM scorer h(x, y) to characterize all candidate targets. The score is scored, and the scale corresponding to the highest score is taken as the scale of the target in the current video frame.
  • the tracking model update uses the online SVM classifier optimization solution method in step 2.
  • Step 7 The target disappears judgment
  • the target image is first subjected to log polar transformation, and then the Gaussian model in the step disappearance judgment mechanism determines whether the target disappears.
  • the Gaussian model corresponding to the pixel is not found, the point is considered to be the former attraction, and the pixel is used as the mean to construct a new Gaussian model, and the Gaussian model with the lowest ranking is replaced.
  • the update of the Gaussian model uses the following formula:
  • the parameter ⁇ represents the update rate of the Gaussian model.
  • n t represents the number of pixels in the log polar image that is determined by the Gaussian model as the foreground
  • N t represents the total number of pixels in the log polar image
  • the disappearance judgment mechanism When the ratio in the above formula is greater than the threshold T1, the disappearance judgment mechanism is triggered. At this time, it is suspected that the target may have disappeared, but it is not very certain, because there may be only a misjudgment caused by the target object changing posture, so the following is performed. Time domain filtering reduces the occurrence of false positives.
  • the following equation represents the disappearance judgment mechanism, that is, the statistics of the occurrence of the subsequent K frame, when there are at least M (M ⁇ K) frames in the K frame, the proportion of the foreground pixel points to the total number of pixels Above the threshold T2, the target is considered to have disappeared.
  • the update of the Gaussian model in the tracking model and the disappearance judging mechanism is stopped, and at this time, the target position search range r of the tracking algorithm becomes larger and becomes r2. .
  • the information that the target disappears is returned to the upper user.
  • Step 8 Return to the target rectangle frame and the disappearance judgment result to the upper layer user, and repeat steps 5 to 7 until the target leaves the video, as shown in FIG. 4d;
  • Step 9 Matching across camera targets
  • the user specifies the next video that needs to be matched across cameras, as shown in Figure 5a.
  • Background modeling and foreground extraction are performed on each frame of the input video using a mixed Gaussian model, and the connected regions are extracted from the foreground image to obtain all foreground objects in each frame.
  • step 3 The features described in step 3 are then extracted for all foreground objects. Then, for each tracking target, the target template library is used to match the features of these foregrounds to obtain matching similarity.
  • the target template library is used to match the features of these foregrounds to obtain matching similarity.
  • any point in the curve represents the similarity measure between the most similar foreground object and the tracking target in the video at that point in time.
  • the target is tracked using the methods in steps 2 to 8 from the initial time when the target appears.
  • FIG. 4 shows the tracking result of the embodiment
  • FIG. 5 shows the cross-camera matching result of the embodiment
  • the time sequence is from left to right and top to bottom.
  • the left side is a tracking video
  • the rectangular frame is a tracking target object.
  • the lower left corner is the disappearing judgment result of the tracking target, and the tracking of the embodiment can be seen.
  • the method accurately tracks the student until he leaves the video boundary.
  • the upper right corner is four target template images; the right side is the video matched across the camera, the target student appears six times in the video; the lower right corner is the similarity curve, which has Six spikes correspond to the six occurrences of the tracking target in the video.
  • the cross-camera tracking method in this embodiment can accurately track the target object, and successfully find the target object in another video by cross-camera matching, and the tracking effect is good and robust. Very strong.
  • the embodiment provides an adaptive cross-camera multi-target tracking system, which can implement the method of Embodiment 1 above, and the system includes at least the following modules:
  • the judging module is configured to perform logarithmic polar coordinate transformation on the image of the target object of the current video frame according to the updated tracking model, and perform mixed Gaussian modeling on the image of the target object after the log polar coordinate transformation, and measure the center deviation of the target object. Move and change the degree to determine whether the target object has disappeared.
  • the judging module may perform logarithmic polar coordinate transformation on the image of the current video frame target object according to the following formula:
  • the horizontal axis and the vertical axis of the image after the log polar transformation are ⁇ and ⁇ .
  • the judging module can also perform mixed Gaussian modeling on the image of the target object after the log polar transformation according to the following formula:
  • q t (x, y) represents the pixel value of the feature image of the target at (x, y)
  • H represents the number of mixed Gaussian models
  • N() represents a normal distribution.
  • the judging module measures the center offset and the degree of change of the target object, and the process of discriminating whether the target object has disappeared is as follows:
  • the judging module uses the mixed Gaussian model of the target object to detect the image of the tracking target object in the next frame of the video image, and counts the ratio of the non-target pixel to the total pixel, if the ratio of the non-target pixel total pixel reaches or exceeds the preset
  • the threshold is determined to judge that the target object disappears.
  • the determining module is further configured to perform a time domain filtering operation when the proportion of the total pixels of the non-target pixel reaches or exceeds a preset threshold; that is, only when the proportion of the total pixels of the non-target pixel reaches or exceeds
  • a preset threshold that is, only when the proportion of the total pixels of the non-target pixel reaches or exceeds
  • the above system may further include a feature library building module of the target object.
  • the feature library establishing module of the target object is configured to extract and save the feature of the target object while tracking, and establish a feature library of the target object;
  • the feature library building module of the target object can periodically extract the HSV block histogram feature, the edge direction histogram feature, the main color feature and the hog feature of the image of the tracking target object, and then fuse the four features.
  • the target feature is stored in the target feature database, that is, the feature library of the target object is established.
  • the feature library building module of the target object is further set to replace the most primitive target feature in the target feature library with the new target feature when the target feature library capacity reaches the upper limit.
  • the above system may further include a tracking module configured to match and detect the most similar target object in another video while the tracked target object leaves the video, and continue tracking.
  • the tracking module uses the mixed Gaussian model to extract the foreground object for the entire matching video, uses the target feature database to calculate the similarity between the tracking target and all foreground objects, and selects the foreground object with the highest similarity as the target for tracking across the camera.
  • the update module and the judgment module may constitute a target tracking unit and a cross-video target matching unit, and the feature library creation module of the target object may be placed in the moving target detection unit.
  • the architecture of the entire system is as shown in FIG. 6.
  • the technical solution of the embodiment of the present application is directed to continuous tracking of multiple targets for a plurality of targets by a multi-camera monitoring network whose topology is unknown in an overlapping region, which has high robustness and can meet real-time requirements;
  • the technical solution of the embodiment of the present application combines the tracking algorithm and the disappearance judgment mechanism organically, and controls the update of the tracking model and the disappearance judgment model through the disappearance judgment mechanism, thereby effectively reducing the introduction and accumulation of model errors, and improving. Track the accuracy and robustness of the algorithm.
  • the technical solution of the embodiment of the present application utilizes the online SVM framework to implement the adaptive scale transformation.
  • the tracking algorithm combines the disappearance judgment mechanism based on log-polar transformation and mixed Gaussian modeling to effectively improve the accuracy and robustness of the tracking algorithm.
  • the technical solution of the embodiment of the present application can perform cross-camera tracking on multiple targets at the same time, has strong robustness, and can meet real-time requirements and can be applied to actual scenarios.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

一种自适应跨摄像机多目标跟踪方法及系统,所述方法包括:固定跟踪窗口大小并使用预先建立的跟踪模型得到当前视频帧目标物体的位置,在所得到的位置处变换跟踪窗口的大小,使用所述跟踪模型得到目标物体的尺度,根据得到的目标物体的尺度在线更新跟踪模型;根据更新的跟踪模型,对当前视频帧目标物体的图像进行对数极坐标变换,对对数极坐标变换后的目标物体的图像进行混合高斯建模,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失。

Description

一种自适应跨摄像机多目标跟踪方法及系统 技术领域
本发明实施例涉及但不限于计算机视觉领域的跨摄像机多目标跟踪技术,尤其涉及一种自适应跨摄像机多目标跟踪方法及系统。
背景技术
在智能视频监控中,单个摄像头的监控范围十分有限,并不能对整个监控场景进行有效覆盖。为了实现更大范围的监控,比如在城市的道路网络中跟踪某辆汽车或是在大型火车站内寻找某个可疑目标,通常需要多个监控摄像机协同工作。在无重叠区域拓扑结构未知的多摄像头监控网络中,多目标匹配与跟踪问题是其中的热点和难点。跨摄像机的多目标匹配与跟踪将不同监控摄像头中的同一运动目标联系起来,是监控网络中运动分析、行为理解等后续工作的基础,因此可以说跨摄像机的多目标匹配与跟踪是智能视频监控中的不可或缺环节。
尽管跨摄像机的多目标匹配与跟踪已经得到一定的关注和研究,然而,在现实环境中实现高效、鲁棒的目标跟踪算法仍然是一个极具挑战性的问题。这是因为在实际场景中,环境复杂多变,设计高效的跨摄像机的目标跟踪算法需要解决多种问题,这些问题包括场景光照变化、相互遮挡、目标尺度和姿势变化、不同摄像机之间的视角差异、不同摄像机之间的参数差异。
相关技术中的目标跟踪算法一般可以分为生成模型方法和判别模型方法。基于生成模型的跟踪算法着重于准确地描述目标的特征,然后搜索具有最大相似度的物体作为目标;而基于判别模型的跟踪算法着重于把目标和背景分离开,往往采集背景区域作为负样本,目标物体作为正样本进行训练,得到一个二分类器,这种方法的好处是利用了背景的信息,能够有效地减弱背景中与目标相似的物体对于跟踪算法的影响。所以基于判别模型的跟踪算法往往具有更好的跟踪效果。
但是无论使用上述哪种方法,都没有考虑到目标消失情况的判断,而这 种情况在实际场景中是经常出现的。目前绝大部分主流目标跟踪算法都只考虑到目标跟踪的准确度,缺少目标消失以后的有效判断机制,因此很难在实际场景中得到广泛的应用。Kalal等人于2012年发表在《IEEE Transactions onPattern Analysis and Machine Intelligence》(国际电气和电子工程师协会模式分析和机器智能学报)第34卷1409页至1422页中“Tracking-learning-detection”一文中提出了目标物体消失判断的一种方法,这种方法是首先计算目标区域的光流,然后根据光流的混乱程度来判断物体是否消失,但是目前的光流计算方法是个病态问题,计算得到的光流并不可靠,因此根据光流来判断物体是否消失在理论上就存在较多问题,该方法在实际应用中也显示出较高的误判概率。Traver等人于2010年发表在《Robotics and Autonomous Systems》(机器人技术和自动系统学报)第58卷378页至398页的“A review of log-polar imaging for visual perception in robotics”中指出,对数极坐标转换后的图像与人眼看物体的特征非常吻合,而人眼对于目标物体的消失判断是非常准确的。因此,可以模仿人眼对于目标物体的观察属性,利用对数极坐标转换来进行目标的消失判断。同时Hare等人于2011年发表在《IEEE International Conference onComputer Vision》(国际电气和电子工程师协会计算机视觉国际会议)上的“Struck:Structured output tracking with kernels”中提出了一种基于结构化输出约束的online SVM分类器框架,该分类器可以有效地把前景目标与背景目标相分离。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本发明实施例提供一种自适应跨摄像机多目标跟踪方法及系统,以解决相关技术中跨摄像机多目标跟踪准确性低的问题。
本发明实施例提供了一种自适应跨摄像机多目标跟踪方法,该方法包括:
固定跟踪窗口大小并使用预先建立的跟踪模型得到当前视频帧目标物体的位置,在所得到的位置处变换跟踪窗口的大小,使用所述跟踪模型得到目 标物体的尺度,根据得到的目标物体的尺度在线更新跟踪模型;
根据更新的跟踪模型,对当前视频帧目标物体的图像进行对数极坐标变换,对对数极坐标变换后的目标物体的图像进行混合高斯建模,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失。
可选地,上述方法还包括:
在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库,当跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪。
可选地,上述方法中,所述对当前视频帧目标物体的图像进行对数极坐标变换,包括:按照如下公式,对当前视频帧目标物体的图像进行对数极坐标变换:
Figure PCTCN2015092765-appb-000001
其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
可选地,上述方法中,所述对对数极坐标变换后的目标物体的图像进行混合高斯建模,包括:按照如下公式对对数极坐标变换后的目标物体的图像进行混合高斯建模
Figure PCTCN2015092765-appb-000002
其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
Figure PCTCN2015092765-appb-000003
表示t时刻(x,y)坐标处第i个高斯模型的权重,N()表示正态分布,
Figure PCTCN2015092765-appb-000004
表示t时刻(x,y)坐标处第i个高斯模型的均值,
Figure PCTCN2015092765-appb-000005
表示t时刻(x,y)坐标处第i个高斯模型的方差。
可选地,上述方法中,所述度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失包括:
使用目标物体的混合高斯模型对下一帧视频图像中的跟踪目标物体的图像进行检测,统计非目标的像素占总像素的比例,若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失。
可选地,所述方法还包括,若非目标的像素点总像素的比例达到或超过预先设定的阈值时,进行时域滤波操作;
其中,所述若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失,包括:仅当非目标的像素点总像素的比例达到或超过预先设定的阈值,且满足时域滤波的消失判断条件时,判断目标物体消失。
可选地,上述方法中,所述在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库包括:
周期性地对跟踪目标物体的图像提取色彩、饱和度和值HSV分块直方图特征、边缘方向直方图特征、主颜色特征和方向梯度直方图HOG特征,将上述四种特征进行融合得到目标特征,存入目标特征库。
可选地,上述方法还包括:
在目标特征库容量达到上限时,用新的目标特征替换目标特征库中最原始的目标特征。
可选地,上述方法中,所述当跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪包括:
对整段匹配视频使用混合高斯模型进行前景物体提取,使用目标特征库计算跟踪目标与所有前景物体的相似度,选择相似度最高的前景物体作为跨摄像机跟踪的目标。
本发明实施例还提供了一种自适应跨摄像机多目标跟踪系统,该系统包括:
更新模块,设置为固定跟踪窗口大小并使用预先建立的跟踪模型得到当前视频帧目标物体的位置,在所得到的位置处变换跟踪窗口的大小,使用所述跟踪模型得到目标物体的尺度,根据得到的目标物体的尺度在线更新跟踪模型;以及
判断模块,设置为根据更新的跟踪模型,对当前视频帧目标物体的图像进行对数极坐标变换,对对数极坐标变换后的目标物体的图像进行混合高斯建模,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失。
可选地,上述系统还包括:
目标物体的特征库建立模块,设置为在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库;以及
跟踪模块,设置为在跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪。
可选地,上述系统中,所述判断模块对当前视频帧目标物体的图像进行对数极坐标变换,包括:所述判断模块按照如下公式,对当前视频帧目标物体的图像进行对数极坐标变换:
Figure PCTCN2015092765-appb-000006
其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
可选地,上述系统中,所述判断模块对对数极坐标变换后的目标物体的图像进行混合高斯建模,包括:所述判断模块按照如下公式对对数极坐标变换后的目标物体的图像进行混合高斯建模:
Figure PCTCN2015092765-appb-000007
其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
Figure PCTCN2015092765-appb-000008
表示t时刻(x,y)坐标处第i个高斯模型的权重,N()表示正态分布,
Figure PCTCN2015092765-appb-000009
表示t时刻(x,y)坐标处第i个高斯模型的均值,
Figure PCTCN2015092765-appb-000010
表示t时刻(x,y)坐标处第i个高斯模型的方差。
可选地,上述系统中,所述判断模块度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失包括:
所述判断模块使用目标物体的混合高斯模型对下一帧视频图像中的跟踪目标物体的图像进行检测,统计非目标的像素占总像素的比例,若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失。
可选地,上述系统中,所述判断模块还设置为在非目标的像素点总像素的比例达到或超过预先设定的阈值时,进行时域滤波操作;
所述判断模块在非目标的像素点总像素的比例达到或超过预先设定的阈值,判断目标物体消失包括:所述判断模块仅当非目标的像素点总像素的比例达到或超过预先设定的阈值,且满足时域滤波的消失判断条件时,才判断目标物体消失。
可选地,上述系统中,所述目标物体的特征库建立模块在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库包括:
所述目标物体的特征库建立模块周期性地对跟踪目标物体的图像提取色彩、饱和度和值HSV分块直方图特征、边缘方向直方图特征、主颜色特征和方向梯度直方图HOG特征,将上述四种特征进行融合得到目标特征,存入目标特征库。
可选地,上述系统中,所述目标物体的特征库建立模块还设置为在目标特征库容量达到上限时,用新的目标特征替换目标特征库中最原始的目标特征。
可选地,上述系统中,所述跟踪模块在跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪包括:
所述跟踪模块对整段匹配视频使用混合高斯模型进行前景物体提取,使用目标特征库计算跟踪目标与所有前景物体的相似度,选择相似度最高的前景物体作为跨摄像机跟踪的目标。
本发明实施例还提供一种计算机可读存储介质,存储有程序指令,当该程序指令被执行时可实现上述方法。
本申请实施例技术方案利用online SVM框架来实现自适应尺度变换的跟踪算法,并融合了基于对数极坐标变换和混合高斯建模的消失判断机制,有效地提高的跟踪算法的准确性和鲁棒性。同时,本申请实施例技术方案能够同时对多个目标进行跨摄像机跟踪,具有极强的鲁棒性,而且能够达到实时要求,能够应用到实际场景中。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
图1为本发明实施例方法的整体流程示意图;
图2为实例1中的单目标跟踪结果图;
图3为实例2中的多目标跟踪结果图;
图4为实例3中的单目标跟踪结果图;
图5为实例3中的跨摄像机匹配结果图;
图6为本发明实施例系统的结构示意图。
本发明的实施方式
下文将结合附图对本发明实施例技术方案进行详细说明。需要说明的是,在不冲突的情况下,本申请的实施例和实施例中的特征可以任意相互组合。
实施例1
本实施例提供一种自适应跨摄像机多目标跟踪方法,主要包括如下操作:
先固定跟踪窗口大小并使用预先建立的跟踪模型得到当前视频帧目标物体的位置,再在所得到的位置处变换跟踪窗口的大小,使用所述跟踪模型得到目标物体的尺度,根据得到的目标物体的尺度在线更新跟踪模型;
根据更新的跟踪模型,对当前视频帧目标物体的图像进行对数极坐标变换,对对数极坐标变换后的目标物体的图像进行混合高斯建模,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失。
另外,按照上述方法操作时,还可以在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库,当跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪。
其中,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失的过程又分为如下操作:
使用目标物体的混合高斯模型对下一帧视频图像中的跟踪目标物体的图像进行检测,统计非目标的像素占总像素的比例,若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失。
要说明的是,当非目标的像素点总像素的比例达到或超过预先设定的阈值时,还可以进行时域滤波操作,这样,仅当非目标的像素点总像素的比例达到或超过预先设定的阈值,且满足时域滤波的消失判断条件时,才判断目标物体消失,从而提高判断结果的可靠性。
其中,在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库的过程包括:
周期性地对跟踪目标物体的图像提取HSV(色彩Hue、饱和度Saturation和值Value)分块直方图特征、边缘方向直方图特征、主颜色特征和hog特征,将这四种特征进行融合得到目标特征,存入目标特征库即可。
其中,在目标特征库容量达到上限时,还可以用新的目标特征替换目标特征库中最原始的目标特征。
当跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪的操作如下:
对整段匹配视频使用混合高斯模型进行前景物体提取,使用目标特征库计算跟踪目标与所有前景物体的相似度,选择相似度最高的前景物体作为跨摄像机跟踪的目标。
从上述方法可以看出,本实施例技术方案主要以online SVM分类器为跟踪算法基础框架,在跟踪时,对上一帧图像的目标位置周围进行搜索,利用online SVM对搜索范围内的区域进行评分,评分最高的区域即为当前帧目标的位置,再利用先搜索位置再搜索尺度的方法来快速准确地跟踪目标的尺度,跟踪完成以后,在目标周围采样正样本和负样本来在线更新目标模型;对对数极坐标变换后的目标图像进行混合高斯建模来度量目标的中心偏移和变化程度,以此判别目标是否已经跟丢或消失。同时在跟踪过程中,对每一个目标提取多个目标模板并计算特征,用于跨摄像机匹配。当目标离开这段视频以后,由用户指定一段视频进行跨摄像机匹配,先对待匹配视频中的每一帧进行高斯建模和前景提取,对每一个前景物体计算特征,然后使用每一个跟踪目标的模板特征与前景特征进行匹配,得到对应的相似度,最后在整段视频中找到匹配相似度最高的物体作为目标物体,对该目标进行继续跟踪。重复上述步骤即可实现跨摄像机多目标跟踪。
其中,上述方法的实现过程如图1所示,包括以下操作步骤:
步骤100:初始化输入;
由用户输入需要跟踪的视频、单个或多个跟踪目标所在的初始帧以及初始矩形框。
步骤200:跟踪模型建立;
读取目标所在初始帧,在距离目标位置R的范围内采集样本,样本的矩形框大小与目标矩形框大小一致,对所有样本提取haar-like特征,同时记录每个样本的矩形框位置。得到了训练样本以后,就通过求解如下的凸优化目标函数得到online SVM跟踪模型:
Figure PCTCN2015092765-appb-000011
Figure PCTCN2015092765-appb-000012
其中:ξi是松弛变量,C是调和参数,δΦi(y)=Φ(xi,yi)-Φ(xi,y);
Figure PCTCN2015092765-appb-000013
表示两个不同位置的矩形框之间的覆盖度。
使用拉普拉斯对偶法对上式中的结构化输出SVM问题进行转化:
Figure PCTCN2015092765-appb-000014
Figure PCTCN2015092765-appb-000015
对目标函数的变量w,ξ求导,使之为零,得到一组条件:
Figure PCTCN2015092765-appb-000016
Figure PCTCN2015092765-appb-000017
将上式重新代入拉普拉斯对偶问题中,得到最终的对偶问题:
Figure PCTCN2015092765-appb-000018
Figure PCTCN2015092765-appb-000019
由于上式与一般SVM的形式有点差别,无法使用通用的SVM求解算法。所以对上式进行一定的参数变换,使之与一般的SVM具有相同的形式:
Figure PCTCN2015092765-appb-000020
使用上式参数转换后得到的对偶问题为:
Figure PCTCN2015092765-appb-000021
Figure PCTCN2015092765-appb-000022
其中:如果y=yi,δ(y,yi)=1;反之,δ(y,yi)=0。
上式的优化方程使用经典的SMO(sequential minimal optimization)算法求解。最后得到的跟踪算法模型为:
Figure PCTCN2015092765-appb-000023
其中,内积使用核函数进行表示,为
Figure PCTCN2015092765-appb-000024
本实施例采用高斯核k(x,x′)=e-γ||x-x′||
上式中的h(x,y)计算的是一个评分,评分越高表明与目标特征的相似度越高,越可能是目标物体。
步骤300:跟踪目标模板库建立;
对于每一个跟踪目标,建立目标模板库,模板库的容量为P。首先对初始目标图像提取HSV分块直方图特征、边缘方向直方图特征、主颜色特征和hog特征,把这四种特征进行融合得到目标特征,把该特征加入模板库。然后每间隔帧数Q对目标图像提取上述特征,把该特征加入模板库。当模板库中的模板数量达到容量上限时,把除了初始模板以外最早的模板进行删除, 用最新的模板进行替换。
步骤400:消失判断模型建立;
得到初始目标物体的图像模板以后,先对目标图像作对数极坐标转换,公式如下:
Figure PCTCN2015092765-appb-000025
其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
然后,对目标的对数极坐标图像进行混合高斯建模:
Figure PCTCN2015092765-appb-000026
其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
Figure PCTCN2015092765-appb-000027
表示t时刻(x,y)坐标处第i个高斯模型的权重,N()表示正态分布,
Figure PCTCN2015092765-appb-000028
表示t时刻(x,y)坐标处第i个高斯模型的均值,
Figure PCTCN2015092765-appb-000029
表示t时刻(x,y)坐标处第i个高斯模型的方差。
步骤500:输入下一帧视频图像;
步骤600:跟踪目标的位置和尺度;
该步骤600又包括如下操作:
步骤600.1、目标位置跟踪;
首先固定搜索窗口的尺度为上一帧的目标尺度,初始搜索位置为上一帧的目标位置,然后在当前视频帧中,在初始搜索位置的领域大小为r的范围内进行遍历搜索,使用跟踪模型中的online SVM评分器h(x,y)对所有搜索窗口进行评分,评分最高的一个作为当前视频帧中目标的位置。
步骤600.2、目标尺度搜索;
得到了当前视频帧中目标的位置以后,需要确定目标的尺度大小,即目标矩形框的长和宽。首先使用上一帧中目标的尺度作为初始搜索尺度,对初始搜索尺度乘以一系列系数得到尺度搜索空间,然后使用尺度搜索空间的每个尺度在目标位置处截取一系列的图像作为候选目标图像,然后再对这些候选目标图像提取haar-like特征,使用online SVM评分器h(x,y)对所有候选目标的特征进行评分,评分最高者对应的尺度就作为当前视频帧中目标的尺度大小。
步骤600.3、跟踪模型更新;
跟踪模型更新使用步骤200中的online SVM分类器优化求解方法即可。
步骤600.4、目标模板库更新;
使用步骤300所述的方法对目标模板库进行更新即可。
步骤700:目标消失判断;
当得到了当前视频帧目标的位置和尺度,就需要判断该目标是否都已经消失。先对目标图像进行对数极坐标转换,然后通过步骤消失判断机制中的高斯模型判断目标是否都消失。方法如下:
对于输入图像的每一个像素点,先对该像素点处的H个高斯模型按照
Figure PCTCN2015092765-appb-000030
进行排序,然后按照排序从高到低进行搜索,如果该像素点与当前高斯模型的均值的差小于2.5倍的标准差,就认为该像素点属于当前高斯模型,然后计算该高斯模型以及排名大于该高斯模型的其他高斯模型的参数
Figure PCTCN2015092765-appb-000031
的和,如果这个值大于阈值ε,则认为该点为背景点,反之,则认为该点为前景点,对相应的高斯模型进行更新。如果没有找到与该像素点对应的高斯模型,则认为该点为前景点,使用该像素点作为均值构建一个新的高斯模型,并替换掉原先排名最低的那个高斯模型。
高斯模型的更新使用下式:
Figure PCTCN2015092765-appb-000032
Figure PCTCN2015092765-appb-000033
Figure PCTCN2015092765-appb-000034
其中,参数λ表示高斯模型的更新速率。
然后在每一帧统计当前对数极坐标图像被高斯模型判断为前景的像素点的个数,并计算其占总像素点数的比例,即
Figure PCTCN2015092765-appb-000035
其中:nt表示t时刻对数极坐标图像被高斯模型判断为前景的像素点的个数,Nt表示对数极坐标图像的总像素点数。
当上式中的比例大于阈值T1时,就触发了消失判断机制,此时怀疑目标有可能已经消失,但是不是非常确定,因为有可能只是目标物体改变姿态所 引发的误判,所以下面可以进行时域滤波,减小误判的发生。下式表示了消失判断机制,也就是说,对随后的K帧所发生的情况进行统计,当随后K帧中有至少M(M≤K)帧的前景像素点个数占总像素点数的比例大于阈值T2,就认为目标真正消失。
Figure PCTCN2015092765-appb-000036
Figure PCTCN2015092765-appb-000037
当目标被判断为真正消失以后,为了防止误差的引入和累积,会停止跟踪模型和消失判断机制中的高斯模型的更新,而且此时跟踪算法的目标位置搜索范围r会变大,变为r2。同时,可以将目标消失的信息返回给上层用户。当目标被消失判断模型判断为重新出现,即ηt<T3的时候,跟踪算法和消失判断算法的模型恢复更新,跟踪算法的搜索范围恢复正常。
步骤800:返回目标矩形框和消失判断结果给上层用户,重复步骤500至700直到目标离开视频;
步骤900:跨摄像机目标匹配;
该步骤900又包括如下操作:
步骤900.1、下一段待匹配视频输入;
当目标离开原始视频以后,由用户指定下一段需要进行跨摄像机匹配的视频。
步骤900.2、前景目标提取;
使用混合高斯模型对输入视频的每一帧图像进行背景建模和前景提取,对前景图像提取连通区域得到每一帧中的所有前景物体。
步骤900.3、前景目标特征提取和跨摄像机匹配;
再对所有前景物体提取步骤300中所述的特征。然后,对每一个跟踪目标,使用目标模板库与这些前景的特征进行匹配,得到匹配相似度。这样,对于每一个跟踪目标,在该段视频中都有一条相似度曲线,曲线中的任意一点都代表在该时间点处视频中最相似的前景物体与跟踪目标之间的相似度度 量值。最后,只需要找到曲线峰值处所对应的前景物体和时间点,把该前景物体作为匹配得到的目标物体,把该时间点作为匹配得到的目标的初始出现时间。
步骤900.4、继续跟踪;
在匹配得到该视频中的目标物体以后,从该目标出现的初始时间开始,使用步骤200至800中的方法对目标进行跟踪。
步骤900.5、多摄像机目标连续跟踪。
重复步骤900.1至900.5即可在多个摄像机的视频中对同一目标进行连续的跟踪。
下面结合应用场景说明上述方法的实施过程。
实例1:
本实例对一段带有严重遮挡的室内场景视频中的脸部进行跟踪,该视频中的人物使用书本去遮挡脸部。该实施例包括以下步骤:
步骤1:初始化输入;
输入视频序列所在路径、初始跟踪矩形框给跟踪算法,该视频中初始矩形框中的目标为人物的脸部,如图2中图2a所示。
步骤2:跟踪模型建立;
读取目标所在初始帧,在距离目标位置R=40像素点的范围内采集样本,样本的矩形框大小与目标矩形框大小一致,对所有样本提取haar-like特征,同时记录每个样本的矩形框位置。得到了训练样本以后,就通过求解如下的凸优化目标函数得到online SVM跟踪模型:
Figure PCTCN2015092765-appb-000038
Figure PCTCN2015092765-appb-000039
其中:ξi是松弛变量,C是调和参数,实施例中取C=100,δΦi(y)=Φ(xi,yi)-Φ(xi,y);
Figure PCTCN2015092765-appb-000040
表示两个不同位置的 矩形框之间的覆盖度。
使用拉普拉斯对偶法对上式中的结构化输出SVM问题进行转化:
Figure PCTCN2015092765-appb-000041
Figure PCTCN2015092765-appb-000042
对目标函数的变量w,ξ求导,使之为零,得到一组条件:
Figure PCTCN2015092765-appb-000043
Figure PCTCN2015092765-appb-000044
将上式重新代入拉普拉斯对偶问题中,得到最终的对偶问题:
Figure PCTCN2015092765-appb-000045
Figure PCTCN2015092765-appb-000046
由于上式与一般SVM的形式有点差别,无法使用通用的SVM求解算法。所以对上式进行一定的参数变换,使之与一般的SVM具有相同的形式:
Figure PCTCN2015092765-appb-000047
使用上式参数转换后得到的对偶问题为:
Figure PCTCN2015092765-appb-000048
Figure PCTCN2015092765-appb-000049
其中:如果y=yi,δ(y,yi)=1;反之,δ(y,yi)=0。
上式的优化方程使用经典的SMO(sequential minimal optimization)算法求解。最后得到的跟踪算法模型为:
Figure PCTCN2015092765-appb-000050
其中,内积使用核函数进行表示,为
Figure PCTCN2015092765-appb-000051
本实施例采用高斯核k(x,x′)=e-γ||x-x′||,在本实施例中,取γ=0.2。
上式中的h(x,y)计算的是一个评分,评分越高表明与目标特征的相似度越高,越可能是目标物体。
步骤3:消失判断模型建立;
得到初始目标物体的图像模板以后,先对目标图像作对数极坐标转换,公式如下:
Figure PCTCN2015092765-appb-000052
其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
然后,对目标的对数极坐标图像进行混合高斯建模:
Figure PCTCN2015092765-appb-000053
其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
Figure PCTCN2015092765-appb-000054
表示t时刻(x,y)坐标处第i个高斯模型的权重,N()表示正态分布,
Figure PCTCN2015092765-appb-000055
表示t时刻(x,y)坐标处第i个高斯模型的均值,
Figure PCTCN2015092765-appb-000056
表示t时刻(x,y)坐标处第i个高斯模型的方差。在本实施例中,取H=8,初始权重w0=0.05,初始均值μ0为目标特征图像像素值,初始方差σ0=900。
步骤4:输入下一帧视频图像;
步骤5:跟踪目标的位置和尺度;
5.1、目标位置跟踪;
首先固定搜索窗口的尺度为上一帧的目标尺度,初始搜索位置为上一帧的目标位置,然后在当前视频帧中,在初始搜索位置的领域大小为r=20的范围内进行遍历搜索,使用跟踪模型中的online SVM评分器h(x,y)对所有搜索窗口进行评分,评分最高的一个作为当前视频帧中目标的位置。
5.2、目标尺度搜索;
得到了当前视频帧中目标的位置以后,需要确定目标的尺度大小,即目 标矩形框的长和宽。首先使用上一帧中目标的尺度作为初始搜索尺度,取尺度系数S=0.9+0.01i,i=0,2,.,用这些尺度系数与初始搜索尺度相乘构成尺度搜索空间,然后使用尺度搜索空间的每个尺度在目标位置处截取一系列的图像作为候选目标图像,然后再对这些候选目标图像提取haar-like特征,使用online SVM评分器h(x,y)对所有候选目标的特征进行评分,评分最高者对应的尺度就作为当前视频帧中目标的尺度大小。
5.3、跟踪模型更新;
跟踪模型更新使用步骤2中的online SVM分类器优化求解方法。
步骤6:目标消失判断;
当得到了当前视频帧目标的位置和尺度,需要判断该目标是否都已经消失。先对目标图像进行对数极坐标转换,然后通过步骤消失判断机制中的高斯模型判断目标是否都消失。方法如下:
对于输入图像的每一个像素点,先对该像素点处的H个高斯模型按照
Figure PCTCN2015092765-appb-000057
进行排序,然后按照排序从高到低进行搜索,如果该像素点与当前高斯模型的均值的差小于2.5倍的标准差,就认为该像素点属于当前高斯模型,然后计算该高斯模型以及排名大于该高斯模型的其他高斯模型的参数
Figure PCTCN2015092765-appb-000058
的和,如果这个值大于阈值ε,则认为该点为背景点,反之,则认为该点为前景点,对相应的高斯模型进行更新。如果没有找到与该像素点对应的高斯模型,则认为该点为前景点,使用该像素点作为均值构建一个新的高斯模型,并替换掉原先排名最低的那个高斯模型。
高斯模型的更新使用下式:
Figure PCTCN2015092765-appb-000059
Figure PCTCN2015092765-appb-000060
Figure PCTCN2015092765-appb-000061
其中,参数λ表示高斯模型的更新速率。
然后在每一帧统计当前对数极坐标图像被高斯模型判断为前景的像素点的个数,并计算其占总像素点数的比例,即
Figure PCTCN2015092765-appb-000062
其中:nt表示t时刻对数极坐标图像被高斯模型判断为前景的像素点的个数,Nt表示对数极坐标图像的总像素点数。
当上式中的比例大于阈值T1时,就触发了消失判断机制,此时怀疑目标有可能已经消失,但是不是非常确定,因为有可能只是目标物体改变姿态所引发的误判,所以下面要进行时域滤波,减小误判的发生。下式表示了消失判断机制,也就是说,对随后的K帧所发生的情况进行统计,当随后K帧中有至少M(M≤K)帧的前景像素点个数占总像素点数的比例大于阈值T2,就认为目标真正消失。在本实施例中,阈值T1=0.4,T2=0.3,K=4,M=2
Figure PCTCN2015092765-appb-000063
Figure PCTCN2015092765-appb-000064
当目标被判断为真正消失以后,为了防止误差的引入和累积,会停止跟踪模型和消失判断机制中的高斯模型的更新,而且此时跟踪算法的目标位置搜索范围r会变大,变为r2。同时,可以把目标消失的信息返回给上层用户。当目标被消失判断模型判断为重新出现,即ηt<T3的时候,跟踪算法和消失判断算法的模型恢复更新,跟踪算法的搜索范围恢复正常。
在本实施例中,搜索范围r=20,r2=30,T2=0.1。
步骤7:返回目标矩形框和消失判断结果给上层用户,重复步骤4至6直到视频结束或者用户中断。
图2显示了本实施例的结果,时间顺序为从左到右、从上到下,矩形框内为跟踪的目标物体,此实施例中为一个人的脸部,矩形框的颜色为白色表示目标物体正常跟踪,矩形框的颜色为黑色表示目标消失。可以看出,在本实施例中,跟踪算法准确地跟踪住了目标物体,当目标被其他物体大面积遮挡以后,能够准确地判断出目标消失。
实例2:
本实施例对一段道路视频中的两个行人进行同时跟踪,该视频中的光照 变化较大,目标车辆的分辨率较低。该实施例包括以下步骤:
步骤1:初始化输入;
由用户选择视频序列,并且框选想要跟踪的一个或多个目标,该视频中的跟踪目标为两个行人,如图3a所示。
步骤2:跟踪模型建立;
读取目标所在初始帧,在距离目标位置R=40像素点的范围内采集样本,样本的矩形框大小与目标矩形框大小一致,对所有样本提取haar-like特征,同时记录每个样本的矩形框位置。得到了训练样本以后,就通过求解如下的凸优化目标函数得到online SVM跟踪模型:
Figure PCTCN2015092765-appb-000065
Figure PCTCN2015092765-appb-000066
其中:ξi是松弛变量,C是调和参数,实施例中取C=100,δΦi(y)=Φ(xi,yi)-Φ(xi,y);
Figure PCTCN2015092765-appb-000067
Figure PCTCN2015092765-appb-000068
表示两个不同位置的矩形框之间的覆盖度。
使用拉普拉斯对偶法对上式中的结构化输出SVM问题进行转化:
Figure PCTCN2015092765-appb-000069
Figure PCTCN2015092765-appb-000070
对目标函数的变量w,ξ求导,使之为零,得到一组条件:
Figure PCTCN2015092765-appb-000071
Figure PCTCN2015092765-appb-000072
将上式重新代入拉普拉斯对偶问题中,得到最终的对偶问题:
Figure PCTCN2015092765-appb-000073
Figure PCTCN2015092765-appb-000074
由于上式与一般SVM的形式有点差别,无法使用通用的SVM求解算法。所以对上式进行一定的参数变换,使之与一般的SVM具有相同的形式:
Figure PCTCN2015092765-appb-000075
使用上式参数转换后得到的对偶问题为:
Figure PCTCN2015092765-appb-000076
Figure PCTCN2015092765-appb-000077
其中:如果y=yi,δ(y,yi)=1;反之,δ(y,yi)=0。
上式的优化方程使用经典的SMO(sequential minimal optimization)算法求解。最后得到的跟踪算法模型为:
Figure PCTCN2015092765-appb-000078
其中,内积使用核函数进行表示,为
Figure PCTCN2015092765-appb-000079
本实施例采用高斯核k(x,x′)=e-γ||x-x′||,在本实施例中,取γ=0.2。
上式中的h(x,y)计算的是一个评分,评分越高表明与目标特征的相似度越高,越可能是目标物体。
步骤3:消失判断模型建立;
得到初始目标物体的图像模板以后,先对目标图像作对数极坐标转换,公式如下:
Figure PCTCN2015092765-appb-000080
其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
然后,对目标的对数极坐标图像进行混合高斯建模:
Figure PCTCN2015092765-appb-000081
其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
Figure PCTCN2015092765-appb-000082
表示t时刻(x,y)坐标处第i个高斯模型的权重,N()表示正态分布,
Figure PCTCN2015092765-appb-000083
表示t时刻(x,y)坐标处第i个高斯模型的均值,
Figure PCTCN2015092765-appb-000084
表示t时刻(x,y)坐标处第i个高斯模型的方差。在本实施例中,取H=8,初始权重w0=0.05,初始均值μ0为目标特征图像像素值,初始方差σ0=900。
步骤4:输入下一帧视频图像;
步骤5:跟踪目标的位置和尺度;
5.1、目标位置跟踪;
首先固定搜索窗口的尺度为上一帧的目标尺度,初始搜索位置为上一帧的目标位置,然后在当前视频帧中,在初始搜索位置的领域大小为r=20的范围内进行遍历搜索,使用跟踪模型中的online SVM评分器h(x,y)对所有搜索窗口进行评分,评分最高的一个作为当前视频帧中目标的位置。
5.2、目标尺度搜索;
得到了当前视频帧中目标的位置以后,就需要确定目标的尺度大小,即目标矩形框的长和宽。首先使用上一帧中目标的尺度作为初始搜索尺度,取尺度系数S=0.9+0.01i,i=0,2,.,用这些尺度系数与初始搜索尺度相乘构成尺度搜索空间,然后使用尺度搜索空间的每个尺度在目标位置处截取一系列的图像作为候选目标图像,然后再对这些候选目标图像提取haar-like特征,使用online SVM评分器h(x,y)对所有候选目标的特征进行评分,评分最高者对应的尺度就作为当前视频帧中目标的尺度大小。
5.3、跟踪模型更新;
跟踪模型更新使用步骤2中的online SVM分类器优化求解方法。
步骤6:目标消失判断;
当得到了当前视频帧目标的位置和尺度,就需要判断该目标是否都已经消失。先对目标图像进行对数极坐标转换,然后通过步骤消失判断机制中的高斯模型判断目标是否都消失。方法如下:
对于输入图像的每一个像素点,先对该像素点处的H个高斯模型按照
Figure PCTCN2015092765-appb-000085
进行排序,然后按照排序从高到低进行搜索,如果该像素点与当前高斯模型的均值的差小于2.5倍的标准差,就认为该像素点属于当前高斯模型,然后计算该高斯模型以及排名大于该高斯模型的其他高斯模型的参数
Figure PCTCN2015092765-appb-000086
的和,如果这个值大于阈值ε,则认为该点为背景点,反之,则认为该点为前景点,对相应的高斯模型进行更新。如果没有找到与该像素点对应的高斯模型,则认为该点为前景点,使用该像素点作为均值构建一个新的高斯模型,并替换掉原先排名最低的那个高斯模型。
高斯模型的更新使用下式:
Figure PCTCN2015092765-appb-000087
Figure PCTCN2015092765-appb-000088
Figure PCTCN2015092765-appb-000089
其中,参数λ表示高斯模型的更新速率。
然后在每一帧统计当前对数极坐标图像被高斯模型判断为前景的像素点的个数,并计算其占总像素点数的比例,即
Figure PCTCN2015092765-appb-000090
其中:nt表示t时刻对数极坐标图像被高斯模型判断为前景的像素点的个数,Nt表示对数极坐标图像的总像素点数。
当上式中的比例大于阈值T1时,就触发了消失判断机制,此时怀疑目标有可能已经消失,但是不是非常确定,因为有可能只是目标物体改变姿态所引发的误判,所以下面要进行时域滤波,减小误判的发生。下式表示了消失判断机制,也就是说,对随后的K帧所发生的情况进行统计,当随后K帧中有至少M(M≤K)帧的前景像素点个数占总像素点数的比例大于阈值T2,就认为目标真正消失。在本实施例中,阈值T1=0.4,T2=0.3,K=4,M=2
Figure PCTCN2015092765-appb-000091
Figure PCTCN2015092765-appb-000092
当目标被判断为真正消失以后,为了防止误差的引入和累积,会停止跟踪模型和消失判断机制中的高斯模型的更新,而且此时跟踪算法的目标位置搜索范围r会变大,变为r2。同时,将目标消失的信息返回给上层用户。当目标被消失判断模型判断为重新出现,即ηt<T3的时候,跟踪算法和消失判断算法的模型恢复更新,跟踪算法的搜索范围恢复正常。
在本实施例中,搜索范围r=20,r2=30,T2=0.1。
步骤7:返回目标矩形框和消失判断结果给上层用户,重复步骤4至6直到视频结束或者用户中断。
图3显示了本实施例的结果,时间顺序为从左到右、从上到下,矩形框内为跟踪的目标物体,此实施例中跟踪目标为两个行人。可以看出,在本实施例中,跟踪算法准确地跟踪住了两个目标行人。当这两个目标行人出现相互遮挡时,跟踪算法仍然没有发生跟丢的情况,说明了本实施例的跟踪方法的鲁棒性很强。同时在图3最后一幅图中,其中一个行人离开了视频边界,跟踪算法可以准确地判断出目标离开边界的情况,并且停止对该目标进行跟踪。从该实施例可以看出,本发明实施例的跟踪方法能够有效地跟踪多个目标,在遮挡情况下仍然保持了跟踪的准确性。
实例3:
本实例对两段视频中的学生进行跨摄像机跟踪和匹配。其中一段视频是室内场景,对该视频内的一个学生进行跟踪;另外一段视频是走廊场景,其中出现了前一段视频中的跟踪目标,并且还有许多其他学生出现,作为干扰情况,对该视频中的目标学生进行匹配。该实施例包括以下步骤:
步骤1:初始化输入;
由用户选择视频序列,并且用矩形框选取感兴趣的跟踪目标,该视频中 的跟踪目标为一个男学生,如图4a所示。
步骤2:跟踪模型建立;
读取目标所在初始帧,在距离目标位置R=40像素点的范围内采集样本,样本的矩形框大小与目标矩形框大小一致,对所有样本提取haar-like特征,同时记录每个样本的矩形框位置。得到了训练样本以后,就通过求解如下的凸优化目标函数得到online SVM跟踪模型:
Figure PCTCN2015092765-appb-000093
Figure PCTCN2015092765-appb-000094
其中:ξi是松弛变量,C是调和参数,实施例中取C=100,δΦi(y)=Φ(xi,yi)-Φ(xi,y);
Figure PCTCN2015092765-appb-000095
Figure PCTCN2015092765-appb-000096
表示两个不同位置的矩形框之间的覆盖度。
使用拉普拉斯对偶法对上式中的结构化输出SVM问题进行转化:
Figure PCTCN2015092765-appb-000097
Figure PCTCN2015092765-appb-000098
对目标函数的变量w,ξ求导,使之为零,得到一组条件:
Figure PCTCN2015092765-appb-000099
Figure PCTCN2015092765-appb-000100
将上式重新代入拉普拉斯对偶问题中,得到最终的对偶问题:
Figure PCTCN2015092765-appb-000101
Figure PCTCN2015092765-appb-000102
由于上式与一般SVM的形式有点差别,无法使用通用的SVM求解算法。所以对上式进行一定的参数变换,使之与一般的SVM具有相同的形式:
Figure PCTCN2015092765-appb-000103
使用上式参数转换后得到的对偶问题为:
Figure PCTCN2015092765-appb-000104
Figure PCTCN2015092765-appb-000105
其中:如果y=yi,δ(y,yi)=1;反之,δ(y,yi)=0。
上式的优化方程使用经典的SMO(sequential minimal optimization)算法求解。最后得到的跟踪算法模型为:
Figure PCTCN2015092765-appb-000106
其中,内积使用核函数进行表示,为
Figure PCTCN2015092765-appb-000107
本实施例采用高斯核k(x,x′)=e-γ||x-x′||,在本实施例中,取γ=0.2。
上式中的h(x,y)计算的是一个评分,评分越高表明与目标特征的相似度越高,越可能是目标物体。
步骤3:跟踪目标模板库建立;
对于跟踪目标,建立目标模板库,模板库的容量为P=4。首先对初始目标图像提取HSV分块直方图特征、边缘方向直方图特征、主颜色特征和hog特征,把这四种特征进行融合得到目标特征,把该特征加入模板库。然后每间隔帧数Q=20对目标图像提取上述特征,把该特征加入模板库。当模板库中的模板数量达到容量上限时,把除了初始模板以外最早的模板进行删除,用最新的模板进行替换。
步骤4:消失判断模型建立;
得到初始目标物体的图像模板以后,先对目标图像作对数极坐标转换,公式如下:
Figure PCTCN2015092765-appb-000108
其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
然后,对目标的对数极坐标图像进行混合高斯建模:
Figure PCTCN2015092765-appb-000109
其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
Figure PCTCN2015092765-appb-000110
表示t时刻(x,y)坐标处第i个高斯模型的权重,N()表示正态分布,
Figure PCTCN2015092765-appb-000111
表示t时刻(x,y)坐标处第i个高斯模型的均值,
Figure PCTCN2015092765-appb-000112
表示t时刻(x,y)坐标处第i个高斯模型的方差。在本实施例中,取H=8,初始权重w0=0.05,初始均值μ0为目标特征图像像素值,初始方差σ0=900。
步骤5:输入下一帧视频图像;
步骤6:跟踪目标的位置和尺度;
6.1、目标位置跟踪;
首先固定搜索窗口的尺度为上一帧的目标尺度,初始搜索位置为上一帧的目标位置,然后在当前视频帧中,在初始搜索位置的领域大小为r=20的范围内进行遍历搜索,使用跟踪模型中的online SVM评分器h(x,y)对所有搜索窗口进行评分,评分最高的一个作为当前视频帧中目标的位置。
6.2、目标尺度搜索;
得到了当前视频帧中目标的位置以后,就需要确定目标的尺度大小,即目标矩形框的长和宽。首先使用上一帧中目标的尺度作为初始搜索尺度,取尺度系数S=0.9+0.01i,i=0,2,.,用这些尺度系数与初始搜索尺度相乘构成尺度搜索空间,然后使用尺度搜索空间的每个尺度在目标位置处截取一系列的图像作为候选目标图像,然后再对这些候选目标图像提取haar-like特征,使用online SVM评分器h(x,y)对所有候选目标的特征进行评分,评分最高者对应的尺度就作为当前视频帧中目标的尺度大小。
6.3、跟踪模型更新;
跟踪模型更新使用步骤2中的online SVM分类器优化求解方法。
步骤7:目标消失判断;
当得到了当前视频帧目标的位置和尺度,需要判断该目标是否都已经消失。先对目标图像进行对数极坐标转换,然后通过步骤消失判断机制中的高斯模型判断目标是否都消失。方法如下:
对于输入图像的每一个像素点,先对该像素点处的H个高斯模型按照
Figure PCTCN2015092765-appb-000113
进行排序,然后按照排序从高到低进行搜索,如果该像素点与当前高斯模型的均值的差小于2.5倍的标准差,就认为该像素点属于当前高斯模型,然后计算该高斯模型以及排名大于该高斯模型的其他高斯模型的参数
Figure PCTCN2015092765-appb-000114
的和,如果这个值大于阈值ε,则认为该点为背景点,反之,则认为该点为前景点,对相应的高斯模型进行更新。如果没有找到与该像素点对应的高斯模型,则认为该点为前景点,使用该像素点作为均值构建一个新的高斯模型,并替换掉原先排名最低的那个高斯模型。
高斯模型的更新使用下式:
Figure PCTCN2015092765-appb-000115
Figure PCTCN2015092765-appb-000116
Figure PCTCN2015092765-appb-000117
其中,参数λ表示高斯模型的更新速率。
然后在每一帧统计当前对数极坐标图像被高斯模型判断为前景的像素点的个数,并计算其占总像素点数的比例,即
Figure PCTCN2015092765-appb-000118
其中:nt表示t时刻对数极坐标图像被高斯模型判断为前景的像素点的个数,Nt表示对数极坐标图像的总像素点数。
当上式中的比例大于阈值T1时,就触发了消失判断机制,此时怀疑目标有可能已经消失,但是不是非常确定,因为有可能只是目标物体改变姿态所引发的误判,所以下面要进行时域滤波,减小误判的发生。下式表示了消失判断机制,也就是说,对随后的K帧所发生的情况进行统计,当随后K帧中有至少M(M≤K)帧的前景像素点个数占总像素点数的比例大于阈值T2,就认为目标真正消失。在本实施例中,阈值T1=0.4,T2=0.3,K=4,M=2
Figure PCTCN2015092765-appb-000119
Figure PCTCN2015092765-appb-000120
当目标被判断为真正消失以后,为了防止误差的引入和累积,会停止跟踪模型和消失判断机制中的高斯模型的更新,而且此时跟踪算法的目标位置搜索范围r会变大,变为r2。同时,将目标消失的信息返回给上层用户。当目标被消失判断模型判断为重新出现,即ηt<T3的时候,跟踪算法和消失判断算法的模型恢复更新,跟踪算法的搜索范围恢复正常。
在本实施例中,搜索范围r=20,r2=30,T2=0.1。
步骤8:返回目标矩形框和消失判断结果给上层用户,重复步骤5至7直到目标离开视频,如图4d所示;
步骤9:跨摄像机目标匹配;
9.1、下一段待匹配视频输入;
当目标离开原始视频以后,由用户指定下一段需要进行跨摄像机匹配的视频,如图5a所示。
9.2、前景目标提取;
使用混合高斯模型对输入视频的每一帧图像进行背景建模和前景提取,对前景图像提取连通区域得到每一帧中的所有前景物体。
9.3、前景目标特征提取和跨摄像机匹配;
再对所有前景物体提取步骤3中所述的特征。然后,对每一个跟踪目标,使用目标模板库与这些前景的特征进行匹配,得到匹配相似度。这样,对于每一个跟踪目标,在该段视频中都有一条相似度曲线,曲线中的任意一点都代表在该时间点处视频中最相似的前景物体与跟踪目标之间的相似度度量值。最后,只需要找到曲线峰值处所对应的前景物体和时间点,把该前景物体作为匹配得到的目标物体,把该时间点作为匹配得到的目标的初始出现时间。
9.4、继续跟踪;
在匹配得到该视频中的目标物体以后,从该目标出现的初始时间开始,使用步骤2至8中的方法对目标进行跟踪。
9.5、多摄像机目标连续跟踪。
重复步骤9.1至9.5即可在多个摄像机的视频中对同一目标进行连续的跟踪。
图4显示了本实施例的跟踪结果,图5显示了本实施例的跨摄像机匹配结果,时间顺序为从左到右、从上到下。在图4的每一幅图像中,左边为跟踪视频,矩形框内为跟踪的目标物体,此实施例中为一个学生,左下角为跟踪目标的消失判断结果,可以看出本实施例的跟踪方法准确地跟踪住了该学生,直到他离开视频边界。在图6中的每一幅图像中,右上角为四张目标模板图像;右边为跨摄像机匹配的视频,目标学生在该视频中出现了六次;右下角为相似度曲线,该曲线中有六个尖峰,分别对应了跟踪目标在该视频中的六次出现情况。从图4和图5中的结果可以看出,本实施例中的跨摄像机跟踪方法可以准确地跟踪目标物体,并且通过跨摄像机匹配成功找到另一段视频中的目标物体,跟踪效果好并且鲁棒性极强。
实施例2
本实施例提供一种自适应跨摄像机多目标跟踪系统,可实现上述实施例1的方法,该系统至少包括如下模块:
更新模块,设置为固定跟踪窗口大小并使用预先建立的跟踪模型得到当前视频帧目标物体的位置,在所得到的位置处变换跟踪窗口的大小,使用所述跟踪模型得到目标物体的尺度,根据得到的目标物体的尺度在线更新跟踪模型;以及
判断模块,设置为根据更新的跟踪模型,对当前视频帧目标物体的图像进行对数极坐标变换,对对数极坐标变换后的目标物体的图像进行混合高斯建模,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失。
其中,判断模块可以按照如下公式,对当前视频帧目标物体的图像进行对数极坐标变换:
Figure PCTCN2015092765-appb-000121
其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
判断模块还可以按照如下公式对对数极坐标变换后的目标物体的图像进行混合高斯建模:
Figure PCTCN2015092765-appb-000122
其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
Figure PCTCN2015092765-appb-000123
表示t时刻(x,y)坐标处第i个高斯模型的权重,N()表示正态分布,
Figure PCTCN2015092765-appb-000124
表示t时刻(x,y)坐标处第i个高斯模型的均值,
Figure PCTCN2015092765-appb-000125
表示t时刻(x,y)坐标处第i个高斯模型的方差。
而判断模块度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失的过程如下:
判断模块使用目标物体的混合高斯模型对下一帧视频图像中的跟踪目标物体的图像进行检测,统计非目标的像素占总像素的比例,若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失。
可选地,判断模块还设置为在非目标的像素点总像素的比例达到或超过预先设定的阈值时,进行时域滤波操作;即仅当非目标的像素点总像素的比例达到或超过预先设定的阈值,且满足时域滤波的消失判断条件时,才判断目标物体消失,从而大大提高判断结果的可靠性。
另外,上述系统还可以包括目标物体的特征库建立模块。
其中,目标物体的特征库建立模块,设置为在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库;
其中,目标物体的特征库建立模块,可以周期性地对跟踪目标物体的图像提取HSV分块直方图特征、边缘方向直方图特征、主颜色特征和hog特征,再将这四种特征进行融合得到目标特征,存入目标特征库,即建立完成了目标物体的特征库。
另外,目标物体的特征库建立模块还设置为在目标特征库容量达到上限时,还可以用新的目标特征替换目标特征库中最原始的目标特征。
上述系统还可以包括跟踪模块,其设置为在跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪。
其中,跟踪模块对整段匹配视频使用混合高斯模型进行前景物体提取,使用目标特征库计算跟踪目标与所有前景物体的相似度,选择相似度最高的前景物体作为跨摄像机跟踪的目标。
其中,更新模块和判断模块可构成目标跟踪单元和跨视频目标匹配单元,而目标物体的特征库建立模块可置于运动目标检测单元中,此时,整个系统的架构如图6所示。
从上述实施例可以看出,本申请实施例技术方案与其他相关技术相比,具有以下优势:
(1)本申请实施例技术方案针对无重叠区域拓扑结构未知的多摄像头监控网络对多个目标进行跨摄像机连续跟踪,该方案具有极高的鲁棒性,而且能够达到实时要求;
(2)本申请实施例技术方案采用了快速有效的目标尺度变换方法,能够实时准确地得到目标的尺度,并且对于跟踪算法运算速度的影响极小;
(3)本申请实施例技术方案采用对数极坐标变换和混合高斯模型建立目标消失判断机制,并且加入时域滤波增强消失判断的鲁棒性;
(4)本申请实施例技术方案将跟踪算法和消失判断机制有机的结合在一起,通过消失判断机制来控制跟踪模型和消失判断模型的更新,有效的减少了模型错误的引入和累积,提高了跟踪算法的准确性和鲁棒性。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本申请实施例不限制于任何特定形式的硬件和软件的结合。
工业实用性
本申请实施例技术方案利用online SVM框架来实现自适应尺度变换的跟 踪算法,并融合了基于对数极坐标变换和混合高斯建模的消失判断机制,有效地提高的跟踪算法的准确性和鲁棒性。同时,本申请实施例技术方案能够同时对多个目标进行跨摄像机跟踪,具有极强的鲁棒性,而且能够达到实时要求,能够应用到实际场景中。

Claims (19)

  1. 一种自适应跨摄像机多目标跟踪方法,该方法包括:
    固定跟踪窗口大小并使用预先建立的跟踪模型得到当前视频帧目标物体的位置,在所得到的位置处变换跟踪窗口的大小,使用所述跟踪模型得到目标物体的尺度,根据得到的目标物体的尺度在线更新跟踪模型;
    根据更新的跟踪模型,对当前视频帧目标物体的图像进行对数极坐标变换,对对数极坐标变换后的目标物体的图像进行混合高斯建模,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失。
  2. 如权利要求1所述的方法,该方法还包括:
    在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库,当跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪。
  3. 如权利要求1或2所述的方法,其中,所述对当前视频帧目标物体的图像进行对数极坐标变换,包括:
    按照如下公式,对当前视频帧目标物体的图像进行对数极坐标变换:
    Figure PCTCN2015092765-appb-100001
    其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
  4. 如权利要求1或2所述的方法,其中,所述对对数极坐标变换后的目标物体的图像进行混合高斯建模,包括:
    按照如下公式对对数极坐标变换后的目标物体的图像进行混合高斯建模:
    Figure PCTCN2015092765-appb-100002
    其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
    Figure PCTCN2015092765-appb-100003
    表示t时刻(x,y)坐标处第i个高斯模型的权重,N( )表示正态分布,
    Figure PCTCN2015092765-appb-100004
    表示t时刻(x,y)坐标处第i个高斯模型的均值,
    Figure PCTCN2015092765-appb-100005
    表示t时刻(x,y)坐标处第i个高斯模型的方差。
  5. 如权利要求1或2所述的方法,其中,所述度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失包括:
    使用目标物体的混合高斯模型对下一帧视频图像中的跟踪目标物体的图像进行检测,统计非目标的像素占总像素的比例,若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失。
  6. 如权利要求5所述的方法,所述方法还包括,若非目标的像素点总像素的比例达到或超过预先设定的阈值时,进行时域滤波操作;
    其中,所述若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失,包括:仅当非目标的像素点总像素的比例达到或超过预先设定的阈值,且满足时域滤波的消失判断条件时,判断目标物体消失。
  7. 如权利要求2所述的方法,其中,所述在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库包括:
    周期性地对跟踪目标物体的图像提取色彩、饱和度和值HSV分块直方图特征、边缘方向直方图特征、主颜色特征和方向梯度直方图HOG特征,将上述四种特征进行融合得到目标特征,存入目标特征库。
  8. 如权利要求7所述的方法,该方法还包括:
    在目标特征库容量达到上限时,用新的目标特征替换目标特征库中最原始的目标特征。
  9. 如权利要求2、7或8所述的方法,其中,所述当跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪包括:
    对整段匹配视频使用混合高斯模型进行前景物体提取,使用目标特征库计算跟踪目标与所有前景物体的相似度,选择相似度最高的前景物体作为跨摄像机跟踪的目标。
  10. 一种自适应跨摄像机多目标跟踪系统,该系统包括:
    更新模块,设置为固定跟踪窗口大小并使用预先建立的跟踪模型得到当 前视频帧目标物体的位置,在所得到的位置处变换跟踪窗口的大小,使用所述跟踪模型得到目标物体的尺度,根据得到的目标物体的尺度在线更新跟踪模型;以及
    判断模块,设置为根据更新的跟踪模型,对当前视频帧目标物体的图像进行对数极坐标变换,对对数极坐标变换后的目标物体的图像进行混合高斯建模,度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失。
  11. 如权利要求10所述的系统,该系统还包括:
    目标物体的特征库建立模块,设置为在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库;以及
    跟踪模块,设置为在跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪。
  12. 如权利要求10或11所述的系统,其中,所述判断模块对当前视频帧目标物体的图像进行对数极坐标变换,包括:
    所述判断模块按照如下公式,对当前视频帧目标物体的图像进行对数极坐标变换:
    Figure PCTCN2015092765-appb-100006
    其中,经过对数极坐标转换后的图像的横轴和纵轴为ρ和θ。
  13. 如权利要求10或11所述的系统,其中,所述判断模块对对数极坐标变换后的目标物体的图像进行混合高斯建模,包括:
    所述判断模块按照如下公式对对数极坐标变换后的目标物体的图像进行混合高斯建模:
    Figure PCTCN2015092765-appb-100007
    其中:qt(x,y)表示目标的特征图像在(x,y)处的像素值,H表示混合高斯模型的个数,
    Figure PCTCN2015092765-appb-100008
    表示t时刻(x,y)坐标处第i个高斯模型的权重,N( )表示正态分布,
    Figure PCTCN2015092765-appb-100009
    表示t时刻(x,y)坐标处第i个高斯模型的均值,
    Figure PCTCN2015092765-appb-100010
    表示t时刻(x,y)坐标处第i个高斯模型的方差。
  14. 如权利要求10或11所述的系统,其中,所述判断模块度量目标物体的中心偏移和变化程度,判别目标物体是否已经消失包括:
    所述判断模块使用目标物体的混合高斯模型对下一帧视频图像中的跟踪目标物体的图像进行检测,统计非目标的像素占总像素的比例,若非目标的像素点总像素的比例达到或超过预先设定的阈值,则判断目标物体消失。
  15. 如权利要求14所述的系统,其中,所述判断模块还设置为在非目标的像素点总像素的比例达到或超过预先设定的阈值时,进行时域滤波操作;
    所述判断模块在非目标的像素点总像素的比例达到或超过预先设定的阈值,判断目标物体消失包括:所述判断模块仅当非目标的像素点总像素的比例达到或超过预先设定的阈值,且满足时域滤波的消失判断条件时,才判断目标物体消失。
  16. 如权利要求11所述的系统,其中,所述目标物体的特征库建立模块在跟踪的同时提取并保存目标物体的特征,建立目标物体的特征库包括:
    所述目标物体的特征库建立模块周期性地对跟踪目标物体的图像提取色彩、饱和度和值HSV分块直方图特征、边缘方向直方图特征、主颜色特征和方向梯度直方图HOG特征,将上述四种特征进行融合得到目标特征,存入目标特征库。
  17. 如权利要求16所述的系统,其中
    所述目标物体的特征库建立模块还设置为在目标特征库容量达到上限时,用新的目标特征替换目标特征库中最原始的目标特征。
  18. 如权利要求11、16或17所述的系统,其中,所述跟踪模块在跟踪的目标物体离开该视频时,在另一段视频中匹配并检测最相似的目标物体,继续进行跟踪包括:
    所述跟踪模块对整段匹配视频使用混合高斯模型进行前景物体提取,使用目标特征库计算跟踪目标与所有前景物体的相似度,选择相似度最高的前景物体作为跨摄像机跟踪的目标。
  19. 一种计算机可读存储介质,存储有程序指令,当该程序指令被执行 时可实现权利要求1-9任一项所述的方法。
PCT/CN2015/092765 2015-07-22 2015-10-23 一种自适应跨摄像机多目标跟踪方法及系统 WO2016131300A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510434243.0 2015-07-22
CN201510434243.0A CN106373143A (zh) 2015-07-22 2015-07-22 一种自适应跨摄像机多目标跟踪方法及系统

Publications (1)

Publication Number Publication Date
WO2016131300A1 true WO2016131300A1 (zh) 2016-08-25

Family

ID=56692220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/092765 WO2016131300A1 (zh) 2015-07-22 2015-10-23 一种自适应跨摄像机多目标跟踪方法及系统

Country Status (2)

Country Link
CN (1) CN106373143A (zh)
WO (1) WO2016131300A1 (zh)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308704A (zh) * 2018-08-02 2019-02-05 平安科技(深圳)有限公司 背景剔除方法、装置、计算机设备及存储介质
CN109326007A (zh) * 2018-10-10 2019-02-12 炫彩互动网络科技有限公司 一种适用于虚拟现实的动态更新模板追踪方法
CN109359552A (zh) * 2018-09-21 2019-02-19 中山大学 一种高效的跨摄像头行人双向跟踪方法
CN109445453A (zh) * 2018-09-12 2019-03-08 湖南农业大学 一种基于OpenCV的无人机实时压缩跟踪方法
CN109615641A (zh) * 2018-11-23 2019-04-12 中山大学 基于kcf算法的多目标行人跟踪系统及跟踪方法
CN109643321A (zh) * 2017-02-15 2019-04-16 富士通株式会社 基于视频监控的视频分析系统和视频分析方法
CN110084830A (zh) * 2019-04-07 2019-08-02 西安电子科技大学 一种视频运动目标检测与跟踪方法
CN110135224A (zh) * 2018-02-09 2019-08-16 中国科学院上海高等研究院 一种监控视频的前景目标提取方法及系统、存储介质及终端
CN110189365A (zh) * 2019-05-24 2019-08-30 上海交通大学 抗遮挡相关滤波跟踪方法
CN110532989A (zh) * 2019-09-04 2019-12-03 哈尔滨工业大学 一种海上目标自动探测方法
CN110599519A (zh) * 2019-08-27 2019-12-20 上海交通大学 基于领域搜索策略的抗遮挡相关滤波跟踪方法
CN110889864A (zh) * 2019-09-03 2020-03-17 河南理工大学 一种基于双层深度特征感知的目标跟踪方法
CN110889863A (zh) * 2019-09-03 2020-03-17 河南理工大学 一种基于目标感知相关滤波的目标跟踪方法
CN111105436A (zh) * 2018-10-26 2020-05-05 曜科智能科技(上海)有限公司 目标跟踪方法、计算机设备及存储介质
CN111145121A (zh) * 2019-12-27 2020-05-12 安徽工业大学 一种强化多特征融合的置信项滤波器目标跟踪方法
CN111402286A (zh) * 2018-12-27 2020-07-10 杭州海康威视系统技术有限公司 一种目标跟踪方法、装置、系统及电子设备
CN111462187A (zh) * 2020-04-09 2020-07-28 成都大学 基于多特征融合的非刚体目标跟踪方法
CN111583294A (zh) * 2020-04-22 2020-08-25 西安工业大学 一种结合尺度自适应与模型更新的目标跟踪方法
WO2020228353A1 (zh) * 2019-05-13 2020-11-19 深圳先进技术研究院 一种基于运动加速度的图像搜索方法、系统及电子设备
CN112037159A (zh) * 2020-07-29 2020-12-04 长安大学 一种跨相机道路空间融合及车辆目标检测跟踪方法及系统
CN112508962A (zh) * 2020-09-27 2021-03-16 绍兴文理学院 基于时间关联图像序列的目标图像区域子序列分离方法
CN112699718A (zh) * 2020-04-15 2021-04-23 南京工程学院 一种尺度和光照自适应的结构化多目标跟踪方法及其应用
CN112712571A (zh) * 2020-12-25 2021-04-27 科大讯飞股份有限公司 基于视频的物体平面贴图方法、装置以及设备
US11024039B2 (en) 2018-12-13 2021-06-01 Axis Ab Method and device for tracking an object
CN112950674A (zh) * 2021-03-09 2021-06-11 厦门市公安局 一种基于多种识别技术协同的跨摄像头轨迹跟踪方法、装置及存储介质
CN113379798A (zh) * 2021-06-03 2021-09-10 中国电子科技集团公司第二十八研究所 一种基于交互评价模型的相关滤波跟踪方法
CN113643327A (zh) * 2021-08-18 2021-11-12 江西理工大学 一种响应置信度多特征融合的核相关滤波目标跟踪方法
CN114945071A (zh) * 2022-03-31 2022-08-26 深圳闪回科技有限公司 一种回收机器内置摄像头拍照控制方法、装置及系统
CN115499585A (zh) * 2022-09-07 2022-12-20 湖南中信安科技有限责任公司 一种混杂场景执法视频焦点局部修正方法与系统
CN117037006A (zh) * 2023-10-09 2023-11-10 山东中宇航空科技发展有限公司 一种高续航能力的无人机跟踪方法
CN112699718B (zh) * 2020-04-15 2024-05-28 南京工程学院 一种尺度和光照自适应的结构化多目标跟踪方法及其应用

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280846B (zh) * 2018-01-16 2020-12-29 中国科学院福建物质结构研究所 基于几何图形匹配的目标跟踪修正方法及其装置
CN109212513B (zh) * 2018-09-29 2021-11-12 河北德冠隆电子科技有限公司 多目标在雷达间数据传递、数据融合及连续跟踪定位方法
CN109194929B (zh) * 2018-10-24 2020-05-29 北京航空航天大学 基于WebGIS的目标关联视频快速筛选方法
CN110223329A (zh) * 2019-05-10 2019-09-10 华中科技大学 一种多摄像机多目标跟踪方法
CN111369587B (zh) * 2019-06-13 2023-05-02 杭州海康威视系统技术有限公司 一种跟踪方法及装置
CN111008992B (zh) * 2019-11-28 2024-04-05 驭势科技(浙江)有限公司 目标跟踪方法、装置和系统及存储介质
CN111091584B (zh) * 2019-12-23 2024-03-08 浙江宇视科技有限公司 一种目标跟踪方法、装置、设备和存储介质
CN111833382B (zh) * 2020-02-13 2021-03-09 珠海安联锐视科技股份有限公司 一种基于摇头摄像机的目标跟踪方法
CN112699874B (zh) * 2020-12-30 2023-02-17 中孚信息股份有限公司 一种面向任意旋转方向图像的文字识别方法及系统
CN112686178B (zh) * 2020-12-30 2024-04-16 中国电子科技集团公司信息科学研究院 一种多视角目标轨迹生成方法、装置和电子设备
CN114119989B (zh) * 2021-11-29 2023-08-11 北京百度网讯科技有限公司 图像特征提取模型的训练方法、装置及电子设备

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101120382A (zh) * 2005-04-01 2008-02-06 三菱电机株式会社 跟踪用相机获得的场景视频中的移动物体的方法
CN102930539A (zh) * 2012-10-25 2013-02-13 江苏物联网研究发展中心 基于动态图匹配的目标跟踪方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853511B (zh) * 2010-05-17 2012-07-11 哈尔滨工程大学 一种抗遮挡目标轨迹预测跟踪方法
CN102629385B (zh) * 2012-02-28 2014-09-24 中山大学 一种基于多摄像机信息融合的目标匹配与跟踪系统及方法
CN103489199B (zh) * 2012-06-13 2016-08-24 通号通信信息集团有限公司 视频图像目标跟踪处理方法和系统
CN104240266A (zh) * 2014-09-04 2014-12-24 成都理想境界科技有限公司 基于颜色-结构特征的目标对象跟踪方法
CN104463900A (zh) * 2014-12-31 2015-03-25 天津汉光祥云信息科技有限公司 一种多摄像机间目标自动追踪方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101120382A (zh) * 2005-04-01 2008-02-06 三菱电机株式会社 跟踪用相机获得的场景视频中的移动物体的方法
CN102930539A (zh) * 2012-10-25 2013-02-13 江苏物联网研究发展中心 基于动态图匹配的目标跟踪方法

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LI, FEI ET AL.: "Tracking by Detection Algorithm Based on Structured Support Vector Machine", ELECTRONICS OPTICS & CONTROL, vol. 21, no. 12, 31 December 2014 (2014-12-31), pages 49 - 52, ISSN: 1671-637X *
XU, YUNXI ET AL.: "Pedestrain Re-identification Algorithm Based on Non-sparse Multiple Kernel Support Vector Machine", OPTO- ELECTRONIC ENGINEERING, vol. 40, no. 9, 30 September 2013 (2013-09-30), pages 82 - 87, ISSN: 1003-501X *
YILMAZ, A. ET AL.: "Object Tracking: A Survey", ACM COMPUTING SURVEYS (CSUR, vol. 38, no. 4, 31 December 2006 (2006-12-31), pages 1 - 45, XP007902942, ISSN: 0360-0300 *
ZHANG, ZHEN ET AL.: "A Moving Object Detection Method Based on LPT and Improved Background Subtraction", MECHATRONICS, no. 5, 31 May 2014 (2014-05-31), pages 8 - 9 and 82, ISSN: 1007-080X *

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109643321A (zh) * 2017-02-15 2019-04-16 富士通株式会社 基于视频监控的视频分析系统和视频分析方法
CN110135224B (zh) * 2018-02-09 2022-10-14 中国科学院上海高等研究院 一种监控视频的前景目标提取方法及系统、存储介质及终端
CN110135224A (zh) * 2018-02-09 2019-08-16 中国科学院上海高等研究院 一种监控视频的前景目标提取方法及系统、存储介质及终端
CN109308704B (zh) * 2018-08-02 2024-01-16 平安科技(深圳)有限公司 背景剔除方法、装置、计算机设备及存储介质
CN109308704A (zh) * 2018-08-02 2019-02-05 平安科技(深圳)有限公司 背景剔除方法、装置、计算机设备及存储介质
CN109445453A (zh) * 2018-09-12 2019-03-08 湖南农业大学 一种基于OpenCV的无人机实时压缩跟踪方法
CN109359552A (zh) * 2018-09-21 2019-02-19 中山大学 一种高效的跨摄像头行人双向跟踪方法
CN109359552B (zh) * 2018-09-21 2020-11-13 中山大学 一种高效的跨摄像头行人双向跟踪方法
CN109326007A (zh) * 2018-10-10 2019-02-12 炫彩互动网络科技有限公司 一种适用于虚拟现实的动态更新模板追踪方法
CN109326007B (zh) * 2018-10-10 2022-11-29 炫彩互动网络科技有限公司 一种适用于虚拟现实的动态更新模板追踪方法
CN111105436B (zh) * 2018-10-26 2023-05-09 曜科智能科技(上海)有限公司 目标跟踪方法、计算机设备及存储介质
CN111105436A (zh) * 2018-10-26 2020-05-05 曜科智能科技(上海)有限公司 目标跟踪方法、计算机设备及存储介质
CN109615641B (zh) * 2018-11-23 2022-11-29 中山大学 基于kcf算法的多目标行人跟踪系统及跟踪方法
CN109615641A (zh) * 2018-11-23 2019-04-12 中山大学 基于kcf算法的多目标行人跟踪系统及跟踪方法
US11024039B2 (en) 2018-12-13 2021-06-01 Axis Ab Method and device for tracking an object
CN111402286A (zh) * 2018-12-27 2020-07-10 杭州海康威视系统技术有限公司 一种目标跟踪方法、装置、系统及电子设备
CN111402286B (zh) * 2018-12-27 2024-04-02 杭州海康威视系统技术有限公司 一种目标跟踪方法、装置、系统及电子设备
CN110084830A (zh) * 2019-04-07 2019-08-02 西安电子科技大学 一种视频运动目标检测与跟踪方法
CN110084830B (zh) * 2019-04-07 2022-12-09 西安电子科技大学 一种视频运动目标检测与跟踪方法
WO2020228353A1 (zh) * 2019-05-13 2020-11-19 深圳先进技术研究院 一种基于运动加速度的图像搜索方法、系统及电子设备
CN110189365A (zh) * 2019-05-24 2019-08-30 上海交通大学 抗遮挡相关滤波跟踪方法
CN110189365B (zh) * 2019-05-24 2023-04-07 上海交通大学 抗遮挡相关滤波跟踪方法
CN110599519A (zh) * 2019-08-27 2019-12-20 上海交通大学 基于领域搜索策略的抗遮挡相关滤波跟踪方法
CN110599519B (zh) * 2019-08-27 2022-11-08 上海交通大学 基于领域搜索策略的抗遮挡相关滤波跟踪方法
CN110889863B (zh) * 2019-09-03 2023-03-24 河南理工大学 一种基于目标感知相关滤波的目标跟踪方法
CN110889863A (zh) * 2019-09-03 2020-03-17 河南理工大学 一种基于目标感知相关滤波的目标跟踪方法
CN110889864A (zh) * 2019-09-03 2020-03-17 河南理工大学 一种基于双层深度特征感知的目标跟踪方法
CN110889864B (zh) * 2019-09-03 2023-04-18 河南理工大学 一种基于双层深度特征感知的目标跟踪方法
CN110532989A (zh) * 2019-09-04 2019-12-03 哈尔滨工业大学 一种海上目标自动探测方法
CN110532989B (zh) * 2019-09-04 2022-10-14 哈尔滨工业大学 一种海上目标自动探测方法
CN111145121A (zh) * 2019-12-27 2020-05-12 安徽工业大学 一种强化多特征融合的置信项滤波器目标跟踪方法
CN111145121B (zh) * 2019-12-27 2023-02-28 安徽工业大学 一种强化多特征融合的置信项滤波器目标跟踪方法
CN111462187A (zh) * 2020-04-09 2020-07-28 成都大学 基于多特征融合的非刚体目标跟踪方法
CN112699718A (zh) * 2020-04-15 2021-04-23 南京工程学院 一种尺度和光照自适应的结构化多目标跟踪方法及其应用
CN112699718B (zh) * 2020-04-15 2024-05-28 南京工程学院 一种尺度和光照自适应的结构化多目标跟踪方法及其应用
CN111583294A (zh) * 2020-04-22 2020-08-25 西安工业大学 一种结合尺度自适应与模型更新的目标跟踪方法
CN111583294B (zh) * 2020-04-22 2023-05-12 西安工业大学 一种结合尺度自适应与模型更新的目标跟踪方法
CN112037159A (zh) * 2020-07-29 2020-12-04 长安大学 一种跨相机道路空间融合及车辆目标检测跟踪方法及系统
CN112037159B (zh) * 2020-07-29 2023-06-23 中天智控科技控股股份有限公司 一种跨相机道路空间融合及车辆目标检测跟踪方法及系统
CN112508962A (zh) * 2020-09-27 2021-03-16 绍兴文理学院 基于时间关联图像序列的目标图像区域子序列分离方法
CN112712571B (zh) * 2020-12-25 2023-12-01 科大讯飞股份有限公司 基于视频的物体平面贴图方法、装置以及设备
CN112712571A (zh) * 2020-12-25 2021-04-27 科大讯飞股份有限公司 基于视频的物体平面贴图方法、装置以及设备
CN112950674A (zh) * 2021-03-09 2021-06-11 厦门市公安局 一种基于多种识别技术协同的跨摄像头轨迹跟踪方法、装置及存储介质
CN112950674B (zh) * 2021-03-09 2024-03-05 厦门市公安局 一种基于多种识别技术协同的跨摄像头轨迹跟踪方法、装置及存储介质
CN113379798B (zh) * 2021-06-03 2022-11-22 中国电子科技集团公司第二十八研究所 一种基于交互评价模型的相关滤波跟踪方法
CN113379798A (zh) * 2021-06-03 2021-09-10 中国电子科技集团公司第二十八研究所 一种基于交互评价模型的相关滤波跟踪方法
CN113643327B (zh) * 2021-08-18 2023-10-20 江西理工大学 一种响应置信度多特征融合的核相关滤波目标跟踪方法
CN113643327A (zh) * 2021-08-18 2021-11-12 江西理工大学 一种响应置信度多特征融合的核相关滤波目标跟踪方法
CN114945071A (zh) * 2022-03-31 2022-08-26 深圳闪回科技有限公司 一种回收机器内置摄像头拍照控制方法、装置及系统
CN115499585A (zh) * 2022-09-07 2022-12-20 湖南中信安科技有限责任公司 一种混杂场景执法视频焦点局部修正方法与系统
CN115499585B (zh) * 2022-09-07 2024-04-16 湖南中信安科技有限责任公司 一种混杂场景执法视频焦点局部修正方法与系统
CN117037006A (zh) * 2023-10-09 2023-11-10 山东中宇航空科技发展有限公司 一种高续航能力的无人机跟踪方法
CN117037006B (zh) * 2023-10-09 2023-12-15 山东中宇航空科技发展有限公司 一种高续航能力的无人机跟踪方法

Also Published As

Publication number Publication date
CN106373143A (zh) 2017-02-01

Similar Documents

Publication Publication Date Title
WO2016131300A1 (zh) 一种自适应跨摄像机多目标跟踪方法及系统
CN107832672B (zh) 一种利用姿态信息设计多损失函数的行人重识别方法
Denman et al. Multi-spectral fusion for surveillance systems
JP7136500B2 (ja) ノイズチャネルに基づくランダム遮蔽回復の歩行者再識別方法
Avula et al. A novel forest fire detection system using fuzzy entropy optimized thresholding and STN-based CNN
Tang et al. Multiple-kernel adaptive segmentation and tracking (MAST) for robust object tracking
Fradi et al. Spatio-temporal crowd density model in a human detection and tracking framework
Dewangan et al. Real time object tracking for intelligent vehicle
Jiang et al. A self-attention network for smoke detection
Jiang et al. A tree-based approach to integrated action localization, recognition and segmentation
Shalnov et al. Convolutional neural network for camera pose estimation from object detections
Afonso et al. Automatic estimation of multiple motion fields from video sequences using a region matching based approach
Diaz et al. Detecting dynamic objects with multi-view background subtraction
Moujtahid et al. Classifying global scene context for on-line multiple tracker selection
CN111833380A (zh) 一种多视角影像融合的空间目标跟踪系统及方法
Tong et al. Transformer based line segment classifier with image context for real-time vanishing point detection in manhattan world
Saif et al. Crowd density estimation from autonomous drones using deep learning: challenges and applications
KR20140141239A (ko) 평균이동 알고리즘을 적용한 실시간 객체 추적방법 및 시스템
Cai et al. A target tracking method based on KCF for omnidirectional vision
Zhou et al. Multiple perspective object tracking via context-aware correlation filter
Eiselein et al. Enhancing human detection using crowd density measures and an adaptive correction filter
Li An improved face detection method based on face recognition application
Chowdhury et al. Human detection and localization in secure access control by analysing facial features
Zhang et al. Integral channel features for particle filter based object tracking
Ma et al. Reliable loop closure detection using 2-channel convolutional neural networks for visual slam

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15882439

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15882439

Country of ref document: EP

Kind code of ref document: A1