WO2021012757A1 - Real-time target detection and tracking method based on panoramic multichannel 4k video images - Google Patents

Real-time target detection and tracking method based on panoramic multichannel 4k video images Download PDF

Info

Publication number
WO2021012757A1
WO2021012757A1 PCT/CN2020/090155 CN2020090155W WO2021012757A1 WO 2021012757 A1 WO2021012757 A1 WO 2021012757A1 CN 2020090155 W CN2020090155 W CN 2020090155W WO 2021012757 A1 WO2021012757 A1 WO 2021012757A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
track
frame
background
Prior art date
Application number
PCT/CN2020/090155
Other languages
French (fr)
Chinese (zh)
Inventor
朱伟
王扬红
苗锋
邱文嘉
王寿峰
马浩
白俊奇
Original Assignee
南京莱斯电子设备有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京莱斯电子设备有限公司 filed Critical 南京莱斯电子设备有限公司
Publication of WO2021012757A1 publication Critical patent/WO2021012757A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration by the use of local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20032Median filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Definitions

  • the invention relates to the technical field of digital image processing, in particular to a real-time target detection and tracking method based on panoramic multi-channel 4k video images.
  • Target detection is to extract the target of interest from the image through computer vision algorithms.
  • target detection has a very wide range of applications in various fields.
  • In the actual detection scene due to the complex and unstable external environment, there are many interferences, which bring many problems to target detection.
  • the realization of accurate, stable and real-time target detection and tracking has very important research significance.
  • Zhang Tianyu proposed a multi-scale target detection method in the patent "Spatio-temporal multi-scale moving target detection method”.
  • the image is divided into blocks and the optimal difference interval in the moving area is used to achieve target detection and tracking.
  • This method is robust in complex scenes The performance is low, and the criteria for determining significant differences are difficult to adapt to multiple scenarios.
  • Zdenek Kalal, Krystian Mikolajczyk and others in "Tracking-Learning-Detection” proposed a method for detecting and tracking a single target in a video, which uses the information difference between frames to combine the detection and tracking to realize online learning of target samples.
  • the median optical flow method proposed by this method requires target initialization, and it is difficult to ensure synchronization with the detector when the tracking correction is fixed.
  • Yang Yanshuang and Pu Baoming proposed an adaptive threshold SUSAN method to detect the vehicle target boundary in "Moving Vehicle Detection Based on Improved SUSAN Algorithm".
  • the histogram transform and the Hough transform are used to extract the connected domain of the target, and the vehicle target and background are extracted. Separation, the real-time performance of this method is poor, and it is difficult to effectively complete the target segmentation with adaptive threshold in complex scenes.
  • the present invention proposes a real-time target detection and tracking method based on panoramic multi-channel 4k video images. Target detection and tracking Excellent performance and easy to implement in engineering.
  • the real-time target detection and tracking method based on panoramic multi-channel 4k video images includes the following steps:
  • Step 1 Divide the panoramic multi-channel 4k video image into n regions, perform multi-frame target statistics for each region, classify each region of the panoramic video according to the target statistical probability, and complete the background modeling parameters according to the level of each region Threshold setting;
  • Step 2 Perform median filtering on the panoramic video image, initialize the background model, adaptively adjust the background modeling parameter threshold through the degree of dynamic transformation of the background, complete the background update, and then process the blinking pixels to complete the background image generation, and finally Use frame difference operation to realize the image generation of foreground candidate target area;
  • Step 3 Perform median filtering on the candidate target area image, use morphology-related operations to complete the enhanced candidate target area extraction, calculate the connected domain of the enhanced candidate target area and the minimum circumscribed rectangle of the connected domain, and eliminate false candidate target frames through the target shape features. Form the target spot;
  • Step 4 Perform continuous multi-frame detection on the panoramic video image to obtain the target point trace.
  • the target dynamic track management is performed, and the continuous multi-frame track information Perform data correction and complete multi-target stable tracking.
  • Step 1 includes:
  • Step 1-1 according to the panoramic video image size and scene coverage (the dividing criterion is that a single area does not exceed 1920*1080, and the 4k video image is just divided into 16), divide the panoramic video image into n areas S n .
  • N areas are denoted as S n , the area width of each area is less than or equal to 1920 (pixels), and the area height is greater than or equal to 1080 (pixels);
  • Step 1-2 use the frame difference method (reference: ZHOU Y, JI J, SONG KA Moving Target Detection Method Based on Improved Frame Difference Background Modeling[J].Open Cybernetics&Systemics Journal, 2014) to count moving targets in K-frame video images
  • the frequency of appearance in the panoramic video image according to the frequency of the moving target, the n regions are divided into four levels: A, B, C, and D according to the frequency of the target appearance. Among them, there are moving targets in the video image with more than 1 frame.
  • the area is an A-level image area, the area where there are moving objects in the video image with more than K 2 frames and less than 1 frame is the B-level image area, and the area where there are moving objects in the video image with more than K 3 frames and less than 2 frames is the C-level image area.
  • the area where the moving target exists in the video image of more than 4 frames and less than 3 frames is the D-level image area;
  • Steps 1-3 merge the adjacent image areas, and respectively record the corresponding panoramic position coordinates of each area.
  • the nth S n corresponds to the panoramic position coordinates of (x n ,y n ,w n ,h n ), where ( x n, y n) is the n th region left corner position S n w n, h n denote the n-th region S n, width and height.
  • Steps 1-4 setting corresponding background modeling parameter thresholds for n regions respectively, and the background modeling parameter threshold corresponding to the nth region S n is T n .
  • Step 2 includes:
  • Step 2-1 perform fast median filtering on panoramic video images (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.), to eliminate the influence of background noise;
  • Step 2-2 initialize the background model of the panoramic video image
  • the background model modeling method adopts ViBE (Visual Background Extractor, BARNICH O, DROOGENBROECK M V. ViBe: A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing, 2011, 20(06): 1709-1724.), where the background modeling parameter threshold T n is set as the Euclidean distance threshold in the ViBE algorithm.
  • Step 2-3 adaptively adjust the background modeling parameter threshold T n according to the dynamic transformation degree of the background to complete the background model update.
  • the background modeling parameter threshold T n is used to determine whether a pixel belongs to the background. Too large or too small will affect the quality of background modeling.
  • the threshold is adaptively adjusted by the dynamic transformation degree to define the background transformation parameters. ⁇ (x,y) is:
  • f(i,j) is the pixel value of the current frame at position (i,j)
  • d(i,j) is the pixel value of the background model at position (i,j)
  • M is the width of the current frame image
  • N Is the height of the current frame image.
  • T n ' is the threshold after adaptive adjustment
  • is the dynamic adjustment factor
  • ⁇ and ⁇ are both fixed parameters.
  • Steps 2-4 processing the blinking pixels in the background model to complete the generation of the background image.
  • the specific processing method of flashing pixels For the pixels in the background image generated in the background modeling, a certain pixel in the background image often bounces back and forth between the background point and the front spot, constructing an index level table of the flashing pixels, if said Pixels belong to the edge contour points of the background image (Reference: Kass M, Witkin A, Terzopoulos D. Snakes: Active contour models[J].
  • the flicker frequency level increases Otherwise, the flashing frequency level is reduced If the flicker frequency level of a certain pixel of the continuous K frames of background image is greater than S NK , then it is determined that the pixel is a flickering pixel, and the flickering pixel is removed from the updated background image.
  • Step 2-5 Perform difference between the panoramic video image and the background image obtained in step 2-4 to generate a candidate target image Im obj , and the candidate target area is the candidate target image.
  • Step 3 includes:
  • Step 3-1 perform fast median filter on the candidate target image Im obj (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.) Generate image Im mf ;
  • Step 3-2 perform morphological expansion on the filtered image Im mf (Haralick R.Zhunag X. Image analysis using mathematical morphology[J].IEEE Trans.On Pattern Analysis and Machine Intelligence1987,9(4):532-550. ) Operate to generate an image Im do , and then perform an AND operation between the image Im do and the candidate target image Im obj to generate an enhanced candidate target image Im obj2 ;
  • Step 3-3 perform morphological closing operation on the image Im obj2 (Haralick R.Zhunag X.Image analysis using mathematical morphology[J].IEEE Trans.On Pattern Analysis and Machine Intelligence 1987,9(4):532-550. ), extract the connected domain of the candidate target, calculate the minimum bounding rectangle of the connected domain, and extract the candidate target frame;
  • Step 3-4 Calculate the shape characteristics of the candidate target frame, the shape characteristics including the width obj_w, height obj_h, and aspect ratio obj_wh of the target frame, and determine whether the shape characteristics of the current candidate target frame satisfy obj_w>w 0 , obj_h>h 0 , obj_wh ⁇ wh 0 and obj_wh ⁇ wh 1 , if the above requirements are not met, the current candidate target frame is judged to be a false target and deleted; the candidate target frame that meets the requirements is generated as a target trace, where w 0 is the target frame Width threshold, h 0 is the target frame height threshold, wh 1 and wh 0 are the target aspect ratio high threshold and target aspect ratio low threshold respectively; the target trace includes frame number, target position coordinates, target width, and target height , Target aspect ratio and target area.
  • Step 4 includes:
  • Step 4-1 generate the target track Tr i from the target point trace Po i extracted from the first frame of panoramic video image
  • the specific operation method is: put the batch number BN automatically generated by the target point trace structure into the target track structure Volume vector, batch number BN is automatically accumulated, and satisfies 1 ⁇ BN ⁇ 9999, and the target track includes frame number, target position coordinates, target width, target height, target aspect ratio and target area;
  • Step 4-2 Calculate the absolute distance D i+1 between the target point track Po i+1 and the target track Tr i extracted from the next frame of panoramic video image respectively, and the calculation formula of the absolute distance D i+1 is:
  • Po i+1 (x) is the abscissa of the target track
  • Po i+1 (y) is the ordinate of the target track
  • Tr i (x) is the abscissa of the target track
  • Tr i (y) Is the ordinate of the target track
  • Step 4-3 Determine whether the current target is in the multi-channel video cross coverage state based on the track information, and adopt the fast correlation filtering method (Henriques J F, Rui C, Martins P, et al. High-speed tracking with kernelized correlation filters[J ].IEEE Transactions on Pattern Analysis&Machine Intelligence, 2015, 37(3):583-596.) Track management of multi-screen targets.
  • step 4-3 judging whether the current target is in the multi-channel video cross coverage state according to the track information includes: when the position of the target in the horizontal direction in the i-th frame of panoramic video image I i is greater than the threshold w 1 , And the target's horizontal track speed is positive, and at the same time, when the target's position in the horizontal direction in the i+1th frame of panoramic video image I i+1 is less than the threshold w 2 , and the target's horizontal track speed When it is negative, it is determined that the target track reaches the edge of the image, that is, it is in a multi-channel video cross coverage state, where the panoramic video images I i and I i+1 are adjacent continuous images.
  • Step 4-4 Perform data correction on continuous multiple frames of track information to complete stable multi-target tracking.
  • Step 4-4 includes: storing the track data of continuous N k frames of panoramic video images, and converting the track data of the current frame And its previous N k -1 frame predicted track data Perform weighted average to generate corrected track data
  • the specific operations are as follows:
  • x is the target horizontal position coordinate in the track data
  • y is the target vertical position coordinate in the track data
  • w is the target width in the track data
  • h is the target height in the track data
  • the present invention discloses a real-time target detection and tracking method based on panoramic multi-channel 4k ultra-high-definition video images, which solves the problems of high false alarm rate and low robustness in panoramic target detection and tracking.
  • Using regional block processing to complete the background modeling threshold setting then implement adaptive background modeling to extract candidate target regions and point traces, and finally use dynamic track management to achieve stable multi-target tracking of panoramic video.
  • the present invention performs verification tests in multiple scenarios, has excellent target detection and tracking performance, the target detection rate is greater than 90%, and the average processing time is less than 40 ms, which fully verifies the effectiveness of the present invention.
  • Figure 1 is a flow chart of the method according to the invention.
  • a real-time target detection and tracking method based on multiple 4k video images includes the following steps:
  • Step 1 Divide the panoramic 4-channel 4k video image into 16 areas, perform multi-frame target statistics on each of the 16 areas, classify each area of the panoramic video according to the target statistical probability, and complete the 16 areas according to the levels of the 16 areas. Threshold setting of each regional background modeling parameter;
  • Step 2 Perform fast median filtering on the panoramic video image, initialize the background model, adjust the background modeling parameter threshold adaptively through the dynamic transformation of the background to complete the background update, and then process the blinking pixels to complete the background image generation. Finally, the frame difference operation is used to extract the foreground target candidate area;
  • Step 3 Perform fast median filtering on the candidate target region image, use morphological related operations to complete the enhanced target region extraction, calculate the connected domain of the enhanced candidate target region and the minimum circumscribed rectangle of the connected domain, and eliminate false candidate target frames through target shape features. Form the target spot;
  • Step 4 Perform continuous multi-frame detection on the panoramic video to obtain the target point trace.
  • the target dynamic track management is performed, and the continuous multi-frame track information Perform data correction and complete multi-target stable tracking.
  • step 1 includes:
  • Step 1-1 according to the panoramic 4-channel 4k video image size and scene coverage, the panoramic video image is divided into 16 regions, the width and height of the region is W n ⁇ H n , where the region width W n ⁇ 1920, the region height H n ⁇ 1080;
  • Step 1-2 use the frame difference method (ZHOU Y, JI J, SONG KA Moving Target Detection Method Based on Improved Frame Difference Background Modeling [J].Open Cybernetics&Systemics Journal, 2014) to count the moving targets in the panoramic video in 200000 frames of video images
  • the frequency of appearance in the image, according to the frequency of occurrence of moving objects, the area S n is divided into four levels A, B, C, and D according to the frequency of appearance of the target.
  • the area with moving objects in the video image with more than 20,000 frames is A level Image area
  • the area where there are moving objects in the video image with more than 10,000 frames and less than 20,000 frames is the B-level image area
  • the area where there are moving objects in the video image with 5000 frames and more than 10,000 frames is the C-level image area
  • the video images with more than 1,000 frames and less than 5000 frames exist
  • the area of the moving target is a D-level image area, where n in the area S n ranges from [1,16]; each area has only one level, and each level corresponds to a threshold, so there are 16 thresholds in 16 areas;
  • Steps 1-3 merge the adjacent level areas, and respectively record the corresponding panoramic position coordinates (x n , y n , w n , h n ) of each area S n , where (x n , y n ) is the area S n
  • the position coordinates are the upper left corner coordinates, and (w n , h n ) is the width and height of the area S n .
  • step 2 includes:
  • Step 2-1 perform fast median filtering on panoramic video images (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.), to eliminate the influence of background noise;
  • Step 2-2 initialize the background model of the panoramic video, the background model modeling method adopts ViBE (Visual Background Extractor, BARNICH O, DROOGENBROECK M V.ViBe: A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing , 2011, 20(06): 1709-1724.), where the background modeling parameter threshold T n is set as the Euclidean distance threshold in the ViBE algorithm, and the default value of T n is 20.
  • ViBE Visual Background Extractor, BARNICH O, DROOGENBROECK M V.ViBe: A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing , 2011, 20(06): 1709-1724.
  • Step 2-3 adaptively adjust the background modeling parameter threshold T n according to the dynamic transformation degree of the background to complete the background model update.
  • the background modeling parameter threshold T n is used to determine whether a pixel belongs to the background. Too large or too small will affect the quality of background modeling. In order to accurately describe the motion state of the target, the threshold is adaptively adjusted by the dynamic transformation degree to define the background transformation parameters. ⁇ (x,y):
  • f(i,j) is the pixel value of the current frame at (i,j)
  • d(i,j) is the pixel value of the background model at (i,j)
  • M is the width of the current frame image
  • Set the background transformation factor parameter ⁇ When the current pixel value is successfully matched with the background model, calculate the value of ⁇ (x, y). If the current is a static scene ⁇ (x, y) tends to be a stable value, if for a dynamic scene, ⁇ (x, y) is larger, and the adaptive update of the background modeling parameter threshold T n is performed according to the following formula:
  • T n ' is the threshold after adaptive adjustment
  • is the dynamic adjustment factor
  • ⁇ and ⁇ are fixed parameters
  • is generally taken as 0.8
  • is generally taken as 0.2.
  • Steps 2-4 processing the blinking pixels in the background model to complete the generation of the background image.
  • the specific processing method of flashing pixels For the background image generated in the background modeling, a certain pixel in the background image often bounces back and forth between the background point and the front spot, constructing an index level table of flashing pixels, for the edge contours belonging to the background image When the point (Kass M, Witkin A, Terzopoulos D.
  • Step 2-5 Perform a difference between the original image and the background image extracted from a single frame to generate a candidate target image Im obj to complete the candidate target extraction.
  • step 3 includes:
  • Step 3-1 perform fast median filter on the candidate target image Im obj (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.) Generate image Im mf ;
  • Step 3-2 Perform a morphological expansion operation on the filtered image Im mf to generate an image Im do , and then perform an AND operation between the image Im do and the candidate target image Im obj to generate an enhanced candidate target image Im obj2 ;
  • Step 3-3 Perform a morphological closing operation on the image Im obj2 , extract the connected domain of the candidate target, calculate the minimum bounding rectangle of the connected domain, and extract the candidate target frame;
  • the track data includes frame number, target position coordinates, target width, target height, target aspect ratio and target area.
  • step 4 includes:
  • Step 4-1 Generate the target track Tr i from the target point trace Po i extracted from the first frame of video image.
  • the specific operation method is as follows: automatically generate the batch number BN of the target track structure and put it into the target track structure vector.
  • the batch number BN is automatically accumulated and satisfies 1 ⁇ BN ⁇ 9999.
  • the target track includes frame number and target. Position coordinates, target width, target height, target aspect ratio and target area.
  • Step 4-2 Calculate the absolute distance D i+1 between the target point track Po i+1 and the target track Tr i extracted from the next frame of video image respectively, and the calculation method of the absolute distance D i+1 is:
  • Po i+1 (x) is the x coordinate of the target point trace
  • Po i+1 (y) is the y coordinate of the target point trace
  • Tr i (x) is the x coordinate of the target track
  • Tr i (y) Is the y coordinate of the target track.
  • Step 4-3 judge whether the current target is in the multi-channel video cross coverage state according to the track information, and adopt the fast correlation filtering method (Henriques JF, Rui C, Martins P, et al. High-speed tracking with kernelized correlation filters[J] .IEEE Transactions on Pattern Analysis&Machine Intelligence, 2015, 37(3):583-596.) Track management of multi-screen targets.
  • the specific method for determining the state of multi-channel video cross coverage When the position of the target in the horizontal direction in the image I 1 is greater than w 1 , and the target's horizontal track speed is positive, it is determined that the target track reaches the edge of the image At the same time, when the position of the target in the horizontal direction in the image I 2 is less than w 2 and the track speed of the target in the horizontal direction is negative, it is determined that the target track also reaches the edge of the image, and w 1 generally takes the value Is 3800, and w 2 is generally 50.
  • Step 4-4 Perform data correction on continuous multiple frames of track information to complete stable multi-target tracking.
  • the data correction method is: store the track data of continuous N k frames of video images, and change the track data of the current frame And its previous N k -1 frame predicted track data Perform weighted average to generate corrected track data
  • the specific operations are as follows:
  • x is the target horizontal position coordinate in the track data
  • y is the target vertical position coordinate in the track data
  • w is the target width in the track data
  • h is the target height in the track data
  • ⁇ 1 and ⁇ 2 are weighting factors
  • N k is generally 25
  • ⁇ 1 is generally 0.3
  • the present invention provides a real-time target detection and tracking method based on panoramic multi-channel 4k video images.

Abstract

A real-time target detection and tracking method based on panoramic multichannel 4k video images, mainly used for solving the problems in the prior art that the processing speed of panoramic multichannel 4k images is slow, a target crossing multichannel cameras is erroneously detected or missed, and the stability of detecting and tracking the target is low. The method comprises: first, performing long-time target probability counting on a panoramic video image, realizing region importance division, and setting a background modeling parameter threshold; next, performing adaptive background modeling on the panoramic video image and obtaining a foreground target candidate region of a scene; then, performing fusing and processing on the foreground target candidate region to form a candidate target plot; finally, performing dynamic flight track management to realize the multi-target stable tracking of a panoramic video. The present method can be used in the fields of the remote monitoring of an airport tower, panoramic video enhancement, road traffic vehicle detection and the like, and is excellent in target detection and tracking performance.

Description

基于全景多路4k视频图像的实时目标检测跟踪方法Real-time target detection and tracking method based on panoramic multi-channel 4k video images 技术领域Technical field
本发明涉及数字图像处理技术领域,尤其涉及基于全景多路4k视频图像的实时目标检测跟踪方法。The invention relates to the technical field of digital image processing, in particular to a real-time target detection and tracking method based on panoramic multi-channel 4k video images.
背景技术Background technique
目标检测是通过计算机视觉算法从图像中提取出感兴趣的目标。目标检测作为图像处理中的重要的分支,在各个领域都有着十分广泛的应用。在实际的检测场景中,由于外界复杂且不稳定的环境,干扰繁多,给目标检测带来了诸多的难题。实现准确稳定实时的目标检测与跟踪具有十分重要的研究意义。Target detection is to extract the target of interest from the image through computer vision algorithms. As an important branch of image processing, target detection has a very wide range of applications in various fields. In the actual detection scene, due to the complex and unstable external environment, there are many interferences, which bring many problems to target detection. The realization of accurate, stable and real-time target detection and tracking has very important research significance.
张天宇在专利“时空多尺度运动目标检测方法”中提出了一种多尺度目标检测方法,将图像进行分块利用运动区域内最优差分间隔实现目标检测与跟踪,该方法在复杂场景下鲁棒性低,显著性差异判定准则难以适应多个场景。Zdenek Kalal,Krystian Mikolajczyk等人在“Tracking-Learning-Detection”中提出了一种对视频中单个目标检测与跟踪方法,利用帧间信息差异将检测与跟踪结合起来,实现对目标样本的在线学习,该方法提出的中值光流法需要进行目标初始化,跟踪修正固定很难保证与检测器同步。杨艳爽,蒲宝明在“基于改进SUSAN算法的移动车辆检测”中提出了自适应阈值的SUSAN检测到车辆目标边界方法,利用直方图变换与霍夫变换结合提取目标连通域,实现对车辆目标与背景的分离,该方法的实时性较差且在复杂场景中自适应阈值将很难有效完成目标分割。Zhang Tianyu proposed a multi-scale target detection method in the patent "Spatio-temporal multi-scale moving target detection method". The image is divided into blocks and the optimal difference interval in the moving area is used to achieve target detection and tracking. This method is robust in complex scenes The performance is low, and the criteria for determining significant differences are difficult to adapt to multiple scenarios. Zdenek Kalal, Krystian Mikolajczyk and others in "Tracking-Learning-Detection" proposed a method for detecting and tracking a single target in a video, which uses the information difference between frames to combine the detection and tracking to realize online learning of target samples. The median optical flow method proposed by this method requires target initialization, and it is difficult to ensure synchronization with the detector when the tracking correction is fixed. Yang Yanshuang and Pu Baoming proposed an adaptive threshold SUSAN method to detect the vehicle target boundary in "Moving Vehicle Detection Based on Improved SUSAN Algorithm". The histogram transform and the Hough transform are used to extract the connected domain of the target, and the vehicle target and background are extracted. Separation, the real-time performance of this method is poor, and it is difficult to effectively complete the target segmentation with adaptive threshold in complex scenes.
发明内容Summary of the invention
针对现有的技术的不足之处,本发明为解决现有目标检测与跟踪技术实时性差和稳定性不足的问题,提出了基于全景多路4k视频图像的实时目标检测跟踪方法,目标检测与跟踪性能优异且易于工程上的实现。Aiming at the shortcomings of the existing technology, in order to solve the problems of poor real-time performance and insufficient stability of the existing target detection and tracking technology, the present invention proposes a real-time target detection and tracking method based on panoramic multi-channel 4k video images. Target detection and tracking Excellent performance and easy to implement in engineering.
本发明提供的基于全景多路4k视频图像的实时目标检测跟踪方法包括以下步骤:The real-time target detection and tracking method based on panoramic multi-channel 4k video images provided by the present invention includes the following steps:
步骤1,将全景多路4k视频图像划分成n个区域,对各个区域分别进行多帧目标统计,根据目标统计概率对全景视频各个区域进行等级划分,并根据各个区域的等级完成背景建模参数阈值设定;Step 1. Divide the panoramic multi-channel 4k video image into n regions, perform multi-frame target statistics for each region, classify each region of the panoramic video according to the target statistical probability, and complete the background modeling parameters according to the level of each region Threshold setting;
步骤2,对全景视频图像进行中值滤波,初始化背景模型,通过背景的动态变换程度自适应地调整背景建模参数阈值,完成背景更新,然后对闪烁像素点进行处理,完成背景图像生成,最后利用帧差操作实现前景候选目标区域图像生成;Step 2: Perform median filtering on the panoramic video image, initialize the background model, adaptively adjust the background modeling parameter threshold through the degree of dynamic transformation of the background, complete the background update, and then process the blinking pixels to complete the background image generation, and finally Use frame difference operation to realize the image generation of foreground candidate target area;
步骤3,对候选目标区域图像进行中值滤波,利用形态学相关操作完成增强候选目标区域提取,计算增强候选目标区域的连通域及连通域最小外接矩形,通过目标形状特征剔除虚假候选目标框,形成目标点迹;Step 3. Perform median filtering on the candidate target area image, use morphology-related operations to complete the enhanced candidate target area extraction, calculate the connected domain of the enhanced candidate target area and the minimum circumscribed rectangle of the connected domain, and eliminate false candidate target frames through the target shape features. Form the target spot;
步骤4,对全景视频图像进行连续多帧检测获取目标点迹,通过判断目标点迹与目标航迹的绝对距离、多路视频交叉覆盖状态进行目标动态航迹管理,对连续多帧航迹信息进行数据矫正,完成多目标稳定跟踪。Step 4. Perform continuous multi-frame detection on the panoramic video image to obtain the target point trace. By judging the absolute distance between the target point trace and the target track, and the multi-channel video cross coverage state, the target dynamic track management is performed, and the continuous multi-frame track information Perform data correction and complete multi-target stable tracking.
步骤1包括:Step 1 includes:
步骤1-1,根据全景视频图像尺寸和场景覆盖情况(划分的准则就是单区域不超过1920*1080,4k视频图像刚好分为16个),将全景视频图像划分成n个区域S n,第n个区域记为S n,每个区域的区域宽度小于等于1920(像素),区域高度大于等于1080(像素); Step 1-1, according to the panoramic video image size and scene coverage (the dividing criterion is that a single area does not exceed 1920*1080, and the 4k video image is just divided into 16), divide the panoramic video image into n areas S n . N areas are denoted as S n , the area width of each area is less than or equal to 1920 (pixels), and the area height is greater than or equal to 1080 (pixels);
步骤1-2,利用帧差法(参考文献:ZHOU Y,JI J,SONG K.A Moving Target Detection Method Based on Improved Frame Difference Background Modeling[J].Open Cybernetics&Systemics Journal,2014)统计K帧视频图像中运动目标在全景视频图像中出现的频率,根据运动目标出现频率的高低,以目标出现频率高低将n个区域划分为A、B、C、D四个等级,其中K 1帧以上视频图像存在运动目标的区域为A等级图像区域,K 2帧以上K 1帧以下视频图像存在运动目标的区域为B等级图像区域,K 3帧以上K 2帧以下视频图像存在运动目标的区域为C等级图像区域,K 4帧以上K 3帧以下视频图像存在运动目标的区域为D等级图像区域; Step 1-2, use the frame difference method (reference: ZHOU Y, JI J, SONG KA Moving Target Detection Method Based on Improved Frame Difference Background Modeling[J].Open Cybernetics&Systemics Journal, 2014) to count moving targets in K-frame video images The frequency of appearance in the panoramic video image, according to the frequency of the moving target, the n regions are divided into four levels: A, B, C, and D according to the frequency of the target appearance. Among them, there are moving targets in the video image with more than 1 frame. The area is an A-level image area, the area where there are moving objects in the video image with more than K 2 frames and less than 1 frame is the B-level image area, and the area where there are moving objects in the video image with more than K 3 frames and less than 2 frames is the C-level image area. The area where the moving target exists in the video image of more than 4 frames and less than 3 frames is the D-level image area;
步骤1-3,对相邻等级图像区域进行合并,并分别记录各个区域对应全景位置坐标,第n个S n对应全景位置坐标为(x n,y n,w n,h n),其中(x n,y n)为第n个区域S n位置的左上角坐标w n,h n分别表示第n个区域S n的宽和高。 Steps 1-3, merge the adjacent image areas, and respectively record the corresponding panoramic position coordinates of each area. The nth S n corresponds to the panoramic position coordinates of (x n ,y n ,w n ,h n ), where ( x n, y n) is the n th region left corner position S n w n, h n denote the n-th region S n, width and height.
步骤1-4,分别对n个区域设置相对应背景建模参数阈值,第n个区域S n相对应的背景建模参数阈值为T nSteps 1-4, setting corresponding background modeling parameter thresholds for n regions respectively, and the background modeling parameter threshold corresponding to the nth region S n is T n .
步骤2包括:Step 2 includes:
步骤2-1,对全景视频图像进行快速中值滤波(ZHANG Li,CHEN Zhi-qiang,GAO Wen-huan,et al.Mean-based fast median filter[J].Journal of Tsinghua University:Science and Technology,2004,44(9):1157-1159.),消除背景噪声影响;Step 2-1, perform fast median filtering on panoramic video images (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.), to eliminate the influence of background noise;
步骤2-2,初始化全景视频图像的背景模型,背景模型建模方法采用ViBE(Visual Background Extractor,BARNICH O,DROOGENBROECK M V.ViBe:A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing,2011,20(06):1709-1724.),其中将背景建模参数阈值T n设定为ViBE算法中欧式距离阈值。 Step 2-2, initialize the background model of the panoramic video image, the background model modeling method adopts ViBE (Visual Background Extractor, BARNICH O, DROOGENBROECK M V. ViBe: A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing, 2011, 20(06): 1709-1724.), where the background modeling parameter threshold T n is set as the Euclidean distance threshold in the ViBE algorithm.
步骤2-3,根据背景的动态变换程度自适应地调整背景建模参数阈值T n完成背景模型更新。背景建模参数阈值T n用于判定像素点是否属于背景,过大或过小都会影响背景建模的质量,为精确刻画目标运动状态,采用动态变换程度自适应地调整阈值,定义背景变换参数φ(x,y)为: Step 2-3, adaptively adjust the background modeling parameter threshold T n according to the dynamic transformation degree of the background to complete the background model update. The background modeling parameter threshold T n is used to determine whether a pixel belongs to the background. Too large or too small will affect the quality of background modeling. In order to accurately describe the motion state of the target, the threshold is adaptively adjusted by the dynamic transformation degree to define the background transformation parameters. φ(x,y) is:
Figure PCTCN2020090155-appb-000001
Figure PCTCN2020090155-appb-000001
其中f(i,j)为当前帧在位置(i,j)的像素值,d(i,j)为背景模型在位置(i,j)的像素值,M为当前帧图像的宽度,N为当前帧图像的高度。Where f(i,j) is the pixel value of the current frame at position (i,j), d(i,j) is the pixel value of the background model at position (i,j), M is the width of the current frame image, N Is the height of the current frame image.
设定背景变换因子参数μ,对于当前像素值与背景模型匹配成功时,计算φ(x,y)的值,若当前为静态场景φ(x,y)趋于稳定值,若对于动态场景,φ(x,y)较大,背景建模参数阈值T n的自适应更新则根据下式进行: Set the background transformation factor parameter μ. When the current pixel value is successfully matched with the background model, calculate the value of φ(x, y). If the current is a static scene φ(x, y) tends to be a stable value, if for a dynamic scene, φ(x, y) is larger, and the adaptive update of the background modeling parameter threshold T n is performed according to the following formula:
Figure PCTCN2020090155-appb-000002
Figure PCTCN2020090155-appb-000002
其中T n'为自适应调节后的阈值,β为动态调节因子,μ和β均为固定参数。 Where T n 'is the threshold after adaptive adjustment, β is the dynamic adjustment factor, and μ and β are both fixed parameters.
步骤2-4,对背景模型中的闪烁像素点进行处理,完成背景图像生成。闪烁像素点具体处理方法:对于背景建模中生成的背景图像中的像素点,背景图像中某个像素点经常在背景点和前景点来回跳动,构建闪烁像素点的索引层级表,如果所述像素点属于背景图像的边缘轮廓点(参考文献:Kass M,Witkin A,Terzopoulos D.Snakes:Active contour models[J].International Journal of Computer Vision,1988,1(4):321-331.),但不同于上一帧背景图像中边缘轮廓点,则闪烁频率等级增加
Figure PCTCN2020090155-appb-000003
否则闪烁频率等级减少
Figure PCTCN2020090155-appb-000004
如果连续K帧背景图像某一像素点闪烁频率等级大于S NK,则判断所述像素点为闪烁像素点,将闪烁像素点从更新背景图像上移除。
Steps 2-4, processing the blinking pixels in the background model to complete the generation of the background image. The specific processing method of flashing pixels: For the pixels in the background image generated in the background modeling, a certain pixel in the background image often bounces back and forth between the background point and the front spot, constructing an index level table of the flashing pixels, if said Pixels belong to the edge contour points of the background image (Reference: Kass M, Witkin A, Terzopoulos D. Snakes: Active contour models[J]. International Journal of Computer Vision,1988,1(4):321-331.), But different from the edge contour points in the background image of the previous frame, the flicker frequency level increases
Figure PCTCN2020090155-appb-000003
Otherwise, the flashing frequency level is reduced
Figure PCTCN2020090155-appb-000004
If the flicker frequency level of a certain pixel of the continuous K frames of background image is greater than S NK , then it is determined that the pixel is a flickering pixel, and the flickering pixel is removed from the updated background image.
步骤2-5,利用全景视频图像与步骤2-4中得到的背景图像进行做差,生成候选目标图像Im obj,候选目标区域就是候选目标图像。 Step 2-5: Perform difference between the panoramic video image and the background image obtained in step 2-4 to generate a candidate target image Im obj , and the candidate target area is the candidate target image.
步骤3包括:Step 3 includes:
步骤3-1,对候选目标图像Im obj进行快速中值滤波(ZHANG Li,CHEN Zhi-qiang,GAO Wen-huan,et al.Mean-based fast median filter[J].Journal of Tsinghua University:Science and Technology,2004,44(9):1157-1159.)生成图像Im mfStep 3-1, perform fast median filter on the candidate target image Im obj (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.) Generate image Im mf ;
步骤3-2,对滤波后图像Im mf进行形态学膨胀(Haralick R.Zhunag X.Image analysis using mathematical morphology[J].IEEE Trans.On Pattern Analysis and Machine Intelligence1987,9(4):532-550.)操作生成图像Im do,然后图像Im do与候选目标图像Im obj进行与操作生成增强候选目标图像Im obj2Step 3-2, perform morphological expansion on the filtered image Im mf (Haralick R.Zhunag X. Image analysis using mathematical morphology[J].IEEE Trans.On Pattern Analysis and Machine Intelligence1987,9(4):532-550. ) Operate to generate an image Im do , and then perform an AND operation between the image Im do and the candidate target image Im obj to generate an enhanced candidate target image Im obj2 ;
步骤3-3,对图像Im obj2进行形态学闭操作(Haralick R.Zhunag X.Image analysis using mathematical morphology[J].IEEE Trans.On Pattern Analysis and Machine Intelligence 1987,9(4):532-550.),提取候选目标的连通域,计算连通域的最小外接矩形,提取候选目标框; Step 3-3, perform morphological closing operation on the image Im obj2 (Haralick R.Zhunag X.Image analysis using mathematical morphology[J].IEEE Trans.On Pattern Analysis and Machine Intelligence 1987,9(4):532-550. ), extract the connected domain of the candidate target, calculate the minimum bounding rectangle of the connected domain, and extract the candidate target frame;
步骤3-4,计算候选目标框的形状特征,所述形状特征包括目标框的宽度obj_w、高度obj_h及宽高比obj_wh,判断当前候选目标框的形状特征是否满足obj_w>w 0、obj_h>h 0、obj_wh≥wh 0及obj_wh≤wh 1,若不满足上述要求,则判断当前候选目标框为虚假目标,并进行删除;将满足要求的候选目标框生成目标点迹,其中w 0为目标框宽度阈值,h 0为目标框高度阈值,wh 1、wh 0分别为目标宽高比高阈值、目标宽高比低阈值;所述目标点迹包括帧号、目标位置坐标、目标宽度、目标高度、目标宽高比和目标面积。 Step 3-4: Calculate the shape characteristics of the candidate target frame, the shape characteristics including the width obj_w, height obj_h, and aspect ratio obj_wh of the target frame, and determine whether the shape characteristics of the current candidate target frame satisfy obj_w>w 0 , obj_h>h 0 , obj_wh ≥ wh 0 and obj_wh ≤ wh 1 , if the above requirements are not met, the current candidate target frame is judged to be a false target and deleted; the candidate target frame that meets the requirements is generated as a target trace, where w 0 is the target frame Width threshold, h 0 is the target frame height threshold, wh 1 and wh 0 are the target aspect ratio high threshold and target aspect ratio low threshold respectively; the target trace includes frame number, target position coordinates, target width, and target height , Target aspect ratio and target area.
步骤4包括:Step 4 includes:
步骤4-1,将第一帧全景视频图像提取到的目标点迹Po i生成目标航迹Tr i,具体操作方法为:将目标点迹结构体自动生成的批号BN放入到目标航迹结构体向量,批号BN自动进行累加,且满足1≤BN≤9999,所述目标航迹包括帧号、目标位置坐标、目标宽度、目标高度、目标宽高比和目标面积; Step 4-1, generate the target track Tr i from the target point trace Po i extracted from the first frame of panoramic video image, the specific operation method is: put the batch number BN automatically generated by the target point trace structure into the target track structure Volume vector, batch number BN is automatically accumulated, and satisfies 1≤BN≤9999, and the target track includes frame number, target position coordinates, target width, target height, target aspect ratio and target area;
步骤4-2,分别计算下一帧全景视频图像提取的目标点迹Po i+1与目标航迹Tr i的绝对距离D i+1,所述绝对距离D i+1的计算公式为: Step 4-2: Calculate the absolute distance D i+1 between the target point track Po i+1 and the target track Tr i extracted from the next frame of panoramic video image respectively, and the calculation formula of the absolute distance D i+1 is:
Figure PCTCN2020090155-appb-000005
Figure PCTCN2020090155-appb-000005
其中,Po i+1(x)为目标点迹的横坐标,Po i+1(y)为目标点迹的纵坐标,Tr i(x)为目标 航迹的横坐标,Tr i(y)为目标航迹的纵坐标; Among them, Po i+1 (x) is the abscissa of the target track, Po i+1 (y) is the ordinate of the target track, Tr i (x) is the abscissa of the target track, Tr i (y) Is the ordinate of the target track;
若D i+1≤DT,将目标点迹Po i+1加入到目标航迹Tr i;若D i+1>DT,则将目标点迹Po i+1按照步骤4-1重新生成新的目标航迹Tr i+1,其中DT为绝对距离判断阈值; If D i+1 ≤DT, add the target point track Po i+1 to the target track Tr i ; if D i+1 >DT, then regenerate the target point track Po i+1 according to step 4-1 Target track Tr i+1 , where DT is the absolute distance judgment threshold;
步骤4-3,根据航迹信息判断当前目标是否处于多路视频交叉覆盖状态,采用快速相关滤波方法(Henriques J F,Rui C,Martins P,et al.High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2015,37(3):583-596.)对属于跨屏目标进行航迹管理。Step 4-3: Determine whether the current target is in the multi-channel video cross coverage state based on the track information, and adopt the fast correlation filtering method (Henriques J F, Rui C, Martins P, et al. High-speed tracking with kernelized correlation filters[J ].IEEE Transactions on Pattern Analysis&Machine Intelligence, 2015, 37(3):583-596.) Track management of multi-screen targets.
步骤4-3中,所述根据航迹信息判断当前目标是否处于多路视频交叉覆盖状态,包括:当目标在第i帧全景视频图像I i中的水平方向上的位置大于阈值w 1时,且目标水平方向的航迹速度为正时,同时,当目标在第i+1帧全景视频图像I i+1中的水平方向上的位置小于阈值w 2时,且目标水平方向的航迹速度为负时,此时判定目标航迹达到图像边缘处,即处于多路视频交叉覆盖状态,其中全景视频图像I i和I i+1为相邻连续图像。 In step 4-3, judging whether the current target is in the multi-channel video cross coverage state according to the track information includes: when the position of the target in the horizontal direction in the i-th frame of panoramic video image I i is greater than the threshold w 1 , And the target's horizontal track speed is positive, and at the same time, when the target's position in the horizontal direction in the i+1th frame of panoramic video image I i+1 is less than the threshold w 2 , and the target's horizontal track speed When it is negative, it is determined that the target track reaches the edge of the image, that is, it is in a multi-channel video cross coverage state, where the panoramic video images I i and I i+1 are adjacent continuous images.
步骤4-4,对连续多帧航迹信息进行数据矫正,完成多目标稳定跟踪。Step 4-4: Perform data correction on continuous multiple frames of track information to complete stable multi-target tracking.
步骤4-4包括:存储连续N k帧全景视频图像的航迹数据,将当前帧的航迹数据
Figure PCTCN2020090155-appb-000006
和其前N k-1帧预测航迹数据
Figure PCTCN2020090155-appb-000007
进行加权平均生成矫正后的航迹数据
Figure PCTCN2020090155-appb-000008
具体操作如下:
Step 4-4 includes: storing the track data of continuous N k frames of panoramic video images, and converting the track data of the current frame
Figure PCTCN2020090155-appb-000006
And its previous N k -1 frame predicted track data
Figure PCTCN2020090155-appb-000007
Perform weighted average to generate corrected track data
Figure PCTCN2020090155-appb-000008
The specific operations are as follows:
Figure PCTCN2020090155-appb-000009
Figure PCTCN2020090155-appb-000009
其中,x为航迹数据中的目标水平位置坐标,y为航迹数据中的目标垂直位置坐标,w为航迹数据中的目标宽度,h为航迹数据中的目标高度,σ 1和σ 2为加权因子,满足σ 12=1。 Among them, x is the target horizontal position coordinate in the track data, y is the target vertical position coordinate in the track data, w is the target width in the track data, h is the target height in the track data, σ 1 and σ 2 is a weighting factor, which satisfies σ 12 =1.
有益效果:本发明公开了一种基于全景多路4k超高清视频图像的实时目标检测和跟踪方法,解决了全景目标检测和跟踪的虚警率高、鲁棒性低的问题。采用区域分块处理完成对背景建模阈值设定,接着实现自适应背景建模提取候选目标区域和点迹,最后采用动态航迹管理实现全景视频的多目标稳定跟踪。本发明在多种场景下进行验证测试,目标检测和跟踪性能优异,目标检测率大于90%,平均处理时间低于40ms,充分验证了本发明的有效性。Beneficial effects: The present invention discloses a real-time target detection and tracking method based on panoramic multi-channel 4k ultra-high-definition video images, which solves the problems of high false alarm rate and low robustness in panoramic target detection and tracking. Using regional block processing to complete the background modeling threshold setting, then implement adaptive background modeling to extract candidate target regions and point traces, and finally use dynamic track management to achieve stable multi-target tracking of panoramic video. The present invention performs verification tests in multiple scenarios, has excellent target detection and tracking performance, the target detection rate is greater than 90%, and the average processing time is less than 40 ms, which fully verifies the effectiveness of the present invention.
附图说明Description of the drawings
下面结合附图和具体实施方式对本发明做更进一步的具体说明,本发明的上述或其他方面的优点将会变得更加清楚。In the following, the present invention will be further described in detail with reference to the accompanying drawings and specific embodiments, and the above-mentioned or other advantages of the present invention will become clearer.
图1是根据本发明方法的流程图。Figure 1 is a flow chart of the method according to the invention.
具体实施方式Detailed ways
下面结合附图及实施例对本发明做进一步说明。The present invention will be further described below in conjunction with the drawings and embodiments.
结合图1所示,根据本发明的实施例,基于多路4k视频图像的实时目标检测和跟踪方法包括以下步骤:As shown in FIG. 1, according to an embodiment of the present invention, a real-time target detection and tracking method based on multiple 4k video images includes the following steps:
步骤1,将全景4路4k视频图像划分成16个区域,对16个区域分别进行多帧目标统计,根据目标统计概率对全景视频各个区域进行等级划分,并根据16个区域的等级完成对16个区域背景建模参数阈值设定;Step 1. Divide the panoramic 4-channel 4k video image into 16 areas, perform multi-frame target statistics on each of the 16 areas, classify each area of the panoramic video according to the target statistical probability, and complete the 16 areas according to the levels of the 16 areas. Threshold setting of each regional background modeling parameter;
步骤2,对全景视频图像进行快速中值滤波,初始化背景模型,通过背景的动态变换程度自适应地调整背景建模参数阈值完成背景更新,然后对闪烁像素点进行处理,完成背景图像生成,,最后利用帧差操作实现前景目标候选区域提取;Step 2: Perform fast median filtering on the panoramic video image, initialize the background model, adjust the background modeling parameter threshold adaptively through the dynamic transformation of the background to complete the background update, and then process the blinking pixels to complete the background image generation. Finally, the frame difference operation is used to extract the foreground target candidate area;
步骤3,对候选目标区域图像进行快速中值滤波,利用形态学相关操作完成增强目标区域提取,计算增强候选目标区域的连通域及连通域最小外接矩形,通过目标形状特征剔除虚假候选目标框,形成目标点迹;Step 3: Perform fast median filtering on the candidate target region image, use morphological related operations to complete the enhanced target region extraction, calculate the connected domain of the enhanced candidate target region and the minimum circumscribed rectangle of the connected domain, and eliminate false candidate target frames through target shape features. Form the target spot;
步骤4,对全景视频进行连续多帧检测获取目标点迹,通过判断目标点迹与目标航迹的绝对距离、多路视频交叉覆盖状态进行目标动态航迹管理,,对连续多帧航迹信息进行数据矫正,完成多目标稳定跟踪。Step 4. Perform continuous multi-frame detection on the panoramic video to obtain the target point trace. By judging the absolute distance between the target point trace and the target track, and the multi-channel video cross coverage state, the target dynamic track management is performed, and the continuous multi-frame track information Perform data correction and complete multi-target stable tracking.
本发明中,步骤1包括:In the present invention, step 1 includes:
步骤1-1,根据全景4路4k视频图像尺寸和场景覆盖情况,将全景视频图像划分成16个区域,区域的宽高为W n×H n,其中区域宽度W n≤1920,区域高度H n≤1080; Step 1-1, according to the panoramic 4-channel 4k video image size and scene coverage, the panoramic video image is divided into 16 regions, the width and height of the region is W n × H n , where the region width W n ≤ 1920, the region height H n ≤1080;
步骤1-2,利用帧差法(ZHOU Y,JI J,SONG K.A Moving Target Detection Method Based on Improved Frame Difference Background Modeling[J].Open Cybernetics&Systemics Journal,2014)统计200000帧视频图像中运动目标在全景视频图像中出现的频率,根据运动目标出现频率的高低,以目标出现频率高低将区域S n划分为A、B、C、D四个等级,其中20000帧以上视频图像存在运动目标的区域为A等级图像区域,10000帧以上20000帧以下视频图像存在运动目标的区域为B等级图像区域,5000帧以上10000帧以下视频图像存在运动目标的区域为C等级图像区域,1000帧以上5000帧以下视频图像存在运动目标的区域为D等级图像区域,其中区域S n中n取值范围为[1,16];每个区域只有一个等级,每个等级对应一个阈值,因此16个区域共16个阈值; Step 1-2, use the frame difference method (ZHOU Y, JI J, SONG KA Moving Target Detection Method Based on Improved Frame Difference Background Modeling [J].Open Cybernetics&Systemics Journal, 2014) to count the moving targets in the panoramic video in 200000 frames of video images The frequency of appearance in the image, according to the frequency of occurrence of moving objects, the area S n is divided into four levels A, B, C, and D according to the frequency of appearance of the target. Among them, the area with moving objects in the video image with more than 20,000 frames is A level Image area, the area where there are moving objects in the video image with more than 10,000 frames and less than 20,000 frames is the B-level image area, the area where there are moving objects in the video image with 5000 frames and more than 10,000 frames is the C-level image area, and the video images with more than 1,000 frames and less than 5000 frames exist The area of the moving target is a D-level image area, where n in the area S n ranges from [1,16]; each area has only one level, and each level corresponds to a threshold, so there are 16 thresholds in 16 areas;
步骤1-3,对相邻等级区域进行合并,并分别记录各个区域S n对应全景位置坐标(x n,y n,w n,h n),其中(x n,y n)为区域S n位置坐标以左上角坐标,(w n,h n)为区域S n的宽高。 Steps 1-3, merge the adjacent level areas, and respectively record the corresponding panoramic position coordinates (x n , y n , w n , h n ) of each area S n , where (x n , y n ) is the area S n The position coordinates are the upper left corner coordinates, and (w n , h n ) is the width and height of the area S n .
步骤1-4,分别对区域S n对应等级设置相对应背景建模参数阈值T n,T n一般取值为T nA=30、T nB=25、T nC=20和T nD=15,其中T nA、T nB、T nC、T nD分别表示等级为A、B、C、D区域S n设置的阈值,若S 1区域200000帧视频中22000帧视频图像出现运动目标,则T 1=30。 Step 1-4, S n region respectively disposed corresponding to the corresponding level background modeling parameter threshold value T n, T n is generally a value of T nA = 30, T nB = 25, T nC = 20 and T nD = 15, wherein T nA , T nB , T nC , and T nD represent the thresholds set for S n in areas A, B, C, and D respectively. If a moving target appears in 22,000 frames of video in the 200,000 frames of S 1 area, then T 1 =30 .
本发明中,步骤2包括:In the present invention, step 2 includes:
步骤2-1,对全景视频图像进行快速中值滤波(ZHANG Li,CHEN Zhi-qiang,GAO Wen-huan,et al.Mean-based fast median filter[J].Journal of Tsinghua University:Science and Technology,2004,44(9):1157-1159.),消除背景噪声影响;Step 2-1, perform fast median filtering on panoramic video images (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.), to eliminate the influence of background noise;
步骤2-2,初始化全景视频的背景模型,背景模型建模方法采用ViBE(Visual Background Extractor,BARNICH O,DROOGENBROECK M V.ViBe:A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing,2011,20(06):1709-1724.),其中将背景建模参数阈值T n设定为ViBE算法中欧式距离阈值,T n默认值为20。 Step 2-2, initialize the background model of the panoramic video, the background model modeling method adopts ViBE (Visual Background Extractor, BARNICH O, DROOGENBROECK M V.ViBe: A universal background subtraction algorithm for video sequences[J].IEEE Transactions on Image Processing , 2011, 20(06): 1709-1724.), where the background modeling parameter threshold T n is set as the Euclidean distance threshold in the ViBE algorithm, and the default value of T n is 20.
步骤2-3,根据背景的动态变换程度自适应地调整背景建模参数阈值T n完成背景模型更新。背景建模参数阈值T n用于判定像素点是否属于背景,过大或过小都会影响背景建模的质量,为精确刻画目标运动状态,采用动态变换程度自适应地调整阈值,定义背景变换参数φ(x,y): Step 2-3, adaptively adjust the background modeling parameter threshold T n according to the dynamic transformation degree of the background to complete the background model update. The background modeling parameter threshold T n is used to determine whether a pixel belongs to the background. Too large or too small will affect the quality of background modeling. In order to accurately describe the motion state of the target, the threshold is adaptively adjusted by the dynamic transformation degree to define the background transformation parameters. φ(x,y):
Figure PCTCN2020090155-appb-000010
Figure PCTCN2020090155-appb-000010
其中f(i,j)为当前帧在(i,j)的像素值,d(i,j)为背景模型在(i,j)的像素值,M为当前帧图像的宽度,N为当前帧图像的高度,M=3840,N=2160。设定背景变换因子参数μ,对于当前像素值与背景模型匹配成功时,计算φ(x,y)的值,若当前为静态场景φ(x,y)趋于稳定值,若对于动态场景,φ(x,y)较大,背景建模参数阈值T n的自适应更新则根据下式进行: Where f(i,j) is the pixel value of the current frame at (i,j), d(i,j) is the pixel value of the background model at (i,j), M is the width of the current frame image, and N is the current The height of the frame image, M=3840, N=2160. Set the background transformation factor parameter μ. When the current pixel value is successfully matched with the background model, calculate the value of φ(x, y). If the current is a static scene φ(x, y) tends to be a stable value, if for a dynamic scene, φ(x, y) is larger, and the adaptive update of the background modeling parameter threshold T n is performed according to the following formula:
Figure PCTCN2020090155-appb-000011
Figure PCTCN2020090155-appb-000011
其中T n'为自适应调节后的阈值,β为动态调节因子,μ和β均为固定参数,μ一般取值为0.8,β一般取值为0.2。 Among them, T n 'is the threshold after adaptive adjustment, β is the dynamic adjustment factor, μ and β are fixed parameters, μ is generally taken as 0.8, and β is generally taken as 0.2.
步骤2-4,对背景模型中的闪烁像素点进行处理,完成背景图像生成。闪烁像素点具体处理方法:对于背景建模中生成的背景图,背景图中某个像素点经常在背景点和前景点来回跳动,构建闪烁像素点的索引层级表,对于属于背景图像的边缘轮廓点(Kass M,Witkin A,Terzopoulos D.Snakes:Active contour models[J].International Journal of Computer Vision,1988,1(4):321-331.)不同于上一帧背景图像中边缘轮廓点时闪烁频率等级增加
Figure PCTCN2020090155-appb-000012
相同与像素边缘轮廓点则闪烁频率等级减少
Figure PCTCN2020090155-appb-000013
如果连续K背景图像某像素点频率等级大于S NK,则判断当前像素点为闪烁像素点,将其从更新背景图像上移除。其中K=50,
Figure PCTCN2020090155-appb-000014
S NK=10。
Steps 2-4, processing the blinking pixels in the background model to complete the generation of the background image. The specific processing method of flashing pixels: For the background image generated in the background modeling, a certain pixel in the background image often bounces back and forth between the background point and the front spot, constructing an index level table of flashing pixels, for the edge contours belonging to the background image When the point (Kass M, Witkin A, Terzopoulos D. Snakes: Active contour models[J].International Journal of Computer Vision,1988,1(4):321-331.) is different from the edge contour point in the background image of the previous frame Increased flashing frequency level
Figure PCTCN2020090155-appb-000012
The same as the pixel edge contour point, the flicker frequency level is reduced
Figure PCTCN2020090155-appb-000013
If the frequency level of a certain pixel of the continuous K background image is greater than S NK , it is determined that the current pixel is a flickering pixel, and it is removed from the updated background image. Where K=50,
Figure PCTCN2020090155-appb-000014
S NK =10.
步骤2-5,利用原始图像与单帧提取的背景图像进行做差,生成候选目标图像Im obj,完成候选目标提取。 Step 2-5: Perform a difference between the original image and the background image extracted from a single frame to generate a candidate target image Im obj to complete the candidate target extraction.
本发明中,步骤3包括:In the present invention, step 3 includes:
步骤3-1,对候选目标图像Im obj进行快速中值滤波(ZHANG Li,CHEN Zhi-qiang,GAO Wen-huan,et al.Mean-based fast median filter[J].Journal of Tsinghua University:Science and Technology,2004,44(9):1157-1159.)生成图像Im mfStep 3-1, perform fast median filter on the candidate target image Im obj (ZHANG Li, CHEN Zhi-qiang, GAO Wen-huan, et al. Mean-based fast median filter[J]. Journal of Tsinghua University: Science and Technology, 2004, 44(9): 1157-1159.) Generate image Im mf ;
步骤3-2,对滤波后图像Im mf进行形态学膨胀操作生成图像Im do,然后图像Im do与候选目标图像Im obj进行与操作生成增强候选目标图像Im obj2Step 3-2: Perform a morphological expansion operation on the filtered image Im mf to generate an image Im do , and then perform an AND operation between the image Im do and the candidate target image Im obj to generate an enhanced candidate target image Im obj2 ;
步骤3-3,对图像Im obj2进行形态学闭操作,提取候选目标的连通域,计算连通域的最小外接矩形,提取候选目标框; Step 3-3: Perform a morphological closing operation on the image Im obj2 , extract the connected domain of the candidate target, calculate the minimum bounding rectangle of the connected domain, and extract the candidate target frame;
步骤3-4,计算候选目标框的形状特征,所述形状特征包括目标框的宽度obj_w、高度obj_h及宽高比obj_wh,判断当前候选目标框的形状特征是否满足obj_w>w 0、obj_h>h 0、obj_wh≥wh 0及obj_wh≤wh 1,若不满足上述要求,则判断候选目标框当前为虚假目标,将满足要求候选框生成目标点迹,其中w 0为目标框宽度阈值,h 0为目标框高度阈值,wh 1、wh 0为目标宽高比高、低阈值,通常w 0=10,h 0=10,wh 1=5,wh 0=1。所述点迹数据包括帧号、目标位置坐标、目标宽度、目标高度、目标宽高比和目标面积。 Step 3-4: Calculate the shape characteristics of the candidate target frame, the shape characteristics including the width obj_w, height obj_h, and aspect ratio obj_wh of the target frame, and determine whether the shape characteristics of the current candidate target frame satisfy obj_w>w 0 , obj_h>h 0 , obj_wh ≥ wh 0 and obj_wh ≤ wh 1 , if the above requirements are not met, it is determined that the candidate target frame is currently a false target, and the candidate frame will meet the requirements to generate target traces, where w 0 is the target frame width threshold, and h 0 is The target frame height thresholds, wh 1 , wh 0 are the target aspect ratio high and low thresholds, usually w 0 =10, h 0 =10, wh 1 =5, and wh 0 =1. The track data includes frame number, target position coordinates, target width, target height, target aspect ratio and target area.
本发明中,步骤4包括:In the present invention, step 4 includes:
步骤4-1,将第一帧视频图像提取到的目标点迹Po i生成目标航迹Tr i。具体操作方法为:将目标点迹结构体进行自动生成批号BN放入到目标航迹结构体向量,批号BN自动进 行累加,且满足1≤BN≤9999,所述目标航迹包括帧号、目标位置坐标、目标宽度、目标高度、目标宽高比和目标面积。 Step 4-1 : Generate the target track Tr i from the target point trace Po i extracted from the first frame of video image. The specific operation method is as follows: automatically generate the batch number BN of the target track structure and put it into the target track structure vector. The batch number BN is automatically accumulated and satisfies 1≤BN≤9999. The target track includes frame number and target. Position coordinates, target width, target height, target aspect ratio and target area.
步骤4-2,分别计算下一帧视频图像提取的目标点迹Po i+1与目标航迹Tr i的绝对距离D i+1,,所述绝对距离D i+1的计算方法为: Step 4-2: Calculate the absolute distance D i+1 between the target point track Po i+1 and the target track Tr i extracted from the next frame of video image respectively, and the calculation method of the absolute distance D i+1 is:
Figure PCTCN2020090155-appb-000015
Figure PCTCN2020090155-appb-000015
其中,Po i+1(x)为目标点迹的x坐标,Po i+1(y)为目标点迹的y坐标,Tr i(x)为目标航迹的x坐标,Tr i(y)为目标航迹的y坐标。 Among them, Po i+1 (x) is the x coordinate of the target point trace, Po i+1 (y) is the y coordinate of the target point trace, Tr i (x) is the x coordinate of the target track, Tr i (y) Is the y coordinate of the target track.
若D i+1≤DT,将目标点迹Po i+1加入到目标航迹Tr i;若D i+1>DT,则将目标点迹Po i+1按照步骤4-1重新生成新的目标航迹Tr i+1,其中DT为绝对距离判断阈值,一般取值为15; If D i+1 ≤DT, add the target point track Po i+1 to the target track Tr i ; if D i+1 >DT, then regenerate the target point track Po i+1 according to step 4-1 Target trajectory Tr i+1 , where DT is the absolute distance judgment threshold, which is generally 15;
步骤4-3,根据航迹信息判断当前目标是否处于多路视频交叉覆盖状态,采用快速相关滤波方法(Henriques J F,Rui C,Martins P,et al.High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2015,37(3):583-596.)对属于跨屏目标进行航迹管理。多路视频交叉覆盖状态具体判定方法:当目标在图像I 1中的水平方向上的位置大于w 1时,且目标水平方向的航迹速度为正时,此时判定目标航迹达到图像边缘处,同时,当目标在图像I 2中的水平方向上的位置小于w 2时,且目标水平方向的航迹速度为负时,此时判定目标航迹也达到图像边缘处,w 1一般取值为3800,w 2一般取值为50。 Step 4-3, judge whether the current target is in the multi-channel video cross coverage state according to the track information, and adopt the fast correlation filtering method (Henriques JF, Rui C, Martins P, et al. High-speed tracking with kernelized correlation filters[J] .IEEE Transactions on Pattern Analysis&Machine Intelligence, 2015, 37(3):583-596.) Track management of multi-screen targets. The specific method for determining the state of multi-channel video cross coverage: When the position of the target in the horizontal direction in the image I 1 is greater than w 1 , and the target's horizontal track speed is positive, it is determined that the target track reaches the edge of the image At the same time, when the position of the target in the horizontal direction in the image I 2 is less than w 2 and the track speed of the target in the horizontal direction is negative, it is determined that the target track also reaches the edge of the image, and w 1 generally takes the value Is 3800, and w 2 is generally 50.
步骤4-4,对连续多帧航迹信息进行数据矫正,完成多目标稳定跟踪。数据矫正方法为:存储连续N k帧视频图像的航迹数据,将当前帧的航迹数据
Figure PCTCN2020090155-appb-000016
和其前N k-1帧预测航迹数据
Figure PCTCN2020090155-appb-000017
进行加权平均生成矫正后的航迹数据
Figure PCTCN2020090155-appb-000018
具体操作如下:
Step 4-4: Perform data correction on continuous multiple frames of track information to complete stable multi-target tracking. The data correction method is: store the track data of continuous N k frames of video images, and change the track data of the current frame
Figure PCTCN2020090155-appb-000016
And its previous N k -1 frame predicted track data
Figure PCTCN2020090155-appb-000017
Perform weighted average to generate corrected track data
Figure PCTCN2020090155-appb-000018
The specific operations are as follows:
Figure PCTCN2020090155-appb-000019
Figure PCTCN2020090155-appb-000019
其中
Figure PCTCN2020090155-appb-000020
为矫正后航迹数据,x为航迹数据中的目标水平位置坐标,y为航迹数据中的目标垂直位置坐标,w为航迹数据中的目标宽度,h为航迹数据中的目标高度,σ 1和σ 2为加权因子,N k一般取值为25,σ 1一般取值为0.3,σ 2一般取值为0.7,满足σ 12=1。
among them
Figure PCTCN2020090155-appb-000020
Is the corrected track data, x is the target horizontal position coordinate in the track data, y is the target vertical position coordinate in the track data, w is the target width in the track data, h is the target height in the track data , Σ 1 and σ 2 are weighting factors, N k is generally 25, σ 1 is generally 0.3, and σ 2 is generally 0.7, which satisfies σ 12 =1.
本发明提供了基于全景多路4k视频图像的实时目标检测跟踪方法,具体实现该技术方案的方法和途径很多,以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。本实施例中未明确的各组成部分均可用现有技术加以实现。The present invention provides a real-time target detection and tracking method based on panoramic multi-channel 4k video images. There are many methods and ways to implement this technical solution. The above are only the preferred embodiments of the present invention. It should be pointed out that the technology in this technical field For personnel, without departing from the principle of the present invention, several improvements and modifications can be made, and these improvements and modifications should also be regarded as the protection scope of the present invention. All the components that are not clear in this embodiment can be implemented using existing technology.

Claims (9)

  1. 基于全景多路4k视频图像的实时目标检测跟踪方法,其特征在于,包括以下步骤:A real-time target detection and tracking method based on panoramic multi-channel 4k video images is characterized in that it includes the following steps:
    步骤1,将全景多路4k视频图像划分成n个区域,对各个区域分别进行多帧目标统计,根据目标统计概率对全景视频各个区域进行等级划分,并根据各个区域的等级完成背景建模参数阈值设定;Step 1. Divide the panoramic multi-channel 4k video image into n regions, perform multi-frame target statistics for each region, classify each region of the panoramic video according to the target statistical probability, and complete the background modeling parameters according to the level of each region Threshold setting;
    步骤2,对全景视频图像进行中值滤波,初始化背景模型,通过背景的动态变换程度自适应地调整背景建模参数阈值,完成背景更新,然后对闪烁像素点进行处理,完成背景图像生成,最后利用帧差操作实现前景候选目标区域图像生成;Step 2: Perform median filtering on the panoramic video image, initialize the background model, adaptively adjust the background modeling parameter threshold through the degree of dynamic transformation of the background, complete the background update, and then process the blinking pixels to complete the background image generation, and finally Use frame difference operation to realize the image generation of foreground candidate target area;
    步骤3,对候选目标区域图像进行中值滤波,利用形态学相关操作完成增强候选目标区域提取,计算增强候选目标区域的连通域及连通域最小外接矩形,通过目标形状特征剔除虚假候选目标框,形成目标点迹;Step 3. Perform median filtering on the candidate target area image, use morphology-related operations to complete the enhanced candidate target area extraction, calculate the connected domain of the enhanced candidate target area and the minimum circumscribed rectangle of the connected domain, and eliminate false candidate target frames through the target shape features. Form the target spot;
    步骤4,对全景视频图像进行连续多帧检测获取目标点迹,通过判断目标点迹与目标航迹的绝对距离、多路视频交叉覆盖状态进行目标动态航迹管理,对连续多帧航迹信息进行数据矫正,完成多目标稳定跟踪。Step 4. Perform continuous multi-frame detection on the panoramic video image to obtain the target point trace. By judging the absolute distance between the target point trace and the target track, and the multi-channel video cross coverage state, the target dynamic track management is performed, and the continuous multi-frame track information Perform data correction and complete multi-target stable tracking.
  2. 如权利要求1所述的方法,其特征在于,步骤1包括以下步骤:The method of claim 1, wherein step 1 comprises the following steps:
    步骤1-1,根据全景视频图像尺寸和场景覆盖情况,将全景视频图像划分成n个区域S n,第n个区域记为S n,每个区域的区域宽度小于等于1920,区域高度大于等于1080; Step 1-1, according to the panoramic video image size and scene coverage, divide the panoramic video image into n areas S n , the nth area is denoted as S n , the area width of each area is less than or equal to 1920, and the area height is greater than or equal to 1080;
    步骤1-2,利用帧差法统计K帧视频图像中运动目标在全景视频图像中出现的频率,根据运动目标出现频率的高低,以目标出现频率高低将n个区域划分为A、B、C、D四个等级,其中K 1帧以上视频图像存在运动目标的区域为A等级图像区域,K 2帧以上K 1帧以下视频图像存在运动目标的区域为B等级图像区域,K 3帧以上K 2帧以下视频图像存在运动目标的区域为C等级图像区域,K 4帧以上K 3帧以下视频图像存在运动目标的区域为D等级图像区域; Step 1-2, use the frame difference method to count the frequency of the moving target in the K-frame video image in the panoramic video image. According to the frequency of the moving target, divide the n regions into A, B, C according to the frequency of the target. , D four levels, of which the area where there are moving objects in the video image with more than K 1 frame is the A level image area, the area where the moving object exists in the video image with more than K 2 frames and less than 1 frame is the B level image area, and the area where K is more than 3 frames is K The area where the moving target exists in the video image with less than 2 frames is the C-level image area, and the area where the moving target exists in the video image with more than K 4 frames and less than K 3 frames is the D-level image area;
    步骤1-3,对相邻等级图像区域进行合并,并分别记录各个区域对应全景位置坐标,第n个S n对应全景位置坐标为(x n,y n,w n,h n),其中(x n,y n)为第n个区域S n位置的左上角坐标,w n,h n分别表示第n个区域S n的宽和高; Steps 1-3, merge the adjacent image areas, and respectively record the corresponding panoramic position coordinates of each area. The nth S n corresponds to the panoramic position coordinates of (x n ,y n ,w n ,h n ), where ( x n ,y n ) are the coordinates of the upper left corner of the position of the nth area S n , w n , h n represent the width and height of the nth area S n respectively;
    步骤1-4,分别对n个区域设置相对应背景建模参数阈值,第n个区域S n相对应的背景建模参数阈值为T nSteps 1-4, setting corresponding background modeling parameter thresholds for n regions respectively, and the background modeling parameter threshold corresponding to the nth region S n is T n .
  3. 如权利要求2所述的方法,其特征在于,步骤2包括以下步骤:The method of claim 2, wherein step 2 includes the following steps:
    步骤2-1,对全景视频图像进行快速中值滤波,消除背景噪声影响;Step 2-1, perform fast median filtering on the panoramic video image to eliminate the influence of background noise;
    步骤2-2,初始化全景视频图像的背景模型,背景模型建模方法采用ViBE,其中将背景建模参数阈值T n设定为ViBE算法中欧式距离阈值; Step 2-2: Initialize the background model of the panoramic video image. The background model modeling method adopts ViBE, and the background modeling parameter threshold T n is set as the Euclidean distance threshold in the ViBE algorithm;
    步骤2-3,根据背景的动态变换程度自适应地调整背景建模参数阈值T n,完成背景模型更新; Step 2-3, adaptively adjust the background modeling parameter threshold T n according to the dynamic transformation degree of the background to complete the background model update;
    步骤2-4,对背景模型中的闪烁像素进行处理,完成背景图像生成;Steps 2-4, processing the blinking pixels in the background model to complete the generation of the background image;
    步骤2-5,利用全景视频图像与步骤2-4中得到的背景图像进行做差,生成候选目标图像Im obj,候选目标区域就是候选目标图像。 Step 2-5: Perform difference between the panoramic video image and the background image obtained in step 2-4 to generate a candidate target image Im obj , and the candidate target area is the candidate target image.
  4. 如权利要求3所述的方法,其特征在于,步骤2-3包括:The method of claim 3, wherein steps 2-3 include:
    背景建模参数阈值T n用于判定像素点是否属于背景,定义背景变换参数φ(x,y)为: The background modeling parameter threshold T n is used to determine whether a pixel belongs to the background, and the background transformation parameter φ(x, y) is defined as:
    Figure PCTCN2020090155-appb-100001
    Figure PCTCN2020090155-appb-100001
    其中f(i,j)为当前帧在位置(i,j)的像素值,d(i,j)为背景模型在位置(i,j)的像素值,M为当前帧图像的宽度,N为当前帧图像的高度;Where f(i,j) is the pixel value of the current frame at position (i,j), d(i,j) is the pixel value of the background model at position (i,j), M is the width of the current frame image, N Is the height of the current frame image;
    设定背景变换因子参数μ,对于当前像素值与背景模型匹配成功时,计算φ(x,y)的值,若当前为静态场景φ(x,y)趋于稳定值,若对于动态场景,φ(x,y)较大,背景建模参数阈值T n的自适应更新则根据下式进行: Set the background transformation factor parameter μ. When the current pixel value is successfully matched with the background model, calculate the value of φ(x, y). If the current is a static scene φ(x, y) tends to be a stable value, if for a dynamic scene, φ(x, y) is larger, and the adaptive update of the background modeling parameter threshold T n is performed according to the following formula:
    Figure PCTCN2020090155-appb-100002
    Figure PCTCN2020090155-appb-100002
    其中T n'为自适应调节后的阈值,β为动态调节因子,μ和β均为固定参数。 Where T n 'is the threshold after adaptive adjustment, β is the dynamic adjustment factor, and μ and β are both fixed parameters.
  5. 如权利要求4所述的方法,其特征在于,步骤2-4包括:The method of claim 4, wherein steps 2-4 include:
    对于背景建模中生成的背景图像中的像素点,如果所述像素点属于背景图像的边缘轮廓点,但不同于上一帧背景图像中边缘轮廓点,则闪烁频率等级增加
    Figure PCTCN2020090155-appb-100003
    否则闪烁频率等级减少
    Figure PCTCN2020090155-appb-100004
    如果连续K帧背景图像闪烁频率等级大于S NK,则判断所述像素点为闪烁像素点,将闪烁像素点从更新背景图像上移除。
    For the pixels in the background image generated in the background modeling, if the pixels belong to the edge contour points of the background image but are different from the edge contour points in the background image of the previous frame, the flicker frequency level increases
    Figure PCTCN2020090155-appb-100003
    Otherwise, the flashing frequency level is reduced
    Figure PCTCN2020090155-appb-100004
    If the flicker frequency level of the continuous K frames of background image is greater than S NK , then the pixel is determined to be a flickering pixel, and the flickering pixel is removed from the updated background image.
  6. 如权利要求5所述的方法,其特征在于,步骤3包括以下步骤:The method of claim 5, wherein step 3 includes the following steps:
    步骤3-1,对候选目标图像Im obj进行中值滤波生成图像Im mfStep 3-1: Perform median filtering on the candidate target image Im obj to generate an image Im mf ;
    步骤3-2,对图像Im mf进行形态学膨胀操作生成图像Im do,然后图像Im do与候选目标图像Im obj进行与操作生成增强候选目标图像Im obj2Step 3-2: Perform a morphological expansion operation on the image Im mf to generate an image Im do , and then perform an AND operation between the image Im do and the candidate target image Im obj to generate an enhanced candidate target image Im obj2 ;
    步骤3-3,对图像Im obj2进行形态学闭操作,提取候选目标的连通域,计算连通域的最小外接矩形,提取候选目标框; Step 3-3: Perform a morphological closing operation on the image Im obj2 , extract the connected domain of the candidate target, calculate the minimum bounding rectangle of the connected domain, and extract the candidate target frame;
    步骤3-4,计算候选目标框的形状特征,所述形状特征包括目标框的宽度obj_w、高度obj_h及宽高比obj_wh,判断当前候选目标框的形状特征是否满足obj_w>w 0、obj_h>h 0、obj_wh≥wh 0及obj_wh≤wh 1,若不满足上述要求,则判断当前候选目标框为虚假目标,并进行删除;将满足要求的候选目标框生成目标点迹,其中w 0为目标框宽度阈值,h 0为目标框高度阈值,wh 1、wh 0分别为目标宽高比高阈值、目标宽高比低阈值;所述目标点迹包括帧号、目标位置坐标、目标宽度、目标高度、目标宽高比和目标面积。 Step 3-4: Calculate the shape characteristics of the candidate target frame, the shape characteristics including the width obj_w, height obj_h, and aspect ratio obj_wh of the target frame, and determine whether the shape characteristics of the current candidate target frame satisfy obj_w>w 0 , obj_h>h 0 , obj_wh ≥ wh 0 and obj_wh ≤ wh 1 , if the above requirements are not met, the current candidate target frame is judged to be a false target and deleted; the candidate target frame that meets the requirements is generated as a target trace, where w 0 is the target frame Width threshold, h 0 is the target frame height threshold, wh 1 and wh 0 are the target aspect ratio high threshold and target aspect ratio low threshold respectively; the target trace includes frame number, target position coordinates, target width, and target height , Target aspect ratio and target area.
  7. 如权利要求6所述的方法,其特征在于,步骤4包括以下步骤:The method of claim 6, wherein step 4 includes the following steps:
    步骤4-1,将第一帧全景视频图像提取到的目标点迹Po i生成目标航迹Tr i,具体操作方法为:将目标点迹结构体自动生成的批号BN放入到目标航迹结构体向量,批号BN自动进行累加,且满足1≤BN≤9999,所述目标航迹包括帧号、目标位置坐标、目标宽度、目标高度、目标宽高比和目标面积; Step 4-1, generate the target track Tr i from the target point trace Po i extracted from the first frame of panoramic video image, the specific operation method is: put the batch number BN automatically generated by the target point trace structure into the target track structure Volume vector, batch number BN is automatically accumulated, and satisfies 1≤BN≤9999, and the target track includes frame number, target position coordinates, target width, target height, target aspect ratio and target area;
    步骤4-2,分别计算下一帧全景视频图像提取的目标点迹Po i+1与目标航迹Tr i的绝对距离D i+1,所述绝对距离D i+1的计算公式为: Step 4-2: Calculate the absolute distance D i+1 between the target point track Po i+1 and the target track Tr i extracted from the next frame of panoramic video image respectively, and the calculation formula of the absolute distance D i+1 is:
    Figure PCTCN2020090155-appb-100005
    Figure PCTCN2020090155-appb-100005
    其中,Po i+1(x)为目标点迹的横坐标,Po i+1(y)为目标点迹的纵坐标,Tr i(x)为目标航迹的横坐标,Tr i(y)为目标航迹的纵坐标; Among them, Po i+1 (x) is the abscissa of the target track, Po i+1 (y) is the ordinate of the target track, Tr i (x) is the abscissa of the target track, Tr i (y) Is the ordinate of the target track;
    若D i+1≤DT,将目标点迹Po i+1加入到目标航迹Tr i;若D i+1>DT,则将目标点迹Po i+1按照步骤4-1重新生成新的目标航迹Tr i+1,其中DT为绝对距离判断阈值; If D i+1 ≤DT, add the target point track Po i+1 to the target track Tr i ; if D i+1 >DT, then regenerate the target point track Po i+1 according to step 4-1 Target track Tr i+1 , where DT is the absolute distance judgment threshold;
    步骤4-3,根据航迹信息判断当前目标是否处于多路视频交叉覆盖状态,对属于跨屏目 标进行航迹管理;Step 4-3: Determine whether the current target is in the multi-channel video cross coverage state according to the track information, and manage the track of the target that belongs to the multi-screen;
    步骤4-4,对连续多帧航迹信息进行数据矫正,完成多目标稳定跟踪。Step 4-4: Perform data correction on continuous multiple frames of track information to complete stable multi-target tracking.
  8. 如权利要求7所述的方法,其特征在于,步骤4-3中,所述根据航迹信息判断当前目标是否处于多路视频交叉覆盖状态,包括:The method according to claim 7, wherein, in step 4-3, the judging whether the current target is in a multi-channel video cross coverage state according to the track information comprises:
    当目标在第i帧全景视频图像I i中的水平方向上的位置大于阈值w 1时,且目标水平方向的航迹速度为正时,同时,当目标在第i+1帧全景视频图像I i+1中的水平方向上的位置小于阈值w 2时,且目标水平方向的航迹速度为负时,此时判定目标航迹达到图像边缘处,即处于多路视频交叉覆盖状态,其中全景视频图像I i和I i+1为相邻连续图像。 When the position of the target in the horizontal direction in the i-th frame of panoramic video image I i is greater than the threshold w 1 , and the target's horizontal track speed is positive, at the same time, when the target is in the i+1-th frame of panoramic video image I When the position in the horizontal direction in i+1 is less than the threshold w 2 and the track speed in the horizontal direction of the target is negative, it is determined that the target track reaches the edge of the image, that is, it is in the state of multi-channel video cross coverage. The video images I i and I i+1 are adjacent continuous images.
  9. 如权利要求8所述的方法,其特征在于,步骤4-4包括:8. The method of claim 8, wherein step 4-4 comprises:
    存储连续N k帧全景视频图像的航迹数据,将当前帧的航迹数据
    Figure PCTCN2020090155-appb-100006
    和其前N k-1帧预测航迹数据
    Figure PCTCN2020090155-appb-100007
    进行加权平均生成矫正后的航迹数据
    Figure PCTCN2020090155-appb-100008
    Store the track data of continuous N k frames of panoramic video images, and change the track data of the current frame
    Figure PCTCN2020090155-appb-100006
    And its previous N k -1 frame predicted track data
    Figure PCTCN2020090155-appb-100007
    Perform weighted average to generate corrected track data
    Figure PCTCN2020090155-appb-100008
    Figure PCTCN2020090155-appb-100009
    Figure PCTCN2020090155-appb-100009
    其中,x为航迹数据中的目标水平位置坐标,y为航迹数据中的目标垂直位置坐标,w为航迹数据中的目标宽度,h为航迹数据中的目标高度,σ 1和σ 2为加权因子,满足σ 12=1。 Among them, x is the target horizontal position coordinate in the track data, y is the target vertical position coordinate in the track data, w is the target width in the track data, h is the target height in the track data, σ 1 and σ 2 is a weighting factor, which satisfies σ 12 =1.
PCT/CN2020/090155 2019-07-23 2020-05-14 Real-time target detection and tracking method based on panoramic multichannel 4k video images WO2021012757A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910665691.XA CN110517288B (en) 2019-07-23 2019-07-23 Real-time target detection tracking method based on panoramic multi-path 4k video images
CN201910665691.X 2019-07-23

Publications (1)

Publication Number Publication Date
WO2021012757A1 true WO2021012757A1 (en) 2021-01-28

Family

ID=68623454

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/090155 WO2021012757A1 (en) 2019-07-23 2020-05-14 Real-time target detection and tracking method based on panoramic multichannel 4k video images

Country Status (2)

Country Link
CN (1) CN110517288B (en)
WO (1) WO2021012757A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967305A (en) * 2021-03-24 2021-06-15 南京莱斯电子设备有限公司 Image cloud background detection method under complex sky scene
CN113128342A (en) * 2021-03-19 2021-07-16 中国人民解放军战略支援部队信息工程大学 Track data preprocessing method and aerial target identification method
CN113283326A (en) * 2021-05-17 2021-08-20 南京航空航天大学 Video SAR target intelligent detection method based on simulation target bright line characteristics
CN113379761A (en) * 2021-05-25 2021-09-10 广州市东崇科技有限公司 Multi-AGV and automatic door linkage method and system based on artificial intelligence
CN113674259A (en) * 2021-08-26 2021-11-19 中冶赛迪重庆信息技术有限公司 Belt conveyor slip detection method and system, electronic equipment and medium
CN114090168A (en) * 2022-01-24 2022-02-25 麒麟软件有限公司 Self-adaptive adjusting method for image output window of QEMU (QEMU virtual machine)
CN114360296A (en) * 2021-12-15 2022-04-15 中国飞行试验研究院 Full-automatic airplane approach landing process monitoring method based on foundation photoelectric equipment
CN114612506A (en) * 2022-02-19 2022-06-10 西北工业大学 Simple, efficient and anti-interference high-altitude parabolic track identification and positioning method
CN114821542A (en) * 2022-06-23 2022-07-29 小米汽车科技有限公司 Target detection method, target detection device, vehicle and storage medium
CN117214617A (en) * 2022-07-12 2023-12-12 安徽省万企天成科技有限公司 Smart power grid fault real-time monitoring and positioning system and method

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517288B (en) * 2019-07-23 2021-11-02 南京莱斯电子设备有限公司 Real-time target detection tracking method based on panoramic multi-path 4k video images
CN111833377B (en) * 2020-06-02 2023-09-29 杭州电子科技大学 TBD-based detection method for small moving target in complex environment
CN112257569B (en) * 2020-10-21 2021-11-19 青海城市云大数据技术有限公司 Target detection and identification method based on real-time video stream
CN112700657B (en) * 2020-12-21 2023-04-28 阿波罗智联(北京)科技有限公司 Method and device for generating detection information, road side equipment and cloud control platform
CN113191221B (en) * 2021-04-15 2022-04-19 浙江大华技术股份有限公司 Vehicle detection method and device based on panoramic camera and computer storage medium
CN114650453B (en) * 2022-04-02 2023-08-15 北京中庆现代技术股份有限公司 Target tracking method, device, equipment and medium applied to classroom recording and broadcasting

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831620A (en) * 2012-08-03 2012-12-19 南京理工大学 Infrared dim target searching and tracking method based on multi-hypothesis tracking data association
US8542875B2 (en) * 2010-09-17 2013-09-24 Honeywell International Inc. Image processing based on visual attention and reduced search based generated regions of interest
CN103400117A (en) * 2013-07-29 2013-11-20 电子科技大学 Method for positioning and tracking personnel in well on basis of compute vision
CN110517288A (en) * 2019-07-23 2019-11-29 南京莱斯电子设备有限公司 Real-time target detecting and tracking method based on panorama multichannel 4k video image

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101783015B (en) * 2009-01-19 2013-04-24 北京中星微电子有限公司 Equipment and method for tracking video
US8482452B2 (en) * 2010-08-26 2013-07-09 Lawrence Livermore National Security, Llc Synthetic aperture integration (SAI) algorithm for SAR imaging
US8885885B2 (en) * 2012-10-05 2014-11-11 International Business Machines Corporation Multi-cue object association
CN105872370B (en) * 2016-03-31 2019-01-15 深圳力维智联技术有限公司 Video stabilization method and device
CN106251362B (en) * 2016-07-15 2019-02-01 南京莱斯电子设备有限公司 A kind of sliding window method for tracking target and system based on fast correlation neighborhood characteristics point
US10412395B2 (en) * 2017-03-10 2019-09-10 Raytheon Company Real time frame alignment in video data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8542875B2 (en) * 2010-09-17 2013-09-24 Honeywell International Inc. Image processing based on visual attention and reduced search based generated regions of interest
CN102831620A (en) * 2012-08-03 2012-12-19 南京理工大学 Infrared dim target searching and tracking method based on multi-hypothesis tracking data association
CN103400117A (en) * 2013-07-29 2013-11-20 电子科技大学 Method for positioning and tracking personnel in well on basis of compute vision
CN110517288A (en) * 2019-07-23 2019-11-29 南京莱斯电子设备有限公司 Real-time target detecting and tracking method based on panorama multichannel 4k video image

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128342A (en) * 2021-03-19 2021-07-16 中国人民解放军战略支援部队信息工程大学 Track data preprocessing method and aerial target identification method
CN112967305A (en) * 2021-03-24 2021-06-15 南京莱斯电子设备有限公司 Image cloud background detection method under complex sky scene
CN112967305B (en) * 2021-03-24 2023-10-13 南京莱斯电子设备有限公司 Image cloud background detection method under complex sky scene
CN113283326A (en) * 2021-05-17 2021-08-20 南京航空航天大学 Video SAR target intelligent detection method based on simulation target bright line characteristics
CN113283326B (en) * 2021-05-17 2024-04-19 南京航空航天大学 Video SAR target intelligent detection method based on simulation target bright line characteristics
CN113379761B (en) * 2021-05-25 2023-04-28 重庆顺多利机车有限责任公司 Linkage method and system of multiple AGVs and automatic doors based on artificial intelligence
CN113379761A (en) * 2021-05-25 2021-09-10 广州市东崇科技有限公司 Multi-AGV and automatic door linkage method and system based on artificial intelligence
CN113674259B (en) * 2021-08-26 2024-03-05 中冶赛迪信息技术(重庆)有限公司 Belt conveyor slip detection method, system, electronic equipment and medium
CN113674259A (en) * 2021-08-26 2021-11-19 中冶赛迪重庆信息技术有限公司 Belt conveyor slip detection method and system, electronic equipment and medium
CN114360296A (en) * 2021-12-15 2022-04-15 中国飞行试验研究院 Full-automatic airplane approach landing process monitoring method based on foundation photoelectric equipment
CN114360296B (en) * 2021-12-15 2024-04-09 中国飞行试验研究院 Full-automatic aircraft approach landing process monitoring method based on foundation photoelectric equipment
CN114090168A (en) * 2022-01-24 2022-02-25 麒麟软件有限公司 Self-adaptive adjusting method for image output window of QEMU (QEMU virtual machine)
CN114612506A (en) * 2022-02-19 2022-06-10 西北工业大学 Simple, efficient and anti-interference high-altitude parabolic track identification and positioning method
CN114612506B (en) * 2022-02-19 2024-03-15 西北工业大学 Simple, efficient and anti-interference high-altitude parabolic track identification and positioning method
CN114821542A (en) * 2022-06-23 2022-07-29 小米汽车科技有限公司 Target detection method, target detection device, vehicle and storage medium
CN117214617A (en) * 2022-07-12 2023-12-12 安徽省万企天成科技有限公司 Smart power grid fault real-time monitoring and positioning system and method

Also Published As

Publication number Publication date
CN110517288B (en) 2021-11-02
CN110517288A (en) 2019-11-29

Similar Documents

Publication Publication Date Title
WO2021012757A1 (en) Real-time target detection and tracking method based on panoramic multichannel 4k video images
CN103246896B (en) A kind of real-time detection and tracking method of robustness vehicle
CN110415208B (en) Self-adaptive target detection method and device, equipment and storage medium thereof
CN105654508B (en) Monitor video method for tracking moving target and system based on adaptive background segmentation
CN109086724B (en) Accelerated human face detection method and storage medium
CN106709938B (en) Based on the multi-target tracking method for improving TLD
CN109359563A (en) A kind of road occupying phenomenon real-time detection method based on Digital Image Processing
CN112927262B (en) Camera lens shielding detection method and system based on video
Lian et al. A novel method on moving-objects detection based on background subtraction and three frames differencing
CN105513053A (en) Background modeling method for video analysis
Landabaso et al. Foreground regions extraction and characterization towards real-time object tracking
CN112364865A (en) Method for detecting small moving target in complex scene
CN110363197B (en) Video region of interest extraction method based on improved visual background extraction model
CN103578121B (en) Method for testing motion based on shared Gauss model under disturbed motion environment
Almomani et al. Segtrack: A novel tracking system with improved object segmentation
Fu et al. An effective background subtraction method based on pixel change classification
Zheng et al. An automatic moving object detection algorithm for video surveillance applications
Li et al. Moving vehicle detection based on an improved interframe difference and a Gaussian model
CN113066077B (en) Flame detection method and device
CN109063600A (en) Human motion method for tracing and device based on face recognition
Wu et al. Adaptive Detection of Moving Vehicle Based on On-line Clustering.
JP7096175B2 (en) Object extraction method and device
CN106447685A (en) Infrared tracking method
Liang et al. Adaptive dual threshold based moving target detection algorithm
Zhen-Jie et al. Research on Detection and Tracking of Moving Vehicles in Complex Environment Based on Real-Time Surveillance Video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20845052

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20845052

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20845052

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: OTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 10/11/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20845052

Country of ref document: EP

Kind code of ref document: A1