WO2021136001A1 - Codebook principle-based efficient video moving object detection method - Google Patents

Codebook principle-based efficient video moving object detection method Download PDF

Info

Publication number
WO2021136001A1
WO2021136001A1 PCT/CN2020/137988 CN2020137988W WO2021136001A1 WO 2021136001 A1 WO2021136001 A1 WO 2021136001A1 CN 2020137988 W CN2020137988 W CN 2020137988W WO 2021136001 A1 WO2021136001 A1 WO 2021136001A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
image
histogram
channel
codebook
Prior art date
Application number
PCT/CN2020/137988
Other languages
French (fr)
Chinese (zh)
Inventor
许野平
井焜
刘辰飞
陈英鹏
朱爱红
Original Assignee
神思电子技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 神思电子技术股份有限公司 filed Critical 神思电子技术股份有限公司
Publication of WO2021136001A1 publication Critical patent/WO2021136001A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image

Definitions

  • the invention belongs to the field of machine vision, and particularly relates to an efficient video moving target detection method based on the Codebook principle.
  • Codebook moving target detection method can effectively overcome video background interference.
  • the main disadvantages are: (1) With the change of the video picture, it is necessary to frequently apply for the release of memory. In the case of unattended equipment, memory recycling will affect the reliability and real-time performance of the system. (2) When the video background changes due to factors such as lighting, the Codebook method will gradually fail. In this case, the background information needs to be updated again, and moving targets cannot be detected during this period. (3) The Codebook method is slower, which is not conducive to running on low-configuration hardware devices.
  • Image processing method, device and computer-readable storage medium discloses an image processing method, including: establishing a Codebook model in RGB space based on a codebook algorithm; using the established Codebook model to detect The pixel of the image to be detected is the foreground or the background, and the detection result is obtained; the message value sum of the pixel of the image to be detected is obtained by passing the message value in the multi-neighborhood direction by using the belief propagation algorithm, and normalization is performed to obtain Probability value; the message value characterizes the continuity of a pixel and neighboring pixels; using the probability value to correct the detection result.
  • This invention can reduce the noise of the Codebook method and improve the accuracy of the detection target. The problem to be solved and the method adopted by the invention are not the same.
  • “Monitoring area invasion method based on multi-layer Codebook” discloses a monitoring area intrusion method based on multi-layer Codebook.
  • the video image used for background modeling is used as a temporary background model.
  • Tm the eight areas of background pixels in the temporary background model are searched and connected domains are formed.
  • all pixels in the connected domain Add to the permanent background model, and delete the pixel from the temporary background model; search for the corresponding pixel in the permanent background model for each pixel in the image to be detected, if there is no corresponding pixel, determine the image to be detected as the foreground .
  • the invention can effectively prevent isolated noise from being added to the permanent background model, and effectively deal with false alarms caused by sudden changes in light caused by lightning and train lights.
  • This invention can reduce the noise of the Codebook method and improve the accuracy of the detection target.
  • the problem to be solved and the method adopted by the invention are not the same.
  • Multi-level dictionary set-based non-reference image quality evaluation method (application number: 201610273831.5), the present invention discloses a non-reference quality evaluation method based on multi-level dictionary coding, which mainly solves the problem of computer evaluation of noisy images and human eyes The problem of perception inconsistency.
  • the implementation steps are: 1. Divide the image database; 2. Extract the feature vector of a single experimental sample; 3. Calculate the quality value of the feature vector of a pollution map of the training sample; 4. Calculate the feature vector of all training samples; 5. Calculate training The quality values of the feature vectors of all pollution maps in the sample; 6. Use the feature vectors of the training sample reference image to build the first-level dictionary set; 7.
  • the evaluation result of the present invention is consistent with human eye perception, and can be used for image screening, transmission and compression on the Internet. This invention can reduce the noise of the Codebook method and improve the accuracy of the detection target. The problem to be solved and the method adopted by the invention are not the same.
  • a foreground detection method fusing superpixels and background models (Publication No.: 105825234A), the present invention discloses a foreground detection method fusing superpixels and Codebook background models.
  • the pixels in the video image are segmented by superpixels. Combine it into super pixel blocks, and use the super pixel block as the unit to establish a Codebook background model for its clustering center. There is no need to separately establish a Codebook background model for each pixel in the video, which effectively saves the memory required for the background model.
  • In the foreground detection stage only The detection of the clustering center greatly shortens the detection time and meets the requirements of the real-time monitoring platform. This invention only detects cluster centers, which increases the possibility of missed target detection and reduces the detection accuracy of the Codebook method.
  • the present invention provides a high-efficiency video moving target detection method based on Codebook principle, which mainly solves the following problems: (1) How to use fixed-size memory, avoid frequent applications for memory release, and eliminate system time lag caused by memory management. (2) Solve the problem of background model failure caused by the gradual change of light over time. The device can work continuously for a long time without relearning the background. (3) Simplify the calculation process of the Codebook method and improve the running speed.
  • the invention discloses an efficient video moving target detection method based on Codebook principle, including:
  • the video frame is composed of pixels, and the pixels are composed of several channel components.
  • Each channel of each pixel has a fixed-size histogram; for an image with a resolution of Wx L and C channel per pixel, a statistical histogram H[W] [L][C][D], and set its initial value to 0; where W is the image width, L is the image height, C is the number of channels of image pixels, and D is the total number of channel brightness levels;
  • the increment factor is used as the histogram increment unit; specifically:
  • H[x][y][c][d] H[x][y][c][d]+T
  • (x,y) is the coordinate of the pixel on the image
  • c is the channel number of the pixel
  • d is the pixel (x,y) in channel
  • T is the brightness increment factor
  • R is the forgetting factor
  • T can be assigned a small real number value during initialization
  • R is determined according to the forgetting speed to be achieved, if it is expected that after n frames of images, the current image is opposite to the histogram
  • codebook data structure in Codebook is replaced with a histogram with a fixed memory size to avoid frequent memory application and release operations;
  • the present invention adopts a 16x3 dimensional histogram structure per pixel, realizes direct addressing with less cost, and has higher operating efficiency.
  • Figure 1 is a schematic flow diagram of the present invention.
  • the device hardware adopts a PC computer, and the operating system adopts Windows 7.
  • the PC computer is connected with the network camera through the network cable, and the video code stream of the camera adopts the H.264 encoding format.
  • a video frame is composed of pixels, and the pixels are composed of several channel components, and each channel of each pixel has a histogram with a fixed size.
  • a VGA resolution black and white video image such as a thermal imaging camera, has an image resolution of 640x480 pixels, each pixel is composed of a single brightness channel, and the value range of the brightness channel component is usually 0-255;
  • the image resolution is 1920x1080 pixels, each pixel is composed of three primary color channels of red, green, and blue, and the value range of each primary color channel component is usually 0-255;
  • the histogram structure is H[640][480][1][256], which can be simplified to H[640][480][256], image width 640 pixels, image height 480 pixels, 1 pixel channel, 256 levels of channel brightness;
  • the histogram structure is H[1920][1080][3][256], image width 1920, image height 1080 pixels, 3 pixel channels, and channel brightness 256 levels;
  • H[x][y][c][d] H[x][y][c][d]+T
  • T T*R.
  • (x, y) is the coordinate of the pixel on the image
  • c is the channel number of the pixel
  • d is the brightness value of the pixel (x, y) in channel c
  • T is the brightness accumulation factor
  • R is the forgetting factor

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A Codebook principle-based efficient video moving object detection method, used for mainly eliminating system time delay caused by memory management, simplifying the calculation process of a Codebook method, and accelerating an operation speed. The method specifically comprises: acquiring a video frame from a video signal source in real time, wherein the video frame consists of pixels, the pixels consist of several channel components, and each channel of each pixel has a histogram having a fixed size; when the pixel histogram of a new image frame is updated, using an increment factor as a histogram increment unit; and for a newly received image frame, determining whether each pixel in an image is a foreground or a background, and before receiving a next frame image, multiplying the increment factor by a forgetting factor, i.e., T=T*R. The histograms are directly compared during detection, and the efficiency is higher.

Description

一种基于Codebook原理的高效视频移动目标检测方法An efficient video moving target detection method based on Codebook principle 技术领域Technical field
本发明属于机器视觉领域,特别涉及一种基于Codebook原理的高效视频移动目标检测方法。The invention belongs to the field of machine vision, and particularly relates to an efficient video moving target detection method based on the Codebook principle.
背景技术Background technique
Codebook移动目标检测方法可有效克服视频背景干扰。其主要缺点是:(1)随着视频画面的变化,需要频繁申请释放内存。在设备无人值守情况下,内存回收整理会影响系统可靠性和实时性。(2)当视频背景因光照等因素发生渐变时,Codebook方法会逐渐失效,这种情况下需要重新更新背景信息,这期间无法检测移动目标。(3)Codebook方法速度较慢,不利于在低配置硬件设备上运行。Codebook moving target detection method can effectively overcome video background interference. The main disadvantages are: (1) With the change of the video picture, it is necessary to frequently apply for the release of memory. In the case of unattended equipment, memory recycling will affect the reliability and real-time performance of the system. (2) When the video background changes due to factors such as lighting, the Codebook method will gradually fail. In this case, the background information needs to be updated again, and moving targets cannot be detected during this period. (3) The Codebook method is slower, which is not conducive to running on low-configuration hardware devices.
《图像处理方法、装置及计算机可读存储介质》(公开号:109427067A)公开了一种图像处理方法,包括:在RGB空间,基于码本(Codebook)算法建立Codebook模型;利用建立的Codebook模型检测待检测图像的像素是前景或背景,得到检测结果;利用置信传播算法,通过多邻域方向上传递消息值得到的所述待检测图像的像素的消息值和,并进行归一化处理,得到概率值;所述消息值表征一个像素与邻域像素的连续性;利用所述概率值,对所述检测结果进行修正。该项发明可减少Codebook方法的噪点,提升检测目标准确性,于本发明要解决问题和采用方法均无相同之处。"Image processing method, device and computer-readable storage medium" (Publication No.: 109427067A) discloses an image processing method, including: establishing a Codebook model in RGB space based on a codebook algorithm; using the established Codebook model to detect The pixel of the image to be detected is the foreground or the background, and the detection result is obtained; the message value sum of the pixel of the image to be detected is obtained by passing the message value in the multi-neighborhood direction by using the belief propagation algorithm, and normalization is performed to obtain Probability value; the message value characterizes the continuity of a pixel and neighboring pixels; using the probability value to correct the detection result. This invention can reduce the noise of the Codebook method and improve the accuracy of the detection target. The problem to be solved and the method adopted by the invention are not the same.
《基于多层Codebook的监控区域入侵方法》(公开号:107341816A),公开 了一种基于多层Codebook的监控区域入侵方法,将用于背景建模的视频图像作为临时背景模型,当训练时间满足给定值Tm时,搜索所述临时背景模型中的背景像素的八领域并且形成连通域,当所述连通域的面积满足面积阈值Sm和访问频率Fm时,将所述连通域里的所有像素添加到永久背景模型里,并将该像素从临时背景模型内删除;将待检测图像中每个像素在永久背景模型中查找对应的像素,若不存在对应的像素确定所述待检测图像为前景。本发明能够有效避免孤立的噪声添加到永久背景模型里,有效得处理了闪电和火车车灯引起光线突变而造成的误报。该项发明可减少Codebook方法的噪点,提升检测目标准确性,于本发明要解决问题和采用方法均无相同之处。"Monitoring area invasion method based on multi-layer Codebook" (Publication No.: 107341816A) discloses a monitoring area intrusion method based on multi-layer Codebook. The video image used for background modeling is used as a temporary background model. When the training time is satisfied When the value Tm is given, the eight areas of background pixels in the temporary background model are searched and connected domains are formed. When the area of the connected domain meets the area threshold Sm and the access frequency Fm, all pixels in the connected domain Add to the permanent background model, and delete the pixel from the temporary background model; search for the corresponding pixel in the permanent background model for each pixel in the image to be detected, if there is no corresponding pixel, determine the image to be detected as the foreground . The invention can effectively prevent isolated noise from being added to the permanent background model, and effectively deal with false alarms caused by sudden changes in light caused by lightning and train lights. This invention can reduce the noise of the Codebook method and improve the accuracy of the detection target. The problem to be solved and the method adopted by the invention are not the same.
《一种基于改进的Codebook前景检测的图像处理方法》(申请号:201610452894.7)公开了一种基于改进的Codebook前景检测的图像处理方法,其特征在于,将RGB颜色空间转换为YCbCr颜色空间;改进Codebook前景检测算法;应用改进的Codebook算法进行前景检测。采用本发明的方法能够很好地进行前景检测,区分前景与背景的同时,降低了光照变化对检测的影响,降低了内存的消耗,提高了性能。该方法计算量远远高于正常的Codebook方法,对硬件要求过高。"An Image Processing Method Based on Improved Codebook Foreground Detection" (application number: 201610452894.7) discloses an image processing method based on improved Codebook foreground detection, which is characterized by converting RGB color space to YCbCr color space; improved Codebook foreground detection algorithm; application of improved Codebook algorithm for foreground detection. By adopting the method of the present invention, foreground detection can be performed well, while distinguishing foreground and background, the influence of illumination changes on detection is reduced, memory consumption is reduced, and performance is improved. The calculation amount of this method is much higher than that of the normal Codebook method, and the hardware requirements are too high.
《基于多级字典集的无参考图像质量评价方法》(申请号:201610273831.5),本发明公开了一种基于多级字典编码的无参考质量评价方法,主要解决计算机对噪声图像的评价与人眼感知不符的问题。其实现步骤是:1.划分图像数据库;2.提取单个实验样本的特征向量;3.计算训练样本一副污染图的特征向量质量值;4.计算全部训练样本的特征向量;5.计算训练样本中所有污染图的特征向量质量 值;6.用训练样本参考图的特征向量构建第一级字典集;7.用训练样本污染图的特征向量构建第二级字典集;8.计算第二级字典集中每个聚类中心的质量值;9.将测试样本投影到第二级字典集计算测试样本的质量值;10.根据样本质量值判断样本质量。本发明的评价结果与人眼感知一致,可用在互联网上图像筛选、传输、压缩。该项发明可减少Codebook方法的噪点,提升检测目标准确性,于本发明要解决问题和采用方法均无相同之处。"Multi-level dictionary set-based non-reference image quality evaluation method" (application number: 201610273831.5), the present invention discloses a non-reference quality evaluation method based on multi-level dictionary coding, which mainly solves the problem of computer evaluation of noisy images and human eyes The problem of perception inconsistency. The implementation steps are: 1. Divide the image database; 2. Extract the feature vector of a single experimental sample; 3. Calculate the quality value of the feature vector of a pollution map of the training sample; 4. Calculate the feature vector of all training samples; 5. Calculate training The quality values of the feature vectors of all pollution maps in the sample; 6. Use the feature vectors of the training sample reference image to build the first-level dictionary set; 7. Use the feature vectors of the training sample pollution map to build the second-level dictionary set; 8. Calculate the second The quality value of each cluster center in the first-level dictionary set; 9. Project the test sample to the second-level dictionary set to calculate the quality value of the test sample; 10. Judge the sample quality according to the sample quality value. The evaluation result of the present invention is consistent with human eye perception, and can be used for image screening, transmission and compression on the Internet. This invention can reduce the noise of the Codebook method and improve the accuracy of the detection target. The problem to be solved and the method adopted by the invention are not the same.
《一种融合超像素和背景模型的前景检测方法》(公开号:105825234A),本发明公开了一种融合超像素和Codebook背景模型的前景检测方法,通过超像素分割将视频图像中的像素点结合为超像素块,以超像素块为单位对其聚类中心建立Codebook背景模型,无需对视频中每个像素点单独建立Codebook背景模型,有效节约背景模型所需内存,在前景检测阶段,只对聚类中心进行检测,大大缩短检测时间,符合实时监控平台的要求。该项发明仅对聚类中心检测,增加了目标漏检的可能性,降低了Codebook方法的检测准确性。"A foreground detection method fusing superpixels and background models" (Publication No.: 105825234A), the present invention discloses a foreground detection method fusing superpixels and Codebook background models. The pixels in the video image are segmented by superpixels. Combine it into super pixel blocks, and use the super pixel block as the unit to establish a Codebook background model for its clustering center. There is no need to separately establish a Codebook background model for each pixel in the video, which effectively saves the memory required for the background model. In the foreground detection stage, only The detection of the clustering center greatly shortens the detection time and meets the requirements of the real-time monitoring platform. This invention only detects cluster centers, which increases the possibility of missed target detection and reduces the detection accuracy of the Codebook method.
发明内容Summary of the invention
本发明提供一种基于Codebook原理的高效视频移动目标检测方法,主要解决如下几个问题:(1)如何采用固定大小内存,避免频繁申请释放内存,消除内存管理带来的系统时滞。(2)解决光线随时间渐变造成的背景模型失效问题,装置无需重新学习背景,即可长期连续工作。(3)简化Codebook方法的计算过程,提升运行速度。The present invention provides a high-efficiency video moving target detection method based on Codebook principle, which mainly solves the following problems: (1) How to use fixed-size memory, avoid frequent applications for memory release, and eliminate system time lag caused by memory management. (2) Solve the problem of background model failure caused by the gradual change of light over time. The device can work continuously for a long time without relearning the background. (3) Simplify the calculation process of the Codebook method and improve the running speed.
本发明是通过以下技术方案来实现的The present invention is achieved through the following technical solutions
本发明公开了一种基于Codebook原理的高效视频移动目标检测方法,包括:The invention discloses an efficient video moving target detection method based on Codebook principle, including:
从视频信号源实时采集视频帧;Collect video frames in real time from the video signal source;
视频帧由像素构成,像素由若干通道分量构成,每个像素的每个通道拥有一个尺寸固定的直方图;针对分辨率为Wx L,每像素C通道的图像,构建统计直方图H[W][L][C][D],并设置其初始值为0;其中,W是图像宽度,L是图像高度,C是图像像素的通道数,D是通道亮度级别总数;The video frame is composed of pixels, and the pixels are composed of several channel components. Each channel of each pixel has a fixed-size histogram; for an image with a resolution of Wx L and C channel per pixel, a statistical histogram H[W] [L][C][D], and set its initial value to 0; where W is the image width, L is the image height, C is the number of channels of image pixels, and D is the total number of channel brightness levels;
新图像帧的像素直方图更新时,采用增量因子作为直方图增量单位;具体为:When the pixel histogram of the new image frame is updated, the increment factor is used as the histogram increment unit; specifically:
每接收到一帧图像,针对图像中的每一个像素,把其各通道的亮度值累加到对应的直方图单元中去,具体累加方法是:H[x][y][c][d]=H[x][y][c][d]+T,其中,(x,y)是像素在图像上的坐标,c是像素的通道编号,d是像素(x,y)在通道c的亮度值,T是亮度增量因子,R是遗忘因子;T在初始化时可以赋予一个较小的实数数值;R依据欲实现的遗忘速度来确定,如果期望n帧图像之后,当前图像对直方图贡献的权重降低到1/m,则R^n=m,即R=m^(1/n);Each time a frame of image is received, for each pixel in the image, the brightness value of each channel is accumulated into the corresponding histogram unit. The specific accumulation method is: H[x][y][c][d] =H[x][y][c][d]+T, where (x,y) is the coordinate of the pixel on the image, c is the channel number of the pixel, and d is the pixel (x,y) in channel c T is the brightness increment factor, R is the forgetting factor; T can be assigned a small real number value during initialization; R is determined according to the forgetting speed to be achieved, if it is expected that after n frames of images, the current image is opposite to the histogram The weight of the graph contribution is reduced to 1/m, then R^n=m, that is, R=m^(1/n);
对于新接收到的图像帧,判定图像中每个像素是前景还是背景的方法是:像素(x,y)通道c的亮度为d,依据阈值P,如果H[x][y][c][d]<P,则可判定像素(x,y)属于前景像素,如果对于像素(x,y)的所有通道c都有H[x][y][c][d]>=P,则可判定该像素属于背景像素;For the newly received image frame, the method to determine whether each pixel in the image is the foreground or the background is: the brightness of the pixel (x, y) channel c is d, according to the threshold P, if H[x][y][c] [d]<P, it can be determined that the pixel (x, y) belongs to the foreground pixel, if all channels c of the pixel (x, y) have H[x][y][c][d]>=P, Then it can be determined that the pixel belongs to the background pixel;
具体地,像素(x,y)在通道c的判定阈值P设定方法是:P=max(H[x][y][c])*0.5,即取像素(x,y)在通道c的统计直方图最大值的一半;Specifically, the determination threshold P of the pixel (x, y) in channel c is set as follows: P=max(H[x][y][c])*0.5, that is, the pixel (x, y) is in channel c Half of the maximum value of the statistical histogram;
接收下一帧图像之前,增量因子乘以遗忘因子,即T=T*R。Before receiving the next frame of image, the increment factor is multiplied by the forgetting factor, that is, T=T*R.
上述的基于Codebook原理的高效视频移动目标检测方法,优选的:对于 VGA分辨率黑白视频信号,直方图累加方法是:H[x][y][d]=H[x][y][d]+T,T=T*R;0<=x<640,0<=y<480,0<=d<256;T的初始值为1.0,R=2^(1/1500)=1.0004622。The above-mentioned high-efficiency video moving target detection method based on Codebook principle, preferably: for VGA resolution black and white video signals, the histogram accumulation method is: H[x][y][d]=H[x][y][d ]+T, T=T*R; 0<=x<640, 0<=y<480, 0<=d<256; the initial value of T is 1.0, R=2^(1/1500)=1.004622.
上述的基于Codebook原理的高效视频移动目标检测方法,优选的:对于彩色高清视频信号,直方图累加方法是:H[x][y][c][d]=H[x][y][c][d]+T,T=T*R;0<=x<1920,0<=y<1080,0<=c<3;0<=d<256;T的初始值为1.0,R=2^(1/1500)=1.0004622。The above-mentioned high-efficiency video moving target detection method based on Codebook principle, preferably: for color high-definition video signals, the histogram accumulation method is: H[x][y][c][d]=H[x][y][ c][d]+T, T=T*R; 0<=x<1920, 0<=y<1080, 0<=c<3; 0<=d<256; the initial value of T is 1.0, R =2^(1/1500)=1.004622.
与现有技术相比,本发明的有益效果为:Compared with the prior art, the beneficial effects of the present invention are:
(1)依据[2000],用内存大小固定的直方图代替了Codebook里的码本数据结构,避免的内存频繁申请和释放操作;(1) According to [2000], the codebook data structure in Codebook is replaced with a histogram with a fixed memory size to avoid frequent memory application and release operations;
(2)利用遗忘因子对直方图的更新,可自动使历史数据失效,避免了Codebook方法中频繁初始化背景的操作;(2) Using the forgetting factor to update the histogram can automatically invalidate historical data, avoiding frequent background initialization operations in the Codebook method;
(3)直方图更新比Codebook更新码本更简单,检测过程直接比对直方图,效率更高;(3) The histogram update is simpler than the codebook update codebook, and the detection process is directly compared to the histogram, which is more efficient;
(4)依据[3005]方法,本发明采用了每像素16x3维的直方图结构,用较少代价实现直接寻址,运行效率较高。(4) According to the method of [3005], the present invention adopts a 16x3 dimensional histogram structure per pixel, realizes direct addressing with less cost, and has higher operating efficiency.
附图说明Description of the drawings
图1为本发明的流程示意图。Figure 1 is a schematic flow diagram of the present invention.
具体实施方式:Detailed ways:
下面结合图1对说明作具体的说明。The description will be specifically described below in conjunction with FIG. 1.
(1)装置硬件采用PC计算机,操作系统采用Windows 7。PC计算机通过 网线与网络摄像机连接,摄像机视频码流采用H.264编码格式。(1) The device hardware adopts a PC computer, and the operating system adopts Windows 7. The PC computer is connected with the network camera through the network cable, and the video code stream of the camera adopts the H.264 encoding format.
(2)从视频信号源实时采集视频帧;(2) Collect video frames in real time from the video signal source;
视频帧由像素构成,像素由若干通道分量构成,每个像素的每个通道拥有一个尺寸固定的直方图。A video frame is composed of pixels, and the pixels are composed of several channel components, and each channel of each pixel has a histogram with a fixed size.
具体地,VGA分辨率黑白视频图像,例如热成像摄像机,图像分辨率为640x480像素,每个像素由单一的亮度通道构成,亮度通道分量取值范围通常为0~255;Specifically, a VGA resolution black and white video image, such as a thermal imaging camera, has an image resolution of 640x480 pixels, each pixel is composed of a single brightness channel, and the value range of the brightness channel component is usually 0-255;
具体地,对于彩色视频图像,例如高清摄像机,图像分辨率为1920x1080像素,每个像素由红、绿、蓝三基色通道构成,每个基色通道分量取值范围通常为0~255;Specifically, for a color video image, such as a high-definition camera, the image resolution is 1920x1080 pixels, each pixel is composed of three primary color channels of red, green, and blue, and the value range of each primary color channel component is usually 0-255;
(3)针对分辨率为WxL,每像素C通道的图像,构建统计直方图H[W][L][C][D],并设置其初始值为0。其中,W是图像宽度,L是图像高度,C是图像像素的通道数,D是通道亮度级别总数;(3) For an image with a resolution of WxL and C channel per pixel, construct a statistical histogram H[W][L][C][D], and set its initial value to 0. Among them, W is the image width, L is the image height, C is the number of channels of image pixels, and D is the total number of channel brightness levels;
具体地,针对VGA分辨率黑白视频信号,直方图结构为H[640][480][1][256],可简化为H[640][480][256],图像宽度640像素,图像高度480像素,像素通道1个,通道亮度256级;Specifically, for VGA resolution black and white video signals, the histogram structure is H[640][480][1][256], which can be simplified to H[640][480][256], image width 640 pixels, image height 480 pixels, 1 pixel channel, 256 levels of channel brightness;
具体地,针对彩色高清视频信号,直方图结构为H[1920][1080][3][256],图像宽度1920,图像高度1080像素,像素通道3个,通道亮度256级;Specifically, for color high-definition video signals, the histogram structure is H[1920][1080][3][256], image width 1920, image height 1080 pixels, 3 pixel channels, and channel brightness 256 levels;
每接收到一帧图像,针对图像中的每一个像素,把其各通道的亮度值累加到对应的直方图单元中去,具体累加方法是:H[x][y][c][d]=H[x][y][c][d]+T,T=T*R。其中,(x,y)是像素在图像上的坐标,c是像素的通道编号,d是像素 (x,y)在通道c的亮度值,T是亮度累加因子,R是遗忘因子;T在初始化时可以赋予一个较小的实数数值;R依据欲实现的遗忘速度来确定,如果期望n帧图像之后,当前图像对直方图贡献的权重降低到1/m,则R^n=m,即R=m^(1/n);Each time a frame of image is received, for each pixel in the image, the brightness value of each channel is accumulated into the corresponding histogram unit. The specific accumulation method is: H[x][y][c][d] =H[x][y][c][d]+T, T=T*R. Among them, (x, y) is the coordinate of the pixel on the image, c is the channel number of the pixel, d is the brightness value of the pixel (x, y) in channel c, T is the brightness accumulation factor, R is the forgetting factor; T is A small real number value can be assigned during initialization; R is determined according to the forgetting speed to be achieved. If it is expected that after n frames of images, the weight of the contribution of the current image to the histogram is reduced to 1/m, then R^n=m, that is R=m^(1/n);
具体地,针对VGA分辨率黑白视频信号,直方图累加方法是:H[x][y][d]=H[x][y][d]+T,T=T*R;0<=x<640,0<=y<480,0<=d<256;T的初始值为1.0,R=2^(1/1500)=1.0004622;Specifically, for VGA resolution black and white video signals, the histogram accumulation method is: H[x][y][d]=H[x][y][d]+T, T=T*R; 0<= x<640,0<=y<480, 0<=d<256; the initial value of T is 1.0, R=2^(1/1500)=1.004622;
具体地,针对彩色高清视频信号,直方图累加方法是:H[x][y][c][d]=H[x][y][c][d]+T,T=T*R;0<=x<1920,0<=y<1080,0<=c<3;0<=d<256;T的初始值为1.0,R=2^(1/1500)=1.0004622;Specifically, for color high-definition video signals, the histogram accumulation method is: H[x][y][c][d]=H[x][y][c][d]+T, T=T*R ; 0<=x<1920, 0<=y<1080, 0<=c<3; 0<=d<256; the initial value of T is 1.0, R=2^(1/1500)=1.0004622;
(4)对于新接收到的图像帧,判定图像中每个像素是前景还是背景的方法是:像素(x,y)通道c的亮度为d,依据阈值P,如果H[x][y][c][d]<P,则可判定像素(x,y)属于前景像素,如果对于像素(x,y)的所有通道c都有H[x][y][c][d]>=P,则可判定该像素属于背景像素;(4) For the newly received image frame, the method to determine whether each pixel in the image is the foreground or the background is: the brightness of the pixel (x, y) channel c is d, according to the threshold P, if H[x][y] [c][d]<P, it can be determined that the pixel (x, y) belongs to the foreground pixel, if all channels c of the pixel (x, y) have H[x][y][c][d]> =P, it can be determined that the pixel belongs to the background pixel;
具体地,像素(x,y)在通道c的判定阈值P设定方法是:P=max(H[x][y][c])*0.5,即取像素(x,y)在通道c的统计直方图最大值的一半。Specifically, the determination threshold P of the pixel (x, y) in channel c is set as follows: P=max(H[x][y][c])*0.5, that is, the pixel (x, y) is in channel c Half of the maximum value of the statistical histogram.

Claims (3)

  1. 一种基于Codebook原理的高效视频移动目标检测方法,其特征在于,包括:An efficient video moving target detection method based on Codebook principle, which is characterized in that it includes:
    [1001]从视频信号源实时采集视频帧;[1001] Collect video frames in real time from a video signal source;
    [2001]视频帧由像素构成,像素由若干通道分量构成,每个像素的每个通道拥有一个尺寸固定的直方图;针对分辨率为WxL,每像素C通道的图像,构建统计直方图H[W][L][C][D],并设置其初始值为0;其中,W是图像宽度,L是图像高度,C是图像像素的通道数,D是通道亮度级别总数;[2001] The video frame is composed of pixels, and the pixels are composed of several channel components. Each channel of each pixel has a fixed-size histogram; for an image with a resolution of WxL and C channel per pixel, a statistical histogram H[ W][L][C][D], and set its initial value to 0; among them, W is the image width, L is the image height, C is the number of channels of image pixels, and D is the total number of channel brightness levels;
    [3001]新图像帧的像素直方图更新时,采用增量因子作为直方图增量单位;具体为[3001] When the pixel histogram of the new image frame is updated, the increment factor is used as the histogram increment unit; specifically
    每接收到一帧图像,针对图像中的每一个像素,把其各通道的亮度值累加到对应的直方图单元中去,具体累加方法是:H[x][y][c][d]=H[x][y][c][d]+T,其中,(x,y)是像素在图像上的坐标,c是像素的通道编号,d是像素(x,y)在通道c的亮度值,T是亮度增量因子,R是遗忘因子;T在初始化时可以赋予一个较小的实数数值;R依据欲实现的遗忘速度来确定,如果期望n帧图像之后,当前图像对直方图贡献的权重降低到1/m,则R^n=m,即R=m^(1/n);Each time a frame of image is received, for each pixel in the image, the brightness value of each channel is accumulated into the corresponding histogram unit. The specific accumulation method is: H[x][y][c][d] =H[x][y][c][d]+T, where (x,y) is the coordinate of the pixel on the image, c is the channel number of the pixel, and d is the pixel (x,y) in channel c T is the brightness increment factor, R is the forgetting factor; T can be assigned a small real number value during initialization; R is determined according to the forgetting speed to be achieved, if it is expected that after n frames of images, the current image is opposite to the histogram The weight of the graph contribution is reduced to 1/m, then R^n=m, that is, R=m^(1/n);
    [4001]对于新接收到的图像帧,判定图像中每个像素是前景还是背景的方法是:像素(x,y)通道c的亮度为d,依据阈值P,如果H[x][y][c][d]<P,则可判定像素(x,y)属于前景像素,如果对于像素(x,y)的所有通道c都有H[x][y][c][d]>=P,则可判定该像素属于背景像素;[4001] For a newly received image frame, the method to determine whether each pixel in the image is the foreground or the background is: the brightness of the pixel (x, y) channel c is d, according to the threshold P, if H[x][y] [c][d]<P, it can be determined that the pixel (x, y) belongs to the foreground pixel, if all channels c of the pixel (x, y) have H[x][y][c][d]> =P, it can be determined that the pixel belongs to the background pixel;
    像素(x,y)在通道c的判定阈值P设定方法是:P=max(H[x][y][c])*0.5,即取像素(x,y)在通道c的统计直方图最大值的一半;The determination threshold P of the pixel (x, y) in channel c is set: P=max(H[x][y][c])*0.5, that is, the statistical histogram of pixel (x, y) in channel c is taken Half of the maximum value of the graph
    接收下一帧图像之前,增量因子乘以遗忘因子,即T=T*R。Before receiving the next frame of image, the increment factor is multiplied by the forgetting factor, that is, T=T*R.
  2. 根据权利要求1所述的基于Codebook原理的高效视频移动目标检测方法,其特征在于,The high-efficiency video moving target detection method based on Codebook principle according to claim 1, characterized in that:
    对于VGA分辨率黑白视频信号,直方图累加方法是:H[x][y][d]=H[x][y][d]+T,T=T*R;0<=x<640,0<=y<480,0<=d<256;T的初始值为1.0,R=2^(1/1500)=1.0004622。For VGA resolution black and white video signals, the histogram accumulation method is: H[x][y][d]=H[x][y][d]+T, T=T*R; 0<=x<640 , 0<=y<480, 0<=d<256; the initial value of T is 1.0, and R=2^(1/1500)=1.004622.
  3. 根据权利要求1所述的基于Codebook原理的高效视频移动目标检测方法,其特征在于:对于彩色高清视频信号,直方图累加方法是:H[x][y][c][d]=H[x][y][c][d]+T,T=T*R;0<=x<1920,0<=y<1080,0<=c<3;0<=d<256;T的初始值为1.0,R=2^(1/1500)=1.0004622。The high-efficiency video moving target detection method based on the Codebook principle according to claim 1, characterized in that: for color high-definition video signals, the histogram accumulation method is: H[x][y][c][d]=H[ x][y][c][d]+T, T=T*R; 0<=x<1920, 0<=y<1080, 0<=c<3; 0<=d<256; T's The initial value is 1.0, and R=2^(1/1500)=1.004622.
PCT/CN2020/137988 2019-12-31 2020-12-21 Codebook principle-based efficient video moving object detection method WO2021136001A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911406901.X 2019-12-31
CN201911406901.XA CN111145219B (en) 2019-12-31 2019-12-31 Efficient video moving target detection method based on Codebook principle

Publications (1)

Publication Number Publication Date
WO2021136001A1 true WO2021136001A1 (en) 2021-07-08

Family

ID=70522427

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/137988 WO2021136001A1 (en) 2019-12-31 2020-12-21 Codebook principle-based efficient video moving object detection method

Country Status (2)

Country Link
CN (1) CN111145219B (en)
WO (1) WO2021136001A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145219B (en) * 2019-12-31 2022-06-17 神思电子技术股份有限公司 Efficient video moving target detection method based on Codebook principle

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110200238A1 (en) * 2010-02-16 2011-08-18 Texas Instruments Incorporated Method and system for determining skinline in digital mammogram images
CN104182957A (en) * 2013-05-21 2014-12-03 北大方正集团有限公司 Traffic video detection information method and device
CN104820435A (en) * 2015-02-12 2015-08-05 武汉科技大学 Quadrotor moving target tracking system based on smart phone and method thereof
CN111145219A (en) * 2019-12-31 2020-05-12 神思电子技术股份有限公司 Efficient video moving target detection method based on Codebook principle

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216942A (en) * 2008-01-14 2008-07-09 浙江大学 An increment type characteristic background modeling algorithm of self-adapting weight selection
EP2641401B1 (en) * 2010-11-15 2017-04-05 Huawei Technologies Co., Ltd. Method and system for video summarization
CN104067272A (en) * 2011-11-21 2014-09-24 诺基亚公司 Method for image processing and an apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110200238A1 (en) * 2010-02-16 2011-08-18 Texas Instruments Incorporated Method and system for determining skinline in digital mammogram images
CN104182957A (en) * 2013-05-21 2014-12-03 北大方正集团有限公司 Traffic video detection information method and device
CN104820435A (en) * 2015-02-12 2015-08-05 武汉科技大学 Quadrotor moving target tracking system based on smart phone and method thereof
CN111145219A (en) * 2019-12-31 2020-05-12 神思电子技术股份有限公司 Efficient video moving target detection method based on Codebook principle

Also Published As

Publication number Publication date
CN111145219B (en) 2022-06-17
CN111145219A (en) 2020-05-12

Similar Documents

Publication Publication Date Title
WO2018130016A1 (en) Parking detection method and device based on monitoring video
CN108734107B (en) Multi-target tracking method and system based on human face
CN107085714B (en) Forest fire detection method based on video
CN109635758B (en) Intelligent building site video-based safety belt wearing detection method for aerial work personnel
WO2017000465A1 (en) Method for real-time selection of key frames when mining wireless distributed video coding
WO2022027931A1 (en) Video image-based foreground detection method for vehicle in motion
CN110191320B (en) Video jitter and freeze detection method and device based on pixel time sequence motion analysis
CN103729858B (en) A kind of video monitoring system is left over the detection method of article
CN108564052A (en) Multi-cam dynamic human face recognition system based on MTCNN and method
US8553086B2 (en) Spatio-activity based mode matching
WO2006008944A1 (en) Image processor, image processing method, image processing program, and recording medium on which the program is recorded
KR20210006276A (en) Image processing method for flicker mitigation
CN112017445B (en) Pedestrian violation prediction and motion trail tracking system and method
CN112528861A (en) Foreign matter detection method and device applied to track bed in railway tunnel
WO2021136001A1 (en) Codebook principle-based efficient video moving object detection method
CN111460964A (en) Moving target detection method under low-illumination condition of radio and television transmission machine room
CN112887587B (en) Self-adaptive image data fast transmission method capable of carrying out wireless connection
JP3883250B2 (en) Surveillance image recording device
CN106339995A (en) Space-time multiple feature based vehicle shadow eliminating method
CN116342644A (en) Intelligent monitoring method and system suitable for coal yard
CN115830513A (en) Method, device and system for determining image scene change and storage medium
CN112532938B (en) Video monitoring system based on big data technology
Li et al. Image object detection algorithm based on improved Gaussian mixture model
RU2777883C1 (en) Method for highly efficient detection of a moving object on video, based on the principles of codebook
Chai et al. Fpga-based ROI encoding for HEVC video bitrate reduction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20911022

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20911022

Country of ref document: EP

Kind code of ref document: A1