WO2015117464A1 - 一种视频图像处理装置和方法 - Google Patents

一种视频图像处理装置和方法 Download PDF

Info

Publication number
WO2015117464A1
WO2015117464A1 PCT/CN2014/091796 CN2014091796W WO2015117464A1 WO 2015117464 A1 WO2015117464 A1 WO 2015117464A1 CN 2014091796 W CN2014091796 W CN 2014091796W WO 2015117464 A1 WO2015117464 A1 WO 2015117464A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
video image
background
video
current
Prior art date
Application number
PCT/CN2014/091796
Other languages
English (en)
French (fr)
Inventor
王溢
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2015117464A1 publication Critical patent/WO2015117464A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a video image processing apparatus and method.
  • This type of problem is often encountered in mobile video shooting: in the video captured in a scene with very complex environmental content, the content that needs to be highlighted is often affected by other content, occlusion and other issues. For example, shooting a video of a person on a busy street, often influenced by passing vehicles, and then shooting a scene video in front of a tourist attraction, often affected by people moving around. Eliminating unwanted moving objects from such complex environments, keeping the background image we need is a very strong requirement. If the algorithm can be used to automatically recognize the phone and get the static background we need, it will be very good to meet the user's needs.
  • embodiments of the present invention provide a video image processing apparatus and method.
  • Embodiments of the present invention provide a video image processing method, including the steps of:
  • the pixel color value is not processed; if the pixel is marked as a non-background pixel, the pixel value is replaced with a value associated with the distribution model.
  • the step of converting the video image in the video image file into the sequence image in the RGB format includes:
  • the step of performing statistics on a segment of the sequence image to obtain a color model for each color channel of each pixel includes:
  • marking, by the distribution model, a pixel in the sequence image as a background pixel, or marking a pixel in the sequence image as a non-background pixel comprises:
  • the pixel By comparing with the confidence level, it is determined whether the color value of the current pixel conforms to the statistical rule of the distribution model. If the color value of the current pixel conforms to the statistical rule of the distribution model, the pixel is determined to be a background pixel, otherwise the pixel is determined to be a non-background pixel. .
  • the method further includes the steps of:
  • An embodiment of the present invention further provides a video image processing apparatus, including:
  • a pre-processing module configured to convert a video image in a video image file into a sequence image in an RGB format
  • a pixel statistical modeling module configured to perform statistics on the sequence image to obtain a distribution model of color values of each color channel of each pixel in the sequence image;
  • a pixel value hypothesis checking module by which a pixel in the sequence image is marked as a background pixel, or a pixel in the sequence image is marked as a non-background pixel;
  • a background segmentation module configured to process the marked pixels, determine that the pixel is marked as a background pixel, and not process the pixel color value; determine that the pixel is marked as a non-background pixel, and use the distribution model The value of the pixel is replaced by the value.
  • the preprocessing module includes:
  • the video file reading module is configured to divide the video image file into a plurality of data blocks according to a certain size, and read the plurality of data blocks into the buffer;
  • An audio and video stream separation module configured to separate the video stream and the audio stream according to a compression standard by the video image file in each data block;
  • a video decoding module configured to decode the video stream according to a video compression format standard
  • the image format unified conversion module is configured to replace the decoded video image format with the RGB format.
  • the pixel statistical modeling module is specifically configured to:
  • the pixel value hypothesis verification module is specifically configured to:
  • the pixel By comparing with the confidence level, it is determined whether the color value of the current pixel conforms to the statistical rule of the distribution model. If the color value of the current pixel conforms to the statistical rule of the distribution model, the pixel is determined to be a background pixel, otherwise the pixel is determined to be a non-background pixel. .
  • the video image processing apparatus further includes:
  • a display module configured to display an image of the output background pixel and the non-background pixel.
  • the video image processing method of the embodiment of the present invention automatically processes and eliminates the motion foreground in the sequence image by processing the video image file, and the video output to the user is removed.
  • a sequence of images of pure backgrounds that interfere with motion It satisfies the need for the user to need a static background to eliminate the motion background.
  • FIG. 1 is a main flowchart of a video image processing method in an embodiment of the present invention.
  • FIG. 2 is a detailed flowchart of a video image processing method in an embodiment of the present invention.
  • FIG. 3 is a detailed flowchart of statistical modeling of pixels in a video image processing method according to an embodiment of the present invention.
  • FIG. 4 is a detailed flowchart showing a hypothesis test of pixel values in a video image processing method according to an embodiment of the present invention
  • Fig. 5 is a view showing the main block configuration of a video image processing apparatus in an embodiment of the present invention.
  • an embodiment of the present invention provides a video image processing method, including:
  • Step 1 converting the video image in the video image file into a sequence image in the RGB format
  • Step 2 performing statistics on the sequence image to obtain a distribution model of color values of each color channel of each pixel;
  • Step 3 by using the distribution model, marking a pixel in the sequence image as a background pixel, or marking a pixel in the sequence image as a non-background pixel;
  • Step 4 If the pixel is marked as a background pixel, the pixel color value is not changed; if the pixel is marked as a non-background pixel, the value of the distribution model is used instead of the pixel value.
  • the detailed process of the video image processing method in the embodiment of the present invention is as follows: dividing a video image file (including a video stream) into a plurality of data blocks according to a certain size, and reading the plurality of data blocks into a buffer.
  • the video image file in each data block is separated from the audio stream according to a compression standard, wherein the pure audio stream is processed normally.
  • the format is changed to the RGB format; a segment of the sequence image is counted to obtain a distribution model of color values of each color channel of each pixel; for each pixel in the sequence image The color value of each color channel is statistically hypothesized.
  • the pixel is determined to be a background pixel, otherwise the pixel is determined to be a non-background pixel; if it is marked as a background Pixel pixels, pixel color values are not changed; marked as non The pixels of the background pixel are replaced with a value for the distribution model. Both the background pixel and the non-background pixel are displayed and output.
  • the step of converting the video image in the video image file into the sequence image in the RGB format comprises: reading the video image file into the buffer according to a certain data block size; and converting the video image file in each data block according to the compression standard
  • the stream and the audio stream are separated; the video stream is decoded according to a video compression format standard; the decoded video image format is replaced with an RGB format; and the step of displaying an image that is marked as a background pixel is displayed.
  • the implementation process has basically been standardized. It is only when different terminals are implemented.
  • the size of the buffer, the decoding scheme (software decoding or hardware decoding) is different, and the image format conversion is involved, that is, through different formats.
  • the matrix correspondence between the standard correspondences can be achieved. The invention is not cumbersome in this place.
  • Performing statistics on a segment of the sequence image, and obtaining a distribution model of color values of each color channel of each pixel specifically includes: obtaining a measurement probability of a current pixel at a current time in the sequence pixel; according to a current neighborhood of a small neighborhood The neighboring pixels obtain the self-information of the current color value of the current pixel; calculate the probability that the current pixel color value occurs.
  • x i (r i ,g i ,b i ), where r i ,g i ,b i respectively represent red, green and blue three-color channels Color value.
  • the probability p(x t ) of the measurement x t of the current time t of the pixel can be obtained by the following equation.
  • ⁇ i is a weight coefficient
  • K ⁇ (x t -x i ) is a distribution function of pixel color values
  • is a window radius
  • a distribution function of pixel color values can be selected as a uniform distribution, a normal distribution, a triangular distribution , binomial distribution, etc.
  • the pixel values of a certain point in the sequence image are subject to light, the camera is slightly shaken, the moving object interferes, etc. It is obvious that the uniform distribution, the triangular distribution, the binomial distribution, etc. are not suitable for describing the distribution of the pixel color values, and the normal distribution can be better. Describe the distribution of pixel color values. Under the normal distribution law, the area of 68.268949% is within a standard deviation of the average. The area of 95.449974% is within the range of two standard deviations of 2 ⁇ around the mean. The area of 99.730020% is within the range of three standard deviations of 3 ⁇ around the average. The area of 99.993666% is within the range of four standard deviations of 4 ⁇ around the mean. For a pixel fixed in a sequence image, the color value obviously fluctuates back and forth within a certain standard deviation. If the effects of light, jitter, etc. are ignored, the color value is a fixed value, that is, a normal distribution. Mean.
  • the probability of the current time measurement of the pixel x can be obtained.
  • y be any pixel in a small neighborhood of pixel x, which satisfies dis(x, y) ⁇ ⁇ , where dis(x, y) is the spatial distance between two pixels of x and y in the image.
  • is a constant.
  • B y ) of the current pixel color value of the pixel x by using the sample of the pixel y .
  • I x represents the uncertainty of the random event of the current color value of the pixel x, x t .
  • the larger the I x the more the probability that the current color value of the pixel x is x t Small, the smaller the probability, the more the current color value does not fit the pixel color value distribution model, the more likely it is the non-background.
  • is a coefficient
  • the pixels in the sequence image are marked as background pixels by the distribution model, or the pixels in the sequence image are marked as non-background pixels.
  • marking the image in the sequence pixel by the distribution model is: performing a statistical hypothesis test on the color value of each color channel of each pixel in the sequence image, if the current pixel If the color value conforms to the statistical rule of the distribution model, the pixel is determined to be a background pixel, otherwise the pixel is determined to be a non-background pixel.
  • Marking, by the distribution model, a pixel in the sequence image as a background pixel, or marking a pixel in the sequence image as a non-background pixel comprises: calculating a modified self information of the current pixel; giving a confidence level of the hypothesis test; Compared with the confidence level, it is determined whether the color value of the current pixel conforms to the statistical rule of the distribution model. If the color value of the current pixel conforms to the statistical rule of the distribution model, the pixel is determined to be a background pixel, otherwise the pixel is determined to be a non-background pixel.
  • the step of marking the pixels in the sequence image by the distribution model is embodied as follows:
  • the hypothesis test in statistics can be used to test whether a random variable obeys a certain probability distribution hypothesis, and then use the sample data to calculate the statistics of the test using a certain statistical method, according to a certain probability principle, with a smaller Risk to determine the estimated value and the overall value (or Whether there is a significant difference between the estimated distribution and the actual distribution, and whether a test method for the choice of the null hypothesis should be accepted.
  • a certain probability principle e.g., after we obtain the color distribution model of the pixel value, for the color value of a certain pixel of the current frame in the sequence image, it can be checked whether the color value conforms to the model with a certain probability.
  • the pixel When the model is met with a large probability, the pixel can be considered to be not background interference in the current frame, which is a background pixel. It is assumed that the pre-set test level at the time of the test is taken to be a relatively small value such as 0.05, which means that the probability of the test hypothesis being true but being rejected incorrectly is 0.05.
  • the current pixel value is in accordance with the pixel color value distribution model with a probability of 95%, and thus the probability that the pixel is high is the background pixel.
  • 95% can be set as the segmentation probability of distinguishing between the background pixel and the foreground pixel.
  • the probability value can also change as the complexity of the actual scene changes. The specific selection method is that the more complex the actual scene, the larger the variance of the normal distribution model, and the smaller the probability value selection should be.
  • a probability threshold is set to check if the current pixel fits the distribution model.
  • the pixel color value is not processed; if the current pixel is marked as a non-background pixel, the pixel is replaced with a value of the distribution model. value.
  • another value about the distribution model may be used as the substitute pixel value.
  • the hypothesis test module can be used to distinguish whether the current pixel conforms to the distribution model. According to the formula, we can correspond the probability threshold to the threshold I th of I' x .
  • I th is the segmentation threshold given by the user.
  • the embodiment of the present invention comprehensively considers the information of the pixel itself and the influence of the neighboring pixel information on the central pixel, and uses the combination of the two to perform the separation of the background region. If there is a pixel in the image, the point x is considered to be non-background, ie, I x is relatively large, and the pixels in its neighborhood are judged to be the background, that is, p(x t ) is much smaller than p(x t
  • Pixels whose color values are not changed are marked as background pixels. If it is a non-background, indicating that the pixel is motion interference in the current frame, the value of the distribution model is used instead of the pixel value to achieve the purpose of removing interference.
  • the background and non-background distinction can be intelligently frame-by-pixel, and the video output to the user is a sequence of images with a clean background of motion interference removed.
  • an embodiment of the present invention provides a video image processing apparatus, including:
  • the pre-processing module 100 is configured to convert the video image in the video image file into a sequence image in an RGB format;
  • the pixel statistical modeling module 200 is configured to perform statistics on the sequence image to obtain a distribution model of color values of each color channel of each pixel;
  • the pixel value hypothesis checking module 300 is configured to mark a pixel in the sequence image as a background pixel by using a distribution model, or mark a pixel in the sequence image as a non-background pixel;
  • the background segmentation module 400 is configured to process the marked pixels. If the pixels are marked as background pixels, the pixel color values are not changed; the pixels marked as non-background pixels are replaced by values associated with the distribution model. The pixel value.
  • the preprocessing module includes: a video file reading module configured to divide the video image file into a plurality of data blocks according to a certain size, and read the plurality of data blocks into the buffer;
  • An audio and video stream separation module configured to separate the video stream and the audio stream according to a compression standard by the video image file in each data block;
  • a video decoding module configured to decode the video stream according to a video compression format standard
  • the image format unified conversion module is configured to replace the decoded video image format with the RGB format.
  • the video file reading module, audio and video stream separation module and video decoding module have become standardized modules on many terminals with video functions.
  • the implementation process has basically been standardized, but the implementation of different terminals, the size of the buffer, the decoding scheme (software decoding or hardware decoding) is different, the present invention is not cumbersome in this place, the image format is unified
  • the conversion module involves the conversion of image formats, that is, matrix changes can be made by standard correspondence between different formats.
  • the pixel statistical modeling module is specifically configured to: obtain a measurement probability of a current pixel at a current time in the sequence pixel; obtain a self-information of a current color value of the current pixel according to an adjacent pixel in a small neighborhood of the current pixel; calculate a color of the previous pixel The probability that the value will occur.
  • the pixel value hypothesis testing module is specifically configured to: calculate the corrected self information of the current pixel; give a confidence level of the hypothesis test; determine whether the color value of the current pixel conforms to the statistical rule of the distribution model, and if the color value of the current pixel conforms to the statistics of the distribution model Regularly, the pixel is judged to be a background pixel, otherwise the pixel is judged to be a non-background pixel.
  • the video image processing apparatus also includes a display module configured to display an image that is output by the background pixels and the non-background pixels.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供了一种视频图像处理装置和方法,其中方法包括:将视频图像文件中的视频图像转化为RGB格式的序列图像;对所述序列图像进行统计,得到每一个像素每个颜色通道的颜色值的分布模型;对通过分布模型判断该像素为背景像素,或者判断该像素为非背景像素;若所述像素被标记为背景像素的像素,则对像素颜色值不进行处理;若所述像素被标记为非背景像素的像素,则利用与所述分布模型相关的值来替代该像素值。

Description

一种视频图像处理装置和方法 技术领域
本发明涉及通讯技术领域,尤其涉及一种视频图像处理装置和方法。
背景技术
随着智能终端的普及,用户对智能终端上的应用要求也越来越多样化,手机摄像头的像素也越来越高,手机自带相机大有取代传统相机的趋势。
随着手机相机功能的日趋强大,越来越多的人外出旅行游玩已经用手机替代了传统的照相机,摄像机。因为手机轻便快捷,而且随着手机处理器性能的提升,照片和视频的后处理功能可以直接集成在手机中,使得用户可以很便捷的使用手机完成、甚至超越以前传统相机所能完成的功能。
手机拍摄视频中经常会遇到这类问题:就是在环境内容非常复杂的场景中拍摄到的视频中经常碰到所需要突出表现的内容被其他内容影响,遮挡等问题。比如:在车水马龙的街道上拍摄人物视频,经常被过往的车辆影响,再比如在一个旅游景点前面拍摄一段景物视频,经常被来回走动的人影响。从这类复杂环境中剔除掉不需要的运动物体,保留我们需要的背景图像是一个非常强烈的需求。如果可以通过算法让手机自动识别并得到我们需要的静止背景,将非常好的满足用户的这类需求。
发明内容
为了克服现有技术中存在的技术问题,本发明实施例提供了一种视频图像处理装置和方法。
本发明实施例提供了一种视频图像处理方法,包括步骤:
将视频图像文件中的视频图像转化为RGB格式的序列图像;
对所述序列图像进行统计,得到所述序列图像中的每一个像素每个颜色通道的颜色值的分布模型;
通过所述分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素;
若所述像素被标记为背景像素,则对像素颜色值不做处理;若所述像素被标记为非背景像素,则利用与所述分布模型相关的值来替代该像素值。
在一个实施例中,所述的视频图像处理方法中,将视频图像文件中的视频图像转化为RGB格式的序列图像步骤包括:
将视频图像文件按一定的大小分成多个数据块,并将所述多个数据块读入缓冲区;
将每个数据块中的视频图像文件根据压缩标准将视频流和音频流分开;
根据视频压缩格式标准对所述视频流进行解码;
将解码后的视频图像格式转换成RGB格式。
在一个实施例中,所述的视频图像处理方法中,对所述序列图像的一段进行统计,得到每个像素每个颜色通道的颜色值的分布模型步骤具体包括:
得到所述序列像素中当前时刻当前像素的测量概率;
根据当前像素一小邻域内的相邻像素得到当前像素当前颜色值的自信息;
计算当前像素颜色值发生的概率。
在一个实施例中,所述的视频图像处理方法中,通过所述分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素步骤包括:
计算当前像素的修正自信息;
给出假设检验的置信水平;
通过与置信水平的比较,判断当前像素的颜色值是否符合分布模型的统计规律,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素。
在一个实施例中,所述的视频图像处理方法中,若所述当前像素被标记为背景像素,像素颜色值不做处理;所述当前像素被标记为非背景像素,则利用与所述分布模型相关的值来替代该像素值步骤之后,该方法还包括步骤:
显示输出背景像素和非背景像素的图像。
本发明实施例还提供了一种视频图像处理装置,包括:
预处理模块,配置为将视频图像文件中的视频图像转化为RGB格式的序列图像;
像素统计建模模块,配置为对所述序列图像进行统计,得到所述序列图像中的每一个像素每个颜色通道的颜色值的分布模型;
像素值假设检验模块,通过所述分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素;
背景分割模块,配置为处理被标记的像素,确定所述像素被标记为背景像素,则对像素颜色值不做处理;确定所述像素被标记为非背景像素,则利用与所述分布模型相关的值来替代该像素值。
在一个实施例中,所述的视频图像处理装置中,所述预处理模块包括:
视频文件读入模块,配置为将视频图像文件按一定的大小分成多个数据块,并将所述多个数据块读入缓冲区;
音视频流分离模块,配置为将每个数据块中的视频图像文件根据压缩标准将视频流和音频流分开;
视频解码模块,配置为根据视频压缩格式标准对所述视频流进行解码;
图像格式统一转换模块,配置为将解码后的视频图像格式换成RGB格式。
在一个实施例中,所述的视频图像处理装置中,所述像素统计建模模块具体配置为:
得到所述序列像素中当前时刻当前像素的测量概率;
根据当前像素一小邻域内的相邻像素得到当前像素当前颜色值的自信息;
计算当前像素颜色值发生的概率。
在一个实施例中,所述的视频图像处理装置中,所述像素值假设检验模块具体配置为:
计算当前像素的修正自信息;
给出假设检验的置信水平;
通过与置信水平的比较,判断当前像素的颜色值是否符合分布模型的统计规律,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素。
在一个实施例中,所述的视频图像处理装置中,所述视频图像处理装置还包括:
显示模块,配置为显示输出背景像素和非背景像素的图像。
本发明实施例的有益效果是:本发明实施例的视频图像处理方法,通过对视频图像文件中进行处理,自动识别并剔除掉序列图像中的运动前景,输出给用户显示的视频则是去掉了运动干扰的纯净背景的图像序列。满足了用户需要静止的背景剔除运动背景的需要。
附图说明
图1表示本发明实施例中视频图像处理方法的主要流程图;
图2表示本发明实施例中视频图像处理方法的详细流程图;
图3表示本发明实施例中视频图像处理方法中对像素统计建模的详细流程图;
图4表示本发明实施例中视频图像处理方法中对像素值假设检验的详细流程图;
图5表示本发明实施例中视频图像处理装置的主要模块组成图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图及具体实施例对本发明进行详细描述。
参照图1所示,本发明实施例提供了一种视频图像处理方法,包括:
步骤1,将视频图像文件中的视频图像转化为RGB格式的序列图像;
步骤2,对所述序列图像进行统计,得到每一个像素每个颜色通道的颜色值的分布模型;
步骤3,通过所述分布模型,标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素;
步骤4,若被标记为背景像素的像素,像素颜色值不做改变;被标记为非背景像素的像素,则利用一关于所述分布模型的值来替代该像素值。
参照图2所示,本发明实施例视频图像处理方法的详细流程如下:将视频图像文件(包括视频流)按一定大小分成多个数据块,并将所述多个数据块读入缓冲区,将每个数据块中的视频图像文件根据压缩标准将视频流和音频流分开,其中纯音频流做正常的处理。将其中的视频流进行解码后,格式换成RGB格式;对所述序列图像的一段进行统计,得到每一个像素每个颜色通道的颜色值的分布模型;对所述序列图像中的每一个像素每个颜色通道的颜色值做统计学上的假设检验,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素;若被标记为背景像素的像素,像素颜色值不做改变;被标记为非 背景像素的像素,则利用一关于所述分布模型的值来替代该像素值。将背景像素和非背景像素都进行显示输出。
其中,将视频图像文件中的视频图像转化为RGB格式的序列图像步骤包括:将视频图像文件按一定的数据块大小读入缓冲区;将每个数据块中的视频图像文件根据压缩标准将视频流和音频流分开;根据视频压缩格式标准对所述视频流进行解码;将解码后的视频图像格式换成RGB格式;包括显示输出被标记为背景像素的图像的步骤。实现过程基本上也已经标准化,只不过是不同的终端具体实现的时候,缓冲区的大小,解码的方案(软件解码还是硬件解码)有所差异,涉及图像格式的转换,也就是通过不同格式之间的标准对应关系做矩阵变化就可以实现。本发明在此地方就不在累赘。
对所述序列图像的一段进行统计,得到每个像素每个颜色通道的颜色值的分布模型步骤具体包括:得到所述序列像素中当前时刻当前像素的测量概率;根据当前像素一小邻域内的相邻像素得到当前像素当前颜色值的自信息;计算当前像素颜色值发生的概率。
参照图3所示,像素统计建模的具体实现如下:
设当前像素前N帧的颜色值组成的样本序列为{x1,x2...xi...xN},其中xi为第i帧,当前像素的颜色值向量,由于统一采用RBG格式的图像,那么xi就是三维的向量,可以表示为xi=(ri,gi,bi),其中ri,gi,bi分别表示红绿蓝三-个颜色通道的颜色值。根据统计学中的参数估计方法,可以得到像素当前时刻t的量测xt的概率p(xt)可以由下式得到。
Figure PCTCN2014091796-appb-000001
另外式中,αi为权重系数,Kσ(xt-xi)为像素颜色值的分布函数,σ为 窗半径,像素颜色值的分布函数可以选为均匀分布,正态分布,三角分布,二项分布等等。
序列图像中某个点的像素值受到光线,摄像机轻微抖动,运动物体干扰等,显而易见均匀分布,三角分布,二项分布等都不太适合描述像素颜色值的分布,正态分布则可以比较好的描述像素颜色值的分布规律。正态分布规律下68.268949%的面积在平均数左右的一个标准差范围内。95.449974%的面积在平均数左右两个标准差2σ的范围内。99.730020%的面积在平均数左右三个标准差3σ的范围内。99.993666%的面积在平均数左右四个标准差4σ的范围内。对于序列图像中固定的某一个像素而言,其颜色值显然是在某一个标准差的范围内来回波动,如果光线,抖动等影响忽略,那么颜色值就是固定的一个值,即就是正态分布的均值。
经过以上过程可以得到像素x的当前时刻量测的概率。设y为像素x的一个小邻域内的任意一个像素,该小邻域满足dis(x,y)≤δ,其中dis(x,y)为x和y两个像素点在图像中的空间距离,δ为一个常数。同理我们可以得到像素y当前像素颜色值发生的概率p(yt)和利用像素y的样本来估计像素x的当前像素颜色值发生的概率p(xt|By)。用p(xt|By)去除p(xt),然后再去对数得到I(xt;y)。将其称为像素x的邻域像素y对像素x的信息贡献。
由于对于像素x来说满足dis(x,y)≤δ的像素y有好多个,不妨计为m个,对于x的m个邻域像素中的每一个像素我们都能得到一个I(xt;y),从而对于像素x就得到了m个I(xt;y),将这m个值简化的记为I1,I2...Im
根据信息论中的知识可以得到像素x的当前颜色值为xt这一随机事件的自信息:
Ix=-log2p(xt)      (2-2)
其中,Ix表示了像素x的当前颜色值为xt这一随机事件的不确定性,根据上式,Ix越大,表示像素x的当前颜色值为xt这一随机事件的概率越小,概率越小,说明当前颜色值越不符合像素颜色值分布模型,则越有可能是非背景。
定义像素x的当前颜色值为xt这一随机事件的修正自信息:
Figure PCTCN2014091796-appb-000002
其中式中,β为一个系数。
至此,像素颜色值的分布模型建立完成。
接着,通过所述分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素。
本发明实施例中,通过所述分布模型标记序列像素中的图像具体是:对所述序列图像中的每一个像素每个颜色通道的的颜色值做统计学上的假设检验,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素。
通过所述分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素步骤包括:计算当前像素的修正自信息;给出假设检验的置信水平;通过与置信水平的比较,判断当前像素的颜色值是否符合分布模型的统计规律,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素。
参照图4所示,通过分布模型标记序列图像中的像素步骤具体实施如下:
统计学中的假设检验可以用来检验某一随机变量是否服从某种概率分布的假设,然后利用样本资料采用一定的统计方法计算出有关检验的统计量,依据一定的概率原则,以较小的风险来判断估计数值与总体数值(或者 估计分布与实际分布)是否存在显著差异,是否应当接受原假设选择的一种检验方法。用到本发明实施例中,就是我们得到像素值的颜色分布模型之后,对于序列图像中当前帧的某一个像素的颜色值,都可以检验该颜色值是否以某一个很大的概率符合这个模型,只有以很大的概率符合这个模型时,则可以认为该像素在当前帧中没有被非背景干扰,是背景像素。假设检验时预先设定的检验水准取为0.05等比较小的一个值,其意思就是当检验假设为真,但被错误地拒绝的概率为0.05。放在本发明实施例中,就是说当前像素值以95%的概率是符合像素颜色值分布模型的,由此则可判定该像素很高的概率是背景像素。由此可以设置95%为区分背景像素和前景像素的分割概率,当然该概率值也可以随着实际场景的复杂程度改变而改变。具体选择方法就是实际场景越复杂,则正态分布模型的方差越大,该概率值选择应该越小。总之,设置一个概率阈值用来检验当前像素是否符合分布模型。
对标记后的像素,进行背景分隔处理,若当前像素被标记为背景像素,像素颜色值不做处理;当前像素被标记为非背景像素,则利用一关于所述分布模型的值来替代该像素值。
本发明实施例中,使用分布模型的均值进行代替,也可以使用一关于分布模型的其他的值作为替代像素值。
背景分割的具体实现如下:
根据假设检验模块的概率阈值,可以用来区分当前像素是否符合分布模型。根据公式,我们可以将概率阈值对应为I′x的阈值Ith
如果I′x≥Ith则认为像素x为非背景,否则认为像素x为背景。Ith为用户给出的分割阈值。
本发明实施例综合考虑了像素本身的信息和邻域像素信息对中心像素的影响,用二者的联合来进行背景区域的分离。如果图像中存在一个像素 点x不考虑邻域信息被认为成非背景即Ix比较大,而用它的邻域内的像素来判断都认为该像素点是背景,即p(xt)远小于p(xt|By),从而得到的I1,I2...Im都小于零,从而就得到I′x要小于Ix,则可能会使I′x小于Ith从而将该像素判断为背景。因为周围都是背景像素,中心像素为非背景时,很大程度上该中心像素点都是噪声,更贴合实际场景,使得判断更准确。
被标记为背景的像素,像素颜色值不做改变。如果是非背景,说明该像素在当前帧中是运动干扰,则利用一关于分布模型的值来替代该该像素值,从而达到去取干扰的目的。
至此,则可以智能的逐帧逐像素的进行背景和非背景的区分,输出给用户显示的视频则是去掉了运动干扰的纯净背景的图像序列。
参照图5所示,本发明实施例提供了一种视频图像处理装置,包括:
预处理模块100,配置为将视频图像文件中的视频图像转化为RGB格式的序列图像;
像素统计建模模块200,配置为对所述序列图像进行统计,得到每一个像素每个颜色通道的颜色值的分布模型;
像素值假设检验模块300,配置为通过分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素;
背景分割模块400,配置为处理被标记的像素,若被标记为背景像素的像素,像素颜色值不做改变;被标记为非背景像素的像素,则利用与所述分布模型相关的值来替代该像素值。
预处理模块包括:视频文件读入模块,配置为将视频图像文件按一定的大小分成多个数据块,将多个数据块读入缓冲区;
音视频流分离模块,配置为将每个数据块中的视频图像文件根据压缩标准将视频流和音频流分开;
视频解码模块,配置为根据视频压缩格式标准对所述视频流进行解码;
图像格式统一转换模块,配置为将解码后的视频图像格式换成RGB格式。视频文件读入模块、音视频流分离模块和视频解码模块在很多带视频功能的终端上都已经成为了标准化模块。实现过程基本上也已经标准化,只不过是不同的终端具体实现的时候,缓冲区的大小,解码的方案(软件解码还是硬件解码)有所差异,本发明在此地方就不在累赘,图像格式统一转换模块涉及图像格式的转换,也就是通过不同格式之间的标准对应关系做矩阵变化就可以实现。
所述像素统计建模模块具体配置为:得到所述序列像素中当前时刻当前像素的测量概率;根据当前像素一小邻域内的相邻像素得到当前像素当前颜色值的自信息;计算前像素颜色值发生的概率。
像素值假设检验模块具体配置为:计算当前像素的修正自信息;给出假设检验的置信水平;判断当前像素的颜色值是否符合分布模型的统计规律,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素。
视频图像处理装置还包括显示模块,配置为显示输出被背景像素和非背景像素的图像。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、 嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。

Claims (10)

  1. 一种视频图像处理方法,该方法包括:
    将视频图像文件中的视频图像转化为RGB格式的序列图像;
    对所述序列图像进行统计,得到所述序列图像中的每一个像素每个颜色通道的颜色值的分布模型;
    通过所述分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素;
    若所述像素被标记为背景像素,则对像素颜色值不进行处理;若所述像素被标记为非背景像素,则利用与所述分布模型相关的值来替代该像素值。
  2. 如权利要求1所述的视频图像处理方法,其中,所述将视频图像文件中的视频图像转化为RGB格式的序列图像,包括:
    将视频图像文件按大小分成多个数据块,并将所述多个数据块读入缓冲区;
    将每个数据块中的视频图像文件根据压缩标准将视频流和音频流分开;
    根据视频压缩格式标准对所述视频流进行解码;
    将解码后的视频图像格式转换成RGB格式。
  3. 如权利要求1所述的视频图像处理方法,其中,所述对所述序列图像进行统计,得到每个像素每个颜色通道的颜色值的分布模型,包括:
    得到所述序列像素中当前时刻当前像素的测量概率;
    根据当前像素一小邻域内的相邻像素得到当前像素当前颜色值的自信息;
    计算当前像素颜色值发生的概率。
  4. 如权利要求1所述的视频图像处理方法,其中,所述通过所述分布模 型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素,包括:
    计算当前像素的修正自信息;
    给出假设检验的置信水平;
    通过与置信水平的比较,判断当前像素的颜色值是否符合分布模型的统计规律,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素。
  5. 如权利要求1所述的视频图像处理方法,其中,所述利用与所述分布模型相关的值来替代该像素值之后,该方法还包括:
    显示输出背景像素和非背景像素的图像。
  6. 一种视频图像处理装置,该装置包括:
    预处理模块,配置为将视频图像文件中的视频图像转化为RGB格式的序列图像;
    像素统计建模模块,配置为对所述序列图像进行统计,得到所述序列图像中的每一个像素每个颜色通道的颜色值的分布模型;
    像素值假设检验模块,通过所述分布模型标记所述序列图像中的像素为背景像素,或者标记所述序列图像中的像素为非背景像素;
    背景分割模块,配置为处理被标记的像素,确定所述当前像素被标记为背景像素,则对像素颜色值不进行处理;确定所述当前像素被标记为非背景像素,则利用与所述分布模型相关的值来替代该像素值。
  7. 如权利要求6所述的视频图像处理装置,其中,所述预处理模块包括:
    视频文件读入模块,配置为将视频图像文件按大小分成多个数据块,并将所述多个数据块读入缓冲区;
    音视频流分离模块,配置为将每个数据块中的视频图像文件根据压缩标准将视频流和音频流分开;
    视频解码模块,配置为根据视频压缩格式标准对所述视频流进行解码;
    图像格式统一转换模块,配置为将解码后的视频图像格式转换成RGB格式。
  8. 如权利要求6所述的视频图像处理装置,其中,所述像素统计建模模块,配置为:
    得到所述序列像素中当前时刻当前像素的测量概率;
    根据当前像素一小邻域内的相邻像素得到当前像素当前颜色值的自信息;
    计算当前像素颜色值发生的概率。
  9. 如权利要求6所述的视频图像处理装置,其中,所述像素值假设检验模块,配置为:
    计算当前像素的修正自信息;
    给出假设检验的置信水平;
    通过与置信水平的比较,判断当前像素的颜色值是否符合分布模型的统计规律,如果当前像素的颜色值符合分布模型的统计规律,则判断该像素为背景像素,否则判断该像素为非背景像素。
  10. 如权利要求6所述的视频图像处理装置,其中,所述视频图像处理装置还包括:
    显示模块,配置为显示输出背景像素和非背景像素的图像。
PCT/CN2014/091796 2014-08-20 2014-11-20 一种视频图像处理装置和方法 WO2015117464A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410415141.X 2014-08-20
CN201410415141.XA CN105357575A (zh) 2014-08-20 2014-08-20 一种视频图像处理装置和方法

Publications (1)

Publication Number Publication Date
WO2015117464A1 true WO2015117464A1 (zh) 2015-08-13

Family

ID=53777273

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/091796 WO2015117464A1 (zh) 2014-08-20 2014-11-20 一种视频图像处理装置和方法

Country Status (2)

Country Link
CN (1) CN105357575A (zh)
WO (1) WO2015117464A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10979669B2 (en) * 2018-04-10 2021-04-13 Facebook, Inc. Automated cinematic decisions based on descriptive models
CN114286163B (zh) * 2021-12-24 2024-02-13 苏州亿歌网络科技有限公司 一种序列图的录制方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101686338A (zh) * 2008-09-26 2010-03-31 索尼株式会社 分割视频中的前景和背景的系统和方法
CN101916449A (zh) * 2010-08-21 2010-12-15 上海交通大学 建立图像处理中基于运动信息的背景模型的方法
CN102663713A (zh) * 2012-04-17 2012-09-12 浙江大学 一种基于色彩不变参数的背景减除方法
US20130182184A1 (en) * 2012-01-13 2013-07-18 Turgay Senlet Video background inpainting

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831580B (zh) * 2012-07-17 2015-04-08 西安电子科技大学 基于运动检测的手机拍摄图像修复方法
EP2936433B1 (en) * 2012-12-21 2018-09-19 Bracco Suisse SA Segmentation in diagnostic imaging applications based on statistical analysis over time
CN103310425B (zh) * 2013-07-16 2016-03-09 公安部第三研究所 基于图像梯度先验模型实现大尺度图像修复的方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101686338A (zh) * 2008-09-26 2010-03-31 索尼株式会社 分割视频中的前景和背景的系统和方法
CN101916449A (zh) * 2010-08-21 2010-12-15 上海交通大学 建立图像处理中基于运动信息的背景模型的方法
US20130182184A1 (en) * 2012-01-13 2013-07-18 Turgay Senlet Video background inpainting
CN102663713A (zh) * 2012-04-17 2012-09-12 浙江大学 一种基于色彩不变参数的背景减除方法

Also Published As

Publication number Publication date
CN105357575A (zh) 2016-02-24

Similar Documents

Publication Publication Date Title
EP3477931B1 (en) Image processing method and device, readable storage medium and electronic device
CN110276767B (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
US9741125B2 (en) Method and system of background-foreground segmentation for image processing
US10509954B2 (en) Method and system of image segmentation refinement for image processing
US10430694B2 (en) Fast and accurate skin detection using online discriminative modeling
Crabb et al. Real-time foreground segmentation via range and color imaging
CN108024107B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
RU2426172C1 (ru) Способ и система выделения данных об изображении объекта переднего плана на основе данных о цвете и глубине
WO2016101883A1 (zh) 一种即时视频中的人脸美化方法和电子设备
US12002259B2 (en) Image processing apparatus, training apparatus, image processing method, training method, and storage medium
US20170013191A1 (en) No-reference image and video quality evaluation
US20170206690A1 (en) Image processing apparatus, image processing method, and storage medium
US8577137B2 (en) Image processing apparatus and method, and program
JP6579868B2 (ja) 画像処理装置、撮像装置、画像処理方法およびプログラム
CN107862658B (zh) 图像处理方法、装置、计算机可读存储介质和电子设备
US10145790B2 (en) Image processing apparatus, image processing method, image capturing device and storage medium
US8824823B1 (en) Increased quality of image objects based on depth in scene
CN110276831B (zh) 三维模型的建构方法和装置、设备、计算机可读存储介质
CN108805838B (zh) 一种图像处理方法、移动终端及计算机可读存储介质
US9330340B1 (en) Noise estimation for images using polynomial relationship for pixel values of image features
US20230127009A1 (en) Joint objects image signal processing in temporal domain
US20170116741A1 (en) Apparatus and Methods for Video Foreground-Background Segmentation with Multi-View Spatial Temporal Graph Cuts
CN112802081A (zh) 一种深度检测方法、装置、电子设备及存储介质
CN107578372B (zh) 图像处理方法、装置、计算机可读存储介质和电子设备
JP5914046B2 (ja) 画像処理装置および画像処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14881500

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14881500

Country of ref document: EP

Kind code of ref document: A1