CN107452010B

CN107452010B - Automatic cutout algorithm and device

Info

Publication number: CN107452010B
Application number: CN201710638979.9A
Authority: CN
Inventors: 王灿进; 孙涛; 王挺峰; 王锐; 陈飞; 田玉珍
Original assignee: Changchun Institute of Optics Fine Mechanics and Physics of CAS
Current assignee: Changchun Changguang Qiheng Sensing Technology Co ltd
Priority date: 2017-07-31
Filing date: 2017-07-31
Publication date: 2021-01-05
Anticipated expiration: 2037-07-31
Also published as: CN107452010A

Abstract

An automatic matting method and device relate to the field of digital image processing, comprising: acquiring an original image to be matted, calculating its matting visual saliency; using spatial domain filtering and threshold segmentation algorithms to separate a foreground area and a background area, and combining morphological Learn the operation to obtain a tripartite map; perform gradient calculation on each pixel in the unknown area, and obtain the foreground and background sample point sets of the current unknown area pixel by sampling according to the gradient direction and saliency size; calculate the opacity and confidence of each sample point , and take the sample pair with the highest confidence as the best sample pair for final matting. The local area of opacity is smoothed to obtain the final estimated opacity; finally, according to the final estimated opacity and the color of the best sample pair, a matting operation is performed in the original image to extract the foreground target. The invention also discloses an automatic drawing device. The embodiments of the present invention have the advantages of no user interaction, simple and convenient use, and high matting accuracy and success rate.

Description

A kind of automatic matting algorithm and device

技术领域technical field

本发明涉及数字图像处理领域，具体涉及一种自动抠图算法和装置。The invention relates to the field of digital image processing, in particular to an automatic image matting algorithm and device.

背景技术Background technique

现实生活中，把感兴趣的目标从一幅背景图像中抠取出来，作为独立素材或者与新的背景图像合成，以期得到完整、逼真的背景替换效果，这种技术已普遍应用于影像编辑、影视特效等领域，广泛渗透入人们的日常生活中。数字图像抠图技术以其良好的应用前景和商业价值，近年来成为计算机视觉研究领域中的热点。In real life, the target of interest is extracted from a background image and used as an independent material or synthesized with a new background image in order to obtain a complete and realistic background replacement effect. This technology has been widely used in image editing, Film and television special effects and other fields have widely penetrated into people's daily life. Digital image matting technology has become a hot spot in the field of computer vision research in recent years due to its good application prospects and commercial value.

数字抠图算法将自然图像中的每个像素建模为前景和背景颜色的线性模型，即：Digital matting algorithms model each pixel in a natural image as a linear model of foreground and background colors, namely:

I＝αF+(1-α)B (1)I=αF+(1-α)B (1)

其中，I表示实际图像中的颜色值，F表示前景颜色值，B表示背景颜色值，α称为前景不透明度，取值范围为[0,1]，其中前景区域的不透明度α＝1，背景区域的不透明度α＝0，而在未知区域即前景目标的边缘区域，α取(0,1)之间的值。所谓抠图，就是在已知实际图像I的情况下，求取前景F、背景B和不透明度α的过程。I、F、B均为三维向量，方程需要根据3个已知量求取其余7个未知量，因此是一个高度欠约束问题。Among them, I represents the color value in the actual image, F represents the foreground color value, B represents the background color value, α is called the foreground opacity, and the value range is [0, 1], where the opacity of the foreground area α=1, The opacity α of the background area is 0, and in the unknown area, that is, the edge area of the foreground object, α takes a value between (0, 1). The so-called matting is the process of obtaining the foreground F, the background B and the opacity α when the actual image I is known. I, F, and B are all three-dimensional vectors, and the equation needs to find the remaining seven unknowns based on the three known quantities, so it is a highly underconstrained problem.

目前已经广泛应用于影视和传媒制作公司的抠图技术是蓝屏抠图，其原理是：将背景限制为单一的蓝色，从而将方程中的未知量压缩为4个。蓝屏抠图操作简单，但对背景限制较大，并且当前景出现蓝色时，将无法完整抠出目标。At present, the matting technology that has been widely used in film and media production companies is blue screen matting. The blue screen cutout operation is simple, but it has great restrictions on the background, and when the foreground appears blue, it will not be able to completely cut out the target.

目前学者们主要研究的自然图像抠图算法可大致分为二类，即：At present, the natural image matting algorithms mainly studied by scholars can be roughly divided into two categories, namely:

(一)基于采样的算法。该方法假设图像局部连续，用未知区域附近的已知样本点对当前像素的前景和背景分量进行估计。例如发明CN105225245提出基于纹理分布假设和正则化策略的自然图像抠图方法，改进了贝叶斯抠图方法，但基于采样的方法缺陷在于获得的alpha图连通性较差，且往往需要图像先验知识和大量的用户标记；(A) Sampling-based algorithm. The method assumes that the image is locally continuous, and estimates the foreground and background components of the current pixel with known sample points near the unknown region. For example, the invention CN105225245 proposes a natural image matting method based on texture distribution assumption and regularization strategy, and improves the Bayesian matting method, but the defect of the sampling-based method is that the obtained alpha map has poor connectivity and often requires image priors knowledge and a lot of user markup;

(二)基于传播的算法。该方法需要用户首先进行标记(如点、线等)，标识出前景、背景，随后将未知区域视为场，场的边缘对应于已知区域，通过建立Laplace矩阵用以描述alpha值的关系，将抠图过程转化为Laplace矩阵的求解过程，缺陷在于计算量大，对于非连通区域效果不佳。(2) Algorithms based on propagation. This method requires the user to first mark (such as points, lines, etc.), identify the foreground and background, and then regard the unknown area as a field, and the edge of the field corresponds to the known area. The Laplace matrix is established to describe the relationship of the alpha value, Converting the matting process into the Laplace matrix solution process has the disadvantage of a large amount of calculation and is not effective for non-connected regions.

此外，还有将采样和传播相结合的算法，以期发挥二者的优势，例如鲁棒抠图算法等，但算法仍普遍存在用户交互复杂、图像的先验假设过多、计算量大的缺陷，从而限制了应用范围，增加了使用难度。In addition, there are algorithms that combine sampling and propagation, in order to take advantage of the two, such as robust matting algorithms, etc., but the algorithms still generally have the defects of complex user interaction, too many prior assumptions of images, and large amount of calculation. , thus limiting the scope of application and increasing the difficulty of use.

发明内容SUMMARY OF THE INVENTION

为了解决现有技术中存在的问题，本发明提供了一种自动抠图算法和装置，根据输入图像计算抠图视觉显著度，在不限制背景和图像先验知识的情况下，从自然场景图像中完成全自动抠图，无需用户交互，同时能保证较高的抠图精度和成功率。In order to solve the problems existing in the prior art, the present invention provides an automatic matting algorithm and device, which calculates the visual saliency of matting according to the input image, without limiting the background and image prior knowledge, from natural scene images Complete automatic matting without user interaction, while ensuring high matting accuracy and success rate.

本发明解决技术问题所采用的技术方案如下：The technical scheme adopted by the present invention to solve the technical problem is as follows:

一种自动抠图算法，该方法包括如下步骤：An automatic image matting algorithm, the method comprises the following steps:

步骤一：获取待抠图的原始图像，计算其抠图视觉显著度；Step 1: Obtain the original image to be matted, and calculate its matting visual saliency;

步骤二：根据抠图视觉显著度图，使用空间域滤波和阈值分割算法，分离出前景区域、背景区域，结合形态学运算得到三分图；Step 2: According to the cutout visual saliency map, use spatial domain filtering and threshold segmentation algorithms to separate the foreground area and the background area, and combine morphological operations to obtain a tripartite map;

步骤三：根据三分图，对未知区域的每个像素进行梯度计算，根据梯度方向和显著性大小采样得到当前未知区域像素的前景、背景样本点集；Step 3: Perform gradient calculation on each pixel of the unknown area according to the trisector map, and obtain the foreground and background sample point sets of the pixels in the current unknown area by sampling according to the gradient direction and saliency size;

步骤四：根据当前未知区域像素的前景、背景样本点集，计算每个样本点的不透明度和置信度，取置信度最高的样本对作为最终抠图用的最佳样本对。然后平滑不透明度的局部区域，得到最终估计的不透明度；Step 4: Calculate the opacity and confidence of each sample point according to the foreground and background sample point sets of the pixels in the current unknown area, and take the sample pair with the highest confidence as the best sample pair for final matting. Then smooth the local area of opacity to get the final estimated opacity;

步骤五：根据最终估计的不透明度和最佳样本对的颜色值，在原始图像中进行抠图操作，提取出前景目标。Step 5: According to the final estimated opacity and the color value of the best sample pair, perform a matting operation in the original image to extract the foreground target.

一种自动抠图装置，所述装置包括：An automatic map-out device, the device comprising:

图像获取模块，用于采集单幅图像的颜色值；The image acquisition module is used to collect the color value of a single image;

抠图视觉显著度计算模块，用于根据所述图像获取模块获取的图像颜色值，计算图像的抠图视觉显著度；A cutout visual saliency calculation module for calculating the cutout visual saliency of the image according to the image color value obtained by the image acquisition module;

三分图计算模块，用于根据所述抠图视觉显著度计算模块获取的抠图视觉显著度图，使用空间域滤波和阈值分割算法，分离出前景区域、背景区域，结合形态学运算，计算得到三分图；The tripartite map calculation module is used to separate the foreground area and the background area according to the cutout visual saliency map obtained by the cutout visual saliency calculation module. get a three-point map;

样本点集获取模块，根据所述三分图计算模块获取的三分图，对未知区域的每个像素进行梯度计算，根据梯度方向和显著性大小采样得到当前未知区域像素的前景、背景样本点集；The sample point set acquisition module performs gradient calculation on each pixel in the unknown area according to the trisectoral graph obtained by the tripartite graph calculation module, and obtains the foreground and background sample points of the pixel in the current unknown area by sampling according to the gradient direction and the saliency size. set;

不透明度计算模块，根据所述样本点集获取模块获取的前景、背景样本点集，计算每个样本点的不透明度和置信度，取置信度最高的样本对作为最终抠图用的最佳样本对。然后对不透明度的局部区域进行平滑，得到最终估计的不透明度；The opacity calculation module calculates the opacity and confidence of each sample point according to the foreground and background sample point sets obtained by the sample point set acquisition module, and takes the sample pair with the highest confidence as the best sample for final matting right. Then smooth the local area of opacity to get the final estimated opacity;

前景抠出模块，用于根据最终估计的不透明度和最佳样本对的颜色值，在原始图像中进行抠图操作，提取出前景目标。The foreground keying module is used to perform a keying operation in the original image according to the final estimated opacity and the color value of the best sample pair to extract the foreground target.

本发明的有益效果是：本发明提出的抠图视觉显著性计算方法模拟人眼的视觉注意机制，可自动提取前景目标，免去了复杂的用户交互操作，完成全自动抠图过程，操作简单方便；通过限制样本点对的数量，缩短了抠图时间；对显著性图和不透明度均进行了平滑，提高了抠图精度。The beneficial effects of the present invention are as follows: the method for calculating the visual saliency of the cutout proposed by the present invention simulates the visual attention mechanism of the human eye, can automatically extract the foreground target, avoids complex user interaction operations, completes the automatic cutout process, and is simple to operate Convenient; by limiting the number of sample point pairs, the matting time is shortened; both the saliency map and the opacity are smoothed, and the matting accuracy is improved.

附图说明Description of drawings

图1为本发明一种自动抠图算法的流程示意图Fig. 1 is the schematic flow chart of a kind of automatic matting algorithm of the present invention

图2为本发明计算区域显著度的流程示意图FIG. 2 is a schematic flowchart of calculating the regional saliency of the present invention

图3为本发明一种自动抠图装置的结构示意图Fig. 3 is a structural schematic diagram of an automatic map-out device of the present invention

具体实施方式Detailed ways

下面结合附图和实施例对本发明做进一步详细说明。The present invention will be described in further detail below with reference to the accompanying drawings and embodiments.

图1为本发明一种自动抠图算法实施例的流程示意图，本发明实施例提供了一种自动抠图方法，该方法可以由任意具有图像存储和显示功能的抠图装置来执行，该装置可以是各种终端设备，例如：个人电脑，手机，平板电脑等，也可以是数码相机、摄像机等，具体可以通过软件和/或硬件来实现。如图1所示，本实施例的方法包括：1 is a schematic flowchart of an embodiment of an automatic image matting algorithm of the present invention. An embodiment of the present invention provides an automatic image matting method. The method can be executed by any image matting device with image storage and display functions. The device It can be various terminal devices, such as personal computers, mobile phones, tablet computers, etc., or digital cameras, video cameras, etc., which can be specifically implemented by software and/or hardware. As shown in Figure 1, the method of this embodiment includes:

步骤一：获取待抠图的原始图像，计算原始图像的抠图视觉显著度。Step 1: Obtain the original image to be matted, and calculate the matting visual saliency of the original image.

考虑到通常需要提取的前景目标具有以下特征：前景目标具有完整的目标区域，且与周围背景对比度较明显；颜色分布较均匀；大部分区域亮度较大；与背景有较明显的边缘区分。因此本发明实施例提出的视觉抠图显著度计算方法考虑了前景目标的颜色、亮度和区域完整性等特性，假设获取图像为rgb格式，首先需根据r、g、b颜色通道计算灰度图I_gray，具体公式为：Considering that the foreground objects that usually need to be extracted have the following characteristics: the foreground objects have a complete target area, and the contrast with the surrounding background is relatively obvious; the color distribution is relatively uniform; most areas have high brightness; Therefore, the visual matting saliency calculation method proposed in the embodiment of the present invention takes into account the characteristics of the foreground target, such as color, brightness, and regional integrity. Assuming that the acquired image is in rgb format, the grayscale image needs to be calculated first according to the r, g, and b color channels. I _gray , the specific formula is:

I_gray＝(r+g+b)/3 (2)I _gray = (r+g+b)/3 (2)

原始图像也可以是YUV等任意格式，本发明实施例对摄像机的输出图像格式不作限定，相应彩色转灰度的公式也需作调整。The original image may also be in any format such as YUV, the embodiment of the present invention does not limit the output image format of the camera, and the corresponding formula for converting color to gray scale also needs to be adjusted.

接着对I_gray进行低通滤波和降采样，也即：将原始灰度图像I_gray作为金字塔的第0尺度层。第1尺度层为第0尺度层与低通滤波器进行卷积，然后在x和y方向上分别1/2采样得到，其余层以此类推，每一层的分辨率为上一层的一半。此处的低通滤波器可以是高斯滤波、拉普拉斯滤波和Gabor滤波器，本发明实施例对生成尺度金字塔的低通滤波器的形式不作具体限制。Then perform low-pass filtering and down-sampling on I _gray , that is, take the original grayscale image I _gray as the 0th scale layer of the pyramid. The first scale layer is convolved with the low-pass filter for the 0th scale layer, and then sampled by 1/2 in the x and y directions respectively, and the rest of the layers are analogous, and the resolution of each layer is half of the previous layer. . The low-pass filter here may be a Gaussian filter, a Laplacian filter, or a Gabor filter, and the embodiment of the present invention does not specifically limit the form of the low-pass filter for generating the scale pyramid.

与人眼视觉特性相对应，在图像亮度很低的地方，很难引起人眼注意，因此需要对尺度金字塔的亮度分量进行阈值抑制，即低于最大亮度值I_{gray_max}的5％的区域，亮度分量置为0，这样可以有效抑制暗弱背景干扰。而目标边缘上局部亮度较小的点，虽然也有被剔除的可能，但一方面这些部位在区域显著图中将得到保留，另一方面边缘区域将划分为未知区域，因此对最后的抠图完整性不会造成影响。Corresponding to the visual characteristics of the human eye, it is difficult to attract the attention of the human eye in the place where the image brightness is very low, so it is necessary to perform threshold suppression on the brightness component of the scale pyramid, that is, the area lower than 5% of the maximum brightness value I _{gray_max} , the brightness The component is set to 0, which can effectively suppress the dark background interference. Although the points with low local brightness on the edge of the target may also be eliminated, on the one hand, these parts will be retained in the regional saliency map, and on the other hand, the edge area will be divided into unknown areas, so the final matting is complete. Sex doesn't matter.

建立尺度金字塔后，接着计算亮度显著性图。具体计算方法为：以细尺度c上的图像为视觉的中央区域，粗尺度s上的图像为视觉的周边区域，令中央尺度c∈{2,3,4}，周边尺度s＝c+δ，δ∈{3,4}，可得到6种中央-周边组合分别为{2-5,2-6,3-6,3-7,4-7,4-8}。将s尺度上的图像插值到c的尺度，并根据公式I(c,s)＝|I(c)ΘI(s)|(3)，用c尺度上的图像减去插值后的图像，得到亮度差异图。其中I(σ)为图像金字塔，σ＝0,1,...,8表示不同的尺度，Θ表示中央-周边差算子。这些特征图表示图像中的某个位置与其局部邻域在亮度上的差异性，这种差异性越大，表示在局部范围内亮度显著性越高，越容易引起人眼的注意。依此计算6组亮度差异图之后，对这些差异图进行融合，摒弃冗余特征，生成最后的亮度显著性图。由于在不同尺度上，差异图的绝对值大小并不能反映显著性信息，因此不能将不同尺度的差异图进行简单相加。本发明实施例中，对差异图进行归一化，归一化函数N(·)具体步骤为：After building the scale pyramid, the luminance saliency map is then calculated. The specific calculation method is: take the image on the fine scale c as the central area of vision, and the image on the coarse scale s as the peripheral area of vision, let the central scale c∈{2,3,4}, the peripheral scale s=c+δ , δ∈{3,4}, six center-periphery combinations can be obtained as {2-5, 2-6, 3-6, 3-7, 4-7, 4-8}. Interpolate the image at scale s to the scale c, and subtract the interpolated image from the image at scale c according to the formula I(c,s)=|I(c)ΘI(s)|(3) to get Brightness difference map. where I(σ) is the image pyramid, σ=0,1,...,8 represents different scales, and Θ represents the center-periphery difference operator. These feature maps represent the difference in brightness between a certain position in the image and its local neighborhood. The greater the difference, the higher the brightness saliency in the local range, and the easier it is to attract the attention of the human eye. After calculating 6 sets of brightness difference maps, these difference maps are fused, redundant features are discarded, and the final brightness saliency map is generated. Since at different scales, the absolute value of the difference map cannot reflect the saliency information, so the difference maps at different scales cannot be simply added. In the embodiment of the present invention, the difference map is normalized, and the specific steps of the normalization function N(·) are:

(1)对所有6幅差异图归一化至区间[0,1]；(1) Normalize all 6 difference maps to the interval [0,1];

(2)分别计算每个差异图的局部方差；(2) Calculate the local variance of each difference map separately;

(3)融合权值取与局部方差成正相关，即局部方差越大，表示该差异性图包含的信息量越大，则该幅差异性图在加权组合时应该赋予较大的权值。(3) The fusion weight is positively correlated with the local variance, that is, the larger the local variance, the greater the amount of information contained in the disparity map, and the disparity map should be given a larger weight in the weighted combination.

然后，根据公式

(4)加权组合亮度差异图，得到亮度显著性图。其中

表示跨尺度相加因子。Then, according to the formula

(4) The brightness difference map is weighted and combined to obtain the brightness saliency map. in

Represents a cross-scale additive factor.

接着计算颜色显著性图。根据Ewald的“颜色对”模型，在人眼视觉感受野的中心，神经元若被R颜色激活则被G颜色抑制，若被B颜色激活则被Y颜色抑制，因此根据公式(21～24)将图像从rgb三个通道转化为RGBY四个通道：The color saliency map is then calculated. According to Ewald's "color pair" model, in the center of the visual receptive field of the human eye, if the neuron is activated by the R color, it is inhibited by the G color, and if it is activated by the B color, it is inhibited by the Y color. Therefore, according to formula (21~24) Convert the image from three channels of rgb to four channels of RGBY:

R＝r-(g+b)/2 (5)R=r-(g+b)/2 (5)

G＝g-(r+b)/2 (6)G=g-(r+b)/2 (6)

B＝b-(g+r)/2 (7)B=b-(g+r)/2 (7)

Y＝(r+g)/2-|r-g|/2-b (8)Y=(r+g)/2-|r-g|/2-b (8)

随后根据人眼视觉细胞的特性，计算红绿和蓝黄对抗色组RG和BY，也即：RG(c,s)＝|R(c)-G(c)|Θ|G(s)-R(s)(9)，

求取RG的具体过程为：分别在尺度c和尺度s上，逐像素对图像的R、G通道相减求绝对值，然后将s尺度上的计算结果插值到c尺度上，最后与c尺度上的原结果逐像素作差。BY的求取过程类似。这样多次相减，可分别得到6幅RG和BY上的颜色差异显著性图。Then according to the characteristics of human visual cells, the red-green and blue-yellow anti-color groups RG and BY are calculated, that is: RG(c,s)=|R(c)-G(c)|Θ|G(s)- R(s)(9),

The specific process of obtaining RG is as follows: on the scale c and scale s respectively, subtract the R and G channels of the image pixel by pixel to obtain the absolute value, then interpolate the calculation result on the s scale to the c scale, and finally compare the c scale with the c scale. The original result on the above is pixel-by-pixel difference. The process of obtaining BY is similar. In this way, 6 saliency maps of color difference on RG and BY can be obtained respectively.

然后，根据公式

(11)加权组合颜色差异图，得到颜色显著性图。Then, according to the formula

(11) Weighting and combining the color difference maps to obtain a color saliency map.

接着计算区域显著性图。对前景目标进行超像素分割，接着统计每个超像素的归一化颜色直方图，并根据颜色直方图对超像素进行聚类，将图像分割为几个区域{r₁,r₂,...,r_k}，每个区域的聚类中心为ce_i，i＝1,...,k。则区域r_i的区域显著性图VA_r可以计算如下：Then calculate the regional saliency map. The foreground target is divided into superpixels, and then the normalized color histogram of each superpixel is counted, and the superpixels are clustered according to the color histogram, and the image is divided into several regions {r ₁ , r ₂ , . . . .,r _k }, the cluster center of each region is ce _i , i=1,...,k. Then the regional _saliency map VA _r of the region ri can be calculated as follows:

其中，w(r_i)表示r_i区域面积的权重，计算方法为：Among them, _w (ri) represents the weight of the area of _ri , and the calculation method is:

其中，PN(r_i)表示区域r_i的像素个数。上式表示区域面积越大，权重越大，即区域r_i周围面积大的区域对其显著性的影响要大于面积小的区域。Among them, PN(r _i ) represents the number of pixels in the region _ri . The above formula indicates that the larger the area, the greater the weight, that is, the larger area around the region _ri has a greater influence on its significance than the smaller area.

Dr(r_j,r_i)表示的是区域r_j和区域r_i的聚类中心的距离，即：Dr(r _j , _ri ) represents the distance between the area r _j and the cluster center of the area _ri , namely:

其中，ce_i(m)、ce_j(n)表示的是颜色直方图ce_i、ce_j的第m、n个颜色分量，Dc(m,n)表示的是第m、n种颜色在LAB颜色空间的欧氏距离。Among them, ce _i (m) and ce _j (n) represent the m and n color components of the color histograms ce _i and ce _j , and Dc(m,n) represent the m and n colors in the LAB Euclidean distance in color space.

本发明实施例提出一种超像素分割和聚类相结合的区域显著性提取方法，具体实施步骤将在下文图2进行说明。The embodiment of the present invention proposes a regional saliency extraction method combining superpixel segmentation and clustering. The specific implementation steps will be described in FIG. 2 below.

步骤A)对输入图像作超像素分割。可选用的超像素分割方法包括归一化分割(NC)算法、图割(GS)算法、快速漂移(QS)算法、简单线性迭代聚类(SLIC)算法等。考虑到算法的快速性和实用性，本发明实施例选择SLIC算法，具体步骤为：Step A) Perform superpixel segmentation on the input image. The optional superpixel segmentation methods include normalized segmentation (NC) algorithm, graph cut (GS) algorithm, fast drift (QS) algorithm, simple linear iterative clustering (SLIC) algorithm and so on. Considering the rapidity and practicability of the algorithm, the embodiment of the present invention selects the SLIC algorithm, and the specific steps are:

1)假设图像的总像素数为N，预将其分割为K*K个超像素。首先将整幅图像均匀分割为K*K个小块，取每个小块的中心作为初始点。在每个初始点的3*3邻域计算像素梯度，梯度最小的点为超像素分割算法的初始中心O_i，i＝0,1,...K*K-1，赋予每个初始中心一个单独的标签；1) Assuming that the total number of pixels of the image is N, it is pre-segmented into K*K superpixels. First, the entire image is evenly divided into K*K small blocks, and the center of each small block is taken as the initial point. Calculate the pixel gradient in the 3*3 neighborhood of each initial point, the point with the smallest gradient is the initial center O _i of the superpixel segmentation algorithm, i=0,1,...K*K-1, assign each initial center a separate label;

2)将每个像素点表示为CIELAB颜色空间和XY坐标的五维向量{l,a,b,x,y}，计算每个像素点与其最接近的中心的距离，距离计算公式为：2) Express each pixel as a five-dimensional vector {l,a,b,x,y} of CIELAB color space and XY coordinates, and calculate the distance between each pixel and its closest center. The distance calculation formula is:

其中，d_lab为颜色差异值，d_xy为位置差异值，S为中心间距，m为平衡参数，Dis为像素之间的距离。对每个像素，赋予与之最接近的中心的标签；Among them, d _lab is the color difference value, d _xy is the position difference value, S is the center distance, m is the balance parameter, and Dis is the distance between pixels. For each pixel, assign the label of the closest center to it;

3)重新计算不同标签的像素的中心，更新O_i，并计算新旧中心的差值，若差值小于阈值则算法结束，否则返回步骤2)。3) Recalculate the centers of pixels with different labels, update O _i , and calculate the difference between the old and new centers, if the difference is less than the threshold, the algorithm ends, otherwise, return to step 2).

步骤B：统计每个超像素的归一化颜色直方图。即将Lab颜色空间的每个维度分为若干个bin，统计超像素中的像素颜色落在每个bin中的概率，最后将统计的直方图归一化。Step B: Count the normalized color histogram of each superpixel. That is, each dimension of the Lab color space is divided into several bins, the probability that the pixel color in the superpixel falls in each bin is counted, and finally the statistical histogram is normalized.

步骤C：对超像素进行聚类，将图像分为若干个连续区域。可选用的聚类方法包括基于划分、模型、层次、网格、密度聚类的任意一种，本实施例中使用基于密度的DBSCAN聚类算法。具体步骤为：从超像素中选取一个作为种子点，用给定的误差阈值Eps和MinPts搜索所有它的密度可达超像素并判断该点是否为核心点。如果是核心点，则与其密度可达点形成一个聚类区域；如果不是核心点也不是边界点，则重新选择其他点作为种子点并重复以上步骤；如果不是核心点而是边界点，则认为是噪声点并丢弃。重复以上步骤直至所有点均被检索，最后形成若干个划分区域。DBSCAN能有效应对噪声点，能划分任意形状的类簇，因此适用于本实施例。Step C: Cluster the superpixels to divide the image into several continuous regions. The optional clustering method includes any one of division, model, hierarchy, grid, and density-based clustering. In this embodiment, the density-based DBSCAN clustering algorithm is used. The specific steps are: select one of the superpixels as a seed point, use the given error thresholds Eps and MinPts to search for all its density-reachable superpixels and determine whether the point is a core point. If it is a core point, it forms a clustering area with its density reachable points; if it is not a core point or a boundary point, re-select other points as seed points and repeat the above steps; if it is not a core point but a boundary point, it is considered that is the noise point and discarded. Repeat the above steps until all points are retrieved, and finally form several divided areas. DBSCAN can effectively deal with noise points and can be divided into clusters of arbitrary shapes, so it is suitable for this embodiment.

步骤D：计算区域显著度。根据超像素聚类结果和公式(12)(13)(14)，计算每个区域的区域显著度。Step D: Calculate the regional saliency. According to the superpixel clustering results and formulas (12) (13) (14), the regional saliency of each region is calculated.

最后，根据颜色信息、区域信息和亮度信息的重要程度，使用公式VA＝α_gVA_g+α_cVA_c+α_rVA_r (18)和α_g+α_c+α_r＝1 (19)融合三类显著性图，得到最后的抠图视觉显著图。在本实施例中，取α_c＝0.5,α_r＝0.3,α_g＝0.2。Finally, according to the importance of color information, area information and luminance information, the formulas VA = α _g VA _g + α _c VA _c +α _r VA _r (18) and α _g + α _c +α _r =1 (19) The three types of saliency maps are fused to obtain the final matting visual saliency map. In this embodiment, α _c =0.5, α _r =0.3, and α _g =0.2.

步骤二：根据抠图视觉显著度图，使用空间域滤波和阈值分割算法，分离出前景区域、背景区域，结合形态学运算得到三分图。Step 2: According to the visual saliency map of the matting, use spatial domain filtering and threshold segmentation algorithms to separate the foreground area and the background area, and combine the morphological operations to obtain a tripartite map.

首先使用空间域滤波器对抠图视觉显著图进行平滑以去除噪点，随后选用阈值分割算法(如Otsu方法)计算得到阈值T_va，则在显著图中，显著性值大于T_va的像素对应为前景区域，小于T_va的像素对应为背景区域，从而得到粗分三分图I_tc。Firstly, the spatial domain filter is used to smooth the matting visual saliency map to remove noise, and then a threshold segmentation algorithm (such as the Otsu method) is used to calculate the threshold T _va , then in the saliency map, the pixels with a saliency value greater than T _va correspond to For the foreground area, the pixels smaller than T _va correspond to the background area, so as to obtain the coarse three-part map It _tc .

空间域滤波是为了保证局部区域不出现显著度的奇点，同时对显著度值进行平滑，可选择中值滤波、双边滤波或者高斯滤波器。为了保证算法的计算效率，本实施例中选择高斯滤波器作为抠图视觉显著度图的平滑滤波器，滤波器窗口为3*3。The spatial domain filtering is to ensure that the singularity of saliency does not appear in the local area, and at the same time to smooth the saliency value, you can choose median filter, bilateral filter or Gaussian filter. In order to ensure the computational efficiency of the algorithm, in this embodiment, a Gaussian filter is selected as the smoothing filter of the visual saliency map for matting, and the filter window is 3*3.

由于在抠图视觉显著度图中，综合考虑了亮度显著性、颜色显著性和区域显著性，保证前景区域的显著度远大于背景区域的显著度，因此对于阈值分割算法的性能并没有苛刻要求，阈值T_va可以在一个较宽的范围内取值而不会影响前景分割的结果。在本实施例中，我们选取Otsu阈值算法，具体步骤为：Since the brightness saliency, color saliency and regional saliency are comprehensively considered in the matting visual saliency map to ensure that the saliency of the foreground area is much greater than that of the background area, there is no strict requirement for the performance of the threshold segmentation algorithm. , the threshold T _va can be valued in a wide range without affecting the results of foreground segmentation. In this embodiment, we choose the Otsu threshold algorithm, and the specific steps are:

(1)假设抠图视觉显著度图VA的灰度级为0,1,...,L-1，图像总像素数为N，统计其灰度直方图，即：假设VA中灰度为i，i＝0,1,...,L-1的像素个数为N_i，则灰度直方图中对应灰度级i的值为N_i/N；(1) Assume that the gray level of the matting visual saliency map VA is 0, 1, ..., L-1, the total number of pixels in the image is N, and the gray level histogram is calculated, that is, the gray level in the VA is assumed to be i, i=0,1,..., the number of pixels of L-1 is N _i , then the value of the corresponding gray level i in the grayscale histogram is N _i /N;

(2)将阈值T_va从0到L-1遍历，将像素分为小于T_va和大于等于T_va两类，统计这两类之间的类间差：(2) Traverse the threshold T _va from 0 to L-1, divide the pixels into two categories: less than T _va and greater than or equal to T _va , and count the inter-class difference between the two categories:

g＝ω₀(μ₀-μ)²+ω₁(μ₁-μ)² (20)g=ω ₀ (μ ₀ -μ) ² +ω ₁ (μ ₁ -μ) ² (20)

其中，ω₀、ω₁分别是小于T_va和大于等于T_va的像素个数所占的比重，μ₀、μ₁分别是小于T_va和大于等于T_va的像素均值。Among them, ω ₀ and ω ₁ are the proportions of the number of pixels less than T _va and greater than or equal to T _va , respectively, and μ ₀ and μ ₁ are the average values of pixels less than T _va and greater than or equal to T _va , respectively.

(3)遍历之后，找到类间差g取最大值时对应的T_va作为最终的分割阈值。(3) After traversing, find the corresponding T _va when the inter-class difference g takes the maximum value as the final segmentation threshold.

形态学算子的形状、尺寸应该根据图像内容如图像分辨率、前景区域的尺寸、形状进行选取，本实施例中默认为圆形以保证各个方向的均匀性。进行形态学操作时，为避免阈值分割后出现小的孔洞和毛边，对I_tc作如下的形态学操作：首先作一次开运算，以连接局部不连续的区域，去除孔洞；随后作尺寸为r_e的腐蚀操作，得到三分图的前景区域F_g；作尺寸为r_d的膨胀操作，得到三分图的背景部分B_g，前景和背景之间的区域为未知区域，这样就得到待抠图像I的细分三分图I_t。The shape and size of the morphological operator should be selected according to the image content, such as image resolution, size and shape of the foreground area, and in this embodiment, the default is a circle to ensure uniformity in all directions. When performing morphological operations, in order to avoid small holes and burrs after threshold segmentation, the following morphological operations are performed on I _tc : first, an opening operation is performed to connect local discontinuous areas and remove holes; then the size is r. The corrosion operation of _e can obtain the foreground area F _g of the tripartite map; as an expansion operation of size r _d , the background part _Bg of the tripartite map can be obtained, and the area between the foreground and the background is the unknown area, so that the to-be-cut area is obtained. Subdivision tripartite map It of image _I.

假设前景区域的灰度值为1，背景区域的灰度值为0，使用形态学核算子与二值图进行卷积，对于白色的前景区域，腐蚀能缩小其边界，而对于黑色的背景区域，膨胀能缩小其边界，最后前景和背景之间即为未知区域。Assuming that the gray value of the foreground area is 1, the gray value of the background area is 0, and the morphological operator is used to convolve the binary image. For the white foreground area, corrosion can shrink its boundaries, while for the black background area , the expansion can shrink its boundary, and finally the unknown area is between the foreground and the background.

步骤三：根据三分图，对未知区域的每个像素进行梯度计算，根据梯度方向和显著性大小采样得到当前未知区域像素的前景、背景样本点集。Step 3: Perform gradient calculation on each pixel of the unknown area according to the trisector map, and obtain the foreground and background sample point sets of the pixels in the current unknown area by sampling according to the gradient direction and saliency size.

对未知区域的每个像素I(x_i,y_i)，计算其梯度值Gra_i，梯度的方向记为θ，θ的计算公式为For each pixel I(x _i , y _i ) in the unknown area, calculate its gradient value Gra _i , the direction of the gradient is denoted as θ, and the calculation formula of θ is

在本实施例中，在梯度方向的直线上搜索到参考的前景、背景样本点对，在该样本点对的周围，根据当前位置像素点的抠图视觉显著度值确定搜索区域的大小，显著性越大，搜索范围越小，表示在显著性大的像素周围，真实的前景、背景点更靠近该像素。接着，根据空间距离和视觉显著度分别搜索出5个前景和背景样本对，具体过程为：In this embodiment, a reference pair of foreground and background sample points is searched on a straight line in the gradient direction, and around the pair of sample points, the size of the search area is determined according to the visual saliency value of the pixel point at the current position. The greater the saliency, the smaller the search range, which means that around the pixel with high saliency, the real foreground and background points are closer to the pixel. Then, according to the spatial distance and visual saliency, 5 pairs of foreground and background samples are respectively searched. The specific process is as follows:

1)令搜索半径r_s的初始值为1，计数count＝0；1) Let the initial value of the search radius _rs be 1, and the count count=0;

2)在以参考的前景/背景样本点为中心，r_s为半径为的圆上，对所有像素点p计算是否满足条件：|VA(p)-VA(p₀)|<T_vap，其中p₀为搜索区域的中心即参考的前景或背景样本点，p为搜索圆上的点，若满足条件则count+1；2) On the circle with the reference foreground/background sample point as the center and _rs as the radius, calculate whether the condition is satisfied for all pixel points p: |VA(p)-VA(p ₀ )|<T _vap , where p ₀ is the center of the search area, that is, the reference foreground or background sample point, p is the point on the search circle, if the conditions are met, count+1;

3)判断是否count>5，若是则搜索停止；若否则r_s++并返回步骤2)。3) Determine whether count>5, if so, stop the search; if otherwise, _rs ++ and return to step 2).

这样的采样策略，一方面可生成较少的采样点对，降低后续抠图计算的复杂度，另一方面沿着梯度方向进行采样，能较大概率保证前景和背景点分别位于不同的纹理区域，同时根据邻域显著性值的采样又能保证采样点之间具有一定的空间和显著性相似度，因此该采样方法能较大概率的保证包含真实采样点对，从而提高抠图的准确性Such a sampling strategy, on the one hand, can generate fewer sampling point pairs, reducing the complexity of subsequent matting calculations, and on the other hand, sampling along the gradient direction can ensure that the foreground and background points are located in different texture areas with a high probability. At the same time, the sampling according to the saliency value of the neighborhood can ensure that there is a certain degree of spatial and saliency similarity between the sampling points, so the sampling method can ensure that the real sampling point pair is included with a high probability, thereby improving the accuracy of matting

从前景样本点集和背景样本点集中任选一点，根据成像线性模型，估计不透明度为Choose a point from the foreground sample point set and the background sample point set. According to the imaging linear model, the estimated opacity is

其中

和

分别表示第m个前景样本点和第n个背景样本点的颜色值。这样对于每个未知区域像素，一共获得25个不同的不透明度估计，随后需要从中选出置信度最高的不透明度用于抠出前景目标。in

and

Represent the color values of the mth foreground sample point and the nth background sample point, respectively. In this way, for each unknown region pixel, a total of 25 different opacity estimates are obtained, and then the opacity with the highest confidence needs to be selected for the foreground target.

最优的前景背景点对的要求为：1)对于式(1)的线性模型具有最小的误差；2)前景样本点和背景样本点具有较大的颜色差；3)前景或者背景样本点与当前像素的颜色值较接近；4)前景和背景样本点与当前像素的空间距离较小。The requirements for the optimal pair of foreground and background points are: 1) the linear model of formula (1) has the smallest error; 2) the foreground sample point and the background sample point have a large color difference; 3) the foreground or background sample point is different from The color value of the current pixel is close; 4) The spatial distance between the foreground and background sample points and the current pixel is small.

根据准则1)和准则2)，定义线性-色差相似度为According to Criterion 1) and Criterion 2), the linear-color difference similarity is defined as

根据准则3)，定义颜色相似度为According to criterion 3), the color similarity is defined as

根据准则4)，定义空间距离相似度为According to criterion 4), the spatial distance similarity is defined as

定义置信度函数为Define the confidence function as

其中，

分别是线性-色差相似度、颜色相似度、空间距离相似度，D_i是未知像素的前背景采样半径，由其抠图视觉显著度决定，视觉显著度越大则D_i越小，σ₁、σ₂和σ₃用于调整不同相似度之间的权值。选取置信度最高的α作为当前未知像素的不透明度的估计，而对应的前景、背景样本对则作为最终抠图使用的最佳前景、背景样本对。in,

are the linear-color difference similarity, color similarity, and spatial distance similarity, respectively, D _i is the front and background sampling radius _of the unknown pixel, which is determined by its matting visual saliency _. , σ ₂ and σ ₃ are used to adjust the weights between different similarities. The α with the highest confidence is selected as the estimation of the opacity of the current unknown pixel, and the corresponding pair of foreground and background samples is used as the best pair of foreground and background samples for final matting.

针对未知区域的每个像素，逐点如前估计不透明度。最终，有可能某些像素点的置信度过低，导致估计出的不透明度误差较大，最后抠图出现色差。因此需要对未知区域不透明度做局部平滑。平滑时需要考虑的因素有：颜色值差异、空间位置差异、显著度差异，即局部颜色差异越大、局部空间位置越远、显著度越大的像素，权值越小。为此，为平衡空间域、颜色值域、显著度值域的影响，本发明中的不透明度平滑方法如下：For each pixel of the unknown region, the opacity is estimated point-by-point as before. In the end, it is possible that the confidence of some pixels is too low, resulting in a large error in the estimated opacity, and finally chromatic aberration in the matting. Therefore, it is necessary to locally smooth the opacity of the unknown region. The factors that need to be considered when smoothing are: color value difference, spatial position difference, and saliency difference, that is, the larger the local color difference, the farther the local spatial position, and the larger the saliency pixel, the smaller the weight. For this reason, in order to balance the influence of the spatial domain, the color value domain, and the saliency value domain, the opacity smoothing method in the present invention is as follows:

其中，P_i、P_j表示i、j二点的坐标，I_i、I_j表示i、j二点的颜色，VA_i、VA_j表示i、j二点的抠图视觉显著度，σ_p、σ_c和σ_va用于调整三者之间的权重。这样计算出来的不透明度由于充分考虑了空间位置、颜色和显著度的影响，使得空间位置越近、颜色越相似、显著度越接近的像素之间不透明度越接近，与人眼的主观感受是一致的，从而有效排除了不透明度的奇异点，提高了抠图的精度。Among them, P _i , P _j represent the coordinates of the two points i and j, I _i , I _j represent the colors of the two points i and j, VA _i , VA _j represent the visual significance of the matting of the two points i and j, σ _p , σ _c and σ _va are used to adjust the weight between the three. The opacity calculated in this way fully considers the influence of spatial position, color and saliency, so that the closer the spatial position, the more similar the color, and the closer the saliency is, the closer the opacity of the pixels is, which is similar to the subjective feeling of the human eye. Consistent, thus effectively eliminating the singular point of opacity and improving the precision of matting.

根据最优的前景背景点对的4个准则，确定其度量函数如式(30)所示，其中综合考虑了线性模型符合度、前景背景颜色差、目标像素与前背景的颜色差、空间距离四个指标。通过设定σ₁、σ₂、σ₃的值，用于调整不同相似度的权值。随后，根据公式(31)，对不透明度进行平滑，得到最终抠图用的不透明度。在本实施例中，平滑操作综合考虑了颜色差异、空间位置差异和显著度差异，通过改变σ_p、σ_c和σ_va，可以调整这三者在加权系数中的比重。例如，如果注重显著度信息，即σ_va应该大于σ_p、σ_c。According to the four criteria of the optimal foreground and background point pairs, the metric function is determined as shown in formula (30), which comprehensively considers the linear model conformity, the color difference between the foreground and background, the color difference between the target pixel and the foreground and background, and the spatial distance. four indicators. By setting the values of σ ₁ , σ ₂ , and σ ₃ , it is used to adjust the weights of different degrees of similarity. Then, according to formula (31), the opacity is smoothed to obtain the opacity for final matting. In this embodiment, the smoothing operation comprehensively considers color difference, spatial position difference and saliency difference, and by changing σ _p , σ _c and σ _va , the proportions of these three in the weighting coefficient can be adjusted. For example, if the saliency information is emphasized, that is, σ _va should be greater than σ _p and σ _c .

具体的操作步骤为：新建一张与原图大小相同的图像作为背景，利用计算出的不透明度和前景像素值，根据公式(1)与新背景进行合成，得到最终的抠图结果。The specific operation steps are as follows: create a new image with the same size as the original image as the background, use the calculated opacity and foreground pixel values, and synthesize it with the new background according to formula (1) to obtain the final matting result.

本发明实例中针对自然图像进行抠图，对于前景目标、背景的具体内容不作限定，只需要前景与背景有肉眼可辨的明显区分边界即可。In the example of the present invention, the natural image is cut out, and the specific content of the foreground target and the background is not limited, as long as the foreground and the background have a clearly distinguishable boundary that can be discerned by the naked eye.

本发明实施例提供了一种自动抠图算法，通过获取待抠图的原始图像，计算原始图像的抠图视觉显著度；接着根据抠图视觉显著度图，使用空间域滤波和阈值分割算法，分离出前景区域、背景区域，结合形态学运算得到三分图；根据三分图，对未知区域的每个像素进行梯度计算，根据梯度方向和显著性大小采样得到当前未知区域像素的前景、背景样本点集；根据当前未知区域像素的前景、背景样本点集，计算每个样本点的不透明度和置信度，取置信度最高的样本对作为最终抠图用的最佳样本对。然后平滑不透明度的局部区域，得到最终估计的不透明度；根据最终估计的不透明度和最佳样本对的颜色值，在原始图像中进行抠图操作，提取出前景目标。本发明提出的抠图视觉显著性计算方法模拟人眼的视觉注意机制，可自动提取前景目标，免去了复杂的用户交互操作，完成全自动抠图过程，操作简单方便；通过限制样本点对的数量，缩短了抠图时间；对显著性图和不透明度均进行了平滑，提高了抠图精度。The embodiment of the present invention provides an automatic image matting algorithm. By acquiring the original image to be matted, the matting visual saliency of the original image is calculated; then according to the matting visual saliency map, spatial domain filtering and threshold segmentation algorithms are used, Separate the foreground area and the background area, and combine the morphological operations to obtain a tripartite map; according to the tripartite map, perform gradient calculation on each pixel of the unknown area, and sample the foreground and background of the current unknown area pixel according to the gradient direction and saliency size. Sample point set: Calculate the opacity and confidence of each sample point according to the foreground and background sample point sets of pixels in the current unknown area, and take the sample pair with the highest confidence as the best sample pair for final matting. Then smooth the local area of opacity to obtain the final estimated opacity; according to the final estimated opacity and the color value of the best sample pair, perform a matting operation in the original image to extract the foreground target. The method for calculating the visual saliency of map cutout proposed by the invention simulates the visual attention mechanism of the human eye, can automatically extract foreground objects, eliminates complex user interaction operations, completes the automatic map cutout process, and the operation is simple and convenient; The number of saliency maps and the opacity are smoothed, which shortens the matting time; the saliency map and opacity are smoothed to improve the matting accuracy.

图3是本发明实施例提供的一种自动抠图装置的结构示意图，该装置包括：3 is a schematic structural diagram of an automatic map-out device provided by an embodiment of the present invention, and the device includes:

抠图视觉显著度计算模块，用于根据所述图像获取模块获取的图像，计算其抠图视觉显著度；A cutout visual saliency calculation module for calculating the cutout visual saliency of the image obtained by the image acquisition module;

样本点集获取模块，根据所述三分图计算模块获取的三分图，对未知区域的每个像素进行梯度计算，根据梯度方向和显著性大小采样得到当前未知区域像素的前景、背景样本点集；The sample point set acquisition module performs gradient calculation on each pixel of the unknown area according to the tri-partite map obtained by the tri-partite map calculation module, and obtains the foreground and background sample points of the current unknown area pixel by sampling according to the gradient direction and the saliency size. set;

具体地，所述抠图视觉显著度计算模块包括：Specifically, the cutout visual saliency calculation module includes:

尺度金字塔生成单元，用于根据获取的待抠图像，进行平滑和降采样，生成尺度金字塔；The scale pyramid generation unit is used to perform smoothing and downsampling according to the acquired image to be matted to generate a scale pyramid;

亮度显著性计算单元，用于根据尺度金字塔生成单元获得的尺度金字塔，以细尺度上的图像为视觉的中央区域，粗尺度上的图像为视觉的周边区域，计算亮度显著性图；The luminance saliency calculation unit is used to generate the scale pyramid obtained by the unit according to the scale pyramid, taking the image on the fine scale as the central area of vision, and the image on the coarse scale as the peripheral area of vision, and calculates the luminance saliency map;

颜色显著性计算单元，用于根据尺度金字塔生成单元获得的尺度金字塔，以细尺度上的图像为视觉的中央区域，粗尺度上的图像为视觉的周边区域，计算颜色显著性图；The color saliency calculation unit is used to generate the scale pyramid obtained by the scale pyramid generation unit, taking the image on the fine scale as the central area of vision, and the image on the coarse scale as the peripheral area of vision, and calculates the color saliency map;

区域显著性计算单元，用于根据图像获取模块获取的待抠图像，对前景目标进行超像素分割，并根据颜色直方图对超像素进行聚类，计算每个聚类区域的颜色显著性；The regional saliency calculation unit is used to perform superpixel segmentation on the foreground target according to the to-be-mashed image obtained by the image acquisition module, and cluster the superpixels according to the color histogram, and calculate the color saliency of each clustered region;

显著性融合单元，用于根据亮度显著性计算单元获取的亮度显著性图、颜色显著性计算单元获取的颜色显著性图、区域显著性计算单元获取的区域显著性图，融合得到待抠图像的抠图视觉显著性图。The saliency fusion unit is used to fuse the brightness saliency map obtained by the brightness saliency calculation unit, the color saliency map obtained by the color saliency calculation unit, and the regional saliency map obtained by the regional saliency calculation unit to obtain the image to be keyed. Cutout visual saliency map.

所述三分图计算模块包括：The three-part graph calculation module includes:

空间域滤波单元，用于选取合适的空间域滤波方法，对抠图视觉显著度图进行平滑；The spatial domain filtering unit is used to select an appropriate spatial domain filtering method to smooth the visual saliency map of the matting;

阈值分割单元，用于根据空间域滤波单元获得的平滑抠图视觉显著度图，选用阈值分割算法分割获取前景区域和背景区域，得到粗略的三分图；The threshold segmentation unit is used to obtain a rough three-part map by using a threshold segmentation algorithm to segment and obtain the foreground area and the background area according to the smooth matting visual saliency map obtained by the spatial domain filtering unit;

形态学运算单元：用于根据阈值分割单元获取的粗略三分图，进行形态学操作以填充孔洞，得到前景、背景和未知区域，即精确的三分图。Morphological operation unit: It is used to perform morphological operations on the rough tripartite map obtained by the threshold segmentation unit to fill the holes, and obtain the foreground, background and unknown areas, that is, the precise tripartite map.

所述样本点集获取模块包括：The sample point set acquisition module includes:

梯度计算单元，用于根据所述待抠图像的灰度值，获取每个未知像素的梯度；a gradient calculation unit, configured to obtain the gradient of each unknown pixel according to the grayscale value of the image to be matted;

采样单元，用于根据梯度计算单元得到的梯度方向作直线，取直线与前景区域和背景区域的第一个交点对作为初始搜索点，在该搜索点的邻域，由近到远搜索与未知像素显著性值差异小于阈值的样本点。The sampling unit is used to draw a straight line according to the gradient direction obtained by the gradient calculation unit, and take the first intersection pair of the straight line and the foreground area and the background area as the initial search point. In the neighborhood of the search point, search from near to far and unknown Sample points whose pixel saliency value difference is less than the threshold.

所述不透明度计算模块包括：The opacity calculation module includes:

线性-色差相似度计算单元：用于根据样本点集获取模块得到的样本点集，逐对取出样本点，并计算其线性-颜色相似度；Linear-color difference similarity calculation unit: used to obtain the sample point set obtained by the module according to the sample point set, take out the sample points one by one, and calculate their linear-color similarity;

颜色相似度计算单元：用于根据样本点集获取模块得到的样本点集，逐对取出样本点，并计算其颜色相似度；Color similarity calculation unit: used to obtain the sample point set obtained by the module according to the sample point set, take out the sample points one by one, and calculate their color similarity;

空间距离相似度计算单元：用于根据样本点集获取模块得到的样本点集，逐对取出样本点，并计算其空间距离相似度；Spatial distance similarity calculation unit: used to obtain the sample point set obtained by the module according to the sample point set, take out the sample points one by one, and calculate their spatial distance similarity;

样本筛选单元：用于根据从线性-色差相似度计算单元、颜色相似度计算单元和空间距离相似度计算单元获取的相似度值，计算每一对样本点相对当前未知像素的置信度；选取置信度最高的不透明度作为当前位置像素的不透明度的估计。Sample screening unit: used to calculate the confidence of each pair of sample points relative to the current unknown pixel according to the similarity values obtained from the linear-color difference similarity calculation unit, the color similarity calculation unit and the spatial distance similarity calculation unit; select the confidence The highest degree of opacity is used as an estimate of the opacity of the current loxel.

平滑单元：用于根据样本筛选单元获取的不透明度，对其做局部平滑。Smoothing unit: It is used to locally smooth the opacity obtained by the sample screening unit.

本发明实施例提供了一种自动抠图装置，通过获取待抠图的原始图像，计算原始图像的抠图视觉显著度；接着根据抠图视觉显著度图，使用空间域滤波和阈值分割算法，分离出前景区域、背景区域，结合形态学运算得到三分图；根据三分图，对未知区域的每个像素进行梯度计算，根据梯度方向和显著性大小采样得到当前未知区域像素的前景、背景样本点集；根据当前未知区域像素的前景、背景样本点集，计算每个样本点的不透明度和置信度，取置信度最高的样本对作为最终抠图用的最佳样本对。然后平滑不透明度的局部区域，得到最终估计的不透明度；根据最终估计的不透明度和最佳样本对的颜色值，在原始图像中进行抠图操作，提取出前景目标。本发明提出的抠图视觉显著性计算方法模拟人眼的视觉注意机制，可自动提取前景目标，免去了复杂的用户交互操作，完成全自动抠图过程，操作简单方便；通过限制样本点对的数量，缩短了抠图时间；对显著性图和不透明度均进行了平滑，提高了抠图精度。The embodiment of the present invention provides an automatic matting device, which calculates the matting visual saliency of the original image by acquiring the original image to be matted; and then uses spatial domain filtering and threshold segmentation algorithms according to the matting visual saliency map, Separate the foreground area and the background area, and combine the morphological operations to obtain a tripartite map; according to the tripartite map, perform gradient calculation on each pixel of the unknown area, and sample the foreground and background of the current unknown area pixel according to the gradient direction and saliency size. Sample point set: Calculate the opacity and confidence of each sample point according to the foreground and background sample point sets of pixels in the current unknown area, and take the sample pair with the highest confidence as the best sample pair for final matting. Then smooth the local area of opacity to obtain the final estimated opacity; according to the final estimated opacity and the color value of the best sample pair, perform a matting operation in the original image to extract the foreground target. The visual saliency calculation method for map cutout proposed by the invention simulates the visual attention mechanism of human eyes, can automatically extract foreground targets, eliminates complex user interaction operations, completes the automatic map cutout process, and the operation is simple and convenient; The number of saliency maps and the opacity are smoothed to shorten the matting time; the saliency map and the opacity are smoothed to improve the matting accuracy.

本发明实施例还提供了一种自动抠图的计算机程序产品。The embodiment of the present invention also provides a computer program product for automatic image cutout.

Claims

1. an automatic matting algorithm, is characterized in that, this algorithm comprises the steps:

Step 1: Obtain the original image to be matted, and calculate its matting visual saliency;

Step 2: According to the cutout visual saliency map described in Step 1, use spatial domain filtering and threshold segmentation algorithms to separate the foreground area and the background area, and combine morphological operations to obtain a tripartite map;

Step 3: perform gradient calculation on each pixel of the unknown region of the three-part map described in Step 2, and obtain the foreground and background sample point sets of the pixels of the current unknown region by sampling according to the gradient direction and the saliency size;

Step 4: Calculate the opacity and confidence of each sample point according to the foreground and background sample point sets of the pixels in the current unknown area described in Step 3, and take the sample pair with the highest confidence as the best sample pair for final matting. , and then smooth the local area of opacity to obtain the final estimated opacity; the opacity calculation method includes:

Select a point from the set of foreground sample points and the set of background sample points. According to the imaging linear model, the estimated opacity is

in

and

represent the color values of the mth foreground sample point and the nth background sample point respectively; I _i is the color value of the unknown area pixel i; thus, for each unknown area pixel i, a total of 25 different opacity estimates are obtained, and then The opacity with the highest confidence needs to be selected for the foreground target;

The linear-color difference similarity is defined as

Define color similarity as

The spatial distance similarity is defined as

Define the confidence function as

in

are linear-color difference similarity, color similarity, and spatial distance similarity, respectively. Di is the front-background sampling radius of unknown pixel i, which is determined by its matting visual saliency. _The greater the visual saliency, the smaller Di is. σ ₂ and σ ₃ are used to adjust the weights between different degrees of similarity; α with the highest confidence is selected as the estimation of the opacity of the currently unknown pixel, and the corresponding pair of foreground and background samples are used as the foreground and background samples used in the final matting. pair of background samples; x _i is the position of unknown pixel i in the image,

is the position of the mth foreground sample point in the image,

is the position of the mth background sample point in the image;

Finally, smooth the opacity;

Among them, P _i , P _j represent the coordinates of the two points i and j, I _i , I _j represent the colors of the two points i and j, VA _i , VA _j represent the visual significance of the matting of the two points i and j, σ _p , σ _c and σ _va are used to adjust the weight between the three; ω _ij is the weighted weight of the two points i and j on the opacity;

Step 5: According to the final estimated opacity and the color value of the best sample pair described in Step 4, perform a matting operation in the original image to extract the foreground target.

2. a kind of automatic matting algorithm according to claim 1, is characterized in that, the concrete step of calculating the matting visual salience of original image is:

Step (1), calculate the grayscale image I _gray , and perform layer-by-layer smoothing and downsampling to I _gray to generate a scale pyramid of n layers;

Step (2), take the image on the fine scale c as the central area of vision, and the image on the coarse scale s as the peripheral area of vision, first calculate the brightness difference feature map:

I(c,s)=|I(c)ΘI(s)|

where I(c), I(s) are the image pyramids, c∈{2,3,4} is the central scale, s=c+δ, δ∈{3,4} is the surrounding scale, and Θ is the central-peripheral difference operator, and finally get 6 brightness difference feature maps; the brightness saliency map is the normalized weighted sum of the 6 brightness difference feature maps:

in,

represents the normalization function,

represents a cross-scale addition factor;

Step (3), calculate the image on the fine scale c as the central area of vision, and the image on the coarse scale s as the peripheral area of vision, first convert the image from the rgb channel to the RGBY quaternary color channel, and then calculate the red and green channels. Color difference map RG and blue-yellow channel color difference map BY:

RG(c,s)=|R(c)-G(c)|Θ|G(s)-R(s)|

BY(c,s)=|B(c)-Y(c)|Θ|Y(s)-B(s)|

where R(c), G(c), B(c), Y(c) are the RGBY components of the image on scale c, respectively, R(s), G(s), B(s), Y(s) are the RGBY components of the image on scale s, respectively;

The color saliency map is the normalized weighted sum of 12 color difference feature maps:

Step (4), perform superpixel segmentation on the foreground target, then count the normalized color histogram of each superpixel, and cluster the superpixels according to the color histogram, and divide the image into several regions {r1, r2,...,rk}, the cluster center of each region is ce _i , i=1,...,k; then the regional significance value VAr of the region ri can be calculated as follows:

Among them, w(r _i ) represents the weight value of the area area, and Dr(rj,ri) represents the distance between the cluster centers of the area rj and the area ri;

Step (5), synthesizing brightness saliency, color saliency and regional saliency:

VA=α _g VA _g +α _c VA _c +α _r VA _r

α _g +α _c +α _r =1

Among them, α _g , α _c , and α _r are weighting coefficients of different saliency.

3. a kind of automatic matting algorithm according to claim 1, is characterized in that, obtains the method for three-part graph, and concrete steps are:

First, use the spatial domain filter to smooth the cutout visual saliency map to remove noise, and then use the threshold segmentation algorithm to calculate the threshold value Tva, so as to obtain the coarse three-part map Itc; then perform the following morphological operations on Itc: first do a Open operation to connect local discontinuous areas and remove holes; then perform an etching operation of size re to obtain the foreground area Fg of the three-part map; perform an expansion operation of size rd to obtain the background part of the three-part map Bg, foreground The region between the background and the background is an unknown region, and the subdivision tripartite map It of the image I to be matted is obtained.

4. a kind of automatic matting algorithm according to claim 1, is characterized in that, described sample point set acquisition method, specifically comprises:

For each pixel I(xi,yi) in the unknown area, calculate its gradient value Grai, the direction of the gradient is recorded as θ, and the calculation formula of θ is

Draw a straight line along the θ direction to obtain the first intersection of the straight line with the foreground area and the background area, respectively, as the initial search center; in the neighborhood of the intersection, search for 5 points whose difference is less than the threshold Tvap from near to far. Finally, a total of 5*5=25 sample point pairs are generated.

5. an automatic drawing device, it is characterised in that the device comprises:

The image acquisition module is used to collect the color value of a single image;

A cutout visual saliency calculation module for calculating the cutout visual salience of an image according to the image color value obtained by the image acquisition module;

The tripartite map calculation module is used to separate the foreground area and the background area according to the cutout visual saliency map obtained by the cutout visual saliency calculation module. get a three-point map;

The sample point set acquisition module performs gradient calculation on each pixel in the unknown area according to the trisectoral graph obtained by the tripartite graph calculation module, and obtains the foreground and background sample points of the pixel in the current unknown area by sampling according to the gradient direction and the saliency size. set;

The opacity calculation module calculates the opacity and confidence of each sample point according to the foreground and background sample point sets obtained by the sample point set acquisition module, and takes the sample pair with the highest confidence as the best sample for final matting Yes; then smooth the local area of opacity to obtain the final estimated opacity; the opacity calculation module includes:

Linear-color difference similarity calculation unit: used to obtain the sample point set obtained by the module according to the sample point set, take out the sample points one by one, and calculate their linear-color similarity;

Color similarity calculation unit: used to obtain the sample point set obtained by the module according to the sample point set, take out the sample points one by one, and calculate their color similarity;

Spatial distance similarity calculation unit: used to obtain the sample point set obtained by the module according to the sample point set, take out the sample points one by one, and calculate their spatial distance similarity;

Sample screening unit: used to calculate the confidence of each pair of sample points relative to the current unknown pixel according to the similarity values obtained from the linear-color difference similarity calculation unit, the color similarity calculation unit and the spatial distance similarity calculation unit; select the confidence The highest degree of opacity is used as an estimate of the opacity of the current position pixel;

Smoothing unit: used for local smoothing based on the opacity obtained by the sample screening unit; the factors considered during smoothing include: difference in color value, difference in spatial position, and difference in significance;

The foreground keying module is used to perform a keying operation in the original image according to the final estimated opacity and the color value of the best sample pair to extract the foreground target.

6. a kind of automatic matting device according to claim 5, is characterized in that, described matting visual salience calculation module comprises:

The scale pyramid generation unit is used to perform smoothing and downsampling according to the acquired image to be matted to generate a scale pyramid;

The luminance saliency calculation unit is used to generate the scale pyramid obtained by the unit according to the scale pyramid, taking the image on the fine scale as the central area of vision, and the image on the coarse scale as the peripheral area of vision, and calculates the luminance saliency map;

The color saliency calculation unit is used to generate the scale pyramid obtained by the scale pyramid generation unit, taking the image on the fine scale as the central area of vision, and the image on the coarse scale as the peripheral area of vision, and calculates the color saliency map;

The regional saliency calculation unit is used to perform superpixel segmentation on the foreground target according to the to-be-mashed image obtained by the image acquisition module, and cluster the superpixels according to the color histogram, and calculate the color saliency of each clustered region;

The saliency fusion unit is used to fuse the brightness saliency map obtained by the brightness saliency calculation unit, the color saliency map obtained by the color saliency calculation unit, and the regional saliency map obtained by the regional saliency calculation unit to obtain the image to be keyed. Cutout visual saliency map.

7. a kind of automatic matting device according to claim 5, is characterized in that, described tripartite graph calculation module comprises:

The spatial domain filtering unit is used to select a suitable spatial domain filtering method to smooth the visual saliency map of the matting;

The threshold segmentation unit is used to obtain the foreground area and the background area by using a threshold segmentation algorithm to obtain a rough three-part map according to the smooth matting visual saliency map obtained by the spatial domain filtering unit;

Morphological operation unit: It is used to perform morphological operations to fill the holes according to the coarse three-part map obtained by the threshold segmentation unit, and obtain the foreground, background and unknown area, that is, an accurate three-part map.

8. a kind of automatic matting device according to claim 5, is characterized in that, described sample point set acquisition module comprises:

The gradient calculation unit is used to obtain the gradient of each unknown pixel according to the gray value of the image to be keyed;

The sampling unit is used to draw a straight line according to the gradient direction obtained by the gradient calculation module, and take the first intersection pair of the straight line and the foreground area and the background area as the initial search point. In the neighborhood of the search point, search and unknown Sample points whose pixel saliency value difference is less than the threshold.