CN106650737A

CN106650737A - Image automatic cropping method

Info

Publication number: CN106650737A
Application number: CN201611041091.9A
Authority: CN
Inventors: 黄凯奇; 赫然; 考月英
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2016-11-21
Filing date: 2016-11-21
Publication date: 2017-05-10
Anticipated expiration: 2036-11-21
Also published as: CN106650737B

Abstract

The invention relates to an image automatic cropping method. The method includes: extracting an aesthetic response map and a gradient energy map of an image to be cropped; intensively extracting candidate cropping images from the image to be cropped; screening candidate cropping images based on the aesthetic response map; and estimating the selected candidate based on the aesthetic response map and gradient energy map The composition score of the cropped image is determined, and the candidate cropped image with the highest score is determined as the cropped image. This scheme uses the aesthetic response map to explore the aesthetically affected area of the image, and uses the aesthetic response map to determine the aesthetically preserved part, thereby retaining the high aesthetic quality of the cropped image to the greatest extent. At the same time, this scheme also uses the gradient energy map to analyze the gradient distribution rules , and evaluate the composition score of the cropped image based on the aesthetic response map and the gradient energy map. The embodiment of the present invention makes up for the defect of image composition expression, and solves the technical problem of how to improve the robustness and precision of automatic image cropping.

Description

Image automatic cropping method

技术领域technical field

本发明涉及模式识别、机器学习及计算机视觉技术领域，特别涉及一种图像自动裁剪方法。The invention relates to the technical fields of pattern recognition, machine learning and computer vision, in particular to an image automatic cropping method.

背景技术Background technique

随着计算机技术和数字媒体技术的快速发展，人们对计算机视觉、人工智能、机器感知等领域的需求与期盼也越来越高。图像的自动裁剪作为图像自动编辑中的一项非常重要和常见的任务也得到越来越多的关注和发展。图像自动裁剪技术就是希望能够去除多余的区域，强调感兴趣区域，从而提高图像的整体构图和美感质量。一种有效并且自动的图像裁剪方法不仅能够使人类从繁琐的工作中解放出来，而且还能给一些非专业人士提供一些专业的图像编辑的建议。With the rapid development of computer technology and digital media technology, people's needs and expectations for computer vision, artificial intelligence, machine perception and other fields are also increasing. Image automatic cropping, as a very important and common task in automatic image editing, has also received more and more attention and development. Image automatic cropping technology is to remove redundant areas and emphasize areas of interest, thereby improving the overall composition and aesthetic quality of the image. An effective and automatic image cropping method can not only liberate human beings from tedious work, but also provide some professional image editing suggestions to some non-professionals.

由于图像裁剪是一项非常主观性的任务，现有的规则很难考虑所有影响因素。传统的图像自动裁剪区域通常使用显著性图来识别图像中的主要区域或感兴趣区域，同时通过制定的一些规则来计算能量函数最小化或学习分类器来寻找裁剪区域。但是这些制定的规则对图像裁剪这一主观性的任务并不够全面，精度也很难达到用户需求。Since image cropping is a very subjective task, it is difficult for existing rules to consider all influencing factors. Traditional automatic image cropping usually uses a saliency map to identify the main region or region of interest in the image, and at the same time calculates the energy function minimization or learns a classifier to find the cropped region by formulating some rules. However, these formulated rules are not comprehensive enough for the subjective task of image cropping, and the accuracy is difficult to meet user needs.

有鉴于此，特提出本发明。In view of this, the present invention is proposed.

发明内容Contents of the invention

为了解决现有技术中的上述问题，即为了解决如何提高图像自动裁剪的鲁棒性和精度的技术问题而提供一种图像自动裁剪方法。In order to solve the above-mentioned problems in the prior art, that is, to solve the technical problem of how to improve the robustness and accuracy of automatic image cropping, an automatic image cropping method is provided.

为了实现上述目的，提供了以下技术方案：In order to achieve the above object, the following technical solutions are provided:

一种图像自动裁剪方法，所述方法包括：An image automatic cropping method, the method comprising:

提取待裁剪图像的美感响应图和梯度能量图；Extract the aesthetic response map and gradient energy map of the image to be cropped;

对所述待裁剪图像密集提取候选裁剪图像；densely extracting candidate cropped images for the image to be cropped;

基于所述美感响应图，筛选所述候选裁剪图像；Screening the candidate cropped images based on the aesthetic response map;

基于所述美感响应图和所述梯度能量图，估计筛选出的候选裁剪图像的构图分数，并将得分最高的候选裁剪图像确定为裁剪图像。Based on the aesthetic response map and the gradient energy map, estimate composition scores of the screened candidate cropped images, and determine the candidate cropped image with the highest score as the cropped image.

进一步地，所述提取待裁剪图像的美感响应图和梯度能量图，具体包括：Further, the extracting the aesthetic response map and the gradient energy map of the image to be cropped specifically includes:

利用深度卷积神经网络和类别响应映射方法，并采用如下公式提取所述待裁剪图像的所述美感响应图：Using a deep convolutional neural network and a category response mapping method, and using the following formula to extract the aesthetic response map of the image to be cropped:

其中，所述M(x,y)表示在空间位置(x,y)处的美感响应值；所述K表示深度卷积神经网络的最后一层卷积层的特征图的总通道个数；所述k表示第k个通道；所述f_k(x,y)表示所述第k个通道在所述空间位置(x,y)处的特征值；所述w_k表示所述第k个通道的特征图池化后的结果到高美感类别的权值；Wherein, the M (x, y) represents the aesthetic response value at the spatial position (x, y); the K represents the total channel number of the feature map of the last layer of convolutional layer of the deep convolutional neural network; The k represents the kth channel; the f _k (x, y) represents the eigenvalue of the kth channel at the spatial position (x, y); the w _k represents the kth channel The weight of the channel's feature map pooling result to the high aesthetic category;

对所述待裁剪图像进行平滑处理，并计算每个像素点的梯度值，从而得到所述梯度能量图。Perform smoothing processing on the image to be cropped, and calculate the gradient value of each pixel, so as to obtain the gradient energy map.

进一步地，所述深度卷积神经网络通过以下方式训练得到：Further, the deep convolutional neural network is obtained through training in the following manner:

在所述深度卷积神经网络结构的底层设置卷积层；A convolutional layer is set at the bottom layer of the deep convolutional neural network structure;

在所述深度卷积神经网络结构的最后一个卷积层之后通过全局平均池化的方法，将每一特征图池化为一个点；After the last convolutional layer of the deep convolutional neural network structure, each feature map is pooled into a point by a global average pooling method;

连接与美感质量分类类别数相同的全连接层和损失函数。Connect the same number of fully-connected layers and loss functions as aesthetic quality classification classes.

进一步地，所述基于所述美感响应图，筛选所述候选裁剪图像，具体包括：Further, the screening of the candidate cropped images based on the aesthetic response map specifically includes:

通过如下公式计算所述候选裁剪图像的美感保留分数：Calculate the beauty preservation score of the candidate cropped image by the following formula:

其中，所述S_a(C)表示所述候选裁剪图像的所述美感保留分数；所述C表示所述候选裁剪图像；所述(i,j)表示像素的位置；所述I表示原始图像；所述A_(i,j)表示在(i,j)位置处的美感响应值；Wherein, the S _a (C) represents the aesthetic preservation score of the candidate cropped image; the C represents the candidate cropped image; the (i, j) represents the position of the pixel; the I represents the original image ; The A _{(i, j)} represents the aesthetic response value at the (i, j) position;

将所有候选裁剪图像按照所述美感保留分数从大到小进行排序；Sorting all candidate cropped images according to the aesthetic preservation scores from large to small;

选取得分最高的一部分候选裁剪图像。Select the portion of candidate crops with the highest score.

进一步地，所述基于所述美感响应图和所述梯度能量图，估计筛选出的候选裁剪图像的构图分数，并将得分最高的候选裁剪图像确定为裁剪图像，具体包括：Further, the estimation of composition scores of the screened candidate cropped images based on the aesthetic response map and the gradient energy map, and determining the candidate cropped image with the highest score as the cropped image specifically includes:

基于所述美感响应图和所述梯度能量图建立构图模型；Establishing a composition model based on the aesthetic response map and the gradient energy map;

利用所述构图模型估计所述筛选出的候选裁剪图像的构图分数，并将所述得分最高的候选裁剪图像确定为所述裁剪图像。Estimate composition scores of the screened candidate cropped images by using the composition model, and determine the candidate cropped image with the highest score as the cropped image.

进一步地，所述构图模型通过以下方式获得：Further, the composition model is obtained in the following manner:

基于所述美感响应图和所述梯度能量图建立训练图像集；establishing a training image set based on the aesthetic response map and the gradient energy map;

对训练图像进行美感质量类别的标注；Annotate the aesthetic quality category of the training images;

利用标注的训练图像训练深度卷积神经网络；Train a deep convolutional neural network using labeled training images;

针对所述已标注的训练图像，利用训练好的深度卷积神经网络，提取所述美感响应图和所述梯度能量图的空间金字塔特征；For the labeled training image, using the trained deep convolutional neural network to extract the spatial pyramid features of the aesthetic response map and the gradient energy map;

将提取的空间金字塔特征拼接在一起；Stitch together the extracted spatial pyramid features;

利用分类器进行训练，自动学习构图规则，得到构图模型。Use the classifier to train, automatically learn the composition rules, and obtain the composition model.

本发明实施例提供一种图像自动裁剪方法。该方法包括：提取待裁剪图像的美感响应图和梯度能量图；对待裁剪图像密集提取候选裁剪图像；基于美感响应图，筛选候选裁剪图像；基于美感响应图和梯度能量图，估计筛选出的候选裁剪图像的构图分数，并将得分最高的候选裁剪图像确定为裁剪图像。本方案利用美感响应图去探究图片的美感影响区域，利用美感响应图确定美感保留部分，从而更加最大程度地保留了裁剪图像的高美感质量，同时本方案还利用梯度能量图去分析梯度分布规则，并且基于美感响应图和梯度能量图来评估裁剪图的构图分数。本发明实施例弥补了图像构图表达的缺陷，解决了如何提高图像自动裁剪的鲁棒性和精度的技术问题。本发明实施例能应用于涉及图像自动裁剪的众多领域，包括图像编辑、摄影学及图像重定位等。An embodiment of the present invention provides an image automatic cropping method. The method includes: extracting an aesthetic response map and a gradient energy map of an image to be cropped; intensively extracting candidate cropping images from the image to be cropped; screening candidate cropping images based on the aesthetic response map; and estimating the selected candidate based on the aesthetic response map and gradient energy map The composition score of the cropped image is determined, and the candidate cropped image with the highest score is determined as the cropped image. This scheme uses the aesthetic response map to explore the aesthetically affected area of the image, and uses the aesthetic response map to determine the aesthetically preserved part, thereby retaining the high aesthetic quality of the cropped image to the greatest extent. At the same time, this scheme also uses the gradient energy map to analyze the gradient distribution rules , and evaluate the composition score of the cropped image based on the aesthetic response map and the gradient energy map. The embodiment of the present invention makes up for the defect of image composition expression, and solves the technical problem of how to improve the robustness and precision of automatic image cropping. The embodiments of the present invention can be applied to many fields related to automatic cropping of images, including image editing, photography, and image repositioning.

附图说明Description of drawings

图1是根据本发明实施例的图像自动裁剪方法的流程示意图；Fig. 1 is a schematic flow chart of an image automatic cropping method according to an embodiment of the present invention;

图2是根据本发明实施例的深度卷积神经网络的结构示意图；Fig. 2 is a schematic structural diagram of a deep convolutional neural network according to an embodiment of the present invention;

图3a是根据本发明实施例的待裁剪图像示意图；Fig. 3a is a schematic diagram of an image to be cropped according to an embodiment of the present invention;

图3b是根据本发明实施例的裁剪后的图像示意图。Fig. 3b is a schematic diagram of a cropped image according to an embodiment of the present invention.

具体实施方式detailed description

下面结合附图以及具体实施例对本发明实施例解决的技术问题、所采用的技术方案以及实现的技术效果进行清楚、完整的描述。显然，所描述的实施例仅仅是本申请的一部分实施例，并不是全部实施例。基于本申请中的实施例，本领域普通技术人员在不付出创造性劳动的前提下，所获的所有其它等同或明显变型的实施例均落在本发明的保护范围内。本发明实施例可以按照权利要求中限定和涵盖的多种不同方式来具体化。The technical problems solved by the embodiments of the present invention, the technical solutions adopted and the technical effects achieved are clearly and completely described below in conjunction with the accompanying drawings and specific embodiments. Apparently, the described embodiments are only some of the embodiments of the present application, not all of them. Based on the embodiments in the present application, all other equivalent or obviously modified embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention. Embodiments of the invention can be embodied in many different ways as defined and covered by the claims.

深度学习在各个领域得到了快速的发展及很好的效果。本发明实施例考虑利用深度学习去自动学习对图像裁剪重要的影响区域，以自动全面地学习规则，从而使得在裁剪时尽可能地保留高美感区域。Deep learning has achieved rapid development and good results in various fields. The embodiment of the present invention considers the use of deep learning to automatically learn important areas of influence on image cropping, so as to automatically and comprehensively learn rules, so as to preserve high aesthetic areas as much as possible during cropping.

为此，本发明实施例提供一种自动图像裁剪方法。图1示例性地示出了图像自动裁剪方法的流程。如图1所示，该方法可以包括：To this end, an embodiment of the present invention provides an automatic image cropping method. FIG. 1 exemplarily shows the flow of an image automatic cropping method. As shown in Figure 1, the method may include:

S100：提取待裁剪图像的美感响应图和梯度能量图。S100: Extracting an aesthetic response map and a gradient energy map of an image to be cropped.

具体地，本步骤可以包括：Specifically, this step may include:

S101：利用深度卷积神经网络和类别响应映射方法，并采用如下公式提取待裁剪图像的美感响应图：S101: Using the deep convolutional neural network and the category response mapping method, and using the following formula to extract the aesthetic response map of the image to be cropped:

其中，M(x,y)表示在空间位置(x,y)处的美感响应值；K表示训练好的深度卷积神经网络的最后一层卷积层的特征图f的总通道个数；k表示第k个通道；f_k(x,y)表示第k个通道在空间位置(x,y)处的特征值；w_k表示第k个通道的特征图池化后的结果到高美感类别的权值。Among them, M(x, y) represents the aesthetic response value at the spatial position (x, y); K represents the total number of channels of the feature map f of the last convolutional layer of the trained deep convolutional neural network; k represents the kth channel; f _k (x, y) represents the feature value of the kth channel at the spatial position (x, y); w _k represents the result of the feature map pooling of the kth channel to high aesthetics category weights.

上述步骤在提取美感响应图时可以根据实际需要训练深度卷积神经网络。深度卷积神经网络的训练可以通过以下方式进行：The above steps can train the deep convolutional neural network according to actual needs when extracting the aesthetic response map. The training of deep convolutional neural networks can be done in the following ways:

步骤1：在深度卷积神经网络结构的底层设置卷积层。Step 1: Set the convolutional layer at the bottom of the deep convolutional neural network structure.

步骤2：在深度卷积神经网络结构的最后一个卷积层之后通过全局平均池化的方法，将每一个特征图池化为一个点。Step 2: After the last convolutional layer of the deep convolutional neural network structure, each feature map is pooled into a point by the global average pooling method.

步骤3：连接一个与美感质量分类类别数相同的全连接层和损失函数。Step 3: Connect a fully connected layer and loss function with the same number of categories as aesthetic quality classification.

图2示例性地示出了一个深度卷积神经网络结构。Fig. 2 exemplarily shows a deep convolutional neural network structure.

通过步骤1-3可以训练一个在美感质量分类任务下的深度卷积神经网络模型。然后，利用为美感质量分类任务训练好的深度卷积神经网络和类别响应映射方法；再采用上述公式，可以计算在高美感类别下待裁剪图像的美感响应图M。Through steps 1-3, a deep convolutional neural network model under the aesthetic quality classification task can be trained. Then, using the deep convolutional neural network trained for the aesthetic quality classification task and the category response mapping method; and then using the above formula, the aesthetic response map M of the image to be cropped under the high aesthetic category can be calculated.

S102：对待裁剪图像进行平滑处理，并计算每个像素点的梯度值，从而得到梯度能量图。S102: Perform smoothing processing on the image to be cropped, and calculate the gradient value of each pixel, so as to obtain a gradient energy map.

S110：对待裁剪图像密集提取候选裁剪图像。S110: Densely extract candidate cropped images from the image to be cropped.

这里，可以采用小于图像大小的所有大小的滑动窗口，对待裁剪图像密集提取候选裁剪窗口，通过候选裁剪窗口提取出候选裁剪图像。Here, sliding windows of all sizes smaller than the size of the image can be used to intensively extract candidate cropping windows from the image to be cropped, and extract candidate cropping images through the candidate cropping windows.

S120：基于美感响应图，筛选候选裁剪图像。S120: Screen candidate cropped images based on the aesthetic response map.

具体地，本步骤可以包括：Specifically, this step may include:

S121：通过如下公式计算候选裁剪图像的美感保留分数：S121: Calculate the aesthetic preservation score of the candidate cropped image by the following formula:

其中，S_a(C)表示候选裁剪图像的美感保留分数；C表示候选裁剪图像；(i,j)表示像素的位置；I表示原始图像；A_(i,j)表示在(i,j)处的美感响应值。Among them, S _a (C) represents the aesthetic preservation score of the candidate cropped image; C represents the candidate cropped image; (i, _j ) represents the position of the pixel; I represents the original image; The aesthetic response value at .

通过本步骤可以构建美感保留模型。将候选裁剪窗口经过美感保留模型筛选出美感保留分数较高的候选窗口。Through this step, an aesthetic preservation model can be constructed. Filter the candidate cropping windows through the aesthetic preservation model to select the candidate windows with higher aesthetic preservation scores.

S122：将所有候选裁剪图像按照美感保留分数从大到小进行排序。S122: Sort all candidate cropped images in descending order of aesthetic preservation scores.

S123：选取得分最高的一部分候选裁剪图像。S123: Select a part of candidate cropped images with the highest score.

例如：实际应用中可以设置保留前10000个候选裁剪窗口中的候选裁剪图像。For example: in practical applications, it may be set to retain candidate cropping images in the first 10,000 candidate cropping windows.

S130：基于美感响应图和梯度能量图，估计筛选出的候选裁剪图像的构图分数，并将得分最高的候选裁剪图像确定为裁剪图像。S130: Based on the aesthetic response map and the gradient energy map, estimate composition scores of the screened candidate cropped images, and determine the candidate cropped image with the highest score as the cropped image.

具体地，本步骤可以通过步骤S131至步骤S133来实现。Specifically, this step may be implemented through steps S131 to S133.

S131：基于美感响应图和梯度能量图建立构图模型。S131: Establish a composition model based on the aesthetic response map and the gradient energy map.

本步骤在建立构图模型时可以根据实际情况训练构图模型。在训练构图模型的过程中，训练数据可以采用构图较好的图像作为正样本，而将有构图缺陷的图像作为负样本。In this step, the composition model can be trained according to the actual situation when the composition model is established. In the process of training the composition model, the training data can use images with better composition as positive samples, and images with composition defects as negative samples.

可以通过以下方式来训练构图模型：Compositional models can be trained in the following ways:

步骤a：基于美感响应图和梯度能量图建立训练图像集。Step a: Build a training image set based on the aesthetic response map and the gradient energy map.

步骤b：对训练图像进行美感质量类别的标注。Step b: Annotate the aesthetic quality category of the training images.

步骤c：利用标注的训练图像训练深度卷积神经网络。Step c: Train a deep convolutional neural network using the annotated training images.

本步骤的训练过程可以参考上述步骤1至步骤3，在此不再赘述。For the training process of this step, reference may be made to the above steps 1 to 3, which will not be repeated here.

步骤d：针对已标注的训练图像，利用训练好的深度卷积神经网络，提取美感响应图和梯度能量图的空间金字塔特征。Step d: For the labeled training images, use the trained deep convolutional neural network to extract the spatial pyramid features of the aesthetic response map and the gradient energy map.

步骤e：将提取的空间金字塔特征拼接在一起。Step e: stitch together the extracted spatial pyramid features.

步骤f：利用分类器进行训练，自动学习构图规则，得到构图模型。Step f: using a classifier for training, automatically learning composition rules, and obtaining a composition model.

其中，分类器例如可以采用支持向量机分类器。Wherein, the classifier may adopt a support vector machine classifier, for example.

S132：利用构图模型估计筛选出的候选裁剪图像的构图分数，并将得分最高的候选裁剪图像确定为裁剪图像。S132: Use the composition model to estimate the composition scores of the screened candidate cropped images, and determine the candidate cropped image with the highest score as the cropped image.

图3a示例性地示出了待裁剪图像；图3b示例性地示出了裁剪后的图像。Fig. 3a exemplarily shows an image to be cropped; Fig. 3b exemplarily shows a cropped image.

下面再以一优选实施例来更好地说明本发明。The present invention will be better described below with a preferred embodiment.

步骤A：将标注有美感质量类别的图像数据集送入深度卷积神经网络进行美感质量类别模型训练。Step A: Send the image data set marked with aesthetic quality categories into the deep convolutional neural network for aesthetic quality category model training.

步骤B：将标注有构图类别的图像数据集输入训练好的深度卷积神经网络，提取最后一层卷积层的特征图，并计算美感响应图，同时计算美感梯度图，然后采用支持向量机分类器训练构图模型。Step B: Input the image data set marked with the composition category into the trained deep convolutional neural network, extract the feature map of the last convolutional layer, and calculate the aesthetic response map, and calculate the aesthetic gradient map at the same time, and then use the support vector machine A classifier trains a composition model.

步骤C：对待测试图像提取美感响应图和梯度能量图。Step C: extract the aesthetic response map and the gradient energy map from the test image.

本步骤的提取方法可参考训练阶段的方法。The extraction method in this step can refer to the method in the training stage.

步骤D：密集采集待测试图像的候选裁剪窗口。Step D: Densely collect candidate cropping windows of the image to be tested.

举例来说，在1000×1000的待测试图像上，利用间隔为30个像素的滑动窗口进行采集或提取。For example, on a 1000×1000 image to be tested, a sliding window with an interval of 30 pixels is used for acquisition or extraction.

步骤E：利用美感保留模型筛选候选裁剪窗口。Step E: Use the aesthetic preservation model to screen candidate cropping windows.

本步骤利用美感保留模型计算密集采集到的候选裁剪窗口的美感保留分数，筛选出美感分类最高的一部分候选裁剪窗口，例如：筛选出10000个候选裁剪窗口。In this step, the beauty preservation model is used to calculate the aesthetic preservation scores of the densely collected candidate cropping windows, and a part of the candidate cropping windows with the highest aesthetic classification is selected, for example, 10,000 candidate cropping windows are screened out.

步骤F：利用构图模型评估筛选出的候选裁剪窗口。Step F: Evaluate the screened candidate cropping windows using the composition model.

本步骤采集训练阶段训练好的构图模型去评估筛选出的候选裁剪窗口的构图分数，将得分最高的作为最后的裁剪窗口，从而得到裁剪图像。In this step, the composition model trained in the training phase is collected to evaluate the composition scores of the screened candidate cropping windows, and the one with the highest score is used as the final cropping window to obtain a cropped image.

综上所述，本发明实施例提供的方法很好地利用了美感响应图和梯度能量图来最大程度地保留美感质量和图像的构图规则，得到更加鲁棒，精度更高的图像的自动裁剪性能，进而说明了美感响应图和梯度能量图对于图像自动裁剪的有效性。To sum up, the method provided by the embodiment of the present invention makes good use of the aesthetic response map and the gradient energy map to preserve the aesthetic quality and the composition rules of the image to the greatest extent, and obtain a more robust and precise automatic cropping of the image Performance, which further illustrates the effectiveness of aesthetic response maps and gradient energy maps for automatic image cropping.

上述实施例中虽然按照上述先后次序描述了本发明实施例提供的方法，但是本领域技术人员可以理解，为了实现本实施例的效果，还可以以诸如并行或颠倒次序等不同的顺序来执行，这些简单的变化都在本发明的保护范围之内。Although the above embodiments describe the methods provided by the embodiments of the present invention according to the above order, those skilled in the art can understand that in order to achieve the effect of this embodiment, they can also be executed in different orders such as parallel or reverse order, These simple changes are all within the protection scope of the present invention.

以上所述，仅为本发明中的具体实施方式，但本发明的保护范围并不局限于此，任何熟悉该技术的人在本发明所揭露的技术范围内，可理解想到的变换或替换，都应涵盖在本发明的包含范围之内，因此，本发明的保护范围应该以权利要求书的保护范围为准。The above is only a specific implementation mode in the present invention, but the scope of protection of the present invention is not limited thereto. Anyone familiar with the technology can understand the conceivable transformation or replacement within the technical scope disclosed in the present invention. All should be covered within the scope of the present invention, therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims

1. a kind of automatic image cutting out method, it is characterised in that methods described includes：

The aesthetic feeling response diagram and gradient energy figure of cutting image is treated in extraction；

The intensive extraction candidate's cutting image of cutting image is treated to described；

Based on the aesthetic feeling response diagram, candidate's cutting image is screened；

The composition fraction of the candidate's cutting image filtered out based on the aesthetic feeling response diagram and the gradient energy figure, estimation, and Candidate's cutting image of highest scoring is defined as into cutting image.

2. method according to claim 1, it is characterised in that the aesthetic feeling response diagram and gradient of cutting image is treated in the extraction Energy diagram, specifically includes：

Using depth convolutional neural networks and classification response mapping method, and cutting image is treated using described in equation below extraction The aesthetic feeling response diagram：

M (x, y) = Σ_{k = 1}^{K} w_{k} f_{k} (x, y)

Wherein, the M (x, y) represents the aesthetic feeling response at locus (x, y) place；The K represents depth convolutional Neural net The overall channel number of the characteristic pattern of last layer of convolutional layer of network；The k represents k-th passage；The f_k(x, y) represents described Characteristic value of k-th passage at locus (x, the y) place；The w_kAfter representing the characteristic pattern pond of k-th passage Result to high aesthetic feeling classification weights；

Treat that cutting image is smoothed to described, and calculate the Grad of each pixel, so as to obtain the gradient energy Spirogram.

3. method according to claim 2, it is characterised in that the depth convolutional neural networks are trained in the following manner Obtain：

In the bottom of the depth convolutional neural networks structure, convolutional layer is set；

By the method in global average pond after last convolutional layer of the depth convolutional neural networks structure, will be every One characteristic pattern pond turns to a point；

Connection and the full articulamentum of aesthetic qualities class categories number identical and loss function.

4. method according to claim 1, it is characterised in that described based on the aesthetic feeling response diagram, screens the candidate Cutting image, specifically includes：

The aesthetic feeling retention score of candidate's cutting image is calculated by equation below：

S_{a} (C) = \frac{Σ_{(i, j) &Element; C} A_{(i, j)}}{Σ_{(i, j) &Element; I} A_{(i, j)}}

Wherein, the S_a(C) the aesthetic feeling retention score of candidate's cutting image is represented；The C represents candidate's cutting Image；(i, j) represents the position of pixel；The I represents original image；The A_(i,j)Represent the U.S. at (i, j) position Sense response；

All candidate's cutting images are ranked up from big to small according to the aesthetic feeling retention score；

Choose a part of candidate's cutting image of highest scoring.

5. method according to claim 1, it is characterised in that described based on the aesthetic feeling response diagram and the gradient energy Figure, the composition fraction of candidate's cutting image that estimation is filtered out, and candidate's cutting image of highest scoring is defined as into cutting figure Picture, specifically includes：

Composition model is set up based on the aesthetic feeling response diagram and the gradient energy figure；

Using the composition model estimate described in the composition fraction of candidate's cutting image that filters out, and by the highest scoring Candidate's cutting image is defined as the cutting image.

6. method according to claim 5, it is characterised in that the composition model is obtained in the following manner：

Training image collection is set up based on the aesthetic feeling response diagram and the gradient energy figure；

The mark of aesthetic qualities classification is carried out to training image；

Using the training image training depth convolutional neural networks of mark；

For the training image for having marked, using the depth convolutional neural networks for training, the aesthetic feeling response diagram is extracted With the spatial pyramid feature of the gradient energy figure；

By the spatial pyramid merging features for extracting together；

It is trained using grader, automatically study composition rule, obtains composition model.