CN109146892B

CN109146892B - Image clipping method and device based on aesthetics

Info

Publication number: CN109146892B
Application number: CN201810813038.9A
Authority: CN
Inventors: 鲁鹏; 张昊; 彭响; 刘咏彬; 王小捷
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2018-07-23
Filing date: 2018-07-23
Publication date: 2020-06-19
Anticipated expiration: 2038-07-23
Also published as: CN109146892A

Abstract

The embodiment of the application provides an image clipping method and device based on aesthetics, and belongs to the technical field of computers. The method comprises the following steps: acquiring an image to be cut; calculating a saliency map corresponding to the image to be cut according to a saliency detection algorithm, wherein the saliency map comprises a saliency image corresponding to the image to be cut, and the saliency image is a gray image; determining a salient bounding box in the salient map through a salient region extraction algorithm; determining a salient region corresponding to the salient bounding box in the image to be cropped, wherein the salient region is an image region contained in the salient bounding box in the image to be cropped; determining an aesthetic region bounding box containing the salient region according to an aesthetic region identification algorithm and the salient region; and cutting the image to be cut based on the aesthetic region boundary frame to obtain a target image. By adopting the method and the device, the efficiency of determining the cutting frame can be improved.

Description

Image clipping method and device based on aesthetics

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image cropping method and device based on aesthetics.

Background

The image has aesthetic qualities in addition to semantic information. An image with high aesthetic quality can express semantic information of the image better and is preferred by users. However, with the popularization of digital cameras and smart phones, most images in the network are photographed by users who do not have professional photographic knowledge, and the aesthetic quality of the images is low. Therefore, based on images in the network, acquiring images with high aesthetic quality becomes a hot research issue.

Since image composition is an important factor affecting the aesthetic quality of an image, people generally change the image composition by cropping the image, thereby improving the aesthetic quality of the image. The processing flow of the commonly used image cropping method is as follows: 1. and the electronic equipment acquires the salient boundary box of the image to be cut and the coordinate information corresponding to the salient boundary box according to the salient boundary box acquisition algorithm and the image to be cut. 2. The electronic equipment sequentially generates a plurality of candidate cutting frames containing the salient boundary frames and candidate cutting areas corresponding to the candidate cutting frames according to the coordinate information of the salient boundary frames and a preset coordinate interval threshold by taking the salient boundary frames as a reference; then, the electronic equipment obtains a classification result of each candidate cutting area through an aesthetic quality classification network, wherein the classification result is a probability value with a value range of 0 to 1, and then a maximum probability value is determined; then, the electronic equipment takes the candidate cutting box corresponding to the maximum probability value as a cutting box, and the image area corresponding to the cutting box is an aesthetic area; 3. and the electronic equipment cuts the image to be cut based on the cutting frame to obtain the image with high aesthetic quality. The electronic equipment comprises a server and a terminal, and the aesthetic region is a region with high aesthetic quality in the image.

However, this image cropping method needs to generate thousands of candidate crop boxes and determine the probability value corresponding to each candidate crop box one by one, and therefore, for a single image to be cropped, the time required to determine a crop box is long, and the efficiency of determining a crop box is low.

Disclosure of Invention

The embodiment of the application aims to provide an image cropping method and device based on aesthetics so as to improve the efficiency of determining a cropping frame. The specific technical scheme is as follows:

in a first aspect, there is provided an aesthetic-based image cropping method, the method comprising:

acquiring an image to be cut;

calculating a saliency map corresponding to the image to be cut according to a saliency detection algorithm, wherein the saliency map comprises a saliency image corresponding to the image to be cut, and the saliency image is a gray image;

determining a salient bounding box in the salient map through a salient region extraction algorithm;

determining a salient region corresponding to the salient bounding box in the image to be cropped, wherein the salient region is an image region contained in the salient bounding box in the image to be cropped;

determining an aesthetic region bounding box containing the salient region according to an aesthetic region identification algorithm and the salient region;

and cutting the image to be cut based on the aesthetic region boundary frame to obtain a target image.

Optionally, the determining, according to the aesthetic region identification algorithm and the salient region, an aesthetic region bounding box containing the salient region includes:

acquiring first coordinate information corresponding to the salient region, wherein the first coordinate information comprises coordinates of pixel points corresponding to two non-adjacent end points of the salient boundary frame in a preset image coordinate system to be cut;

determining an offset proportion vector according to the distinguishing algorithm of the salient region and the aesthetic region, wherein the offset proportion vector is formed by the percentage of coordinate offset in the upper direction, the lower direction, the left direction and the right direction of the salient region in the corresponding side length of the boundary frame of the aesthetic region;

and determining second coordinate information according to the offset proportion vector and the first coordinate information, and taking a boundary frame formed by the second coordinate information as an aesthetic region boundary frame.

Optionally, the method further includes:

acquiring a first pre-stored image sample set, wherein the first image sample set comprises a plurality of first image samples and a saliency map sample corresponding to each first image sample;

determining a first target parameter according to a preset first initial neural network, each first image sample and a saliency map sample corresponding to each first image sample, wherein the first target parameter is a parameter contained in the first initial neural network;

and determining the significance detection algorithm according to the first target parameter.

Optionally, the method further includes:

acquiring a pre-stored second image sample set, wherein the second image sample set comprises a plurality of second image samples, and a salient region sample and an offset proportion vector sample corresponding to each second image sample;

and training a preset second initial neural network based on the second image sample set to obtain the aesthetic region identification algorithm.

Optionally, the training a preset second initial neural network based on the second image sample set to obtain the aesthetic region identification algorithm includes:

for the second image sample set, obtaining the salient region sample and the offset proportion vector sample corresponding to each second image sample;

determining a second target parameter according to the significant region sample, the offset proportion vector sample and a preset second initial neural network of each second image sample, wherein the second target parameter is a parameter contained in the second initial neural network;

determining the aesthetic region identification algorithm according to the second target parameter and the second initial neural network.

In a second aspect, there is provided an aesthetic based image cropping apparatus, the apparatus comprising:

the first acquisition module is used for acquiring an image to be cut;

the calculation module is used for calculating a saliency map corresponding to the image to be cut according to a saliency detection algorithm, wherein the saliency map comprises a saliency image corresponding to the image to be cut, and the saliency image is a gray image;

a first determination module for determining a salient bounding box in the salient map by a salient region extraction algorithm;

a second determining module, configured to determine, in the image to be cropped, a salient region corresponding to the salient bounding box, where the salient region is an image region included in the salient bounding box in the image to be cropped;

a third determination module, configured to determine an aesthetic region bounding box containing the salient region according to an aesthetic region identification algorithm and the salient region;

and the cutting module is used for cutting the image to be cut based on the aesthetic region boundary frame to obtain a target image.

Optionally, the third determining module includes:

the obtaining submodule is used for obtaining first coordinate information corresponding to the salient region, wherein the first coordinate information is included in a preset image coordinate system to be cut, and coordinates of pixel points corresponding to two non-adjacent end points of the salient boundary frame are obtained;

the first determining submodule is used for determining an offset proportion vector according to the distinguishing algorithm of the distinguishing region and the aesthetic region, wherein the offset proportion vector is formed by the percentage of coordinate offset in the upper direction, the lower direction, the left direction and the right direction of the distinguishing region in the corresponding side length of the bounding box of the aesthetic region;

and the second determining submodule is used for determining second coordinate information according to the offset proportion vector and the first coordinate information, and using a boundary frame formed by the second coordinate information as an aesthetic region boundary frame.

Optionally, the apparatus further comprises:

the second acquisition module is used for acquiring a prestored second image sample set, wherein the second image sample set comprises a plurality of second image samples, and a salient region sample and an offset proportion vector sample which correspond to each second image sample;

and the fourth determining module is used for training a preset second initial neural network based on the second image sample set to obtain the aesthetic region identification algorithm.

In a third aspect, there is provided an electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: the method steps of the first aspect are implemented.

In a fourth aspect, there is provided a machine-readable storage medium storing machine-executable instructions that, when invoked and executed by a processor, cause the processor to: the method steps of the first aspect are implemented.

According to the image clipping method and device based on aesthetics, firstly, a saliency map of an image to be clipped is obtained according to the image to be clipped and a pre-stored saliency detection algorithm; then, acquiring a salient boundary box according to a salient image and a prestored salient region extraction algorithm; and determining a salient region according to the salient boundary frame and the image to be cut. Then, the aesthetic region is determined based on the prominent region and a pre-stored aesthetic region identification algorithm. And then, according to the aesthetic region, cutting the image to be cut to obtain the image with high aesthetic quality. For a single image to be cropped, the aesthetic region identification algorithm is adopted to determine the aesthetic region corresponding to the salient region, so that the efficiency of determining the cropping frame can be improved.

Of course, it is not necessary for any product or method of the present application to achieve all of the above-described advantages at the same time.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flowchart of a method for image cropping based on aesthetics according to an embodiment of the present invention;

FIG. 2a is a schematic diagram of an image to be cropped according to an embodiment of the present invention;

fig. 2b is a schematic diagram of a saliency map corresponding to an image to be cut provided by an embodiment of the present invention;

fig. 2c is a schematic diagram of a corresponding salient region of an image to be cut according to an embodiment of the present invention;

fig. 2d is a schematic diagram of an image to be cut corresponding to an aesthetic region bounding box according to an embodiment of the present invention;

fig. 2e is a schematic diagram of an image to be cut corresponding to a target image according to an embodiment of the present invention;

FIG. 3 is a flowchart of a method for image cropping based on aesthetics according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a coordinate system of an image to be cut according to an embodiment of the present invention;

FIG. 5 is a flowchart of a method for image cropping based on aesthetics according to an embodiment of the present invention;

FIG. 6 is a flowchart of a method for image cropping based on aesthetics according to an embodiment of the present invention;

FIG. 7 is a flowchart of a method for image cropping based on aesthetics according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram of an image cropping device based on aesthetics according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the invention provides an image cropping method based on aesthetics, which can be applied to electronic equipment, such as a server of an image searching website, a smart phone and a personal computer. Based on the method, the server can cut the image uploaded by the user, and the image with high aesthetic quality is obtained on the premise of keeping semantic information contained in the original image. The smart phone can acquire a shot current image when receiving a shooting instruction of a user, then calculate an aesthetic region boundary box of the current image through the method, and then display the aesthetic region boundary box in the current image, so that the user can obtain an image with high aesthetic quality by selecting a preset clipping function.

For the convenience of understanding, the related concepts in the embodiments of the present invention are briefly described as follows: for an image, a salient object is an object which occupies a large area in the image and can reflect a large amount of information, that is, a main attention object when a user sees the image, for example, a user takes a picture of a building, the image further includes trees, cars, trash bins and the like around the building, and the salient object in the image is the building. The salient region is a region including a salient object, and is also a region which is most interesting for users in the image and can most express the content of the image. The salient region bounding box is the bounding box of the salient region.

The saliency map of the image is used for representing the saliency probability of all pixel points in the image, in the saliency map, the saliency probability of a pixel point is a numerical value between 0 and 1, and the higher the saliency probability of a certain pixel point is, the higher the saliency of the pixel point is, namely, the pixel point is more likely to cause the attention of a user. Therefore, the saliency map is a saliency image corresponding to an image. Note that the saliency map of a certain image corresponds to the size of the image.

The aesthetic quality of an image represents the degree of aesthetics an image has. The aesthetic region is a region with aesthetic quality in the image; accordingly, the aesthetic region bounding box is a bounding box of the aesthetic region. Typically, the saliency region bounding box and the aesthetic region bounding box are both rectangular boxes.

As shown in fig. 1, the specific processing flow of the method is as follows:

step 101, obtaining an image to be cut.

In implementation, after receiving an image uploaded by a user, the electronic device may use the image as an image to be cropped, and obtain image data of the image to be cropped. The image data comprises a plurality of pixel points contained in the image and position information of each pixel point in the image. The position information includes the relative position of a certain pixel point in the image. As shown in fig. 2a, the embodiment of the present invention provides a schematic diagram of an image to be cropped.

The electronic device may also store the image in a preset image library, and when a preset processing period is reached, the electronic device acquires a certain image in the image library as an image to be cut according to a preset processing sequence. For example, the electronic device may sequentially process each image according to the sequence of the uploading time of each image, or may sequentially process each image according to the sequence of the file of each image from large to small, which is not limited in this embodiment.

And 102, calculating a saliency map corresponding to the image to be cut according to a saliency detection algorithm.

In the embodiment of the invention, the saliency map is a saliency image corresponding to the image to be cut, the saliency image can be a gray image, in the saliency map, white represents a pixel point with the maximum saliency probability, black represents a pixel point with the minimum saliency probability, and the larger the saliency probability of a certain pixel point is, the closer the pixel point is to white. Therefore, a plurality of pixel points representing the salient object in the saliency map are displayed as being close to white.

In implementation, the electronic device may calculate image data of an image to be cut through a preset saliency detection algorithm to obtain saliency probabilities of each pixel point in the image to be cut, then map the saliency probabilities of each pixel point into a numerical range of the image through a preset linear mapping algorithm to obtain image numerical values corresponding to each pixel point, and generate a saliency map based on each image numerical value. As shown in fig. 2b, the embodiment of the present invention provides a schematic diagram of the to-be-clipped image corresponding to the saliency map, in which the saliency object is composed of a plurality of pixel points displayed as being close to white.

The specific processing process that the electronic device calculates the image data of the image to be cut through the saliency detection algorithm to obtain the saliency probability of each pixel point in the image to be cut, and the specific processing process that the electronic device generates the saliency map through the preset linear mapping algorithm are the prior art, and are not repeated here.

It should be noted that the saliency detection algorithm can be any algorithm capable of converting an image into a saliency map, such as a U-Net (U-type network) full convolution network.

And 103, determining a salient bounding box in the salient map through a salient region extraction algorithm.

In implementation, the electronic device may store a preset significant region extraction threshold, where the significant region extraction threshold is a ratio of a sum of significant probabilities of all pixel points in a region included in the significant bounding box to a sum of significant probabilities of all pixel points in the image to be clipped, for example, 90%.

The electronic equipment determines a boundary box meeting a salient region extraction threshold value based on the salient probability of each pixel point in the salient image according to a preset salient region extraction algorithm, and takes the boundary box as a salient boundary box. The electronic equipment takes the position information of the pixel points corresponding to the salient boundary frame as the position information of the salient boundary frame in the salient image, and then stores the position information of the salient boundary frame in a preset position information file.

It should be noted that the salient region extraction algorithm may be any algorithm capable of determining a bounding box corresponding to the salient map based on the salient map and the salient region extraction threshold, such as a heuristic clipping algorithm. The specific processing procedure of the electronic device for acquiring the salient bounding box corresponding to the salient map according to the salient region extraction algorithm is the prior art, and is not described herein again.

And 104, determining a salient region corresponding to the salient bounding box in the image to be cut.

The salient region is an image region contained in a salient bounding box in the image to be cropped.

In implementation, since the saliency map and the image to be cropped have the same size, the electronic device may determine the position information of the saliency bounding box in the image to be cropped according to the position information of the saliency bounding box in the saliency map. Then, the electronic device takes the image area contained in the salient bounding box as a salient area in the image to be cut. That is, the electronic device may use the position information of the salient bounding box in the saliency map in the position information file as the position information of the salient region in the image to be cropped.

As shown in fig. 2c, the embodiment of the present invention provides a schematic diagram of an image to be cropped corresponding to a salient region, where a white line frame is a salient boundary frame, and an image region included in the white line frame is a salient region.

And step 105, determining an aesthetic region bounding box containing the salient region according to the aesthetic region identification algorithm and the salient region.

In an implementation, the electronic device may calculate the position information of the aesthetic region bounding box through a preset aesthetic region identification algorithm and the salient region, and then the electronic device determines the aesthetic region bounding box through the position information of the aesthetic region bounding box. As shown in fig. 2d, the embodiment of the present invention provides a schematic diagram of the image to be cut corresponding to the aesthetic region bounding box, wherein the small white-colored wire frame is a prominent bounding box, and the large white-colored wire frame is the aesthetic region bounding box.

It should be noted that the aesthetic region identification algorithm includes a regression network, and the aesthetic region bounding box includes a salient region.

And 106, cutting the image to be cut based on the aesthetic region boundary frame to obtain a target image.

In implementation, the electronic device determines, according to the position information of the aesthetic region bounding box, an image region included in the aesthetic region bounding box as a region to be cropped in the image to be cropped. Then, the electronic device extracts a plurality of pixel points included in the region to be cut and position information of each pixel point as image data of the region to be cut. Then, the electronic device displays the target image according to the image data.

As shown in fig. 2e, the embodiment of the present invention provides a schematic diagram of a target image obtained by cropping an image to be cropped based on an aesthetic region bounding box.

Specifically, as shown in fig. 3, according to the aesthetic region identification algorithm and the salient region, the specific processing flow for determining the aesthetic region bounding box containing the salient region is as follows:

step 301, obtaining first coordinate information corresponding to the salient region.

The electronic equipment is preset with an image coordinate system to be cut, and the image coordinate system to be cut comprises a plane rectangular coordinate system xOy. In the coordinate system of the image to be cut, the zero point of the coordinate is an endpoint of the image to be cut, and the coordinate unit is a pixel point. Therefore, the position information of the pixel point includes coordinate information. The electronic equipment takes the number of the pixel points in the x-axis direction of the image to be cut as the side length in the x-axis direction of the image to be cut, and correspondingly, takes the number of the pixel points in the y-axis direction of the image to be cut as the side length in the y-axis direction of the image to be cut.

In implementation, the electronic device acquires the position information of the salient bounding box in the salient map from a preset position information file, and uses the position information as the first coordinate information corresponding to the salient region. The first coordinate information comprises coordinates of pixel points corresponding to two non-adjacent end points of the remarkable bounding box in a preset image coordinate system to be cut. The two non-adjacent endpoints in the salient bounding box comprise the two endpoints of the main diagonal of the salient bounding box. The electronic equipment can determine the position of the salient region in the coordinate system of the image to be cropped according to the first coordinate information.

For example, by

First coordinate information indicating correspondence of the salient region, wherein s represents the salient region,

and

coordinates representing an end point of the salient bounding box,

and

representing the other end of the salient bounding box that is not adjacent to the endpointThe coordinates of the points. When in use

When the number is {40, 60,100, 60}, the coordinates of the pixel points corresponding to the two non-adjacent end points of the significant bounding box are (40,100) and (60,60), and since the significant bounding box is a rectangular box, the coordinates of the other two end points of the significant bounding box are (40,60) and (60, 100).

Step 302, an offset scale vector is determined based on the salient region and aesthetic region identification algorithm.

The deviation proportion vector is composed of the percentage of coordinate deviation amounts in four directions, namely the upper direction, the lower direction, the left direction and the right direction of the salient region in the corresponding side length of the boundary frame of the aesthetic region. In the embodiment of the invention, all percentages in the offset proportion vector are positive numbers.

In implementation, the electronic device acquires image data of the salient region in an image to be cut according to the first coordinate information of the salient region. The electronic device then calculates the image data of the salient region by an aesthetic region identification algorithm, resulting in an offset scale vector consisting of four percentages. The four percentages contained in the offset proportion vector represent the coordinate offset of the upper, lower, left and right directions of the salient region in percentage of the corresponding side length of the boundary box of the aesthetic region respectively. The image coordinate system to be cut has four directions of upper, lower, left and right, which can represent four directions of y positive axis, y negative axis, x negative axis and x positive axis.

E.g. by h^aSide length in the y-axis direction of the bounding box of the aesthetic region, denoted by w^aSide length, h, in the x-axis direction of the bounding box of the aesthetic region^aIs 300, w^a400, when the first percentage in the offset proportion vector is 0.1, the coordinate offset of the salient region is the side length h of the boundary frame of the aesthetic region in the upward direction in the coordinate system of the image to be cut, namely the positive y-axis direction^a0.1 times, the coordinate offset is 30. The rest three directions are analogized, and the description is omitted.

Step 303, determining second coordinate information according to the offset proportion vector and the first coordinate information, and using a bounding box formed by the second coordinate information as an aesthetic region bounding box.

In implementation, the electronic device determines the side length of the salient region according to the first coordinate information of the salient region, and then calculates the side length of the aesthetic region bounding box in the direction corresponding to the side length of the salient region in the offset proportion vector according to each percentage and the side length of the salient region in the direction corresponding to the salient region. And the electronic equipment calculates the coordinates of the four endpoints according to the side length corresponding to the aesthetic region bounding box, the offset proportion vector and the first coordinate information. The electronic device builds a bounding box based on the coordinates of the four endpoints and treats the bounding box as an aesthetic region bounding box.

It will be appreciated that in embodiments of the present invention, the aesthetic region bounding box comprises a salient region bounding box. And the second coordinate information comprises coordinates of pixel points corresponding to two non-adjacent end points of the boundary frame of the aesthetic region in a preset image coordinate system to be cut. Similarly, the second coordinate information of the aesthetic region bounding box may be used as the position information of the aesthetic region in the image to be cropped, and the side length of the aesthetic region bounding box is equal to the corresponding side length of the aesthetic region.

The embodiment of the invention provides a specific process for determining second coordinate information according to an offset proportion vector and first coordinate information, which comprises the following steps:

for example, as shown in fig. 4, in the coordinate system xOy of the image to be cropped, 401 represents the image to be cropped, the size of the image to be cropped is w × h, and accordingly 403 represents a salient bounding box corresponding to the salient region, and w is used for representing the salient bounding box corresponding to the salient region^sThe length of the side in the x-axis direction of the salient region is represented by h^sRepresenting the side length of the y-axis direction of the salient region; 402 represents an aesthetic region bounding box, denoted by w^aSide length in x-axis direction of the bounding box of the aesthetic region, expressed as h^aThe side length in the y-axis direction of the aesthetic region bounding box is shown, and a represents the aesthetic region corresponding to the aesthetic region bounding box.

The first coordinate information of the salient region is

By [ Delta y_t,Δy_b,Δx_t,Δx_b]Represents an offset scale vector, where Δ y_tCoordinate offset amount representing y positive axis directionh^aPercent of, like, Δ y_bCoordinate offset h representing the y negative axis direction^a(ii) percent (d); Δ x_tCoordinate offset amount w representing the direction of negative axis x^a(ii) percent (d); Δ x_bCoordinate offset amount w representing positive x-axis direction^aPercentage of (c).

The electronic equipment determines w according to the first coordinate information^sAnd h^sThe specific calculation method is

Then, the electronic equipment determines w according to the offset proportion vector^sAnd h^sCalculating w of the aesthetic region^aAnd h^aThe specific calculation method is as follows: w is a^a＝w^s/(1-Δx_t-Δx_b)，h^a＝h^s/(1-Δy_t-Δy_b). Then, the electronic equipment determines w according to the offset proportion vector, the first coordinate information and the determined w^aAnd h^aAnd calculating second coordinate information in the following specific calculation mode:

as shown in fig. 5, an embodiment of the present invention further provides a method for training a saliency detection algorithm, which specifically includes the following steps:

step 501, a first image sample set stored in advance is obtained.

The electronic device stores a first image sample set in advance, wherein the first image sample set comprises a plurality of first image samples and a saliency map sample corresponding to each first image sample. The first image sample set comprises a salience in Context (Saliency) eye movement data set.

In an implementation, the electronic device may acquire a first set of image samples upon receiving a preset first training instruction. The first training instruction may include an identifier of the first image sample set, and the electronic device may obtain the first image sample set according to the identifier of the first image sample set.

Step 502, determining a first target parameter according to a preset first initial neural network, each first image sample, and a saliency map sample corresponding to each first image sample.

The first target parameter is a parameter included in the first initial neural network. The first initial neural network includes various full convolution neural networks, such as U-Net full convolution networks, SegNet (Semantic Image Segmentation networks).

In implementation, the electronic device inputs, for the first image sample set, each first image sample and the saliency map sample corresponding to each first image sample into a preset first initial neural network, and then takes an output result of the first initial neural network as a first target parameter.

Step 503, determining a significance detection algorithm according to the first target parameter.

As shown in fig. 6, an embodiment of the present invention further provides a training method for an aesthetic region recognition algorithm, which specifically includes the following steps:

step 601, a pre-stored second image sample set is obtained.

In implementation, a second image sample set is pre-stored in the electronic device, and the second image sample set includes a plurality of second image samples, and a salient region sample and an offset scale vector sample corresponding to each second image sample.

The second image sample comprises a high quality image sample with a score of more than 6 in an AVA (Atomic Visual Action) dataset.

The embodiment of the invention provides a method for determining a second image sample set by electronic equipment, which comprises the following specific processing procedures:

for each second image sample, the electronic device may obtain a saliency map of the second image sample through a saliency detection algorithm, and obtain a saliency bounding box corresponding to the saliency map and coordinate information of the saliency bounding box through a saliency region extraction algorithm. The electronic device determines a salient region sample from the coordinate information of the salient bounding box in the second image sample. Thus, the electronic device obtains a salient region sample corresponding to each second image sample.

For each second image sample, the electronic device may use the coordinate information of the second image sample as the coordinate information of the aesthetic region bounding box corresponding to the second image sample, and then determine the offset scale vector sample of the second image sample according to the coordinate information of the aesthetic region bounding box and the coordinate information of the salient bounding box. Thus, the electronic device obtains offset scale vector samples corresponding to each second image sample.

The electronic device determines a second set of image samples by the second image samples, and the salient region sample and the offset scale vector sample corresponding to each second image sample.

The electronic device may obtain a second pre-stored image sample set when receiving a second preset training instruction. The electronic device may also receive a second set of image samples input by the technician.

Step 602, training a preset second initial neural network based on the second image sample set to obtain an aesthetic region recognition algorithm.

In implementation, the electronic device takes the second image sample set as a training sample, trains a preset second initial neural network, and takes the trained neural network as an aesthetic region recognition algorithm. The second initial neural network comprises a plurality of regression networks, the network structures of the regression networks are different, and the network structures comprise the placing modes of all connection layers in the regression networks and the number of neurons.

Specifically, as shown in fig. 7, based on the second image sample set, training a preset second initial neural network, and obtaining an aesthetic region recognition algorithm, the specific processing procedure is as follows:

step 701, for a second image sample set, obtaining a significant region sample and an offset proportion vector sample corresponding to each second image sample.

In an implementation, the electronic device obtains each second image sample included in the second set of image samples, and a significant region sample and an offset scale vector sample corresponding to each second image sample.

Step 702, determining a second target parameter according to the significant region sample, the offset proportion vector sample and a preset second initial neural network of each second image sample.

And the second target parameter is a parameter contained in the second initial neural network.

In implementation, for each second image sample, the electronic device inputs a significant region sample of the second image sample into the second initial neural network, resulting in an offset scale vector. The electronic device takes the offset proportion vector of a certain second image sample and the offset proportion vector sample corresponding to the second image sample as a group of test data, and therefore the electronic device obtains the test data of all the second image samples. And then, the electronic equipment determines the neural network weight of the second initial neural network through an error back propagation algorithm and the test data of each second image sample, and the obtained neural network weight is used as a second target parameter.

In the embodiment of the present invention, the specific process of calculating the weight of the neural network by the electronic device through the error back propagation algorithm and the test data corresponding to each second image sample is the prior art, and is not described herein again.

And step 703, determining an aesthetic region identification algorithm according to the second target parameter and the second initial neural network.

An embodiment of the present invention further provides an image cropping device based on aesthetics, as shown in fig. 8, the device includes:

a first obtaining module 810, configured to obtain an image to be cut;

a calculating module 820, configured to calculate a saliency map corresponding to the image to be clipped according to a saliency detection algorithm, where the saliency map includes a saliency image corresponding to the image to be clipped, and the saliency image is a grayscale image;

a first determining module 830, configured to determine a salient bounding box in the salient map through a salient region extraction algorithm;

a second determining module 840, configured to determine, in the image to be cropped, a salient region corresponding to the salient bounding box, where the salient region is an image region included in the salient bounding box in the image to be cropped;

a third determining module 850 for determining an aesthetic region bounding box containing the salient region according to an aesthetic region identification algorithm and the salient region;

and the cropping module 860 is used for cropping the image to be cropped based on the aesthetic region boundary box to obtain a target image.

Optionally, the third determining module includes:

Optionally, the apparatus further comprises:

An embodiment of the present invention further provides an electronic device, as shown in fig. 9, which includes a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 complete mutual communication through the communication bus 904,

a memory 903 for storing computer programs;

a processor 901, configured to execute the program stored in the memory 903, so that the node apparatus executes the following steps, where the steps include:

there is provided an aesthetic-based image cropping method, the method comprising:

acquiring an image to be cut;

Optionally, the method further includes:

The machine-readable storage medium may include a RAM (Random Access Memory) and may also include a NVM (Non-Volatile Memory), such as at least one disk Memory. Additionally, the machine-readable storage medium may be at least one memory device located remotely from the aforementioned processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also a DSP (Digital Signal Processing), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. A method for aesthetic based image cropping, the method comprising:

acquiring an image to be cut;

based on the aesthetic region boundary frame, cutting the image to be cut to obtain a target image;

wherein the aesthetic region identification algorithm is obtained by:

2. The method of claim 1, wherein determining an aesthetic region bounding box containing the prominent region based on an aesthetic region identification algorithm and the prominent region comprises:

3. The method of claim 1, further comprising:

4. The method according to claim 1, wherein the training a preset second initial neural network based on the second image sample set to obtain the aesthetic region identification algorithm comprises:

5. An image cropping device based on aesthetics, characterized in that it comprises:

the first acquisition module is used for acquiring an image to be cut;

the cutting module is used for cutting the image to be cut based on the aesthetic region boundary frame to obtain a target image;

the third determining module is specifically configured to:

6. The apparatus of claim 5, wherein the third determining module comprises:

7. The apparatus of claim 5, further comprising:

8. An electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor, the processor being caused by the machine-executable instructions to: carrying out the method steps of any one of claims 1 to 4.

9. A machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to: carrying out the method steps of any one of claims 1 to 4.