CN112017193A - Image cropping device and method based on visual saliency and aesthetic score - Google Patents

Image cropping device and method based on visual saliency and aesthetic score Download PDF

Info

Publication number
CN112017193A
CN112017193A CN202010858270.1A CN202010858270A CN112017193A CN 112017193 A CN112017193 A CN 112017193A CN 202010858270 A CN202010858270 A CN 202010858270A CN 112017193 A CN112017193 A CN 112017193A
Authority
CN
China
Prior art keywords
image
frame
cropping
module
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010858270.1A
Other languages
Chinese (zh)
Inventor
吕亚奇
熊永春
李云夕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Quwei Science & Technology Co ltd
Original Assignee
Hangzhou Quwei Science & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Quwei Science & Technology Co ltd filed Critical Hangzhou Quwei Science & Technology Co ltd
Priority to CN202010858270.1A priority Critical patent/CN112017193A/en
Publication of CN112017193A publication Critical patent/CN112017193A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

An image cropping device and method based on visual saliency and aesthetic score, the device comprises an operation module, a saliency detection module, a cropping processing module, an aesthetic quality evaluation module and a display module; the significance detection module and the aesthetic quality evaluation module are deep convolutional neural networks; according to the invention, the cutting processing module is arranged, the initial cutting frame is obtained according to the obvious target frame and the cutting width-to-height ratio, the width, the height and the central point x and y coordinate of the single cutting frame are sequentially transformed, the obvious target frame does not need to be traversed, and the cutting speed is accelerated.

Description

Image cropping device and method based on visual saliency and aesthetic score
Technical Field
The invention relates to the field of image analysis, in particular to an image cropping device and method based on visual saliency and aesthetic scores.
Background
With the development of intelligent devices, the requirements for the intelligent devices are higher and higher, and the automatic processing is gradually changed from the initial realization to the automatic and efficient processing. In order to deal with the rapid processing of a large number of pictures, a variety of picture processing software is designed, and the picture processing software can perform automatic cutting, beautifying and other operations on the pictures. The existing picture cropping method mainly comprises three types:
the first type is to crop directly from the center point of the image as the center. The cutting method has poor applicability and unsatisfactory cutting effect on the condition that the target cutting area is not located at the center of the picture.
The second category is automatic cropping of images based on recognition of face information or traditional saliency algorithms. The algorithm has low recognition degree on the complex scene image, the cutting result can be output only by traversing the remarkable image in the image cutting process, the speed is low, and on the other hand, the cutting fails if no remarkable target exists in the image.
The third type is an image cropping model obtained based on deep learning, the conventional image cropping model has poor generalization capability due to the limitation of the number of training samples, and the aspect ratio of an image cropping area cannot be arbitrarily specified, so that the conventional image cropping model is difficult to adapt to the cropping requirement of any proportion.
The three image cropping methods have disadvantages, so that an image cropping method which can flexibly adapt to images with any cropping proportion, is high in fault tolerance rate and wide in universality is urgently needed.
Disclosure of Invention
The invention aims to solve the defects of the prior art and provides an image cropping device and an image cropping method based on visual saliency and aesthetic score, which can efficiently and quickly crop and process a large number of images and are convenient to use.
An image cropping device based on visual saliency and aesthetic scores comprises an operation module, a saliency detection module, a cropping processing module, an aesthetic quality evaluation module and a display module; the operation module is electrically connected with the significance detection module, the cutting processing module and the display module, transmits the initial image information to the significance detection module through a connecting circuit, and transmits an operation instruction to the cutting processing module through the connecting circuit; the saliency module identifies a saliency region of the image; the cropping processing module performs cropping framing on the image according to the saliency area of the image and the operation instruction, and the image subjected to cropping framing is sent to the aesthetic quality evaluation module through a connecting line; the aesthetic quality evaluation module can grade the images in the cutting frame after training; the image with the highest aesthetic quality score is cut according to the cutting frame, and the obtained cut image is used as a final cut image and sent to the display module; the display module is capable of displaying the final cropped image.
Further, the display module simultaneously displays the initial image transmitted by the operation module and the final cut image transmitted by the aesthetic quality evaluation module; the operation module receives an initial image to be cut and an operation instruction input by an operator, wherein the operation instruction comprises the cut aspect ratio.
Further, the significance detection module and the aesthetic quality evaluation module are deep convolutional neural networks.
An image cropping method based on visual saliency and aesthetic scores comprising the steps of:
step S1: the operation module receives the initial image and the cropped aspect ratio, sends the initial image to the significance detection module and sends the cropped aspect ratio to the cropping processing module;
step S2: the saliency detection module receives the initial image to perform saliency region detection to obtain an initial image with a saliency target frame, and sends the initial image with the saliency target frame to the cropping processing module;
step S3: the cropping processing module obtains an initial image with an initial cropping frame according to the significant target frame and the cropped aspect ratio, and generates an initial image with a group of candidate cropping frames based on the initial cropping frame; the group of candidate cropping frames at least comprises one candidate cropping frame; combining each candidate cutting frame with the initial image, and cutting according to the candidate cutting frames to obtain a group of candidate cutting images; sending the candidate cropped image to an aesthetic quality evaluation module;
step S4: the aesthetic quality evaluation module evaluates the aesthetic quality score of each candidate cropped image, and sends the candidate cropped image with the highest aesthetic quality score as a final cropped image to the display module;
step S5: the display module receives the final cut image sent by the aesthetic quality evaluation module and displays the final cut image and the initial image simultaneously;
wherein the significance detection module and the aesthetic quality evaluation module need to be trained firstly.
Further, the significant object box in the step S2 is marked as bsalientThe salient object box is derived from equation (1):
bsalient=S(Iinput) (1)
wherein IinputA three-dimensional matrix representation representing an initial image; and S is an operator obtained after the significance detection module is trained.
Further, the step of generating a candidate trimming frame in step S3, and trimming according to the candidate trimming frame includes:
s31: determining a cropped aspect ratio rwH and significant object box bsalient
S32: with the center of the salient object frame as the origin, combining the width-to-height ratio r of the cutwH, obtaining an initial cutting frame b containing the image range of the significant target frameinit
S33: according to the obtained initial cutting frame binitGenerating a group of candidate cutting frames;
s34: and matching each candidate cutting frame with the initial image, and cutting to obtain a candidate cut image.
Further, in step S32, in order to obtain the initial trimming frame, h needs to be defined firstsalient,wsalient,xsalient,ysalientRespectively as a significant object frame bsalientHeight, width, and x and y coordinates of the center point; secondly according to the significant object frame bsalientAnd the aspect ratio r of the cutwCalculating initial cutting frame binitAs shown in formula (2):
Figure BDA0002647136440000031
wherein h isinit,winit,xinit,yinitAre respectively an initial cutting frame binitHeight, width, and x and y coordinates of the center point;
if w is satisfiedinit≥wsalientThen output the initial trimming frame binit(ii) a Otherwise, updating the initial cutting frame b according to the formula (3)initThe width and height and center point data of (a):
Figure BDA0002647136440000032
outputting an initial trimming frame binit
Further, in S33, a set of candidate cropping frames is generated, and the generating step includes:
s331: b of the initial cutting frameinitTransforming within a set high transformation ratio range to obtain n1 cutting frames; wherein the high transformation ratio of each trimming frame is obtained from a ratio of the high transformation ratio range to (n 1-1);
s332: w of the trimming frame obtained in step S311initConverting within a set wide conversion ratio range to obtain n1 × n2 cutting frames; wherein the wide transform scale of each crop frame is obtained from a ratio of the wide transform scale range to (n 2-1);
s333: x of the trimming frame obtained in step S332initConverting in the set central point conversion scale range to obtain n1 × n2 × n3 cutting frames; wherein the central point transformation proportion of each cutting frame is obtained according to the ratio of the central point transformation proportion range to (n 3-1);
s334: y of the trimming frame obtained in step S333initConverting within the set central point conversion scale range to obtain n1 × n2 × n3 × n4 cutting frames; wherein the central point transformation proportion of each cutting frame is obtained according to the ratio of the central point transformation proportion range to (n 4-1);
s335: n trimming frames are randomly picked as candidate trimming frames from the n1 × n2 × n3 × n4 trimming frames obtained in step S334.
Further, in step S4, the candidate cropped image is input to the aesthetic quality evaluation module to obtain an aesthetic quality score qkAs shown in formula (4):
Figure BDA0002647136440000041
wherein
Figure BDA0002647136440000042
Representing the three-dimensional form of the candidate cropped image, wherein k belongs to 1, and n represents the number of candidate cropping frames; and A is an operator obtained after the aesthetic quality evaluation module is trained.
Further, in S33, h is required to be addedinit,winit,xinit,yinitEach parameter in the step (a) is sequentially randomly transformed once in a set proportion range to obtain a candidate cropping frame, and n times of operations are repeated to obtain n candidate cropping frames.
The invention has the beneficial effects that:
by arranging the cutting processing module, obtaining an initial cutting frame according to the obvious target frame and the cutting width-to-height ratio, and independently and sequentially transforming the width, the height and the central point x and y coordinates of the cutting frame, the obvious target frame does not need to be traversed, and the cutting speed is accelerated;
by arranging the significance detection module and the aesthetic quality evaluation module and training the significance detection module and the aesthetic quality evaluation module, the significant target frame can be automatically judged, the candidate cut images can be scored and selected, and the robustness of the algorithm is good;
the display module can output the initial image and the final cut image simultaneously, is convenient to compare, can display the intermediate processing process of the image, and can correct and check the processing process;
the invention marks the images in different stages, is convenient for distinguishing the stage of the current image, can find the stage of the error image by checking the cutting process of the image, and corrects the corresponding module in time.
Drawings
FIG. 1 is a block flow diagram of a first embodiment of the present invention;
fig. 2 is a schematic diagram of an initial image according to a first embodiment of the present invention:
FIG. 3 is a schematic diagram of an initial image with a frame of a salient object according to a first embodiment of the present invention;
FIG. 4 is a diagram illustrating an initial cropped image with an initial cropping frame according to a first embodiment of the present invention;
FIG. 5 is a diagram illustrating an initial image with a set of candidate cropping frames according to a first embodiment of the present invention;
FIG. 6 is a diagram illustrating an initial image with a candidate cropping frame according to a first embodiment of the present invention;
FIG. 7 is a final cropped image according to a first embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
The first embodiment is as follows:
an image cropping device based on visual saliency and aesthetic scores comprises an operation module, a saliency detection module, a cropping processing module, an aesthetic quality evaluation module and a display module.
The operation module can receive an initial image to be cut and an operation instruction input by an operator, wherein the operation instruction comprises the cut aspect ratio. The operation module is electrically connected with the significance detection module, the cutting processing module and the display module, the operation module can transmit the received initial image information to the significance detection module through a connecting circuit, and the operation instruction is transmitted to the cutting processing module through the connecting circuit. In this embodiment, the significance detection module is a deep convolutional neural network, and the significance detection module is electrically connected with the operation module, the cropping processing module and the display module. The saliency module can identify a saliency region of an image after being trained, wherein an ideal saliency region represents a minimum block diagram region containing a saliency target, and the block diagram is an upright block diagram rather than a tilted block diagram. The cropping processing module divides the cropping frame of the image according to the saliency area of the image and the operation instruction, crops the image with the divided cropping frame according to the cropping frame to obtain a candidate cropped image, and sends the candidate cropped image to the aesthetic quality evaluation module through the connecting line. The cutting processing module is electrically connected with the significance detection module, the aesthetic quality evaluation module and the display module. The aesthetic quality evaluation module is a deep convolutional neural network and can score the candidate cropped images after training, wherein the candidate cropped image with the highest aesthetic quality score is used as a final cropped image and is sent to the display module. The aesthetic quality evaluation module is electrically connected with the cutting processing module and the display module. The display module can display the final cut image, and in the embodiment, the display module can also simultaneously display the initial image input by the operation module for comparison; the display module can also display the intermediate processing process of the image, so that the backtracking and the inspection are facilitated.
As shown in fig. 1 to 7, an image cropping method based on visual saliency and aesthetic score by means of the above-described image cropping device, comprises the steps of:
step S1: the operation module receives the initial image and the cropped aspect ratio, sends the initial image to the significance detection module and sends the cropped aspect ratio to the cropping processing module;
step S2: the saliency detection module receives the initial image to perform saliency region detection to obtain an initial image with a saliency target frame, and sends the initial image with the saliency target frame to the cropping processing module;
step S3: the cropping processing module obtains an initial image with an initial cropping frame according to the significant target frame and the cropped aspect ratio, and generates an initial image with a group of candidate cropping frames based on the initial cropping frame; the group of candidate cropping frames at least comprises one candidate cropping frame; combining each candidate cutting frame with the initial image, and cutting according to the candidate cutting frames to obtain a group of candidate cutting images; sending the candidate cropped image to an aesthetic quality evaluation module;
step S4: the aesthetic quality evaluation module evaluates the aesthetic quality score of each candidate cropped image, and sends the candidate cropped image with the highest aesthetic quality score as a final cropped image to the display module;
step S5: and the display module receives the final cut image sent by the aesthetic quality evaluation module and displays the final cut image and the initial image simultaneously.
Wherein the significance detection module and the aesthetic quality evaluation module need to be trained firstly.
As shown in fig. 3, the significant object box in the step S2 is marked as bsalientThe salient object box is derived from equation (1):
bsalient=S(Iinput) (1)
wherein IinputA three-dimensional matrix representation representing an initial image; and S is an operator obtained after the significance detection module is trained. In this embodiment, the saliency detection module trains on a private data set comprising 20000 color images with marked saliency target frames, wherein the training of the saliency detection module is a conventional deep learning-based target detection training.
As shown in fig. 4 and 5, the step S3 of cropping the image and cropping according to the candidate cropping frame includes:
s31: determining a cropped aspect ratio rwH and significant object box bsalient
S32: with the center of the salient object frame as the origin, combining the width-to-height ratio r of the cutwH, obtaining an initial cutting frame b containing the image range of the significant target frameinit
S33: according to the obtained initial cutting frame binitGenerating a group of candidate cutting frames;
s34: and matching each candidate cutting frame with the initial image, and cutting to obtain a candidate cut image.
Where the initial trimming frame obtained in S32 contains the salient object frame from the height or width direction, h needs to be defined firstsalient,wsalient,xsalient,ysalientRespectively as a significant object frame bsalientHeight, width, and x and y coordinates of the center point; secondly according to the significant object frame bsalientAnd the aspect ratio r of the cutwCalculating initial cutting frame binitAs shown in formula (2):
Figure BDA0002647136440000071
wherein h isinit,winit,xinit,yinitAre respectively an initial cutting frame binitHeight, width, and x and y coordinates of the center point.
If w is satisfiedinit≥wsalientThen output the initial trimming frame binit(ii) a Otherwise, updating the initial cutting frame b according to the formula (3)initThe width and height and center point data of (a):
Figure BDA0002647136440000072
outputting an initial trimming frame binit
Need to make sure thatIt is noted that in some other embodiments, the data of the initial trimming frame can be obtained according to equation (3) first, if h is satisfiedinit≥hsalientThen output the initial trimming frame binit(ii) a Otherwise, updating the initial trimming frame data according to the formula (2) and outputting the initial trimming frame data.
In S33, a group of candidate cropping frames is generated, and the generating step includes:
s331: b of the initial cutting frameinitTransforming within a set high transformation ratio range to obtain n1 cutting frames; wherein the high transformation ratio of each trimming frame is obtained from a ratio of the high transformation ratio range to (n 1-1);
s332: w of the trimming frame obtained in step S311initConverting within a set wide conversion ratio range to obtain n1 × n2 cutting frames; wherein the wide transform scale of each crop frame is obtained from a ratio of the wide transform scale range to (n 2-1);
s333: x of the trimming frame obtained in step S332initConverting in the set central point conversion scale range to obtain n1 × n2 × n3 cutting frames; wherein the central point transformation proportion of each cutting frame is obtained according to the ratio of the central point transformation proportion range to (n 3-1);
s334: y of the trimming frame obtained in step S333initConverting within the set central point conversion scale range to obtain n1 × n2 × n3 × n4 cutting frames; wherein the central point transformation proportion of each cutting frame is obtained according to the ratio of the central point transformation proportion range to (n 4-1);
s335: n trimming frames are randomly picked as candidate trimming frames from the n1 × n2 × n3 × n4 trimming frames obtained in step S334.
In this embodiment, the high transformation ratio range, the wide transformation ratio range and the center transformation ratio range are all [ -20%, 20% ], and n1, n2, n3 and n4 are all 5, which means that 5 × 5 is obtained in step S334, which is 625 cut frames, and 20 cut frames are randomly selected from the 625 cut frames as the candidate cut frames of this embodiment. In this embodiment, the value of n1 is 5, and taking the candidate cropping frame in S331 as an example:
Figure BDA0002647136440000081
the high transformation ratios of the candidate trimming frames in S331 are obtained as-20%, -10%, 0, 10%, and 20%, respectively, and five trimming frames, the heights of which are 80%, 90%, 1, 110%, and 120% of the initial trimming frame height, respectively, can be obtained by the transformation in step S331.
When generating a candidate trimming frame, h may be setinit,winit,xinit,yinitEach parameter in the step (a) is sequentially and randomly transformed once in a set proportion range to obtain a candidate cutting frame, and n times of operations are repeated to obtain n candidate cutting frames. Say hinit,winit,xinit,yinitThe transformation ratio ranges of (1) and (20)%]First, the initial cut frame is required to be at a height of [ 80%, 120%]Is randomly transformed once again within the range of (1), and then again over a width of [ 80%, 120% ]]Is randomly transformed once and then at the x coordinate of the center point [ 80%, 120% ]]Is randomly transformed once in the range of (1), and finally is positioned at the y coordinate of the central point (80%, 120%)]Randomly transforming once within the range of (1) to obtain a candidate cutting frame; repeating the random transformation 20 times can obtain 20 random candidate cropping frames.
In step S34, the candidate trimming frame can exceed the boundary of the initial image, and the portion exceeding the boundary of the initial image is filled with white pixels during trimming.
As shown in FIG. 6, in step S4, the candidate cropped image is input to the aesthetic quality evaluation module to obtain an aesthetic quality score qkAs shown in formula (4):
Figure BDA0002647136440000091
wherein
Figure BDA0002647136440000092
Representing candidatesCropping a three-dimensional form of the image, k ∈ 1., n, where n is 20 in this embodiment, and n represents the number of candidate cropping frames; and A is an operator obtained after the aesthetic quality evaluation module is trained. In the embodiment, the aesthetic quality evaluation module trains on a private data set, the data set comprises q images which are manually scored, wherein the training of the aesthetic quality evaluation module is conventional target detection training based on deep learning.
When the images are processed in the saliency detection module, the cropping processing module and the aesthetic quality evaluation module, the images are displayed in the display module, and the images in different stages are marked by combining colors with solid and dotted lines in the embodiment, wherein a saliency target frame is a red solid line, an initial cropping frame is a green solid line, a candidate cropping frame is a light-colored dotted line frame, and a candidate cropping frame with the highest aesthetic quality score is a yellow solid line. In some other embodiments, the images at different stages can be marked in other ways, such as text marking.
In the implementation process, firstly, a saliency detection module and an aesthetic quality evaluation module are trained, after the training is finished, an initial image to be cut is input into an operation module, a candidate cutting frame is obtained through the detection of the saliency detection module and the cutting of a cutting processing module, finally, the aesthetic quality evaluation module scores the candidate cutting frame, and the image in the candidate cutting frame with the highest score is output to a display module as a final cutting image to realize the automatic cutting of the image; the manner in which candidate cropping frames are generated in conjunction with the cropping processing module enables fast, large-scale processing of images.
The above description is only one specific example of the present invention and should not be construed as limiting the invention in any way. It will be apparent to persons skilled in the relevant art(s) that, having the benefit of this disclosure and its principles, various modifications and changes in form and detail can be made without departing from the principles and structures of the invention, which are, however, encompassed by the appended claims.

Claims (10)

1. An image cropping device based on visual saliency and aesthetic scores is characterized by comprising an operation module, a saliency detection module, a cropping processing module, an aesthetic quality evaluation module and a display module; the operation module is electrically connected with the significance detection module, the cutting processing module and the display module, transmits the initial image information to the significance detection module through a connecting circuit, and transmits an operation instruction to the cutting processing module through the connecting circuit; the saliency module identifies a saliency region of the image, the saliency region representing a minimum block region containing a saliency target; the cropping processing module performs cropping framing on the image according to the saliency area of the image and the operation instruction, and the image subjected to cropping framing is sent to the aesthetic quality evaluation module through a connecting line; the aesthetic quality evaluation module can grade the images in the cutting frame after training; the image with the highest aesthetic quality score is cut according to the cutting frame, and the obtained cut image is used as a final cut image and sent to the display module; the display module is capable of displaying the final cropped image.
2. The image cropping device based on visual saliency and aesthetic score as claimed in claim 1, characterized in that said display module simultaneously displays the initial image transmitted by the operation module and the final cropped image transmitted by the aesthetic quality evaluation module; the operation module receives an initial image to be cut and an operation instruction input by an operator, wherein the operation instruction comprises the cut aspect ratio.
3. The image cropping device based on visual saliency and aesthetic scores of claim 1, characterized in that said saliency detection module and aesthetic quality evaluation module are deep convolutional neural networks.
4. An image cropping method based on visual saliency and aesthetic scores, comprising the steps of:
step S1: the operation module receives the initial image and the cropped aspect ratio, sends the initial image to the significance detection module and sends the cropped aspect ratio to the cropping processing module;
step S2: the saliency detection module receives the initial image to perform saliency region detection to obtain an initial image with a saliency target frame, and sends the initial image with the saliency target frame to the cropping processing module;
step S3: the cropping processing module obtains an initial image with an initial cropping frame according to the significant target frame and the cropped aspect ratio, and generates an initial image with a group of candidate cropping frames based on the initial cropping frame; the group of candidate cropping frames at least comprises one candidate cropping frame; combining each candidate cutting frame with the initial image, and cutting according to the candidate cutting frames to obtain a group of candidate cutting images; sending the candidate cropped image to an aesthetic quality evaluation module;
step S4: the aesthetic quality evaluation module evaluates the aesthetic quality score of each candidate cropped image, and sends the candidate cropped image with the highest aesthetic quality score as a final cropped image to the display module;
step S5: the display module receives the final cut image sent by the aesthetic quality evaluation module and displays the final cut image and the initial image simultaneously;
wherein the significance detection module and the aesthetic quality evaluation module need to be trained firstly.
5. The image cropping method based on visual saliency and aesthetic score as claimed in claim 4, wherein said salient object box marked as b in step S2salientThe salient object box is derived from equation (1):
bsalient=S(Iinput) (1)
wherein IinputA three-dimensional matrix representation representing an initial image; and S is an operator obtained after the significance detection module is trained.
6. The image cropping method based on visual saliency and aesthetic score as claimed in claim 4, wherein said step S3 generates a candidate cropping frame, and the step of cropping according to the candidate cropping frame comprises:
s31: determining a cropped aspect ratio rwH and significant object box bsalient
S32: with the center of the salient object frame as the origin, combining the width-to-height ratio r of the cutwH, obtaining an initial cutting frame b containing the image range of the significant target frameinit
S33: according to the obtained initial cutting frame binitGenerating a group of candidate cutting frames;
s34: and matching each candidate cutting frame with the initial image, and cutting to obtain a candidate cut image.
7. The image cropping method based on visual saliency and aesthetic score as claimed in claim 6, wherein in order to obtain an initial cropping frame in S32, h is defined firstsalient,wsalient,xsalient,ysalientRespectively as a significant object frame bsalientHeight, width, and x and y coordinates of the center point; secondly according to the significant object frame bsalientAnd the aspect ratio r of the cutwCalculating an initial clipping frame bin as shown in equation (2):
Figure FDA0002647136430000021
wherein h isinit,winit,xinit,yinitAre respectively an initial cutting frame binitHeight, width, and x and y coordinates of the center point;
if w is satisfiedinit≥wsalientThen output the initial trimming frame binit(ii) a Otherwise, updating the initial cutting frame b according to the formula (3)initThe width and height and center point data of (a):
Figure FDA0002647136430000031
outputting an initial trimming frame binit
8. The image cropping method based on visual saliency and aesthetic score as claimed in claim 4, wherein in said S33, a set of candidate cropping frames is generated, and the generation step comprises:
s331: b of the initial cutting frameinitTransforming within a set high transformation ratio range to obtain n1 cutting frames; wherein the high transformation ratio of each trimming frame is obtained from a ratio of the high transformation ratio range to (n 1-1);
s332: w of the trimming frame obtained in step S311initConverting within a set wide conversion ratio range to obtain n1 × n2 cutting frames; wherein the wide transform scale of each crop frame is obtained from a ratio of the wide transform scale range to (n 2-1);
s333: x of the trimming frame obtained in step S332initConverting in the set central point conversion scale range to obtain n1 × n2 × n3 cutting frames; wherein the central point transformation proportion of each cutting frame is obtained according to the ratio of the central point transformation proportion range to (n 3-1);
s334: y of the trimming frame obtained in step S333initConverting within the set central point conversion scale range to obtain n1 × n2 × n3 × n4 cutting frames; wherein the central point transformation proportion of each cutting frame is obtained according to the ratio of the central point transformation proportion range to (n 4-1);
s335: n trimming frames are randomly picked as candidate trimming frames from the n1 × n2 × n3 × n4 trimming frames obtained in step S334.
9. The image cropping method based on visual saliency and aesthetic score as claimed in claim 8, wherein said step S4 inputs the candidate cropped image into an aesthetic quality evaluation module to obtain an aesthetic quality score qkAs shown in formula (4):
Figure FDA0002647136430000032
wherein
Figure FDA0002647136430000033
Representing the three-dimensional form of the candidate cropped image, wherein k belongs to 1, and n represents the number of candidate cropping frames; and A is an operator obtained after the aesthetic quality evaluation module is trained.
10. The image cropping method based on visual saliency and aesthetic score as claimed in claim 4, wherein in S33, h is required to be cutinit,winit,xinit,yinitEach parameter in the step (a) is sequentially randomly transformed once in a set proportion range to obtain a candidate cropping frame, and n times of operations are repeated to obtain n candidate cropping frames.
CN202010858270.1A 2020-08-24 2020-08-24 Image cropping device and method based on visual saliency and aesthetic score Pending CN112017193A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010858270.1A CN112017193A (en) 2020-08-24 2020-08-24 Image cropping device and method based on visual saliency and aesthetic score

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010858270.1A CN112017193A (en) 2020-08-24 2020-08-24 Image cropping device and method based on visual saliency and aesthetic score

Publications (1)

Publication Number Publication Date
CN112017193A true CN112017193A (en) 2020-12-01

Family

ID=73505712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010858270.1A Pending CN112017193A (en) 2020-08-24 2020-08-24 Image cropping device and method based on visual saliency and aesthetic score

Country Status (1)

Country Link
CN (1) CN112017193A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082673A (en) * 2022-06-14 2022-09-20 阿里巴巴(中国)有限公司 Image processing method, device, equipment and storage medium
WO2022227752A1 (en) * 2021-04-26 2022-11-03 荣耀终端有限公司 Photographing method and device
WO2023075936A1 (en) * 2021-10-29 2023-05-04 Microsoft Technology Licensing, Llc. Ai-based aesthetical image modification
CN116543004A (en) * 2023-07-05 2023-08-04 荣耀终端有限公司 Image cutting method, device, terminal equipment and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956999A (en) * 2016-04-28 2016-09-21 努比亚技术有限公司 Thumbnail generating device and method
CN106681606A (en) * 2016-12-06 2017-05-17 宇龙计算机通信科技(深圳)有限公司 Picture processing method and terminal
CN107545576A (en) * 2017-07-31 2018-01-05 华南农业大学 Image edit method based on composition rule
CN109146892A (en) * 2018-07-23 2019-01-04 北京邮电大学 A kind of image cropping method and device based on aesthetics
CN110349082A (en) * 2019-06-28 2019-10-18 腾讯科技(深圳)有限公司 Method of cutting out and device, the storage medium and electronic device of image-region
CN110909724A (en) * 2019-10-08 2020-03-24 华北电力大学 Multi-target image thumbnail generation method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956999A (en) * 2016-04-28 2016-09-21 努比亚技术有限公司 Thumbnail generating device and method
CN106681606A (en) * 2016-12-06 2017-05-17 宇龙计算机通信科技(深圳)有限公司 Picture processing method and terminal
CN107545576A (en) * 2017-07-31 2018-01-05 华南农业大学 Image edit method based on composition rule
CN109146892A (en) * 2018-07-23 2019-01-04 北京邮电大学 A kind of image cropping method and device based on aesthetics
CN110349082A (en) * 2019-06-28 2019-10-18 腾讯科技(深圳)有限公司 Method of cutting out and device, the storage medium and electronic device of image-region
CN110909724A (en) * 2019-10-08 2020-03-24 华北电力大学 Multi-target image thumbnail generation method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022227752A1 (en) * 2021-04-26 2022-11-03 荣耀终端有限公司 Photographing method and device
WO2023075936A1 (en) * 2021-10-29 2023-05-04 Microsoft Technology Licensing, Llc. Ai-based aesthetical image modification
US11961261B2 (en) 2021-10-29 2024-04-16 Microsoft Technology Licensing, Llc AI-based aesthetical image modification
CN115082673A (en) * 2022-06-14 2022-09-20 阿里巴巴(中国)有限公司 Image processing method, device, equipment and storage medium
CN116543004A (en) * 2023-07-05 2023-08-04 荣耀终端有限公司 Image cutting method, device, terminal equipment and computer readable storage medium
CN116543004B (en) * 2023-07-05 2024-04-19 荣耀终端有限公司 Image cutting method, device, terminal equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN112017193A (en) Image cropping device and method based on visual saliency and aesthetic score
CN109583483B (en) Target detection method and system based on convolutional neural network
CN107408211A (en) Method for distinguishing is known again for object
CN112101138B (en) Bridge inhaul cable surface defect real-time identification system and method based on deep learning
US20110018975A1 (en) Stereoscopic image generating method and system
CN107622247A (en) A kind of positioning of express waybill and extracting method
CN116468725B (en) Industrial defect detection method, device and storage medium based on pre-training model
CN111738133A (en) Model training method, target detection method, device, electronic equipment and readable storage medium
CN115810197A (en) Multi-mode electric power form recognition method and device
CN116824347A (en) Road crack detection method based on deep learning
CN109583341B (en) Method and device for detecting multi-person skeleton key points of image containing portrait
US20240161461A1 (en) Object detection method, object detection apparatus, and object detection system
CN110135274B (en) Face recognition-based people flow statistics method
CN116311536A (en) Video action scoring method, computer-readable storage medium and system
CN110400333A (en) Coach's formula binocular stereo vision device and High Precision Stereo visual pattern acquisition methods
CN114782936B (en) Behavior detection method based on improved yolov5s network
CN107067368A (en) Streetscape image splicing method and system based on deformation of image
CN116188763A (en) Method for measuring carton identification positioning and placement angle based on YOLOv5
CN112991455B (en) Method and system for fusing and labeling point cloud and picture
JP2981382B2 (en) Pattern matching method
CN114821493A (en) Ship information display method and system based on computer vision, AIS and radar
JPH0528254A (en) Automatic vectorization processing method for graphic data and device used for the method
CN111768333A (en) Identification removing method, device, equipment and storage medium
CN112115949B (en) Optical character recognition method for tobacco certificate and order
CN116580277B (en) Deep learning-based bottom electronic identification tag missing image identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination