CN112308859A - Method and device for generating thumbnail, camera and storage medium - Google Patents

Method and device for generating thumbnail, camera and storage medium Download PDF

Info

Publication number
CN112308859A
CN112308859A CN202010906455.5A CN202010906455A CN112308859A CN 112308859 A CN112308859 A CN 112308859A CN 202010906455 A CN202010906455 A CN 202010906455A CN 112308859 A CN112308859 A CN 112308859A
Authority
CN
China
Prior art keywords
image
target image
target
thumbnail
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010906455.5A
Other languages
Chinese (zh)
Inventor
庞芸萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Pinecone Electronic Co Ltd
Original Assignee
Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Pinecone Electronic Co Ltd filed Critical Beijing Xiaomi Pinecone Electronic Co Ltd
Priority to CN202010906455.5A priority Critical patent/CN112308859A/en
Publication of CN112308859A publication Critical patent/CN112308859A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/168Details of user interfaces specifically adapted to file systems, e.g. browsing and visualisation, 2d or 3d GUIs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/54Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

The disclosure relates to a method and a device for generating a thumbnail, a camera and a storage medium, relates to the technical field of image processing, and solves the technical problems that in the related art, the thumbnail cannot highlight a significant area of a target image, cannot completely display the main content of the target image, and cannot ensure the reasonability of the distribution of objects in the thumbnail. The method comprises the following steps: acquiring a target image; carrying out object detection on the target image, and cutting the target image according to the detected object to obtain a plurality of cut images; inputting each cut image into a selection model to obtain an output score of the selection model for each cut image, wherein the selection model is obtained by training according to an image sample set, the image sample set comprises a plurality of sample pairs, and each sample pair comprises an image sample and a score corresponding to the image sample; and determining a target cutting image according to the output score, and taking the target cutting image as a thumbnail of the target image.

Description

Method and device for generating thumbnail, camera and storage medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a method and an apparatus for generating a thumbnail, a camera, and a storage medium.
Background
At present, when a large number of pictures are previewed in an application such as an album, a file manager and the like of a mobile terminal, thumbnails of the pictures are generally generated to provide previews for a user. Like the overview page of a camera album, which is typically a thumbnail display of squares, we need to cut the artwork to the required size at this time.
In the related art, in the middle area of a cut picture, the center area of the cut picture is blindly cut, the main object of the cut picture is cut off, the significant area of the picture cannot be highlighted, the original content cannot be sufficiently displayed in a thumbnail display picture, or even if the main object is not extremely cut off, the reasonability of the distribution of the objects in the picture cannot be ensured in the center area of the cut picture.
Disclosure of Invention
To overcome the problems in the related art, the present disclosure provides a method, apparatus, camera, and storage medium for generating a thumbnail.
According to a first aspect of embodiments of the present disclosure, there is provided a method of generating a thumbnail, the method including:
acquiring a target image;
carrying out object detection on the target image, and cutting the target image according to the detected object to obtain a plurality of cut images;
inputting each of the cut images into a selection model to obtain an output score of the selection model for each of the cut images, wherein the selection model is obtained by training according to an image sample set, the image sample set comprises a plurality of sample pairs, and each sample pair comprises an image sample and a score corresponding to the image sample;
and determining a target cutting image according to the output score, and taking the target cutting image as a thumbnail of the target image.
Optionally, the performing object detection on the target image and obtaining a plurality of cut images by cutting from the target image according to the detected object includes:
carrying out object detection on the target image, and determining a main body area of the target image according to the detected main object;
determining a crop box size from the subject region and the target image;
and according to the size of the cropping frame, performing multiple cropping on the target image to obtain multiple cropped images.
Optionally, the performing object detection on the target image and determining a main region of the target image according to the detected main object includes:
and determining a smallest enclosing frame of the detected main object set, and taking an image area in the smallest enclosing frame as the main body area.
Optionally, the performing, according to the size of the cropping frame, multiple cropping on the target image to obtain multiple cropped images includes:
determining a plurality of clipping positions of the clipping frame according to the central position of the main body area and the size of the clipping frame, wherein the clipping frame is provided with a standard position point which is coincident with the central position point of the main body area at each clipping position;
and cutting the target image according to the size of the cutting frame at each cutting position to obtain a plurality of cutting images, wherein the size of the cutting frame enables the cutting frame to fully cover the main body area during each cutting.
Optionally, the performing object detection on the target image and determining a main region of the target image according to the detected main object includes:
when the target image is detected to contain a face, and the area of a face detection frame is larger than a preset threshold value, determining the face in the face detection frame as the main object, wherein the main area comprises an area where the main object is located;
and under the condition that the target image is detected to contain no human face or the target image contains the human face and the area of the human face detection frame is smaller than the preset threshold value, performing target detection on the target object and determining that an object in the target detection frame belongs to the main object, wherein the main area comprises an area where the main object is located.
Optionally, the cropping frame is a square cropping frame, and determining the size of the cropping frame according to the size of the main body area and the size of the target image includes:
determining the side length L of the cutting frame by the following calculation formula;
L=min(w,max(2/3*w,2*c));
where w represents a minimum side length of the target image, and c represents a minimum side length of the subject region.
Optionally, the acquiring the target image includes:
acquiring an image preview request sent by a client;
after determining a target trimming image according to the output score and using the target trimming image as a thumbnail of the target image, the method further comprises:
sending a message to the client for responding to the image preview request, the message including the thumbnail.
According to a second aspect of the embodiments of the present disclosure, there is provided an apparatus for generating a thumbnail, the apparatus including:
an image acquisition module configured to acquire a target image;
the processing module is configured to perform object detection on the target image and cut the target image into a plurality of cut images according to the detected object;
the first execution module is configured to input each of the clipped images into a selection model to obtain an output score of the selection model for each of the clipped images, wherein the selection model is obtained by training according to an image sample set, the image sample set comprises a plurality of sample pairs, and each sample pair comprises an image sample and a score corresponding to the image sample;
and the second execution module is configured to determine a target cutting image according to the output score and take the target cutting image as a thumbnail of the target image.
According to a third aspect of the embodiments of the present disclosure, there is provided a camera including the apparatus for generating a thumbnail described above.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of generating thumbnails provided by the first aspect of the present disclosure.
According to a fifth aspect of embodiments of the present disclosure, there is provided an apparatus for generating a thumbnail, the apparatus including: a processor;
a memory for storing processor-executable instructions;
wherein the processor, when executing the instructions stored in the memory, may perform the method for generating a thumbnail provided by the first aspect of the disclosure.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: the method comprises the steps of detecting an object according to a target image, and cutting the target image according to the detected object to obtain a plurality of cut images, so that the integrity of a main object in the image is ensured; and the image with the highest value in each cut image is taken as a target cut image, and the target cut image is taken as a thumbnail of the target image, so that the reasonability of object distribution in the image is ensured, and the thumbnail can optimally show the content of the target image.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a schematic diagram before cropping a center region of an image;
FIG. 2 is a schematic diagram of a cropped central region of an image;
FIG. 3 is a flow chart illustrating a method of generating a thumbnail according to an exemplary embodiment.
Fig. 4 is a flowchart illustrating step S120 according to an exemplary embodiment.
FIG. 5 is a diagram illustrating a crop box according to an exemplary embodiment.
FIG. 6 is a diagram illustrating a target crop area, according to an example embodiment.
FIG. 7 is a flow diagram illustrating training a selection model according to an example embodiment.
Fig. 8 is a block diagram illustrating an apparatus for generating a thumbnail according to an exemplary embodiment.
Fig. 9 is a block diagram illustrating an apparatus for generating a thumbnail according to an exemplary embodiment.
Detailed Description
At present, when a large number of pictures are previewed in an application such as an album, a file manager and the like of a mobile terminal, thumbnails of the pictures are generally generated to provide previews for a user. The size of the picture is generally determined by the camera, and the most common aspect ratio sizes are 4:3, 3:4, 9:16, 16:9, 1:1, 2:3, 3:2, and the like. However, in many application scenarios, when a photo is displayed, the photo with an original size is not directly displayed, such as an overview page of a camera album, which is usually a rectangular thumbnail display image, and at this time, the original image needs to be cut to a required size.
In the related art, the middle area of the picture is cut, as shown in fig. 1, which is a schematic diagram before the central area of the cut image, and as shown in fig. 2, which is a schematic diagram after the central area of the cut image, the situation that the main object of the picture is cut off occurs in the blind cut central area of the picture, and the significant area of the picture cannot be highlighted, the thumbnail cannot completely display the content of the original image, or even if the main object is not cut off extremely, the rationality of the distribution of the objects in the picture cannot be guaranteed in the blind cut central area of the picture.
While the present disclosure addresses deficiencies of the related art in respect to a method, apparatus, camera, and storage medium for generating a thumbnail, exemplary embodiments will be described herein in detail, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 3 is a flowchart illustrating a method for generating a thumbnail according to an exemplary embodiment, where the method for generating a thumbnail may be used in an apparatus with a camera function, which may be a camera, a video camera, a mobile phone, etc., but is not limited by the disclosure, and as shown in fig. 3, the method includes the following steps:
in step S110, a target image is acquired;
in step S120, performing object detection on the target image, and cropping from the target image according to the detected object to obtain a plurality of cropped images;
in step S130, inputting each of the clipped images into a selection model to obtain an output score of the selection model for each of the clipped images, where the selection model is obtained by training according to an image sample set, where the image sample set includes a plurality of sample pairs, and each sample pair includes an image sample and a score corresponding to the image sample;
in step S140, a target trimming image is determined according to the output score, and the target trimming image is used as a thumbnail of the target image.
Specifically, the score corresponding to the image sample may be used to determine a matching degree between the distribution of the main object in the standard image sample and the standard distribution. The higher the score, the closer the distribution of the main object is to the standard distribution, i.e. the more matched it is to the standard distribution; the standard distribution may be the optimal distribution state of the thumbnails.
Optionally, as shown in fig. 4, in step S120, object detection is performed on the target image, and a plurality of cut images are cut from the target image according to the detected object, where a specific process may be as follows:
step S1201: carrying out object detection on the target image, and determining a main body area of the target image according to the detected main object;
step S1202: determining a crop box size from the subject region and the target image;
step S1203: and the size of the cropping frame is used for cropping the target image for multiple times to obtain multiple cropped images.
The object detection may include performing face detection on the target image; and detecting display targets in the target image according to the display requirements of the actual thumbnail, such as plants, animals, buildings and the like.
Optionally, the performing, according to the size of the cropping frame, multiple cropping on the target image to obtain multiple cropped images may include:
determining a plurality of clipping positions of the clipping frame according to the central position of the main body area and the size of the clipping frame, wherein the clipping frame is provided with a standard position point which is coincident with the central position point of the main body area at each clipping position;
and cutting the target image according to the size of the cutting frame at each cutting position to obtain a plurality of cutting images, wherein the size of the cutting frame enables the cutting frame to fully cover the main body area during each cutting.
Specifically, the standard points of the trimming frame may be placed at the center point of the main body area, respectively, as the trimming positions of the trimming frame.
Specifically, the cropping frame may be a squared figure distribution, and the plurality of standard location points further include: the number of the position points uniformly distributed along the periphery of the central grid in the squared figure is, for example, 4, or 8, and is not limited thereto. That is, the standard point of the crop box may be plural, for example, 9 standard points, but is not limited thereto.
For example, as shown in fig. 5, each position point is numbered sequentially, the position point No. 9 is located at the central position of the central lattice in the nine-square grid, and eight position points uniformly distributed around the position point No. 9 are sequentially the position point No. 1, the position point No. 2, the position point No. 3, the position point No. 4, the position point No. 5, the position point No. 6, the position point No. 7, and the position point No. 8, and are arranged on the central lattice sequentially. The clipping position may be obtained by sequentially overlapping the position points of the clipping frame with the center point of the main body region, for example, when the position point No. 1 of the clipping frame overlaps with the center point of the main body region, a clipping position is obtained, and the target image is clipped at the clipping position to obtain a clipping image; then, the position No. 2 of the cutting frame is superposed with the central point of the main body area to obtain a cutting position, and the target image is cut at the cutting position to obtain a second cutting image; and by analogy, a position 9 of the cutting frame is overlapped with the central point of the main body area to obtain a cutting position, and the target image is cut at the cutting position to obtain a 9 th cutting image, namely 9 cutting images are obtained.
Optionally, performing object detection on the target image, and determining a main region of the target image according to the detected main object may include:
and determining a smallest enclosing frame of the detected main object set, and taking an image area in the smallest enclosing frame as the main body area.
Alternatively, in an exemplary embodiment, the minimum bounding box may be a rectangular box or a square box, and is not limited thereto.
For example, when the target image includes one subject, the detection frame of the subject is set as the subject region S; when there are a plurality of subjects in the target image, the smallest bounding box of the plurality of subjects is calculated as a subject region S, and the center point position of the subject region is set as a subject center C (a, b).
Optionally, performing object detection on the target image, and determining a main region of the target image according to the detected main object may include:
when the target image is detected to contain a face, and the area of a face detection frame is larger than a preset threshold value, determining the face in the face detection frame as the main object, wherein the main area comprises an area where the main object is located;
and under the condition that the target image is detected to contain no human face or the target image contains the human face and the area of the human face detection frame is smaller than the preset threshold value, performing target detection on the target object and determining that an object in the target detection frame belongs to the main object, wherein the main area comprises an area where the main object is located.
The preset threshold may be preset according to the display requirement of the thumbnail, which is not specifically limited in this embodiment.
For example, performing object detection on a target image, detecting all faces in the target object under the condition that the target image contains a face and the area of a face detection frame is greater than 5% of the area of the target image, determining that all the faces in the target detection frame belong to a main object, and taking the region containing the face in the target image as a main region;
and when the target image does not contain the human face or the target image contains the human face and the area of the human face detection frame is less than 5 percent of the area of the target image, performing target detection on animals, plants and the like in the target object, determining that all the animals, plants and the like in the target detection frame belong to a main object, and taking the region containing the animals or plants in the target image as a main region.
Alternatively, in an exemplary embodiment that may be implemented, the crop box may be a square crop box.
Optionally, determining the cropping frame according to the size of the main body area and the size of the target image may include:
determining the side length L of the cutting frame by the following calculation formula;
L=min(w,max(2/3*w,2*c));
where w represents a minimum side length of the target image, and c represents a minimum side length of the subject region.
Optionally, after the cropping frame is determined, a plurality of cropping positions of the cropping frame are determined according to the central position point of the main body area. FIG. 6 is a diagram illustrating a target crop area according to an exemplary embodiment, and the crop area is calculated by sequentially overlapping position points of a crop box, such as numbers 1-9, with the center point of the body area, as shown in FIG. 6.
In the embodiment, the coordinates of the upper left corner of the cropping area are (x1, y1), the coordinates of the lower right corner are (x2, y2), the coordinates of the central area are (a, b), and L represents the side length of the cropping frame;
when the center point coincides with position point No. 1, the coordinates of the upper left corner of the crop box are x1 ═ max (a-1/3L, 0), and y1 ═ max (b-1/3L); the coordinates of the lower right corner of the crop box are x 2-min (w, a +2/3L), y 2-min (h, b + 2/3L); as shown in fig. 6, the cutting frame ensures the integrity of the main object, and the cut main object is at the position of each position point of the squared figure in the cut area (circular area), and simultaneously ensures the reasonability of the distribution of the objects in the cut area.
When the center point coincides with position point No. 9, the coordinates of the upper left corner of the crop box are x3 ═ max (a-1/2L, 0), and y3 ═ max (b-1/2L); the coordinates of the lower right corner of the crop box are x 4-min (w, a +1/2L), y 4-min (h, b + 1/2L); similarly, the coordinates of the upper left corner and the coordinates of the lower right corner of the corresponding cutting frame under the condition that the central point is respectively superposed with the position points No. 2-8 can be obtained, and a plurality of cutting positions of the cutting frame are determined.
Optionally, the selected model may be a convolutional neural network, and the training process of the selected model may include:
performing iteration training on the convolution application network for preset times, wherein each iteration executes the following operations:
inputting the image sample into a convolutional neural network, and calculating a prediction score of the image sample output by an output layer of the convolutional neural network;
determining a loss value of the iteration according to the prediction value of the image sample and the value of the image sample;
and updating the network parameters of the convolutional neural network according to the loss value.
Fig. 7 is a flowchart illustrating a process of training a selection model according to an exemplary embodiment, where as shown in fig. 7, in step S130, the process of training the selection model may include:
step S13011: acquiring an image sample set;
step S13012: inputting the image sample into a convolutional neural network, and calculating a prediction score of the image sample output by an output layer of the convolutional neural network;
step S13013: determining a loss value of the iteration according to the prediction value of the image sample and the value of the image sample;
step S13014: updating network parameters of the convolutional neural network according to the loss values;
step S13015: and obtaining the trained selection model under the condition that the iteration times reach the preset iteration times.
The preset falling times can be set according to the number of images in the image training set; or setting according to the error of the actual matching degree of the image and the corresponding standard matching degree; the embodiments of the present disclosure are not particularly limited in this regard.
In this embodiment, taking convolutional neural network resNet-34 as an example for explanation, first obtaining an image sample, where the image sample set includes a plurality of sample pairs, each sample pair includes an image sample and a prediction score corresponding to the image sample, and initializing a resNet-34 initialization model parameter;
then inputting the image samples into a resNet-34 initialization model to obtain the scores of all the image samples output by an output layer of the resNet-34 initialization model;
calculating the value of each image sample in the image training set, the error of the corresponding prediction value and the error gradient;
finally, reversely propagating the values of all the image samples, the errors of the corresponding prediction values and the error gradients back to the upper layer of the RESNet-34 initialization model, updating the weight of the RESNet-34 initialization model according to the errors and the error gradients, and finishing the first iteration;
inputting the image sample into a resNet-34 initialization model after the first iteration for the next iteration;
and obtaining the trained selection model under the condition that the iteration times reach 200 times.
Optionally, the acquiring the target image may include:
acquiring an image preview request sent by a client;
after determining a target trimming image according to the output score and using the target trimming image as a thumbnail of the target image, the method further comprises:
sending a message to the client for responding to the image preview request, the message including the thumbnail.
For example, after an image preview request sent by a client is obtained, a thumbnail of an image is obtained by the method for generating a thumbnail disclosed in this embodiment, then information including the thumbnail is sent to the client in response to the image preview request, and the client displays the thumbnail in the received information.
In the method for generating the thumbnail provided by this embodiment, a plurality of cut images are obtained from the target object according to the object detection of the target object, so that the integrity of a main object in the cut images is ensured; and then according to the score corresponding to the image sample, determining the cut image with the highest score as a target cut image from all the cut images, ensuring the reasonability of object distribution in the thumbnail, and taking the target cut image as the thumbnail of the target image, so that the thumbnail can optimally display the main object in the target image and highlight the salient region in the target image.
Fig. 8 is a block diagram illustrating an apparatus for generating a thumbnail, which may implement part or all of a device with an image capturing function in software, hardware, or a combination thereof according to an exemplary embodiment, and as shown in fig. 8, an apparatus 800 for generating a thumbnail includes:
an image acquisition module 801 configured to acquire a target image;
a processing module 802 configured to perform object detection on the target image and crop a plurality of cropped images from the target image according to the detected object;
a first executing module 803, configured to input each of the cropped images into a selection model, to obtain an output score of the selection model for each of the cropped images, where the selection model is obtained by training according to an image sample set, where the image sample set includes a plurality of sample pairs, and each sample pair includes an image sample and a score corresponding to the image sample;
a second executing module 804 configured to determine a target trimming image according to the output score, and to use the target trimming image as a thumbnail of the target image.
The device for generating the thumbnail obtains a plurality of cut images from the target object according to the object detection of the target object, so that the integrity of a main object in the cut images is ensured; and then according to the score corresponding to the image sample, determining the cut image with the highest score as a target cut image from all the cut images, ensuring the reasonability of object distribution in the thumbnail, and taking the target cut image as the thumbnail of the target image, so that the thumbnail can optimally display the main object in the target image and highlight the salient region in the target image.
Optionally, the processing module 802 may perform object detection on the target image, and determine a main area of the target image according to the detected main object;
determining a crop box size from the subject region and the target image;
and according to the size of the cropping frame, performing multiple cropping on the target image to obtain multiple cropped images.
Optionally, the processing module 802 may further determine a smallest bounding box of the detected main object set, and use an image area in the smallest bounding box as the main area.
Optionally, the processing module 802 may specifically determine a plurality of clipping positions of the clipping frame according to the center position of the main body area and the size of the clipping frame, where the clipping frame has a standard position point coinciding with the center position point of the main body area at each of the clipping positions;
and cutting the target image according to the size of the cutting frame at each cutting position to obtain a plurality of cutting images, wherein the size of the cutting frame enables the cutting frame to fully cover the main body area during each cutting.
Optionally, the processing module 802 may specifically determine, when the target image is detected to include a face, and the area of the face detection frame is greater than a preset threshold, that the face in the face detection frame is the main object, where the main area includes an area where the main object is located;
and under the condition that the target image is detected to contain no human face or the target image contains the human face and the area of the human face detection frame is smaller than the preset threshold value, performing target detection on the target object and determining that an object in the target detection frame belongs to the main object, wherein the main area comprises an area where the main object is located.
Optionally, the first execution module 803 may obtain a sample set of images; inputting the image sample into a convolutional neural network, and calculating a prediction score of the image sample output by an output layer of the convolutional neural network; determining a loss value of the iteration according to the prediction value of the image sample and the value of the image sample; and updating the network parameters of the convolutional neural network according to the loss value to obtain a trained selection model under the condition that the iteration times reach the preset iteration times.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
The camera comprises the device for generating the thumbnail, and can obtain a plurality of cut images from the target object by the device for generating the thumbnail according to the object detection of the target object, so that the integrity of a main object in the cut images is ensured; and then according to the score corresponding to the image sample, determining the cut image with the highest score as a target cut image from all the cut images, ensuring the reasonability of object distribution in the thumbnail, and taking the target cut image as the thumbnail of the target image, so that the thumbnail can optimally display the main object in the target image and highlight the salient region in the target image.
With regard to the apparatus for generating a thumbnail image included in the camera in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of generating thumbnails provided by the present disclosure.
Specifically, the computer-readable storage medium may be a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, etc.
With regard to the computer-readable storage medium in the above-described embodiments, the method steps when the computer program stored thereon is executed will be described in detail in relation to the embodiments of the method, and will not be elaborated upon here.
The present disclosure also provides an apparatus for generating a thumbnail, which may be a computer, a platform device, or the like, the apparatus for generating a thumbnail including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor, when executing the instructions stored in the memory, may perform the steps of the method of generating thumbnails provided by the present disclosure.
The device for generating the thumbnail obtains a plurality of cut images from the target object according to the object detection of the target object, thereby ensuring the integrity of a main object in the cut images; and then according to the score corresponding to the image sample, determining the cut image with the highest score as a target cut image from all the cut images, ensuring the reasonability of object distribution in the thumbnail, and taking the target cut image as the thumbnail of the target image, so that the thumbnail can optimally display the main object in the target image and highlight the salient region in the target image.
Fig. 9 is a block diagram illustrating an apparatus 900 for generating a thumbnail according to an exemplary embodiment. As shown in fig. 9, the apparatus 900 for generating a thumbnail may include one or more of the following components: a processing component 902, a memory 904, a power component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, and a communication component 916.
The processing component 902 generally controls overall operations of the apparatus 900, such as image capture operations, photographing operations, and the like. Processing component 902 may include one or more processors 920 to execute instructions to perform all or a portion of the steps of the above-described method of generating thumbnails. Further, processing component 902 can include one or more modules that facilitate interaction between processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.
The memory 904 is configured to store various types of data to support operation at the apparatus 900. Examples of such data include instructions for any application or method operating on the device 900, characteristic information of the object (e.g., face information, characteristic information of a building, category information of an animal, etc.), preset thresholds of ratios, etc. The memory 904 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power component 906 provides power to the various components of device 900. The power components 906 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 900.
The multimedia component 908 comprises a screen providing an output interface between the device 900 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen, presenting thumbnails. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel to replace thumbnails shown on the touch screen. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. The front camera and/or the rear camera may receive external multimedia data when the device 900 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
I/O interface 912 provides an interface between processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a start button, and a lock button.
The sensor component 914 includes one or more sensors for providing status assessment of various aspects of the apparatus 900. For example, the sensor component 914 can detect the open/closed state of the device 900, the relative positioning of the components.
The communication component 916 is configured to facilitate communications between the apparatus 900 and other devices in a wired or wireless manner. The apparatus 900 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 916 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described method of generating thumbnails.
According to the method, the device, the camera and the storage medium for generating the thumbnail, object detection is carried out on a target image, a plurality of cut images are obtained by cutting the target image according to the detected object, the picture with the required size can be cut in a self-adaptive mode, and the integrity of a main object in the image is guaranteed; and determining the image with the highest score from all the cut images as a target cut image, and taking the target cut image as a thumbnail of the target image, so that the salient region of the target image can be highlighted, the main object required to be presented by the target image can be completely displayed, the reasonability of object distribution in the image is ensured, and the thumbnail can optimally display the main object of the target image.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (11)

1. A method of generating a thumbnail, the method comprising:
acquiring a target image;
carrying out object detection on the target image, and cutting the target image according to the detected object to obtain a plurality of cut images;
inputting each of the cut images into a selection model to obtain an output score of the selection model for each of the cut images, wherein the selection model is obtained by training according to an image sample set, the image sample set comprises a plurality of sample pairs, and each sample pair comprises an image sample and a score corresponding to the image sample;
and determining a target cutting image according to the output score, and taking the target cutting image as a thumbnail of the target image.
2. The method according to claim 1, wherein the performing object detection on the target image and cropping a plurality of cropped images from the target image according to the detected object comprises:
carrying out object detection on the target image, and determining a main body area of the target image according to the detected main object;
determining a crop box size from the subject region and the target image;
and according to the size of the cropping frame, performing multiple cropping on the target image to obtain multiple cropped images.
3. The method of claim 2, wherein the performing object detection on the target image and determining the subject region of the target image according to the detected main object comprises:
and determining a smallest enclosing frame of the detected main object set, and taking an image area in the smallest enclosing frame as the main body area.
4. The method according to claim 2, wherein the cropping the target image a plurality of times according to the cropping frame size to obtain the plurality of cropped images comprises:
determining a plurality of clipping positions of the clipping frame according to the central position of the main body area and the size of the clipping frame, wherein the clipping frame is provided with a standard position point which is coincident with the central position point of the main body area at each clipping position;
and cutting the target image according to the size of the cutting frame at each cutting position to obtain a plurality of cutting images, wherein the size of the cutting frame enables the cutting frame to fully cover the main body area during each cutting.
5. The method according to claim 3, wherein the performing object detection on the target image and determining the subject region of the target image according to the detected main object comprises:
when the target image is detected to contain a face, and the area of a face detection frame is larger than a preset threshold value, determining the face in the face detection frame as the main object, wherein the main area comprises an area where the main object is located;
and under the condition that the target image is detected to contain no human face or the target image contains the human face and the area of the human face detection frame is smaller than the preset threshold value, performing target detection on the target object and determining that an object in the target detection frame belongs to the main object, wherein the main area comprises an area where the main object is located.
6. The method of claim 4, wherein the cropping frame is a square cropping frame, and wherein determining the size of the cropping frame based on the size of the subject area and the size of the target image comprises:
determining the side length L of the cutting frame by the following calculation formula;
L=min(w,max(2/3*w,2*c));
where w represents a minimum side length of the target image, and c represents a minimum side length of the subject region.
7. The method of any of claims 1-4, wherein the acquiring the target image comprises:
acquiring an image preview request sent by a client;
after determining a target trimming image according to the output score and using the target trimming image as a thumbnail of the target image, the method further comprises:
sending a message to the client for responding to the image preview request, the message including the thumbnail.
8. An apparatus for generating a thumbnail, the apparatus comprising:
an image acquisition module configured to acquire a target image;
the processing module is configured to perform object detection on the target image and cut the target image into a plurality of cut images according to the detected object;
the first execution module is configured to input each of the clipped images into a selection model to obtain an output score of the selection model for each of the clipped images, wherein the selection model is obtained by training according to an image sample set, the image sample set comprises a plurality of sample pairs, and each sample pair comprises an image sample and a score corresponding to the image sample;
and the second execution module is configured to determine a target cutting image according to the output score and take the target cutting image as a thumbnail of the target image.
9. A camera characterized in that it comprises a device for generating thumbnails as claimed in claim 8.
10. A computer-readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of generating a thumbnail as claimed in any one of claims 1-7.
11. An apparatus for generating a thumbnail, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor, when executing the instructions stored in the memory, may implement the method of generating a thumbnail as recited in any of claims 1-7.
CN202010906455.5A 2020-09-01 2020-09-01 Method and device for generating thumbnail, camera and storage medium Pending CN112308859A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010906455.5A CN112308859A (en) 2020-09-01 2020-09-01 Method and device for generating thumbnail, camera and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010906455.5A CN112308859A (en) 2020-09-01 2020-09-01 Method and device for generating thumbnail, camera and storage medium

Publications (1)

Publication Number Publication Date
CN112308859A true CN112308859A (en) 2021-02-02

Family

ID=74483891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010906455.5A Pending CN112308859A (en) 2020-09-01 2020-09-01 Method and device for generating thumbnail, camera and storage medium

Country Status (1)

Country Link
CN (1) CN112308859A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818161A (en) * 2021-02-24 2021-05-18 西安博达软件股份有限公司 Method for identifying original image by merging media asset library thumbnail based on deep learning
CN114972369A (en) * 2021-02-26 2022-08-30 北京小米移动软件有限公司 Image processing method, device and storage medium
CN115033154A (en) * 2021-02-23 2022-09-09 北京小米移动软件有限公司 Thumbnail generation method, thumbnail generation device and storage medium
CN115082673A (en) * 2022-06-14 2022-09-20 阿里巴巴(中国)有限公司 Image processing method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014146561A1 (en) * 2013-03-21 2014-09-25 腾讯科技(深圳)有限公司 Thumbnail generating method and system
CN109978858A (en) * 2019-03-27 2019-07-05 华南理工大学 A kind of double frame thumbnail image quality evaluating methods based on foreground detection
CN110136142A (en) * 2019-04-26 2019-08-16 微梦创科网络科技(中国)有限公司 A kind of image cropping method, apparatus, electronic equipment
CN110795925A (en) * 2019-10-12 2020-02-14 腾讯科技(深圳)有限公司 Image-text typesetting method based on artificial intelligence, image-text typesetting device and electronic equipment
CN110909724A (en) * 2019-10-08 2020-03-24 华北电力大学 Multi-target image thumbnail generation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014146561A1 (en) * 2013-03-21 2014-09-25 腾讯科技(深圳)有限公司 Thumbnail generating method and system
CN109978858A (en) * 2019-03-27 2019-07-05 华南理工大学 A kind of double frame thumbnail image quality evaluating methods based on foreground detection
CN110136142A (en) * 2019-04-26 2019-08-16 微梦创科网络科技(中国)有限公司 A kind of image cropping method, apparatus, electronic equipment
CN110909724A (en) * 2019-10-08 2020-03-24 华北电力大学 Multi-target image thumbnail generation method
CN110795925A (en) * 2019-10-12 2020-02-14 腾讯科技(深圳)有限公司 Image-text typesetting method based on artificial intelligence, image-text typesetting device and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033154A (en) * 2021-02-23 2022-09-09 北京小米移动软件有限公司 Thumbnail generation method, thumbnail generation device and storage medium
CN112818161A (en) * 2021-02-24 2021-05-18 西安博达软件股份有限公司 Method for identifying original image by merging media asset library thumbnail based on deep learning
CN114972369A (en) * 2021-02-26 2022-08-30 北京小米移动软件有限公司 Image processing method, device and storage medium
CN115082673A (en) * 2022-06-14 2022-09-20 阿里巴巴(中国)有限公司 Image processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US9953506B2 (en) Alarming method and device
CN106937045B (en) Display method of preview image, terminal equipment and computer storage medium
US9554030B2 (en) Mobile device image acquisition using objects of interest recognition
KR102349428B1 (en) Method for processing image and electronic device supporting the same
CN112308859A (en) Method and device for generating thumbnail, camera and storage medium
US9756261B2 (en) Method for synthesizing images and electronic device thereof
US9412017B1 (en) Methods systems and computer program products for motion initiated document capture
CN112135046B (en) Video shooting method, video shooting device and electronic equipment
US10430456B2 (en) Automatic grouping based handling of similar photos
US20190109981A1 (en) Guided image composition on mobile devices
CN108346171B (en) Image processing method, device, equipment and computer storage medium
CN112954210B (en) Photographing method and device, electronic equipment and medium
CN105874776A (en) Image processing apparatus and method
CN107360375B (en) Shooting method and mobile terminal
CN112437232A (en) Shooting method, shooting device, electronic equipment and readable storage medium
CN111669495B (en) Photographing method, photographing device and electronic equipment
CN112312016A (en) Shooting processing method and device, electronic equipment and readable storage medium
CN111080571A (en) Camera shielding state detection method and device, terminal and storage medium
WO2015196681A1 (en) Picture processing method and electronic device
CN114430460A (en) Shooting method and device and electronic equipment
CN112330728A (en) Image processing method, image processing device, electronic equipment and readable storage medium
CN112188108A (en) Photographing method, terminal, and computer-readable storage medium
WO2016188199A1 (en) Method and device for clipping pictures
CN115514887A (en) Control method and device for video acquisition, computer equipment and storage medium
CN112383708B (en) Shooting method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination