CN117666897A

CN117666897A - Image display method, device, electronic equipment and readable storage medium

Info

Publication number: CN117666897A
Application number: CN202311610690.8A
Authority: CN
Inventors: 程林
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2023-11-28
Filing date: 2023-11-28
Publication date: 2024-03-08

Abstract

The application discloses an image display method, an image display device, electronic equipment and a readable storage medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: receiving a first input to a first scale and a first label; responsive to the first input, obtaining at least one first image associated with the first tag; outputting, by the first model, at least one second image cropped from the first image based on the first scale, and outputting an aesthetic score of the at least one second image; and displaying a third image with highest aesthetic score in the at least one second image.

Description

Image display method, device, electronic equipment and readable storage medium

Technical Field

The application belongs to the technical field of artificial intelligence, and particularly relates to an image display method, an image display device, electronic equipment and a readable storage medium.

Background

At present, an album application is used as a high-frequency application in the use of electronic equipment and is used for recording all relevant photographed contents of a user, so that the album application is an application filled with memory and emotion temperature.

In the prior art, in order to make users feel ever and better from time to time, one important function is album carefully chosen recommendation function. In the album pick recommendation function, a widget is displayed on the desktop interface, and then some images are selectively picked out from the album application and displayed in the widget. Therefore, as long as the user enters the desktop interface, the memory of the user can be aroused, and emotion resonance of the user is caused; on the other hand, the album carefully chosen recommending function can also increase the richness of the layout of the desktop interface, so that the desktop interface is not only used for displaying icons of applications, and the whole album is more attractive and personalized.

In an application scenario of the album selection recommendation function, a user needs to manually set images displayed in a small window, including selection operation, clipping operation and the like of the images, which causes complicated operation of the user.

Disclosure of Invention

The embodiment of the application aims to provide an image display method which can solve the problem that a user has complicated operation when setting the display of a small window image in a desktop interface.

In a first aspect, an embodiment of the present application provides an image display method, including: receiving a first input to a first scale and a first label; responsive to the first input, obtaining at least one first image associated with the first tag; outputting, by the first model, at least one second image cropped from the first image based on the first scale, and outputting an aesthetic score of the at least one second image; and displaying a third image with highest aesthetic score in the at least one second image.

In a second aspect, an embodiment of the present application provides an image display apparatus, including: a receiving module for receiving a first input to a first scale and a first tag; an acquisition module for acquiring at least one first image associated with the first tag in response to the first input; an output module for outputting, by a first model, at least one second image obtained by cropping the first image based on the first scale, and outputting an aesthetic score of the at least one second image; and the display module is used for displaying a third image with highest aesthetic score in the at least one second image.

In a third aspect, embodiments of the present application provide an electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the method as described in the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the first aspect.

In a sixth aspect, embodiments of the present application provide a computer program product stored in a storage medium, the program product being executable by at least one processor to implement the method according to the first aspect.

In an embodiment of the present application, for an image display component, such as a desktop pick image component, a user sets a first scale through a first input, the first scale being used to define a form of a widget for displaying an image, and the user sets a first tab through the first input, the first tab being used to define a set of candidate images for display in the widget; in response to the first input, the system automatically obtains at least one first image associated with the first tag as a candidate image set, inputs the candidate image set into the first model, generates a first scale crop box from the first model to intelligently crop the first images, crops one first image to obtain at least one second image after cropping, and calculates an aesthetic score of the second image from the first model. Further, for a first image, a second image with the highest aesthetic score needs to be found and used as a third image, and the third image is displayed in a small window, so that the content of the first image can be expressed, and the appearance is attractive. Therefore, in the embodiment of the application, the user does not need to manually screen a large number of images or manually cut the screened images, and the user is combined with simple operation instead, the electronic equipment intelligently screens the images and the electronic equipment intelligently cuts the images in a form of combining the small window, so that the purpose of simplifying the user operation is achieved.

Drawings

Fig. 1 is a flowchart of an image display method of an embodiment of the present application;

FIG. 2 is one of the display schematic diagrams of the electronic device of the embodiment of the present application;

FIG. 3 is a second schematic diagram of an electronic device according to an embodiment of the present disclosure;

FIG. 4 is a third schematic illustration of a display of an electronic device according to an embodiment of the present application;

FIG. 5 is a fourth schematic illustration of a display of an electronic device according to an embodiment of the present application;

FIG. 6 is one of explanatory diagrams of an image display method of the embodiment of the present application;

FIG. 7 is a second illustrative diagram of an image display method according to an embodiment of the present application;

FIG. 8 is a third illustrative diagram of an image display method according to an embodiment of the present disclosure;

FIG. 9 is a diagram illustrating an image display method according to an embodiment of the present application;

FIG. 10 is a fifth illustrative diagram of an image display method according to an embodiment of the present application;

fig. 11 is a block diagram of an image display apparatus of an embodiment of the present application;

FIG. 12 is a schematic diagram of a hardware architecture of an electronic device according to an embodiment of the present application;

fig. 13 is a second schematic diagram of a hardware structure of the electronic device according to the embodiment of the present application.

Detailed Description

Technical solutions of embodiments of the present application will be clearly described below with reference to the accompanying drawings of embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application are within the scope of the protection of the present application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the objects identified by "first," "second," etc. are generally of a type and do not limit the number of objects, for example, the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The image display method provided by the embodiment of the application is described in detail below by means of specific embodiments and application scenes thereof with reference to the accompanying drawings.

As shown in fig. 1, a flowchart of an image display method according to an embodiment of the present application is shown, where the method is applied to an electronic device for example, and the method includes:

step 110: a first input to a first scale and a first label is received.

In some embodiments of the present application, the first input is used to set a first scale and a first label on a desktop pick image component, for example, on a setting page of the desktop pick image component, and the first input may be a first operation. Illustratively, the first input includes, but is not limited to: the user inputs the control on the screen, the touch control on the screen area, or the voice command input by the user, or the specific gesture input by the user, or other feasibility inputs through a touch control device such as a finger or a stylus, and the embodiment of the invention is not limited. The specific gesture in the embodiment of the application may be any one of a single-click gesture, a sliding gesture, a dragging gesture, a pressure recognition gesture, a long-press gesture, an area change gesture, a double-press gesture and a double-click gesture; the click input in the embodiment of the application may be single click input, double click input, or any number of click inputs, and may also be long press input or short press input.

Alternatively, referring to FIG. 2, a desktop pick image component is displayed in the desktop interface in the form of a widget 201 with icons of applications, such as icon 202, displayed around the widget 201.

For example, the first input may be: in the setting page of the desktop fine-selection image component, the setting page is shown in fig. 3, the user clicks the "add component" option 301, so that the page shown in fig. 4 is displayed, in the page shown in fig. 4, a plurality of box symbols are uniformly distributed, the user clicks one of the box symbols, for example, the box symbol 401, and drags the box symbol to the other box symbol 402, wherein the box symbol 401 and the box symbol 402 are two opposite angles of an area, namely, a display area of a small window of the desktop fine-selection image component, which is user-defined, and the ratio, namely, the first ratio, between the side length value of the area in the vertical direction and the side length value of the area in the horizontal direction is the first ratio.

For example, referring to fig. 5, the first ratio determined based on the first input may be: "1:2"," 2:1"," 1:1 ".

Optionally, the user sets a plurality of desktop pick image components. For example, referring to FIG. 5, the user has created three portlets of the desktop pick image assembly simultaneously.

Further, the first input may be: referring to fig. 5, the user double-clicks any one of the portlets, displays an edit box below the portlets, and the user can input text, images, etc. in the edit box as a first tab.

For example, referring to FIG. 5, the first label entered by the user for the first portlet 501 is the word "family", the first label entered by the user for the second portlet 502 is a facial image, and the first label entered by the user for the third portlet 503 is the word "baby".

Step 120: at least one first image associated with a first tag is acquired in response to a first input.

Alternatively, the first image is a still image, such as a photograph, and the first image may also be a moving image, such as a video. In the case of video, key frames in the video can be acquired for processing.

In some embodiments of the present application, for all images in the album application, labels are automatically added to the images according to the shooting content in the images, and the added labels are as text: sea, cat, dog, child, pet, landscape, specific name, specific designation; the tag is added as a feature of a face image. So that images with identical, similar labels are clustered based on these labels. Thus, in this step, when the user inputs the first tag, the image tags are matched in the album application to find all images having the first tag or having tags similar to the first tag as the first image.

For example, the first tag is the word "mom", then based on the user manually creating the album of "mom", the text tag "mom" is automatically added to all images in the album of "mom" so that all images with the text tag "mom" can be matched in this step.

As another example, the first tag is a face image, and then all images including the same face feature may be matched in this step.

Thus, in this step, the at least one first image has a similarity between the label and the first label that is greater than a certain threshold, and the at least one first image is also considered to be associated with the first label.

Step 130: outputting, by the first model, at least one second image that is cropped from the first image based on the first scale, and outputting an aesthetic score of the at least one second image.

Optionally, a cropping frame is generated based on the first proportion, any one of the first images is cropped by utilizing the cropping frame, and at least one second image can be obtained through at least one cropping because the cropping frame is different in placement position and different in size.

The ratio of the two side length values of the cutting frame is a first ratio, but the sizes of different cutting frames are different. For example, the first ratio is 1:2, the size ratio of one crop box is 2:4, and the size ratio of one crop box is 3:6, it can be seen that the size of the subsequent crop box is larger than the size of the preceding crop box.

Further, an aesthetic score of each second image is calculated from the first model based on the cropped at least one second image.

Optionally, based on the first model, an aesthetic scoring scheme combining clipping is provided, in the scheme, the factors that the clipping mode of the image influences the aesthetic degree of the composition are considered, and the phenomena that the clipping of the image and the aesthetic scoring of the image are completely cut and split in the prior art can be effectively avoided, so that on one hand, clipping and the aesthetic scoring are associated, the window is ensured to realize the most accurate image display, and on the other hand, two models for realizing the two schemes are combined into one model, and the operation of a plurality of models can be reduced to bring higher power consumption and operation time.

For example, in the prior art, for a first image, an aesthetic score model may be input to output an aesthetic score for the first image; and simultaneously, inputting the first image into the intelligent clipping model, and outputting the clipped image of the first image. In this application, for the first image, the first image may be input to an intelligent scoring cropping model, i.e., a first model, from which the cropped image is output, while the aesthetic score of the cropped image is output.

Wherein in the first model, the higher the aesthetic degree of the image, the higher the aesthetic score calculated based on the feature information of the image.

Step 140: displaying a third image with highest aesthetic score in the at least one second image.

In this step, for any one of the first images, after at least one second image is obtained, the second image with the highest aesthetic score is found and used as a third image, which can be used for display in the small window of the current custom desktop pick image assembly.

Alternatively, when the number of the first images is plural, a third image cut based on the plural first images may be carousel in the widget.

In further embodiments, the third image may also be presented in a window of the other image display component.

In the flow of the image display method according to another embodiment of the present application, step 130 includes:

substep A1: at least two crop boxes are determined according to the first scale.

The sizes of the at least two cutting frames are different, the long side values of the at least two cutting frames are larger than a preset value, and the preset value is a first percentage of the corresponding side value of the first image.

In the present embodiment, the candidate Image set composed of at least one first Image is represented as Image _list The number of first images in the candidate image set is denoted as M and the first ratio is denoted as ratio.

For example, referring to fig. 6, at a first ratio of 1:1 as an example, the first model is split into two phases, namely a basic network reasoning phase and a feature score reasoning phase. Each first image is run only once for the underlying network reasoning phase, and multiple times for the feature score reasoning phase. The following is a detailed description.

Firstly, in a basic network reasoning stage, taking a first Image as an input Image, firstly, carrying out a resizing operation on the input Image, wherein the length of the input Image is resized to 1024, the aspect ratio is kept unchanged, the first Image after the resizing operation is sent into a network to obtain a corresponding Feature space, which is expressed as image_feature, the process only needs to be deduced once, and the length of the image_feature is determined to be 128 (=1024/8); subsequent cropping and aesthetic scoring scores are based on this image_feature. In fig. 6, W represents the width of image_feature, and H represents the height of image_feature.

And then, in the characteristic score reasoning stage, candidate frame generation and score reasoning are carried out, and a specific candidate frame generation scheme adopts a progressive judgment principle. For example, the first ratio is 1:1, generating at least two cutting frames, wherein the size of a first cutting frame generated can be 6:6, the size of the generated second crop frame may be 5:5, more size cutting frames can be generated, and the size of the generated cutting frames can be smaller and smaller. In addition, the size of the generated cutting frame can be larger and larger.

The long side value of the cutting frame needs to be larger than the first percentage, such as 70%, of the long side value or the short side value of the corresponding first image, so that the cutting frame is ensured to be large enough, and the obtained image content after cutting can display the original image content as much as possible, thereby avoiding the phenomenon that a user cannot accurately acquire the original image content and cannot distinguish which image is caused by transitional cutting.

For example, if the long side of the cutting frame corresponds to the long side of the first image, the long side value of the cutting frame needs to be greater than 70% of the long side value of the first image; for another example, if the long side of the crop frame corresponds to the short side of the first image, the long side value of the crop frame needs to be greater than 70% of the short side value of the first image.

Substep A2: for each crop frame, crop the first image with the crop frame, and output a set of second images.

In this step, taking a cutting frame as an example, when cutting an image, the cutting frame is placed at different positions in the first image, so that a plurality of images, that is, the second image, can be cut.

The first image is cropped by using a cropping frame, so that a group of second images can be obtained.

Wherein at least two sets of second images are output by the first model.

Substep A3: an aesthetic score of the second image is calculated and output.

In this step, for the second image to be output, an aesthetic score of the second image is calculated from the image characteristic information of the second image.

In this embodiment, for a first image and a first scale, a plurality of crop frames are generated by the first model, and a group of second images can be cropped by using one crop frame, so that a plurality of groups of second images can be cropped, and then aesthetic score calculation is performed on each second image to find the second image with the highest aesthetic score. It can be seen that, in this embodiment, considering the influence of the clipping frames with different sizes on the image appearance, the image is clipped by using a plurality of clipping frames, so as to ensure that the clipping image finally displayed is the optimal clipping result.

In the flow of the image display method according to another embodiment of the present application, step A2 includes:

substep B1: and according to the first step, the first image is cut at least once by utilizing the cutting frame, and at least one first sub-image is output.

Substep B2: an aesthetic score of the at least one first sub-image is calculated.

Referring to fig. 7, a large step size is used on image_feature, denoted as s_large, for example, s_large=8 is uniformly windowed, and a first sub-Image is obtained by clipping a clipping frame at a distance corresponding to every 8 Feature values, so that h_l first sub-images can be obtained in total. And for each first sub-image, acquiring the image characteristic information of the first sub-image, and obtaining H_l aesthetic scores through a characteristic score reasoning stage.

Substep B3: determining in at least one first sub-image: the first N1 first sub-images with the highest aesthetic score are based on the expanded areas of the areas in the first images. Wherein N1 is a positive integer.

In the step, the H_l aesthetic scores are sequenced, the first sub-image corresponding to the highest N1 aesthetic scores is selected, then the area of the first sub-image in the first image is determined, the area is properly expanded, and finally the expanded area is obtained. Where h_l > N1, for example, n1=3.

If N1 expanded regions corresponding to N1 first sub-images can be combined into one region, then the window can be divided in the combined region in the next step; if N1 expanded regions are scattered, i.e. cannot be combined into one region, then the next step is to divide windows in multiple regions.

Substep B4: and according to the second step, the expanded region is cut at least once by utilizing the cutting frame, and at least one second sub-image is output. Wherein the first stride is greater than the second stride.

In this step, referring to fig. 7, a middle step is used on image_feature, denoted as s_mid=4, and a uniform window is formed by s_mid=4, and a distance corresponding to every 4 Feature values is cut by a cutting frame to obtain one second sub-Image, so that h_m second sub-images can be obtained in total. Wherein, this step is different from step B1, step B1 is performed by cropping the entire area of the first image, and this step is performed by cropping the area of the N1 first sub-images after the area of the first image is expanded.

Substep B5: a set of second images is output based on the at least one second sub-image.

Optionally, in order to make clipping finer, a small step size, denoted as s_min, s_min=2, is used next on image_feature, and a third sub-Image is obtained by clipping with a clipping frame at a distance corresponding to every 2 Feature values, so that h_s third sub-images can be obtained in total. The clipping is different from the step B4, in the step B4, clipping is performed on the N1 first sub-images after the area of the first image is expanded, and the clipping is performed on the N5 second sub-images after the area of the first image is expanded. For example, n5=3.

Optionally, according to the above steps, multiple rounds of cutting may be repeated, where each round of cutting may be performed in a smaller image area range than the previous round of cutting, and cutting may be performed in a smaller step, until the last round of cutting is completed, a set of second images is obtained, and then the second image with the highest aesthetic score is obtained.

For example, h_s third sub-images are taken as a set of second images.

Wherein the resulting second image with the highest aesthetic score is represented as: (i_pointx, i_pointy, i_score), (i_pointx, i_pointy) represents the position coordinates of the crop box in the first image, and i_score represents the corresponding aesthetic score.

In this embodiment, a plurality of times of cropping is performed by using one cropping frame, so that a group of second images can be obtained, then one second image with the highest aesthetic score is found, based on this, a plurality of corresponding second images with the highest aesthetic score can be finally obtained by using a plurality of cropping frames, and further, one second image with the highest aesthetic score is found in the plurality of second images to be used as a third image.

Further, from Image _list Third images corresponding to each first image are obtained, third image forming atlas corresponding to n aesthetic scores before arrangement is selected according to the aesthetic scores of all the third images and the sequence from high to low, and the images in the atlas are sequentially displayed in the small window. According to the position coordinates (i_pointx, i_pointy) corresponding to the third image, the best display area of the first image is found, and the display area is based on the feature space and needs to be mapped to the original image, i.e., (i_pointx×8, i_pointy×8).

In this embodiment, for a first image, based on a cropping frame, a change from large stride to small stride and a change from large range to small range are adopted to gradually lock the optimal image area that the cropping frame can crop; and combining the optimal image areas cut by the plurality of cutting frames to find the optimal display area of the first image. Therefore, in this embodiment, the multi-dimensional analysis calculation is performed from the different sizes, the different clipping ranges and the different clipping positions of the clipping frame, so as to finally determine the optimal clipping scheme of the first image.

In the flow of the image display method according to another embodiment of the present application, before step 130, the method further includes:

step C1: the training image is input into the second model.

In the training image, at least N2 cutting frames and manual scores corresponding to the cutting frames are marked, the proportion of each cutting frame is different, and N2 is a positive integer.

In this embodiment, a method of training a second model is provided to obtain a trained second model, i.e., a first model.

Prior to training, training images are prepared. First, an Image is preprocessed for one Image, the length of the long side of the Image is reduced to 1024, and then Feature extraction is performed to obtain image_feature, and the long side corresponding to the image_feature is 128. Further, the defined different aspect ratios may comprise conventional 11 different ratios, e.g., long: wide = 0.5, 0.6, 0.7, 0.8, 0.9, 1, wide: length = 0.5, 0.6, 0.7, 0.8, 0.9 and original scale of image. And then manually marking the optimal areas of the proportions in the images, marking the manual scores corresponding to each area, and simultaneously inputting the manual scores of the original pictures, wherein each image has 12 manual scores, 11 corresponding proportional positions and a full picture. Wherein the manual scoring is an aesthetic scoring manually based on image features of different regions.

In the training process, N2 proportions, for example, n2=5, are randomly selected from 11 proportions, and the region frame corresponding to each proportion is used as a cutting frame.

Step C2: for each cutting frame, generating N3 candidate frames with different sizes according to the corresponding proportion. Wherein N3 is a positive integer.

In this step, N3 candidate frames, for example, n3=3, are randomly generated per scale, so that one image can generate n2×n3 candidate frames. For example, 5×3=15 candidate boxes.

For example, referring to fig. 8, for one crop box 801, 3 candidate boxes 802 are randomly generated as shown by solid lines, as shown by broken lines.

Step C3: and calculating the coincidence ratio between each candidate frame and the corresponding proportion of the cutting frames.

In this step, for a candidate frame, the coincidence ratio between this candidate frame and the crop frame of the same scale is calculated.

Therein, referring to fig. 9, one of the candidate frames 902 is randomly generated based on a scaled crop frame 901, and the coincidence ratio between the two frames=the intersection area between the two frames/the union area between the two frames, for example, the coincidence ratio is 0.9.

Step C4: a first aesthetic score for each candidate box is calculated based on the odds and the artificial score.

In this step, the first aesthetic score = coincidence ratio x manual score.

For example, one of the candidate frames is randomly generated based on a proportion of the crop frames, the overlap ratio between the two frames is 0.9, the manual score of the crop frame label is 80 points, and the first aesthetic score of the candidate frame is 80×0.9=72 points.

Step C5: training the second model according to the first aesthetic score and the second aesthetic score of the same candidate frame output by the second model to obtain the first model.

Referring to fig. 10, a candidate frame 10001 is mapped to a full-image feature, and a corresponding image feature is obtained according to the position of the candidate frame 10001, wherein the candidate frame feature can be obtained on the full-image feature according to the position of the candidate frame 10001 in the original image by scaling down to 1/8 of the original image feature. In this way, each candidate box feature is obtained, and a candidate box score, i.e., a second aesthetic score, is obtained through global averaging pooling and full-join layers in the model network structure.

For example, for a training image, 15 candidate boxes plus the original full map may result in 16 first aesthetic scores and 16 second aesthetic scores.

Further, calculating the loss function completes model training. Wherein, the loss function calculation is shown in formula (1):

Wherein the loss function uses a fundamental L2 regression loss, N represents the number of losses calculated each time, taking 16 sets of aesthetic scores as an example, n=16, s _ti The first aesthetic score representing the candidate box divided by 100 yields a score between 0 and 1, S _pi Indicating that the second aesthetic score divided by 100 yields a score between 0 and 1. Each time 16 sets of aesthetic scores are randomly generated, the more overlap between a candidate box and a corresponding proportion of crop boxes, the closer the first aesthetic score is to the manual score of the corresponding proportion of crop boxes.

By traversing and learning enough data and candidate frames, network parameters can be optimized such that the model can automatically calculate the aesthetic score for each candidate frame in combination with parameters and candidate frame features, the higher the overall aesthetic degree of the candidate frame, the higher the aesthetic score. Along with the learning of parameters, S _pi The more accurate the prediction of (2); and for different training images, the scores are different due to itself, for example 1 for image a: the manual score of the 1-proportion cutting frame is 98 points, and the image B is 1: the manual score of the clipping frames in proportion of 1 is 70, the manual score of each clipping frame is closely related to the image characteristics in the frame, so that when learning, the higher the aesthetic degree of original image characteristics in the frame is based on the constraint of a loss function, the higher the obtained aesthetic score is, and finally, the optimal clipping region capable of predicting the image is realized.

Based on the continuous iteration of the loss function, the aesthetic scoring model is optimized, and corresponding model weight parameters are obtained and expressed AS AS_weight. The weight parameter of the first model is as_weight.

In this embodiment, in the training image, a scheme of randomly generating the candidate frames is adopted to ensure that the information of the different-proportion cutting frames and the different candidate frames can be traversed and learned during model training, so that the network learns that the aesthetic score of the candidate frame can be highest only when the candidate frame is overlapped with the corresponding-proportion cutting frame, and the display effect is also the best. Therefore, based on the training of the second model in the embodiment, the obtained weight parameters of the first model can be directionally inferred according to the aesthetic degrees of different areas of the image, and the network predicts that the aesthetic score of the first model is higher as the aesthetic degrees are higher, so that when the method is applied, the trained weight parameters of the first model are loaded, an optimal cutting scheme can be intelligently pushed for a user, and the cut image is ensured to be attractive.

In the flow of the image display method according to another embodiment of the present application, step 140 includes:

substep D1: a second input is received to the first region of the screen.

In some embodiments of the present application, the second input is used to customize a display area of the third image in the screen, for example, in a process of customizing the desktop pick image component, selecting a display area of the desktop pick image component in the desktop interface, and the second input may be a second operation. Illustratively, the second input includes, but is not limited to: the user inputs the control on the screen, the touch control on the screen area, or the voice command input by the user, or the specific gesture input by the user, or other feasibility inputs through a touch control device such as a finger or a stylus, and the embodiment of the invention is not limited. The specific gesture in the embodiment of the application may be any one of a single-click gesture, a sliding gesture, a dragging gesture, a pressure recognition gesture, a long-press gesture, an area change gesture, a double-press gesture and a double-click gesture; the click input in the embodiment of the application may be single click input, double click input, or any number of click inputs, and may also be long press input or short press input.

For example, the second input may be: referring to fig. 5, the user clicks any widget to drag to a certain position of the desktop interface, so that the system adaptively adjusts an area at the position of the desktop interface according to the form of the widget, and the area is used as a display area, namely a first area, of the widget.

Substep D2: in response to the second input, a third image of the at least one second image that is most aesthetically scored is displayed in the first region.

Optionally, a widget of the desktop pick image assembly is displayed in the first area and a third image is displayed in the widget.

In this embodiment, the user may customize the display position of the image display assembly, so that the image display is more flexible, the personalized setting of the user is satisfied, and interesting interaction experience is brought to the user.

substep E1: and when the number of the first images is at least two, respectively acquiring third images obtained by cutting each first image.

Substep E2: the top N4 Zhang Disan image with the highest aesthetic score is displayed. Wherein N4 is a positive integer.

Alternatively, based on each first image, a third image may be obtained, the corresponding third images are ordered sequentially in order of aesthetic score from high to low, the third images arranged in the top N4 are found, and in this order, the N4 Zhang Disan images are carousel in the portlets of the desktop pick image assembly.

Optionally, a related display policy is formulated by using information such as the shooting time of the N4 third images, for example, the N4 Zhang Disan images are displayed in the order of the shooting time from near to far.

Optionally, as the captured image is updated, the third image currently displayed is synchronously updated, recorded and maintained in the form of a database.

In this embodiment, multiple related images may be pushed to the user, and each image is cropped in combination with the form of the display area, and is attractive, for example, one desktop widget displays multiple images of the child in a centralized manner; the other desktop widget displays a plurality of images in a centralized way, so that the image selection display is richer, intelligent and flexible, and the operation of a user is simpler.

In some embodiments of the present application, the ability to generate large models of content (AI Generated Content, AIGC) in combination with artificial intelligence may also be extended to the level of different semantic custom components, which may be suitable for the presentation of various types of components, making the presentation more accurate.

In summary, an object of the present application is to provide an intelligent display method for desktop selection image components, which can push images in a desktop window more intelligently and accurately, and enhance emotional connection between a user and album images. In the application, a user is supported to select a certain person and a certain category for a single desktop selection image component to accurately display, for example, the certain desktop selection image component is independently used for displaying: any one of baby images, pet images and scenic images is supported, and the definition of component names is supported, so that the specific area in the desktop interface is displayed in a targeted mode. Meanwhile, the application provides an aesthetic scoring scheme combining the display proportion, wherein each image is subjected to aesthetic scoring combining the display proportion, and then the aesthetic scoring is ordered, so that the optimal effect of the final display at the appointed proportion is ensured.

According to the image display method provided by the embodiment of the application, the execution subject can be an image display device. In the embodiment of the present application, an image display device is described by taking an example in which the image display device performs an image display method.

Fig. 11 shows a block diagram of an image display apparatus according to an embodiment of the present application, the apparatus including:

a receiving module 10 for receiving a first input to a first scale and a first tag;

an acquisition module 20 for acquiring at least one first image associated with a first tag in response to a first input;

an output module 30 for outputting, by the first model, at least one second image obtained by cropping the first image based on the first scale, and outputting an aesthetic score of the at least one second image;

and a display module 40 for displaying a third image with highest aesthetic score in the at least one second image.

Optionally, the output module 30 includes:

the determining unit is used for determining at least two cutting frames according to the first proportion; the sizes of the at least two cutting frames are different, the long side values of the at least two cutting frames are larger than a preset value, and the preset value is a first percentage of the corresponding side value of the first image;

the clipping unit is used for clipping the first image by using the clipping frames for each clipping frame and outputting a group of second images;

and a calculating unit for calculating and outputting the aesthetic score of the second image.

Optionally, the clipping unit includes:

the first clipping subunit is used for clipping the first image at least once by utilizing the clipping frame according to the first step and outputting at least one first sub-image;

a computing subunit for computing an aesthetic score of the at least one first sub-image;

a determining subunit configured to determine, in at least one first sub-image: a region expanded based on the region of the first N1 first sub-images in the first image with the highest aesthetic score; wherein N1 is a positive integer;

the second clipping subunit is used for clipping the expanded area at least once by utilizing the clipping frame according to the second step and outputting at least one second sub-image; wherein the first stride is greater than the second stride;

And the output subunit is used for outputting a group of second images according to at least one second sub-image.

Optionally, the apparatus further comprises:

the input module is used for inputting the training image into the second model; wherein, in the training image, at least N2 cutting frames and the manual scores corresponding to each cutting frame are marked, the proportion of each cutting frame is different, and N2 is a positive integer;

the generating module is used for generating N3 candidate frames with different sizes according to the corresponding proportion for each cutting frame; wherein N3 is a positive integer;

the first calculation module is used for calculating the coincidence ratio between each candidate frame and the corresponding proportion of the cutting frames;

a second calculation module for calculating a first aesthetic score for each candidate box based on the odds of coincidence and the manual score;

and the training module is used for training the second model according to the first aesthetic score and the second aesthetic score of the same candidate frame output by the second model to obtain the first model.

Optionally, the display module 40 includes:

a receiving unit for receiving a second input to the first region of the screen;

and a first display unit for displaying a third image with highest aesthetic score among the at least one second image in the first area in response to the second input.

Optionally, the display module 40 includes:

the acquisition unit is used for respectively acquiring third images obtained by cutting each first image when the number of the first images is at least two;

a second display unit for displaying the first N4 third images with the highest aesthetic score; wherein N4 is a positive integer.

The device in the embodiment of the application may be an electronic device, or may be a component in an electronic device, for example, an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. By way of example, the electronic device may be a mobile phone, tablet computer, notebook computer, palm computer, vehicle-mounted electronic device, mobile internet appliance (Mobile Internet Device, MID), augmented reality (augmented reality, AR)/Virtual Reality (VR) device, robot, wearable device, ultra-mobile personal computer, UMPC, netbook or personal digital assistant (personal digital assistant, PDA), etc., but may also be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the present application are not limited in particular.

The device of the embodiment of the application may be a device with an action system. The action system may be an Android (Android) action system, may be an ios action system, and may also be other possible action systems, which are not specifically limited in the embodiments of the present application.

The device provided by the embodiment of the application can realize each process realized by the embodiment of the method and realize the same technical effect, and in order to avoid repetition, the description is omitted here.

Optionally, as shown in fig. 12, the embodiment of the present application further provides an electronic device 100, including a processor 101, a memory 102, and a program or an instruction stored in the memory 102 and capable of being executed on the processor 101, where the program or the instruction implements each step of any one of the above embodiments of the image display method when executed by the processor 101, and the steps achieve the same technical effects, and for avoiding repetition, a description is omitted herein.

The electronic device of the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.

Fig. 13 is a schematic hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 1000 includes, but is not limited to: radio frequency unit 1001, network module 1002, audio output unit 1003, input unit 1004, sensor 1005, display unit 1006, user input unit 1007, interface unit 1008, memory 1009, processor 1010, camera 1011, and the like.

Those skilled in the art will appreciate that the electronic device 1000 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 1010 by a power management system to perform functions such as managing charge, discharge, and power consumption by the power management system. The electronic device structure shown in fig. 13 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.

Wherein the user input unit 1007 is configured to receive a first input of a first scale and a first label; a processor 1010 for acquiring at least one first image associated with the first tag in response to the first input; outputting, by the first model, at least one second image cropped from the first image based on the first scale, and outputting an aesthetic score of the at least one second image; a display unit 1006, configured to display a third image with highest aesthetic score in the at least one second image.

Optionally, the processor 1010 is further configured to determine at least two crop frames according to the first ratio; the sizes of the at least two cutting frames are different, the long side values of the at least two cutting frames are larger than a preset value, and the preset value is a first percentage of the corresponding side value of the first image; for each cutting frame, cutting the first image by using the cutting frame, and outputting a group of second images; an aesthetic score of the second image is calculated and output.

Optionally, the processor 1010 is further configured to perform, according to a first step, cropping on the first image at least once by using the cropping frame, and output at least one first sub-image; calculating an aesthetic score for the at least one first sub-image; determining in the at least one first sub-image: a region expanded based on the region of the first N1 first sub-images in the first image with the highest aesthetic score; wherein N1 is a positive integer; cutting the expanded region at least once by utilizing the cutting frame according to the second step, and outputting at least one second sub-image; wherein the first stride is greater than the second stride; and outputting a group of second images according to the at least one second sub-image.

Optionally, the processor 1010 is further configured to input a training image into the second model; wherein, in the training image, at least N2 cutting frames and manual scores corresponding to each cutting frame are marked, the proportion of each cutting frame is different, and N2 is a positive integer; for each cutting frame, generating N3 candidate frames with different sizes according to the corresponding proportion; wherein N3 is a positive integer; calculating the coincidence ratio between each candidate frame and the corresponding proportion of the cutting frames; calculating a first aesthetic score for each candidate box based on the odds ratio and the artificial score; training the second model according to the first aesthetic score and a second aesthetic score of the same candidate frame output by the second model to obtain a first model.

Optionally, the user input unit 1007 is further configured to receive a second input to the first region of the screen; the display unit 1006 is further configured to display a third image with highest aesthetic score in the at least one second image in the first area in response to the second input.

Optionally, the processor 1010 is further configured to, when the number of the first images is at least two, respectively obtain a third image obtained by cropping each of the first images; a display unit 1006 for displaying the first N4 third images with the highest aesthetic score; wherein N4 is a positive integer.

It should be understood that in the embodiment of the present application, the input unit 1004 may include a graphics processor (Graphics Processing Unit, GPU) 10041 and a microphone 10042, and the graphics processor 10041 processes image data of a still picture or a video image obtained by an image capturing device (such as a camera) in a video image capturing mode or an image capturing mode. The display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes at least one of a touch panel 10071 and other input devices 10072. The touch panel 10071 is also referred to as a touch screen. The touch panel 10071 can include two portions, a touch detection device and a touch controller. Other input devices 10072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein. Memory 1009 may be used to store software programs as well as various data including, but not limited to, application programs and an action system. The processor 1010 may integrate an application processor that primarily processes an action system, user pages, applications, etc., with a modem processor that primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 1010.

The memory 1009 may be used to store software programs as well as various data. The memory 1009 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 1009 may include volatile memory or nonvolatile memory, or the memory 1009 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (ddr SDRAM), enhanced SDRAM (Enhanced SDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRRAM). Memory 1009 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.

The processor 1010 may include one or more processing units; optionally, the processor 1010 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, and the like, and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 1010.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored, where the program or the instruction realizes each process of the embodiment of the image display method when executed by a processor, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.

The processor is a processor in the electronic device in the above embodiment. Readable storage media include computer readable storage media such as computer readable memory ROM, random access memory RAM, magnetic or optical disks, and the like.

The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled with the processor, the processor is used for running a program or instructions, each process of the embodiment of the image display method can be realized, the same technical effect can be achieved, and in order to avoid repetition, the description is omitted here.

It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.

The embodiments of the present application provide a computer program product stored in a storage medium, where the program product is executed by at least one processor to implement the respective processes of the embodiments of the image display method described above, and achieve the same technical effects, and are not repeated herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods of the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.

Claims

1. An image display method, the method comprising:

receiving a first input to a first scale and a first label;

responsive to the first input, obtaining at least one first image associated with the first tag;

outputting, by the first model, at least one second image cropped from the first image based on the first scale, and outputting an aesthetic score of the at least one second image;

and displaying a third image with highest aesthetic score in the at least one second image.

2. The method of claim 1, wherein the outputting at least one second image cropped from the first image based on the first scale and outputting an aesthetic score for the at least one second image comprises:

determining at least two cutting frames according to the first proportion; the sizes of the at least two cutting frames are different, the long side values of the at least two cutting frames are larger than a preset value, and the preset value is a first percentage of the corresponding side value of the first image;

for each cutting frame, cutting the first image by using the cutting frame, and outputting a group of second images;

An aesthetic score of the second image is calculated and output.

3. The method of claim 2, wherein cropping the first image with the cropping frame and outputting a set of second images comprises:

according to the first step, the first image is cut at least once by utilizing the cutting frame, and at least one first sub-image is output;

calculating an aesthetic score for the at least one first sub-image;

determining in the at least one first sub-image: a region expanded based on the region of the first N1 first sub-images in the first image with the highest aesthetic score; wherein N1 is a positive integer;

cutting the expanded region at least once by utilizing the cutting frame according to the second step, and outputting at least one second sub-image; wherein the first stride is greater than the second stride;

and outputting a group of second images according to the at least one second sub-image.

4. The method of claim 1, wherein prior to outputting, by the first model, at least one second image cropped from the first image based on the first scale and outputting the aesthetic score for the at least one second image, the method further comprises:

Inputting the training image into a second model; wherein, in the training image, at least N2 cutting frames and manual scores corresponding to each cutting frame are marked, the proportion of each cutting frame is different, and N2 is a positive integer;

for each cutting frame, generating N3 candidate frames with different sizes according to the corresponding proportion; wherein N3 is a positive integer;

calculating the coincidence ratio between each candidate frame and the corresponding proportion of the cutting frames;

calculating a first aesthetic score for each candidate box based on the odds ratio and the artificial score;

training the second model according to the first aesthetic score and a second aesthetic score of the same candidate frame output by the second model to obtain a first model.

5. The method of claim 1, wherein displaying a third image of the at least one second image that has a highest aesthetic score comprises:

receiving a second input to the first region of the screen;

in response to the second input, a third image of the at least one second image that is most aesthetically scored is displayed in the first region.

6. The method of claim 1, wherein displaying a third image of the at least one second image that has a highest aesthetic score comprises:

When the number of the first images is at least two, respectively acquiring third images obtained by cutting each first image;

displaying the top N4 third images with highest aesthetic scores; wherein N4 is a positive integer.

7. An image display device, the device comprising:

a receiving module for receiving a first input to a first scale and a first tag;

an acquisition module for acquiring at least one first image associated with the first tag in response to the first input;

an output module for outputting, by a first model, at least one second image obtained by cropping the first image based on the first scale, and outputting an aesthetic score of the at least one second image;

and the display module is used for displaying a third image with highest aesthetic score in the at least one second image.

8. The apparatus of claim 7, wherein the output module comprises:

The clipping unit is used for clipping the first image by utilizing the clipping frames for each clipping frame and outputting a group of second images;

9. The apparatus of claim 8, wherein the clipping unit comprises:

a determining subunit configured to determine, in the at least one first sub-image: a region expanded based on the region of the first N1 first sub-images in the first image with the highest aesthetic score; wherein N1 is a positive integer;

the second clipping subunit is used for clipping the expanded area at least once by utilizing the clipping frame according to a second stride, and outputting at least one second sub-image; wherein the first stride is greater than the second stride;

and the output subunit is used for outputting a group of second images according to the at least one second sub-image.

10. The apparatus of claim 7, wherein the apparatus further comprises:

the input module is used for inputting the training image into the second model; wherein, in the training image, at least N2 cutting frames and manual scores corresponding to each cutting frame are marked, the proportion of each cutting frame is different, and N2 is a positive integer;

a second calculation module for calculating a first aesthetic score for each candidate box based on the weight ratio and the manual score;

and the training module is used for training the second model according to the first aesthetic score and the second aesthetic score of the same candidate frame output by the second model to obtain a first model.

11. The apparatus of claim 7, wherein the display module comprises:

and a first display unit for displaying a third image with highest aesthetic score in the at least one second image in the first area in response to the second input.

12. The apparatus of claim 7, wherein the display module comprises:

a second display unit for displaying the first N4 third images with highest aesthetic score; wherein N4 is a positive integer.

13. An electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the image display method of any one of claims 1 to 6.

14. A readable storage medium, wherein a program or instructions is stored on the readable storage medium, which when executed by a processor, implements the steps of the image display method according to any one of claims 1 to 6.