WO2024046162A1

WO2024046162A1 - Image recommendation method and electronic device

Info

Publication number: WO2024046162A1
Application number: PCT/CN2023/114053
Authority: WO
Inventors: 汪涛; 许梦雯; 宋凯凯; 宋超领; 周剑辉
Original assignee: 华为技术有限公司
Priority date: 2022-08-30
Filing date: 2023-08-21
Publication date: 2024-03-07
Also published as: CN117688195A

Abstract

The present application provides an image recommendation method and an electronic device. The method comprises: an electronic device displaying an image acquisition interface; in response to an operation for an image acquisition button of the image acquisition interface, acquiring a first image sequence by means of an image acquisition device; generating a second image sequence on the basis of the first image sequence, wherein the second image sequence comprises images having timestamps different from those of images in the first image sequence, and/or images having observation viewing angles different from those of images in the first image sequence; determining a third image sequence from the first image sequence and the second image sequence, wherein the third image sequence comprises N images, on a first dimension, which have scores ranked at the first N positions, and M images, on a second dimension, which have scores ranked at the first M positions, and N and M are positive integers; and recommending the third image sequence. An artificial intelligence (AI) technology can be used to recommend to a user the "best moment" image meeting requirements of the user, so that the user can obtain a satisfactory image more conveniently and more quickly.

Description

An image recommendation method and electronic device

This application claims priority to the Chinese patent application filed with the China Patent Office on August 30, 2022, with the application number 202211049479.9 and the application title "A picture recommendation method and electronic device", the entire content of which is incorporated into this application by reference. middle.

Technical field

The present application relates to the field of computer technology, and in particular, to a picture recommendation method and electronic device.

Background technique

It is difficult for users to capture satisfactory pictures in some non-static scenes. For example, when users use mobile phones to take pictures of a moving human body, it is easy to take pictures such as the limbs are not stretched when jumping, the human body motion is blurred, the human body has its eyes closed, and passers-by are intrusive. Pictures that meet user requirements. Although mobile phones can currently take multiple pictures in a row for users to choose, these pictures are often large in number and have small differences. The user's selection is time-consuming and labor-intensive, and the user cannot obtain satisfactory pictures conveniently and quickly.

Contents of the invention

This application discloses a picture recommendation method and electronic device, which can recommend pictures that meet the user's needs for users, allowing users to obtain satisfactory pictures more conveniently and quickly.

In a first aspect, embodiments of the present application provide a picture recommendation method, which is applied to electronic equipment. The method includes: displaying an image collection interface; responding to a first operation of an image collection button on the image collection interface, collecting data through an image collection device. Obtaining a first sequence of pictures; generating a second sequence of pictures based on the first sequence of pictures, the second sequence of pictures including pictures with timestamps different from those of pictures in the first sequence of pictures, and/or observing Pictures with different viewing angles from the pictures in the first picture sequence; a third picture sequence is determined from the first picture sequence and the second picture sequence, and the third picture sequence includes a first dimension N pictures whose scores are ranked in the top N positions, and M pictures whose scores in the second dimension are ranked in the top M positions, N and M are positive integers; the third picture sequence is recommended.

In a possible implementation, the second picture sequence includes pictures with different timestamps from the timestamps of pictures in the first picture sequence, specifically including: any picture in the second picture sequence The timestamp of is different from the timestamps of all pictures in the first picture sequence.

In a possible implementation, the second picture sequence includes pictures with different viewing angles from the viewing angles of the pictures in the first picture sequence, specifically including: any picture in the second picture sequence The observation angle is different from the observation angle of the picture in the first picture sequence whose timestamp is the same as the timestamp of the picture.

In the above method, the third picture sequence recommended by the electronic device is selected from the first picture sequence and the second picture sequence, and the second picture sequence is generated in the time and/or space dimension based on the collected first picture sequence. , thereby adding differentiated, high-quality candidate pictures within the limited collection time, and the probability of users obtaining the required pictures is greatly increased. Moreover, the third picture sequence recommended by the electronic device includes N pictures that are better in the first dimension and M pictures that are better in the second dimension, thereby meeting the different needs of different users and allowing users to more conveniently and quickly Get satisfactory pictures.

In a possible implementation, the displayed image acquisition interface, in response to the first operation of the image acquisition button of the image acquisition interface, and the first image sequence is acquired through the image acquisition device, may be replaced by: responding to The operation of selecting the first picture sequence is to obtain the first picture sequence from the gallery of the electronic device.

In the above method, the first picture sequence can also be obtained from the gallery to meet different user needs in different scenarios and broaden application scenarios.

In a possible implementation, a third picture sequence is determined from the first picture sequence and the second picture sequence, and the third picture sequence includes the top N positions of scores of the first dimension. N pictures, as well as M pictures ranked in the top M positions in terms of second dimension scores, N and M are positive integers, the recommended third picture sequence can be replaced by: starting from the first picture sequence and the Determine P pictures in the second picture sequence whose third dimension scores are ranked in the top P positions, P is a positive integer, save the P pictures, and delete the first picture sequence and the second picture sequence. Pictures other than the P pictures, wherein the third dimension is determined by the electronic device in response to an operation on the setting interface, or the third dimension is a dimension of user preferences learned by the electronic device .

In the above method, the electronic device can determine from the first picture sequence and the second picture sequence: P pictures that are better in the third dimension manually set by the user, or in the third dimension of the learned user preference. Upload the best P pictures, save these P pictures and delete other pictures, allowing users to quickly and easily obtain the pictures they need without manual selection, which greatly improves the user experience.

In a possible implementation, the first dimension or the second dimension is any one of the following: a comprehensive dimension, the position of the photographed subject in the picture, the movement stretch of the photographed subject in the picture, the The expression of the subject and the quality of the picture.

In a possible implementation, the recommending the third picture sequence includes: displaying a first interface, the first interface displaying the first information, the second information, the N pictures and the M pictures. pictures, the first information indicates the first dimension, the first information is associated with the N pictures, the second information indicates the second dimension, the second information is associated with the M pictures association.

In the above method, the user can obtain N pictures associated with the first dimension based on the first information, and obtain M pictures associated with the second dimension based on the second information. The display method is simple and clear, making it convenient for the user to obtain what he needs. Pictures in different dimensions improve user experience.

In a possible implementation, the recommendation of the third picture sequence includes: displaying a second interface, the second interface displays K pictures, where K is a positive integer greater than or equal to N, and the K pictures The pictures include the N pictures and (K-N) pictures other than the N pictures. The (K-N) pictures belong to the first picture sequence and/or the second picture sequence. The K pictures The pictures include a first picture and a second picture. The score of the first picture in the first dimension is greater than the score of the second picture in the first dimension. The first picture is displayed in the second interface. before said second picture.

In the above method, the electronic device can preferentially display the picture with a higher score in the first dimension to avoid the situation where the picture with a higher score is displayed later, causing the user to spend more time to obtain the picture, and further improves the user's ability to obtain the picture. The efficiency of images is needed to improve user experience.

In a possible implementation, the (K-N) pictures do not belong to the third picture sequence. For example, the K pictures are the first dimension in the first picture sequence and the second picture sequence. The scores are ranked among the top K pictures.

In the above method, the electronic device can also display other pictures other than the recommended third picture sequence (that is, when K is greater than N), that is, more candidate pictures are provided for the user to choose, to avoid that none of the pictures in the third picture sequence are Meet the user's needs and prevent the user from obtaining the required pictures to further ensure the user experience.

In a possible implementation, the method further includes: receiving a second operation for selecting at least one picture, which belongs to the first picture sequence and/or the second picture sequence. ; Save the at least one picture, and delete pictures other than the at least one picture in the first picture sequence and the second picture sequence.

In the above method, the electronic device can save at least one picture selected by the user and delete other pictures to prevent other pictures that the user does not need from occupying the storage space of the device and reduce the storage pressure of the device.

In a possible implementation, the third picture sequence is obtained according to the first strategy; the method further includes: receiving a second operation for selecting at least one picture, the at least one picture belonging to the the first picture sequence and/or the second picture sequence; and the first policy is updated according to the third picture sequence and the at least one picture.

In the above method, the electronic device can update the first strategy for determining the recommended third picture sequence according to at least one picture selected by the user, that is, learn the first strategy according to the user's habits and realize the personalization of the first strategy, so that Subsequent recommended images determined based on the first strategy are more in line with current user needs and improve user experience.

In a possible implementation, generating a second picture sequence based on the first picture sequence includes: generating a fourth picture sequence based on the first picture sequence, and the time of the picture in the fourth picture sequence is The time stamps of the pictures in the first picture sequence are different from those of the pictures in the first picture sequence; a fifth picture sequence is generated based on the first picture sequence and the fourth picture sequence, and the observation angle of the pictures in the fifth picture sequence is different from that of the pictures in the fifth picture sequence. The viewing angles of the pictures in the first picture sequence and the fourth picture sequence are different, and the second picture sequence includes the fourth picture sequence and the fifth picture sequence.

In a possible implementation, the timestamps of the pictures in the fourth picture sequence are different from the timestamps of the pictures in the first picture sequence, specifically including: any picture in the fourth picture sequence The time stamp of the picture is different from the time stamps of all pictures in the first picture sequence.

In a possible implementation, the observation angle of the pictures in the fifth picture sequence is different from the observation angles of the pictures in the first picture sequence and the fourth picture sequence, specifically including: the fifth The observation angle of any picture in the picture sequence is different from the observation angle of the picture in the first picture sequence and the fourth picture sequence with the same time stamp as the time stamp of the picture.

In the above method, the electronic device can first generate a fourth picture sequence with a different time stamp from the collected first picture sequence in the time dimension, and then generate an observation perspective from the first picture sequence and the fourth picture sequence in the spatial dimension. Different fifth picture sequences, compared to only generating pictures with different viewing angles from the first picture sequence, further expand the high-quality and differentiated candidate pictures, and the probability of users obtaining the required pictures is further improved.

In a possible implementation, generating a fifth picture sequence based on the first picture sequence and the fourth picture sequence includes: training to obtain a space based on the first picture sequence and the fourth picture sequence. Perception model; obtain a first spatial parameter, which is different from the spatial parameters of the pictures in the first picture sequence and the second picture sequence; use the first spatial parameter as the spatial perception model The input obtains the output, and the output is the fifth picture sequence.

In a possible implementation, the spatial perception model is obtained through multiple rounds of iterative training.

In a possible implementation, the spatial parameters include the spatial coordinates of the picture and the posture of the picture collection device used to collect the picture.

In the above method, the spatial perception model is iteratively trained based on the currently collected first picture sequence and the fourth picture sequence generated based on the first picture sequence. Therefore, the spatial perception model can fully learn the situation of the current shooting scene. The accuracy of the fifth picture sequence obtained by the spatial perception model is higher, that is, the accuracy of the candidate pictures is higher, which further increases the probability that the user can obtain the desired picture.

In a possible implementation, generating a second picture sequence based on the first picture sequence includes: training to obtain a space-time perception model based on the first picture sequence; obtaining second spatial parameters and first temporal parameters, The second spatial parameters include spatial parameters that are different from the spatial parameters of the pictures in the first picture sequence, and the first temporal parameters include time parameters that are different from the temporal parameters of the pictures in the first picture sequence; The second spatial parameter and the first temporal parameter are used as inputs of the spatio-temporal perception model to obtain an output, and the output is the second picture sequence.

In a possible implementation, the space-time perception model is obtained through multiple rounds of iterative training.

In a possible implementation, the time parameter includes the timestamp of the image, or the time nesting obtained based on the timestamp of the image.

In the above method, the spatio-temporal perception model is iteratively trained based on the currently collected first picture sequence. Therefore, the spatio-temporal perception model can fully learn the situation of the current shooting scene. The accuracy of the second picture sequence obtained through the spatio-temporal perception model Higher, that is, the accuracy of the candidate pictures is higher, which further increases the probability that the user will obtain the desired picture.

In a second aspect, embodiments of the present application provide an electronic device, including a transceiver, a processor, and a memory; the memory is used to store a computer program, and the processor calls the computer program to execute the first aspect of the embodiment of the present application. and the image recommendation method provided by any implementation of the first aspect.

In a third aspect, embodiments of the present application provide a computer storage medium. The computer storage medium stores a computer program. When the computer program is executed by a processor, the computer program is used to perform the first aspect of the embodiments of the present application and any of the first aspects. An image recommendation method provided by an implementation.

In a fourth aspect, embodiments of the present application provide a computer program product. When the computer program product is run on an electronic device, the electronic device causes the electronic device to execute the first aspect of the embodiment of the present application and any implementation of the first aspect. Provided image recommendation method.

In a fifth aspect, embodiments of the present application provide an electronic device, which includes executing the method or device described in any embodiment of the present application. The above-mentioned electronic device is, for example, a chip.

Description of drawings

The drawings used in this application are introduced below.

Figure 1 is a schematic diagram of the hardware structure of an electronic device provided by this application;

Figure 2 is a schematic diagram of the software architecture of an electronic device provided by this application;

Figure 3 is a schematic flow chart of an image recommendation method provided by this application;

Figure 4 is a schematic diagram of an image generation process provided by this application;

Figure 5 is a schematic diagram of another image generation process provided by this application;

Figure 6 is a schematic diagram of the skeletal position points of a human body provided by this application;

Figure 7 is a schematic diagram of the acquisition process of a personalized data set provided by this application;

Figures 8-15 are schematic diagrams of the software architecture of yet another electronic device provided by this application;

Figures 16-19 are schematic diagrams of some user interface embodiments provided by this application.

Detailed ways

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings. Among them, in the description of the embodiments of this application, unless otherwise stated, "/" means or, for example, A/B can mean A or B; "and/or" in the text is only a way to describe related objects. The association relationship means that there can be three relationships. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiment of the present application , "plurality" means two or more than two.

Hereinafter, the terms “first” and “second” are used for descriptive purposes only and shall not be understood as implying or implying relative importance or implicitly specifying the quantity of indicated technical features. Therefore, the features defined as “first” and “second” may explicitly or implicitly include one or more of the features. In the description of the embodiments of this application, unless otherwise specified, “plurality” The meaning is two or more.

When users capture pictures in some non-static scenes, such as when capturing moving objects, the electronic device can take multiple pictures in succession, and select pictures with better quality/clarity from these multiple pictures to recommend to the user. The user can Select the desired image from these multiple images based on recommended images pictures, but there are still the following technical problems that prevent users from obtaining satisfactory pictures conveniently and quickly.

Technical problem 1: Electronic equipment performs continuous shooting operations in chronological order, that is, imaging within a limited shooting time. There may be situations where all the images taken do not meet the user's needs;

Technical problem two: Electronic devices only recommend pictures to users based on picture quality, that is, the picture recommendation strategy is simple, and there may be cases where the recommended pictures are of low quality and do not meet user needs;

Technical problem three: The picture recommendation strategy for all users is the same, and it does not take into account that different users may need different pictures, resulting in low-quality recommended pictures that do not meet user needs.

This application provides a picture recommendation method, which is applied to electronic devices. This method allows users to obtain satisfactory pictures conveniently and quickly, and improves user experience. In one implementation, the electronic device can generate more pictures in the temporal and/or spatial dimensions based on the captured pictures for user selection, that is, adding differentiated, high-quality candidate pictures within a limited shooting time, To solve the above technical problem one. In one implementation, the electronic device can also recommend pictures to the user from multiple dimensions such as comprehensive, photographed subject's position (referred to as subject position), photographed subject's movement and stretch, photographed subject's facial expression, image quality, etc., effectively optimizing Image recommendation strategy to solve the above technical problem two. In one implementation, the electronic device can also update the picture recommendation strategy (which can be understood as end-side self-learning) based on the pictures selected by the user, to achieve personalization and continuous updating of the picture recommendation strategy to solve the above technical problem 3. This allows users to obtain satisfactory pictures conveniently and quickly, improving user experience.

In this application, the electronic device may be a mobile phone, a tablet computer, a handheld computer, a desktop computer, a laptop computer, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, or a personal digital assistant (personal digital assistant). assistant (PDA), as well as smart home devices such as smart TVs and smart cameras, wearable devices such as smart bracelets, smart watches, and smart glasses, augmented reality (AR), virtual reality (VR), and mixed reality Extended reality (XR) devices such as (mixed reality, MR), vehicle-mounted devices or smart city devices. The embodiments of this application do not place special restrictions on the specific types of electronic devices.

Next, an exemplary electronic device 100 provided by an embodiment of the present application is introduced.

FIG. 1 exemplarily shows a schematic diagram of the hardware structure of an electronic device 100 .

The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.

It can be understood that the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100 . In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown in the figures, or some components may be combined, some components may be separated, or some components may be arranged differently. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units. For example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Among them, different processing units can be independent devices or integrated in one or more processors.

The controller can generate operation control signals based on the instruction operation code and timing signals to complete the control of fetching and executing instructions.

The processor 110 may also be provided with a memory for storing instructions and data. In some embodiments, the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.

In some embodiments, processor 110 may include one or more interfaces. Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (PCM) interface, universal asynchronous receiver and transmitter (universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and /or universal serial bus (USB) interface, etc.

The charging management module 140 is used to receive charging input from the charger.

The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, the wireless communication module 160, and the like.

The wireless communication function of the electronic device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.

Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example: Antenna 1 can be reused as a diversity antenna for a wireless LAN.

The mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G/6G applied on the electronic device 100 . The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc. The mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation. The mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation. In one implementation, at least part of the functional modules of the mobile communication module 150 may be disposed in the processor 110 . In one implementation, at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.

A modem processor may include a modulator and a demodulator. Among them, the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor. The application processor outputs sound signals through audio devices (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194. In one implementation, the modem processor may be a stand-alone device. In another implementation, the modem processor may be independent of the processor 110 and may be provided in the same device as the mobile communication module 150 or other functional modules.

The wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellites. System (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 . The wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.

In one implementation, the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology. The above-mentioned wireless communication technologies may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband code Wideband code division multiple access (WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technology, etc. The GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi) -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).

The electronic device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like. The GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 194 is used to display images, videos, etc. Display 194 includes a display panel. The display panel can use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode). emitting diode (AMOLED), flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diode (QLED), etc. In one implementation, the electronic device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.

The electronic device 100 can implement the shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.

The ISP is used to process the data fed back by the camera 193. For example, when taking a photo, you open the shutter and light is transmitted through the lens to the camera. On the component, the optical signal is converted into an electrical signal, and the camera photosensitive element passes the electrical signal to the ISP for processing and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, color, etc. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In one implementation, the ISP may be provided in the camera 193.

Camera 193 is used to capture still images or video. The object passes through the lens to produce an optical image that is projected onto the photosensitive element. The photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other format image signals. In one implementation, the electronic device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.

The external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.

Internal memory 121 may be used to store computer executable program code, which includes instructions. The processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.

The electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playback, recording, etc.

The audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals. Audio module 170 may also be used to encode and decode audio signals.

Speaker 170A, also called "speaker", is used to convert audio electrical signals into sound signals.

Receiver 170B, also called "earpiece", is used to convert audio electrical signals into sound signals.

Microphone 170C, also called "microphone" or "microphone", is used to convert sound signals into electrical signals.

The headphone interface 170D is used to connect wired headphones.

The pressure sensor 180A is used to sense pressure signals and can convert the pressure signals into electrical signals. In one implementation, the pressure sensor 180A may be disposed on the display screen 194 . There are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors, etc. A capacitive pressure sensor may include at least two parallel plates of conductive material. When a force is applied to pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the intensity of the pressure based on the change in capacitance. When a touch operation is performed on the display screen 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device 100 may also calculate the touched position based on the detection signal of the pressure sensor 180A.

The gyro sensor 180B may be used to determine the motion posture of the electronic device 100 . In one implementation, the angular velocity of electronic device 100 about three axes (ie, x, y, and z axes) may be determined by gyro sensor 180B.

Air pressure sensor 180C is used to measure air pressure.

Magnetic sensor 180D includes a Hall sensor. The electronic device 100 may utilize the magnetic sensor 180D to detect opening and closing of the flip holster.

The acceleration sensor 180E can detect the acceleration of the electronic device 100 in various directions (generally three axes).

Distance sensor 180F for measuring distance.

Proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light outwardly through the light emitting diode. Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 .

The ambient light sensor 180L is used to sense ambient light brightness.

Fingerprint sensor 180H is used to collect fingerprints. The electronic device 100 can use the collected fingerprint characteristics to achieve fingerprint unlocking, access to application locks, fingerprint photography, fingerprint answering of incoming calls, etc.

Temperature sensor 180J is used to detect temperature.

Touch sensor 180K, also known as "touch device". The touch sensor 180K can be disposed on the display screen 194. The touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation on or near the touch sensor 180K. The touch sensor can pass the detected touch operation to the application processor to determine the touch event type. Visual output related to the touch operation may be provided through display screen 194 . In another implementation, the touch sensor 180K may also be disposed on the surface of the electronic device 100 in a position different from that of the display screen 194 .

Bone conduction sensor 180M can acquire vibration signals.

The buttons 190 include a power button, a volume button, etc.

The motor 191 can generate vibration prompts.

The indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, or may be used to indicate messages, missed calls, notifications, etc.

The SIM card interface 195 is used to connect a SIM card.

The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. For example, the layered architecture software system can be the Android system, the Harmony operating system (operating system, OS), or other software systems. The embodiment of this application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .

FIG. 2 exemplarily shows a schematic diagram of the software architecture of the electronic device 100 .

The layered architecture divides the software into several layers, and each layer has clear roles and division of labor. The layers communicate through software interfaces. In some embodiments, the Android system is divided into four layers, from top to bottom: application layer, application framework layer, Android runtime and system libraries, and kernel layer.

The application layer can include a series of application packages.

As shown in Figure 2, the application package can include camera, gallery, music, calendar, short message, call, navigation, Bluetooth, browser and other applications. The application package in this application can also be replaced by other forms of software such as applets.

The application framework layer provides an application programming interface (API) and programming framework for applications in the application layer. The application framework layer includes some predefined functions.

As shown in Figure 2, the application framework layer can include a window manager, content provider, view system, phone manager, resource manager, notification manager, etc.

A window manager is used to manage window programs. The window manager can obtain the display size, determine whether there is a status bar, lock the screen, capture the screen, etc.

Content providers are used to store and retrieve data and make this data accessible to applications. Said data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.

The view system includes visual controls, such as controls that display text, controls that display pictures, etc. A view system can be used to build applications. The display interface can be composed of one or more views. For example, a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.

The phone manager is used to provide communication functions of the electronic device 100 . For example, call status management (including connected, hung up, etc.).

The resource manager provides various resources to applications, such as localized strings, icons, pictures, layout files, video files, etc.

The notification manager allows applications to display notification information in the status bar, which can be used to convey notification-type messages and can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc. The notification manager can also be notifications that appear in the status bar at the top of the system in the form of charts or scroll bar text, such as notifications for applications running in the background, or notifications that appear on the screen in the form of conversation windows. For example, text information is prompted in the status bar, a beep sounds, the electronic device vibrates, the indicator light flashes, etc.

Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.

The core library contains two parts: one is the functional functions that need to be called by the Java language, and the other is the core library of Android.

The application layer and application framework layer run in virtual machines. The virtual machine executes the java files of the application layer and application framework layer into binary files. The virtual machine is used to perform object life cycle management, stack management, thread management, security and exception management, and garbage collection and other functions.

System libraries can include multiple functional modules. For example: surface manager (surface manager), media libraries (Media Libraries), 3D graphics processing libraries (for example: OpenGL ES), 2D graphics engines (for example: SGL), etc.

The surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.

The media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc. The media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, composition, and layer processing.

2D Graphics Engine is a drawing engine for 2D drawing.

The kernel layer is the layer between hardware and software. The kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.

The following exemplifies the workflow of the software and hardware of the electronic device 100 in conjunction with capturing the photographing scene.

When the touch sensor 180K receives a touch operation, the corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes touch operations into raw Input events (including touch coordinates, timestamp of touch operations and other information). Raw input events are stored at the kernel level. The application framework layer obtains the original input event from the kernel layer and identifies the control corresponding to the input event. Taking the touch operation as a touch click operation and the control corresponding to the click operation as a camera application icon control as an example, the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer. Camera 193 captures still images or video.

Next, the image recommendation method provided by the embodiment of this application is introduced.

Please refer to Figure 3. Figure 3 is a schematic flowchart of an image recommendation method provided by an embodiment of the present application. This method can be applied to the electronic device 100 shown in FIG. 1 . This method can be applied to the electronic device 100 shown in FIG. 2 . The method may include but is not limited to the following steps:

S101: The electronic device obtains the first picture sequence.

A picture sequence in this application refers to at least one picture.

In one implementation, the electronic device can capture the first sequence of pictures through a camera. In another implementation, the electronic device may obtain a first sequence of pictures taken by the connected device. In another implementation manner, the electronic device may obtain the first picture sequence from a memory of the electronic device, for example, obtain the first picture sequence in a gallery of the electronic device. In another implementation, the electronic device can obtain the first picture sequence stored by the network device. For example, when the user uses the cloud album application of the electronic device, the electronic device can respond to a message for selecting the first picture sequence in the cloud album. The user operates to send a request message to the application server of the cloud photo album, and receives the first picture sequence sent by the application server. Not limited to this, the first picture sequence can also be obtained through at least two of the above embodiments. For example, the electronic device obtains some pictures in the first picture sequence through a camera, and obtains the first picture sequence taken by the connected device except the partial pictures. For pictures other than some pictures, this application does not limit the specific method of obtaining the first picture sequence.

S102: The electronic device generates a second picture sequence based on the first picture sequence.

In one implementation, the electronic device may generate a second picture sequence in a temporal dimension and/or a spatial dimension based on the first picture sequence.

In one implementation, the electronic device can generate at least one picture in the time dimension based on the first picture sequence, and the second picture sequence includes the at least one picture. In some examples, the electronic device can first obtain the shooting time of each picture in the first picture sequence (which can be referred to as a timestamp for short), assuming that the minimum timestamp and the maximum timestamp are timestamp 1 and timestamp 2 respectively, The electronic device can then generate at least one picture with a more detailed timestamp based on these timestamps and the first picture sequence, wherein the timestamp of each picture in the at least one picture is greater than timestamp 1 and less than timestamp 2, and The timestamp of any picture in the first picture sequence is different, and the implementation method is similar to video frame insertion, for example. An example of the above process can be seen in Figure 4 below.

As shown in Figure 4, the first picture sequence may include four pictures: picture 1, picture 2, picture 3 and picture 4. The timestamps of these four pictures from small to large are: t ₁ , t ₂ , t ₃ , _t4 . The electronic device can generate three pictures in the time dimension based on the first picture sequence: picture ₅ with timestamp t 5 between t ₁ and t ₂ , picture 6 with timestamp t ₆ between t ₂ and t ₃ , and time Poke picture 7 where t ₇ is between t ₃ and t ₄ . Not limited to the situation shown in Figure 4, in other examples, for any two adjacent timestamps in the timestamps of the first picture sequence, the electronic device can generate multiple timestamps between the two timestamps. Pictures, for example, generate multiple pictures with timestamps between t ₁ and t _2. In other examples, for any two adjacent timestamps in the timestamp of the first picture sequence, the electronic device may not generate a timestamp. Pictures between these two timestamps, for example, picture 5 with timestamp t ₅ between t ₁ and t ₂ are not generated. This application does not limit the specific generation method.

In one implementation, the electronic device can generate at least one picture in a spatial dimension based on the first picture sequence, for example, perform new perspective synthesis based on neural radiation field (NeRF) to obtain the at least one picture. The two-picture sequence includes the at least one picture. In some examples, for any picture in the first picture sequence (which may be called a reference picture), the electronic device can generate one or more pictures with different viewing angles, where the time of this one or more pictures The stamp is the timestamp of the reference picture. The observation perspective of any one of the one or more pictures is different from the observation perspective of the reference picture. If multiple pictures are generated, these multiple pictures correspond to different observation perspectives. This process can be understood as synthesizing new perspectives from different observation perspectives for a fixed timestamp, thereby obtaining at least one picture of the new observation perspective. The electronic device may use some or all of the pictures in the first picture sequence as reference pictures to generate at least one picture with more viewing angles. An example of the above process can be seen in Figure 5 below. Figure 5 uses picture 1 in Figure 4 as a reference picture as an example. The subject in picture 1 is human body 1.

As shown in Figure 5, the human body 1 can be abstracted into a cube, and the human body 1/the cube can be observed from different perspectives, for example but not limited to: observing the front/front of the human body 1 from the front perspective, and observing the back of the human body 1 from the rear perspective. /Back, the left side of human body 1 is observed from the left perspective, the right side of human body 1 is observed from the right perspective, etc. Picture 1 as a reference picture is a picture of human body 1 taken from the front view when the timestamp is t _1. For the timestamp t ₁ of picture 1, the electronic device can generate: a picture of human body 1 taken from the back view 8. From Picture 9 obtained by observing human body 1 from the left perspective, and picture 10 obtained by observing human body 1 from the right perspective. Not limited to the situation shown in Figure 5, in other examples, the electronic device can generate pictures with more or less viewing angles. For example, for the timestamp t ₁ of picture 1, the electronic device can also generate pictures with a view from top to bottom. This application does not limit the specific generation method of the picture obtained by observing the human body 1 from a perspective.

S103: The electronic device uses the aesthetic evaluation model to score each picture in the first picture sequence and the second picture sequence, and obtains the score. The third picture sequence with a higher score.

In one implementation, the electronic device can use each picture in the first picture sequence and the second picture sequence as the input of the aesthetic evaluation model to obtain a corresponding output. The output can include scores of the picture in multiple dimensions, The multiple dimensions include, for example, but are not limited to, the following: comprehensive (the corresponding score is called the comprehensive score), subject position (the corresponding score is called the subject position score), and the shooting subject's movement and stretch (the corresponding score is called the action Stretch score), the subject's expression (the corresponding score is called the expression score), the image quality (the corresponding score is called the image quality score), etc. It can be understood that the aesthetic evaluation model can score pictures from multiple dimensions. Next, take any picture in the first picture sequence and the second picture sequence: the first picture as an example to illustrate the scoring method of the aesthetic evaluation model.

In some examples, the aesthetic evaluation model may determine the subject position score based on a rate of change (ie, acceleration) of the speed of the subject in the first picture, where the acceleration may be based on the subject in the first picture and adjacent pictures. Determined by the speed, the adjacent pictures belong to the first picture sequence and the second picture sequence, for example but are not limited to pictures whose absolute value of the difference between the timestamp and the timestamp of the first picture is less than or equal to a preset threshold. For example, when the change trend of the acceleration of the subject in the first picture is getting smaller and the value is 0, the aesthetic evaluation model can consider that the subject in the first picture is at the highest point of motion, so the subject in the first picture can be The position score is set to the maximum value. The above-mentioned changing trend may include: the acceleration in the upward direction gradually decreases from a positive number (gradually approaching 0). The above-mentioned changing trend may be consistent with the acceleration of the subject in the previous picture. It can be obtained from the comparison that the previous picture belongs to the first picture sequence and the second picture sequence and the timestamp is smaller than the timestamp of the first picture. Not limited to the above examples, in other examples, the aesthetic evaluation model can also determine the subject position score based on the size of the area occupied by the subject in the first picture. For example, the larger the occupied area, the greater the subject position score. In other examples, the aesthetic evaluation model can also determine the subject position score based on the position priority of the subject in the first picture. For example, when the position of the subject is in the middle, the position priority is the highest, so the subject position score can be set to the maximum value. , this application does not limit this.

In some examples, the aesthetic evaluation model may determine the motion stretch score based on the distance between the skeletal position points of the subject in the first picture. For example, when the distance is larger, the aesthetic evaluation model may consider that the subject in the first picture The more stretched the movement is, the greater the movement stretching score is. When the distance is smaller (for example, the limbs are not spread out when the human body jumps), the smaller the movement stretching score is obtained. Among them, an example of the skeletal position points of the subject can be seen in Figure 6. The subject shown in Figure 6 is a human body. As shown in Figure 6, the human body can include head points, neck points, left/right shoulder points, left/right shoulder points, etc. There are multiple skeletal position points such as right elbow point, left/right hand point, left/right hip point, left/right knee point, left/right foot point, etc. The above distance includes, for example, the distance between any two skeletal position points as shown in Figure 6 .

In some examples, the aesthetic evaluation model can determine the expression score based on the facial expression of the subject in the first picture, for example, when the facial expression is a better expression such as smiling, laughing, eyes wide open, etc., the expression obtained The score is larger. When the facial expression is a poor expression such as closing the eyes, the obtained expression score is smaller. The above-mentioned better or worse expressions can be preset or determined in response to user operations. It can also be obtained by learning user preferences.

In some examples, the aesthetic evaluation model can determine the image quality score based on the image quality of the first picture. For example, when the image quality of the first picture is higher, the obtained image quality score is larger. When the image quality of the first picture is lower, the image quality score is larger. When it is low (for example, when the subject is subject to motion blur), the image quality score obtained is smaller. Among them, image quality can include, but is not limited to, dynamic range, saturation, contrast, sharpness, etc.

In some examples, the comprehensive score can be a score obtained by combining multiple indicators such as image quality, subject position, ease of movement, expression, whether there are moving objects (for example, whether there are passers-by).

In one implementation, the electronic device can separately sort and filter the output of the aesthetic evaluation model: the scores of the pictures in the first picture sequence and the second picture sequence in multiple dimensions (from multiple dimensions), and obtain A third picture sequence with a higher score. The third picture sequence may include a plurality of picture sequences with higher scores in the above-mentioned multiple dimensions. In some examples, the scores of multiple dimensions are the comprehensive score, subject position score, movement stretch score, expression score and image quality score of the above examples. Therefore, the third picture sequence may include: the top N ₁ comprehensive scores Picture sequence 1. Picture sequence with subject position score in the top N ₂ places 2. Picture sequence with movement stretch score in the top N ₃ places 3. Picture sequence with expression score in the top N ₄ places 4. Image quality score in the top N ₅ Bit picture sequence 5, where N _i is a positive integer and i is a positive integer less than 6.

S104: The electronic device displays the third picture sequence.

In one implementation, when displaying at least one picture in the first picture sequence and/or the second picture sequence, the electronic device may display/highlight the third picture sequence therein, for example, in the third picture sequence pictures appear before other pictures.

In another embodiment, when the electronic device displays the third picture sequence, in response to a user operation (such as a user operation returning to the picture list interface of the gallery application), the electronic device displays the first picture sequence and/or the second picture sequence. At least one picture.

In one implementation, when the electronic device displays the third picture sequence, it may preferentially display/highlight pictures with higher scores. For example, when the electronic device displays the picture sequence 1 with the top N ₁ comprehensive score in the third picture sequence, , the pictures can be displayed from front to back in order of comprehensive score from high to low, that is, the picture with the highest comprehensive score is displayed at the front, the picture with the second highest comprehensive score is displayed at the second, and so on.

In one implementation, when the electronic device displays the third picture sequence, the pictures with the highest scores may be displayed, from high to low, or in chronological order.

S105: The electronic device receives a user operation for selecting at least one picture (which may be called a fourth picture sequence).

In one implementation, S105 is an optional step.

In one implementation, when the electronic device displays the third picture sequence, it may receive a user operation for selecting a fourth picture sequence in the third picture sequence. In another embodiment, when the electronic device displays the third picture sequence, it also displays other pictures in the first picture sequence and/or the second picture sequence, and the electronic device may receive a method for selecting the third picture sequence and/or the second picture sequence. User action for the fourth picture sequence among other pictures.

In one implementation, the fourth picture sequence may include pictures in the first picture sequence. In one implementation, the fourth picture sequence may include pictures in the second picture sequence. In one implementation, the fourth picture sequence may include pictures in the third picture sequence.

In one implementation, the electronic device may set the priority of the pictures in the fourth picture sequence according to the received user operation. For example, the priority of the picture selected by the user first is higher than the priority of the picture selected by the user later.

In one implementation, the electronic device can determine the dimension corresponding to the picture in the fourth picture sequence according to the received user operation, and the dimension can be any one of the multiple dimensions described in S103. For example, when the electronic device displays picture sequence 1 with a high comprehensive score, it receives a user operation for selecting picture A. Therefore, the dimension corresponding to picture A is comprehensive.

S106: The electronic device updates the aesthetic evaluation model based on the third picture sequence and at least one picture selected by the user (ie, the fourth picture sequence).

In one implementation, the electronic device can set the score of the picture in the fourth picture sequence based on the score of the picture in the third picture sequence. The fourth picture sequence and the corresponding score can be called a personalized data set. The personalized data set Can be used to update aesthetic assessment models.

In one implementation, for M pictures corresponding to the same dimension in the fourth picture sequence (M is a positive integer), the dimension is a comprehensive dimension as an example for explanation. If these M pictures are not the third picture sequence In picture sequence 1 with a higher comprehensive score, the electronic device can set the comprehensive score corresponding to these M pictures as: the comprehensive score of the top M pictures in the picture sequence 1, these M pictures and the corresponding comprehensive score The score may belong to the personalized data set, in which the above M pictures are not picture sequence 1, and may include: any picture among the M pictures does not belong to picture sequence 1, and/or, among the M pictures The priority order of any picture is different from the comprehensive score of the picture in picture sequence 1. In some examples, for any picture among the above M pictures (can be called the second picture), assuming that the second picture The priority is ranked Kth among these M pictures (K is a positive integer less than or equal to M), if the second picture does not belong to picture sequence 1, or if the second picture belongs to picture sequence 1 but is in picture sequence 1 The picture with the Kth comprehensive score is not the second picture. The electronic device can set the comprehensive score corresponding to the second picture as: the comprehensive score of the third picture with the Kth comprehensive score in the picture sequence 1. Not limited to the above embodiment, in another embodiment, when the electronic device sets the comprehensive score corresponding to the M pictures, for any one of the M pictures, if the picture belongs to picture sequence 1, and the picture The priority order is consistent with the order of the comprehensive score of the picture in picture sequence 1, then the electronic device does not need to set the comprehensive score corresponding to the picture. An example of how the electronic device sets the score of the pictures in the fourth picture sequence can be seen in Figure 7 below, which will not be described in detail yet.

In one implementation, the updated aesthetic evaluation model can be used to subsequently score pictures. For example, after executing S101-S106 shown in Figure 3, the updated aesthetic evaluation model can be obtained, and then the electronic device can execute it again. S101-S104 (the picture sequence at this time may be different from the previous picture sequence). At this time, the aesthetic evaluation model used in S103 may be the updated aesthetic evaluation model mentioned above.

Not limited to the implementation shown in Figure 3, in another implementation, S102 and/or S103 may also be executed by a network device connected to the electronic device. For example, the electronic device may send the first picture sequence to the network device, The network device may perform S102 and S103, and then send the third picture sequence to the electronic device for display.

In the method shown in Figure 3, the electronic device can generate a second picture sequence in the time and/or space dimensions, and based on multiple dimensions such as synthesis, subject position, movement stretch, facial expression, image quality, etc., from the first picture The third picture sequence with higher quality is selected from the sequence and the second picture sequence and recommended to the user, and the picture recommendation strategy is optimized so that the user can obtain satisfactory pictures conveniently and quickly. While recommending the third picture sequence, the first picture sequence and the second picture sequence can also be displayed as candidate pictures, thereby increasing the probability that the user obtains the desired image. Moreover, the electronic device can update the picture recommendation strategy based on the pictures selected by the user, recommend different pictures for different users, and further increase the probability that the user obtains the desired image.

Figure 7 exemplarily shows a schematic diagram of the acquisition process of a personalized data set.

As shown in Figure 7, the third picture sequence includes picture sequence 1 with the top 2 comprehensive scores and picture sequence 2 with the subject position score in the top 3. In picture sequence 1, the comprehensive score 1 of picture 11 is higher than that of picture 12 The comprehensive score is 2. In the picture sequence 2, the order from high to low according to the subject position score is: picture 21 (corresponding to the subject position score 1), picture 22 (corresponding to the subject position score 2), picture 23 (corresponding to the subject position score 2) 3). The fourth picture sequence includes picture 11, picture 22 and picture 24, where the corresponding dimension of picture 11 is comprehensive, the corresponding dimension of picture 22 and picture 24 is subject position, and picture 22 has a higher priority than picture 24.

Since picture 11 corresponding to the comprehensive dimension in the fourth picture sequence belongs to picture sequence 1, and the priority order of picture 11 and the order of the comprehensive score of picture 11 in picture sequence 1 are both first, it can be understood that the aesthetic evaluation model obtains The comprehensive score meets the user's needs. Therefore, the personalized data set does not need to include the picture 11 and the corresponding comprehensive score 1.

Since picture 22 corresponding to the subject position dimension in the fourth picture sequence belongs to picture sequence 2, but the priority order of picture 22 (first place) is different from the order of the comprehensive score of picture 22 in picture sequence 2 (second place), therefore , the electronic device can set the subject position score corresponding to the picture 22 in the fourth picture sequence as: the subject position score of the picture 21 with the first comprehensive score in the picture sequence 2 is 1. Correspondingly, the personalized data set can include Image 22 and the corresponding subject position score 1.

Since the picture 24 corresponding to the subject position dimension in the fourth picture sequence does not belong to the picture sequence 2, and the priority order of the picture 24 is the second, the electronic device can set the subject position score corresponding to the picture 24 in the fourth picture sequence. is: the subject position score 2 of the picture 22 whose comprehensive score is second in the picture sequence 2. Correspondingly, the personalized data set may include the picture 24 and the corresponding subject position score 2.

Please refer to FIG. 8 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .

As shown in FIG. 8 , the electronic device 100 may include a picture generation module 200 , a picture recommendation module 300 , a user selection module 400 , a storage module 500 , a personalized learning module 600 and a picture library 700 , where the picture library 700 may include a picture sequence 701 , the picture sequence 701 is, for example, a plurality of pictures continuously taken by the electronic device 100 in response to user operations.

The picture generation module 200 may receive a picture sequence 701 (as input), generate a picture sequence 702 with a new timestamp and/or a new observation perspective in the temporal and/or spatial dimensions according to the picture sequence 701, and the picture sequence 702 may be output to the picture Library 700. In one implementation, the picture generation module 200 may be used to perform S102 in FIG. 3 .

The picture recommendation module 300 can receive the picture library 700 (as input), use the aesthetic evaluation model to score each picture in the picture library 700 from multiple dimensions such as comprehensiveness, subject position, movement stretch, expression, and picture quality, and output Highly rated image sequence 703. In one implementation, the picture recommendation module 300 may be used to perform S103 in FIG. 3 . The electronic device 100 may display a sequence of pictures 703 to recommend to the user for selection.

The user selection module 400 can select the picture sequence 704 (as an output) from the displayed pictures according to the user operation while displaying the picture sequence 703, optionally and other pictures in the picture library 700 (as an input). In one implementation, the user selection module 400 may be used to perform S105 in FIG. 3 .

The storage module 500 can store the picture sequence 704 output by the user selection module 400. In one implementation, the storage module 500 can also delete pictures other than the picture sequence 704 in the picture library 700.

The personalized learning module 600 can receive the picture sequence 703 and the picture sequence 704 (as input), compare the picture sequence 703 and the picture sequence 704 to obtain a personalized data set, and perform a pre-update/historical aesthetic evaluation model based on the personalized data set. Training (e.g., periodic training) to obtain an updated aesthetic evaluation model (as output). In one implementation, the aesthetic evaluation model before updating may be sent by the picture recommendation module 300 to the personalized learning module 600 as input. The updated aesthetic evaluation model can be sent to the image recommendation module for use. In one implementation, the personalized learning module 600 may be used to perform S106 in FIG. 3 .

Next, the picture generation module 200 in the electronic device 100 shown in FIG. 8 is introduced as an example.

Please refer to FIG. 9 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .

As shown in FIG. 9 , the picture generation module 200 of the electronic device 100 may include a time perception module 201 and a space perception module 202 . Among them, the time perception module 201 can receive the picture sequence 701 in the picture library 700 as input, and generate a new timestamp picture sequence 705 in the time dimension according to the picture sequence 701 (as output). The time perception module 201 is implemented based on video frame insertion, for example. The spatial perception module 202 can receive the picture sequence 701 in the picture library 700 and the picture sequence 705 output by the time perception module 201 as input, and generate a picture sequence 706 of a new observation perspective in the spatial dimension according to the picture sequence 701 and the picture sequence 705 (as output). The spatial perception module 202 is implemented based on NeRF, for example. The picture sequence 705 and the picture sequence 706 can be output to the picture library 700 to form the picture sequence 702. The picture sequence 702 can be the union of the picture sequence 705 and the picture sequence 706.

In one implementation, the spatial perception module 202 shown in Figure 9 may include a model training module 202A, a parameter extraction module 202B, a spatial perception model 202C and a new parameter generation module 202D. For details, see the electronic device 100 shown in Figure 10 architecture.

As shown in Figure 10, the process of the spatial perception module 202 generating the picture sequence 706 based on the picture sequence 701 and the picture sequence 705 may include two steps: online training and picture generation, as detailed below.

Online training: First, the parameter extraction module 202B can receive the picture sequence 701 and the picture sequence 705 as input, and output the spatial parameters of each picture in the picture sequence 701 and the picture sequence 705. The spatial parameters include, for example, but are not limited to, the scene shown in the picture. Coordinates (can be simplified are called spatial coordinates, for example expressed as (x, y, z) in the spatial rectangular coordinate system/world coordinate system), the posture/posture of the camera of the electronic device 100 (can be referred to as the camera posture for short, and can also be understood as the observation direction , for example expressed as Among them, θ, are the azimuth angle and polar angle in the spherical coordinate system respectively). Then, the spatial perception model 202C can receive the spatial parameters output by the parameter extraction module 202B as input, and output a picture sequence 707 corresponding to these spatial parameters respectively, wherein, for any picture in the picture sequence 701 and the picture sequence 705 (which can be called is picture B), the spatial perception model 202C can receive the spatial parameter 1 of the picture B as input, and output the picture C in the picture sequence 707. The picture C can be understood as the picture corresponding to the spatial parameter 1 "simulated" by the spatial perception model 202C. . Finally, the model training module 202A can receive the picture sequence 701, the picture sequence 705, and the picture sequence 707 output by the spatial perception model 202C as input, and compare each picture in the picture sequence 701 and the picture sequence 705 with the corresponding picture in the picture sequence 707 based on the loss function. pictures, and train the spatial perception model 202C according to the comparison results to obtain the updated spatial perception model 202C (for example, specifically obtain the weight of the model), where, for any picture in the picture sequence 701 and the picture sequence 705 (picture B), the corresponding picture in the picture sequence 707 is the output obtained by using the spatial parameter 1 of the picture B as the input of the spatial perception model 202C before the update, that is, the picture C. The above process can be called a training process. Multiple training processes can be performed to obtain multiple updated spatial perception models 202C. For example, the weight of the spatial perception model 202C before the first update is W0, and the weight of the spatial perception model 202C after the first update is W1. The parameter extraction module 202B, the model training module 202A and the spatial perception model 202C with weight W1 can perform the training process again (at this time, the output of the spatial perception model 202C may not be the picture sequence 707) to perform the second update and obtain the third The weight W2 of the spatial perception model 202C after the second update is obtained through multiple rounds of iterations, and the weight Wn of the spatial perception model 202C after multiple updates is obtained, where n is the number of updates. The spatial perception model 202C updated multiple times is used to perform the following steps of image generation.

Picture generation: The new parameter generation module 202D can receive the spatial parameters of each picture in the picture sequence 701 and the picture sequence 705 output by the parameter extraction module 202B as input, and output different spatial parameters. For example, the new parameter generation module 202D can receive parameters. The spatial parameter 1 of the picture sequence 701 and the picture B in the picture sequence 705 output by the extraction module 202B is used as input, and a spatial parameter 2 that is different from the spatial parameter 1 is output. Assuming that the spatial parameter 1 includes the spatial coordinate 1 and the camera pose 1, then the spatial parameter 2 includes spatial coordinates 1 and camera pose 2. The spatial perception model 202C after multiple updates can receive the spatial parameters output by the new parameter generation module 202D as input, and output the picture sequence 706 corresponding to these spatial parameters respectively. For example, the spatial perception model 202C after multiple updates can output and spatial The picture corresponding to parameter 2.

Not limited to the above embodiment, in another embodiment, the input of the spatial perception module 202 is only the picture sequence 701 in the picture library 700, which can be understood as: the time perception module 201 and the space perception module 202 are picture generation modules. Two separate modules in 200.

In one implementation, the spatial perception model 202C shown in Figure 10 may include two independent multilayer perceptions (MLPs) (MLP1 and MLP2), and a volume rendering module. For details, see The architecture of the electronic device 100 is shown in FIG. 11 . FIG. 11 takes the spatial parameters input to the spatial perception model 202C including camera posture and spatial coordinates as an example for illustration.

As shown in Figure 11, MLP1 can receive spatial coordinates in spatial parameters as input for feature extraction, and output intermediate features and spatial density. MLP2 can receive the camera pose in the spatial parameters and the intermediate features output by MLP1 as input for feature extraction and output color information (color). The stereoscopic rendering module can receive the spatial density output by MLP1 and the color information output by MLP2 as input, perform stereoscopic rendering, and output pictures corresponding to the above spatial parameters. In some examples, learnable parameters are set in MLP1 and/or MLP2, and the above online training can specifically train MLP1 and/or MLP2.

It is not limited to the picture generation module 200 in the above embodiment. In another embodiment, the time perception module 201 and the space perception module 202 can be integrated together. For details, see the architecture of the electronic device 100 shown in FIG. 12 .

As shown in FIG. 12 , the picture generation module 200 of the electronic device 100 may include a model training module 203 , a parameter extraction module 204 , a spatiotemporal perception model 205 and a new parameter generation module 206 . The picture generation module 200 can receive the picture sequence 701 in the picture library 700 as input, generate a new timestamp and a new observation perspective picture sequence 702 in the time dimension and the spatial dimension according to the picture sequence 701, and output it to the picture library 700 . This process can include two steps: online training and image generation, as shown below.

Online training: First, the parameter extraction module 204 can receive the picture sequence 701 as input and output the spatial parameters and temporal parameters of each picture in the picture sequence 701. The spatial parameters include, but are not limited to, spatial coordinates and camera posture. Time parameters include, but are not limited to, timestamps and/or time embeddings. The time embedding of any picture can be determined based on the timestamp of the picture. For example, Fourier transform is performed on the timestamp to obtain high resolution. dimensional vector (for example, a 128-dimensional vector). This high-dimensional vector is a determined time nesting. Then, the spatiotemporal perception model 205 can receive the spatial parameters and temporal parameters output by the parameter extraction module 204, and output a picture sequence 702 corresponding to these spatial parameters and temporal parameters respectively, wherein for any picture in the picture sequence 701 (which can be called For picture D), the spatio-temporal perception model 205 can receive the spatial parameter 3 and the temporal parameter 1 of the picture D as input, and output the picture E in the picture sequence 708. The picture E can be understood as the spatial parameter "simulated" by the spatio-temporal perception model 205. 3 and the picture corresponding to time parameter 1. Finally, the model training module 203 can receive the picture sequence 701 and the picture sequence 708 output by the spatiotemporal perception model 205 as input, compare each picture in the picture sequence 701 with the corresponding picture in the picture sequence 708 based on the loss function, and train according to the comparison results Spatiotemporal Perception Model 205 to get updated The subsequent space-time perception model 205 (for example, specifically obtain the weight of the model). The above process can be called a training process. Multiple training processes can be performed to obtain multiple updated spatio-temporal perception models 205. The specific examples are similar to the online training examples described in Figure 10 and will not be described again. The spatio-temporal perception model 205 updated multiple times is used to perform the following steps of image generation.

Picture generation: the new parameter generation module 206 can receive the spatial parameters and time parameters of each picture in the picture sequence 701 output by the parameter extraction module 204 as input, and output different spatial parameters and different time parameters, for example, the new parameter generation module 206 may receive the spatial parameter 3 and the temporal parameter 1 of the picture D in the picture sequence 701 output by the parameter extraction module 204 as input, and output different spatial parameters 4 and different temporal parameters 2. The spatio-temporal perception model 205 after multiple updates can receive the spatial parameters and temporal parameters output by the parameter extraction module 204 and the new parameter generation module 206 as input, and output a picture sequence 702 corresponding to these spatial parameters and temporal parameters respectively, for example, multiple times The updated spatio-temporal perception model 205 can receive the spatial parameter 3 and the temporal parameter 1 of the picture D in the picture sequence 701 output by the parameter extraction module 204, and the temporal parameter 2 and spatial parameter 4 output by the new parameter generation module 206 as input, corresponding to It is possible to output: the picture F corresponding to the spatial parameter 3 and the temporal parameter 2, the picture G corresponding to the temporal parameter 1 and the spatial parameter 4, and the picture H corresponding to the spatial parameter 4 and the temporal parameter 2.

In one implementation, the spatio-temporal perception model 205 shown in FIG. 12 may include two independent MLPs (MLP3 and MLP4), and a stereoscopic rendering module. For details, please refer to the architecture of the electronic device 100 shown in FIG. 13 . FIG. 13 takes the spatial parameters input to the spatio-temporal perception model 205 as an example including camera posture and spatial coordinates, and the time parameters including time nesting as an example.

As shown in Figure 13, MLP3 can receive spatial coordinates in spatial parameters as input for feature extraction, and output intermediate features and spatial density. MLP4 can receive temporal nesting (time parameters), camera poses in spatial parameters, and intermediate features output by MLP1 as input for feature extraction and output color information. The stereoscopic rendering module can receive the spatial density output by MLP3 and the color information output by MLP4 as input, perform stereoscopic rendering, and output pictures corresponding to the above spatial parameters and time parameters. In some examples, learnable parameters are set in MLP3 and/or MLP4, and the above online training can specifically train MLP3 and/or MLP4.

It is not limited to the picture generation module 200 in the above embodiment. In another embodiment, the picture generation module 200 may include a time perception module 201 or a space perception module 202, where when the picture generation module 200 only includes the space perception module 202 , the input of the spatial perception module 202 is the picture sequence 701 in the picture library 700.

This application can generate a new picture sequence 702 in the temporal and/or spatial dimensions based on the captured picture sequence 701. The picture sequence 702 and the picture sequence 701 can be used together as candidate pictures to recommend pictures for the user and for the user to select, in a limited time. Differentiated and high-quality candidate pictures have been added during the shooting time to increase the probability that users can obtain the pictures they need and improve the user experience.

Next, the picture recommendation module 300 in the electronic device 100 shown in FIG. 8 is introduced as an example.

Please refer to FIG. 14 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .

As shown in FIG. 14 , the picture recommendation module 300 of the electronic device 100 may include an aesthetic evaluation model 301 and a filtering module 302 . The aesthetic evaluation model 301 may be an updated aesthetic evaluation model output by the personalized learning module 600 of the electronic device 100 . The aesthetic evaluation model 301 can receive each picture in the picture library 700 as input, and output the score of each picture in the picture library 700 in multiple dimensions, based on the comprehensive score, subject position score, movement stretch score, expression score and painting score. Take quality score as an example to illustrate. The screening module 302 can receive the output of the aesthetic evaluation model 301 as input, sort and filter the scores in multiple dimensions, and obtain a picture sequence 703, that is, a plurality of picture sequences with higher scores in multiple dimensions: a comprehensive score. The picture sequence in the top N ₁ places 1. The picture sequence with the subject position score in the top N ₂ places 2. The picture sequence with the movement stretch score in the top N ₃ places 3. The picture sequence with the expression score in the top N ₄ places 4. Painting The picture sequence 5 with the top N ₅ quality scores, where N _i is a positive integer and i is a positive integer less than 6.

This application can conduct a comprehensive aesthetic evaluation and screening of candidate pictures from multiple dimensions such as comprehensiveness, subject position, movement stretch, expression and image quality, and recommend to users pictures with higher scores in multiple dimensions, which can meet the needs of users. Different users have different preferences, so picture recommendations are more accurate, increasing the probability that users can get the pictures they need, and improving user experience.

Next, the personalized learning module 600 in the electronic device 100 shown in FIG. 8 is introduced as an example.

Please refer to FIG. 15 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .

As shown in FIG. 15 , the picture recommendation module 300 of the electronic device 100 may include an aesthetic evaluation model 301 and a filtering module 302 . For details, please refer to the description of FIG. 14 . The user selection module 400 of the electronic device 100 can receive the picture sequence 703 output by the filtering module 302 (including multiple picture sequences with higher scores in multiple dimensions) as input, and select from the picture sequence 703 according to the received user operation. Sequence of pictures 704 (as output).

As shown in FIG. 15 , the personalized learning module 600 of the electronic device 100 may include a picture calibration module 601 , a personalized data set 602 and a model training module 603 . Among them, the picture calibration module 601 can receive the picture sequence 703 and the corresponding score output by the screening module 302, and the picture sequence 704 output by the user selection module 400 as input, and set the picture sequence 704 according to the picture sequence 703 and the corresponding score. The corresponding score, the picture sequence 704 output by the picture calibration module 601 and the corresponding score can constitute a personalized data set 602. For an implementation example of the picture calibration module 601, see S106 of Figure 3 and the description of Figure 7 . The model training module 603 may receive the personalized data set 602 and the pre-updated aesthetic evaluation model 301 as input, use the personalized data set 602 to train the pre-updated aesthetic evaluation model 301 and obtain the updated aesthetic evaluation model 301. The updated aesthetics The evaluation model 301 may be output to the image recommendation module 300.

This application can train the aesthetic evaluation model 301 in the picture recommendation module 300 based on the default recommended pictures and user-selected pictures, that is, perform end-side self-learning to make the scoring strategy of the aesthetic evaluation model 301 match the user's habits as much as possible to achieve personalization. Picture recommendation further improves the probability that users can obtain the pictures they need and improves user experience.

In some examples, after different electronic devices used by different users perform their own end-side self-learning, and after the two electronic devices obtain the first sequence of pictures in the same scene (for example, multiple pictures in the same scene are captured), The recommended third picture sequence can be different, that is, the picture recommendation of "thousands of people and thousands of faces" is realized.

The application scenarios involved in the embodiments of this application and the user interface embodiments in this scenario are introduced below.

Figure 16 exemplarily shows a schematic diagram of the user interface of a picture recommendation process.

As shown in (A) of FIG. 16 , the electronic device 100 may display the user interface 1000 of the camera application. The user interface 1000 may include a viewfinder frame 1010, a shooting control 1020, and a thumbnail image 1030. The viewfinder frame 1010 is used to display images collected by the electronic device 100 through a camera in real time, the shooting control 1020 is used to trigger the shooting of pictures through the camera, and the thumbnail image 1030 Used to display the latest picture taken by the electronic device 100 through the camera. In one implementation, the electronic device 100 can continuously capture multiple pictures in response to an operation on the shooting control 1020 (such as a touch operation, such as a click operation or a long press operation, etc.), that is, to achieve what is shown in FIG. 3 As shown in S101, these multiple pictures are the first picture sequence. Then, the electronic device 100 may display any one of the plurality of pictures in response to an operation (such as a touch operation, such as a click operation) on the thumbnail 1030. For details, see (B) of FIG. 16 As shown in the user interface 2000, the user interface 2000 may include a picture 2010 and a control 2020 among the plurality of pictures mentioned above.

In one implementation, the electronic device 100 can respond to an operation (such as a touch operation, such as a click operation) on the control 2020, based on the above-mentioned multiple continuously shot pictures from comprehensive recommendations, subject position, and movement stretch. Recommend pictures to users from multiple dimensions such as expression, image quality, etc., and display the recommended pictures and other pictures, that is, implement S102-S104 shown in Figure 3. The recommended pictures are the third picture sequence, and the other pictures include the third picture sequence. At least one picture in the first picture sequence and the second picture sequence except the third picture sequence. For details, see the user interface 3000 shown in (C) of Figure 16 .

As shown in (C) of FIG. 16 , the user interface 3000 may include a return control 3010 , prompt information 3020 , and a save control 3030 . Among them, the return control 3010 is used to return to the previous level interface. The save control 3030 is used to save the picture selected by the user. The prompt information 3020 is used to indicate the number of candidate pictures and the number of pictures selected by the user. For example, it may include the characters "Select Photo 0/30" to indicate that the number of candidate pictures is 30 and the number of pictures selected by the user is 0. In some examples, the above-mentioned candidate pictures include a first picture sequence and a second picture sequence. For example, the number of multiple pictures (ie, the first picture sequence) continuously shot by the electronic device 100 is 10. Based on the time of these multiple pictures, and/or the number of second picture sequences generated in spatial dimensions is 20, so the number of candidate pictures is 30. In some examples, the folder used to store pictures in the electronic device 100 may include information about the first picture sequence and the second picture sequence. For example, the storage location of the first picture sequence and the storage location of the second picture sequence are different. The picture sequence is, for example, stored in a newly created temporary cache area, and the attributes of the first picture sequence and the second picture sequence may be different, for example but not limited to including different generation times, different carrying tags, etc. In some examples, after the electronic device 100 captures the first sequence of pictures and before receiving the above-mentioned operation for the control 2020 , the folder used to store the pictures in the electronic device 100 may only include the first sequence of pictures. After receiving the above-mentioned operation for the control 2020 After the operation of 2020, the folder may also include a second sequence of pictures.

The user interface 3000 also includes a recommendation dimension 3040, a picture list 3050, and a display box 3060. Among them, the recommendation dimension 3040 may include multiple dimensions such as comprehensive recommendation 3040A, subject position 3040B, movement stretch 3040C, expression 3040D, and image quality 3040E. The electronic device 100 may respond to an operation (such as a touch operation) in any one of the dimensions. The touch operation is, for example, a click operation), and the dimension is set to the selected state. For example, the current comprehensive recommendation 3040A is the selected state. The picture list 3050 is used to display recommended pictures and other pictures under the selected dimension in the recommended dimension 3040 (currently the comprehensive recommendation 3040A). The recommended pictures include: pictures 3051 showing the recommendation mark 3051A and pictures showing the recommendation mark 3052A. Picture 3052, the other pictures include: picture 3052 and picture 3054. The pictures in the picture list 3050 can be displayed from front to back according to the score in the corresponding dimension (currently the comprehensive score) from high to low, that is, the pictures in the picture list 3050 are picture 3051, picture in order from high to low according to the comprehensive score. 3052, Picture 3053 and Picture 3054. In some examples, the electronic device 100 may display other pictures in the picture list 3050 in response to an operation on the picture list 3050 (such as a touch operation, such as a sliding operation from right to left). The picture list 3050 also includes a control 3055. The display box 3060 is used to display the picture pointed by the control 3055. For example, the current control 3055 points to the picture 3051. Therefore, the display box 3060 is used to display the enlarged picture 3051.

In one implementation, when the electronic device 100 displays recommended pictures and other pictures, the picture can be set to a selected state in response to an operation (such as a touch operation, such as a click operation) on any picture. , that is, S105 shown in FIG. 3 is implemented. For example, the electronic device 100 can respond to the operation on the picture 3054 in the user interface 3000 shown in (C) of FIG. 16 , and set the picture 3054 to the selected state. For details, see FIG. The user interface 4000 is shown in 17.

As shown in Figure 17, the user interface 4000 is similar to the user interface 3000. The difference is that the picture 3054 in the picture list 3050 displays information 4010. The information 4010 includes the character "1", indicating that the picture 3054 is the first picture selected by the user. and/or the user-selected image with the highest priority. Moreover, the control 3055 currently points to the picture 3054, and accordingly, the display box 3060 is used to display the enlarged picture 3054. Since the current user has selected a picture, the prompt information 3020 may include the characters "Select photo 1/30".

In one implementation, the electronic device 100 may display recommended pictures and other pictures under the recommended dimension 3040 in response to operations (such as click operations) on other dimensions in the recommended dimension 3040, for example, the implementation shown in FIG. 17 Thereafter, the electronic device 100 may display the recommended pictures and other pictures based on the subject position 3040B in response to the operation on the subject position 3040B in the recommendation dimension 3040 included in the user interface 4000 shown in FIG. 17 . For details, see FIG. 18 UI 5000.

As shown in Figure 18, the user interface 5000 is similar to the user interface 3000. The difference is that the subject position 3040B in the recommended dimension 3040 is selected. Therefore, the user interface 5000 includes a picture list 5010. The picture list 5010 is used to display the subject position 3040B. recommended pictures (i.e. picture 5011 and picture 5012), and other pictures (i.e. picture 5013 and picture 5014). The pictures in the picture list 5010 are picture 5011, picture 5012, picture 5013 and picture 5014 in descending order according to the subject position score. The electronic device 100 can set the picture 5014 to a selected state in response to an operation on the picture 5014 (such as a touch operation, such as a click operation). Therefore, the picture 5014 in the user interface 5000 displays information 5020, the information 5020 includes the character "2", indicating that the picture 5014 is the second picture selected by the user and/or the picture ranked second in priority selected by the user. Moreover, the control 3055 currently points to the picture 5014, and accordingly, the display box 3060 is used to display the enlarged picture 5014. Since the current user has selected two pictures, the prompt information 3020 may include the characters "Select photo 2/30".

In one embodiment, in the embodiment shown in FIG. 18 , the electronic device 100 may save the user-selected pictures 3054 and 5014 in response to the save control 3030 in the user interface 5000 shown in FIG. 18 , and delete the candidates. Other pictures in pictures. In some examples, the electronic device 100 may implement S106 shown in FIG. 3 based on the pictures 3054 and 5014 selected by the user.

Not limited to the above embodiments, in other embodiments, the electronic device 100 may also display the user interface shown in (C) of FIG. 16 directly in response to the operation of the shooting control 1020 in the user interface 1000 shown in FIG. 16 3000. In other embodiments, the electronic device 100 may also display FIG. 16 in response to the operation on the thumbnail 1030 in the user interface 1000 after receiving the operation on the shooting control 1020 in the user interface 1000 shown in FIG. 16 The user interface 3000 is shown in (C).

In another implementation, the user can also select multiple pictures in the gallery, and the electronic device 100 can perform picture recommendation based on these multiple pictures, that is, implement the method shown in Figure 3. The first picture sequence is these multiple pictures. , for a specific example, see user interface 6000 shown in Figure 19 . The user interface 6000 may be a user interface of a gallery application. The user interface 6000 may include prompt information 6010, a picture list 6020, and a function list 6030. The picture list 6020 may include multiple pictures, such as, but not limited to, picture 6021, picture 6022, picture 6023, picture 6024, picture 6025, and picture 6026. Picture 6021 is used as an example for illustration. A selection control 6021A is also displayed on the picture 6021. , the selection control 6021A is used to select the picture 6021 or deselect the picture 6021. The selection control 6021A in the user interface 6000 indicates that the picture 6021 has been selected. Similarly, picture 6022, picture 6023 and picture 6025 are all selected. The prompt information 6010 is used to indicate the number of pictures that have been selected. For example, if 4 pictures are currently selected, the prompt information 6010 includes the characters "4 items have been selected." The function list 6030 may include controls for multiple functions, such as but not limited to controls for sharing functions, controls for deleting functions, controls for selecting all functions, controls for recommended functions 6031 and controls for more functions. The electronic device 100 may respond to an operation on the control 6031 (such as a touch operation, such as a click operation), and use the pictures 6021, 6022, 6023 and 6025 selected by the user as the first picture sequence to implement FIG. 3 For the method shown, the user interface for displaying the third picture sequence may refer to the user interface 3000 shown in (C) of Figure 16 . In some examples, the picture 3051 and the picture 3052 in the picture list 3050 shown in the user interface 3000 are the above-mentioned pictures 6021 and 6025, but the pictures 3053 and 3054 in the picture list 3050 do not belong to the first picture sequence, that is, they belong to Second picture sequence.

Not limited to the above embodiment, in another embodiment, the electronic device 100 may receive dimension 1 input by the user for the setting interface, and determine user preference dimension 1. Then, after the electronic device 100 continuously takes multiple pictures, it can recommend pictures to the user from dimension 1 based on these pictures. In some examples, the electronic device 100 can automatically save pictures with higher scores in dimension 1 and delete other pictures. For example, dimension 1 of the picture is the comprehensive recommendation 3040A in the user interface 3000 shown in (C) of FIG. 16 , and the electronic device 100 continuously shoots in response to the shooting control 1020 in the user interface 1000 shown in (A) of FIG. 16 Multiple pictures, and automatically save the picture with a higher comprehensive score among the multiple pictures, that is, picture 3051 and picture 3052 in the user interface 3000 shown in (C) of Figure 16, and delete other pictures such as picture 3053 and picture 3054 at the same time. . It is not limited to this. In another implementation, there is no need for the user to manually input dimension 1, and the electronic device 100 can automatically Learn the dimension of user preference. For example, most of the pictures in the gallery have better expressions of the subjects. The electronic device 100 can determine the user preference dimension 1 based on the pictures in the gallery.

The methods provided by the embodiments of this application can be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user equipment, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmit to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means. The computer The readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated. The available media can be magnetic media (for example, floppy disks, hard disks, tapes ), optical media (for example, digital video disc (DWD)), or semiconductor media (for example, solid state disk (SSD), etc.). The above-mentioned embodiments are only used to illustrate the technology of the present application. scheme, rather than limiting it; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still modify the technical solutions recorded in the foregoing embodiments, or modify part of them. The technical features are equivalently substituted; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to depart from the scope of the technical solutions of the embodiments of the present application.

Claims

A picture recommendation method, characterized in that it is applied to electronic devices, and the method includes:

Display the image acquisition interface;

In response to the first operation of the image collection button of the image collection interface, collect and obtain the first picture sequence through the image collection device;

A second picture sequence is generated based on the first picture sequence, the second picture sequence includes pictures with different timestamps from the timestamps of the pictures in the first picture sequence, and/or the viewing angle is different from that of the first picture sequence. Pictures in the picture sequence with different viewing angles;

A third picture sequence is determined from the first picture sequence and the second picture sequence. The third picture sequence includes N pictures ranked in the top N positions by scores in the first dimension, and scores in the second dimension. M pictures ranked in the top M positions, N and M are positive integers;

The third picture sequence is recommended.
The method of claim 1, wherein the first dimension or the second dimension is any one of the following: a comprehensive dimension, the position of the photographed subject in the picture, or the movement stretch of the photographed subject in the picture. , the expression of the subject in the picture, and the image quality of the picture.
The method of claim 1 or 2, wherein the recommending the third picture sequence includes:

Display a first interface, the first interface displays first information, second information, the N pictures and the M pictures, the first information indicates the first dimension, the first information and the The N pictures are associated, the second information indicates the second dimension, and the second information is associated with the M pictures.
The method according to any one of claims 1-3, wherein the recommending the third picture sequence includes:

Display a second interface. The second interface displays K pictures. K is a positive integer greater than or equal to N. The K pictures include the N pictures and (K-N) pictures other than the N pictures. , the (K-N) pictures belong to the first picture sequence and/or the second picture sequence, the K pictures include a first picture and a second picture, and the first picture is in the first dimension The score of is greater than the score of the second picture in the first dimension, and the first picture is displayed before the second picture in the second interface.
The method according to any one of claims 1 to 4, characterized in that the method further includes:

receiving a second operation for selecting at least one picture belonging to the first picture sequence and/or the second picture sequence;

Save the at least one picture, and delete pictures other than the at least one picture in the first picture sequence and the second picture sequence.
The method according to any one of claims 1 to 5, wherein the third picture sequence is obtained according to the first strategy; the method further includes:

receiving a second operation for selecting at least one picture belonging to the first picture sequence and/or the second picture sequence;

The first policy is updated based on the third sequence of pictures and the at least one picture.
The method according to any one of claims 1 to 6, wherein generating a second picture sequence based on the first picture sequence includes:

Generate a fourth picture sequence based on the first picture sequence, the timestamps of pictures in the fourth picture sequence being different from the timestamps of pictures in the first picture sequence;

A fifth picture sequence is generated based on the first picture sequence and the fourth picture sequence, and the observation angle of the pictures in the fifth picture sequence is the same as the observation angle of the pictures in the first picture sequence and the fourth picture sequence. The viewing angles are different, and the second picture sequence includes the fourth picture sequence and the fifth picture sequence.
The method of claim 7, wherein generating a fifth picture sequence based on the first picture sequence and the fourth picture sequence includes:

Train to obtain a spatial perception model based on the first picture sequence and the fourth picture sequence;

Obtain a first spatial parameter, which is different from the spatial parameters of the pictures in the first picture sequence and the second picture sequence;

The first spatial parameter is used as an input of the spatial perception model to obtain an output, and the output is the fifth picture sequence.
The method according to any one of claims 1 to 8, wherein generating a second picture sequence based on the first picture sequence includes:

Obtain a spatio-temporal perception model based on the first picture sequence training;

Obtain a second spatial parameter and a first temporal parameter, the second spatial parameter includes a spatial parameter different from the spatial parameter of the picture in the first picture sequence, the first temporal parameter includes and the first picture sequence The time parameters of the pictures in the pictures are different time parameters;

The second spatial parameter and the first temporal parameter are used as inputs of the spatio-temporal perception model to obtain an output, and the output is the second picture sequence.
An electronic device, characterized in that it includes a transceiver, a processor and a memory, the memory is used to store a computer program, and the processor calls the computer program to execute the method according to any one of claims 1-9. method described.
A computer storage medium, characterized in that the computer storage medium stores a computer program, and when the computer program is executed by a processor, the method described in any one of claims 1-9 is implemented.