WO2024046162A1 - Image recommendation method and electronic device - Google Patents

Image recommendation method and electronic device Download PDF

Info

Publication number
WO2024046162A1
WO2024046162A1 PCT/CN2023/114053 CN2023114053W WO2024046162A1 WO 2024046162 A1 WO2024046162 A1 WO 2024046162A1 CN 2023114053 W CN2023114053 W CN 2023114053W WO 2024046162 A1 WO2024046162 A1 WO 2024046162A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
picture sequence
pictures
sequence
electronic device
Prior art date
Application number
PCT/CN2023/114053
Other languages
French (fr)
Chinese (zh)
Inventor
汪涛
许梦雯
宋凯凯
宋超领
周剑辉
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024046162A1 publication Critical patent/WO2024046162A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/535Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/93Regeneration of the television signal or of selected parts thereof

Definitions

  • the present application relates to the field of computer technology, and in particular, to a picture recommendation method and electronic device.
  • This application discloses a picture recommendation method and electronic device, which can recommend pictures that meet the user's needs for users, allowing users to obtain satisfactory pictures more conveniently and quickly.
  • embodiments of the present application provide a picture recommendation method, which is applied to electronic equipment.
  • the method includes: displaying an image collection interface; responding to a first operation of an image collection button on the image collection interface, collecting data through an image collection device.
  • a third picture sequence is determined from the first picture sequence and the second picture sequence, and the third picture sequence includes a first dimension N pictures whose scores are ranked in the top N positions, and M pictures whose scores in the second dimension are ranked in the top M positions, N and M are positive integers; the third picture sequence is recommended.
  • the second picture sequence includes pictures with different timestamps from the timestamps of pictures in the first picture sequence, specifically including: any picture in the second picture sequence
  • the timestamp of is different from the timestamps of all pictures in the first picture sequence.
  • the second picture sequence includes pictures with different viewing angles from the viewing angles of the pictures in the first picture sequence, specifically including: any picture in the second picture sequence
  • the observation angle is different from the observation angle of the picture in the first picture sequence whose timestamp is the same as the timestamp of the picture.
  • the third picture sequence recommended by the electronic device is selected from the first picture sequence and the second picture sequence, and the second picture sequence is generated in the time and/or space dimension based on the collected first picture sequence. , thereby adding differentiated, high-quality candidate pictures within the limited collection time, and the probability of users obtaining the required pictures is greatly increased.
  • the third picture sequence recommended by the electronic device includes N pictures that are better in the first dimension and M pictures that are better in the second dimension, thereby meeting the different needs of different users and allowing users to more conveniently and quickly Get satisfactory pictures.
  • the displayed image acquisition interface in response to the first operation of the image acquisition button of the image acquisition interface, and the first image sequence is acquired through the image acquisition device, may be replaced by: responding to The operation of selecting the first picture sequence is to obtain the first picture sequence from the gallery of the electronic device.
  • the first picture sequence can also be obtained from the gallery to meet different user needs in different scenarios and broaden application scenarios.
  • a third picture sequence is determined from the first picture sequence and the second picture sequence, and the third picture sequence includes the top N positions of scores of the first dimension.
  • N pictures, as well as M pictures ranked in the top M positions in terms of second dimension scores, N and M are positive integers
  • the recommended third picture sequence can be replaced by: starting from the first picture sequence and the Determine P pictures in the second picture sequence whose third dimension scores are ranked in the top P positions, P is a positive integer, save the P pictures, and delete the first picture sequence and the second picture sequence.
  • the electronic device can determine from the first picture sequence and the second picture sequence: P pictures that are better in the third dimension manually set by the user, or in the third dimension of the learned user preference. Upload the best P pictures, save these P pictures and delete other pictures, allowing users to quickly and easily obtain the pictures they need without manual selection, which greatly improves the user experience.
  • the first dimension or the second dimension is any one of the following: a comprehensive dimension, the position of the photographed subject in the picture, the movement stretch of the photographed subject in the picture, the The expression of the subject and the quality of the picture.
  • the recommending the third picture sequence includes: displaying a first interface, the first interface displaying the first information, the second information, the N pictures and the M pictures. pictures, the first information indicates the first dimension, the first information is associated with the N pictures, the second information indicates the second dimension, the second information is associated with the M pictures association.
  • the user can obtain N pictures associated with the first dimension based on the first information, and obtain M pictures associated with the second dimension based on the second information.
  • the display method is simple and clear, making it convenient for the user to obtain what he needs. Pictures in different dimensions improve user experience.
  • the recommendation of the third picture sequence includes: displaying a second interface, the second interface displays K pictures, where K is a positive integer greater than or equal to N, and the K pictures
  • the pictures include the N pictures and (K-N) pictures other than the N pictures.
  • the (K-N) pictures belong to the first picture sequence and/or the second picture sequence.
  • the K pictures include a first picture and a second picture.
  • the score of the first picture in the first dimension is greater than the score of the second picture in the first dimension.
  • the first picture is displayed in the second interface. before said second picture.
  • the electronic device can preferentially display the picture with a higher score in the first dimension to avoid the situation where the picture with a higher score is displayed later, causing the user to spend more time to obtain the picture, and further improves the user's ability to obtain the picture.
  • the efficiency of images is needed to improve user experience.
  • the (K-N) pictures do not belong to the third picture sequence.
  • the K pictures are the first dimension in the first picture sequence and the second picture sequence.
  • the scores are ranked among the top K pictures.
  • the electronic device can also display other pictures other than the recommended third picture sequence (that is, when K is greater than N), that is, more candidate pictures are provided for the user to choose, to avoid that none of the pictures in the third picture sequence are Meet the user's needs and prevent the user from obtaining the required pictures to further ensure the user experience.
  • the method further includes: receiving a second operation for selecting at least one picture, which belongs to the first picture sequence and/or the second picture sequence. ; Save the at least one picture, and delete pictures other than the at least one picture in the first picture sequence and the second picture sequence.
  • the electronic device can save at least one picture selected by the user and delete other pictures to prevent other pictures that the user does not need from occupying the storage space of the device and reduce the storage pressure of the device.
  • the third picture sequence is obtained according to the first strategy; the method further includes: receiving a second operation for selecting at least one picture, the at least one picture belonging to the the first picture sequence and/or the second picture sequence; and the first policy is updated according to the third picture sequence and the at least one picture.
  • the electronic device can update the first strategy for determining the recommended third picture sequence according to at least one picture selected by the user, that is, learn the first strategy according to the user's habits and realize the personalization of the first strategy, so that Subsequent recommended images determined based on the first strategy are more in line with current user needs and improve user experience.
  • generating a second picture sequence based on the first picture sequence includes: generating a fourth picture sequence based on the first picture sequence, and the time of the picture in the fourth picture sequence is The time stamps of the pictures in the first picture sequence are different from those of the pictures in the first picture sequence; a fifth picture sequence is generated based on the first picture sequence and the fourth picture sequence, and the observation angle of the pictures in the fifth picture sequence is different from that of the pictures in the fifth picture sequence.
  • the viewing angles of the pictures in the first picture sequence and the fourth picture sequence are different, and the second picture sequence includes the fourth picture sequence and the fifth picture sequence.
  • the timestamps of the pictures in the fourth picture sequence are different from the timestamps of the pictures in the first picture sequence, specifically including: any picture in the fourth picture sequence
  • the time stamp of the picture is different from the time stamps of all pictures in the first picture sequence.
  • the observation angle of the pictures in the fifth picture sequence is different from the observation angles of the pictures in the first picture sequence and the fourth picture sequence, specifically including: the fifth The observation angle of any picture in the picture sequence is different from the observation angle of the picture in the first picture sequence and the fourth picture sequence with the same time stamp as the time stamp of the picture.
  • the electronic device can first generate a fourth picture sequence with a different time stamp from the collected first picture sequence in the time dimension, and then generate an observation perspective from the first picture sequence and the fourth picture sequence in the spatial dimension.
  • Different fifth picture sequences compared to only generating pictures with different viewing angles from the first picture sequence, further expand the high-quality and differentiated candidate pictures, and the probability of users obtaining the required pictures is further improved.
  • generating a fifth picture sequence based on the first picture sequence and the fourth picture sequence includes: training to obtain a space based on the first picture sequence and the fourth picture sequence. Perception model; obtain a first spatial parameter, which is different from the spatial parameters of the pictures in the first picture sequence and the second picture sequence; use the first spatial parameter as the spatial perception model The input obtains the output, and the output is the fifth picture sequence.
  • the spatial perception model is obtained through multiple rounds of iterative training.
  • the spatial parameters include the spatial coordinates of the picture and the posture of the picture collection device used to collect the picture.
  • the spatial perception model is iteratively trained based on the currently collected first picture sequence and the fourth picture sequence generated based on the first picture sequence. Therefore, the spatial perception model can fully learn the situation of the current shooting scene.
  • the accuracy of the fifth picture sequence obtained by the spatial perception model is higher, that is, the accuracy of the candidate pictures is higher, which further increases the probability that the user can obtain the desired picture.
  • generating a second picture sequence based on the first picture sequence includes: training to obtain a space-time perception model based on the first picture sequence; obtaining second spatial parameters and first temporal parameters,
  • the second spatial parameters include spatial parameters that are different from the spatial parameters of the pictures in the first picture sequence, and the first temporal parameters include time parameters that are different from the temporal parameters of the pictures in the first picture sequence;
  • the second spatial parameter and the first temporal parameter are used as inputs of the spatio-temporal perception model to obtain an output, and the output is the second picture sequence.
  • the space-time perception model is obtained through multiple rounds of iterative training.
  • the time parameter includes the timestamp of the image, or the time nesting obtained based on the timestamp of the image.
  • the spatio-temporal perception model is iteratively trained based on the currently collected first picture sequence. Therefore, the spatio-temporal perception model can fully learn the situation of the current shooting scene.
  • the accuracy of the second picture sequence obtained through the spatio-temporal perception model Higher, that is, the accuracy of the candidate pictures is higher, which further increases the probability that the user will obtain the desired picture.
  • embodiments of the present application provide an electronic device, including a transceiver, a processor, and a memory; the memory is used to store a computer program, and the processor calls the computer program to execute the first aspect of the embodiment of the present application. and the image recommendation method provided by any implementation of the first aspect.
  • embodiments of the present application provide a computer storage medium.
  • the computer storage medium stores a computer program.
  • the computer program is executed by a processor, the computer program is used to perform the first aspect of the embodiments of the present application and any of the first aspects.
  • An image recommendation method provided by an implementation.
  • embodiments of the present application provide a computer program product.
  • the electronic device causes the electronic device to execute the first aspect of the embodiment of the present application and any implementation of the first aspect.
  • Provided image recommendation method When the computer program product is run on an electronic device, the electronic device causes the electronic device to execute the first aspect of the embodiment of the present application and any implementation of the first aspect.
  • Provided image recommendation method When the computer program product is run on an electronic device, the electronic device causes the electronic device to execute the first aspect of the embodiment of the present application and any implementation of the first aspect.
  • Provided image recommendation method is provided by the computer program product.
  • embodiments of the present application provide an electronic device, which includes executing the method or device described in any embodiment of the present application.
  • the above-mentioned electronic device is, for example, a chip.
  • Figure 1 is a schematic diagram of the hardware structure of an electronic device provided by this application.
  • FIG. 2 is a schematic diagram of the software architecture of an electronic device provided by this application.
  • Figure 3 is a schematic flow chart of an image recommendation method provided by this application.
  • FIG. 4 is a schematic diagram of an image generation process provided by this application.
  • FIG. 5 is a schematic diagram of another image generation process provided by this application.
  • Figure 6 is a schematic diagram of the skeletal position points of a human body provided by this application.
  • Figure 7 is a schematic diagram of the acquisition process of a personalized data set provided by this application.
  • FIGS 8-15 are schematic diagrams of the software architecture of yet another electronic device provided by this application.
  • FIGS 16-19 are schematic diagrams of some user interface embodiments provided by this application.
  • first and second are used for descriptive purposes only and shall not be understood as implying or implying relative importance or implicitly specifying the quantity of indicated technical features. Therefore, the features defined as “first” and “second” may explicitly or implicitly include one or more of the features. In the description of the embodiments of this application, unless otherwise specified, “plurality” The meaning is two or more.
  • the electronic device can take multiple pictures in succession, and select pictures with better quality/clarity from these multiple pictures to recommend to the user.
  • the user can Select the desired image from these multiple images based on recommended images pictures, but there are still the following technical problems that prevent users from obtaining satisfactory pictures conveniently and quickly.
  • Technical problem 1 Electronic equipment performs continuous shooting operations in chronological order, that is, imaging within a limited shooting time. There may be situations where all the images taken do not meet the user's needs;
  • This application provides a picture recommendation method, which is applied to electronic devices.
  • This method allows users to obtain satisfactory pictures conveniently and quickly, and improves user experience.
  • the electronic device can generate more pictures in the temporal and/or spatial dimensions based on the captured pictures for user selection, that is, adding differentiated, high-quality candidate pictures within a limited shooting time, To solve the above technical problem one.
  • the electronic device can also recommend pictures to the user from multiple dimensions such as comprehensive, photographed subject's position (referred to as subject position), photographed subject's movement and stretch, photographed subject's facial expression, image quality, etc., effectively optimizing Image recommendation strategy to solve the above technical problem two.
  • the electronic device can also update the picture recommendation strategy (which can be understood as end-side self-learning) based on the pictures selected by the user, to achieve personalization and continuous updating of the picture recommendation strategy to solve the above technical problem 3. This allows users to obtain satisfactory pictures conveniently and quickly, improving user experience.
  • the picture recommendation strategy (which can be understood as end-side self-learning) based on the pictures selected by the user, to achieve personalization and continuous updating of the picture recommendation strategy to solve the above technical problem 3. This allows users to obtain satisfactory pictures conveniently and quickly, improving user experience.
  • the electronic device may be a mobile phone, a tablet computer, a handheld computer, a desktop computer, a laptop computer, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, or a personal digital assistant (personal digital assistant).
  • assistant PDA
  • smart home devices such as smart TVs and smart cameras
  • wearable devices such as smart bracelets, smart watches, and smart glasses
  • augmented reality (AR), virtual reality (VR), and mixed reality Extended reality (XR) devices such as (mixed reality, MR), vehicle-mounted devices or smart city devices.
  • AR augmented reality
  • VR virtual reality
  • XR mixed reality Extended reality
  • FIG. 1 exemplarily shows a schematic diagram of the hardware structure of an electronic device 100 .
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (SIM) card interface 195, etc.
  • a processor 110 an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or fewer components than shown in the figures, or some components may be combined, some components may be separated, or some components may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the controller can generate operation control signals based on the instruction operation code and timing signals to complete the control of fetching and executing instructions.
  • the processor 110 may also be provided with a memory for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • processor 110 may include one or more interfaces.
  • Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (PCM) interface, universal asynchronous receiver and transmitter (universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and /or universal serial bus (USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • UART universal asynchronous receiver and transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB universal serial bus
  • the charging management module 140 is used to receive charging input from the charger.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, the wireless communication module 160, and the like.
  • the wireless communication function of the electronic device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example: Antenna 1 can be reused as a diversity antenna for a wireless LAN.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G/6G applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation.
  • at least part of the functional modules of the mobile communication module 150 may be disposed in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
  • the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the application processor outputs sound signals through audio devices (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194.
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent of the processor 110 and may be provided in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellites.
  • WLAN wireless local area networks
  • System global navigation satellite system, GNSS
  • frequency modulation frequency modulation, FM
  • near field communication technology near field communication, NFC
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the above-mentioned wireless communication technologies may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband code Wideband code division multiple access (WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technology, etc.
  • the GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi) -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the electronic device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the display screen 194 is used to display images, videos, etc.
  • Display 194 includes a display panel.
  • the display panel can use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode).
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • AMOLED organic light-emitting diode
  • FLED flexible light-emitting diode
  • Miniled MicroLed, Micro-oLed, quantum dot light emitting diode (QLED), etc.
  • the electronic device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
  • the electronic device 100 can implement the shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
  • the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, you open the shutter and light is transmitted through the lens to the camera. On the component, the optical signal is converted into an electrical signal, and the camera photosensitive element passes the electrical signal to the ISP for processing and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, color, etc. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In one implementation, the ISP may be provided in the camera 193.
  • Camera 193 is used to capture still images or video.
  • the object passes through the lens to produce an optical image that is projected onto the photosensitive element.
  • the photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal.
  • ISP outputs digital image signals to DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other format image signals.
  • the electronic device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
  • an external memory card such as a Micro SD card
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
  • the electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals. Audio module 170 may also be used to encode and decode audio signals.
  • Speaker 170A also called “speaker” is used to convert audio electrical signals into sound signals.
  • Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • Microphone 170C also called “microphone” or “microphone”, is used to convert sound signals into electrical signals.
  • the headphone interface 170D is used to connect wired headphones.
  • the pressure sensor 180A is used to sense pressure signals and can convert the pressure signals into electrical signals.
  • the pressure sensor 180A may be disposed on the display screen 194 .
  • pressure sensors 180A such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors, etc.
  • a capacitive pressure sensor may include at least two parallel plates of conductive material.
  • the electronic device 100 determines the intensity of the pressure based on the change in capacitance.
  • the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the electronic device 100 may also calculate the touched position based on the detection signal of the pressure sensor 180A.
  • the gyro sensor 180B may be used to determine the motion posture of the electronic device 100 .
  • the angular velocity of electronic device 100 about three axes ie, x, y, and z axes
  • Air pressure sensor 180C is used to measure air pressure.
  • Magnetic sensor 180D includes a Hall sensor.
  • the electronic device 100 may utilize the magnetic sensor 180D to detect opening and closing of the flip holster.
  • the acceleration sensor 180E can detect the acceleration of the electronic device 100 in various directions (generally three axes).
  • Distance sensor 180F for measuring distance.
  • Proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the electronic device 100 emits infrared light outwardly through the light emitting diode.
  • Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 .
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • Fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to achieve fingerprint unlocking, access to application locks, fingerprint photography, fingerprint answering of incoming calls, etc.
  • Temperature sensor 180J is used to detect temperature.
  • Touch sensor 180K also known as "touch device”.
  • the touch sensor 180K can be disposed on the display screen 194.
  • the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near the touch sensor 180K.
  • the touch sensor can pass the detected touch operation to the application processor to determine the touch event type.
  • Visual output related to the touch operation may be provided through display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100 in a position different from that of the display screen 194 .
  • Bone conduction sensor 180M can acquire vibration signals.
  • the buttons 190 include a power button, a volume button, etc.
  • the motor 191 can generate vibration prompts.
  • the indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, or may be used to indicate messages, missed calls, notifications, etc.
  • the SIM card interface 195 is used to connect a SIM card.
  • the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • the layered architecture software system can be the Android system, the Harmony operating system (operating system, OS), or other software systems.
  • the embodiment of this application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .
  • FIG. 2 exemplarily shows a schematic diagram of the software architecture of the electronic device 100 .
  • the layered architecture divides the software into several layers, and each layer has clear roles and division of labor.
  • the layers communicate through software interfaces.
  • the Android system is divided into four layers, from top to bottom: application layer, application framework layer, Android runtime and system libraries, and kernel layer.
  • the application layer can include a series of application packages.
  • the application package can include camera, gallery, music, calendar, short message, call, navigation, Bluetooth, browser and other applications.
  • the application package in this application can also be replaced by other forms of software such as applets.
  • the application framework layer provides an application programming interface (API) and programming framework for applications in the application layer.
  • API application programming interface
  • the application framework layer includes some predefined functions.
  • the application framework layer can include a window manager, content provider, view system, phone manager, resource manager, notification manager, etc.
  • a window manager is used to manage window programs.
  • the window manager can obtain the display size, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • Content providers are used to store and retrieve data and make this data accessible to applications.
  • Said data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
  • the view system includes visual controls, such as controls that display text, controls that display pictures, etc.
  • a view system can be used to build applications.
  • the display interface can be composed of one or more views.
  • a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the phone manager is used to provide communication functions of the electronic device 100 .
  • call status management including connected, hung up, etc.
  • the resource manager provides various resources to applications, such as localized strings, icons, pictures, layout files, video files, etc.
  • the notification manager allows applications to display notification information in the status bar, which can be used to convey notification-type messages and can automatically disappear after a short stay without user interaction.
  • the notification manager is used to notify download completion, message reminders, etc.
  • the notification manager can also be notifications that appear in the status bar at the top of the system in the form of charts or scroll bar text, such as notifications for applications running in the background, or notifications that appear on the screen in the form of conversation windows. For example, text information is prompted in the status bar, a beep sounds, the electronic device vibrates, the indicator light flashes, etc.
  • Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.
  • the core library contains two parts: one is the functional functions that need to be called by the Java language, and the other is the core library of Android.
  • the application layer and application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application layer and application framework layer into binary files.
  • the virtual machine is used to perform object life cycle management, stack management, thread management, security and exception management, and garbage collection and other functions.
  • System libraries can include multiple functional modules. For example: surface manager (surface manager), media libraries (Media Libraries), 3D graphics processing libraries (for example: OpenGL ES), 2D graphics engines (for example: SGL), etc.
  • the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, composition, and layer processing.
  • 2D Graphics Engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.
  • the following exemplifies the workflow of the software and hardware of the electronic device 100 in conjunction with capturing the photographing scene.
  • the corresponding hardware interrupt is sent to the kernel layer.
  • the kernel layer processes touch operations into raw Input events (including touch coordinates, timestamp of touch operations and other information).
  • Raw input events are stored at the kernel level.
  • the application framework layer obtains the original input event from the kernel layer and identifies the control corresponding to the input event. Taking the touch operation as a touch click operation and the control corresponding to the click operation as a camera application icon control as an example, the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer.
  • Camera 193 captures still images or video.
  • Figure 3 is a schematic flowchart of an image recommendation method provided by an embodiment of the present application. This method can be applied to the electronic device 100 shown in FIG. 1 . This method can be applied to the electronic device 100 shown in FIG. 2 . The method may include but is not limited to the following steps:
  • S101 The electronic device obtains the first picture sequence.
  • a picture sequence in this application refers to at least one picture.
  • the electronic device can capture the first sequence of pictures through a camera.
  • the electronic device may obtain a first sequence of pictures taken by the connected device.
  • the electronic device may obtain the first picture sequence from a memory of the electronic device, for example, obtain the first picture sequence in a gallery of the electronic device.
  • the electronic device can obtain the first picture sequence stored by the network device.
  • the electronic device can respond to a message for selecting the first picture sequence in the cloud album. The user operates to send a request message to the application server of the cloud photo album, and receives the first picture sequence sent by the application server.
  • the first picture sequence can also be obtained through at least two of the above embodiments.
  • the electronic device obtains some pictures in the first picture sequence through a camera, and obtains the first picture sequence taken by the connected device except the partial pictures. For pictures other than some pictures, this application does not limit the specific method of obtaining the first picture sequence.
  • S102 The electronic device generates a second picture sequence based on the first picture sequence.
  • the electronic device may generate a second picture sequence in a temporal dimension and/or a spatial dimension based on the first picture sequence.
  • the electronic device can generate at least one picture in the time dimension based on the first picture sequence, and the second picture sequence includes the at least one picture.
  • the electronic device can first obtain the shooting time of each picture in the first picture sequence (which can be referred to as a timestamp for short), assuming that the minimum timestamp and the maximum timestamp are timestamp 1 and timestamp 2 respectively, The electronic device can then generate at least one picture with a more detailed timestamp based on these timestamps and the first picture sequence, wherein the timestamp of each picture in the at least one picture is greater than timestamp 1 and less than timestamp 2, and The timestamp of any picture in the first picture sequence is different, and the implementation method is similar to video frame insertion, for example.
  • An example of the above process can be seen in Figure 4 below.
  • the first picture sequence may include four pictures: picture 1, picture 2, picture 3 and picture 4.
  • the timestamps of these four pictures from small to large are: t 1 , t 2 , t 3 , t4 .
  • the electronic device can generate three pictures in the time dimension based on the first picture sequence: picture 5 with timestamp t 5 between t 1 and t 2 , picture 6 with timestamp t 6 between t 2 and t 3 , and time Poke picture 7 where t 7 is between t 3 and t 4 .
  • the electronic device can generate multiple timestamps between the two timestamps.
  • Pictures for example, generate multiple pictures with timestamps between t 1 and t 2.
  • the electronic device may not generate a timestamp. Pictures between these two timestamps, for example, picture 5 with timestamp t 5 between t 1 and t 2 are not generated. This application does not limit the specific generation method.
  • the electronic device can generate at least one picture in a spatial dimension based on the first picture sequence, for example, perform new perspective synthesis based on neural radiation field (NeRF) to obtain the at least one picture.
  • the two-picture sequence includes the at least one picture.
  • the electronic device can generate one or more pictures with different viewing angles, where the time of this one or more pictures The stamp is the timestamp of the reference picture.
  • the observation perspective of any one of the one or more pictures is different from the observation perspective of the reference picture. If multiple pictures are generated, these multiple pictures correspond to different observation perspectives.
  • This process can be understood as synthesizing new perspectives from different observation perspectives for a fixed timestamp, thereby obtaining at least one picture of the new observation perspective.
  • the electronic device may use some or all of the pictures in the first picture sequence as reference pictures to generate at least one picture with more viewing angles.
  • An example of the above process can be seen in Figure 5 below.
  • Figure 5 uses picture 1 in Figure 4 as a reference picture as an example.
  • the subject in picture 1 is human body 1.
  • the human body 1 can be abstracted into a cube, and the human body 1/the cube can be observed from different perspectives, for example but not limited to: observing the front/front of the human body 1 from the front perspective, and observing the back of the human body 1 from the rear perspective. /Back, the left side of human body 1 is observed from the left perspective, the right side of human body 1 is observed from the right perspective, etc.
  • Picture 1 as a reference picture is a picture of human body 1 taken from the front view when the timestamp is t 1.
  • the electronic device can generate: a picture of human body 1 taken from the back view 8.
  • the electronic device can generate pictures with more or less viewing angles.
  • the electronic device can also generate pictures with a view from top to bottom. This application does not limit the specific generation method of the picture obtained by observing the human body 1 from a perspective.
  • S103 The electronic device uses the aesthetic evaluation model to score each picture in the first picture sequence and the second picture sequence, and obtains the score.
  • the third picture sequence with a higher score.
  • the electronic device can use each picture in the first picture sequence and the second picture sequence as the input of the aesthetic evaluation model to obtain a corresponding output.
  • the output can include scores of the picture in multiple dimensions,
  • the multiple dimensions include, for example, but are not limited to, the following: comprehensive (the corresponding score is called the comprehensive score), subject position (the corresponding score is called the subject position score), and the shooting subject's movement and stretch (the corresponding score is called the action Stretch score), the subject's expression (the corresponding score is called the expression score), the image quality (the corresponding score is called the image quality score), etc.
  • the aesthetic evaluation model can score pictures from multiple dimensions. Next, take any picture in the first picture sequence and the second picture sequence: the first picture as an example to illustrate the scoring method of the aesthetic evaluation model.
  • the aesthetic evaluation model may determine the subject position score based on a rate of change (ie, acceleration) of the speed of the subject in the first picture, where the acceleration may be based on the subject in the first picture and adjacent pictures. Determined by the speed, the adjacent pictures belong to the first picture sequence and the second picture sequence, for example but are not limited to pictures whose absolute value of the difference between the timestamp and the timestamp of the first picture is less than or equal to a preset threshold. For example, when the change trend of the acceleration of the subject in the first picture is getting smaller and the value is 0, the aesthetic evaluation model can consider that the subject in the first picture is at the highest point of motion, so the subject in the first picture can be The position score is set to the maximum value.
  • a rate of change ie, acceleration
  • the above-mentioned changing trend may include: the acceleration in the upward direction gradually decreases from a positive number (gradually approaching 0).
  • the above-mentioned changing trend may be consistent with the acceleration of the subject in the previous picture. It can be obtained from the comparison that the previous picture belongs to the first picture sequence and the second picture sequence and the timestamp is smaller than the timestamp of the first picture.
  • the aesthetic evaluation model can also determine the subject position score based on the size of the area occupied by the subject in the first picture. For example, the larger the occupied area, the greater the subject position score.
  • the aesthetic evaluation model can also determine the subject position score based on the position priority of the subject in the first picture. For example, when the position of the subject is in the middle, the position priority is the highest, so the subject position score can be set to the maximum value. , this application does not limit this.
  • the aesthetic evaluation model may determine the motion stretch score based on the distance between the skeletal position points of the subject in the first picture. For example, when the distance is larger, the aesthetic evaluation model may consider that the subject in the first picture The more stretched the movement is, the greater the movement stretching score is. When the distance is smaller (for example, the limbs are not spread out when the human body jumps), the smaller the movement stretching score is obtained.
  • an example of the skeletal position points of the subject can be seen in Figure 6.
  • the subject shown in Figure 6 is a human body.
  • the human body can include head points, neck points, left/right shoulder points, left/right shoulder points, etc.
  • There are multiple skeletal position points such as right elbow point, left/right hand point, left/right hip point, left/right knee point, left/right foot point, etc.
  • the above distance includes, for example, the distance between any two skeletal position points as shown in Figure 6 .
  • the aesthetic evaluation model can determine the expression score based on the facial expression of the subject in the first picture, for example, when the facial expression is a better expression such as smiling, laughing, eyes wide open, etc., the expression obtained The score is larger. When the facial expression is a poor expression such as closing the eyes, the obtained expression score is smaller.
  • the above-mentioned better or worse expressions can be preset or determined in response to user operations. It can also be obtained by learning user preferences.
  • the aesthetic evaluation model can determine the image quality score based on the image quality of the first picture. For example, when the image quality of the first picture is higher, the obtained image quality score is larger. When the image quality of the first picture is lower, the image quality score is larger. When it is low (for example, when the subject is subject to motion blur), the image quality score obtained is smaller.
  • image quality can include, but is not limited to, dynamic range, saturation, contrast, sharpness, etc.
  • the comprehensive score can be a score obtained by combining multiple indicators such as image quality, subject position, ease of movement, expression, whether there are moving objects (for example, whether there are passers-by).
  • the electronic device can separately sort and filter the output of the aesthetic evaluation model: the scores of the pictures in the first picture sequence and the second picture sequence in multiple dimensions (from multiple dimensions), and obtain A third picture sequence with a higher score.
  • the third picture sequence may include a plurality of picture sequences with higher scores in the above-mentioned multiple dimensions.
  • the scores of multiple dimensions are the comprehensive score, subject position score, movement stretch score, expression score and image quality score of the above examples. Therefore, the third picture sequence may include: the top N 1 comprehensive scores Picture sequence 1. Picture sequence with subject position score in the top N 2 places 2. Picture sequence with movement stretch score in the top N 3 places 3. Picture sequence with expression score in the top N 4 places 4. Image quality score in the top N 5 Bit picture sequence 5, where N i is a positive integer and i is a positive integer less than 6.
  • S104 The electronic device displays the third picture sequence.
  • the electronic device when displaying at least one picture in the first picture sequence and/or the second picture sequence, may display/highlight the third picture sequence therein, for example, in the third picture sequence pictures appear before other pictures.
  • the electronic device when the electronic device displays the third picture sequence, in response to a user operation (such as a user operation returning to the picture list interface of the gallery application), the electronic device displays the first picture sequence and/or the second picture sequence. At least one picture.
  • the electronic device when the electronic device displays the third picture sequence, it may preferentially display/highlight pictures with higher scores. For example, when the electronic device displays the picture sequence 1 with the top N 1 comprehensive score in the third picture sequence, the pictures can be displayed from front to back in order of comprehensive score from high to low, that is, the picture with the highest comprehensive score is displayed at the front, the picture with the second highest comprehensive score is displayed at the second, and so on.
  • the pictures with the highest scores may be displayed, from high to low, or in chronological order.
  • the electronic device receives a user operation for selecting at least one picture (which may be called a fourth picture sequence).
  • S105 is an optional step.
  • the electronic device when the electronic device displays the third picture sequence, it may receive a user operation for selecting a fourth picture sequence in the third picture sequence. In another embodiment, when the electronic device displays the third picture sequence, it also displays other pictures in the first picture sequence and/or the second picture sequence, and the electronic device may receive a method for selecting the third picture sequence and/or the second picture sequence. User action for the fourth picture sequence among other pictures.
  • the fourth picture sequence may include pictures in the first picture sequence. In one implementation, the fourth picture sequence may include pictures in the second picture sequence. In one implementation, the fourth picture sequence may include pictures in the third picture sequence.
  • the electronic device may set the priority of the pictures in the fourth picture sequence according to the received user operation. For example, the priority of the picture selected by the user first is higher than the priority of the picture selected by the user later.
  • the electronic device can determine the dimension corresponding to the picture in the fourth picture sequence according to the received user operation, and the dimension can be any one of the multiple dimensions described in S103. For example, when the electronic device displays picture sequence 1 with a high comprehensive score, it receives a user operation for selecting picture A. Therefore, the dimension corresponding to picture A is comprehensive.
  • the electronic device updates the aesthetic evaluation model based on the third picture sequence and at least one picture selected by the user (ie, the fourth picture sequence).
  • the electronic device can set the score of the picture in the fourth picture sequence based on the score of the picture in the third picture sequence.
  • the fourth picture sequence and the corresponding score can be called a personalized data set.
  • the personalized data set Can be used to update aesthetic assessment models.
  • the dimension is a comprehensive dimension as an example for explanation. If these M pictures are not the third picture sequence In picture sequence 1 with a higher comprehensive score, the electronic device can set the comprehensive score corresponding to these M pictures as: the comprehensive score of the top M pictures in the picture sequence 1, these M pictures and the corresponding comprehensive score
  • the score may belong to the personalized data set, in which the above M pictures are not picture sequence 1, and may include: any picture among the M pictures does not belong to picture sequence 1, and/or, among the M pictures
  • the priority order of any picture is different from the comprehensive score of the picture in picture sequence 1.
  • the second picture For any picture among the above M pictures (can be called the second picture), assuming that the second picture The priority is ranked Kth among these M pictures (K is a positive integer less than or equal to M), if the second picture does not belong to picture sequence 1, or if the second picture belongs to picture sequence 1 but is in picture sequence 1 The picture with the Kth comprehensive score is not the second picture.
  • the electronic device can set the comprehensive score corresponding to the second picture as: the comprehensive score of the third picture with the Kth comprehensive score in the picture sequence 1.
  • the electronic device when the electronic device sets the comprehensive score corresponding to the M pictures, for any one of the M pictures, if the picture belongs to picture sequence 1, and the picture The priority order is consistent with the order of the comprehensive score of the picture in picture sequence 1, then the electronic device does not need to set the comprehensive score corresponding to the picture.
  • An example of how the electronic device sets the score of the pictures in the fourth picture sequence can be seen in Figure 7 below, which will not be described in detail yet.
  • the updated aesthetic evaluation model can be used to subsequently score pictures. For example, after executing S101-S106 shown in Figure 3, the updated aesthetic evaluation model can be obtained, and then the electronic device can execute it again. S101-S104 (the picture sequence at this time may be different from the previous picture sequence). At this time, the aesthetic evaluation model used in S103 may be the updated aesthetic evaluation model mentioned above.
  • S102 and/or S103 may also be executed by a network device connected to the electronic device.
  • the electronic device may send the first picture sequence to the network device,
  • the network device may perform S102 and S103, and then send the third picture sequence to the electronic device for display.
  • the electronic device can generate a second picture sequence in the time and/or space dimensions, and based on multiple dimensions such as synthesis, subject position, movement stretch, facial expression, image quality, etc., from the first picture
  • the third picture sequence with higher quality is selected from the sequence and the second picture sequence and recommended to the user, and the picture recommendation strategy is optimized so that the user can obtain satisfactory pictures conveniently and quickly.
  • the first picture sequence and the second picture sequence can also be displayed as candidate pictures, thereby increasing the probability that the user obtains the desired image.
  • the electronic device can update the picture recommendation strategy based on the pictures selected by the user, recommend different pictures for different users, and further increase the probability that the user obtains the desired image.
  • Figure 7 exemplarily shows a schematic diagram of the acquisition process of a personalized data set.
  • the third picture sequence includes picture sequence 1 with the top 2 comprehensive scores and picture sequence 2 with the subject position score in the top 3.
  • the comprehensive score 1 of picture 11 is higher than that of picture 12
  • the comprehensive score is 2.
  • the order from high to low according to the subject position score is: picture 21 (corresponding to the subject position score 1), picture 22 (corresponding to the subject position score 2), picture 23 (corresponding to the subject position score 2) 3).
  • the fourth picture sequence includes picture 11, picture 22 and picture 24, where the corresponding dimension of picture 11 is comprehensive, the corresponding dimension of picture 22 and picture 24 is subject position, and picture 22 has a higher priority than picture 24.
  • picture 11 corresponding to the comprehensive dimension in the fourth picture sequence belongs to picture sequence 1, and the priority order of picture 11 and the order of the comprehensive score of picture 11 in picture sequence 1 are both first, it can be understood that the aesthetic evaluation model obtains The comprehensive score meets the user's needs. Therefore, the personalized data set does not need to include the picture 11 and the corresponding comprehensive score 1.
  • the electronic device can set the subject position score corresponding to the picture 22 in the fourth picture sequence as: the subject position score of the picture 21 with the first comprehensive score in the picture sequence 2 is 1.
  • the personalized data set can include Image 22 and the corresponding subject position score 1.
  • the electronic device can set the subject position score corresponding to the picture 24 in the fourth picture sequence. is: the subject position score 2 of the picture 22 whose comprehensive score is second in the picture sequence 2.
  • the personalized data set may include the picture 24 and the corresponding subject position score 2.
  • FIG. 8 exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
  • the electronic device 100 may include a picture generation module 200 , a picture recommendation module 300 , a user selection module 400 , a storage module 500 , a personalized learning module 600 and a picture library 700 , where the picture library 700 may include a picture sequence 701 , the picture sequence 701 is, for example, a plurality of pictures continuously taken by the electronic device 100 in response to user operations.
  • the picture generation module 200 may receive a picture sequence 701 (as input), generate a picture sequence 702 with a new timestamp and/or a new observation perspective in the temporal and/or spatial dimensions according to the picture sequence 701, and the picture sequence 702 may be output to the picture Library 700.
  • the picture generation module 200 may be used to perform S102 in FIG. 3 .
  • the picture recommendation module 300 can receive the picture library 700 (as input), use the aesthetic evaluation model to score each picture in the picture library 700 from multiple dimensions such as comprehensiveness, subject position, movement stretch, expression, and picture quality, and output Highly rated image sequence 703. In one implementation, the picture recommendation module 300 may be used to perform S103 in FIG. 3 .
  • the electronic device 100 may display a sequence of pictures 703 to recommend to the user for selection.
  • the user selection module 400 can select the picture sequence 704 (as an output) from the displayed pictures according to the user operation while displaying the picture sequence 703, optionally and other pictures in the picture library 700 (as an input). In one implementation, the user selection module 400 may be used to perform S105 in FIG. 3 .
  • the storage module 500 can store the picture sequence 704 output by the user selection module 400. In one implementation, the storage module 500 can also delete pictures other than the picture sequence 704 in the picture library 700.
  • the personalized learning module 600 can receive the picture sequence 703 and the picture sequence 704 (as input), compare the picture sequence 703 and the picture sequence 704 to obtain a personalized data set, and perform a pre-update/historical aesthetic evaluation model based on the personalized data set. Training (e.g., periodic training) to obtain an updated aesthetic evaluation model (as output).
  • the aesthetic evaluation model before updating may be sent by the picture recommendation module 300 to the personalized learning module 600 as input.
  • the updated aesthetic evaluation model can be sent to the image recommendation module for use.
  • the personalized learning module 600 may be used to perform S106 in FIG. 3 .
  • the picture generation module 200 in the electronic device 100 shown in FIG. 8 is introduced as an example.
  • FIG. 9 exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
  • the picture generation module 200 of the electronic device 100 may include a time perception module 201 and a space perception module 202 .
  • the time perception module 201 can receive the picture sequence 701 in the picture library 700 as input, and generate a new timestamp picture sequence 705 in the time dimension according to the picture sequence 701 (as output).
  • the time perception module 201 is implemented based on video frame insertion, for example.
  • the spatial perception module 202 can receive the picture sequence 701 in the picture library 700 and the picture sequence 705 output by the time perception module 201 as input, and generate a picture sequence 706 of a new observation perspective in the spatial dimension according to the picture sequence 701 and the picture sequence 705 (as output).
  • the spatial perception module 202 is implemented based on NeRF, for example.
  • the picture sequence 705 and the picture sequence 706 can be output to the picture library 700 to form the picture sequence 702.
  • the picture sequence 702 can be the union of the picture sequence 705 and the picture sequence 706.
  • the spatial perception module 202 shown in Figure 9 may include a model training module 202A, a parameter extraction module 202B, a spatial perception model 202C and a new parameter generation module 202D.
  • a model training module 202A may be included in the spatial perception module 202 shown in Figure 9.
  • the process of the spatial perception module 202 generating the picture sequence 706 based on the picture sequence 701 and the picture sequence 705 may include two steps: online training and picture generation, as detailed below.
  • the parameter extraction module 202B can receive the picture sequence 701 and the picture sequence 705 as input, and output the spatial parameters of each picture in the picture sequence 701 and the picture sequence 705.
  • the spatial parameters include, for example, but are not limited to, the scene shown in the picture. Coordinates (can be simplified are called spatial coordinates, for example expressed as (x, y, z) in the spatial rectangular coordinate system/world coordinate system), the posture/posture of the camera of the electronic device 100 (can be referred to as the camera posture for short, and can also be understood as the observation direction , for example expressed as Among them, ⁇ , are the azimuth angle and polar angle in the spherical coordinate system respectively).
  • the spatial perception model 202C can receive the spatial parameters output by the parameter extraction module 202B as input, and output a picture sequence 707 corresponding to these spatial parameters respectively, wherein, for any picture in the picture sequence 701 and the picture sequence 705 (which can be called is picture B), the spatial perception model 202C can receive the spatial parameter 1 of the picture B as input, and output the picture C in the picture sequence 707.
  • the picture C can be understood as the picture corresponding to the spatial parameter 1 "simulated" by the spatial perception model 202C. .
  • the model training module 202A can receive the picture sequence 701, the picture sequence 705, and the picture sequence 707 output by the spatial perception model 202C as input, and compare each picture in the picture sequence 701 and the picture sequence 705 with the corresponding picture in the picture sequence 707 based on the loss function. pictures, and train the spatial perception model 202C according to the comparison results to obtain the updated spatial perception model 202C (for example, specifically obtain the weight of the model), where, for any picture in the picture sequence 701 and the picture sequence 705 (picture B), the corresponding picture in the picture sequence 707 is the output obtained by using the spatial parameter 1 of the picture B as the input of the spatial perception model 202C before the update, that is, the picture C.
  • the above process can be called a training process.
  • Multiple training processes can be performed to obtain multiple updated spatial perception models 202C.
  • the weight of the spatial perception model 202C before the first update is W0
  • the weight of the spatial perception model 202C after the first update is W1.
  • the parameter extraction module 202B, the model training module 202A and the spatial perception model 202C with weight W1 can perform the training process again (at this time, the output of the spatial perception model 202C may not be the picture sequence 707) to perform the second update and obtain the third
  • the weight W2 of the spatial perception model 202C after the second update is obtained through multiple rounds of iterations, and the weight Wn of the spatial perception model 202C after multiple updates is obtained, where n is the number of updates.
  • the spatial perception model 202C updated multiple times is used to perform the following steps of image generation.
  • the new parameter generation module 202D can receive the spatial parameters of each picture in the picture sequence 701 and the picture sequence 705 output by the parameter extraction module 202B as input, and output different spatial parameters.
  • the new parameter generation module 202D can receive parameters.
  • the spatial parameter 1 of the picture sequence 701 and the picture B in the picture sequence 705 output by the extraction module 202B is used as input, and a spatial parameter 2 that is different from the spatial parameter 1 is output.
  • the spatial parameter 1 includes the spatial coordinate 1 and the camera pose 1
  • the spatial parameter 2 includes spatial coordinates 1 and camera pose 2.
  • the spatial perception model 202C after multiple updates can receive the spatial parameters output by the new parameter generation module 202D as input, and output the picture sequence 706 corresponding to these spatial parameters respectively.
  • the spatial perception model 202C after multiple updates can output and spatial The picture corresponding to parameter 2.
  • the input of the spatial perception module 202 is only the picture sequence 701 in the picture library 700, which can be understood as: the time perception module 201 and the space perception module 202 are picture generation modules. Two separate modules in 200.
  • the spatial perception model 202C shown in Figure 10 may include two independent multilayer perceptions (MLPs) (MLP1 and MLP2), and a volume rendering module.
  • MLPs multilayer perceptions
  • MLP1 and MLP2 independent multilayer perceptions
  • MLP1 and MLP2 independent multilayer perceptions
  • FIG. 11 takes the spatial parameters input to the spatial perception model 202C including camera posture and spatial coordinates as an example for illustration.
  • MLP1 can receive spatial coordinates in spatial parameters as input for feature extraction, and output intermediate features and spatial density.
  • MLP2 can receive the camera pose in the spatial parameters and the intermediate features output by MLP1 as input for feature extraction and output color information (color).
  • the stereoscopic rendering module can receive the spatial density output by MLP1 and the color information output by MLP2 as input, perform stereoscopic rendering, and output pictures corresponding to the above spatial parameters.
  • learnable parameters are set in MLP1 and/or MLP2, and the above online training can specifically train MLP1 and/or MLP2.
  • the time perception module 201 and the space perception module 202 can be integrated together. For details, see the architecture of the electronic device 100 shown in FIG. 12 .
  • the picture generation module 200 of the electronic device 100 may include a model training module 203 , a parameter extraction module 204 , a spatiotemporal perception model 205 and a new parameter generation module 206 .
  • the picture generation module 200 can receive the picture sequence 701 in the picture library 700 as input, generate a new timestamp and a new observation perspective picture sequence 702 in the time dimension and the spatial dimension according to the picture sequence 701, and output it to the picture library 700 .
  • This process can include two steps: online training and image generation, as shown below.
  • the parameter extraction module 204 can receive the picture sequence 701 as input and output the spatial parameters and temporal parameters of each picture in the picture sequence 701.
  • the spatial parameters include, but are not limited to, spatial coordinates and camera posture.
  • Time parameters include, but are not limited to, timestamps and/or time embeddings. The time embedding of any picture can be determined based on the timestamp of the picture. For example, Fourier transform is performed on the timestamp to obtain high resolution.
  • dimensional vector for example, a 128-dimensional vector). This high-dimensional vector is a determined time nesting.
  • the spatiotemporal perception model 205 can receive the spatial parameters and temporal parameters output by the parameter extraction module 204, and output a picture sequence 702 corresponding to these spatial parameters and temporal parameters respectively, wherein for any picture in the picture sequence 701 (which can be called For picture D), the spatio-temporal perception model 205 can receive the spatial parameter 3 and the temporal parameter 1 of the picture D as input, and output the picture E in the picture sequence 708.
  • the picture E can be understood as the spatial parameter "simulated" by the spatio-temporal perception model 205. 3 and the picture corresponding to time parameter 1.
  • the model training module 203 can receive the picture sequence 701 and the picture sequence 708 output by the spatiotemporal perception model 205 as input, compare each picture in the picture sequence 701 with the corresponding picture in the picture sequence 708 based on the loss function, and train according to the comparison results Spatiotemporal Perception Model 205 to get updated
  • the subsequent space-time perception model 205 (for example, specifically obtain the weight of the model).
  • the above process can be called a training process. Multiple training processes can be performed to obtain multiple updated spatio-temporal perception models 205.
  • the specific examples are similar to the online training examples described in Figure 10 and will not be described again.
  • the spatio-temporal perception model 205 updated multiple times is used to perform the following steps of image generation.
  • the new parameter generation module 206 can receive the spatial parameters and time parameters of each picture in the picture sequence 701 output by the parameter extraction module 204 as input, and output different spatial parameters and different time parameters, for example, the new parameter generation module 206 may receive the spatial parameter 3 and the temporal parameter 1 of the picture D in the picture sequence 701 output by the parameter extraction module 204 as input, and output different spatial parameters 4 and different temporal parameters 2.
  • the spatio-temporal perception model 205 after multiple updates can receive the spatial parameters and temporal parameters output by the parameter extraction module 204 and the new parameter generation module 206 as input, and output a picture sequence 702 corresponding to these spatial parameters and temporal parameters respectively, for example, multiple times
  • the updated spatio-temporal perception model 205 can receive the spatial parameter 3 and the temporal parameter 1 of the picture D in the picture sequence 701 output by the parameter extraction module 204, and the temporal parameter 2 and spatial parameter 4 output by the new parameter generation module 206 as input, corresponding to It is possible to output: the picture F corresponding to the spatial parameter 3 and the temporal parameter 2, the picture G corresponding to the temporal parameter 1 and the spatial parameter 4, and the picture H corresponding to the spatial parameter 4 and the temporal parameter 2.
  • the spatio-temporal perception model 205 shown in FIG. 12 may include two independent MLPs (MLP3 and MLP4), and a stereoscopic rendering module.
  • MLP3 and MLP4 independent MLPs
  • a stereoscopic rendering module for details, please refer to the architecture of the electronic device 100 shown in FIG. 13 .
  • FIG. 13 takes the spatial parameters input to the spatio-temporal perception model 205 as an example including camera posture and spatial coordinates, and the time parameters including time nesting as an example.
  • MLP3 can receive spatial coordinates in spatial parameters as input for feature extraction, and output intermediate features and spatial density.
  • MLP4 can receive temporal nesting (time parameters), camera poses in spatial parameters, and intermediate features output by MLP1 as input for feature extraction and output color information.
  • the stereoscopic rendering module can receive the spatial density output by MLP3 and the color information output by MLP4 as input, perform stereoscopic rendering, and output pictures corresponding to the above spatial parameters and time parameters.
  • learnable parameters are set in MLP3 and/or MLP4, and the above online training can specifically train MLP3 and/or MLP4.
  • the picture generation module 200 may include a time perception module 201 or a space perception module 202, where when the picture generation module 200 only includes the space perception module 202 , the input of the spatial perception module 202 is the picture sequence 701 in the picture library 700.
  • This application can generate a new picture sequence 702 in the temporal and/or spatial dimensions based on the captured picture sequence 701.
  • the picture sequence 702 and the picture sequence 701 can be used together as candidate pictures to recommend pictures for the user and for the user to select, in a limited time. Differentiated and high-quality candidate pictures have been added during the shooting time to increase the probability that users can obtain the pictures they need and improve the user experience.
  • the picture recommendation module 300 in the electronic device 100 shown in FIG. 8 is introduced as an example.
  • FIG. 14 exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
  • the picture recommendation module 300 of the electronic device 100 may include an aesthetic evaluation model 301 and a filtering module 302 .
  • the aesthetic evaluation model 301 may be an updated aesthetic evaluation model output by the personalized learning module 600 of the electronic device 100 .
  • the aesthetic evaluation model 301 can receive each picture in the picture library 700 as input, and output the score of each picture in the picture library 700 in multiple dimensions, based on the comprehensive score, subject position score, movement stretch score, expression score and painting score. Take quality score as an example to illustrate.
  • the screening module 302 can receive the output of the aesthetic evaluation model 301 as input, sort and filter the scores in multiple dimensions, and obtain a picture sequence 703, that is, a plurality of picture sequences with higher scores in multiple dimensions: a comprehensive score.
  • the picture sequence in the top N 1 places 1.
  • the picture sequence with the subject position score in the top N 2 places 2.
  • the picture sequence with the movement stretch score in the top N 3 places 3.
  • the picture sequence with the expression score in the top N 4 places 4. Paint The picture sequence 5 with the top N 5 quality scores, where N i is a positive integer and i is a positive integer less than 6.
  • This application can conduct a comprehensive aesthetic evaluation and screening of candidate pictures from multiple dimensions such as comprehensiveness, subject position, movement stretch, expression and image quality, and recommend to users pictures with higher scores in multiple dimensions, which can meet the needs of users.
  • Different users have different preferences, so picture recommendations are more accurate, increasing the probability that users can get the pictures they need, and improving user experience.
  • the personalized learning module 600 in the electronic device 100 shown in FIG. 8 is introduced as an example.
  • FIG. 15 exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
  • the picture recommendation module 300 of the electronic device 100 may include an aesthetic evaluation model 301 and a filtering module 302 .
  • the user selection module 400 of the electronic device 100 can receive the picture sequence 703 output by the filtering module 302 (including multiple picture sequences with higher scores in multiple dimensions) as input, and select from the picture sequence 703 according to the received user operation. Sequence of pictures 704 (as output).
  • the personalized learning module 600 of the electronic device 100 may include a picture calibration module 601 , a personalized data set 602 and a model training module 603 .
  • the picture calibration module 601 can receive the picture sequence 703 and the corresponding score output by the screening module 302, and the picture sequence 704 output by the user selection module 400 as input, and set the picture sequence 704 according to the picture sequence 703 and the corresponding score.
  • the corresponding score, the picture sequence 704 output by the picture calibration module 601 and the corresponding score can constitute a personalized data set 602.
  • the picture calibration module 601 see S106 of Figure 3 and the description of Figure 7 .
  • the model training module 603 may receive the personalized data set 602 and the pre-updated aesthetic evaluation model 301 as input, use the personalized data set 602 to train the pre-updated aesthetic evaluation model 301 and obtain the updated aesthetic evaluation model 301.
  • the evaluation model 301 may be output to the image recommendation module 300.
  • This application can train the aesthetic evaluation model 301 in the picture recommendation module 300 based on the default recommended pictures and user-selected pictures, that is, perform end-side self-learning to make the scoring strategy of the aesthetic evaluation model 301 match the user's habits as much as possible to achieve personalization.
  • Picture recommendation further improves the probability that users can obtain the pictures they need and improves user experience.
  • the recommended third picture sequence can be different, that is, the picture recommendation of "thousands of people and thousands of faces" is realized.
  • Figure 16 exemplarily shows a schematic diagram of the user interface of a picture recommendation process.
  • the electronic device 100 may display the user interface 1000 of the camera application.
  • the user interface 1000 may include a viewfinder frame 1010, a shooting control 1020, and a thumbnail image 1030.
  • the viewfinder frame 1010 is used to display images collected by the electronic device 100 through a camera in real time
  • the shooting control 1020 is used to trigger the shooting of pictures through the camera
  • the thumbnail image 1030 Used to display the latest picture taken by the electronic device 100 through the camera.
  • the electronic device 100 can continuously capture multiple pictures in response to an operation on the shooting control 1020 (such as a touch operation, such as a click operation or a long press operation, etc.), that is, to achieve what is shown in FIG.
  • the electronic device 100 may display any one of the plurality of pictures in response to an operation (such as a touch operation, such as a click operation) on the thumbnail 1030.
  • an operation such as a touch operation, such as a click operation
  • the user interface 2000 may include a picture 2010 and a control 2020 among the plurality of pictures mentioned above.
  • the electronic device 100 can respond to an operation (such as a touch operation, such as a click operation) on the control 2020, based on the above-mentioned multiple continuously shot pictures from comprehensive recommendations, subject position, and movement stretch.
  • an operation such as a touch operation, such as a click operation
  • Recommend pictures to users from multiple dimensions such as expression, image quality, etc., and display the recommended pictures and other pictures, that is, implement S102-S104 shown in Figure 3.
  • the recommended pictures are the third picture sequence, and the other pictures include the third picture sequence. At least one picture in the first picture sequence and the second picture sequence except the third picture sequence. For details, see the user interface 3000 shown in (C) of Figure 16 .
  • the user interface 3000 may include a return control 3010 , prompt information 3020 , and a save control 3030 .
  • the return control 3010 is used to return to the previous level interface.
  • the save control 3030 is used to save the picture selected by the user.
  • the prompt information 3020 is used to indicate the number of candidate pictures and the number of pictures selected by the user. For example, it may include the characters "Select Photo 0/30" to indicate that the number of candidate pictures is 30 and the number of pictures selected by the user is 0.
  • the above-mentioned candidate pictures include a first picture sequence and a second picture sequence. For example, the number of multiple pictures (ie, the first picture sequence) continuously shot by the electronic device 100 is 10.
  • the folder used to store pictures in the electronic device 100 may include information about the first picture sequence and the second picture sequence. For example, the storage location of the first picture sequence and the storage location of the second picture sequence are different. The picture sequence is, for example, stored in a newly created temporary cache area, and the attributes of the first picture sequence and the second picture sequence may be different, for example but not limited to including different generation times, different carrying tags, etc.
  • the folder used to store the pictures in the electronic device 100 may only include the first sequence of pictures. After receiving the above-mentioned operation for the control 2020 After the operation of 2020, the folder may also include a second sequence of pictures.
  • the user interface 3000 also includes a recommendation dimension 3040, a picture list 3050, and a display box 3060.
  • the recommendation dimension 3040 may include multiple dimensions such as comprehensive recommendation 3040A, subject position 3040B, movement stretch 3040C, expression 3040D, and image quality 3040E.
  • the electronic device 100 may respond to an operation (such as a touch operation) in any one of the dimensions.
  • the touch operation is, for example, a click operation), and the dimension is set to the selected state.
  • the current comprehensive recommendation 3040A is the selected state.
  • the picture list 3050 is used to display recommended pictures and other pictures under the selected dimension in the recommended dimension 3040 (currently the comprehensive recommendation 3040A).
  • the recommended pictures include: pictures 3051 showing the recommendation mark 3051A and pictures showing the recommendation mark 3052A.
  • Picture 3052 the other pictures include: picture 3052 and picture 3054.
  • the pictures in the picture list 3050 can be displayed from front to back according to the score in the corresponding dimension (currently the comprehensive score) from high to low, that is, the pictures in the picture list 3050 are picture 3051, picture in order from high to low according to the comprehensive score. 3052, Picture 3053 and Picture 3054.
  • the electronic device 100 may display other pictures in the picture list 3050 in response to an operation on the picture list 3050 (such as a touch operation, such as a sliding operation from right to left).
  • the picture list 3050 also includes a control 3055.
  • the display box 3060 is used to display the picture pointed by the control 3055. For example, the current control 3055 points to the picture 3051. Therefore, the display box 3060 is used to display the enlarged picture 3051.
  • the picture when the electronic device 100 displays recommended pictures and other pictures, the picture can be set to a selected state in response to an operation (such as a touch operation, such as a click operation) on any picture. , that is, S105 shown in FIG. 3 is implemented.
  • the electronic device 100 can respond to the operation on the picture 3054 in the user interface 3000 shown in (C) of FIG. 16 , and set the picture 3054 to the selected state.
  • FIG. The user interface 4000 is shown in 17.
  • the user interface 4000 is similar to the user interface 3000.
  • the difference is that the picture 3054 in the picture list 3050 displays information 4010.
  • the information 4010 includes the character "1", indicating that the picture 3054 is the first picture selected by the user. and/or the user-selected image with the highest priority.
  • the control 3055 currently points to the picture 3054, and accordingly, the display box 3060 is used to display the enlarged picture 3054. Since the current user has selected a picture, the prompt information 3020 may include the characters "Select photo 1/30".
  • the electronic device 100 may display recommended pictures and other pictures under the recommended dimension 3040 in response to operations (such as click operations) on other dimensions in the recommended dimension 3040, for example, the implementation shown in FIG. 17 Thereafter, the electronic device 100 may display the recommended pictures and other pictures based on the subject position 3040B in response to the operation on the subject position 3040B in the recommendation dimension 3040 included in the user interface 4000 shown in FIG. 17 . For details, see FIG. 18 UI 5000.
  • the user interface 5000 is similar to the user interface 3000. The difference is that the subject position 3040B in the recommended dimension 3040 is selected. Therefore, the user interface 5000 includes a picture list 5010.
  • the picture list 5010 is used to display the subject position 3040B. recommended pictures (i.e. picture 5011 and picture 5012), and other pictures (i.e. picture 5013 and picture 5014).
  • the pictures in the picture list 5010 are picture 5011, picture 5012, picture 5013 and picture 5014 in descending order according to the subject position score.
  • the electronic device 100 can set the picture 5014 to a selected state in response to an operation on the picture 5014 (such as a touch operation, such as a click operation).
  • the picture 5014 in the user interface 5000 displays information 5020
  • the information 5020 includes the character "2”, indicating that the picture 5014 is the second picture selected by the user and/or the picture ranked second in priority selected by the user.
  • the control 3055 currently points to the picture 5014, and accordingly, the display box 3060 is used to display the enlarged picture 5014. Since the current user has selected two pictures, the prompt information 3020 may include the characters "Select photo 2/30".
  • the electronic device 100 may save the user-selected pictures 3054 and 5014 in response to the save control 3030 in the user interface 5000 shown in FIG. 18 , and delete the candidates. Other pictures in pictures.
  • the electronic device 100 may implement S106 shown in FIG. 3 based on the pictures 3054 and 5014 selected by the user.
  • the electronic device 100 may also display the user interface shown in (C) of FIG. 16 directly in response to the operation of the shooting control 1020 in the user interface 1000 shown in FIG. 16 3000.
  • the electronic device 100 may also display FIG. 16 in response to the operation on the thumbnail 1030 in the user interface 1000 after receiving the operation on the shooting control 1020 in the user interface 1000 shown in FIG. 16
  • the user interface 3000 is shown in (C).
  • the user can also select multiple pictures in the gallery, and the electronic device 100 can perform picture recommendation based on these multiple pictures, that is, implement the method shown in Figure 3.
  • the first picture sequence is these multiple pictures.
  • the user interface 6000 may be a user interface of a gallery application.
  • the user interface 6000 may include prompt information 6010, a picture list 6020, and a function list 6030.
  • the picture list 6020 may include multiple pictures, such as, but not limited to, picture 6021, picture 6022, picture 6023, picture 6024, picture 6025, and picture 6026.
  • Picture 6021 is used as an example for illustration.
  • a selection control 6021A is also displayed on the picture 6021.
  • the selection control 6021A is used to select the picture 6021 or deselect the picture 6021.
  • the selection control 6021A in the user interface 6000 indicates that the picture 6021 has been selected. Similarly, picture 6022, picture 6023 and picture 6025 are all selected.
  • the prompt information 6010 is used to indicate the number of pictures that have been selected. For example, if 4 pictures are currently selected, the prompt information 6010 includes the characters "4 items have been selected.”
  • the function list 6030 may include controls for multiple functions, such as but not limited to controls for sharing functions, controls for deleting functions, controls for selecting all functions, controls for recommended functions 6031 and controls for more functions.
  • the electronic device 100 may respond to an operation on the control 6031 (such as a touch operation, such as a click operation), and use the pictures 6021, 6022, 6023 and 6025 selected by the user as the first picture sequence to implement FIG. 3
  • the user interface for displaying the third picture sequence may refer to the user interface 3000 shown in (C) of Figure 16 .
  • the picture 3051 and the picture 3052 in the picture list 3050 shown in the user interface 3000 are the above-mentioned pictures 6021 and 6025, but the pictures 3053 and 3054 in the picture list 3050 do not belong to the first picture sequence, that is, they belong to Second picture sequence.
  • the electronic device 100 may receive dimension 1 input by the user for the setting interface, and determine user preference dimension 1. Then, after the electronic device 100 continuously takes multiple pictures, it can recommend pictures to the user from dimension 1 based on these pictures. In some examples, the electronic device 100 can automatically save pictures with higher scores in dimension 1 and delete other pictures.
  • dimension 1 of the picture is the comprehensive recommendation 3040A in the user interface 3000 shown in (C) of FIG. 16 , and the electronic device 100 continuously shoots in response to the shooting control 1020 in the user interface 1000 shown in (A) of FIG.
  • the methods provided by the embodiments of this application can be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user equipment, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmit to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated.
  • the available media can be magnetic media (for example, floppy disks, hard disks, tapes ), optical media (for example, digital video disc (DWD)), or semiconductor media (for example, solid state disk (SSD), etc.).
  • magnetic media for example, floppy disks, hard disks, tapes
  • optical media for example, digital video disc (DWD)
  • semiconductor media for example, solid state disk (SSD), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present application provides an image recommendation method and an electronic device. The method comprises: an electronic device displaying an image acquisition interface; in response to an operation for an image acquisition button of the image acquisition interface, acquiring a first image sequence by means of an image acquisition device; generating a second image sequence on the basis of the first image sequence, wherein the second image sequence comprises images having timestamps different from those of images in the first image sequence, and/or images having observation viewing angles different from those of images in the first image sequence; determining a third image sequence from the first image sequence and the second image sequence, wherein the third image sequence comprises N images, on a first dimension, which have scores ranked at the first N positions, and M images, on a second dimension, which have scores ranked at the first M positions, and N and M are positive integers; and recommending the third image sequence. An artificial intelligence (AI) technology can be used to recommend to a user the "best moment" image meeting requirements of the user, so that the user can obtain a satisfactory image more conveniently and more quickly.

Description

一种图片推荐方法及电子设备An image recommendation method and electronic device
本申请要求于2022年08月30日提交中国专利局、申请号为202211049479.9、申请名称为“一种图片推荐方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on August 30, 2022, with the application number 202211049479.9 and the application title "A picture recommendation method and electronic device", the entire content of which is incorporated into this application by reference. middle.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种图片推荐方法及电子设备。The present application relates to the field of computer technology, and in particular, to a picture recommendation method and electronic device.
背景技术Background technique
用户在一些非静态场景下很难抓拍到满意的图片,例如,用户使用手机拍摄运动的人体时,容易拍摄到例如跳跃姿势时四肢没有舒展开、人体运动模糊、人体闭眼、路人乱入等不符合用户要求的图片。虽然目前手机可以连拍多张图片提供给用户选择,但这多张图片往往数量较多、差异较小,用户选择耗时耗力,用户无法方便快捷地获取到满意的图片。It is difficult for users to capture satisfactory pictures in some non-static scenes. For example, when users use mobile phones to take pictures of a moving human body, it is easy to take pictures such as the limbs are not stretched when jumping, the human body motion is blurred, the human body has its eyes closed, and passers-by are intrusive. Pictures that meet user requirements. Although mobile phones can currently take multiple pictures in a row for users to choose, these pictures are often large in number and have small differences. The user's selection is time-consuming and labor-intensive, and the user cannot obtain satisfactory pictures conveniently and quickly.
发明内容Contents of the invention
本申请公开了一种图片推荐方法及电子设备,能够为用户推荐符合用户需求的图片,让用户更加方便快捷地获取到满意的图片。This application discloses a picture recommendation method and electronic device, which can recommend pictures that meet the user's needs for users, allowing users to obtain satisfactory pictures more conveniently and quickly.
第一方面,本申请实施例提供一种图片推荐方法,应用于电子设备,该方法包括:显示图像采集界面;响应针对所述图像采集界面的图像采集按钮的第一操作,通过图像采集装置采集获得第一图片序列;基于所述第一图片序列生成第二图片序列,所述第二图片序列包括时间戳和所述第一图片序列中的图片的时间戳不同的图片,和/或,观察视角和所述第一图片序列中的图片的观察视角不同的图片;从所述第一图片序列和所述第二图片序列中确定出第三图片序列,所述第三图片序列包括第一维度的得分排列在前N位的N张图片,以及第二维度的得分排列在前M位的M张图片,N和M为正整数;推荐所述第三图片序列。In a first aspect, embodiments of the present application provide a picture recommendation method, which is applied to electronic equipment. The method includes: displaying an image collection interface; responding to a first operation of an image collection button on the image collection interface, collecting data through an image collection device. Obtaining a first sequence of pictures; generating a second sequence of pictures based on the first sequence of pictures, the second sequence of pictures including pictures with timestamps different from those of pictures in the first sequence of pictures, and/or observing Pictures with different viewing angles from the pictures in the first picture sequence; a third picture sequence is determined from the first picture sequence and the second picture sequence, and the third picture sequence includes a first dimension N pictures whose scores are ranked in the top N positions, and M pictures whose scores in the second dimension are ranked in the top M positions, N and M are positive integers; the third picture sequence is recommended.
在一种可能的实现方式中,所述第二图片序列包括时间戳和所述第一图片序列中的图片的时间戳不同的图片,具体包括:所述第二图片序列中的任意一张图片的时间戳和所述第一图片序列中的全部图片的时间戳不同。In a possible implementation, the second picture sequence includes pictures with different timestamps from the timestamps of pictures in the first picture sequence, specifically including: any picture in the second picture sequence The timestamp of is different from the timestamps of all pictures in the first picture sequence.
在一种可能的实现方式中,所述第二图片序列包括观察视角和所述第一图片序列中的图片的观察视角不同的图片,具体包括:所述第二图片序列中的任意一张图片的观察视角和所述第一图片序列中时间戳和该图片的时间戳相同的图片的观察视角不同。In a possible implementation, the second picture sequence includes pictures with different viewing angles from the viewing angles of the pictures in the first picture sequence, specifically including: any picture in the second picture sequence The observation angle is different from the observation angle of the picture in the first picture sequence whose timestamp is the same as the timestamp of the picture.
在上述方法中,电子设备推荐的第三图片序列是从第一图片序列和第二图片序列中筛选出来的,第二图片序列是基于采集的第一图片序列在时间和/或空间维度生成的,从而在有限的采集时间内增加了差异化、高质量的候选图片,用户获取到所需图片的概率大大提升。并且,电子设备推荐的第三图片序列包括在第一维度上较优的N张图片和在第二维度上较优的M张图片,从而可以满足不同用户的不同需求,让用户更加方便快捷地获取到满意的图片。In the above method, the third picture sequence recommended by the electronic device is selected from the first picture sequence and the second picture sequence, and the second picture sequence is generated in the time and/or space dimension based on the collected first picture sequence. , thereby adding differentiated, high-quality candidate pictures within the limited collection time, and the probability of users obtaining the required pictures is greatly increased. Moreover, the third picture sequence recommended by the electronic device includes N pictures that are better in the first dimension and M pictures that are better in the second dimension, thereby meeting the different needs of different users and allowing users to more conveniently and quickly Get satisfactory pictures.
在一种可能的实现方式中,所述显示图像采集界面,响应针对所述图像采集界面的图像采集按钮的第一操作,通过图像采集装置采集获得第一图片序列,可以替换为:响应用于选择所述第一图片序列的操作,从所述电子设备的图库中获取所述第一图片序列。In a possible implementation, the displayed image acquisition interface, in response to the first operation of the image acquisition button of the image acquisition interface, and the first image sequence is acquired through the image acquisition device, may be replaced by: responding to The operation of selecting the first picture sequence is to obtain the first picture sequence from the gallery of the electronic device.
在上述方法中,第一图片序列也可以是从图库中获取的,满足不同场景下的不同用户需求,拓宽应用场景。In the above method, the first picture sequence can also be obtained from the gallery to meet different user needs in different scenarios and broaden application scenarios.
在一种可能的实现方式中,所述从所述第一图片序列和所述第二图片序列中确定出第三图片序列,所述第三图片序列包括第一维度的得分排列在前N位的N张图片,以及第二维度的得分排列在前M位的M张图片,N和M为正整数,推荐所述第三图片序列,可以替换为:从所述第一图片序列和所述第二图片序列中确定出第三维度的得分排列在前P位的P张图片,P为正整数,保存所述P张图片,以及删除所述第一图片序列和所述第二图片序列中除所述P张图片以外的图片,其中,所述第三维度是所述电子设备响应针对设置界面的操作确定的,或者,所述第三维度是所述电子设备学习到的用户喜好的维度。In a possible implementation, a third picture sequence is determined from the first picture sequence and the second picture sequence, and the third picture sequence includes the top N positions of scores of the first dimension. N pictures, as well as M pictures ranked in the top M positions in terms of second dimension scores, N and M are positive integers, the recommended third picture sequence can be replaced by: starting from the first picture sequence and the Determine P pictures in the second picture sequence whose third dimension scores are ranked in the top P positions, P is a positive integer, save the P pictures, and delete the first picture sequence and the second picture sequence. Pictures other than the P pictures, wherein the third dimension is determined by the electronic device in response to an operation on the setting interface, or the third dimension is a dimension of user preferences learned by the electronic device .
在上述方法中,电子设备可以从第一图片序列和第二图片序列中确定出:在用户手动设置的第三维度上较优的P张图片,或者,在学习到的用户喜好的第三维度上较优的P张图片,并保存这P张图片和删除其他图片,无需用户手动选择即可让用户快速便捷地获取到所需图片,大大提升了用户体验。 In the above method, the electronic device can determine from the first picture sequence and the second picture sequence: P pictures that are better in the third dimension manually set by the user, or in the third dimension of the learned user preference. Upload the best P pictures, save these P pictures and delete other pictures, allowing users to quickly and easily obtain the pictures they need without manual selection, which greatly improves the user experience.
在一种可能的实现方式中,所述第一维度或所述第二维度为以下任意一项:综合维度、图片中的拍摄主体的位置、图片中的拍摄主体的动作舒展度、图片中的拍摄主体的表情、图片的画质。In a possible implementation, the first dimension or the second dimension is any one of the following: a comprehensive dimension, the position of the photographed subject in the picture, the movement stretch of the photographed subject in the picture, the The expression of the subject and the quality of the picture.
在一种可能的实现方式中,所述推荐所述第三图片序列,包括:显示第一界面,所述第一界面显示第一信息、第二信息、所述N张图片和所述M张图片,所述第一信息指示所述第一维度,所述第一信息和所述N张图片关联,所述第二信息指示所述第二维度,所述第二信息和所述M张图片关联。In a possible implementation, the recommending the third picture sequence includes: displaying a first interface, the first interface displaying the first information, the second information, the N pictures and the M pictures. pictures, the first information indicates the first dimension, the first information is associated with the N pictures, the second information indicates the second dimension, the second information is associated with the M pictures association.
在上述方法中,用户可以根据第一信息获取到和第一维度关联的N张图片,以及根据第二信息获取到和第二维度关联的M张图片,显示方式简单明了,方便用户获取所需维度下的图片,提升用户体验。In the above method, the user can obtain N pictures associated with the first dimension based on the first information, and obtain M pictures associated with the second dimension based on the second information. The display method is simple and clear, making it convenient for the user to obtain what he needs. Pictures in different dimensions improve user experience.
在一种可能的实现方式中,所述推荐所述第三图片序列,包括:显示第二界面,所述第二界面显示K张图片,K为大于或等于N的正整数,所述K张图片包括所述N张图片和所述N张图片之外的(K-N)张图片,所述(K-N)张图片属于所述第一图片序列和/或所述第二图片序列,所述K张图片包括第一图片和第二图片,所述第一图片在所述第一维度的得分大于所述第二图片在所述第一维度的得分,所述第二界面中所述第一图片显示在所述第二图片之前。In a possible implementation, the recommendation of the third picture sequence includes: displaying a second interface, the second interface displays K pictures, where K is a positive integer greater than or equal to N, and the K pictures The pictures include the N pictures and (K-N) pictures other than the N pictures. The (K-N) pictures belong to the first picture sequence and/or the second picture sequence. The K pictures The pictures include a first picture and a second picture. The score of the first picture in the first dimension is greater than the score of the second picture in the first dimension. The first picture is displayed in the second interface. before said second picture.
在上述方法中,电子设备可以优先显示在第一维度的得分较高的图片,避免得分较高的图片显示在较后面导致用户耗时较多才能获取到该图片的情况,进一步提升用户获取所需图片的效率,提升用户体验。In the above method, the electronic device can preferentially display the picture with a higher score in the first dimension to avoid the situation where the picture with a higher score is displayed later, causing the user to spend more time to obtain the picture, and further improves the user's ability to obtain the picture. The efficiency of images is needed to improve user experience.
在一种可能的实现方式中,所述(K-N)张图片不属于所述第三图片序列,例如,所述K张图片是所述第一图片序列和所述第二图片序列中第一维度的得分排列在前K位的图片。In a possible implementation, the (K-N) pictures do not belong to the third picture sequence. For example, the K pictures are the first dimension in the first picture sequence and the second picture sequence. The scores are ranked among the top K pictures.
在上述方法中,电子设备还可以显示推荐的第三图片序列以外的其他图片(即K大于N的情况),即提供更多的候选图片供用户选择,避免第三图片序列中的图片均不符合用户需求导致用户无法获取到所需图片的情况,进一步保证用户体验。In the above method, the electronic device can also display other pictures other than the recommended third picture sequence (that is, when K is greater than N), that is, more candidate pictures are provided for the user to choose, to avoid that none of the pictures in the third picture sequence are Meet the user's needs and prevent the user from obtaining the required pictures to further ensure the user experience.
在一种可能的实现方式中,所述方法还包括:接收用于选择至少一张图片的第二操作,所述至少一张图片属于所述第一图片序列和/或所述第二图片序列;保存所述至少一张图片,以及删除所述第一图片序列和所述第二图片序列中除所述至少一张图片以外的图片。In a possible implementation, the method further includes: receiving a second operation for selecting at least one picture, which belongs to the first picture sequence and/or the second picture sequence. ; Save the at least one picture, and delete pictures other than the at least one picture in the first picture sequence and the second picture sequence.
在上述方法中,电子设备可以保存用户选择的至少一张图片,以及删除其他图片,避免用户不需要的其他图片占用设备的存储空间,减少设备的存储压力。In the above method, the electronic device can save at least one picture selected by the user and delete other pictures to prevent other pictures that the user does not need from occupying the storage space of the device and reduce the storage pressure of the device.
在一种可能的实现方式中,所述第三图片序列是根据第一策略得到的;所述方法还包括:接收用于选择至少一张图片的第二操作,所述至少一张图片属于所述第一图片序列和/或所述第二图片序列;根据所述第三图片序列和所述至少一张图片更新所述第一策略。In a possible implementation, the third picture sequence is obtained according to the first strategy; the method further includes: receiving a second operation for selecting at least one picture, the at least one picture belonging to the the first picture sequence and/or the second picture sequence; and the first policy is updated according to the third picture sequence and the at least one picture.
在上述方法中,电子设备可以根据用户选择的至少一张图片更新用于确定推荐的第三图片序列的第一策略,即根据用户习惯学习第一策略,实现第一策略的个性化,从而使得后续根据第一策略确定的推荐图片更加符合当前用户的需求,提升用户体验。In the above method, the electronic device can update the first strategy for determining the recommended third picture sequence according to at least one picture selected by the user, that is, learn the first strategy according to the user's habits and realize the personalization of the first strategy, so that Subsequent recommended images determined based on the first strategy are more in line with current user needs and improve user experience.
在一种可能的实现方式中,所述基于所述第一图片序列生成第二图片序列,包括:基于所述第一图片序列生成第四图片序列,所述第四图片序列中的图片的时间戳和所述第一图片序列中的图片的时间戳不同;基于所述第一图片序列和所述第四图片序列生成第五图片序列,所述第五图片序列中的图片的观察视角与所述第一图片序列、所述第四图片序列中的图片的观察视角不同,所述第二图片序列包括所述第四图片序列和所述第五图片序列。In a possible implementation, generating a second picture sequence based on the first picture sequence includes: generating a fourth picture sequence based on the first picture sequence, and the time of the picture in the fourth picture sequence is The time stamps of the pictures in the first picture sequence are different from those of the pictures in the first picture sequence; a fifth picture sequence is generated based on the first picture sequence and the fourth picture sequence, and the observation angle of the pictures in the fifth picture sequence is different from that of the pictures in the fifth picture sequence. The viewing angles of the pictures in the first picture sequence and the fourth picture sequence are different, and the second picture sequence includes the fourth picture sequence and the fifth picture sequence.
在一种可能的实现方式中,所述第四图片序列中的图片的时间戳和所述第一图片序列中的图片的时间戳不同,具体包括:所述第四图片序列中的任意一张图片的时间戳和所述第一图片序列中的全部图片的时间戳不同。In a possible implementation, the timestamps of the pictures in the fourth picture sequence are different from the timestamps of the pictures in the first picture sequence, specifically including: any picture in the fourth picture sequence The time stamp of the picture is different from the time stamps of all pictures in the first picture sequence.
在一种可能的实现方式中,所述第五图片序列中的图片的观察视角与所述第一图片序列、所述第四图片序列中的图片的观察视角不同,具体包括:所述第五图片序列中的任意一张图片的观察视角和所述第一图片序列、所述第四图片序列中时间戳和该图片的时间戳相同的图片的观察视角不同。In a possible implementation, the observation angle of the pictures in the fifth picture sequence is different from the observation angles of the pictures in the first picture sequence and the fourth picture sequence, specifically including: the fifth The observation angle of any picture in the picture sequence is different from the observation angle of the picture in the first picture sequence and the fourth picture sequence with the same time stamp as the time stamp of the picture.
在上述方法中,电子设备可以先在时间维度上生成和采集的第一图片序列的时间戳不同的第四图片序列,然后在空间维度上生成和第一图片序列、第四图片序列的观察视角不同的第五图片序列,相比只生成和第一图片序列的观察视角不同的图片,进一步扩充了高质量、差异化的候选图片,用户获取到所需图片的概率也进一步提升。In the above method, the electronic device can first generate a fourth picture sequence with a different time stamp from the collected first picture sequence in the time dimension, and then generate an observation perspective from the first picture sequence and the fourth picture sequence in the spatial dimension. Different fifth picture sequences, compared to only generating pictures with different viewing angles from the first picture sequence, further expand the high-quality and differentiated candidate pictures, and the probability of users obtaining the required pictures is further improved.
在一种可能的实现方式中,所述基于所述第一图片序列和所述第四图片序列生成第五图片序列,包括:基于所述第一图片序列和所述第四图片序列训练得到空间感知模型;获取第一空间参数,所述第一空间参数和所述第一图片序列、所述第二图片序列中的图片的空间参数不同;将所述第一空间参数作为所述空间感知模型的输入获取输出,所述输出为所述第五图片序列。 In a possible implementation, generating a fifth picture sequence based on the first picture sequence and the fourth picture sequence includes: training to obtain a space based on the first picture sequence and the fourth picture sequence. Perception model; obtain a first spatial parameter, which is different from the spatial parameters of the pictures in the first picture sequence and the second picture sequence; use the first spatial parameter as the spatial perception model The input obtains the output, and the output is the fifth picture sequence.
在一种可能的实现方式中,所述空间感知模型是经过多轮迭代训练得到的。In a possible implementation, the spatial perception model is obtained through multiple rounds of iterative training.
在一种可能的实现方式中,空间参数包括图片的空间坐标和用于采集图片的图片采集装置的姿势。In a possible implementation, the spatial parameters include the spatial coordinates of the picture and the posture of the picture collection device used to collect the picture.
在上述方法中,空间感知模型是基于当前采集的第一图片序列和根据第一图片序列生成的第四图片序列迭代训练得到的,因此,空间感知模型可以充分学习到当前拍摄场景的情况,通过空间感知模型获取到的第五图片序列的准确度更高,即候选图片的准确度更高,进一步提升用户获取到所需图片的概率。In the above method, the spatial perception model is iteratively trained based on the currently collected first picture sequence and the fourth picture sequence generated based on the first picture sequence. Therefore, the spatial perception model can fully learn the situation of the current shooting scene. The accuracy of the fifth picture sequence obtained by the spatial perception model is higher, that is, the accuracy of the candidate pictures is higher, which further increases the probability that the user can obtain the desired picture.
在一种可能的实现方式中,所述基于所述第一图片序列生成第二图片序列,包括:基于所述第一图片序列训练得到时空感知模型;获取第二空间参数和第一时间参数,所述第二空间参数包括和所述第一图片序列中的图片的空间参数不同的空间参数,所述第一时间参数包括和所述第一图片序列中的图片的时间参数不同的时间参数;将所述第二空间参数和所述第一时间参数作为所述时空感知模型的输入获取输出,所述输出为所述第二图片序列。In a possible implementation, generating a second picture sequence based on the first picture sequence includes: training to obtain a space-time perception model based on the first picture sequence; obtaining second spatial parameters and first temporal parameters, The second spatial parameters include spatial parameters that are different from the spatial parameters of the pictures in the first picture sequence, and the first temporal parameters include time parameters that are different from the temporal parameters of the pictures in the first picture sequence; The second spatial parameter and the first temporal parameter are used as inputs of the spatio-temporal perception model to obtain an output, and the output is the second picture sequence.
在一种可能的实现方式中,所述时空感知模型是经过多轮迭代训练得到的。In a possible implementation, the space-time perception model is obtained through multiple rounds of iterative training.
在一种可能的实现方式中,时间参数包括图片的时间戳,或者根据图片的时间戳得到的时间嵌套。In a possible implementation, the time parameter includes the timestamp of the image, or the time nesting obtained based on the timestamp of the image.
在上述方法中,时空感知模型是基于当前采集的第一图片序列迭代训练得到的,因此,时空感知模型可以充分学习到当前拍摄场景的情况,通过时空感知模型获取的第二图片序列的准确度更高,即候选图片的准确度更高,进一步提升用户获取到所需图片的概率。In the above method, the spatio-temporal perception model is iteratively trained based on the currently collected first picture sequence. Therefore, the spatio-temporal perception model can fully learn the situation of the current shooting scene. The accuracy of the second picture sequence obtained through the spatio-temporal perception model Higher, that is, the accuracy of the candidate pictures is higher, which further increases the probability that the user will obtain the desired picture.
第二方面,本申请实施例提供了一种电子设备,包括收发器、处理器和存储器;上述存储器用于存储计算机程序,上述处理器调用上述计算机程序,用于执行本申请实施例第一方面以及第一方面的任意一种实现方式提供的图片推荐方法。In a second aspect, embodiments of the present application provide an electronic device, including a transceiver, a processor, and a memory; the memory is used to store a computer program, and the processor calls the computer program to execute the first aspect of the embodiment of the present application. and the image recommendation method provided by any implementation of the first aspect.
第三方面,本申请实施例提供了一种计算机存储介质,该计算机存储介质存储有计算机程序,该计算机程序被处理器执行时,用于执行本申请实施例第一方面以及第一方面的任意一种实现方式提供的图片推荐方法。In a third aspect, embodiments of the present application provide a computer storage medium. The computer storage medium stores a computer program. When the computer program is executed by a processor, the computer program is used to perform the first aspect of the embodiments of the present application and any of the first aspects. An image recommendation method provided by an implementation.
第四方面,本申请实施例提供了一种计算机程序产品,当该计算机程序产品在电子设备上运行时,使得该电子设备执行本申请实施例第一方面以及第一方面的任意一种实现方式提供的图片推荐方法。In a fourth aspect, embodiments of the present application provide a computer program product. When the computer program product is run on an electronic device, the electronic device causes the electronic device to execute the first aspect of the embodiment of the present application and any implementation of the first aspect. Provided image recommendation method.
第五方面,本申请实施例提供一种电子设备,该电子设备包括执行本申请任一实施例所介绍的方法或装置。上述电子设备例如为芯片。In a fifth aspect, embodiments of the present application provide an electronic device, which includes executing the method or device described in any embodiment of the present application. The above-mentioned electronic device is, for example, a chip.
附图说明Description of drawings
以下对本申请用到的附图进行介绍。The drawings used in this application are introduced below.
图1是本申请提供的一种电子设备的硬件结构示意图;Figure 1 is a schematic diagram of the hardware structure of an electronic device provided by this application;
图2是本申请提供的一种电子设备的软件架构示意图;Figure 2 is a schematic diagram of the software architecture of an electronic device provided by this application;
图3是本申请提供的一种图片推荐方法的流程示意图;Figure 3 is a schematic flow chart of an image recommendation method provided by this application;
图4是本申请提供的一种图片生成过程的示意图;Figure 4 is a schematic diagram of an image generation process provided by this application;
图5是本申请提供的又一种图片生成过程的示意图;Figure 5 is a schematic diagram of another image generation process provided by this application;
图6是本申请提供的一种人体的骨骼位置点的示意图;Figure 6 is a schematic diagram of the skeletal position points of a human body provided by this application;
图7是本申请提供的一种个性化数据集的获取过程的示意图;Figure 7 is a schematic diagram of the acquisition process of a personalized data set provided by this application;
图8-图15是本申请提供的又一种电子设备的软件架构示意图;Figures 8-15 are schematic diagrams of the software architecture of yet another electronic device provided by this application;
图16-图19是本申请提供的一些用户界面实施例的示意图。Figures 16-19 are schematic diagrams of some user interface embodiments provided by this application.
具体实施方式Detailed ways
下面将结合附图对本申请实施例中的技术方案进行描述。其中,在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;文本中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,另外,在本申请实施例的描述中,“多个”是指两个或多于两个。The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings. Among them, in the description of the embodiments of this application, unless otherwise stated, "/" means or, for example, A/B can mean A or B; "and/or" in the text is only a way to describe related objects. The association relationship means that there can be three relationships. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiment of the present application , "plurality" means two or more than two.
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为暗示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征,在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。Hereinafter, the terms “first” and “second” are used for descriptive purposes only and shall not be understood as implying or implying relative importance or implicitly specifying the quantity of indicated technical features. Therefore, the features defined as “first” and “second” may explicitly or implicitly include one or more of the features. In the description of the embodiments of this application, unless otherwise specified, “plurality” The meaning is two or more.
用户在一些非静态场景下抓拍图片时,例如抓拍运动物体时,电子设备可以连拍多张图片,并从这多张图片中选择出画质/清晰度较好的图片推荐给用户,用户可以基于推荐的图片从这多张图片中选择出所需 的图片,但是仍然存在以下技术问题导致用户无法方便快捷地获取到满意的图片。When users capture pictures in some non-static scenes, such as when capturing moving objects, the electronic device can take multiple pictures in succession, and select pictures with better quality/clarity from these multiple pictures to recommend to the user. The user can Select the desired image from these multiple images based on recommended images pictures, but there are still the following technical problems that prevent users from obtaining satisfactory pictures conveniently and quickly.
技术问题一:电子设备是按照时间顺序执行连拍操作的,即在有限的拍摄时间内成像,可能存在拍摄的全部图像均不符合用户需求的情况;Technical problem 1: Electronic equipment performs continuous shooting operations in chronological order, that is, imaging within a limited shooting time. There may be situations where all the images taken do not meet the user's needs;
技术问题二:电子设备仅根据画质为用户推荐图片,即图片推荐策略简单,可能存在推荐的图片质量较低,不符合用户需求的情况;Technical problem two: Electronic devices only recommend pictures to users based on picture quality, that is, the picture recommendation strategy is simple, and there may be cases where the recommended pictures are of low quality and do not meet user needs;
技术问题三:所有用户的图片推荐策略相同,没有考虑到不同用户所需的图片可能不同,导致推荐的图片质量较低,不符合用户需求的情况。Technical problem three: The picture recommendation strategy for all users is the same, and it does not take into account that different users may need different pictures, resulting in low-quality recommended pictures that do not meet user needs.
本申请提供了一种图片推荐方法,应用于电子设备,该方法可以让用户方便快捷地获取到满意的图片,提升用户体验。在一种实施方式中,电子设备可以基于拍摄的图片在时间和/或空间维度上生成更多的图片以用于用户选择,即在有限的拍摄时间内增加差异化、高质量的候选图片,以解决上述技术问题一。在一种实施方式中,电子设备也可以从综合、拍摄主体的位置(简称主体位置)、拍摄主体的动作舒展度、拍摄主体的面部表情、画质等多个维度为用户推荐图片,有效优化图片推荐策略,以解决上述技术问题二。在一种实施方式中,电子设备还可以基于用户选择的图片更新图片推荐策略(可以理解为是端侧自学习),实现图片推荐策略的个性化和持续更新,以解决上述技术问题三。从而让用户可以方便快捷地获取到满意的图片,提升用户体验。This application provides a picture recommendation method, which is applied to electronic devices. This method allows users to obtain satisfactory pictures conveniently and quickly, and improves user experience. In one implementation, the electronic device can generate more pictures in the temporal and/or spatial dimensions based on the captured pictures for user selection, that is, adding differentiated, high-quality candidate pictures within a limited shooting time, To solve the above technical problem one. In one implementation, the electronic device can also recommend pictures to the user from multiple dimensions such as comprehensive, photographed subject's position (referred to as subject position), photographed subject's movement and stretch, photographed subject's facial expression, image quality, etc., effectively optimizing Image recommendation strategy to solve the above technical problem two. In one implementation, the electronic device can also update the picture recommendation strategy (which can be understood as end-side self-learning) based on the pictures selected by the user, to achieve personalization and continuous updating of the picture recommendation strategy to solve the above technical problem 3. This allows users to obtain satisfactory pictures conveniently and quickly, improving user experience.
本申请中,电子设备可以是手机、平板电脑、手持计算机、桌面型计算机、膝上型计算机、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、蜂窝电话、个人数字助理(personal digital assistant,PDA),以及智能电视、智能摄像头等智能家居设备,智能手环、智能手表、智能眼镜等可穿戴设备,增强现实(augmented reality,AR)、虚拟现实(virtual reality,VR)、混合现实(mixed reality,MR)等扩展现实(extended reality,XR)设备,车载设备或智慧城市设备,本申请实施例对电子设备的具体类型不作特殊限制。In this application, the electronic device may be a mobile phone, a tablet computer, a handheld computer, a desktop computer, a laptop computer, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, or a personal digital assistant (personal digital assistant). assistant (PDA), as well as smart home devices such as smart TVs and smart cameras, wearable devices such as smart bracelets, smart watches, and smart glasses, augmented reality (AR), virtual reality (VR), and mixed reality Extended reality (XR) devices such as (mixed reality, MR), vehicle-mounted devices or smart city devices. The embodiments of this application do not place special restrictions on the specific types of electronic devices.
接下来介绍本申请实施例提供的示例性的电子设备100。Next, an exemplary electronic device 100 provided by an embodiment of the present application is introduced.
图1示例性示出了一种电子设备100的硬件结构示意图。FIG. 1 exemplarily shows a schematic diagram of the hardware structure of an electronic device 100 .
电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
可以理解的是,本发明实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the electronic device 100 . In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown in the figures, or some components may be combined, some components may be separated, or some components may be arranged differently. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。The processor 110 may include one or more processing units. For example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Among them, different processing units can be independent devices or integrated in one or more processors.
控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。The controller can generate operation control signals based on the instruction operation code and timing signals to complete the control of fetching and executing instructions.
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。The processor 110 may also be provided with a memory for storing instructions and data. In some embodiments, the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。 In some embodiments, processor 110 may include one or more interfaces. Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (PCM) interface, universal asynchronous receiver and transmitter (universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and /or universal serial bus (USB) interface, etc.
充电管理模块140用于从充电器接收充电输入。The charging management module 140 is used to receive charging input from the charger.
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,显示屏194,摄像头193,和无线通信模块160等供电。The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, the wireless communication module 160, and the like.
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。The wireless communication function of the electronic device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example: Antenna 1 can be reused as a diversity antenna for a wireless LAN.
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G/6G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一种实施方式中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一种实施方式中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块设置在同一个器件中。The mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G/6G applied on the electronic device 100 . The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc. The mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation. The mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation. In one implementation, at least part of the functional modules of the mobile communication module 150 may be disposed in the processor 110 . In one implementation, at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一种实施方式中,调制解调处理器可以是独立的器件。在另一种实施方式中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。A modem processor may include a modulator and a demodulator. Among them, the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor. The application processor outputs sound signals through audio devices (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194. In one implementation, the modem processor may be a stand-alone device. In another implementation, the modem processor may be independent of the processor 110 and may be provided in the same device as the mobile communication module 150 or other functional modules.
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。The wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellites. System (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 . The wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
在一种实施方式中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。上述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。In one implementation, the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology. The above-mentioned wireless communication technologies may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband code Wideband code division multiple access (WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technology, etc. The GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi) -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。The electronic device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like. The GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一种实施方式中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。The display screen 194 is used to display images, videos, etc. Display 194 includes a display panel. The display panel can use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode). emitting diode (AMOLED), flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diode (QLED), etc. In one implementation, the electronic device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。The electronic device 100 can implement the shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光 元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,颜色等进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一种实施方式中,ISP可以设置在摄像头193中。The ISP is used to process the data fed back by the camera 193. For example, when taking a photo, you open the shutter and light is transmitted through the lens to the camera. On the component, the optical signal is converted into an electrical signal, and the camera photosensitive element passes the electrical signal to the ISP for processing and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, color, etc. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In one implementation, the ISP may be provided in the camera 193.
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一种实施方式中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。Camera 193 is used to capture still images or video. The object passes through the lens to produce an optical image that is projected onto the photosensitive element. The photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other format image signals. In one implementation, the electronic device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。The external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行电子设备100的各种功能应用以及数据处理。Internal memory 121 may be used to store computer executable program code, which includes instructions. The processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。The electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playback, recording, etc.
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。The audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals. Audio module 170 may also be used to encode and decode audio signals.
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。Speaker 170A, also called "speaker", is used to convert audio electrical signals into sound signals.
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。Receiver 170B, also called "earpiece", is used to convert audio electrical signals into sound signals.
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。Microphone 170C, also called "microphone" or "microphone", is used to convert sound signals into electrical signals.
耳机接口170D用于连接有线耳机。The headphone interface 170D is used to connect wired headphones.
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一种实施方式中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测所述触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。The pressure sensor 180A is used to sense pressure signals and can convert the pressure signals into electrical signals. In one implementation, the pressure sensor 180A may be disposed on the display screen 194 . There are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors, etc. A capacitive pressure sensor may include at least two parallel plates of conductive material. When a force is applied to pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the intensity of the pressure based on the change in capacitance. When a touch operation is performed on the display screen 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device 100 may also calculate the touched position based on the detection signal of the pressure sensor 180A.
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一种实施方式中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。The gyro sensor 180B may be used to determine the motion posture of the electronic device 100 . In one implementation, the angular velocity of electronic device 100 about three axes (ie, x, y, and z axes) may be determined by gyro sensor 180B.
气压传感器180C用于测量气压。Air pressure sensor 180C is used to measure air pressure.
磁传感器180D包括霍尔传感器。电子设备100可以利用磁传感器180D检测翻盖皮套的开合。Magnetic sensor 180D includes a Hall sensor. The electronic device 100 may utilize the magnetic sensor 180D to detect opening and closing of the flip holster.
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。The acceleration sensor 180E can detect the acceleration of the electronic device 100 in various directions (generally three axes).
距离传感器180F,用于测量距离。Distance sensor 180F for measuring distance.
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100通过发光二极管向外发射红外光。电子设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备100附近有物体。当检测到不充分的反射光时,电子设备100可以确定电子设备100附近没有物体。Proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light outwardly through the light emitting diode. Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 .
环境光传感器180L用于感知环境光亮度。The ambient light sensor 180L is used to sense ambient light brightness.
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。Fingerprint sensor 180H is used to collect fingerprints. The electronic device 100 can use the collected fingerprint characteristics to achieve fingerprint unlocking, access to application locks, fingerprint photography, fingerprint answering of incoming calls, etc.
温度传感器180J用于检测温度。Temperature sensor 180J is used to detect temperature.
触摸传感器180K,也称“触控器件”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一种实施方式中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。Touch sensor 180K, also known as "touch device". The touch sensor 180K can be disposed on the display screen 194. The touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation on or near the touch sensor 180K. The touch sensor can pass the detected touch operation to the application processor to determine the touch event type. Visual output related to the touch operation may be provided through display screen 194 . In another implementation, the touch sensor 180K may also be disposed on the surface of the electronic device 100 in a position different from that of the display screen 194 .
骨传导传感器180M可以获取振动信号。Bone conduction sensor 180M can acquire vibration signals.
按键190包括开机键,音量键等。 The buttons 190 include a power button, a volume button, etc.
马达191可以产生振动提示。The motor 191 can generate vibration prompts.
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。The indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, or may be used to indicate messages, missed calls, notifications, etc.
SIM卡接口195用于连接SIM卡。The SIM card interface 195 is used to connect a SIM card.
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。例如,分层架构的软件系统可以是安卓(Android)系统,也可以是鸿蒙(harmony)操作系统(operating system,OS),或其它软件系统。本申请实施例以分层架构的Android系统为例,示例性说明电子设备100的软件结构。The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. For example, the layered architecture software system can be the Android system, the Harmony operating system (operating system, OS), or other software systems. The embodiment of this application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .
图2示例性示出一种电子设备100的软件架构示意图。FIG. 2 exemplarily shows a schematic diagram of the software architecture of the electronic device 100 .
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。The layered architecture divides the software into several layers, and each layer has clear roles and division of labor. The layers communicate through software interfaces. In some embodiments, the Android system is divided into four layers, from top to bottom: application layer, application framework layer, Android runtime and system libraries, and kernel layer.
应用程序层可以包括一系列应用程序包。The application layer can include a series of application packages.
如图2所示,应用程序包可以包括相机,图库,音乐,日历,短信息,通话,导航,蓝牙,浏览器等应用程序。本申请中的应用程序包也可以替换为小程序等其他形式的软件。As shown in Figure 2, the application package can include camera, gallery, music, calendar, short message, call, navigation, Bluetooth, browser and other applications. The application package in this application can also be replaced by other forms of software such as applets.
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。The application framework layer provides an application programming interface (API) and programming framework for applications in the application layer. The application framework layer includes some predefined functions.
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。As shown in Figure 2, the application framework layer can include a window manager, content provider, view system, phone manager, resource manager, notification manager, etc.
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。A window manager is used to manage window programs. The window manager can obtain the display size, determine whether there is a status bar, lock the screen, capture the screen, etc.
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。Content providers are used to store and retrieve data and make this data accessible to applications. Said data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。The view system includes visual controls, such as controls that display text, controls that display pictures, etc. A view system can be used to build applications. The display interface can be composed of one or more views. For example, a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。The phone manager is used to provide communication functions of the electronic device 100 . For example, call status management (including connected, hung up, etc.).
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。The resource manager provides various resources to applications, such as localized strings, icons, pictures, layout files, video files, etc.
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。The notification manager allows applications to display notification information in the status bar, which can be used to convey notification-type messages and can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc. The notification manager can also be notifications that appear in the status bar at the top of the system in the form of charts or scroll bar text, such as notifications for applications running in the background, or notifications that appear on the screen in the form of conversation windows. For example, text information is prompted in the status bar, a beep sounds, the electronic device vibrates, the indicator light flashes, etc.
Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。The core library contains two parts: one is the functional functions that need to be called by the Java language, and the other is the core library of Android.
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。The application layer and application framework layer run in virtual machines. The virtual machine executes the java files of the application layer and application framework layer into binary files. The virtual machine is used to perform object life cycle management, stack management, thread management, security and exception management, and garbage collection and other functions.
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。System libraries can include multiple functional modules. For example: surface manager (surface manager), media libraries (Media Libraries), 3D graphics processing libraries (for example: OpenGL ES), 2D graphics engines (for example: SGL), etc.
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。The surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。The media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc. The media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, composition, and layer processing.
2D图形引擎是2D绘图的绘图引擎。2D Graphics Engine is a drawing engine for 2D drawing.
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。The kernel layer is the layer between hardware and software. The kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.
下面结合捕获拍照场景,示例性说明电子设备100软件以及硬件的工作流程。The following exemplifies the workflow of the software and hardware of the electronic device 100 in conjunction with capturing the photographing scene.
当触摸传感器180K接收到触摸操作,相应的硬件中断被发给内核层。内核层将触摸操作加工成原始 输入事件(包括触摸坐标,触摸操作的时间戳等信息)。原始输入事件被存储在内核层。应用程序框架层从内核层获取原始输入事件,识别该输入事件所对应的控件。以该触摸操作是触摸单击操作,该单击操作所对应的控件为相机应用图标的控件为例,相机应用调用应用框架层的接口,启动相机应用,进而通过调用内核层启动摄像头驱动,通过摄像头193捕获静态图像或视频。When the touch sensor 180K receives a touch operation, the corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes touch operations into raw Input events (including touch coordinates, timestamp of touch operations and other information). Raw input events are stored at the kernel level. The application framework layer obtains the original input event from the kernel layer and identifies the control corresponding to the input event. Taking the touch operation as a touch click operation and the control corresponding to the click operation as a camera application icon control as an example, the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer. Camera 193 captures still images or video.
接下来介绍本申请实施例提供的图片推荐方法。Next, the image recommendation method provided by the embodiment of this application is introduced.
请参见图3,图3是本申请实施例提供的一种图片推荐方法的流程示意图。该方法可以应用于图1所示的电子设备100。该方法可以应用于图2所示的电子设备100。该方法可以包括但不限于如下步骤:Please refer to Figure 3. Figure 3 is a schematic flowchart of an image recommendation method provided by an embodiment of the present application. This method can be applied to the electronic device 100 shown in FIG. 1 . This method can be applied to the electronic device 100 shown in FIG. 2 . The method may include but is not limited to the following steps:
S101:电子设备获取第一图片序列。S101: The electronic device obtains the first picture sequence.
本申请中的图片序列是指至少一张图片。A picture sequence in this application refers to at least one picture.
在一种实施方式中,电子设备可以通过摄像头拍摄得到第一图片序列。在另一种实施方式中,电子设备可以获取连接的设备拍摄的第一图片序列。在另一种实施方式中,电子设备可以从电子设备的存储器中获取第一图片序列,例如获取电子设备的图库中的第一图片序列。在另一种实施方式中,电子设备可以获取网络设备存储的第一图片序列,例如,用户使用电子设备的云相册应用时,电子设备可以响应于用于选择云相册中的第一图片序列的用户操作,向云相册的应用服务器发送请求消息,并接收该应用服务器发送的第一图片序列。不限于此,也可以通过上述至少两种实施方式获取第一图片序列,例如,电子设备通过摄像头拍摄得到第一图片序列中的部分图片,并获取连接的设备拍摄的第一图片序列中除该部分图片以外的其他图片,本申请对获取第一图片序列的具体方式不作限定。In one implementation, the electronic device can capture the first sequence of pictures through a camera. In another implementation, the electronic device may obtain a first sequence of pictures taken by the connected device. In another implementation manner, the electronic device may obtain the first picture sequence from a memory of the electronic device, for example, obtain the first picture sequence in a gallery of the electronic device. In another implementation, the electronic device can obtain the first picture sequence stored by the network device. For example, when the user uses the cloud album application of the electronic device, the electronic device can respond to a message for selecting the first picture sequence in the cloud album. The user operates to send a request message to the application server of the cloud photo album, and receives the first picture sequence sent by the application server. Not limited to this, the first picture sequence can also be obtained through at least two of the above embodiments. For example, the electronic device obtains some pictures in the first picture sequence through a camera, and obtains the first picture sequence taken by the connected device except the partial pictures. For pictures other than some pictures, this application does not limit the specific method of obtaining the first picture sequence.
S102:电子设备基于第一图片序列生成第二图片序列。S102: The electronic device generates a second picture sequence based on the first picture sequence.
在一种实施方式中,电子设备可以基于第一图片序列在时间维度和/或空间维度上生成第二图片序列。In one implementation, the electronic device may generate a second picture sequence in a temporal dimension and/or a spatial dimension based on the first picture sequence.
在一种实施方式中,电子设备可以基于第一图片序列在时间维度上生成至少一张图片,第二图片序列包括这至少一张图片。在一些示例中,电子设备可以先获取第一图片序列中每张图片的拍摄时间(可简称为时间戳),假设其中最小的时间戳和最大的时间戳分别为时间戳1和时间戳2,然后电子设备可以基于这些时间戳和第一图片序列生成时间戳更细的至少一张图片,其中,这至少一张图片中每张图片的时间戳大于时间戳1且小于时间戳2,并且和第一图片序列中任意一张图片的时间戳不同,实现方式例如类似视频插帧。上述过程的示例可参见下图4。In one implementation, the electronic device can generate at least one picture in the time dimension based on the first picture sequence, and the second picture sequence includes the at least one picture. In some examples, the electronic device can first obtain the shooting time of each picture in the first picture sequence (which can be referred to as a timestamp for short), assuming that the minimum timestamp and the maximum timestamp are timestamp 1 and timestamp 2 respectively, The electronic device can then generate at least one picture with a more detailed timestamp based on these timestamps and the first picture sequence, wherein the timestamp of each picture in the at least one picture is greater than timestamp 1 and less than timestamp 2, and The timestamp of any picture in the first picture sequence is different, and the implementation method is similar to video frame insertion, for example. An example of the above process can be seen in Figure 4 below.
如图4所示,第一图片序列可以包括四张图片:图片1、图片2、图片3和图片4,这四张图片的时间戳从小到大依次为:t1、t2、t3、t4。电子设备可以基于第一图片序列在时间维度上生成三张图片:时间戳t5在t1和t2之间的图片5、时间戳t6在t2和t3之间的图片6以及时间戳t7在t3和t4之间的图片7。不限于图4所示的情况,在另一些示例中,对于第一图片序列的时间戳中任意两个相邻的时间戳,电子设备可以生成多张时间戳在这两个时间戳之间的图片,例如生成多张时间戳在t1和t2之间的图片,在另一些示例中,对于第一图片序列的时间戳中任意两个相邻的时间戳,电子设备可以不生成时间戳在这两个时间戳之间的图片,例如不生成时间戳t5在t1和t2之间的图片5,本申请对具体生成方式不作限定。As shown in Figure 4, the first picture sequence may include four pictures: picture 1, picture 2, picture 3 and picture 4. The timestamps of these four pictures from small to large are: t 1 , t 2 , t 3 , t4 . The electronic device can generate three pictures in the time dimension based on the first picture sequence: picture 5 with timestamp t 5 between t 1 and t 2 , picture 6 with timestamp t 6 between t 2 and t 3 , and time Poke picture 7 where t 7 is between t 3 and t 4 . Not limited to the situation shown in Figure 4, in other examples, for any two adjacent timestamps in the timestamps of the first picture sequence, the electronic device can generate multiple timestamps between the two timestamps. Pictures, for example, generate multiple pictures with timestamps between t 1 and t 2. In other examples, for any two adjacent timestamps in the timestamp of the first picture sequence, the electronic device may not generate a timestamp. Pictures between these two timestamps, for example, picture 5 with timestamp t 5 between t 1 and t 2 are not generated. This application does not limit the specific generation method.
在一种实施方式中,电子设备可以基于第一图片序列在空间维度上生成至少一张图片,例如基于神经辐射场(neural radiance field,NeRF)进行新视角合成以得到这至少一张图片,第二图片序列包括这至少一张图片。在一些示例中,对于第一图片序列中的任意一张图片(可称为参考图片),电子设备可以生成观察视角不同的一张或多张图片,其中,这一张或多张图片的时间戳为参考图片的时间戳,这一张或多张图片中的任意一张图片的观察视角和参考图片的观察视角不同,若生成了多张图片则这多张图片分别对应不同的观察视角,该过程可以理解为是针对某个固定的时间戳,从不同的观察视角进行新视角合成,从而得到新观察视角的至少一张图片。电子设备可以将第一图片序列中的部分或全部图片作为参考图片,来生成观察视角更多的至少一张图片。上述过程的示例可参见下图5,图5以图4中的图片1为参考图片为例进行说明,图片1中的拍摄主体为人体1。In one implementation, the electronic device can generate at least one picture in a spatial dimension based on the first picture sequence, for example, perform new perspective synthesis based on neural radiation field (NeRF) to obtain the at least one picture. The two-picture sequence includes the at least one picture. In some examples, for any picture in the first picture sequence (which may be called a reference picture), the electronic device can generate one or more pictures with different viewing angles, where the time of this one or more pictures The stamp is the timestamp of the reference picture. The observation perspective of any one of the one or more pictures is different from the observation perspective of the reference picture. If multiple pictures are generated, these multiple pictures correspond to different observation perspectives. This process can be understood as synthesizing new perspectives from different observation perspectives for a fixed timestamp, thereby obtaining at least one picture of the new observation perspective. The electronic device may use some or all of the pictures in the first picture sequence as reference pictures to generate at least one picture with more viewing angles. An example of the above process can be seen in Figure 5 below. Figure 5 uses picture 1 in Figure 4 as a reference picture as an example. The subject in picture 1 is human body 1.
如图5所示,人体1可以抽象为一个立方体,可以从不同的视角观察人体1/该立方体,例如但不限于:从正视角观察人体1的正面/前面,从背视角观察人体1的背面/后面,从左视角观察人体1的左侧面,从右视角观察人体1的右侧面等。作为参考图片的图片1是在时间戳为t1时从正视角拍摄人体1得到的图片,针对图片1的时间戳t1,电子设备可以生成:从背视角观察人体1得到的图片8、从左视角观察人体1得到的图片9、从右视角观察人体1得到的图片10。不限于图5所示的情况,在另一些示例中,电子设备可以生成观察视角更多或更少的图片,例如,针对图片1的时间戳t1,电子设备可以还生成以从上往下的视角观察人体1得到的图片,本申请对具体生成方式不作限定。As shown in Figure 5, the human body 1 can be abstracted into a cube, and the human body 1/the cube can be observed from different perspectives, for example but not limited to: observing the front/front of the human body 1 from the front perspective, and observing the back of the human body 1 from the rear perspective. /Back, the left side of human body 1 is observed from the left perspective, the right side of human body 1 is observed from the right perspective, etc. Picture 1 as a reference picture is a picture of human body 1 taken from the front view when the timestamp is t 1. For the timestamp t 1 of picture 1, the electronic device can generate: a picture of human body 1 taken from the back view 8. From Picture 9 obtained by observing human body 1 from the left perspective, and picture 10 obtained by observing human body 1 from the right perspective. Not limited to the situation shown in Figure 5, in other examples, the electronic device can generate pictures with more or less viewing angles. For example, for the timestamp t 1 of picture 1, the electronic device can also generate pictures with a view from top to bottom. This application does not limit the specific generation method of the picture obtained by observing the human body 1 from a perspective.
S103:电子设备使用美学评估模型对第一图片序列和第二图片序列中的每张图片进行评分,并得到评 分较高的第三图片序列。S103: The electronic device uses the aesthetic evaluation model to score each picture in the first picture sequence and the second picture sequence, and obtains the score. The third picture sequence with a higher score.
在一种实施方式中,电子设备可以将第一图片序列和第二图片序列中的每张图片作为美学评估模型的输入,得到对应的输出,该输出可以包括该图片在多个维度的得分,该多个维度例如但不限于包括以下多项:综合(对应的得分称为综合得分)、主体位置(对应的得分称为主体位置得分)、拍摄主体的动作舒展度(对应的得分称为动作舒展度得分)、拍摄主体的表情(对应的得分称为表情得分)、画质(对应的得分称为画质得分)等,可以理解为是美学评估模型可以从多个维度对图片进行评分。接下来以第一图片序列和第二图片序列中的任意一张图片:第一图片为例示例性说明美学评估模型的评分方式。In one implementation, the electronic device can use each picture in the first picture sequence and the second picture sequence as the input of the aesthetic evaluation model to obtain a corresponding output. The output can include scores of the picture in multiple dimensions, The multiple dimensions include, for example, but are not limited to, the following: comprehensive (the corresponding score is called the comprehensive score), subject position (the corresponding score is called the subject position score), and the shooting subject's movement and stretch (the corresponding score is called the action Stretch score), the subject's expression (the corresponding score is called the expression score), the image quality (the corresponding score is called the image quality score), etc. It can be understood that the aesthetic evaluation model can score pictures from multiple dimensions. Next, take any picture in the first picture sequence and the second picture sequence: the first picture as an example to illustrate the scoring method of the aesthetic evaluation model.
在一些示例中,美学评估模型可以基于第一图片中的拍摄主体的速度的变化率(即加速度)确定主体位置得分,其中,上述加速度可以是根据第一图片和相邻图片中的拍摄主体的速度确定的,该相邻图片属于第一图片序列和第二图片序列,例如但不限于是时间戳和第一图片的时间戳的差值的绝对值小于或等于预设阈值的图片。例如,当第一图片中的拍摄主体的加速度的变化趋势为变小且数值为0时,美学评估模型可以认为第一图片中的拍摄主体处于运动的最高点,因此可以将第一图片的主体位置得分设置为最大值,其中,上述变化趋势为变小可以包括:方向向上的加速度从正数逐渐变小(逐渐趋于0),上述变化趋势可以是和之前图片中的拍摄主体的加速度相比得到的,该之前图片属于第一图片序列和第二图片序列且时间戳小于第一图片的时间戳。不限于上述示例的情况,在另一些示例中,美学评估模型还可以基于第一图片中拍摄主体所占区域的大小确定主体位置得分,例如,所占区域越大时主体位置得分越大,在另一些示例中,美学评估模型还可以基于第一图片中拍摄主体的位置优先级确定主体位置得分,例如,拍摄主体的位置位于中间时,位置优先级最大,因此主体位置得分可以设置为最大值,本申请对此不作限定。In some examples, the aesthetic evaluation model may determine the subject position score based on a rate of change (ie, acceleration) of the speed of the subject in the first picture, where the acceleration may be based on the subject in the first picture and adjacent pictures. Determined by the speed, the adjacent pictures belong to the first picture sequence and the second picture sequence, for example but are not limited to pictures whose absolute value of the difference between the timestamp and the timestamp of the first picture is less than or equal to a preset threshold. For example, when the change trend of the acceleration of the subject in the first picture is getting smaller and the value is 0, the aesthetic evaluation model can consider that the subject in the first picture is at the highest point of motion, so the subject in the first picture can be The position score is set to the maximum value. The above-mentioned changing trend may include: the acceleration in the upward direction gradually decreases from a positive number (gradually approaching 0). The above-mentioned changing trend may be consistent with the acceleration of the subject in the previous picture. It can be obtained from the comparison that the previous picture belongs to the first picture sequence and the second picture sequence and the timestamp is smaller than the timestamp of the first picture. Not limited to the above examples, in other examples, the aesthetic evaluation model can also determine the subject position score based on the size of the area occupied by the subject in the first picture. For example, the larger the occupied area, the greater the subject position score. In other examples, the aesthetic evaluation model can also determine the subject position score based on the position priority of the subject in the first picture. For example, when the position of the subject is in the middle, the position priority is the highest, so the subject position score can be set to the maximum value. , this application does not limit this.
在一些示例中,美学评估模型可以基于第一图片中的拍摄主体的骨骼位置点的距离确定动作舒展度得分,例如,当该距离越大时,美学评估模型可以认为第一图片中的拍摄主体的动作越舒展,得到的动作舒展度得分越大,当该距离越小(例如人体跳跃时四肢没有展开)时,得到的动作舒展度得分越小。其中,拍摄主体的骨骼位置点的示例可参见图6,图6所示的拍摄主体为人体,如图6所示,人体可以包括头部点、颈部点、左/右肩点、左/右肘点、左/右手点、左/右髋点、左/右膝点、左/右脚点等多个骨骼位置点,上述距离例如包括图6所示的任意两个骨骼位置点的距离。In some examples, the aesthetic evaluation model may determine the motion stretch score based on the distance between the skeletal position points of the subject in the first picture. For example, when the distance is larger, the aesthetic evaluation model may consider that the subject in the first picture The more stretched the movement is, the greater the movement stretching score is. When the distance is smaller (for example, the limbs are not spread out when the human body jumps), the smaller the movement stretching score is obtained. Among them, an example of the skeletal position points of the subject can be seen in Figure 6. The subject shown in Figure 6 is a human body. As shown in Figure 6, the human body can include head points, neck points, left/right shoulder points, left/right shoulder points, etc. There are multiple skeletal position points such as right elbow point, left/right hand point, left/right hip point, left/right knee point, left/right foot point, etc. The above distance includes, for example, the distance between any two skeletal position points as shown in Figure 6 .
在一些示例中,美学评估模型可以基于第一图片中的拍摄主体的面部表情确定表情得分,例如,当面部表情为微笑、大笑、眼睛睁开较大等较好的表情时,得到的表情得分较大,当面部表情为闭眼等较差的表情时,得到的表情得分较小,其中,上述较好或较差的表情可以是预设的,也可以是响应于用户操作确定的,还可以是学习用户喜好得到的。In some examples, the aesthetic evaluation model can determine the expression score based on the facial expression of the subject in the first picture, for example, when the facial expression is a better expression such as smiling, laughing, eyes wide open, etc., the expression obtained The score is larger. When the facial expression is a poor expression such as closing the eyes, the obtained expression score is smaller. The above-mentioned better or worse expressions can be preset or determined in response to user operations. It can also be obtained by learning user preferences.
在一些示例中,美学评估模型可以基于第一图片的画质确定画质得分,例如,当第一图片的画质较高时,得到的画质得分较大,当第一图片的画质较低时(例如拍摄主体存在运动模糊的情况),得到的画质得分较小。其中,画质可以但不限于包括动态范围、饱和度、对比度和清晰度等。In some examples, the aesthetic evaluation model can determine the image quality score based on the image quality of the first picture. For example, when the image quality of the first picture is higher, the obtained image quality score is larger. When the image quality of the first picture is lower, the image quality score is larger. When it is low (for example, when the subject is subject to motion blur), the image quality score obtained is smaller. Among them, image quality can include, but is not limited to, dynamic range, saturation, contrast, sharpness, etc.
在一些示例中,综合得分可以是结合画质、主体位置、动作舒展度、表情、是否有移动物体(例如是否有路人乱入)等多个指标得到的评分。In some examples, the comprehensive score can be a score obtained by combining multiple indicators such as image quality, subject position, ease of movement, expression, whether there are moving objects (for example, whether there are passers-by).
在一种实施方式中,电子设备可以对美学评估模型的输出:第一图片序列和第二图片序列中的图片在多个维度的得分,(从多个维度)分别进行排序和筛选,并得到评分较高的第三图片序列,第三图片序列可以包括分别在上述多个维度的得分较高的多个图片序列。在一些示例中,多个维度的得分为上述示例的综合得分、主体位置得分、动作舒展度得分、表情得分和画质得分,因此,第三图片序列可以包括:综合得分在前N1位的图片序列1、主体位置得分在前N2位的图片序列2、动作舒展度得分在前N3位的图片序列3、表情得分在前N4位的图片序列4、画质得分在前N5位的图片序列5,其中,Ni为正整数,i为小于6的正整数。In one implementation, the electronic device can separately sort and filter the output of the aesthetic evaluation model: the scores of the pictures in the first picture sequence and the second picture sequence in multiple dimensions (from multiple dimensions), and obtain A third picture sequence with a higher score. The third picture sequence may include a plurality of picture sequences with higher scores in the above-mentioned multiple dimensions. In some examples, the scores of multiple dimensions are the comprehensive score, subject position score, movement stretch score, expression score and image quality score of the above examples. Therefore, the third picture sequence may include: the top N 1 comprehensive scores Picture sequence 1. Picture sequence with subject position score in the top N 2 places 2. Picture sequence with movement stretch score in the top N 3 places 3. Picture sequence with expression score in the top N 4 places 4. Image quality score in the top N 5 Bit picture sequence 5, where N i is a positive integer and i is a positive integer less than 6.
S104:电子设备显示第三图片序列。S104: The electronic device displays the third picture sequence.
在一种实施方式中,电子设备可以在显示第一图片序列和/或第二图片序列中的至少一张图片时,优先显示/突出显示其中的第三图片序列,例如,第三图片序列中的图片显示在其他图片之前。In one implementation, when displaying at least one picture in the first picture sequence and/or the second picture sequence, the electronic device may display/highlight the third picture sequence therein, for example, in the third picture sequence pictures appear before other pictures.
在另一种实施方式中,电子设备显示第三图片序列时,响应于用户操作(例如返回到图库应用的图片列表界面的用户操作),显示第一图片序列和/或第二图片序列中的至少一张图片。In another embodiment, when the electronic device displays the third picture sequence, in response to a user operation (such as a user operation returning to the picture list interface of the gallery application), the electronic device displays the first picture sequence and/or the second picture sequence. At least one picture.
在一种实施方式中,电子设备显示第三图片序列时,可以优先显示/突出显示评分更高的图片,例如,电子设备显示第三图片序列中综合得分在前N1位的图片序列1时,可以按照综合得分从高到低的顺序从前往后显示图片,即综合得分最高的图片显示在最前面,综合得分次高的图片显示在第二位,以此类推。In one implementation, when the electronic device displays the third picture sequence, it may preferentially display/highlight pictures with higher scores. For example, when the electronic device displays the picture sequence 1 with the top N 1 comprehensive score in the third picture sequence, , the pictures can be displayed from front to back in order of comprehensive score from high to low, that is, the picture with the highest comprehensive score is displayed at the front, the picture with the second highest comprehensive score is displayed at the second, and so on.
在一种实施方式中,电子设备显示第三图片序列时,可以显示评分位于前几位的图片,可以按照评分从高到低显示,也可以按照时间顺序显示。 In one implementation, when the electronic device displays the third picture sequence, the pictures with the highest scores may be displayed, from high to low, or in chronological order.
S105:电子设备接收用于选择至少一张图片(可以称为第四图片序列)的用户操作。S105: The electronic device receives a user operation for selecting at least one picture (which may be called a fourth picture sequence).
在一种实施方式中,S105是可选的步骤。In one implementation, S105 is an optional step.
在一种实施方式中,电子设备显示第三图片序列时,可以接收用于选择第三图片序列中的第四图片序列的用户操作。在另一种实施方式中,电子设备显示第三图片序列时,还显示第一图片序列和/或第二图片序列中的其他图片,电子设备可以接收用于选择第三图片序列和/或该其他图片中的第四图片序列的用户操作。In one implementation, when the electronic device displays the third picture sequence, it may receive a user operation for selecting a fourth picture sequence in the third picture sequence. In another embodiment, when the electronic device displays the third picture sequence, it also displays other pictures in the first picture sequence and/or the second picture sequence, and the electronic device may receive a method for selecting the third picture sequence and/or the second picture sequence. User action for the fourth picture sequence among other pictures.
在一种实施方式中,上述第四图片序列可以包括第一图片序列中的图片。在一种实施方式中,上述第四图片序列可以包括第二图片序列中的图片。在一种实施方式中,上述第四图片序列可以包括第三图片序列中的图片。In one implementation, the fourth picture sequence may include pictures in the first picture sequence. In one implementation, the fourth picture sequence may include pictures in the second picture sequence. In one implementation, the fourth picture sequence may include pictures in the third picture sequence.
在一种实施方式中,电子设备可以根据接收到的用户操作设置第四图片序列中的图片的优先级,例如,用户先选择的图片的优先级高于用户后选择的图片的优先级。In one implementation, the electronic device may set the priority of the pictures in the fourth picture sequence according to the received user operation. For example, the priority of the picture selected by the user first is higher than the priority of the picture selected by the user later.
在一种实施方式中,电子设备可以根据接收到的用户操作,确定第四图片序列中的图片对应的维度,该维度可以为S103所述的多个维度中的任意一个维度。例如,电子设备显示综合得分较高的图片序列1时,接收到用于选择图片A的用户操作,因此,图片A对应的维度为综合。In one implementation, the electronic device can determine the dimension corresponding to the picture in the fourth picture sequence according to the received user operation, and the dimension can be any one of the multiple dimensions described in S103. For example, when the electronic device displays picture sequence 1 with a high comprehensive score, it receives a user operation for selecting picture A. Therefore, the dimension corresponding to picture A is comprehensive.
S106:电子设备基于第三图片序列和用户选择的至少一张图片(即第四图片序列)更新美学评估模型。S106: The electronic device updates the aesthetic evaluation model based on the third picture sequence and at least one picture selected by the user (ie, the fourth picture sequence).
在一种实施方式中,电子设备可以基于第三图片序列中图片的评分,设置第四图片序列中图片的评分,第四图片序列和对应的评分可以称为个性化数据集,个性化数据集可以用于更新美学评估模型。In one implementation, the electronic device can set the score of the picture in the fourth picture sequence based on the score of the picture in the third picture sequence. The fourth picture sequence and the corresponding score can be called a personalized data set. The personalized data set Can be used to update aesthetic assessment models.
在一种实施方式中,对于第四图片序列中对应同一个维度的M张图片(M为正整数),以该维度为综合维度为例进行说明,若这M张图片不为第三图片序列中综合得分较高的图片序列1,则电子设备可以将这M张图片对应的综合得分设置为:图片序列1中综合得分在前M位的图片的综合得分,这M张图片和对应的综合得分可以属于个性化数据集,其中,上述这M张图片不为图片序列1,可以包括:这M张图片中的任意一张图片不属于图片序列1,和/或,这M张图片中的任意一张图片的优先级顺序和图片序列1中该图片的综合得分不同,在一些示例中,对于上述M张图片中的任意一张图片(可称为第二图片),假设第二图片的优先级在这M张图片中排在第K位(K为小于或等于M的正整数),若第二图片不属于图片序列1,或者,若第二图片属于图片序列1但图片序列1中综合得分排在第K位的图片不为第二图片,电子设备可以将第二图片对应的综合得分设置为:图片序列1中综合得分排在第K位的第三图片的综合得分。不限于上述实施方式,在另一种实施方式中,电子设备设置这M张图片对应的综合得分时,对于这M张图片中的任意一张图片,若该图片属于图片序列1,并且该图片的优先级顺序和图片序列1中该图片的综合得分的顺序一致,则电子设备可以不设置该图片对应的综合得分。电子设备设置第四图片序列中图片的评分的示例可参见下图7,暂不详述。In one implementation, for M pictures corresponding to the same dimension in the fourth picture sequence (M is a positive integer), the dimension is a comprehensive dimension as an example for explanation. If these M pictures are not the third picture sequence In picture sequence 1 with a higher comprehensive score, the electronic device can set the comprehensive score corresponding to these M pictures as: the comprehensive score of the top M pictures in the picture sequence 1, these M pictures and the corresponding comprehensive score The score may belong to the personalized data set, in which the above M pictures are not picture sequence 1, and may include: any picture among the M pictures does not belong to picture sequence 1, and/or, among the M pictures The priority order of any picture is different from the comprehensive score of the picture in picture sequence 1. In some examples, for any picture among the above M pictures (can be called the second picture), assuming that the second picture The priority is ranked Kth among these M pictures (K is a positive integer less than or equal to M), if the second picture does not belong to picture sequence 1, or if the second picture belongs to picture sequence 1 but is in picture sequence 1 The picture with the Kth comprehensive score is not the second picture. The electronic device can set the comprehensive score corresponding to the second picture as: the comprehensive score of the third picture with the Kth comprehensive score in the picture sequence 1. Not limited to the above embodiment, in another embodiment, when the electronic device sets the comprehensive score corresponding to the M pictures, for any one of the M pictures, if the picture belongs to picture sequence 1, and the picture The priority order is consistent with the order of the comprehensive score of the picture in picture sequence 1, then the electronic device does not need to set the comprehensive score corresponding to the picture. An example of how the electronic device sets the score of the pictures in the fourth picture sequence can be seen in Figure 7 below, which will not be described in detail yet.
在一种实施方式中,更新后的美学评估模型可以用于后续对图片进行评分,例如,图3所示的S101-S106执行之后,可以得到更新后的美学评估模型,然后电子设备可以再次执行S101-S104(此时的图片序列和之前的图片序列可以不同),此时,S103中使用的美学评估模型可以为上述更新后的美学评估模型。In one implementation, the updated aesthetic evaluation model can be used to subsequently score pictures. For example, after executing S101-S106 shown in Figure 3, the updated aesthetic evaluation model can be obtained, and then the electronic device can execute it again. S101-S104 (the picture sequence at this time may be different from the previous picture sequence). At this time, the aesthetic evaluation model used in S103 may be the updated aesthetic evaluation model mentioned above.
不限于图3所示的实施方式,在另一种实施方式中,S102和/或S103也可以是和电子设备连接的网络设备执行的,例如,电子设备可以向网络设备发送第一图片序列,网络设备可以执行S102和S103,然后向电子设备发送第三图片序列以用于显示。Not limited to the implementation shown in Figure 3, in another implementation, S102 and/or S103 may also be executed by a network device connected to the electronic device. For example, the electronic device may send the first picture sequence to the network device, The network device may perform S102 and S103, and then send the third picture sequence to the electronic device for display.
在图3所示的方法中,电子设备可以在时间和/空间维度上生成第二图片序列,并基于综合、主体位置、动作舒展度、面部表情、画质等多个维度,从第一图片序列和第二图片序列中筛选出质量较高的第三图片序列推荐给用户,优化图片推荐策略,让用户可以方便快捷地获取到满意的图片。推荐第三图片序列的同时还可以显示第一图片序列和第二图片序列作为候选图片,增加用户获取所需图像的概率。并且,电子设备可以基于用户选择的图片更新图片推荐策略,为不同用户推荐不同的图片,进一步增加用户获取所需图像的概率。In the method shown in Figure 3, the electronic device can generate a second picture sequence in the time and/or space dimensions, and based on multiple dimensions such as synthesis, subject position, movement stretch, facial expression, image quality, etc., from the first picture The third picture sequence with higher quality is selected from the sequence and the second picture sequence and recommended to the user, and the picture recommendation strategy is optimized so that the user can obtain satisfactory pictures conveniently and quickly. While recommending the third picture sequence, the first picture sequence and the second picture sequence can also be displayed as candidate pictures, thereby increasing the probability that the user obtains the desired image. Moreover, the electronic device can update the picture recommendation strategy based on the pictures selected by the user, recommend different pictures for different users, and further increase the probability that the user obtains the desired image.
图7示例性示出一种个性化数据集的获取过程的示意图。Figure 7 exemplarily shows a schematic diagram of the acquisition process of a personalized data set.
如图7所示,第三图片序列包括综合得分在前2位的图片序列1和主体位置得分在前3位的图片序列2,在图片序列1中,图片11的综合得分1高于图片12的综合得分2,在图片序列2中,按照主体位置得分从高到低排列依次为:图片21(对应主体位置得分1)、图片22(对应主体位置得分2)、图片23(对应主体位置得分3)。第四图片序列包括图片11、图片22和图片24,其中,图片11对应的维度为综合,图片22和图片24为对应的维度为主体位置,并且图片22的优先级高于图片24。 As shown in Figure 7, the third picture sequence includes picture sequence 1 with the top 2 comprehensive scores and picture sequence 2 with the subject position score in the top 3. In picture sequence 1, the comprehensive score 1 of picture 11 is higher than that of picture 12 The comprehensive score is 2. In the picture sequence 2, the order from high to low according to the subject position score is: picture 21 (corresponding to the subject position score 1), picture 22 (corresponding to the subject position score 2), picture 23 (corresponding to the subject position score 2) 3). The fourth picture sequence includes picture 11, picture 22 and picture 24, where the corresponding dimension of picture 11 is comprehensive, the corresponding dimension of picture 22 and picture 24 is subject position, and picture 22 has a higher priority than picture 24.
由于第四图片序列中对应综合维度的图片11属于图片序列1,并且图片11的优先级顺序和图片序列1中图片11的综合得分的顺序均为第一位,可以理解为是美学评估模型得到的综合评分符合用户需求,因此,个性化数据集可以不包括图片11和对应的综合得分1。Since picture 11 corresponding to the comprehensive dimension in the fourth picture sequence belongs to picture sequence 1, and the priority order of picture 11 and the order of the comprehensive score of picture 11 in picture sequence 1 are both first, it can be understood that the aesthetic evaluation model obtains The comprehensive score meets the user's needs. Therefore, the personalized data set does not need to include the picture 11 and the corresponding comprehensive score 1.
由于第四图片序列中对应主体位置维度的图片22属于图片序列2,但图片22的优先级顺序(第一位)和图片序列2中图片22的综合得分的顺序(第二位)不同,因此,电子设备可以将第四图片序列中图片22对应的主体位置得分设置为:图片序列2中综合得分的顺序在第一位的图片21的主体位置得分1,相应地,个性化数据集可以包括图片22和对应的主体位置得分1。Since picture 22 corresponding to the subject position dimension in the fourth picture sequence belongs to picture sequence 2, but the priority order of picture 22 (first place) is different from the order of the comprehensive score of picture 22 in picture sequence 2 (second place), therefore , the electronic device can set the subject position score corresponding to the picture 22 in the fourth picture sequence as: the subject position score of the picture 21 with the first comprehensive score in the picture sequence 2 is 1. Correspondingly, the personalized data set can include Image 22 and the corresponding subject position score 1.
由于第四图片序列中对应主体位置维度的图片24不属于图片序列2,由于图片24的优先级顺序为第二位,因此,电子设备可以将第四图片序列中图片24对应的主体位置得分设置为:图片序列2中综合得分的顺序在第二位的图片22的主体位置得分2,相应地,个性化数据集可以包括图片24和对应的主体位置得分2。Since the picture 24 corresponding to the subject position dimension in the fourth picture sequence does not belong to the picture sequence 2, and the priority order of the picture 24 is the second, the electronic device can set the subject position score corresponding to the picture 24 in the fourth picture sequence. is: the subject position score 2 of the picture 22 whose comprehensive score is second in the picture sequence 2. Correspondingly, the personalized data set may include the picture 24 and the corresponding subject position score 2.
请参见图8,图8示例性示出又一种电子设备100的软件架构示意图。Please refer to FIG. 8 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
如图8所示,电子设备100可以包括图片生成模块200、图片推荐模块300、用户选择模块400、存储模块500、个性化学习模块600和图片库700,其中,图片库700可以包括图片序列701,图片序列701例如为电子设备100响应于用户操作连续拍摄得到的多张图片。As shown in FIG. 8 , the electronic device 100 may include a picture generation module 200 , a picture recommendation module 300 , a user selection module 400 , a storage module 500 , a personalized learning module 600 and a picture library 700 , where the picture library 700 may include a picture sequence 701 , the picture sequence 701 is, for example, a plurality of pictures continuously taken by the electronic device 100 in response to user operations.
图片生成模块200可以接收图片序列701(作为输入),根据图片序列701在时间和/或空间维度上生成新的时间戳和/或新的观察视角的图片序列702,图片序列702可以输出到图片库700。在一种实施方式中,图片生成模块200可以用于执行图3中的S102。The picture generation module 200 may receive a picture sequence 701 (as input), generate a picture sequence 702 with a new timestamp and/or a new observation perspective in the temporal and/or spatial dimensions according to the picture sequence 701, and the picture sequence 702 may be output to the picture Library 700. In one implementation, the picture generation module 200 may be used to perform S102 in FIG. 3 .
图片推荐模块300可以接收图片库700(作为输入),通过美学评估模型从综合、主体位置、动作舒展度、表情、画质等多个维度对图片库700中的每张图片进行评分,并输出评分较高的图片序列703。在一种实施方式中,图片推荐模块300可以用于执行图3中的S103。电子设备100可以显示图片序列703以推荐给用户选择。The picture recommendation module 300 can receive the picture library 700 (as input), use the aesthetic evaluation model to score each picture in the picture library 700 from multiple dimensions such as comprehensiveness, subject position, movement stretch, expression, and picture quality, and output Highly rated image sequence 703. In one implementation, the picture recommendation module 300 may be used to perform S103 in FIG. 3 . The electronic device 100 may display a sequence of pictures 703 to recommend to the user for selection.
用户选择模块400可以在显示图片序列703,可选地以及图片库700中的其他图片(作为输入)时,根据用户操作从显示的图片中选择出图片序列704(作为输出)。在一种实施方式中,用户选择模块400可以用于执行图3中的S105。The user selection module 400 can select the picture sequence 704 (as an output) from the displayed pictures according to the user operation while displaying the picture sequence 703, optionally and other pictures in the picture library 700 (as an input). In one implementation, the user selection module 400 may be used to perform S105 in FIG. 3 .
存储模块500可以存储用户选择模块400输出的图片序列704,在一种实施方式中,存储模块500还可以删除图片库700中除图片序列704以外的图片。The storage module 500 can store the picture sequence 704 output by the user selection module 400. In one implementation, the storage module 500 can also delete pictures other than the picture sequence 704 in the picture library 700.
个性化学习模块600可以接收图片序列703和图片序列704(作为输入),对比图片序列703和图片序列704以获取个性化数据集,并基于个性化数据集对更新前/历史的美学评估模型进行训练(例如周期性训练)以得到更新后的美学评估模型(作为输出)。在一种实施方式中,更新前的美学评估模型可以是图片推荐模块300发送给个性化学习模块600作为输入的。更新后的美学评估模型可以发送给图片推荐模块使用。在一种实施方式中,个性化学习模块600可以用于执行图3中的S106。The personalized learning module 600 can receive the picture sequence 703 and the picture sequence 704 (as input), compare the picture sequence 703 and the picture sequence 704 to obtain a personalized data set, and perform a pre-update/historical aesthetic evaluation model based on the personalized data set. Training (e.g., periodic training) to obtain an updated aesthetic evaluation model (as output). In one implementation, the aesthetic evaluation model before updating may be sent by the picture recommendation module 300 to the personalized learning module 600 as input. The updated aesthetic evaluation model can be sent to the image recommendation module for use. In one implementation, the personalized learning module 600 may be used to perform S106 in FIG. 3 .
接下来示例性介绍图8所示的电子设备100中的图片生成模块200。Next, the picture generation module 200 in the electronic device 100 shown in FIG. 8 is introduced as an example.
请参见图9,图9示例性示出又一种电子设备100的软件架构示意图。Please refer to FIG. 9 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
如图9所示,电子设备100的图片生成模块200可以包括时间感知模块201和空间感知模块202。其中,时间感知模块201可以接收图片库700中的图片序列701作为输入,根据图片序列701在时间维度上生成新的时间戳的图片序列705(作为输出)。时间感知模块201例如是基于视频插帧实现的。空间感知模块202可以接收图片库700中的图片序列701和时间感知模块201输出的图片序列705作为输入,根据图片序列701和图片序列705在空间维度上生成新的观察视角的图片序列706(作为输出)。空间感知模块202例如是基于NeRF实现的。图片序列705和图片序列706可以输出到图片库700中构成图片序列702,图片序列702可以为图片序列705和图片序列706的并集。As shown in FIG. 9 , the picture generation module 200 of the electronic device 100 may include a time perception module 201 and a space perception module 202 . Among them, the time perception module 201 can receive the picture sequence 701 in the picture library 700 as input, and generate a new timestamp picture sequence 705 in the time dimension according to the picture sequence 701 (as output). The time perception module 201 is implemented based on video frame insertion, for example. The spatial perception module 202 can receive the picture sequence 701 in the picture library 700 and the picture sequence 705 output by the time perception module 201 as input, and generate a picture sequence 706 of a new observation perspective in the spatial dimension according to the picture sequence 701 and the picture sequence 705 (as output). The spatial perception module 202 is implemented based on NeRF, for example. The picture sequence 705 and the picture sequence 706 can be output to the picture library 700 to form the picture sequence 702. The picture sequence 702 can be the union of the picture sequence 705 and the picture sequence 706.
在一种实施方式中,图9所示的空间感知模块202可以包括模型训练模块202A、参数提取模块202B、空间感知模型202C和新参数生成模块202D,具体可参见图10所示的电子设备100的架构。In one implementation, the spatial perception module 202 shown in Figure 9 may include a model training module 202A, a parameter extraction module 202B, a spatial perception model 202C and a new parameter generation module 202D. For details, see the electronic device 100 shown in Figure 10 architecture.
如图10所示,空间感知模块202根据图片序列701和图片序列705生成图片序列706的过程,可以包括在线训练和图片生成两个步骤,具体如下所示。As shown in Figure 10, the process of the spatial perception module 202 generating the picture sequence 706 based on the picture sequence 701 and the picture sequence 705 may include two steps: online training and picture generation, as detailed below.
在线训练:首先,参数提取模块202B可以接收图片序列701和图片序列705作为输入,输出图片序列701和图片序列705中的每张图片的空间参数,空间参数例如但不限于包括图片所示场景的坐标(可简 称为空间坐标,例如表示为空间直角坐标系/世界坐标系中的(x,y,z))、电子设备100的摄像头的姿态/姿势(可简称为相机姿势,也可理解为是观察方向,例如表示为其中,θ、分别为球面坐标系中的方位角和极角)。然后,空间感知模型202C可以接收参数提取模块202B输出的空间参数作为输入,输出和这些空间参数分别对应的图片序列707,其中,对于图片序列701和图片序列705中的任意一张图片(可称为图片B),空间感知模型202C可以接收图片B的空间参数1作为输入,输出图片序列707中的图片C,图片C可以理解为是空间感知模型202C“模拟”的和空间参数1对应的图片。最后,模型训练模块202A可以接收图片序列701、图片序列705和空间感知模型202C输出的图片序列707作为输入,基于损失函数比较图片序列701和图片序列705中的每张图片与图片序列707中对应的图片,并根据比较结果训练空间感知模型202C,以得到更新后的空间感知模型202C(例如具体得到该模型的权重),其中,对于图片序列701和图片序列705中的任意一张图片(图片B),图片序列707中对应的图片为将图片B的空间参数1作为更新前的空间感知模型202C的输入得到的输出,即图片C。上述过程可以称为一次训练过程。可以进行多次训练过程以得到多次更新后的空间感知模型202C,例如,第一次更新前的空间感知模型202C的权重为W0,第一次更新后的空间感知模型202C的权重为W1,参数提取模块202B、模型训练模块202A和权重为W1的空间感知模型202C可以再次执行训练过程(此时空间感知模型202C的输出可以不为图片序列707),以进行第二次更新,并得到第二次更新后的空间感知模型202C的权重W2,经过多轮迭代得到多次更新后的空间感知模型202C的权重Wn,n为更新的次数。多次更新后的空间感知模型202C用于执行下述图片生成的步骤。Online training: First, the parameter extraction module 202B can receive the picture sequence 701 and the picture sequence 705 as input, and output the spatial parameters of each picture in the picture sequence 701 and the picture sequence 705. The spatial parameters include, for example, but are not limited to, the scene shown in the picture. Coordinates (can be simplified are called spatial coordinates, for example expressed as (x, y, z) in the spatial rectangular coordinate system/world coordinate system), the posture/posture of the camera of the electronic device 100 (can be referred to as the camera posture for short, and can also be understood as the observation direction , for example expressed as Among them, θ, are the azimuth angle and polar angle in the spherical coordinate system respectively). Then, the spatial perception model 202C can receive the spatial parameters output by the parameter extraction module 202B as input, and output a picture sequence 707 corresponding to these spatial parameters respectively, wherein, for any picture in the picture sequence 701 and the picture sequence 705 (which can be called is picture B), the spatial perception model 202C can receive the spatial parameter 1 of the picture B as input, and output the picture C in the picture sequence 707. The picture C can be understood as the picture corresponding to the spatial parameter 1 "simulated" by the spatial perception model 202C. . Finally, the model training module 202A can receive the picture sequence 701, the picture sequence 705, and the picture sequence 707 output by the spatial perception model 202C as input, and compare each picture in the picture sequence 701 and the picture sequence 705 with the corresponding picture in the picture sequence 707 based on the loss function. pictures, and train the spatial perception model 202C according to the comparison results to obtain the updated spatial perception model 202C (for example, specifically obtain the weight of the model), where, for any picture in the picture sequence 701 and the picture sequence 705 (picture B), the corresponding picture in the picture sequence 707 is the output obtained by using the spatial parameter 1 of the picture B as the input of the spatial perception model 202C before the update, that is, the picture C. The above process can be called a training process. Multiple training processes can be performed to obtain multiple updated spatial perception models 202C. For example, the weight of the spatial perception model 202C before the first update is W0, and the weight of the spatial perception model 202C after the first update is W1. The parameter extraction module 202B, the model training module 202A and the spatial perception model 202C with weight W1 can perform the training process again (at this time, the output of the spatial perception model 202C may not be the picture sequence 707) to perform the second update and obtain the third The weight W2 of the spatial perception model 202C after the second update is obtained through multiple rounds of iterations, and the weight Wn of the spatial perception model 202C after multiple updates is obtained, where n is the number of updates. The spatial perception model 202C updated multiple times is used to perform the following steps of image generation.
图片生成:新参数生成模块202D可以接收参数提取模块202B输出的图片序列701和图片序列705中的每张图片的空间参数作为输入,输出不同的空间参数,例如,新参数生成模块202D可以接收参数提取模块202B输出的图片序列701和图片序列705中的图片B的空间参数1作为输入,输出和空间参数1不同的空间参数2,假设空间参数1包括空间坐标1和相机姿势1,则空间参数2包括空间坐标1和相机姿势2。多次更新后的空间感知模型202C可以接收新参数生成模块202D输出的空间参数作为输入,输出和这些空间参数分别对应的图片序列706,例如,多次更新后的空间感知模型202C可以输出和空间参数2对应的图片。Picture generation: The new parameter generation module 202D can receive the spatial parameters of each picture in the picture sequence 701 and the picture sequence 705 output by the parameter extraction module 202B as input, and output different spatial parameters. For example, the new parameter generation module 202D can receive parameters. The spatial parameter 1 of the picture sequence 701 and the picture B in the picture sequence 705 output by the extraction module 202B is used as input, and a spatial parameter 2 that is different from the spatial parameter 1 is output. Assuming that the spatial parameter 1 includes the spatial coordinate 1 and the camera pose 1, then the spatial parameter 2 includes spatial coordinates 1 and camera pose 2. The spatial perception model 202C after multiple updates can receive the spatial parameters output by the new parameter generation module 202D as input, and output the picture sequence 706 corresponding to these spatial parameters respectively. For example, the spatial perception model 202C after multiple updates can output and spatial The picture corresponding to parameter 2.
不限于上述实施方式,在另一种实施方式中,空间感知模块202的输入也仅为图片库700中的图片序列701,可以理解为是:时间感知模块201和空间感知模块202是图片生成模块200中的两个独立的模块。Not limited to the above embodiment, in another embodiment, the input of the spatial perception module 202 is only the picture sequence 701 in the picture library 700, which can be understood as: the time perception module 201 and the space perception module 202 are picture generation modules. Two separate modules in 200.
在一种实施方式中,图10所示的空间感知模型202C可以包括两个独立的多层感知器(mutilayer perception,MLP)(MLP1和MLP2),以及立体渲染(volume rendering)模块,具体可参见图11所示的电子设备100的架构。图11以输入空间感知模型202C的空间参数包括相机姿势和空间坐标为例进行说明。In one implementation, the spatial perception model 202C shown in Figure 10 may include two independent multilayer perceptions (MLPs) (MLP1 and MLP2), and a volume rendering module. For details, see The architecture of the electronic device 100 is shown in FIG. 11 . FIG. 11 takes the spatial parameters input to the spatial perception model 202C including camera posture and spatial coordinates as an example for illustration.
如图11所示,MLP1可以接收空间参数中的空间坐标作为输入进行特征提取,输出中间特征和空间密度(density)。MLP2可以接收空间参数中的相机姿势和MLP1输出的中间特征作为输入进行特征提取,输出颜色信息(color)。立体渲染模块可以接收MLP1输出的空间密度和MLP2输出的颜色信息作为输入,进行立体渲染,输出和上述空间参数对应的图片。在一些示例中,MLP1和/或MLP2中设置有可学习的参数,上述在线训练可以具体训练MLP1和/或MLP2。As shown in Figure 11, MLP1 can receive spatial coordinates in spatial parameters as input for feature extraction, and output intermediate features and spatial density. MLP2 can receive the camera pose in the spatial parameters and the intermediate features output by MLP1 as input for feature extraction and output color information (color). The stereoscopic rendering module can receive the spatial density output by MLP1 and the color information output by MLP2 as input, perform stereoscopic rendering, and output pictures corresponding to the above spatial parameters. In some examples, learnable parameters are set in MLP1 and/or MLP2, and the above online training can specifically train MLP1 and/or MLP2.
不限于上述实施方式中的图片生成模块200,在另一种实施方式中,时间感知模块201和空间感知模块202可以集成在一起,具体可参见图12所示的电子设备100的架构。It is not limited to the picture generation module 200 in the above embodiment. In another embodiment, the time perception module 201 and the space perception module 202 can be integrated together. For details, see the architecture of the electronic device 100 shown in FIG. 12 .
如图12所示,电子设备100的图片生成模块200可以包括模型训练模块203、参数提取模块204、时空感知模型205和新参数生成模块206。的图片生成模块200可以接收图片库700中的图片序列701作为输入,根据图片序列701在时间维度和空间维度上生成新的时间戳、新的观察视角的图片序列702,输出到图片库700中。该过程可以包括在线训练和图片生成两个步骤,具体如下所示。As shown in FIG. 12 , the picture generation module 200 of the electronic device 100 may include a model training module 203 , a parameter extraction module 204 , a spatiotemporal perception model 205 and a new parameter generation module 206 . The picture generation module 200 can receive the picture sequence 701 in the picture library 700 as input, generate a new timestamp and a new observation perspective picture sequence 702 in the time dimension and the spatial dimension according to the picture sequence 701, and output it to the picture library 700 . This process can include two steps: online training and image generation, as shown below.
在线训练:首先,参数提取模块204可以接收图片序列701作为输入,输出图片序列701中的每张图片的空间参数和时间参数。其中,空间参数例如但不限于包括空间坐标和相机姿势。时间参数例如但不限于包括时间戳和/或时间嵌套(time embedding),任意一张图片的时间嵌套可以是根据该图片的时间戳确定的,例如对时间戳进行傅里叶变换得到高维向量(例如128维的向量),该高维向量即为确定的时间嵌套。然后,时空感知模型205可以接收参数提取模块204输出的空间参数和时间参数,输出和这些空间参数和时间参数分别对应的图片序列702,其中,对于图片序列701中的任意一张图片(可称为图片D),时空感知模型205可以接收图片D的空间参数3和时间参数1作为输入,输出图片序列708中的图片E,图片E可以理解为是时空感知模型205“模拟”的与空间参数3和时间参数1对应的图片。最后,模型训练模块203可以接收图片序列701和时空感知模型205输出的图片序列708作为输入,基于损失函数比较图片序列701中的每张图片和图片序列708中对应的图片,并根据比较结果训练时空感知模型205,以得到更新 后的时空感知模型205(例如具体得到该模型的权重)。上述过程可以称为一次训练过程。可以进行多次训练过程以得到多次更新后的时空感知模型205,具体示例和图10所述的在线训练的示例类似,不再赘述。多次更新后的时空感知模型205用于执行下述图片生成的步骤。Online training: First, the parameter extraction module 204 can receive the picture sequence 701 as input and output the spatial parameters and temporal parameters of each picture in the picture sequence 701. The spatial parameters include, but are not limited to, spatial coordinates and camera posture. Time parameters include, but are not limited to, timestamps and/or time embeddings. The time embedding of any picture can be determined based on the timestamp of the picture. For example, Fourier transform is performed on the timestamp to obtain high resolution. dimensional vector (for example, a 128-dimensional vector). This high-dimensional vector is a determined time nesting. Then, the spatiotemporal perception model 205 can receive the spatial parameters and temporal parameters output by the parameter extraction module 204, and output a picture sequence 702 corresponding to these spatial parameters and temporal parameters respectively, wherein for any picture in the picture sequence 701 (which can be called For picture D), the spatio-temporal perception model 205 can receive the spatial parameter 3 and the temporal parameter 1 of the picture D as input, and output the picture E in the picture sequence 708. The picture E can be understood as the spatial parameter "simulated" by the spatio-temporal perception model 205. 3 and the picture corresponding to time parameter 1. Finally, the model training module 203 can receive the picture sequence 701 and the picture sequence 708 output by the spatiotemporal perception model 205 as input, compare each picture in the picture sequence 701 with the corresponding picture in the picture sequence 708 based on the loss function, and train according to the comparison results Spatiotemporal Perception Model 205 to get updated The subsequent space-time perception model 205 (for example, specifically obtain the weight of the model). The above process can be called a training process. Multiple training processes can be performed to obtain multiple updated spatio-temporal perception models 205. The specific examples are similar to the online training examples described in Figure 10 and will not be described again. The spatio-temporal perception model 205 updated multiple times is used to perform the following steps of image generation.
图片生成:新参数生成模块206可以接收参数提取模块204输出的图片序列701中的每张图片的空间参数和时间参数作为输入,输出不同的空间参数和不同的时间参数,例如,新参数生成模块206可以接收参数提取模块204输出的图片序列701中的图片D的空间参数3和时间参数1作为输入,输出不同的空间参数4和不同的时间参数2。多次更新后的时空感知模型205可以接收参数提取模块204和新参数生成模块206输出的空间参数和时间参数作为输入,输出与这些空间参数和时间参数分别对应的图片序列702,例如,多次更新后的时空感知模型205可以接收参数提取模块204输出的图片序列701中的图片D的空间参数3和时间参数1,以及新参数生成模块206输出的时间参数2和空间参数4作为输入,对应可以输出:与空间参数3和时间参数2对应的图片F、与时间参数1和空间参数4对应的图片G、与空间参数4和时间参数2对应的图片H。Picture generation: the new parameter generation module 206 can receive the spatial parameters and time parameters of each picture in the picture sequence 701 output by the parameter extraction module 204 as input, and output different spatial parameters and different time parameters, for example, the new parameter generation module 206 may receive the spatial parameter 3 and the temporal parameter 1 of the picture D in the picture sequence 701 output by the parameter extraction module 204 as input, and output different spatial parameters 4 and different temporal parameters 2. The spatio-temporal perception model 205 after multiple updates can receive the spatial parameters and temporal parameters output by the parameter extraction module 204 and the new parameter generation module 206 as input, and output a picture sequence 702 corresponding to these spatial parameters and temporal parameters respectively, for example, multiple times The updated spatio-temporal perception model 205 can receive the spatial parameter 3 and the temporal parameter 1 of the picture D in the picture sequence 701 output by the parameter extraction module 204, and the temporal parameter 2 and spatial parameter 4 output by the new parameter generation module 206 as input, corresponding to It is possible to output: the picture F corresponding to the spatial parameter 3 and the temporal parameter 2, the picture G corresponding to the temporal parameter 1 and the spatial parameter 4, and the picture H corresponding to the spatial parameter 4 and the temporal parameter 2.
在一种实施方式,图12所示的时空感知模型205可以包括两个独立的MLP(MLP3和MLP4),以及立体渲染模块,具体可参见图13所示的电子设备100的架构。图13以输入时空感知模型205的空间参数包括相机姿势和空间坐标,时间参数包括时间嵌套为例进行说明。In one implementation, the spatio-temporal perception model 205 shown in FIG. 12 may include two independent MLPs (MLP3 and MLP4), and a stereoscopic rendering module. For details, please refer to the architecture of the electronic device 100 shown in FIG. 13 . FIG. 13 takes the spatial parameters input to the spatio-temporal perception model 205 as an example including camera posture and spatial coordinates, and the time parameters including time nesting as an example.
如图13所示,MLP3可以接收空间参数中的空间坐标作为输入进行特征提取,输出中间特征和空间密度。MLP4可以接收时间嵌套(时间参数)、空间参数中的相机姿势和MLP1输出的中间特征作为输入进行特征提取,输出颜色信息。立体渲染模块可以接收MLP3输出的空间密度和MLP4输出的颜色信息作为输入,进行立体渲染,输出和上述空间参数和时间参数对应的图片。在一些示例中,MLP3和/或MLP4中设置有可学习的参数,上述在线训练可以具体训练MLP3和/或MLP4。As shown in Figure 13, MLP3 can receive spatial coordinates in spatial parameters as input for feature extraction, and output intermediate features and spatial density. MLP4 can receive temporal nesting (time parameters), camera poses in spatial parameters, and intermediate features output by MLP1 as input for feature extraction and output color information. The stereoscopic rendering module can receive the spatial density output by MLP3 and the color information output by MLP4 as input, perform stereoscopic rendering, and output pictures corresponding to the above spatial parameters and time parameters. In some examples, learnable parameters are set in MLP3 and/or MLP4, and the above online training can specifically train MLP3 and/or MLP4.
不限于上述实施方式中的图片生成模块200,在另一种实施方式中,图片生成模块200可以包括时间感知模块201或者空间感知模块202,其中,当图片生成模块200仅包括空间感知模块202时,空间感知模块202的输入是图片库700中的图片序列701。It is not limited to the picture generation module 200 in the above embodiment. In another embodiment, the picture generation module 200 may include a time perception module 201 or a space perception module 202, where when the picture generation module 200 only includes the space perception module 202 , the input of the spatial perception module 202 is the picture sequence 701 in the picture library 700.
本申请可以基于拍摄到的图片序列701,在时间和/或空间维度上生成新的图片序列702,图片序列702和图片序列701可以一起作为候选图片为用户推荐图片和供用户选择,在有限的拍摄时间内增加了差异化、高质量的候选图片,提高用户能获取到所需图片的概率,提升用户体验。This application can generate a new picture sequence 702 in the temporal and/or spatial dimensions based on the captured picture sequence 701. The picture sequence 702 and the picture sequence 701 can be used together as candidate pictures to recommend pictures for the user and for the user to select, in a limited time. Differentiated and high-quality candidate pictures have been added during the shooting time to increase the probability that users can obtain the pictures they need and improve the user experience.
接下来示例性介绍图8所示的电子设备100中的图片推荐模块300。Next, the picture recommendation module 300 in the electronic device 100 shown in FIG. 8 is introduced as an example.
请参见图14,图14示例性示出又一种电子设备100的软件架构示意图。Please refer to FIG. 14 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
如图14所示,电子设备100的图片推荐模块300可以包括美学评估模型301和筛选模块302。其中,美学评估模型301可以是电子设备100的个性化学习模块600输出的更新后的美学评估模型。美学评估模型301可以接收图片库700中的每张图片作为输入,输出图片库700中的每张图片在多个维度的得分,以综合得分、主体位置得分、动作舒展度得分、表情得分和画质得分为例进行说明。筛选模块302可以接收美学评估模型301的输出作为输入,分别对多个维度的得分进行排序和筛选,并得到图片序列703,即分别在多个维度的得分较高的多个图片序列:综合得分在前N1位的图片序列1、主体位置得分在前N2位的图片序列2、动作舒展度得分在前N3位的图片序列3、表情得分在前N4位的图片序列4、画质得分在前N5位的图片序列5,其中,Ni为正整数,i为小于6的正整数。As shown in FIG. 14 , the picture recommendation module 300 of the electronic device 100 may include an aesthetic evaluation model 301 and a filtering module 302 . The aesthetic evaluation model 301 may be an updated aesthetic evaluation model output by the personalized learning module 600 of the electronic device 100 . The aesthetic evaluation model 301 can receive each picture in the picture library 700 as input, and output the score of each picture in the picture library 700 in multiple dimensions, based on the comprehensive score, subject position score, movement stretch score, expression score and painting score. Take quality score as an example to illustrate. The screening module 302 can receive the output of the aesthetic evaluation model 301 as input, sort and filter the scores in multiple dimensions, and obtain a picture sequence 703, that is, a plurality of picture sequences with higher scores in multiple dimensions: a comprehensive score. The picture sequence in the top N 1 places 1. The picture sequence with the subject position score in the top N 2 places 2. The picture sequence with the movement stretch score in the top N 3 places 3. The picture sequence with the expression score in the top N 4 places 4. Painting The picture sequence 5 with the top N 5 quality scores, where N i is a positive integer and i is a positive integer less than 6.
本申请可以从综合、主体位置、动作舒展度、表情和画质等多个维度对候选图片进行全方位的美学评估和筛选,并为用户推荐分别在多个维度评分较高的图片,可以满足不同用户的不同偏好,图片推荐更加准确,提高用户能获取到所需图片的概率,提升用户体验。This application can conduct a comprehensive aesthetic evaluation and screening of candidate pictures from multiple dimensions such as comprehensiveness, subject position, movement stretch, expression and image quality, and recommend to users pictures with higher scores in multiple dimensions, which can meet the needs of users. Different users have different preferences, so picture recommendations are more accurate, increasing the probability that users can get the pictures they need, and improving user experience.
接下来示例性介绍图8所示的电子设备100中的个性化学习模块600。Next, the personalized learning module 600 in the electronic device 100 shown in FIG. 8 is introduced as an example.
请参见图15,图15示例性示出又一种电子设备100的软件架构示意图。Please refer to FIG. 15 , which exemplarily shows a schematic diagram of the software architecture of yet another electronic device 100 .
如图15所示,电子设备100的图片推荐模块300可以包括美学评估模型301和筛选模块302,具体可参见图14的说明。电子设备100的用户选择模块400可以接收筛选模块302输出的图片序列703(包括在多个维度的得分较高的多个图片序列)作为输入,根据接收到的用户操作从图片序列703中选择出图片序列704(作为输出)。As shown in FIG. 15 , the picture recommendation module 300 of the electronic device 100 may include an aesthetic evaluation model 301 and a filtering module 302 . For details, please refer to the description of FIG. 14 . The user selection module 400 of the electronic device 100 can receive the picture sequence 703 output by the filtering module 302 (including multiple picture sequences with higher scores in multiple dimensions) as input, and select from the picture sequence 703 according to the received user operation. Sequence of pictures 704 (as output).
如图15所示,电子设备100的个性化学习模块600可以包括图片标定模块601、个性化数据集602和模型训练模块603。其中,图片标定模块601可以接收筛选模块302输出的图片序列703和对应的得分,以及用户选择模块400输出的图片序列704作为输入,根据图片序列703和对应的得分设置图片序列704 对应的得分,图片标定模块601输出的图片序列704和对应的得分可以构成个性化数据集602,图片标定模块601的实现示例可参见图3的S106和图7的说明。模型训练模块603可以接收个性化数据集602和更新前的美学评估模型301作为输入,使用个性化数据集602训练更新前的美学评估模型301并得到更新后的美学评估模型301,更新后的美学评估模型301可以输出至图片推荐模块300。As shown in FIG. 15 , the personalized learning module 600 of the electronic device 100 may include a picture calibration module 601 , a personalized data set 602 and a model training module 603 . Among them, the picture calibration module 601 can receive the picture sequence 703 and the corresponding score output by the screening module 302, and the picture sequence 704 output by the user selection module 400 as input, and set the picture sequence 704 according to the picture sequence 703 and the corresponding score. The corresponding score, the picture sequence 704 output by the picture calibration module 601 and the corresponding score can constitute a personalized data set 602. For an implementation example of the picture calibration module 601, see S106 of Figure 3 and the description of Figure 7 . The model training module 603 may receive the personalized data set 602 and the pre-updated aesthetic evaluation model 301 as input, use the personalized data set 602 to train the pre-updated aesthetic evaluation model 301 and obtain the updated aesthetic evaluation model 301. The updated aesthetics The evaluation model 301 may be output to the image recommendation module 300.
本申请可以基于默认推荐的图片和用户选择的图片训练图片推荐模块300中的美学评估模型301,即进行端侧自学习,使美学评估模型301的评分策略尽可能和用户习惯匹配,实现个性化的图片推荐,进一步提高用户能获取到所需图片的概率,提升用户体验。This application can train the aesthetic evaluation model 301 in the picture recommendation module 300 based on the default recommended pictures and user-selected pictures, that is, perform end-side self-learning to make the scoring strategy of the aesthetic evaluation model 301 match the user's habits as much as possible to achieve personalization. Picture recommendation further improves the probability that users can obtain the pictures they need and improves user experience.
在一些示例中,不同用户使用的不同电子设备进行各自的端侧自学习后,这两个电子设备获取到同一场景下的第一图片序列(例如拍摄得到同一场景下的多张图片)后,推荐的第三图片序列可以不同,即实现了“千人千面”的图片推荐。In some examples, after different electronic devices used by different users perform their own end-side self-learning, and after the two electronic devices obtain the first sequence of pictures in the same scene (for example, multiple pictures in the same scene are captured), The recommended third picture sequence can be different, that is, the picture recommendation of "thousands of people and thousands of faces" is realized.
下面介绍本申请实施例涉及的应用场景以及该场景下的用户界面实施例。The application scenarios involved in the embodiments of this application and the user interface embodiments in this scenario are introduced below.
图16示例性示出一种图片推荐过程的用户界面示意图。Figure 16 exemplarily shows a schematic diagram of the user interface of a picture recommendation process.
如图16的(A)所示,电子设备100可以显示相机应用的用户界面1000。用户界面1000可以包括取景框1010、拍摄控件1020和缩略图1030,其中,取景框1010用于显示电子设备100通过摄像头实时采集到的图像,拍摄控件1020用于触发通过摄像头拍摄图片,缩略图1030用于显示电子设备100通过摄像头最近一次拍摄到的图片。在一种实施方式中,电子设备100可以响应于针对拍摄控件1020的操作(例如触摸操作,该触摸操作例如为单击操作或者长按操作等),连续拍摄多张图片,即实现图3所示的S101,这多张图片即为第一图片序列。然后,电子设备100可以响应于针对缩略图1030的操作(例如触摸操作,该触摸操作例如为单击操作),显示这多张图片中的任意一张图片,具体可参见图16的(B)所示的用户界面2000,用户界面2000可以包括上述多张图片中的图片2010和控件2020。As shown in (A) of FIG. 16 , the electronic device 100 may display the user interface 1000 of the camera application. The user interface 1000 may include a viewfinder frame 1010, a shooting control 1020, and a thumbnail image 1030. The viewfinder frame 1010 is used to display images collected by the electronic device 100 through a camera in real time, the shooting control 1020 is used to trigger the shooting of pictures through the camera, and the thumbnail image 1030 Used to display the latest picture taken by the electronic device 100 through the camera. In one implementation, the electronic device 100 can continuously capture multiple pictures in response to an operation on the shooting control 1020 (such as a touch operation, such as a click operation or a long press operation, etc.), that is, to achieve what is shown in FIG. 3 As shown in S101, these multiple pictures are the first picture sequence. Then, the electronic device 100 may display any one of the plurality of pictures in response to an operation (such as a touch operation, such as a click operation) on the thumbnail 1030. For details, see (B) of FIG. 16 As shown in the user interface 2000, the user interface 2000 may include a picture 2010 and a control 2020 among the plurality of pictures mentioned above.
在一种实施方式中,电子设备100可以响应于针对控件2020的操作(例如触摸操作,该触摸操作例如为点击操作),基于上述连拍的多张图片从综合推荐、主体位置、动作舒展度、表情和画质等多个维度为用户推荐图片,并显示推荐的图片和其他图片,即实现图3所示的S102-S104,该推荐的图片即为第三图片序列,该其他图片包括第一图片序列和第二图片序列中除第三图片序列外的至少一张图片,具体可参见图16的(C)所示的用户界面3000。In one implementation, the electronic device 100 can respond to an operation (such as a touch operation, such as a click operation) on the control 2020, based on the above-mentioned multiple continuously shot pictures from comprehensive recommendations, subject position, and movement stretch. Recommend pictures to users from multiple dimensions such as expression, image quality, etc., and display the recommended pictures and other pictures, that is, implement S102-S104 shown in Figure 3. The recommended pictures are the third picture sequence, and the other pictures include the third picture sequence. At least one picture in the first picture sequence and the second picture sequence except the third picture sequence. For details, see the user interface 3000 shown in (C) of Figure 16 .
如图16的(C)所示,用户界面3000可以包括返回控件3010、提示信息3020、保存控件3030。其中,返回控件3010用于返回上一级界面。保存控件3030用于保存用户选择的图片。提示信息3020用于指示候选图片的数量和用户选择的图片数量,例如可以包括字符“选择照片0/30”,用于指示候选图片的数量为30以及用户选择的图片数量为0。在一些示例中,上述候选图片包括第一图片序列和第二图片序列,例如,电子设备100连拍得到的多张图片(即第一图片序列)的数量为10,基于这多张图片从时间和/或空间维度生成的第二图片序列的数量为20,因此候选图片的数量为30。在一些示例中,电子设备100中用于存储图片的文件夹可以包括第一图片序列和第二图片序列的信息,例如第一图片序列的存储位置和第二图片序列的存储位置不同,第二图片序列例如存储在新建的临时缓存区中,并且,第一图片序列和第二图片序列的属性可以不同,例如但不限于包括生成时间不同、携带标签不同等。在一些示例中,电子设备100拍摄得到第一图片序列之后,接收到上述针对控件2020的操作之前,电子设备100中用于存储图片的文件夹可以仅包括第一图片序列,接收到上述针对控件2020的操作之后,该文件夹可以还包括第二图片序列。As shown in (C) of FIG. 16 , the user interface 3000 may include a return control 3010 , prompt information 3020 , and a save control 3030 . Among them, the return control 3010 is used to return to the previous level interface. The save control 3030 is used to save the picture selected by the user. The prompt information 3020 is used to indicate the number of candidate pictures and the number of pictures selected by the user. For example, it may include the characters "Select Photo 0/30" to indicate that the number of candidate pictures is 30 and the number of pictures selected by the user is 0. In some examples, the above-mentioned candidate pictures include a first picture sequence and a second picture sequence. For example, the number of multiple pictures (ie, the first picture sequence) continuously shot by the electronic device 100 is 10. Based on the time of these multiple pictures, and/or the number of second picture sequences generated in spatial dimensions is 20, so the number of candidate pictures is 30. In some examples, the folder used to store pictures in the electronic device 100 may include information about the first picture sequence and the second picture sequence. For example, the storage location of the first picture sequence and the storage location of the second picture sequence are different. The picture sequence is, for example, stored in a newly created temporary cache area, and the attributes of the first picture sequence and the second picture sequence may be different, for example but not limited to including different generation times, different carrying tags, etc. In some examples, after the electronic device 100 captures the first sequence of pictures and before receiving the above-mentioned operation for the control 2020 , the folder used to store the pictures in the electronic device 100 may only include the first sequence of pictures. After receiving the above-mentioned operation for the control 2020 After the operation of 2020, the folder may also include a second sequence of pictures.
用户界面3000还包括推荐维度3040、图片列表3050和显示框3060。其中,推荐维度3040可以包括综合推荐3040A、主体位置3040B、动作舒展度3040C、表情3040D和画质3040E等多个维度,电子设备100可以响应于针对其中任意一个维度的操作(例如触摸操作,该触摸操作例如为单击操作),将该维度设置为选中状态,例如当前综合推荐3040A为选中状态。图片列表3050用于显示推荐维度3040中为选中状态的维度(当前为综合推荐3040A)下的推荐图片和其他图片,该推荐图片包括:显示有推荐标志3051A的图片3051和显示有推荐标志3052A的图片3052,该其他图片包括:图片3052和图片3054。图片列表3050中的图片可以按照在对应维度的得分(当前为综合得分)从高到低的顺序从前往后显示,即图片列表3050中的图片按照综合得分从高到低依次为图片3051、图片3052、图片3053和图片3054。在一些示例中,电子设备100可以响应于针对图片列表3050的操作(例如触摸操作,该触摸操作例如为从右往左的滑动操作),显示图片列表3050中的其他图片。图片列表3050还包括控件3055,显示框3060用于显示控件3055指向的图片,例如当前控件3055指向图片3051,因此,显示框3060用于显示放大后的图片3051。 The user interface 3000 also includes a recommendation dimension 3040, a picture list 3050, and a display box 3060. Among them, the recommendation dimension 3040 may include multiple dimensions such as comprehensive recommendation 3040A, subject position 3040B, movement stretch 3040C, expression 3040D, and image quality 3040E. The electronic device 100 may respond to an operation (such as a touch operation) in any one of the dimensions. The touch operation is, for example, a click operation), and the dimension is set to the selected state. For example, the current comprehensive recommendation 3040A is the selected state. The picture list 3050 is used to display recommended pictures and other pictures under the selected dimension in the recommended dimension 3040 (currently the comprehensive recommendation 3040A). The recommended pictures include: pictures 3051 showing the recommendation mark 3051A and pictures showing the recommendation mark 3052A. Picture 3052, the other pictures include: picture 3052 and picture 3054. The pictures in the picture list 3050 can be displayed from front to back according to the score in the corresponding dimension (currently the comprehensive score) from high to low, that is, the pictures in the picture list 3050 are picture 3051, picture in order from high to low according to the comprehensive score. 3052, Picture 3053 and Picture 3054. In some examples, the electronic device 100 may display other pictures in the picture list 3050 in response to an operation on the picture list 3050 (such as a touch operation, such as a sliding operation from right to left). The picture list 3050 also includes a control 3055. The display box 3060 is used to display the picture pointed by the control 3055. For example, the current control 3055 points to the picture 3051. Therefore, the display box 3060 is used to display the enlarged picture 3051.
在一种实施方式中,电子设备100显示推荐图片和其他图片时,可以响应于针对任意一张图片的操作(例如触摸操作,该触摸操作例如为单击操作),将该图片设置为选中状态,即实现图3所示的S105,例如,电子设备100可以响应于针对图16的(C)所示的用户界面3000中的图片3054的操作,将图片3054设置为选中状态,具体可参见图17所示的用户界面4000。In one implementation, when the electronic device 100 displays recommended pictures and other pictures, the picture can be set to a selected state in response to an operation (such as a touch operation, such as a click operation) on any picture. , that is, S105 shown in FIG. 3 is implemented. For example, the electronic device 100 can respond to the operation on the picture 3054 in the user interface 3000 shown in (C) of FIG. 16 , and set the picture 3054 to the selected state. For details, see FIG. The user interface 4000 is shown in 17.
如图17所示,用户界面4000和用户界面3000类似,区别在于,图片列表3050中的图片3054上显示有信息4010,信息4010包括字符“1”,表征图片3054为用户选择的第一张图片和/或用户选择的优先级排列在第一位的图片。并且,控件3055当前指向图片3054,相应地,显示框3060用于显示放大后的图片3054。由于当前用户选择了一张图片,因此,提示信息3020可以包括字符“选择照片1/30”。As shown in Figure 17, the user interface 4000 is similar to the user interface 3000. The difference is that the picture 3054 in the picture list 3050 displays information 4010. The information 4010 includes the character "1", indicating that the picture 3054 is the first picture selected by the user. and/or the user-selected image with the highest priority. Moreover, the control 3055 currently points to the picture 3054, and accordingly, the display box 3060 is used to display the enlarged picture 3054. Since the current user has selected a picture, the prompt information 3020 may include the characters "Select photo 1/30".
在一种实施方式中,电子设备100可以响应于针对推荐维度3040中的其他维度的操作(例如单击操作),显示该维度下的推荐图片和其他图片,例如,图17所示的实施方式之后,电子设备100可以响应于针对图17所示的用户界面4000包括的推荐维度3040中的主体位置3040B的操作,显示基于主体位置3040B推荐的图片和其他图片,具体可参见图18所示的用户界面5000。In one implementation, the electronic device 100 may display recommended pictures and other pictures under the recommended dimension 3040 in response to operations (such as click operations) on other dimensions in the recommended dimension 3040, for example, the implementation shown in FIG. 17 Thereafter, the electronic device 100 may display the recommended pictures and other pictures based on the subject position 3040B in response to the operation on the subject position 3040B in the recommendation dimension 3040 included in the user interface 4000 shown in FIG. 17 . For details, see FIG. 18 UI 5000.
如图18所示,用户界面5000和用户界面3000类似,区别在于,推荐维度3040中的主体位置3040B为选中状态,因此,用户界面5000包括图片列表5010,图片列表5010用于显示主体位置3040B下的推荐图片(即图片5011和图片5012),以及其他图片(即图片5013和图片5014)。图片列表5010中的图片按照主体位置得分从高到低依次为图片5011、图片5012、图片5013和图片5014。电子设备100可以响应于针对图片5014的操作(例如触摸操作,该触摸操作例如为单击操作),将图片5014设置为选中状态,因此,用户界面5000中的图片5014上显示有信息5020,信息5020包括字符“2”,表征图片5014为用户选择的第二张图片和/或用户选择的优先级排列在第二位的图片。并且,控件3055当前指向图片5014,相应地,显示框3060用于显示放大后的图片5014。由于当前用户选择了两张图片,因此,提示信息3020可以包括字符“选择照片2/30”。As shown in Figure 18, the user interface 5000 is similar to the user interface 3000. The difference is that the subject position 3040B in the recommended dimension 3040 is selected. Therefore, the user interface 5000 includes a picture list 5010. The picture list 5010 is used to display the subject position 3040B. recommended pictures (i.e. picture 5011 and picture 5012), and other pictures (i.e. picture 5013 and picture 5014). The pictures in the picture list 5010 are picture 5011, picture 5012, picture 5013 and picture 5014 in descending order according to the subject position score. The electronic device 100 can set the picture 5014 to a selected state in response to an operation on the picture 5014 (such as a touch operation, such as a click operation). Therefore, the picture 5014 in the user interface 5000 displays information 5020, the information 5020 includes the character "2", indicating that the picture 5014 is the second picture selected by the user and/or the picture ranked second in priority selected by the user. Moreover, the control 3055 currently points to the picture 5014, and accordingly, the display box 3060 is used to display the enlarged picture 5014. Since the current user has selected two pictures, the prompt information 3020 may include the characters "Select photo 2/30".
在一种实施方式中,图18所示的实施方式中,电子设备100可以响应于针对图18所示的用户界面5000中的保存控件3030,保存用户选择的图片3054和图片5014,并且删除候选图片中的其他图片。在一些示例中,电子设备100可以基于用户选择的图片3054和图片5014实现图3所示的S106。In one embodiment, in the embodiment shown in FIG. 18 , the electronic device 100 may save the user-selected pictures 3054 and 5014 in response to the save control 3030 in the user interface 5000 shown in FIG. 18 , and delete the candidates. Other pictures in pictures. In some examples, the electronic device 100 may implement S106 shown in FIG. 3 based on the pictures 3054 and 5014 selected by the user.
不限于上述实施方式,在另一些实施方式中,电子设备100也可以直接响应于针对图16所示的用户界面1000中的拍摄控件1020的操作,显示图16的(C)所示的用户界面3000。在另一些实施方式中,电子设备100也可以在接收到针对图16所示的用户界面1000中的拍摄控件1020的操作之后,响应于针对用户界面1000中的缩略图1030的操作,显示图16的(C)所示的用户界面3000。Not limited to the above embodiments, in other embodiments, the electronic device 100 may also display the user interface shown in (C) of FIG. 16 directly in response to the operation of the shooting control 1020 in the user interface 1000 shown in FIG. 16 3000. In other embodiments, the electronic device 100 may also display FIG. 16 in response to the operation on the thumbnail 1030 in the user interface 1000 after receiving the operation on the shooting control 1020 in the user interface 1000 shown in FIG. 16 The user interface 3000 is shown in (C).
在另一种实施方式中,用户也可以选择图库中的多张图片,电子设备100可以基于这多张图片进行图片推荐,即实现图3所示的方法,第一图片序列为这多张图片,具体示例可参见图19所示的用户界面6000。用户界面6000可以为图库应用的用户界面。用户界面6000可以包括提示信息6010、图片列表6020和功能列表6030。其中,图片列表6020可以包括多张图片,例如但不限于图片6021、图片6022、图片6023、图片6024、图片6025和图片6026,以图片6021为例进行说明,图片6021上还显示有选择控件6021A,选择控件6021A用于选择图片6021或者取消选择图片6021,用户界面6000中的选择控件6021A指示图片6021已被选择。类似地,图片6022、图片6023和图片6025均被选择。提示信息6010用于指示已被选择的图片的数量,例如当前选择了4张图片,则提示信息6010包括字符“已选择4项”。功能列表6030可以包括多个功能的控件,例如但不限于包括分享功能的控件、删除功能的控件、全选功能的控件、推荐功能的控件6031和更多功能的控件等。电子设备100可以响应于针对控件6031的操作(例如触摸操作,该触摸操作例如为单击操作),将用户选择的图片6021、图片6022、图片6023和图片6025作为第一图片序列,实现图3所示的方法,显示第三图片序列的用户界面可参见图16的(C)所示的用户界面3000。在一些示例中,用户界面3000所示的图片列表3050中的图片3051和图片3052即为上述图片6021和图片6025,但图片列表3050中的图片3053、图片3054不属于第一图片序列,即属于第二图片序列。In another implementation, the user can also select multiple pictures in the gallery, and the electronic device 100 can perform picture recommendation based on these multiple pictures, that is, implement the method shown in Figure 3. The first picture sequence is these multiple pictures. , for a specific example, see user interface 6000 shown in Figure 19 . The user interface 6000 may be a user interface of a gallery application. The user interface 6000 may include prompt information 6010, a picture list 6020, and a function list 6030. The picture list 6020 may include multiple pictures, such as, but not limited to, picture 6021, picture 6022, picture 6023, picture 6024, picture 6025, and picture 6026. Picture 6021 is used as an example for illustration. A selection control 6021A is also displayed on the picture 6021. , the selection control 6021A is used to select the picture 6021 or deselect the picture 6021. The selection control 6021A in the user interface 6000 indicates that the picture 6021 has been selected. Similarly, picture 6022, picture 6023 and picture 6025 are all selected. The prompt information 6010 is used to indicate the number of pictures that have been selected. For example, if 4 pictures are currently selected, the prompt information 6010 includes the characters "4 items have been selected." The function list 6030 may include controls for multiple functions, such as but not limited to controls for sharing functions, controls for deleting functions, controls for selecting all functions, controls for recommended functions 6031 and controls for more functions. The electronic device 100 may respond to an operation on the control 6031 (such as a touch operation, such as a click operation), and use the pictures 6021, 6022, 6023 and 6025 selected by the user as the first picture sequence to implement FIG. 3 For the method shown, the user interface for displaying the third picture sequence may refer to the user interface 3000 shown in (C) of Figure 16 . In some examples, the picture 3051 and the picture 3052 in the picture list 3050 shown in the user interface 3000 are the above-mentioned pictures 6021 and 6025, but the pictures 3053 and 3054 in the picture list 3050 do not belong to the first picture sequence, that is, they belong to Second picture sequence.
不限于上述实施方式,在另一种实施方式中,电子设备100可以接收用户针对设置界面输入的维度1,并确定用户喜好维度1。然后,电子设备100连续拍摄多张图片后,可以基于这多张图片从维度1为用户推荐图片,在一些示例中,电子设备100可以自动保存在维度1的得分较高的图片,并删除其他图片,例如,维度1为图16的(C)所示的用户界面3000中的综合推荐3040A,电子设备100响应针对图16的(A)所示的用户界面1000中的拍摄控件1020,连续拍摄多张图片,并自动保存这多张图片中综合得分较高的图片,即图16的(C)所示的用户界面3000中的图片3051和图片3052,同时删除图片3053和图片3054等其他图片。不限于此,在另一种实施方式中,也可以无需用户手动输入维度1,电子设备100可以自行 学习用户喜好的维度,例如,图库中的图片大多是拍摄主体的表情较优的,电子设备100可以根据图库中的图片确定用户喜好维度1。Not limited to the above embodiment, in another embodiment, the electronic device 100 may receive dimension 1 input by the user for the setting interface, and determine user preference dimension 1. Then, after the electronic device 100 continuously takes multiple pictures, it can recommend pictures to the user from dimension 1 based on these pictures. In some examples, the electronic device 100 can automatically save pictures with higher scores in dimension 1 and delete other pictures. For example, dimension 1 of the picture is the comprehensive recommendation 3040A in the user interface 3000 shown in (C) of FIG. 16 , and the electronic device 100 continuously shoots in response to the shooting control 1020 in the user interface 1000 shown in (A) of FIG. 16 Multiple pictures, and automatically save the picture with a higher comprehensive score among the multiple pictures, that is, picture 3051 and picture 3052 in the user interface 3000 shown in (C) of Figure 16, and delete other pictures such as picture 3053 and picture 3054 at the same time. . It is not limited to this. In another implementation, there is no need for the user to manually input dimension 1, and the electronic device 100 can automatically Learn the dimension of user preference. For example, most of the pictures in the gallery have better expressions of the subjects. The electronic device 100 can determine the user preference dimension 1 based on the pictures in the gallery.
本申请各实施例提供的方法中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,DWD)、或者半导体介质(例如,固态硬盘(solid state disk,SSD)等。以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。 The methods provided by the embodiments of this application can be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user equipment, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmit to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means. The computer The readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated. The available media can be magnetic media (for example, floppy disks, hard disks, tapes ), optical media (for example, digital video disc (DWD)), or semiconductor media (for example, solid state disk (SSD), etc.). The above-mentioned embodiments are only used to illustrate the technology of the present application. scheme, rather than limiting it; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still modify the technical solutions recorded in the foregoing embodiments, or modify part of them. The technical features are equivalently substituted; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to depart from the scope of the technical solutions of the embodiments of the present application.

Claims (11)

  1. 一种图片推荐方法,其特征在于,应用于电子设备,所述方法包括:A picture recommendation method, characterized in that it is applied to electronic devices, and the method includes:
    显示图像采集界面;Display the image acquisition interface;
    响应针对所述图像采集界面的图像采集按钮的第一操作,通过图像采集装置采集获得第一图片序列;In response to the first operation of the image collection button of the image collection interface, collect and obtain the first picture sequence through the image collection device;
    基于所述第一图片序列生成第二图片序列,所述第二图片序列包括时间戳和所述第一图片序列中的图片的时间戳不同的图片,和/或,观察视角和所述第一图片序列中的图片的观察视角不同的图片;A second picture sequence is generated based on the first picture sequence, the second picture sequence includes pictures with different timestamps from the timestamps of the pictures in the first picture sequence, and/or the viewing angle is different from that of the first picture sequence. Pictures in the picture sequence with different viewing angles;
    从所述第一图片序列和所述第二图片序列中确定出第三图片序列,所述第三图片序列包括第一维度的得分排列在前N位的N张图片,以及第二维度的得分排列在前M位的M张图片,N和M为正整数;A third picture sequence is determined from the first picture sequence and the second picture sequence. The third picture sequence includes N pictures ranked in the top N positions by scores in the first dimension, and scores in the second dimension. M pictures ranked in the top M positions, N and M are positive integers;
    推荐所述第三图片序列。The third picture sequence is recommended.
  2. 如权利要求1所述的方法,其特征在于,所述第一维度或所述第二维度为以下任意一项:综合维度、图片中的拍摄主体的位置、图片中的拍摄主体的动作舒展度、图片中的拍摄主体的表情、图片的画质。The method of claim 1, wherein the first dimension or the second dimension is any one of the following: a comprehensive dimension, the position of the photographed subject in the picture, or the movement stretch of the photographed subject in the picture. , the expression of the subject in the picture, and the image quality of the picture.
  3. 如权利要求1或2所述的方法,其特征在于,所述推荐所述第三图片序列,包括:The method of claim 1 or 2, wherein the recommending the third picture sequence includes:
    显示第一界面,所述第一界面显示第一信息、第二信息、所述N张图片和所述M张图片,所述第一信息指示所述第一维度,所述第一信息和所述N张图片关联,所述第二信息指示所述第二维度,所述第二信息和所述M张图片关联。Display a first interface, the first interface displays first information, second information, the N pictures and the M pictures, the first information indicates the first dimension, the first information and the The N pictures are associated, the second information indicates the second dimension, and the second information is associated with the M pictures.
  4. 如权利要求1-3任一项所述的方法,其特征在于,所述推荐所述第三图片序列,包括:The method according to any one of claims 1-3, wherein the recommending the third picture sequence includes:
    显示第二界面,所述第二界面显示K张图片,K为大于或等于N的正整数,所述K张图片包括所述N张图片和所述N张图片之外的(K-N)张图片,所述(K-N)张图片属于所述第一图片序列和/或所述第二图片序列,所述K张图片包括第一图片和第二图片,所述第一图片在所述第一维度的得分大于所述第二图片在所述第一维度的得分,所述第二界面中所述第一图片显示在所述第二图片之前。Display a second interface. The second interface displays K pictures. K is a positive integer greater than or equal to N. The K pictures include the N pictures and (K-N) pictures other than the N pictures. , the (K-N) pictures belong to the first picture sequence and/or the second picture sequence, the K pictures include a first picture and a second picture, and the first picture is in the first dimension The score of is greater than the score of the second picture in the first dimension, and the first picture is displayed before the second picture in the second interface.
  5. 如权利要求1-4任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1 to 4, characterized in that the method further includes:
    接收用于选择至少一张图片的第二操作,所述至少一张图片属于所述第一图片序列和/或所述第二图片序列;receiving a second operation for selecting at least one picture belonging to the first picture sequence and/or the second picture sequence;
    保存所述至少一张图片,以及删除所述第一图片序列和所述第二图片序列中除所述至少一张图片以外的图片。Save the at least one picture, and delete pictures other than the at least one picture in the first picture sequence and the second picture sequence.
  6. 如权利要求1-5任一项所述的方法,其特征在于,所述第三图片序列是根据第一策略得到的;所述方法还包括:The method according to any one of claims 1 to 5, wherein the third picture sequence is obtained according to the first strategy; the method further includes:
    接收用于选择至少一张图片的第二操作,所述至少一张图片属于所述第一图片序列和/或所述第二图片序列;receiving a second operation for selecting at least one picture belonging to the first picture sequence and/or the second picture sequence;
    根据所述第三图片序列和所述至少一张图片更新所述第一策略。The first policy is updated based on the third sequence of pictures and the at least one picture.
  7. 如权利要求1-6任一项所述的方法,其特征在于,所述基于所述第一图片序列生成第二图片序列,包括:The method according to any one of claims 1 to 6, wherein generating a second picture sequence based on the first picture sequence includes:
    基于所述第一图片序列生成第四图片序列,所述第四图片序列中的图片的时间戳和所述第一图片序列中的图片的时间戳不同;Generate a fourth picture sequence based on the first picture sequence, the timestamps of pictures in the fourth picture sequence being different from the timestamps of pictures in the first picture sequence;
    基于所述第一图片序列和所述第四图片序列生成第五图片序列,所述第五图片序列中的图片的观察视角与所述第一图片序列、所述第四图片序列中的图片的观察视角不同,所述第二图片序列包括所述第四图片序列和所述第五图片序列。A fifth picture sequence is generated based on the first picture sequence and the fourth picture sequence, and the observation angle of the pictures in the fifth picture sequence is the same as the observation angle of the pictures in the first picture sequence and the fourth picture sequence. The viewing angles are different, and the second picture sequence includes the fourth picture sequence and the fifth picture sequence.
  8. 如权利要求7所述的方法,其特征在于,所述基于所述第一图片序列和所述第四图片序列生成第五图片序列,包括:The method of claim 7, wherein generating a fifth picture sequence based on the first picture sequence and the fourth picture sequence includes:
    基于所述第一图片序列和所述第四图片序列训练得到空间感知模型;Train to obtain a spatial perception model based on the first picture sequence and the fourth picture sequence;
    获取第一空间参数,所述第一空间参数和所述第一图片序列、所述第二图片序列中的图片的空间参数不同; Obtain a first spatial parameter, which is different from the spatial parameters of the pictures in the first picture sequence and the second picture sequence;
    将所述第一空间参数作为所述空间感知模型的输入获取输出,所述输出为所述第五图片序列。The first spatial parameter is used as an input of the spatial perception model to obtain an output, and the output is the fifth picture sequence.
  9. 如权利要求1-8任一项所述的方法,其特征在于,所述基于所述第一图片序列生成第二图片序列,包括:The method according to any one of claims 1 to 8, wherein generating a second picture sequence based on the first picture sequence includes:
    基于所述第一图片序列训练得到时空感知模型;Obtain a spatio-temporal perception model based on the first picture sequence training;
    获取第二空间参数和第一时间参数,所述第二空间参数包括和所述第一图片序列中的图片的空间参数不同的空间参数,所述第一时间参数包括和所述第一图片序列中的图片的时间参数不同的时间参数;Obtain a second spatial parameter and a first temporal parameter, the second spatial parameter includes a spatial parameter different from the spatial parameter of the picture in the first picture sequence, the first temporal parameter includes and the first picture sequence The time parameters of the pictures in the pictures are different time parameters;
    将所述第二空间参数和所述第一时间参数作为所述时空感知模型的输入获取输出,所述输出为所述第二图片序列。The second spatial parameter and the first temporal parameter are used as inputs of the spatio-temporal perception model to obtain an output, and the output is the second picture sequence.
  10. 一种电子设备,其特征在于,包括收发器、处理器和存储器,所述存储器用于存储计算机程序,所述处理器调用所述计算机程序,用于执行如权利要求1-9任一项所述的方法。An electronic device, characterized in that it includes a transceiver, a processor and a memory, the memory is used to store a computer program, and the processor calls the computer program to execute the method according to any one of claims 1-9. method described.
  11. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序被处理器执行时,实现权利要求1-9任一项所述的方法。 A computer storage medium, characterized in that the computer storage medium stores a computer program, and when the computer program is executed by a processor, the method described in any one of claims 1-9 is implemented.
PCT/CN2023/114053 2022-08-30 2023-08-21 Image recommendation method and electronic device WO2024046162A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211049479.9 2022-08-30
CN202211049479.9A CN117688195A (en) 2022-08-30 2022-08-30 Picture recommendation method and electronic equipment

Publications (1)

Publication Number Publication Date
WO2024046162A1 true WO2024046162A1 (en) 2024-03-07

Family

ID=90100406

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/114053 WO2024046162A1 (en) 2022-08-30 2023-08-21 Image recommendation method and electronic device

Country Status (2)

Country Link
CN (1) CN117688195A (en)
WO (1) WO2024046162A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109495686A (en) * 2018-12-11 2019-03-19 上海掌门科技有限公司 Image pickup method and equipment
CN112425156A (en) * 2019-01-31 2021-02-26 华为技术有限公司 Method for selecting images based on continuous shooting and electronic equipment
WO2021103919A1 (en) * 2019-11-28 2021-06-03 荣耀终端有限公司 Composition recommendation method and electronic device
US20210182610A1 (en) * 2019-12-16 2021-06-17 Canon Kabushiki Kaisha Image capturing apparatus, generating apparatus, control method, and storage medium
CN113239220A (en) * 2021-05-26 2021-08-10 Oppo广东移动通信有限公司 Image recommendation method and device, terminal and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109495686A (en) * 2018-12-11 2019-03-19 上海掌门科技有限公司 Image pickup method and equipment
CN112425156A (en) * 2019-01-31 2021-02-26 华为技术有限公司 Method for selecting images based on continuous shooting and electronic equipment
WO2021103919A1 (en) * 2019-11-28 2021-06-03 荣耀终端有限公司 Composition recommendation method and electronic device
US20210182610A1 (en) * 2019-12-16 2021-06-17 Canon Kabushiki Kaisha Image capturing apparatus, generating apparatus, control method, and storage medium
CN113239220A (en) * 2021-05-26 2021-08-10 Oppo广东移动通信有限公司 Image recommendation method and device, terminal and readable storage medium

Also Published As

Publication number Publication date
CN117688195A (en) 2024-03-12

Similar Documents

Publication Publication Date Title
CN109495688B (en) Photographing preview method of electronic equipment, graphical user interface and electronic equipment
CN112887583B (en) Shooting method and electronic equipment
WO2021179773A1 (en) Image processing method and device
CN111327814A (en) Image processing method and electronic equipment
WO2021169394A1 (en) Depth-based human body image beautification method and electronic device
CN112262563B (en) Image processing method and electronic device
WO2021129198A1 (en) Method for photography in long-focal-length scenario, and terminal
US20210358523A1 (en) Image processing method and image processing apparatus
WO2021052111A1 (en) Image processing method and electronic device
WO2021013132A1 (en) Input method and electronic device
WO2022017261A1 (en) Image synthesis method and electronic device
WO2021078001A1 (en) Image enhancement method and apparatus
US20220343648A1 (en) Image selection method and electronic device
CN113170037B (en) Method for shooting long exposure image and electronic equipment
WO2023284715A1 (en) Object reconstruction method and related device
WO2022007707A1 (en) Home device control method, terminal device, and computer-readable storage medium
US20230056332A1 (en) Image Processing Method and Related Apparatus
CN113099146A (en) Video generation method and device and related equipment
CN115967851A (en) Quick photographing method, electronic device and computer readable storage medium
CN115115679A (en) Image registration method and related equipment
WO2022012418A1 (en) Photographing method and electronic device
WO2022057384A1 (en) Photographing method and device
CN113542574A (en) Shooting preview method under zooming, terminal, storage medium and electronic equipment
CN114283195B (en) Method for generating dynamic image, electronic device and readable storage medium
WO2021204103A1 (en) Picture preview method, electronic device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23859196

Country of ref document: EP

Kind code of ref document: A1