WO2021169394A1

WO2021169394A1 - Depth-based human body image beautification method and electronic device

Info

Publication number: WO2021169394A1
Application number: PCT/CN2020/126954
Authority: WO
Inventors: 刘梦莹; 钟顺才; 朱聪超
Original assignee: 荣耀终端有限公司
Priority date: 2020-02-25
Filing date: 2020-11-06
Publication date: 2021-09-02
Also published as: CN113382154A

Abstract

A depth-based human body image beautification method, applied to an electronic device having a display screen and a camera. The method comprises: detecting a first operation of a user; displaying a user interface on a display screen, the preview box of the user interface comprising a first human body image of a person to be photographed, the first human body image comprising a depth image and a color image; determining multiple human body key points in the color image using a preset key point detection model, and determining the position information of the multiple human body key points according to the depth image and the parameters of the camera; determining the body proportion parameter of said person according to the position information of the multiple human body key points; detecting a second operation of the user; and displaying a second human body image of said person in the preview box, the body proportion parameter of said person in the second human body image being adaptively adjusted. The method provided by embodiments of the present application can adaptively perform body shaping on a human body image, thereby bringing an updated user experience to a user.

Description

Depth-based human body image beautification method and electronic equipment

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 25, 2020, the application number is 202010117261.7, and the application name is "Depth-based human image beautification method and electronic equipment", the entire content of which is incorporated by reference In this application.

Technical field

This application relates to the technical field of electronic equipment, and specifically relates to a depth-based method for beautifying human body images and electronic equipment.

Background technique

At present, the existing human body beautification methods are mainly concentrated on the computer side, such as the existing abode photoshop software. The user needs to input portrait photos into the software, artificially mark various parts of the human body, and then manually adjust the proportion of fat to thin parts of the human body. In the beautification process, various parts of the body in the portrait photo cannot be detected automatically and accurately, and the body parts need to be manually marked, resulting in inaccurate shaping of the parts; and manual adjustment is required repeatedly until the adjustment is satisfactory.

The application software of some mobile terminals (such as mobile phones) also realizes the function of body beauty. The image is collected through the camera, and then the human body is detected, and various body parts are estimated. According to the body shaping parameters set by the user, each body part is beautified. shape. In the beautification process, the user needs to select the various body parts that need to be beautiful, which may easily cause the overall proportion of the human body to be imbalanced, such as lengthening the head, which affects the beauty. Moreover, some software is difficult to achieve the expected effect in the adjustment process due to the preset limitations of the function itself.

Application content

The embodiments of the present application provide a depth-based human image beautification method and electronic device. Using key point detection technology, it can adaptively shape the human body image without manual and repeated adjustment, avoiding the overall proportion of the human body, and bringing updates to users. Experience.

In the first aspect, the present application provides a depth-based method for beautifying human body images, which is applied to an electronic device with a display screen and a camera, and the method includes:

The first operation used by the user to turn on the camera is detected;

In response to the first operation, a user interface is displayed on the display screen, the user interface includes a preview frame, the preview frame includes a first human body image of the person being photographed, the first human body image includes a depth image and Color image

Determining a plurality of human body key points in the color image by using a preset key point detection model, and determining position information of the plurality of human body key points according to the depth image and the parameters of the camera;

Determining the body proportion parameter of the photographed person according to the position information of the multiple key points of the human body;

The second operation used by the user to indicate the body shape template is detected;

In response to the second operation, a second human body image of the photographed person is displayed in the preview frame, and the figure scale parameter of the photographed person in the second human body image is based on the figure proportion of the body shape template Parameter adaptation has been adjusted.

In the second aspect, this application also provides an electronic device, including:

A display screen; a camera; one or more processors; a memory; a plurality of application programs; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more The computer program includes instructions that, when executed by the device, cause the device to perform the following steps:

The first operation used by the user to turn on the camera is detected;

Using a preset key point detection algorithm to determine multiple key points of the human body in the color image, and using the depth image and the parameters of the camera to determine the position information of the multiple key points of the human body;

In a third aspect, the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of being run on the processor. When the processor executes the computer program, The computer device implements the depth-based image beautification method of the human body.

In a fourth aspect, the present application also provides a computer program product containing instructions that, when the computer program product runs on an electronic device, causes the electronic device to execute the above-mentioned depth-based human body image beautification method.

In a fifth aspect, the present application also provides a computer-readable storage medium, including instructions, which when run on an electronic device, cause the electronic device to execute the above-mentioned depth-based beautification method for human body images.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained from these drawings without creative labor.

FIG. 1A is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the application;

FIG. 1B is a schematic diagram of the software structure of an electronic device provided by an embodiment of the application;

2A is a front view of an electronic device provided by an embodiment of the application;

2B is a rear view of the electronic device provided by the embodiment of the application;

3A is a schematic diagram of a graphical user interface of an electronic device provided by an embodiment of this application;

3B is a schematic diagram of another graphical user interface of an electronic device provided by an embodiment of the application;

3C is a schematic diagram of another graphical user interface of an electronic device provided by an embodiment of the application;

3D is a schematic diagram of another graphical user interface of an electronic device provided by an embodiment of the application;

Fig. 4 is a schematic diagram of a graphical user interface provided by the prior art;

FIG. 5 is a schematic flowchart of a depth-based human body image beautification method provided by an embodiment of the application;

6A is a schematic diagram of pixels in a 2D coordinate space of a color image provided by an embodiment of the application;

6B is a schematic diagram of pixels in a 2D coordinate space of a depth image provided by an embodiment of the application;

6C is a schematic diagram of pixels in a 3D coordinate space of a color image provided by an embodiment of the application;

FIG. 7 is a schematic diagram of human bone points provided by an embodiment of the application;

8 is a schematic diagram of calculating the length of the bone between the bone points according to the depth value and 2D coordinates of the bone points;

FIG. 9A is a schematic diagram of another graphical user interface of an electronic device provided by an embodiment of the application;

FIG. 9B is a schematic diagram of another graphical user interface of an electronic device provided by an embodiment of the application.

Detailed ways

In order to better understand the technical solutions of the present application, the following describes the embodiments of the present application in detail with reference to the accompanying drawings.

It should be clear that the described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

In this application, "at least one" refers to one or more, and "multiple" refers to two or more. "And/or" describes the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the associated objects before and after are in an "or" relationship. "The following at least one item (a)" or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a). For example, at least one of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .

For ease of understanding, some illustrations of concepts related to the embodiments of the present application are given as examples for reference. As follows:

The depth-based human image beautification method provided in the embodiments of the present application can be applied to an electronic device or a separate application program, which can realize the depth-based human image beautification method in the present application automatically after taking a picture. Specifically, the depth-based human body image beautification method provided by the present application can implement real-time adaptive body beautification and body shaping functions for users through key point detection technology and image processing technology, and bring users a brand-new experience.

The depth-based human body image beautification method provided in the embodiments of this application can be applied to electronic devices with camera functions such as mobile phones, tablet computers, and wearable devices. The embodiments of this application do not impose any restrictions on the specific types of electronic devices.

In the following embodiments of this application, the application "camera" of electronic devices such as smart phones can provide the "beauty" function. Among them, the "Beauty" function can be used to adjust the body image of the person being photographed during the photo preview or video preview process, so that the body shape represented by the adjusted body image is beautified compared to the actual body shape of the person being photographed. Body beautification can include: beautify the proportions of the body (such as lengthening the legs, widening the shoulders, etc.). The adjustment of the body image involved in the "Beauty" function can include: determining the target position to which the key points need to be adjusted, and then using common image scaling algorithms such as bicubic, bilinear, and nearest neighbors to adjust the body image between the key points A scaling process is performed so that the key points can be located at their corresponding target positions after the human body image is scaled, so as to achieve the purpose of beautifying the body proportions.

The adjustment of the human body image involved in the "Beauty" function can also include: adopting common image scaling algorithms such as bicubic, bilinear, and neighboring to perform image scaling (scale) processing on the overall body image of the person being photographed to realize the adjustment. The purpose of body fatness or shaping. For example, image processing related to stovepipe may include compressing the image of the leg using an image scaling algorithm, and the leg image after the compression processing shows that the leg is slimmer than the actual leg of the person being photographed. For another example, the image processing of the waist shaping design may include: using an image scaling algorithm to compress the middle part of the waist image, and stretch the upper and lower ends of the waist image. The waist image after this image processing is shown The waist is more curvilinear than the actual waist of the person being photographed, and the waist image represented by the waist image after such image processing can be an S-shaped waist (the waist is thin in the middle). In the following embodiments of the present application, this processing performed on a body image may be referred to as body beauty processing.

In the following embodiments of the present application, the "beauty" function can be integrated into the "portrait" photographing function and video recording function included in the "camera" application. The "Beauty" function can also be used as an independent camera function in the "Camera" application. The "Portrait" camera function is a camera function set when the subject is a person, to highlight the person and enhance the beauty of the person in the captured picture. When the electronic device turns on the "portrait" camera function, the electronic device can use a larger aperture to keep the depth of field shallow, to highlight the character, and to improve the color effect to optimize the skin tone of the character. When it is detected that the ambient light intensity is lower than a certain threshold, the electronic device can also turn on the flashlight for illumination compensation.

"Camera" is an image capture application on smart phones, tablet computers and other electronic devices. This application does not restrict the name of the application. The "portrait" camera function and video function may be the camera function included in the "camera" application. In addition, the "camera" application can also include a variety of other camera functions. The camera parameters such as aperture size, shutter speed, and sensitivity for different camera functions can be different, and different camera effects can be presented. The camera function can also be called the camera mode, for example, the "portrait" camera function can also be called the "portrait" camera mode.

It is understandable that "beauty" and "portrait" are just some words used in this embodiment, and their meanings have been recorded in this embodiment, and their names do not constitute any limitation on this embodiment. The "beauty" mentioned in the embodiments of this application may also be referred to by other names such as "slimming and shaping" in other embodiments.

First, an exemplary electronic device 100 provided in the following embodiments of the present application is introduced. FIG. 1A shows a schematic structural diagram of an electronic device 100. The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2. , Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, 3D camera module 193, display screen 194, and subscriber identification module (subscriber identification module, SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light Sensor 180L, bone conduction sensor 180M, etc.

It can be understood that the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100. In other embodiments of the present application, the electronic device 100 may include more or fewer components than those shown in the figure, or combine certain components, or split certain components, or arrange different components. The illustrated components can be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units. For example, the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU), etc. Among them, the different processing units may be independent devices or integrated in one or more processors.

The controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching instructions and executing instructions.

A memory may also be provided in the processor 110 to store instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory can store instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.

In some embodiments, the processor 110 may include one or more interfaces. The interface can include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, and a universal asynchronous transmitter (universal asynchronous) interface. receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / Or Universal Serial Bus (USB) interface, etc.

The I2C interface is a bidirectional synchronous serial bus, including a serial data line (SDA) and a serial clock line (SCL). In some embodiments, the processor 110 may include multiple sets of I2C buses. The processor 110 may be coupled to the touch sensor 180K, charger, flash, 3D camera module 193, etc., respectively through different I2C bus interfaces. For example, the processor 110 may couple the touch sensor 180K through an I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to implement the touch function of the electronic device 100.

The I2S interface can be used for audio communication. In some embodiments, the processor 110 may include multiple sets of I2S buses. The processor 110 may be coupled with the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170. In some embodiments, the audio module 170 may transmit audio signals to the wireless communication module 160 through an I2S interface, so as to realize the function of answering calls through a Bluetooth headset.

The PCM interface can also be used for audio communication to sample, quantize and encode analog signals. In some embodiments, the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface. In some embodiments, the audio module 170 may also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.

The UART interface is a universal serial data bus used for asynchronous communication. The bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, the UART interface is generally used to connect the processor 110 and the wireless communication module 160. For example, the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to realize the Bluetooth function. In some embodiments, the audio module 170 may transmit audio signals to the wireless communication module 160 through a UART interface, so as to realize the function of playing music through a Bluetooth headset.

The MIPI interface can be used to connect the processor 110 with the display screen 194, the 3D camera module 193 and other peripheral devices. The MIPI interface includes a camera serial interface (camera serial interface, CSI), a display serial interface (display serial interface, DSI), and so on. In some embodiments, the processor 110 and the 3D camera module 193 communicate through a CSI interface to implement the shooting function of the electronic device 100. The processor 110 and the display screen 194 communicate through a DSI interface to realize the display function of the electronic device 100.

The GPIO interface can be configured through software. The GPIO interface can be configured as a control signal or as a data signal. In some embodiments, the GPIO interface can be used to connect the processor 110 with the 3D camera module 193, the display screen 194, the wireless communication module 160, the audio module 170, the sensor module 180, and so on. The GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.

The USB interface 130 is an interface that complies with the USB standard specification, and specifically may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on. The USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transfer data between the electronic device 100 and peripheral devices. It can also be used to connect earphones and play audio through earphones. This interface can also be used to connect other electronic devices, such as AR devices.

It can be understood that the interface connection relationship between the modules illustrated in the embodiment of the present application is merely a schematic description, and does not constitute a structural limitation of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.

The charging management module 140 is used to receive charging input from the charger. Among them, the charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 140 may receive the charging input of the wired charger through the USB interface 130. In some embodiments of wireless charging, the charging management module 140 may receive the wireless charging input through the wireless charging coil of the electronic device 100. While the charging management module 140 charges the battery 142, it can also supply power to the electronic device through the power management module 141.

The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the 3D camera module 193, and the wireless communication module 160. The power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance). In some other embodiments, the power management module 141 may also be provided in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may also be provided in the same device.

The wireless communication function of the electronic device 100 can be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.

The antenna 1 and the antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in the electronic device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example: Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna can be used in combination with a tuning switch.

The mobile communication module 150 can provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the electronic device 100. The mobile communication module 150 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), and the like. The mobile communication module 150 can receive electromagnetic waves by the antenna 1, and perform processing such as filtering, amplifying and transmitting the received electromagnetic waves to the modem processor for demodulation. The mobile communication module 150 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves for radiation via the antenna 1. In some embodiments, at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110. In some embodiments, at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.

The modem processor may include a modulator and a demodulator. Among them, the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 170A, the receiver 170B, etc.), or displays an image or video through the display screen 194. In some embodiments, the modem processor may be an independent device. In other embodiments, the modem processor may be independent of the processor 110 and be provided in the same device as the mobile communication module 150 or other functional modules.

The wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), and global navigation satellites. System (global navigation satellite system, GNSS), frequency modulation (FM), near field communication (NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be sent from the processor 110, perform frequency modulation, amplify, and convert it into electromagnetic waves to radiate through the antenna 2.

In some embodiments, the antenna 1 of the electronic device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology. The wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc. The GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite-based augmentation systems (SBAS).

The electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like. The GPU is an image processing microprocessor, which is connected to the display screen 194 and the application processor. The GPU is used to perform mathematical and geometric calculations and is used for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or change display information.

The display screen 194 is used to display images, videos, and the like. The display screen 194 includes a display panel. The display panel can use liquid crystal display (LCD), organic light-emitting diode (OLED), active matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode). AMOLED, flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (QLED), etc. In some embodiments, the electronic device 100 may include one or N display screens 194, and N is a positive integer greater than one.

The electronic device 100 can implement a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, and an application processor.

The camera 193 can be used to collect color image data and depth data of the subject. The ISP can be used to process the color image data collected by the 3D camera module 193. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing and is converted into an image visible to the naked eye. ISP can also optimize the image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193.

In some embodiments, the camera 193 may be composed of a color camera module and a 3D sensing module.

In some embodiments, the photosensitive element of the camera of the color camera module may be a charge coupled device (CCD) or a complementary metal-oxide semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other formats of image signals.

In some embodiments, the 3D sensing module may be a (time of flight, TOF) 3D sensing module or a structured light (structured light) 3D sensing module. Among them, structured light 3D sensing is an active depth sensing technology, and the basic components of the structured light 3D sensing module may include infrared (Infrared) emitters, IR camera modules, and so on. The working principle of the structured light 3D sensing module is to first emit a specific pattern of light spots on the object to be photographed, and then receive the light coding on the surface of the object, and then compare the similarities and differences with the original projected light spots. And use the triangulation principle to calculate the three-dimensional coordinates of the object. The three-dimensional coordinates include the distance between the electronic device 100 and the object to be photographed. Among them, TOF 3D sensing is also an active depth sensing technology. The basic components of a TOF 3D sensing module can include an infrared (Infrared) transmitter, an IR camera module, and so on. The working principle of the TOF 3D sensing module is to calculate the distance (that is, depth) between the TOF 3D sensing module and the object to be photographed through the time of the infrared foldback, so as to obtain a 3D depth map.

The structured light 3D sensing module can also be used in fields such as face recognition, somatosensory game consoles, and industrial machine vision detection. TOF 3D sensing modules can also be applied to game consoles, augmented reality (AR)/virtual reality (VR) and other fields.

In other embodiments, the camera 193 may also be composed of two or more cameras. The two or more cameras may include a color camera, and the color camera may be used to collect color image data of the photographed object. The two or more cameras can use stereo vision technology to collect depth data of the object being photographed. Stereo vision technology is based on the principle of human eye parallax. Under natural light sources, two or more cameras shoot images of the same object from different angles, and then perform triangulation and other calculations to obtain the electronic device 100 and the camera. The distance information between the photographs, that is, the depth information.

In some embodiments, the electronic device 100 may include one or N cameras 193, and N is a positive integer greater than one. Specifically, the electronic device 100 may include a front camera 193 and a rear camera 193. Among them, the front camera 193 can usually be used to collect the photographer's own color image data and depth data facing the display 194, and the rear 3D camera module 193 can be used to collect the photographic objects (such as people, landscapes, etc.) faced by the photographer. Etc.) color image data and depth data.

In some embodiments, the CPU, GPU, or NPU in the processor 110 may process the color image data and depth data collected by the 3D camera module 193. In some embodiments, the NPU can recognize the color images collected by the 3D camera module 193 (specifically, the color camera module) through the neural network algorithm based on the key point recognition technology, such as the convolutional neural network algorithm (CNN). Data to determine the key points of the person being photographed. The CPU or GPU can also run neural network algorithms to determine the key points of the person being photographed based on the color image data. In some embodiments, the CPU, GPU, or NPU can also be used to confirm the figure of the person being photographed (such as Body proportions, fatness and thinness of body parts between key points), and can further determine the body beautification parameters for the photographed person, and finally process the photographed image of the photographed person according to the body beautification parameters to make the shooting The figure of the person being photographed in the image is beautified. Subsequent embodiments will introduce in detail how to perform body beautification processing on the image of the photographed person based on the color image data and depth data collected by the 3D camera module 193, which will not be repeated here.

Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects the frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.

Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.

NPU is a neural-network (NN) computing processor. By drawing on the structure of biological neural networks, for example, the transfer mode between human brain neurons, it can quickly process input information, and it can also continuously self-learn. Through the NPU, applications such as intelligent cognition of the electronic device 100 can be realized, such as image recognition, face recognition, voice recognition, text understanding, and so on.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100. The external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save music, video and other files in an external memory card.

The internal memory 121 may be used to store computer executable program code, where the executable program code includes instructions. The internal memory 121 may include a storage program area and a storage data area. Among them, the storage program area can store an operating system, an application program (such as a sound playback function, an image playback function, etc.) required by at least one function, and the like. The data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 100. In addition, the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like. The processor 110 executes various functional applications and data processing of the electronic device 100 by running instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.

The electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. For example, music playback, recording, etc.

The audio module 170 is used to convert digital audio information into an analog audio signal for output, and is also used to convert an analog audio input into a digital audio signal. The audio module 170 can also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be provided in the processor 110, or part of the functional modules of the audio module 170 may be provided in the processor 110.

The speaker 170A, also called "speaker", is used to convert audio electrical signals into sound signals. The electronic device 100 can listen to music through the speaker 170A, or listen to a hands-free call.

The receiver 170B, also called "earpiece", is used to convert audio electrical signals into sound signals. When the electronic device 100 answers a call or voice message, it can receive the voice by bringing the receiver 170B close to the human ear.

The microphone 170C, also called "microphone", "microphone", is used to convert sound signals into electrical signals. When making a call or sending a voice message, the user can make a sound by approaching the microphone 170C through the human mouth, and input the sound signal into the microphone 170C. The electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C, which can implement noise reduction functions in addition to collecting sound signals. In other embodiments, the electronic device 100 may also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.

The earphone interface 170D is used to connect wired earphones. The earphone interface 170D may be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, and a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.

The pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be provided on the display screen 194. There are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors and so on. The capacitive pressure sensor may include at least two parallel plates with conductive materials. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display screen 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A. In some embodiments, touch operations that act on the same touch position but have different touch operation strengths may correspond to different operation instructions. For example: when a touch operation whose intensity of the touch operation is less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.

The gyro sensor 180B may be used to determine the movement posture of the electronic device 100. In some embodiments, the angular velocity of the electronic device 100 around three axes (ie, x, y, and z axes) can be determined by the gyro sensor 180B. The gyro sensor 180B can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor 180B detects the shake angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 100 through reverse movement to achieve anti-shake. The gyro sensor 180B can also be used for navigation and somatosensory game scenes.

The air pressure sensor 180C is used to measure air pressure. In some embodiments, the electronic device 100 uses the air pressure value measured by the air pressure sensor 180C to calculate the altitude to assist positioning and navigation.

The magnetic sensor 180D includes a Hall sensor. The electronic device 100 may use the magnetic sensor 180D to detect the opening and closing of the flip holster. In some embodiments, when the electronic device 100 is a flip machine, the electronic device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D. Furthermore, according to the detected opening and closing state of the leather case or the opening and closing state of the flip cover, features such as automatic unlocking of the flip cover are set.

The acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and apply to applications such as horizontal and vertical screen switching, pedometers, and so on.

Distance sensor 180F, used to measure distance. The electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F to measure the distance to achieve fast focusing.

The proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light to the outside through the light emitting diode. The electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 can determine that there is no object near the electronic device 100. The electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power. The proximity light sensor 180G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.

The ambient light sensor 180L is used to sense the brightness of the ambient light. The electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived brightness of the ambient light. The ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures. The ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket to prevent accidental touch.

The fingerprint sensor 180H is used to collect fingerprints. The electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application locks, fingerprint photographs, fingerprint answering calls, and so on.

The temperature sensor 180J is used to detect temperature. In some embodiments, the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold value, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature. In some other embodiments, when the temperature is lower than another threshold, the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.

Touch sensor 180K, also called "touch panel". The touch sensor 180K may be disposed on the display screen 194, and the touch screen is composed of the touch sensor 180K and the display screen 194, which is also called a “touch screen”. The touch sensor 180K is used to detect touch operations acting on or near it. The touch sensor can pass the detected touch operation to the application processor to determine the type of touch event. The visual output related to the touch operation can be provided through the display screen 194. In other embodiments, the touch sensor 180K may also be disposed on the surface of the electronic device 100, which is different from the position of the display screen 194.

The bone conduction sensor 180M can acquire vibration signals. In some embodiments, the bone conduction sensor 180M can obtain the vibration signal of the vibrating bone mass of the human voice. The bone conduction sensor 180M can also contact the human pulse and receive the blood pressure pulse signal. In some embodiments, the bone conduction sensor 180M may also be provided in the earphone, combined with the bone conduction earphone. The audio module 170 can parse the voice signal based on the vibration signal of the vibrating bone block of the voice obtained by the bone conduction sensor 180M, and realize the voice function. The application processor can analyze the heart rate information based on the blood pressure beating signal obtained by the bone conduction sensor 180M, and realize the heart rate detection function.

The button 190 includes a power-on button, a volume button, and so on. The button 190 may be a mechanical button. It can also be a touch button. The electronic device 100 may receive key input, and generate key signal input related to user settings and function control of the electronic device 100.

The motor 191 can generate vibration prompts. The motor 191 can be used for incoming call vibration notification, and can also be used for touch vibration feedback. For example, touch operations applied to different applications (such as photographing, audio playback, etc.) can correspond to different vibration feedback effects. Acting on touch operations in different areas of the display screen 194, the motor 191 can also correspond to different vibration feedback effects. Different application scenarios (for example: time reminding, receiving information, alarm clock, games, etc.) can also correspond to different vibration feedback effects. The touch vibration feedback effect can also support customization.

The indicator 192 may be an indicator light, which may be used to indicate the charging status, power change, or to indicate messages, missed calls, notifications, and so on.

The SIM card interface 195 is used to connect to the SIM card. The SIM card can be inserted into the SIM card interface 195 or pulled out from the SIM card interface 195 to achieve contact and separation with the electronic device 100. The electronic device 100 may support 1 or N SIM card interfaces, and N is a positive integer greater than 1. The SIM card interface 195 can support Nano SIM cards, Micro SIM cards, SIM cards, etc. The same SIM card interface 195 can insert multiple cards at the same time. The types of the multiple cards can be the same or different. The SIM card interface 195 can also be compatible with different types of SIM cards. The SIM card interface 195 may also be compatible with external memory cards. The electronic device 100 interacts with the network through the SIM card to implement functions such as call and data communication. In some embodiments, the electronic device 100 adopts an eSIM, that is, an embedded SIM card. The eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.

The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. The embodiment of the present application takes an Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 by way of example.

FIG. 1B is a software structure block diagram of an electronic device 100 according to an embodiment of the present application.

The layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Communication between layers through software interface. In some embodiments, the Android system is divided into four layers, from top to bottom, the application layer, the application framework layer, the Android runtime and system library, and the kernel layer.

The application layer can include a series of application packages.

As shown in Figure 1B, the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message, etc.

The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer. The application framework layer includes some predefined functions.

As shown in Figure 1B, the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, and so on.

The window manager is used to manage window programs. The window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, take a screenshot, etc.

The content provider is used to store and retrieve data and make these data accessible to applications. The data may include videos, images, audios, phone calls made and received, browsing history and bookmarks, phone book, etc.

The view system includes visual controls, such as controls that display text, controls that display pictures, and so on. The view system can be used to build applications. The display interface can be composed of one or more views. For example, a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.

The phone manager is used to provide the communication function of the electronic device 100. For example, the management of the call status (including connecting, hanging up, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.

The notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, and so on. The notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or a scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, text messages are prompted in the status bar, prompt sounds, electronic devices vibrate, and indicator lights flash.

Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.

The core library consists of two parts: one part is the function functions that the java language needs to call, and the other part is the core library of Android.

The application layer and application framework layer run in a virtual machine. The virtual machine executes the java files of the application layer and the application framework layer as binary files. The virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.

The system library can include multiple functional modules. For example: surface manager (surface manager), media library (Media Libraries), three-dimensional graphics processing library (for example: OpenGL ES), 2D graphics engine (for example: SGL), etc.

The surface manager is used to manage the display subsystem and provides a combination of 2D and 3D layers for multiple applications.

The media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files. The media library can support multiple audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, synthesis, and layer processing.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is the layer between hardware and software. The kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.

In the following, the workflow of the software and hardware of the electronic device 100 will be exemplified in conjunction with capturing a photo scene.

When the touch sensor 180K receives a touch operation, the corresponding hardware interrupt is sent to the kernel layer. The kernel layer processes touch operations into original input events (including touch coordinates, time stamps of touch operations, etc.). The original input events are stored in the kernel layer. The application framework layer obtains the original input event from the kernel layer and identifies the control corresponding to the input event. Taking the touch operation as a touch click operation, and the control corresponding to the click operation is the control of the camera application icon as an example, the camera application calls the interface of the application framework layer to start the camera application, and then starts the camera driver by calling the kernel layer. The 3D camera module 193 captures still images or videos.

The following describes an exemplary user interface for application menus on the electronic device 100.

FIG. 2A exemplarily shows an exemplary user interface 21 for an application menu on the electronic device 100. As shown in FIG. 2A, the electronic device 100 may be configured with a 3D camera module 193. In some embodiments, 193-1 may be a color camera, and 193-2 may be a structured light 3D camera module. In other embodiments, 193-1 may be a color camera, and 193-2 may be a TOF 3D camera module. In still other embodiments, 193-1 and 193-2 may be two color cameras. As shown in FIG. 2A, the 3D camera module 193 may be disposed on the top of the electronic device 100, such as the "bangs" position of the electronic device 100 (ie, the area AA shown in FIG. 2A). It can be known that in addition to the 3D camera module 193, the area AA may also include an illuminator 197 (not shown in FIG. 1A), a speaker 170A, a proximity light sensor 180G, an ambient light sensor 180L, and the like. In some embodiments, as shown in FIG. 2B, a 3D camera module 193 and an illuminator 197 may also be configured on the back of the electronic device 100.

As shown in FIG. 2A, the user interface 21 may include: a status bar 201, a tray 223 with icons of commonly used applications, a calendar indicator 203, a weather indicator 205, a navigation bar 225, and other application icons. in:

The status bar 201 may include: one or more signal strength indicators 201-1 of a mobile communication signal (also called a cellular signal), an indicator 201-2 of an operator of the mobile communication signal, a time indicator 201-3, Battery status indicator 201-4 etc.

The calendar indicator 203 can be used to indicate the current time, such as date, day of the week, hour and minute information, and so on.

The weather indicator 205 can be used to indicate the type of weather, such as cloudy to clear, light rain, etc., and can also be used to indicate information such as temperature.

The tray 223 with icons of commonly used application programs can display: a phone icon 223-1, a short message icon 223-2, a contact icon 221-4, and so on.

The navigation bar 225 may include system navigation keys such as a return button 225-1, a main interface (Gome screen) button 225-3, and a call-out task history button 225-5. When it is detected that the user clicks the return button 225-1, the electronic device 100 may display the previous page of the current page. When it is detected that the user clicks the main interface button 225-3, the electronic device 100 may display the main interface. When it is detected that the user clicks the outgoing task history button 225-5, the electronic device 100 may display the task recently opened by the user. The naming of each navigation key can also be other, which is not limited in this application. Not limited to virtual keys, each navigation key in the navigation bar 225 can also be implemented as a physical key.

Other application icons can be for example: ^{Wechat TM} (Wechat ^TM ) icon 211, QQ ^TM icon 212, Twitter ^TM (Twitter ^TM ) icon 213, Facebook ^TM (Facebook ^TM ) icon 214, mailbox ^TM icon 215 , Cloud sharing icon 216, memo icon 217, setting icon 218, gallery icon 219, camera icon 220. The user interface 21 may also include a page indicator 221. The icons of other applications may be distributed on multiple pages, and the page indicator 221 may be used to indicate the application in which page the user is currently browsing. The user can swipe the area of other application icons left and right to browse application icons in other pages.

In some embodiments, the user interface 21 exemplarily shown in FIG. 2A may be a main interface (Gome screen).

In some other embodiments, the electronic device 100 may also include a home button. The main screen key can be a physical key or a virtual key (such as key 225-3). The home screen key can be used to receive instructions from the user and return the currently displayed UI to the home interface, so that it is convenient for the user to view the home screen at any time. The above instruction can be an operation instruction for the user to press the home screen key once, or an operation instruction for the user to press the home screen key twice in a short period of time, or the user to press and hold the home screen key for a predetermined period of time. Operation instructions. In some other embodiments of the present application, the home screen key can also be integrated with a fingerprint recognizer, so that when the home screen key is pressed, fingerprints are collected and recognized accordingly.

The following describes an application scenario involved in this application: an image shooting scenario.

As shown in FIG. 3A, the electronic device can detect a touch operation (such as a click operation on the icon 220) acting on the icon 220 of the camera, and in response to this operation, it can display the user interface 31 exemplarily shown in FIG. 3B. The user interface 31 may be a user interface of a “camera” application program, which may be used by the user to take pictures, such as taking pictures and videos. "Camera" is an image capture application on smart phones, tablet computers and other electronic devices. This application does not restrict the name of the application. In other words, the user can click the icon 220 to open the user interface 31 of the “camera”. Not limited to this, the user can also open the user interface 31 in other applications, for example, the user ^{clicks the shooting control in "WeChat™} " to open the user interface 31. " ^WeChatTM " is a social application that allows users to share photos taken with others.

FIG. 3B exemplarily shows a user interface 31 of the "camera" application on an electronic device such as a smart phone.

As shown in FIG. 3B, the user interface 31 may include: an area 301, a shooting mode list 302, a control 303, a control 304, and a control 305. in:

The area 301 may be referred to as a preview frame 301. The preview frame 301 can be used to display the color images collected by the 3D camera module 193 in real time. The electronic device can refresh the displayed content in it in real time, so that the user can preview the color image currently collected by the camera 193. Here, the 3D camera module 193 may be a rear camera or a front camera.

One or more shooting mode options may be displayed in the shooting mode list 302. The one or more camera options may include: night scene mode option 302A, portrait mode option 302B, camera mode option 302C, video mode option 302D, and more shooting mode options 302E. The one or more camera options can be expressed as text information on the interface. For example, the night scene mode option 302A, portrait mode option 302B, camera mode option 302C, video mode option 302D, and more shooting mode options 302E can respectively correspond to the text "night scene". , "Portrait", "Photograph", "Video", "More". Not limited to this, the one or more camera options may also be represented as icons or other forms of interactive elements (IE) on the interface. In some embodiments, the electronic device 100 may select the camera mode option 302C by default, and the display state of the camera mode option 302C (eg, the camera mode option 302C is highlighted) may indicate that the camera mode option 302C has been selected.

The electronic device 100 can detect a user operation acting on the shooting mode option, and the user operation can be used to select a shooting mode, and in response to the operation, the electronic device 100 can start the shooting mode selected by the user. In particular, when the user operation acts on more shooting mode options 302E, the electronic device 100 may further display more other shooting mode options, such as large aperture shooting mode options, slow motion shooting mode options, etc., which can be shown to the user Richer camera functions. Not limited to that shown in FIG. 3B, more shooting mode options 302E may not be displayed in the shooting mode list 302, and the user can browse other shooting mode options by sliding left/right in the shooting mode list 302.

The control 303 can be used to monitor user operations that trigger shooting (photographing or video recording). The electronic device can detect a user operation that acts on the control 303 (such as a click operation on the control 303), and in response to the operation, the electronic device 100 can save the image in the preview box 301. The saved image can be a picture or a video. In addition, the electronic device 100 may also display a thumbnail of the saved image in the control 304. In other words, the user can click the control 303 to trigger the shooting. Among them, the control 303 may be a button or other forms of control. In this application, the control 303 may be referred to as a shooting control.

The control 304 can be used to monitor the user operation that triggers the camera switch. The electronic device 100 can detect a user operation (such as a click operation on the control 304) acting on the control 304, and in response to the operation, the electronic device 100 can switch the camera (such as switching the rear camera to the front camera, or the front camera). Switch to the rear camera).

The control 305 can be used to monitor the user operation that triggers the opening of the "gallery". The electronic device 100 can detect a user operation (such as a click operation on the control 305) acting on the control 305, and in response to the operation, the electronic device 100 can display a user interface of the "Gallery", and the user interface can display the electronic device 100 saved pictures. Here, the "gallery" is a picture management application on electronic devices such as smart phones, tablet computers, etc., and can also be referred to as "album", and the name of the application is not limited in this embodiment. "Gallery" can support users to perform various operations on pictures stored on electronic devices, such as browsing, editing, deleting, and selecting operations.

It can be seen that the user interface 31 can show the user a variety of camera functions (modes) provided by the "camera", and the user can choose to turn on the corresponding shooting mode by clicking the shooting mode option.

Based on the above-mentioned image shooting scene, some embodiments of a user interface (UI) implemented on the electronic device 100 are introduced below.

FIG. 3C exemplarily shows the user interface 32 provided by the "portrait" photographing function of the "camera" application.

In the shooting mode list 302, the electronic device 100 can detect a user operation that acts on the portrait mode option 302B (such as a click operation on the portrait mode option 302B), and in response to the user operation, the electronic device 100 can turn on "portrait" to take photos Function and display the user interface exemplarily shown in Fig. 3C. In the foregoing content, the definition of enabling the “portrait” photographing function of the electronic device 100 has been explained, and the details are not repeated here. In this application, the portrait mode option may be referred to as the first shooting mode option.

As shown in FIG. 3C, the user interface 32 includes a preview box 301, a shooting mode list 302, a control 303, a control 304, a control 305, and a control 306 and a control 207. Among them: the preview box 301, the shooting mode list 302, the control 303, the control 304, and the control 305 can refer to the related description in the user interface 31, which will not be repeated here. The control 306 can be used to monitor the user operation of opening the light effect template option, and the control 307 can be used to monitor the user operation of opening the character beautification option.

When a user operation acting on the control 306 (such as a click operation on the control 306) is detected, the electronic device 100 may display a variety of light effect template options in the user interface 31. Different light effect templates can represent (or correspond to) different light effect parameters, such as light source position, layer fusion parameters, texture pattern projection position, projection direction, etc. Users can choose different light effect templates to make the photos obtained by shooting show different effects. This application does not limit the interface expression form of the multiple light effect template options in the user interface 31.

When a user operation (such as a click operation on the control 307) acting on the control 307 is detected, the electronic device 100 may display the user interface 33 exemplarily shown in FIG. 3C. Fig. 3C exemplarily shows the user interface provided by the character beautification function. The following content will introduce the user interface exemplarily shown in FIG. 3C in detail, and will not be repeated here.

In some embodiments, in response to a user operation on the portrait mode option 302B, the electronic device 100 may also update the display state of the portrait mode option, and the updated display state may indicate that the portrait mode has been selected.

For example, the updated display state may be the text information "portrait" corresponding to the highlight shooting mode option 303B. Not limited to this, the updated display status can also present other interface expressions, such as the font of the text information "Portrait" becomes larger, the text information "Portrait" is framed, the text information "Portrait" is underlined, and options 303B The color is deepened and so on.

In some embodiments, after the electronic device 100 turns on the "portrait" photographing function, if the electronic device 100 does not detect a person in the color image collected by the 3D camera module 193, it may output prompt information 308 in the preview box 301 , The prompt message 308 may be the text "No person detected", which may be used to prompt the electronic device 100 that no person is detected.

It can be seen from Figure 3C that the character beautification function can be integrated into the "portrait" camera function. Not limited to this, the character beautification function may also be a camera function in the “camera” application. At this time, the photographing mode list 302 in the user interface 31 may display a character beautification mode option. In response to a user operation acting on the character beautification mode option, the electronic device 100 may display the user interface provided by the character beautification function exemplarily shown in FIG. 3D.

FIG. 3D exemplarily shows the user interface 33 provided by the character beautification function of the "camera" application. As shown in FIG. 3D, the user interface 33 includes a preview box 301, a shooting mode list 302, a control 303, a control 304, a control 305, as well as a skin beautification option 309 and a body beautification option 310. Among them: the preview box 301, the shooting mode list 302, the control 303, the control 304, and the control 305 can refer to the related description in the user interface 31, which will not be repeated here.

The skin beautification option 309 and the body beautification option 310 may be represented as icons on the interface, as shown in FIG. 3D. Not limited to icons, the skin beautification option 309 and the body beautification option 310 can also be expressed as text (such as the text "beauty skin", "beauty body") or other forms of interactive elements (IE) on the interface.

When the electronic device 100 detects a user operation (such as a click operation on the beautifying option 310) that acts on the beautifying option 310, the user operation is used to select the beautifying option 310.

In some embodiments, after the electronic device 100 turns on the "Beauty" function, if the electronic device 100 does not detect a person in the color image collected by the camera 193, the prompt information 308 can be output in the preview box 301, and the prompt information 308 can be It is the text "No person detected", which can be used to remind the electronic device 100 that no person is detected. Specifically, the electronic device 100 may analyze whether the color image collected by the 3D camera module 193 contains key points of the human body based on the key point recognition technology. If the key points of the human body are included, it is determined that a person is detected; otherwise, it is determined that no person is detected. The specific implementation of determining the key points of the human body based on the key point recognition technology will be described in detail in the follow-up content, and will not be expanded here.

For ease of understanding, the following embodiments of the present application will take the electronic device (smartphone) having the structure shown in FIG. 1A and FIG. Give specific explanations.

As shown in Figure 4, the application software of some mobile terminals (such as mobile phones) also implements the body beauty function. The image is collected through the camera, and then the human body is detected, and various body parts are estimated. Each body part (area A as shown in Figure 4) performs body shaping. In the beautification process, the user needs to select the various body parts that need to be beautiful, which may easily cause the overall proportion of the human body to be imbalanced, such as lengthening the head, which affects the beauty. Moreover, some software is difficult to achieve the expected effect in the adjustment process due to the preset limitations of the function itself.

Further, the existing portrait photos lack 3D depth information, and 2D color images are acquired through a monocular camera, and the 3D depth information is lost. The body proportion data before and after beautification cannot be known, and the adjustment is blind.

The embodiments of the present application provide a depth-based method for beautifying human body images. The method can be implemented in an electronic device (such as a mobile phone, a tablet computer, etc.) having a depth camera and an RGB camera. FIG. 5 is a schematic flowchart of a depth-based human body image beautification method provided by an embodiment of the present application. As shown in FIG. 5, the method may include the following steps:

The embodiment of the present application provides a depth-based method for beautifying human body images. The method can be implemented in an electronic device (such as a mobile phone, a tablet computer, etc.) having a depth camera and an RGB camera. FIG. 6 is a schematic flowchart of a depth-based human body image beautification method provided by an embodiment of the present application. As shown in FIG. 6, the method may include the following steps:

Step S01, detecting the first operation used by the user to turn on the camera;

Step S02, in response to the first operation, display a user interface on the display screen, the user interface including a preview frame, the preview frame includes a first human body image of the person being photographed, and the first human body image includes a depth image And color images;

Step S03, using a preset key point detection model to determine multiple key points of the human body in the color image, and using the depth image data and camera parameters to determine position information of the multiple key points of the human body;

Step S04: Determine the body proportion parameter of the photographed person according to the position information of the multiple key points of the human body;

Step S05: It is detected that a second operation of the user instructing the camera is detected, and the second operation is an operation of instructing the body shape template by the user;

Step S06, in response to the second operation, display a second human body image of the photographed person in the preview frame, and the figure proportion parameter of the photographed person in the second human body image is determined by the body shape template. The body proportion parameters are adjusted adaptively.

In this solution, the key point detection model is used to identify multiple key points of the human body, and the figure scale parameters of the person being photographed are determined according to the position information of the key points of the human body, and then the body proportion parameters of the body shape template set by the user are adaptively adjusted. The body proportion parameters of the person being photographed are described, and the beautified image of the human body after beautification is obtained. In the process of use, users only need to select the corresponding body shape template to perform body beauty processing on the captured human body image, without manual and repeated adjustment, avoiding the overall proportion of the human body and bringing a newer experience to the user.

The specific technical solutions of the depth-based human body image beautification method provided in this embodiment will be described in detail below.

Step S01, detecting the first operation used by the user to turn on the camera;

Step S02, in response to the first operation, display a user interface on the display screen, the user interface including a preview frame, the preview frame includes a first human body image of the person being photographed, and the first human body image includes Depth image and color image.

In an example, the user's shooting behavior may include a first operation of the user to turn on the camera; in response to the first operation, a user interface is displayed on the display screen.

Fig. 3A shows a graphical user interface (GUI) of the mobile phone, and the GUI is the desktop of the mobile phone. When the electronic device detects that the user clicks the icon 220 of the camera application (application, APP) on the desktop, it can start the camera application and display another GUI as shown in FIG. 3B, which may be referred to as the user interface 31. The user interface 31 may include a preview box 301. In the preview state, the preview image can be displayed in the preview frame 301 in real time.

After the electronic device starts the camera, a first human body image may be displayed in the preview frame 301, and the first human body image is a color image. The user interface may also include a control 303 for indicating the photographing mode, and other photographing controls.

Specifically, the electronic device can turn on the 3D camera module, and collect a color image and a depth image through the 3D camera module, and the depth image includes the depth information of the person being photographed. The color image includes the image of the person being photographed (that is, the foreground image) and the background image.

A color image may include multiple pixels, each of which has two-dimensional coordinates and color values. The color value can be an RGB value or a YUV value. The depth image may include a plurality of pixels, and each pixel has a two-dimensional coordinate and a depth value. For a certain position on the body of the person being photographed, the color value of the corresponding pixel in the color image represents the color of the position (such as the color of clothing, the color of bare skin, etc.), and the position corresponds to the depth image The depth value of the pixel point represents the vertical distance between the position and the electronic device (specifically, it may be a 3D camera module). For example, as shown in Figures 6A-6B, for the position A (left hip point) on the body of the person being photographed, the two-dimensional coordinates of the pixel point corresponding to the position A in the color image is (x1, y1), and the pixel point The RGB value of (255, 255, 255); the two-dimensional coordinates of the pixel point corresponding to position A in the depth image is (x1, y1), and the depth value of the pixel point is 350 cm. This means that the color at position A is white, and the vertical distance between position A and the electronic device is 350 cm.

Step S03: Determine a plurality of human body key points in the color image by using a preset key point detection model, and determine the position information of the plurality of human body key points according to the depth image and the parameters of the camera.

Specifically, the electronic device may use the color image of the photographed person and the key point detection model to identify the human body key points of the photographed person. Recognizing the key points of the human body refers to determining the 2D coordinates of the key points.

Among them, the input of the key point detection model may be a color image of the human body, and the output may be the 2D coordinates of the key point of the human body. In this way, the electronic device can specifically take the color image of the captured person as input, and obtain the 2D coordinates of each key point in the color image of the captured person through the recognition of the key point detection model.

As shown in Figure 7, the key points of the human body include head key point 1, right ear key point 2, left ear key point 3, neck key point 4, right shoulder key point 5, left shoulder key point 6, right chest key point 7, left Chest key point 8, right waist key point 9, left waist key point 10, right hip key point 11, left hip key point 12, right knee key point 13, left knee key point 14, right foot key point 15, left foot key point Point 16, crotch key point 17, right elbow key point 18, right wrist key point 19, left elbow key point 20, and left wrist key point 21.

In an embodiment, the key point detection model may be, for example, an hourglass network model. Specifically, the key point detection model is composed of four densely connected hourglass networks; the key point detection model is trained using the preset training set, and the minimum mean square error loss function is used in the training process to make the hourglass network converge and obtain a trained Key point detection model.

Understandably, the hourglass network can effectively detect the key points of the target object. The hourglass network includes an input layer, a convolutional layer, a pooling layer, an up-sampling layer, a down-sampling layer, and so on. When four hourglass networks are connected together, the output of the previous hourglass network is the input of the adjacent hourglass network. In order to ensure the normal update of the underlying parameters, each hourglass network adopts a relay supervision strategy to supervise and train the loss of the network.

The convolution layer can include many convolution operators. The convolution operator is also called the kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix. The convolution operator can essentially It is a weight matrix. This weight matrix is usually predefined. In the process of image convolution operation, the weight matrix is usually one pixel after one pixel (or two pixels after two pixels) along the horizontal direction on the input image. Based on the value of stride), the work of extracting specific features from the image is completed.

Because it is often necessary to reduce the number of training parameters, it is often necessary to periodically introduce a pooling layer after the convolutional layer. It can be a convolutional layer followed by a pooling layer, or a multi-layer convolutional layer followed by a pooling layer. One or more pooling layers. In the image processing process, the sole purpose of the pooling layer is to reduce the size of the image space. The pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size. The average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of the average pooling. The maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling. In addition, just as the size of the weight matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the image size. The size of the image output after processing by the pooling layer can be smaller than the size of the image of the input pooling layer, and each pixel in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.

Optionally, before determining multiple human key points in the color image by using a preset key point detection model, the method further includes:

Construct a key point detection model, where the key point detection model is composed of four densely connected hourglass networks;

Use the preset training set to train the key point detection model. In the training process, the minimum mean square error loss function is used to make the hourglass network converge, and the trained key point detection model is obtained.

The training set includes a plurality of human body image samples. Before training, the human body image samples in the training set need to be preprocessed, for example, the human body image samples are cropped to a standard size, the environmental interference area is removed, and the cropped human body image samples Manually mark each key point.

Then the preprocessed training samples are input into the fourth-order hourglass network, which includes the upper road and the lower road. The live pig image sample is down-sampled four times. Before each down-sampling, the upper-level road processes the original-size human body image, and the lower-level road down-samples the original-size human body image and then performs up-sampling processing. In this embodiment, the intermediate characteristics of the original size, 1/2, 1/4, and 1/8 can be extracted from the original size. After each feature is extracted, the image is restored to the original size by upsampling, which is consistent with the original size. The data is added, and then a residual network is used for feature extraction; between two downsampling, three primary modules are used to extract features; between two additions, one primary module is used to extract features.

In the fourth-order hourglass network, each hourglass network is down-sampling through the pooling layer, and neighboring interpolation is up-sampling, so that key point features can be extracted from top to bottom and bottom to top in each size. Jumping connections are used between the hourglasses, so that the key point position information at each resolution is preserved.

The bone recognition accuracy of the trained key point detection model meets the preset requirements.

Further, the position information of the multiple key points of the human body is determined according to the depth image and camera parameters.

Specifically, according to the coordinates of the key points of the human body recognized in the color image, the depth value of the key points under the same coordinates is determined from the depth image;

According to the depth value of the key points of the human body and the coordinates of the key points of the human body, the position information of the key points of the human body is obtained, that is, the 3D coordinates. For example, the 3D coordinates are (x, y, z), x represents the abscissa of the pixel, y represents the ordinate of the pixel, and z represents the depth value of the pixel.

It can be seen that by combining the color image and the depth image of the photographed person, the two-dimensional coordinates of each photographed part of the photographed person, the depth value relative to the 3D camera module, and the color value can be determined. Among them, two-dimensional coordinates and depth values can represent 3D coordinates.

For example, the color image and the depth image shown in FIG. 6A and FIG. 6B respectively can be combined into the distribution of color values in the 3D coordinate space, as shown in FIG. 6C. The z-axis represents the depth value. Among them, the 3D coordinates of position A are (x1, y1, z1), z1=350 cm, the RGB value at the 3D coordinates is (255,255,255); the 3D coordinates of position B are (x2, y2, z2), z2=345 Cm, the RGB value at the 3D coordinate is (0,0,0).

Here, the photographed part refers to the part of the image collected by the 3D camera module. For example, when the photographed person is standing facing the 3D camera module, the photographed part of the photographed person may include the front face such as face and stomach. The body parts of the 3D camera module, but the buttocks and back are not part of the photographed parts.

Step S04: Determine the body proportion parameter of the photographed person according to the position information of the multiple key points of the human body. The body proportion parameter includes one or more of head-to-body ratio, upper-to-body ratio, lower-body ratio, head-to-shoulder ratio, head-to-waist ratio, head-to-hip ratio, and shoulder-to-body ratio.

Specifically, the electronic device may determine the length of the bones between the key points according to the depth value of the key points and the 2D coordinates of the key points. For example, as shown in FIG. 8, the vertical distances between the left hip point P1 and the left knee point P2 of the person being photographed and the electronic device will be D1 and D2, respectively.

The head-to-body ratio X ₁ =2*D _1-4 /(D _4-15 +D _4-16 ), where D _nm represents the length from the key point n to the key point m calculated using 3D depth information.

For example, as shown in FIG. 8, the vertical distances between the key point 12 of the left hip, the key point 14 of the left knee and the electronic device of the photographed person will be D1 and D2, respectively. In the color image of the person being photographed, the distance L between the left hip point 12 and the left knee point 14 can be calculated from the 2D coordinates of 12 and the 2D coordinates of 14. Therefore, the length between the left hip key point 12 and the left knee key point 14 can be calculated

Similarly, the upper to lower body ratio X ₂ =(D _4-12 +D _4-11 )/(D _11-15 +D _12-16 ), where 4 represents the key point of the neck, 11 represents the key point of the right hip, and 12 represents For the key points of the left hip, 15 represents the key point of the right foot, and 16 represents the key point of the left foot.

Lower body ratio X ₃ =(D _11-13 +D _12-14 )/(D _13-15 +D _14-16 ), where 11 represents the key point of the right hip, 12 represents the key point of the left hip, and 13 represents the key point of the right knee Point, 14 represents the key point of the left knee, 15 represents the key point of the right foot, and 16 represents the key point of the left foot.

Head to shoulder ratio X ₄ =D _2-3 /D _5-6 , where 2 represents the key point of the right ear, 3 represents the key point of the left ear, 5 represents the key point of the right shoulder, and 6 represents the key point of the left shoulder.

Head-to-waist ratio X ₅ =D _2-3 /D _9-10 , where 2 represents the key point of the right ear, 3 represents the key point of the left ear, 9 represents the key point of the right waist, and 10 represents the key point of the left waist.

The head-to-hip ratio X ₆ =D _2-3 /D _11-12 , where 2 represents the key point of the right ear, 3 represents the key point of the left ear, 11 represents the key point of the right hip, and 12 represents the key point of the left hip.

Shoulder-to-body ratio X ₇ = 2*D _5-6 /(D _5-15 +D _6-16 ), where 5 represents the key point of the right shoulder, 6 represents the key point of the left shoulder, 15 represents the key point of the right foot, and 16 represents the key point of the left foot key point.

Specifically, after the figure proportion parameter of the photographed person is determined, the figure proportion parameter of the body shape template is further determined, and the figure proportion parameter of the photographed person is compared with the figure proportion parameter of the body shape template one by one.

Step S05: The second operation of the user indicating the body shape template is detected.

In one embodiment, the second operation of the user using the captured human body image as the body shape template is detected.

The second operation of the user selecting a body shape template from the preset body shape template library is detected; or

The second operation of detecting the human body image selected by the user from the preset gallery as the body shape template.

Specifically, the user can randomly select any photo containing a human body image in the gallery as a body shape template, or select a default body shape template in the body shape template library, or the user can take another person's image and import it as a body shape template.

As shown in FIG. 9A, in one embodiment, when the user selects the default body shape template in the body shape template library, the body scale parameter of the default body shape template has been stored in the "Camera" application, so that it can Quickly compare the figure ratio parameters of the body shape template with the figure ratio parameters of the person being photographed.

It should be noted that, as shown in Figure 9B, there are many body shape templates in the body shape template library, such as the body shape template of a certain star, or the popular aesthetic body shape template, etc. After the user selects the body shape template, the "Camera" application The color image can be adjusted adaptively according to the body shape template selected by the user, so that the figure proportion parameter of the photographed person is close to the figure proportion parameter of the body shape template.

Specifically, when making a body shape template, the method includes:

Use the camera to collect color images and depth images of the body shape template;

Using a preset key point detection model to determine multiple key points of the human body in the color image, and using the depth image and the parameters of the camera to determine position information of the multiple key points of the human body;

Determining the figure scale parameter of the figure template according to the position information of the multiple key points of the human body of the figure template;

The color image of the body shape template and the body proportion parameter of the body shape template are saved together in a preset body shape template library.

Understandably, the body shape template and its body proportion parameters are saved in the body shape template library, so that the user can call it in time.

When the user takes the image of another person and imports it as a body shape template, it is also necessary to take the color image and depth image of the body shape template, and recognize the color image to obtain multiple key points of the body shape template, and then use the depth image and camera parameters Determine the position information of multiple key points of the human body of the body shape template; determine the body proportion parameter of the body shape template according to the position information of the multiple key points of the human body of the body shape template.

In another embodiment, when the user selects a photo containing a human body image in the gallery as the body shape template, the body shape parameter of the body shape template is also first detected by the key point detection model, and then obtained according to the recognition The obtained 2D coordinates of multiple key points of the human body calculate the body proportion parameter. It should be noted that the pose of the person in the body shape template selected by the user in the non-body shape template library should be similar to the pose of the person being photographed.

Specifically, as shown in FIG. 7, the 2D coordinates of the left hip key point 12 of the human body in the body shape template are (x ₁₂ , y ₁₂ ), and the 2D coordinates of the left knee key point 14 are (x ₁₄ , y ₁₄ ). Therefore, the length between the left hip key point 12 and the left knee key point 14 can be calculated

Also calculate the body proportion parameters of the body shape template, including head-to-body ratio, upper-to-bottom ratio, lower-body ratio, head-to-shoulder ratio, head-to-waist ratio, head-to-hip ratio, and shoulder-to-body ratio.

Finally, the figure proportion parameter of the body shape template is compared with the figure proportion parameter of the photographed person, and the figure proportion parameter to be adjusted is obtained.

Specifically, step S06 includes:

Comparing the figure proportion parameters of the photographed person with the figure proportion parameters of the body shape template one by one, and determining the figure proportion parameter whose parameter difference exceeds a preset range as the figure proportion parameter to be adjusted;

Adjusting the to-be-adjusted figure proportion parameter according to the figure proportion parameter of the figure template;

Determine the adjustment position information required by the corresponding key point according to the body proportion parameter of the photographed person after adjustment;

The key point is adjusted according to the adjustment position information required by the key point, so that the figure proportion parameter of the photographed person is compatible with the figure proportion parameter of the body shape template.

Adjust the body proportion parameter to be adjusted according to the figure proportion parameter of the body shape template. For example, the preset range of the parameter difference is ±5%. For example, the head-to-shoulders ratio X4=2/3 in the figure ratio parameter of the body shape template. When the head-to-shoulders ratio of the photographed person is X4'=2.3/3, X4'-X4 =2.3/3-2/3=0.3/3=10%﹥5%, it means that the head-to-shoulders ratio of the photographed person needs to be adjusted, that is, the head-to-shoulders ratio is adjusted from 2.3/3 to 2/3.

It should be noted that the posture of the body shape template and the posture of the subject may be different. The posture of the subject can be determined based on the color image of the person being photographed and the depth information of the person being photographed. At this time, the electronic device can transform the posture of the body shape template into the posture of the person being photographed through the similar transformation. Specifically, the electronic device can compare the displacements of the bone points of the two postures in the two-dimensional space, and the relative included angles of the two limbs connected by the bone points of the person being photographed.

Then, the electronic device can rotate or translate the bone points of the body shape template and the limbs connected by the bone points, so that the posture of the transformed body shape template is consistent with the posture of the subject. After adjusting the posture of the body shape template, the body proportion parameter of the electronic device body shape template adjusts the body proportion parameter to be adjusted.

This embodiment also provides a computer storage medium in which computer instructions are stored,

When the computer instruction runs on the electronic device, the electronic device is caused to execute the above-mentioned related method steps to implement the depth-based human body image beautification method in the above-mentioned embodiment.

This embodiment also provides a computer program product. When the computer program product runs on a computer, the computer executes the above-mentioned related steps to realize the depth-based human image beautification method in the above-mentioned embodiment.

In addition, the embodiments of the present application also provide a device. The device may specifically be a chip, component or module. The device may include a processor and a memory connected to each other. The memory is used to store computer execution instructions. When the device is running, The processor can execute the computer-executable instructions stored in the memory, so that the chip executes the depth-based human body image beautification method in the foregoing method embodiments.

Among them, the electronic device, computer storage medium, computer program product, or chip provided in this embodiment are all used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding method provided above. The beneficial effects of the method will not be repeated here.

Through the description of the above embodiments, those skilled in the art can understand that for the convenience and conciseness of the description, only the division of the above-mentioned functional modules is used as an example. The function module is completed, that is, the internal structure of the device is divided into different function modules to complete all or part of the functions described above.

In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other division methods, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and the parts displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or they may be distributed to multiple different places. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application are essentially or the part that contributes to the prior art, or all or part of the technical solutions can be embodied in the form of a software product, and the software product is stored in a storage medium. It includes several instructions to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods of the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read only memory (read only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.

The above content is only the specific implementation of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Covered in the scope of protection of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims

A method for beautifying human body images based on depth, characterized in that it is applied to an electronic device with a display screen and a camera, and the method includes:

The first operation used by the user to turn on the camera is detected;

In response to the first operation, a user interface is displayed on the display screen, the user interface includes a preview frame, the preview frame includes a first human body image of the person being photographed, the first human body image includes a depth image and Color image

Determining a plurality of human body key points in the color image by using a preset key point detection model, and determining position information of the plurality of human body key points according to the depth image and the parameters of the camera;

Determining the body proportion parameter of the photographed person according to the position information of the multiple key points of the human body;

The second operation used by the user to indicate the body shape template is detected;

In response to the second operation, a second human body image of the photographed person is displayed in the preview frame, and the figure scale parameter of the photographed person in the second human body image is based on the figure proportion of the body shape template Parameter adaptation has been adjusted.
The method according to claim 1, wherein detecting the second operation used by the user to indicate the body shape template comprises:

Detecting the second operation of the user using the captured human body image as the body shape template; or

The second operation of the user selecting a body shape template from the preset body shape template library is detected; or

The second operation of detecting the human body image selected by the user from the preset gallery as the body shape template.
The method according to claims 1 to 2, wherein the body ratio parameters include one of head-to-body ratio, upper-to-body ratio, lower-body ratio, head-to-shoulder ratio, head-to-waist ratio, head-to-hip ratio, and shoulder-to-body ratio Or more; the figure scale parameter of the photographed person in the second human body image is adaptively adjusted according to the figure scale parameter of the body shape template, including:

Comparing the figure proportion parameters of the photographed person with the figure proportion parameters of the body shape template one by one, and determining the figure proportion parameter whose parameter difference exceeds a preset range as the figure proportion parameter to be adjusted;

Adjusting the to-be-adjusted figure proportion parameter according to the figure proportion parameter of the figure template;

Determine the adjustment position information required by the corresponding key point according to the body proportion parameter of the photographed person after adjustment;

The key point is adjusted according to the adjustment position information required by the key point, so that the figure proportion parameter of the photographed person is compatible with the figure proportion parameter of the body shape template.
3. The method of claims 1 to 2, wherein the user interface further comprises: a shooting control; and the method further comprises:

In response to the detected user operation acting on the shooting control, the second human body image displayed in the preview frame is saved.
The method according to claim 1, wherein a plurality of human key points in the color image are determined by using a preset key point detection algorithm, and the determination is made according to the depth image and the parameters of the camera Before the position information of the multiple key points of the human body, the method further includes:

Construct a key point detection model, wherein the key point detection model is composed of four densely connected hourglass networks;

A preset training set is used to train the key point detection model, and a minimum mean square error loss function is used in the training process to make the hourglass network converge, and a trained key point detection model is obtained.
The method according to claim 1, wherein the determining the position information of the multiple key points of the human body by using the depth image and the parameters of the camera comprises:

Determining the depth value of the key points of the human body under the same coordinates from the depth image according to the coordinates of the key points of the human body recognized in the color image;

The position information of the key point of the human body is obtained according to the depth value of the key point of the human body and the coordinates of the key point of the human body.
The method of claim 1, wherein the method further comprises:

Use the camera to collect color images and depth images of the body shape template;

Using a preset key point detection model to determine multiple key points of the human body in the color image, and using the depth image and the parameters of the camera to determine position information of the multiple key points of the human body;

Determining the figure scale parameter of the figure template according to the position information of the multiple key points of the human body of the figure template;

The color image of the body shape template and the body proportion parameter of the body shape template are saved together in a preset body shape template library.
An electronic device, characterized in that it comprises:

A display screen; a camera; one or more processors; a memory; a plurality of application programs; and one or more computer programs, wherein the one or more computer programs are stored in the memory, the one or more The computer program includes instructions that, when executed by the device, cause the device to perform the following steps:

The first operation used by the user to turn on the camera is detected;

In response to the first operation, a user interface is displayed on the display screen, the user interface includes a preview frame, the preview frame includes a first human body image of the person being photographed, the first human body image includes a depth image and Color image

Using a preset key point detection algorithm to determine a plurality of human body key points in the color image, and determine the position information of the plurality of human body key points according to the depth image and the parameters of the camera;

Determining the body proportion parameter of the photographed person according to the position information of the multiple key points of the human body;

The second operation used by the user to indicate the body shape template is detected;

In response to the second operation, a second human body image of the photographed person is displayed in the preview frame, and the figure scale parameter of the photographed person in the second human body image is based on the figure proportion of the body shape template Parameter adaptation has been adjusted.
8. The electronic device according to claim 8, wherein when the instruction is executed by the device, the device specifically executes the following steps:

Detecting the second operation of the user using the captured human body image as the body shape template; or

The second operation of the user selecting a body shape template from the preset body shape template library is detected; or

The second operation of detecting the human body image selected by the user from the preset gallery as the body shape template.
The electronic device of claims 8-9, wherein the body ratio parameter includes one of head-to-body ratio, upper-to-bottom ratio, lower-body ratio, head-to-shoulder ratio, head-to-waist ratio, head-to-hip ratio, and shoulder-to-body ratio. One or more; when the instruction is executed by the device, the device specifically executes the following steps:

Comparing the figure proportion parameters of the photographed person with the figure proportion parameters of the body shape template one by one, and determining the figure proportion parameter whose parameter difference exceeds a preset range as the figure proportion parameter to be adjusted;

Adjusting the to-be-adjusted figure proportion parameter according to the figure proportion parameter of the figure template;

Determine the adjustment position information required by the corresponding key point according to the body proportion parameter of the photographed person after adjustment;

The key point is adjusted according to the adjustment position information required by the key point, so that the figure proportion parameter of the photographed person is compatible with the figure proportion parameter of the body shape template.
9. The electronic device according to claims 8-9, wherein the user interface further comprises: a shooting control; when the instruction is executed by the device, the device specifically executes the following steps:

In response to the detected user operation acting on the shooting control, the second human body image displayed in the preview frame is saved.
8. The electronic device according to claim 8, wherein when the instruction is executed by the device, the device specifically executes the following steps:

Construct a key point detection model, wherein the key point detection model is composed of four densely connected hourglass networks;

A preset training set is used to train the key point detection model, and a minimum mean square error loss function is used in the training process to make the hourglass network converge, and a trained key point detection model is obtained.
8. The electronic device according to claim 8, wherein when the instruction is executed by the device, the device specifically executes the following steps:

Determining the depth value of the key points of the human body under the same coordinates from the depth image according to the coordinates of the key points of the human body recognized in the color image;

The position information of the key point of the human body is obtained according to the depth value of the key point of the human body and the coordinates of the key point of the human body.
A computer device includes a memory, a processor, and a computer program stored on the memory and running on the processor, wherein the processor executes the computer program to make the computer device implement The method for beautifying human body images based on depth according to any one of claims 1 to 7.
A computer program product containing instructions, characterized in that, when the computer program product runs on an electronic device, the electronic device is caused to perform the depth-based human image beautification according to any one of claims 1 to 7 method.
A computer-readable storage medium, comprising instructions, characterized in that, when the instructions run on an electronic device, the electronic device is made to perform the depth-based human image beautification according to any one of claims 1 to 7 method.