WO2021078001A1 - 一种图像增强方法及装置 - Google Patents

一种图像增强方法及装置 Download PDF

Info

Publication number
WO2021078001A1
WO2021078001A1 PCT/CN2020/118833 CN2020118833W WO2021078001A1 WO 2021078001 A1 WO2021078001 A1 WO 2021078001A1 CN 2020118833 W CN2020118833 W CN 2020118833W WO 2021078001 A1 WO2021078001 A1 WO 2021078001A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target object
guide
interface
enhanced
Prior art date
Application number
PCT/CN2020/118833
Other languages
English (en)
French (fr)
Inventor
邵纬航
王银廷
乔蕾
李默
张一帆
黄一宁
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021078001A1 publication Critical patent/WO2021078001A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • G06T5/60
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • This application relates to the field of electronic technology, and in particular to an image enhancement method and device.
  • the images taken by users are often of poor quality due to external factors (such as low brightness, etc.).
  • the embodiments of the present application provide an image enhancement method and device.
  • the guidance image is enhanced through a neural network to be enhanced (the first image). Since the information in the guidance image is used for reference, it is compared with the direct treatment in the traditional face enhancement technology.
  • the enhanced image is processed without distortion and the enhancement effect is better.
  • an embodiment of the present application provides an image enhancement method, and the method includes:
  • the guide image including the target object, and the definition of the target object in the guide image is greater than the definition of the target object in the first image
  • the target object in the first image is enhanced by a neural network according to the target object in the guide image to obtain a target image, the target image includes the enhanced target object, and the sharpness of the enhanced target object Greater than the sharpness of the target object in the first image.
  • the present application provides an image enhancement method, including: acquiring a first image, the first image including a target object; acquiring a guide image according to the first image, the guide image including the target object, the guide image
  • the definition of the target object in the first image is greater than the definition of the target object in the first image
  • the target object in the first image is enhanced by the neural network according to the target object in the guide image to obtain the target image
  • the target image includes an enhanced target object, and the sharpness of the enhanced target object is greater than the sharpness of the target object in the first image.
  • the target object includes at least one of the following objects: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
  • the degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
  • the according to the first image includes:
  • the guide image is determined from the at least one second image according to the degree of difference between the posture of the target object in the first image and the posture of each second image in the at least one second image.
  • the method before the difference between the posture of the target object in the first image and the posture of each second image in the at least one second image, the method further include:
  • a first image selection instruction is received, where the first image selection instruction indicates that the at least one second image is selected from at least one image included in the first image selection interface.
  • the according to the first image includes:
  • At least one third image is determined according to the posture of the target object in the first image, each third image in the at least one third image includes the target object, and the posture of the target object included in each third image is the same as The degree of difference between the postures of the target objects in the first image is within a preset range;
  • a second image selection instruction is received, where the second image selection instruction indicates that the guide image is selected from at least one third image included in the second image selection interface.
  • the degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range, including:
  • the degree of difference between the contour shape of the target object in the guide image and the contour shape of the target object in the first image is within a preset range.
  • the definition of the target object in the guide image is greater than the definition of the target object in the first image.
  • the target image includes an enhanced target object, and a guide image feature of the enhanced target object is closer to the guide image than the target object in the first image
  • the target object in, wherein the guiding image feature includes at least one of the following image features:
  • the target image includes an enhanced target object, and the sharpness of the enhanced target object is greater than the sharpness of the target object in the first image.
  • the target image includes an enhanced target object, and the posture of the enhanced target object is different from the posture of the target object in the first image. Set within the range.
  • the acquiring the first image includes:
  • a third image selection instruction is received, where the third image selection instruction indicates that the first image is selected from a plurality of images included in the album interface.
  • the obtaining a guide image includes:
  • this application provides an image enhancement device, which is applied to an electronic device or a server, and the image enhancement device includes:
  • An acquisition module for acquiring a first image, the first image including a target object; acquiring a guide image according to the first image, the guide image including the target object, the definition of the target object in the guide image Greater than the sharpness of the target object in the first image;
  • the processing module is configured to enhance the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, the target image includes the enhanced target object, and the enhanced The definition of the target object is greater than the definition of the target object in the first image.
  • the degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
  • the acquisition module is specifically used for:
  • the guide image is determined from the at least one second image according to the degree of difference between the posture of the target object in the first image and the posture of each second image in the at least one second image.
  • the electronic module further includes:
  • a display module configured to display a first image selection interface, the first image selection interface including at least one image
  • the receiving module is configured to receive a first image selection instruction, where the first image selection instruction indicates that the at least one second image is selected from at least one image included in the first image selection interface.
  • the processing module is specifically used for:
  • At least one third image is determined according to the posture of the target object in the first image, each third image in the at least one third image includes the target object, and the posture of the target object included in each third image is the same as The degree of difference between the postures of the target objects in the first image is within a preset range;
  • the display module is further configured to display a second image selection interface, the second image selection interface including the at least one third image;
  • the receiving module is further configured to receive a second image selection instruction, where the second image selection instruction indicates that the guide image is selected from at least one third image included in the second image selection interface.
  • the target image includes an enhanced target object
  • the guide image feature of the enhanced target object is closer to the guide image than the target object in the first image
  • the guiding image feature includes at least one of the following image features:
  • the target image includes an enhanced target object, and the posture of the enhanced target object is different from the posture of the target object in the first image. Set within the range.
  • the display module is further used for:
  • the acquisition module is specifically configured to receive a user's shooting operation, and in response to the shooting operation, acquire the first image
  • the display module is also used for:
  • the album interface including a plurality of images
  • the acquisition module is specifically configured to receive a third image selection instruction, where the third image selection instruction indicates that the first image is selected from a plurality of images included in the album interface.
  • the acquisition module is specifically used for:
  • this application provides an image enhancement method, including:
  • the guide image including the target object, and the definition of the target object in the guide image is greater than the definition of the target object in the first image
  • the target object in the first image is enhanced by a neural network according to the target object in the guide image to obtain a target image, the target image includes the enhanced target object, and the sharpness of the enhanced target object Greater than the sharpness of the target object in the first image.
  • the target object includes at least one of the following objects: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
  • the degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
  • the target image includes an enhanced target object
  • the guide image feature of the enhanced target object is closer to the guide image than the target object in the first image
  • the guiding image feature includes at least one of the following image features:
  • the target image includes an enhanced target object, and the posture of the enhanced target object is different from the posture of the target object in the first image. Set within the range.
  • this application provides a server, including:
  • the receiving module is configured to receive a first image sent by an electronic device, the first image includes a target object; and a guide image is obtained according to the first image, the guide image includes the target object, and the target in the guide image The definition of the object is greater than the definition of the target object in the first image;
  • the processing module is configured to enhance the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, the target image includes the enhanced target object, and the enhanced The definition of the target object is greater than the definition of the target object in the first image.
  • the sending module is used to send the target image to the electronic device.
  • the target object includes at least one of the following objects: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
  • the degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
  • the target image includes an enhanced target object
  • the guide image feature of the enhanced target object is closer to the guide image than the target object in the first image
  • the guiding image feature includes at least one of the following image features:
  • the target image includes an enhanced target object, and the degree of difference between the posture of the enhanced target object and the posture of the target object in the first image is in advance. Set within the range.
  • an embodiment of the present application provides an image enhancement method, and the method includes:
  • the guide image including the target object, and the definition of the target object in the guide image is greater than the definition of the target object in the first image
  • the target object in the first image is enhanced according to the target object in the guide image to obtain a target image, the target image includes an enhanced target object, and the sharpness of the enhanced target object is greater than that of the The sharpness of the target object in the first image.
  • the target object is the moon.
  • this application provides an electronic device, including: one or more processors; one or more memories; multiple application programs; and one or more programs, wherein the one or more programs are stored In the memory, when the one or more programs are executed by the processor, the electronic device is caused to execute the steps described in any one of the foregoing first aspect and possible implementation manners of the first aspect.
  • the present application provides a server, including: one or more processors; one or more memories; and one or more programs, wherein the one or more programs are stored in the memory,
  • the server is caused to execute any one of the foregoing first aspect, third aspect, possible implementation manner of the first aspect, and possible implementation manner of the third aspect. The steps described.
  • the present application provides a device included in an electronic device, and the device has a function of implementing any one of the electronic device behaviors in the first aspect described above.
  • the function can be realized by hardware, or the corresponding software can be executed by hardware.
  • the hardware or software includes one or more modules or units corresponding to the above-mentioned functions. For example, display module, acquisition module, processing module, etc.
  • the present application provides an electronic device including: a touch display screen, wherein the touch display screen includes a touch-sensitive surface and a display; a camera; one or more processors; a memory; a plurality of application programs; and one or Multiple computer programs.
  • one or more computer programs are stored in the memory, and the one or more computer programs include instructions.
  • the electronic device is caused to execute the image enhancement method in any one of the possible implementations of the first aspect.
  • this application provides a computer storage medium, including computer instructions, which when the computer instructions run on an electronic device or a server, cause the electronic device to execute any one of the possible image enhancement methods in any of the foregoing aspects.
  • this application provides a computer program product, which when the computer program product runs on an electronic device or a server, causes the electronic device to execute any one of the possible image enhancement methods in any of the foregoing aspects.
  • the present application provides an image enhancement method, including: acquiring a first image, the first image including a target object; acquiring a guide image according to the first image, the guide image including the target object, the guide image
  • the definition of the target object in the first image is greater than the definition of the target object in the first image
  • the target object in the first image is enhanced by the neural network according to the target object in the guide image to obtain the target image
  • the target image includes an enhanced target object, and the sharpness of the enhanced target object is greater than the sharpness of the target object in the first image.
  • FIG. 1 is a schematic diagram of an application scenario architecture according to an embodiment of the application
  • Figure 2 is a schematic diagram of the structure of an electronic device
  • Fig. 3a is a software structure block diagram of an electronic device according to an embodiment of the present application.
  • FIG. 3b is a schematic diagram of an embodiment of an image enhancement method provided by an embodiment of the application.
  • Figure 4(a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 4(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 4(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 4(d) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 5(a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 5(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 5(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Fig. 6(a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 6(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Fig. 6(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 7(a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 7(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 7(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 10 (a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 10(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 10(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 10(d) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 10(e) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 10(f) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 11 (a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 11(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 11(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 12 (a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 12(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 12(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 12(d) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 12(e) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 13 (a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 13(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 13(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 14 (a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 14(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 14(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 15 (a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 15(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 15(c) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 15(d) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 16 (a) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • Figure 16(b) is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 17 is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 18 is a schematic diagram of an example of an image enhancement processing interface provided by an embodiment of the present application.
  • FIG. 19 is a schematic diagram of an image provided by an embodiment of this application.
  • Figure 20 (a) is a schematic diagram of a first image
  • Figure 20(b) is a schematic diagram of a guide image
  • Figure 21 (a) is a schematic diagram of a guide image
  • Figure 21(b) is a schematic diagram of a guide image
  • Figure 21(c) is a schematic diagram of face region recognition
  • Figure 22 (a) is a schematic diagram of a target object
  • Figure 22(b) is a schematic diagram of a target object
  • Figure 23 (a) is a schematic diagram of a target object
  • Figure 23(b) is a schematic diagram of a target object after registration
  • Figure 23(c) is a schematic diagram of a comparison between a target object and a registered target object
  • Figure 23(d) is a schematic diagram of image enhancement
  • Figure 23(e) is a schematic diagram of an image augmentation
  • Figure 23(f) is a schematic diagram of image enhancement
  • Figure 23(g) is a schematic diagram of image enhancement
  • FIG. 24 is a schematic diagram of an embodiment of an image enhancement method provided by an embodiment of this application.
  • FIG. 25a is a system architecture diagram of an image enhancement system provided by an embodiment of this application.
  • FIG. 25b is a schematic diagram of a convolution kernel performing a convolution operation on an image according to an embodiment of the application.
  • FIG. 25c is a schematic diagram of a neural network provided by an embodiment of this application.
  • FIG. 26 is a schematic structural diagram of an electronic device provided by an embodiment of this application.
  • FIG. 27 is a schematic structural diagram of a server provided by an embodiment of this application.
  • FIG. 28 is a schematic structural diagram of an electronic device provided by an embodiment of this application.
  • FIG. 29 is a schematic diagram of a structure of a server provided by an embodiment of the present application.
  • FIG. 30 is a schematic diagram of a structure of a chip provided by an embodiment of the application.
  • the embodiments of the present application provide an image enhancement method, electronic device, and server.
  • the guidance image is enhanced by the neural network to be enhanced (the first image). Since the information in the guidance image is used for reference, compared with the traditional In the face enhancement technology, the enhanced image is processed directly without distortion, and the enhancement effect is better.
  • FIG. 1 is a schematic diagram of an application scenario architecture according to an embodiment of the application.
  • the image enhancement method provided by the embodiment of the present application may be implemented based on the electronic device 101, and the image enhancement method provided by the embodiment of the present application may also be implemented based on the interaction between the electronic device 101 and the server 102.
  • the image enhancement method provided by the embodiments of the application can be applied to mobile phones, tablet computers, wearable devices, vehicle-mounted devices, augmented reality (AR)/virtual reality (VR) devices, notebook computers, and super mobile personal computers
  • AR augmented reality
  • VR virtual reality
  • UMPC ultra-mobile personal computer
  • netbooks netbooks
  • PDA personal digital assistant
  • FIG. 2 shows a schematic structural diagram of the electronic device 200.
  • the electronic device 200 may include a processor 210, an external memory interface 220, an internal memory 221, a universal serial bus (USB) interface 230, a charging management module 240, a power management module 241, a battery 242, an antenna 1, and an antenna 2.
  • Mobile communication module 250 wireless communication module 260, audio module 270, speaker 270A, receiver 270B, microphone 270C, earphone jack 270D, sensor module 280, buttons 290, motor 291, indicator 292, camera 293, display 294, and Subscriber identification module (subscriber identification module, SIM) card interface 295, etc.
  • SIM Subscriber identification module
  • the sensor module 280 can include a pressure sensor 280A, a gyroscope sensor 280B, an air pressure sensor 280C, a magnetic sensor 280D, an acceleration sensor 280E, a distance sensor 280F, a proximity light sensor 280G, a fingerprint sensor 280H, a temperature sensor 280J, a touch sensor 180K, and ambient light Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 200.
  • the electronic device 200 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components.
  • the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
  • the processor 210 may include one or more processing units.
  • the processor 210 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) Wait.
  • AP application processor
  • modem processor modem processor
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • the different processing units may be independent devices or integrated in one or more processors.
  • the controller may be the nerve center and command center of the electronic device 200.
  • the controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 210 for storing instructions and data.
  • the memory in the processor 210 is a cache memory.
  • the memory can store instructions or data that have just been used or recycled by the processor 210. If the processor 210 needs to use the instruction or data again, it can be directly called from the memory. Repeated access is avoided, the waiting time of the processor 210 is reduced, and the efficiency of the system is improved.
  • the processor 210 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, and a universal asynchronous transceiver (universal asynchronous) interface.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transceiver
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB Universal Serial Bus
  • the I2C interface is a bidirectional synchronous serial bus, which includes a serial data line (SDA) and a serial clock line (SCL).
  • the processor 110 may include multiple sets of I2C buses.
  • the processor 210 may be coupled to the touch sensor 280K, charger, flash, camera 193, etc., through different I2C bus interfaces.
  • the processor 210 may couple the touch sensor 280K through an I2C interface, so that the processor 210 and the touch sensor 280K communicate through the I2C bus interface to implement the touch function of the electronic device 200.
  • the I2S interface can be used for audio communication.
  • the processor 210 may include multiple sets of I2S buses.
  • the processor 210 may be coupled with the audio module 270 through an I2S bus to implement communication between the processor 210 and the audio module 270.
  • the PCM interface can also be used for audio communication to sample, quantize and encode analog signals.
  • the audio module 270 and the wireless communication module 260 may be coupled through a PCM bus interface.
  • the audio module 270 may also transmit audio signals to the wireless communication module 260 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • the UART interface is generally used to connect the processor 210 and the wireless communication module 260.
  • the processor 210 communicates with the Bluetooth module in the wireless communication module 260 through the UART interface to realize the Bluetooth function.
  • the audio module 270 may transmit audio signals to the wireless communication module 260 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
  • the MIPI interface can be used to connect the processor 210 with the display screen 294, the camera 293 and other peripheral devices.
  • the MIPI interface includes a camera serial interface (camera serial interface, CSI), a display serial interface (display serial interface, DSI), and so on.
  • the processor 210 and the camera 293 communicate through a CSI interface to implement the shooting function of the electronic device 200.
  • the processor 210 and the display screen 294 communicate through a DSI interface to realize the display function of the electronic device 200.
  • the GPIO interface can be configured through software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface can be used to connect the processor 210 with the camera 293, the display screen 294, the wireless communication module 260, the audio module 270, the sensor module 280, and so on.
  • the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface 230 is an interface that complies with the USB standard specifications, and specifically may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on.
  • the USB interface 230 can be used to connect a charger to charge the electronic device 200, and can also be used to transfer data between the electronic device 200 and peripheral devices. It can also be used to connect earphones and play audio through earphones. This interface can also be used to connect to other electronic devices, such as AR devices.
  • the interface connection relationship between the modules illustrated in the embodiment of the present application is merely a schematic description, and does not constitute a structural limitation of the electronic device 200.
  • the electronic device 200 may also adopt different interface connection modes in the foregoing embodiments, or a combination of multiple interface connection modes.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 240 may receive the charging input of the wired charger through the USB interface 230.
  • the charging management module 240 may receive the wireless charging input through the wireless charging coil of the electronic device 200. While the charging management module 240 charges the battery 242, it can also supply power to the electronic device through the power management module 241.
  • the power management module 241 is used to connect the battery 242, the charging management module 240 and the processor 210.
  • the power management module 241 receives input from the battery 242 and/or the charging management module 240, and supplies power to the processor 210, the internal memory 221, the external memory, the display screen 294, the camera 293, and the wireless communication module 260.
  • the power management module 241 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 241 may also be provided in the processor 210.
  • the power management module 241 and the charging management module 240 may also be provided in the same device.
  • the wireless communication function of the electronic device 200 can be implemented by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, the modem processor, and the baseband processor.
  • the antenna 1 and the antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the electronic device 200 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna can be used in combination with a tuning switch.
  • the mobile communication module 250 may provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the electronic device 200.
  • the mobile communication module 250 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), and the like.
  • the mobile communication module 250 can receive electromagnetic waves by the antenna 1, and perform processing such as filtering, amplifying and transmitting the received electromagnetic waves to the modem processor for demodulation.
  • the mobile communication module 250 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic wave radiation via the antenna 1.
  • at least part of the functional modules of the mobile communication module 250 may be provided in the processor 210.
  • at least part of the functional modules of the mobile communication module 250 and at least part of the modules of the processor 210 may be provided in the same device.
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After the low-frequency baseband signal is processed by the baseband processor, it is passed to the application processor.
  • the application processor outputs a sound signal through an audio device (not limited to a speaker 270A, a receiver 270B, etc.), or displays an image or video through the display screen 294.
  • the modem processor may be an independent device.
  • the modem processor may be independent of the processor 210 and be provided in the same device as the mobile communication module 250 or other functional modules.
  • the wireless communication module 260 can provide applications on the electronic device 200 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), and global navigation satellites. System (global navigation satellite system, GNSS), frequency modulation (FM), near field communication (NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • the wireless communication module 260 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 260 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 210.
  • the wireless communication module 260 may also receive a signal to be sent from the processor 210, perform frequency modulation, amplify, and convert it into electromagnetic waves to radiate through the antenna 2.
  • the antenna 1 of the electronic device 200 is coupled with the mobile communication module 250, and the antenna 2 is coupled with the wireless communication module 260, so that the electronic device 200 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
  • the GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite-based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite-based augmentation systems
  • the electronic device 200 implements a display function through a GPU, a display screen 294, and an application processor.
  • the GPU is an image processing microprocessor, which is connected to the display screen 294 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • the processor 210 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 294 is used to display images, videos, and the like.
  • the display screen 294 includes a display panel.
  • the display panel can adopt liquid crystal display (LCD), organic light-emitting diode (OLED), active matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode).
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • active-matrix organic light-emitting diode active-matrix organic light-emitting diode
  • AMOLED flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (QLED), etc.
  • the electronic device 200 may include one or N display screens 294, and N is a positive integer greater than one.
  • the electronic device 200 can implement a shooting function through an ISP, a camera 293, a video codec, a GPU, a display screen 294, and an application processor.
  • the ISP is used to process the data fed back by the camera 293. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing and is converted into an image visible to the naked eye.
  • ISP can also optimize the image noise, brightness, and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be provided in the camera 293.
  • the camera 293 is used to capture still images or videos.
  • the object generates an optical image through the lens and is projected to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal.
  • ISP outputs digital image signals to DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the electronic device 200 may include 1 or N cameras 293, and N is a positive integer greater than 1.
  • the camera can collect images and display the collected images in the preview interface.
  • the photosensitive element converts the collected optical signal into an electrical signal, and then transfers the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for related image processing.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 200 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 200 may support one or more video codecs. In this way, the electronic device 200 can play or record videos in multiple encoding formats, such as: moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.
  • MPEG moving picture experts group
  • MPEG2 MPEG2, MPEG3, MPEG4, and so on.
  • NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • applications such as intelligent cognition of the electronic device 200 can be realized, such as image recognition, face recognition, voice recognition, text understanding, and so on.
  • the external memory interface 220 may be used to connect an external memory card, such as a Micro SD card, so as to expand the storage capacity of the electronic device 200.
  • the external memory card communicates with the processor 210 through the external memory interface 220 to realize the data storage function. For example, save music, video and other files in an external memory card.
  • the internal memory 221 may be used to store computer executable program code, where the executable program code includes instructions.
  • the processor 210 executes various functional applications and data processing of the electronic device 200 by running instructions stored in the internal memory 221.
  • the internal memory 221 may include a storage program area and a storage data area.
  • the storage program area can store an operating system, at least one application program (such as a sound playback function, an image playback function, etc.) required by at least one function.
  • the data storage area can store data (such as audio data, phone book, etc.) created during the use of the electronic device 200.
  • the internal memory 221 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), and the like.
  • UFS universal flash storage
  • the electronic device 200 can implement audio functions through an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, a headphone interface 270D, and an application processor. For example, music playback, recording, etc.
  • the audio module 270 is used for converting digital audio information into an analog audio signal for output, and also for converting an analog audio input into a digital audio signal.
  • the audio module 270 can also be used to encode and decode audio signals.
  • the audio module 270 may be provided in the processor 210, or part of the functional modules of the audio module 270 may be provided in the processor 210.
  • the speaker 270A also called “speaker” is used to convert audio electrical signals into sound signals.
  • the electronic device 200 can listen to music through the speaker 270A, or listen to a hands-free call.
  • the receiver 270B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the electronic device 200 answers a call or voice message, it can receive the voice by bringing the receiver 270B close to the human ear.
  • Microphone 270C also called “microphone”, “microphone”, is used to convert sound signals into electrical signals.
  • the user can make a sound by approaching the microphone 270C through the human mouth, and input the sound signal into the microphone 270C.
  • the electronic device 200 may be provided with at least one microphone 270C.
  • the electronic device 200 may be provided with two microphones 270C, which can implement noise reduction functions in addition to collecting sound signals.
  • the electronic device 200 may also be provided with three, four or more microphones 270C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions.
  • the earphone interface 270D is used to connect wired earphones.
  • the earphone interface 270D may be a USB interface 230, or a 3.5mm open mobile terminal platform (OMTP) standard interface, and a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association
  • the pressure sensor 280A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
  • the pressure sensor 280A may be provided on the display screen 294.
  • the capacitive pressure sensor may include at least two parallel plates with conductive materials. When a force is applied to the pressure sensor 280A, the capacitance between the electrodes changes.
  • the electronic device 200 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display screen 294, the electronic device 200 detects the intensity of the touch operation according to the pressure sensor 280A.
  • the electronic device 200 may also calculate the touched position based on the detection signal of the pressure sensor 280A.
  • touch operations that act on the same touch position but have different touch operation strengths may correspond to different operation instructions. For example, when a touch operation whose intensity of the touch operation is less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
  • the gyro sensor 280B may be used to determine the movement posture of the electronic device 200.
  • the angular velocity of the electronic device 200 around three axes ie, x, y, and z axes
  • the gyro sensor 280B can be used for image stabilization.
  • the gyroscope sensor 280B detects the shake angle of the electronic device 200, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shake of the electronic device 200 through reverse movement to achieve anti-shake.
  • the gyroscope sensor 280B can also be used for navigation and somatosensory game scenes.
  • the air pressure sensor 280C is used to measure air pressure. In some embodiments, the electronic device 200 calculates the altitude based on the air pressure value measured by the air pressure sensor 280C to assist positioning and navigation.
  • the magnetic sensor 280D includes a Hall sensor.
  • the electronic device 200 may use the magnetic sensor 280D to detect the opening and closing of the flip holster.
  • the electronic device 200 can detect the opening and closing of the flip according to the magnetic sensor 280D.
  • features such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 280E can detect the magnitude of the acceleration of the electronic device 200 in various directions (generally three axes). When the electronic device 200 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and apply to applications such as horizontal and vertical screen switching, pedometers and so on.
  • the electronic device 200 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 200 may use the distance sensor 280F to measure the distance to achieve fast focusing.
  • the auto-focusing process can measure the distance based on the distance sensor 280F, thereby realizing fast auto-focusing.
  • the proximity light sensor 280G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the electronic device 200 emits infrared light to the outside through the light emitting diode.
  • the electronic device 200 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 200. When insufficient reflected light is detected, the electronic device 200 may determine that there is no object near the electronic device 200.
  • the electronic device 200 can use the proximity light sensor 280G to detect that the user holds the electronic device 200 close to the ear to talk, so as to automatically turn off the screen to save power.
  • the proximity light sensor 280G can also be used in leather case mode, and the pocket mode will automatically unlock and lock the screen.
  • the ambient light sensor 280L is used to sense the brightness of the ambient light.
  • the electronic device 200 can adaptively adjust the brightness of the display screen 294 according to the perceived brightness of the ambient light.
  • the ambient light sensor 280L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 280L can also cooperate with the proximity light sensor 280G to detect whether the electronic device 200 is in the pocket to prevent accidental touch.
  • the fingerprint sensor 280H is used to collect fingerprints.
  • the electronic device 200 can use the collected fingerprint characteristics to realize fingerprint unlocking, access application locks, fingerprint photographs, fingerprint answering calls, and so on.
  • the temperature sensor 280J is used to detect temperature.
  • the electronic device 200 uses the temperature detected by the temperature sensor 280J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 280J exceeds a threshold value, the electronic device 200 performs a reduction in the performance of the processor located near the temperature sensor 280J, so as to reduce power consumption and implement thermal protection.
  • the electronic device 200 when the temperature is lower than another threshold, the electronic device 200 heats the battery 242 to avoid abnormal shutdown of the electronic device 200 due to low temperature.
  • the electronic device 200 boosts the output voltage of the battery 242 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 280K also called “touch panel”.
  • the touch sensor 280K may be disposed on the display screen 294, and the touch screen is composed of the touch sensor 280K and the display screen 294, which is also called a “touch screen”.
  • the touch sensor 280K is used to detect touch operations acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • the visual output related to the touch operation can be provided through the display screen 294.
  • the touch sensor 280K may also be disposed on the surface of the electronic device 200, which is different from the position of the display screen 294.
  • the bone conduction sensor 280M can acquire vibration signals.
  • the bone conduction sensor 280M can acquire the vibration signal of the vibrating bone mass of the human voice.
  • the bone conduction sensor 280M can also contact the human pulse and receive the blood pressure pulse signal.
  • the bone conduction sensor 280M may also be provided in the earphone, combined with the bone conduction earphone.
  • the audio module 270 can parse out the voice signal based on the vibration signal of the vibrating bone block of the voice obtained by the bone conduction sensor 280M to realize the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beating signal obtained by the bone conduction sensor 280M, and realize the heart rate detection function.
  • the button 290 includes a power-on button, a volume button, and so on.
  • the button 290 may be a mechanical button. It can also be a touch button.
  • the electronic device 200 may receive key input, and generate key signal input related to user settings and function control of the electronic device 200.
  • the motor 291 can generate vibration prompts.
  • the motor 291 can be used for incoming call vibration notification, and can also be used for touch vibration feedback.
  • touch operations that act on different applications can correspond to different vibration feedback effects.
  • Acting on touch operations in different areas of the display screen 294, the motor 291 can also correspond to different vibration feedback effects.
  • Different application scenarios for example: time reminding, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 292 can be an indicator light, which can be used to indicate the charging status, power change, and can also be used to indicate messages, missed calls, notifications, and so on.
  • the SIM card interface 295 is used to connect to the SIM card.
  • the SIM card can be inserted into the SIM card interface 295 or pulled out from the SIM card interface 295 to achieve contact and separation with the electronic device 200.
  • the electronic device 200 may support 1 or N SIM card interfaces, and N is a positive integer greater than 1.
  • the SIM card interface 295 may support Nano SIM cards, Micro SIM cards, SIM cards, etc.
  • the same SIM card interface 295 can insert multiple cards at the same time. The types of the multiple cards can be the same or different.
  • the SIM card interface 295 can also be compatible with different types of SIM cards.
  • the SIM card interface 295 may also be compatible with external memory cards.
  • the electronic device 200 interacts with the network through the SIM card to realize functions such as call and data communication.
  • the electronic device 200 adopts an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the electronic device 200 and cannot be separated from the electronic device 200.
  • the software system of the electronic device 200 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • the embodiment of the present application takes an Android system with a layered architecture as an example to illustrate the software structure of the electronic device 200 by way of example.
  • Fig. 3a is a software structure block diagram of an electronic device 200 according to an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Communication between layers through software interface.
  • the Android system is divided into four layers, from top to bottom, the application layer, the application framework layer, the Android runtime and system library, and the kernel layer.
  • the application layer can include a series of application packages.
  • the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message, etc.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer can include a window manager, a content provider, a view system, a phone manager, a resource manager, and a notification manager.
  • the window manager is used to manage window programs.
  • the window manager can obtain the size of the display, determine whether there is a status bar, lock the screen, take a screenshot, etc.
  • the content provider is used to store and retrieve data and make these data accessible to applications.
  • the data may include videos, images, audios, phone calls made and received, browsing history and bookmarks, phone book, etc.
  • the view system includes visual controls, such as controls that display text, controls that display pictures, and so on.
  • the view system can be used to build applications.
  • the display interface can be composed of one or more views.
  • a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.
  • the phone manager is used to provide the communication function of the electronic device 200. For example, the management of the call status (including connecting, hanging up, etc.).
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can disappear automatically after a short stay without user interaction.
  • the notification manager is used to notify download completion, message reminders, and so on.
  • the notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or a scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, text messages are prompted in the status bar, prompt sounds, electronic devices vibrate, and indicator lights flash.
  • Android runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.
  • the core library consists of two parts: one part is the function functions that the java language needs to call, and the other part is the core library of Android.
  • the application layer and the application framework layer run in a virtual machine.
  • the virtual machine executes the java files of the application layer and the application framework layer as binary files.
  • the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • the system library can include multiple functional modules. For example: surface manager (surface manager), media library (media libraries), 3D graphics processing library (for example: OpenGL ES), 2D graphics engine (for example: SGL), etc.
  • surface manager surface manager
  • media library media libraries
  • 3D graphics processing library for example: OpenGL ES
  • 2D graphics engine for example: SGL
  • the surface manager is used to manage the display subsystem and provides a combination of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to realize 3D graphics drawing, image rendering, synthesis, and layer processing.
  • the 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.
  • Fig. 3b is a schematic diagram of an embodiment of an image enhancement method provided by an embodiment of the application.
  • an image enhancement method provided by an embodiment of the application includes:
  • An electronic device acquires a first image, where the first image includes a target object.
  • the electronic device may determine the first image that needs image enhancement based on the user's selection.
  • the first image may include a target object obtained by photographing a human face, where the target object may be a human face.
  • the first image may be a face image obtained by a user through a camera device (such as a camera) of an electronic device to photograph a face in real time.
  • a camera device such as a camera
  • the user selects stored face images from the local gallery or cloud album of the electronic device.
  • the cloud album here may refer to a web album located on a cloud computing platform.
  • the electronic device can make an enhancement judgment on the image stored in the local album, and based on the judgment result, will prompt the user to enhance the image that can be enhanced, and then the user can prompt in the electronic device You can select the first image among the enhanced images.
  • the electronic device can set enhanced functions in the shooting interface.
  • the user's selection is not required.
  • the electronic device can automatically take the image taken by the user as the first image.
  • the user obtains the first image to be enhanced through the photographing device of the electronic device.
  • the electronic device may display the photographing interface of the camera, receive the photographing operation of the user, and obtain the first image in response to the photographing operation.
  • the electronic device can display the shooting interface of the camera. After the camera is pointed at the face, the user can click the shooting controls in the shooting interface. Accordingly, the electronic device can receive the user's shooting operation, and perform the shooting in response to the shooting operation. Shooting and acquiring the first image, where the first image includes a target object corresponding to a human face or a partial area of the human face.
  • FIG. 4(a) is a schematic diagram of an image enhancement processing interface (graphical user interface, GUI) provided by an embodiment of the present application
  • FIG. 4(a) illustrates the screen display system of the mobile phone in the unlock mode of the mobile phone.
  • the current output interface content 401 is displayed, and the interface content 401 is the main interface of the mobile phone.
  • the interface content 401 displays a variety of third-party applications (applications, App), such as Alipay, task card store, Weibo, photo album, WeChat, card package, settings, and camera. It should be understood that the interface content 401 may also include other more application programs, which are not limited in this application.
  • the shooting interface 403 may include a viewing frame, an album icon 404, a shooting control 405, a camera rotation control 406, and the like.
  • the viewfinder frame is used to obtain the image of the shooting preview and display the preview image in real time, such as a preview image of a human face in Figure 4(b).
  • the album icon 404 is used to quickly enter the album.
  • the shooting control 405 is used to take photos or videos.
  • the mobile phone detects that the user clicks on the shooting control 405, the mobile phone performs the camera operation and saves the captured photos; or, when the mobile phone is in video recording mode, after the user clicks the shooting control 405, The mobile phone performs the recording operation and saves the recorded video.
  • the camera rotation control 406 can be used to control the switching of the front camera and the rear camera.
  • the shooting interface 403 also includes functional controls for setting shooting modes, such as portrait mode, photo mode, video mode, professional mode, and more modes in Figure 4(b). It should be understood that after the user clicks the icon 402, in response to the click operation, the mobile phone opens the camera application by default in the camera mode, which is not limited in this application.
  • the user can click the photographing control 405 to take a photograph.
  • the mobile phone performs a photographing operation and obtains the first image obtained by the photograph.
  • the low image quality of the first image can be understood as a person in the first image.
  • the image quality of the face area is low, or the image quality of a part of the human face (for example, a certain facial features) in the first image is low, which is not limited here. It should be noted that the low image quality can be judged based on the user's vision.
  • the low image quality may include at least one of the following image features: poor brightness, poor color tone, and low detail definition, for example : Face brightness or tone is poor, face detail definition is low, the brightness or tone of one or more facial features is poor, or the details of one or more facial features are low.
  • the mobile phone can click The photo display area 409 displays the taken photos.
  • the display interface of the mobile phone can also display an "enhanced" control and a "save” control.
  • the user can click the "save” control.
  • the mobile phone can receive the save instruction and respond to The save instruction is to save the taken photo and save it in the album icon 404.
  • the user can click on the "enhanced” control.
  • the mobile phone can receive an enhancement instruction.
  • the mobile phone can determine that the user wants to enhance the photo displayed on the current display interface.
  • the image that needs to be enhanced is referred to as the first image.
  • the mobile phone enhances the first image obtained by shooting.
  • the user can directly select the first image that needs to be enhanced from the album.
  • the electronic device may display an album interface of the camera.
  • the album interface includes a plurality of images, and receives a third image selection instruction.
  • the third image selection instruction indicates to select from a plurality of images included in the album interface. The first image.
  • Figure 5(a) shows the image display interface of the album, which can include images previously taken by the user and images downloaded from the network side, etc.
  • Figure 5(b) shows the image display interface of the album, which can include images previously taken by the user and images downloaded from the network side, etc.
  • the user can select one of the images.
  • the selected image can be clicked or long-pressed.
  • the mobile phone can display the interface shown in Figure 5(c) , In which, in addition to the conventional image preview and "delete" controls that can be displayed, it can also include “enhanced” controls.
  • the user can click on the above-mentioned "enhanced” controls.
  • the mobile phone can The image is enhanced, for example, the enhanced area selection interface shown in Figure 4(d) can be displayed.
  • control settings and display content in the foregoing embodiment are merely illustrative, and this application is not limited.
  • the mobile phone can make an enhancement judgment on the image stored in the local album, and based on the judgment result, prompt the user to enhance the image that can be enhanced.
  • the mobile phone can be based on the dynamic range of the brightness of the photo, the color tone, the skin texture, and whether a high-definition guide image with a similar face pose is used as a judgment basis.
  • the first image in Figure 6(a) is compared with the second image.
  • the brightness of the image is poor, and the pose of the face in the image is similar to that of the first image. Therefore, it can be determined that the first image is an intensified image.
  • FIG. 6(a) FIG.
  • FIG. 6(a) shows the image display interface of the album, where the interface includes not only the images taken by the user before and the images downloaded from the network side, but also It may include an "enhanceable image” control, as shown in Figure 6(b), the user can click on the "enhanceable image” control, and in response to the user's operation, the mobile phone can display an image as shown in Figure 6(b).
  • the display interface of the enhanced image as shown in Figure 6(b)
  • the user can click on the image to be enhanced in the display interface of the enhanced image, and in response to the user’s operation, the mobile phone can display as shown in Figure 6(c)
  • it can also include “enhanced” controls.
  • the user can click on the above-mentioned “enhanced” controls in response to the user’s click on the “enhanced” controls.
  • the mobile phone can enhance the image, for example, it can display the enhanced area selection interface as shown in Figure 4(d).
  • control settings and display content in the foregoing embodiment are merely illustrative, and this application is not limited.
  • the mobile phone can set enhanced functions in the shooting interface.
  • the shooting interface 403 includes functional controls for setting the shooting mode, such as portrait mode, camera mode, video mode, and enhanced mode in Figure 7(a). And more patterns.
  • the mobile phone enters the enhanced mode.
  • the user can click on the camera control 405.
  • the mobile phone displays the captured image on the display interface shown in Figure 7(c).
  • the display interface also It can include "save” control and "enhanced” control. If the user clicks the "save” control, the mobile phone can respond to this operation without enhancing the image, but directly save it to the local album. If the user clicks "enhance” Control, the mobile phone can perform enhancement processing on the image in response to the operation, for example, can acquire a guide image, and perform enhancement processing on the first image obtained by shooting based on the guide image.
  • the mobile phone may not enter the enhanced mode based on the user's operation, but determine whether to enter the enhanced mode based on the image quality analysis of the preview image on the shooting interface.
  • the phone when the phone recognizes that the sharpness of the photographed face is too low, it can automatically enter the enhanced mode.
  • the mobile phone can also determine whether to enter the enhanced mode based on the length of time the face appears on the preview interface, which can reduce the rate of misjudgment and reduce the impact of the user on the operation of the mobile phone. For example, the mobile phone recognizes that the sharpness of the face taken on the preview interface is too low, but the face appears for only 1 second. In the next second, there is no face on the preview interface, and the mobile phone may not enter the enhanced mode.
  • the mobile phone After the mobile phone enters the enhanced mode, it can analyze the image in the image preview area and obtain the guide image that can be used as the guide image of the image in the image preview area. For example, it can be in the local album or local enhanced image library or cloud enhanced image. Find in the gallery whether there is a guide image that can satisfy the guide image of the image in the image preview area (the facial posture and expression are similar, the brightness and color are better, etc.). If it is obtained, it can be captured by the user and the mobile phone obtains the first image After that, the first image is automatically enhanced based on the guide image.
  • the guide image that can be used as the guide image of the image in the image preview area. For example, it can be in the local album or local enhanced image library or cloud enhanced image. Find in the gallery whether there is a guide image that can satisfy the guide image of the image in the image preview area (the facial posture and expression are similar, the brightness and color are better, etc.). If it is obtained, it can be captured by the user and the mobile phone obtains the first image After
  • the preview interface of the mobile phone may include a reminder box, which may be used to remind the user that the current shooting enters the enhanced mode, and the reminder box may include text in the enhanced mode.
  • Content and close controls such as the "exit" control shown in Figure 8).
  • the mobile phone shooting can exit the enhanced mode.
  • the user may have a dynamic range of brightness or a face with too low definition in the preview interface of the mobile phone for a certain period of time, and the mobile phone may enter the enhanced mode after recognizing the dynamic range of brightness or a face with too low definition.
  • the user may not want to enter the enhanced mode to take the face picture; or when the user wants to exit the enhanced mode and enter the normal mode after taking the face picture, the user can click the close control in the reminder box, so that the shooting preview interface can be changed from Figure 8 Switch to the display interface of the normal mode.
  • there may be other methods for turning off the enhanced mode which are not limited in this application.
  • the mobile phone when the mobile phone recognizes that the dynamic range or definition of the brightness of the photographed face is too low, it may display a guide that allows the user to choose to enter the enhanced mode.
  • the preview interface of the mobile phone may include a reminder box that can be used to prompt the user to enter the enhanced mode when making a selection.
  • the reminder box may include the text content of the enhanced mode, confirm controls, and hide Control.
  • the mobile phone can display a guide for the user to choose to enter the enhanced mode, as shown in Figure 9, the user can click "Enter" Control to make the phone enter enhanced mode.
  • the user can select the target object to be enhanced in the first image.
  • the mobile phone can display target object selection controls.
  • the target object selection controls can include “all” controls, “features” controls, and “custom area” controls, where , “All” control can provide the function of enhancing the facial area of the currently taken photo, “Facial senses” control can provide the function of enhancing the facial features in the currently taken photo, “Custom area” control can provide the current Customized areas in the photos taken are enhanced.
  • target object selection control is only an example. In actual applications, the target object selection control may also be of other types, which is not limited in this application.
  • the electronic device acquires a guide image according to the first image, the guide image includes the target object, and the definition of the target object in the guide image is greater than the definition of the target object in the first image.
  • the user can select a guide image for enhancing the first image from a local album or a cloud album, or the electronic device can select a guide image that can be used for enhancing the first image, which will be described separately in the following.
  • the mobile phone can receive an instruction to enhance the face area of the captured photo, and respond to the instruction ,
  • the mobile phone can display the guide image selection interface.
  • the guide image as the enhanced first image is referred to as the guide image below.
  • the mobile phone may display a selection interface of a guide image, and the selection interface of the guide image may include a "select from a local album” control and a "smart Select the "control.”
  • the user can click "select from local album", correspondingly, the mobile phone can receive an instruction to select a guide image from the local album.
  • the mobile phone can open the local album and display it on the display interface as shown in Figure 10 (c ) Shows the selection interface of the guide image.
  • FIG. 10 (c ) Shows the selection interface of the guide image.
  • the 10(c) may include an album display area 501 and a to-be-enhanced image display area 502, wherein the album display area 501 may display a preview of photos saved in a local album
  • the image to be enhanced display area 502 can display the preview of the photo to be enhanced.
  • the setting of the above controls allows the user to visually compare the image to be enhanced and the guide image, and the selection posture is closer to the image and details to be enhanced. Guide image with higher definition.
  • the terms “high” and “low” do not refer to specific thresholds, but refer to relationships relative to each other. Therefore, the "high resolution” image does not need a resolution greater than a certain value, but has a higher resolution than the related "low resolution” image.
  • the user can select an image from the album display area 501 as the guide image.
  • the mobile phone can receive the user's picture selection instruction and obtain the image selected by the user ,
  • the mobile phone can determine the guide image selected by the user as the guide image of the image taken by the user in Figure 4(b).
  • the mobile phone may determine whether the posture and expression of the face in the first image and the guide image are similar to each other based on the similarity of the posture of the face in the first image and the guide image If the posture and expression of the face in the first image and the guide image are similar, it can be determined that the guide image selected by the user can be used as the guide image of the first image. If the posture and expression of the face in the first image and the guide image are not close, Then, it can be determined that the guide image selected by the user cannot be used as the guide image of the first image.
  • the mobile phone can display the target image after enhancing the first image based on the image enhancement method.
  • the mobile phone can prompt the user to restart To guide the selection of images, optionally, as shown in Figure 10(e), the mobile phone can display the prompt "The posture difference is too large, please select again” on the interface, and return to the display in Figure 10(c)
  • the displayed guide image selection interface allows the user to re-select a guide image whose posture is closer to the first image.
  • the mobile phone is based on the image of the human face. After the degree of judgment, it can be determined that the guide image can be used as the guide image of the first image, and the first image is enhanced based on the image enhancement method. As shown in Figure 10(f), the mobile phone is based on the image enhancement method. After the first image is enhanced, the target image can be displayed.
  • the mobile phone after the mobile phone obtains the first image and the guide image, it can send the first image and the guide image to the server, and the server enhances the first image based on the image enhancement method, and Send the target image to the mobile phone, and further, the mobile phone can display the target image.
  • the electronic device automatically selects the guide image that can be used as the guide image of the first image.
  • the electronic device can select a local photo album or a cloud photo album with better brightness and tone, higher detail definition, and face pose based on the pose matching strategy of the face and other image processing strategies.
  • the image with high similarity is used as the guide image to enhance the first image.
  • the mobile phone when the user clicks on the "smart selection” control, correspondingly, the mobile phone can receive the user clicks on the "smart selection” control, and match the strategy based on the posture of the face and For other image processing strategies, select an image with better brightness and tone, higher detail definition, and high facial posture similarity from a local album or cloud album as a guide image to enhance the first image.
  • the dynamic range of brightness may refer to the target object or the number of gray levels between the brightest pixel and the darkest pixel among the pixels included in the target object.
  • the guide image can be selected by the server without guiding the user to select.
  • the mobile phone can receive the user’s click on the "smart selection” control and send the first image to the server.
  • the server can use the facial pose matching strategy and other image processing strategies from a local album or a cloud album choose an image with better brightness and hue, higher detail definition, and high facial posture similarity as the guide image to enhance the first image.
  • the mobile phone may display the target image after enhancing the first image based on the image enhancement method.
  • other controls can also be displayed on the display interface, such as the "save” control and the “cancel” control shown in Figure 10(f).
  • the mobile phone can respond to the user's operation by saving the displayed target image to a local album or other storage location, such as to the cloud.
  • the mobile phone can save the first image and the enhanced first image to a local photo album or other storage location, such as to the cloud, in response to the user's operation of clicking the "save" control, which is not limited here. .
  • the mobile phone can return to the camera’s shooting interface in response to the user’s click on the “cancel” control.
  • the interface shown in Figure 4(b) or the mobile phone can return to the interface in Figure 10(b) in response to the user's operation of clicking the "Cancel" control above, prompting the user to select the guide image again, or ,
  • the mobile phone can respond to the above-mentioned user's operation of clicking the "cancel” control, and return to the interface shown in FIG. 10(c), prompting the user to select the guide image again.
  • control type of the interface and the display content of the display interface of the mobile phone are only an indication, and this application is not limited.
  • the user may only enhance a local area in the first image, for example, only enhance one or more facial features, or other areas in the first image, which is not limited here.
  • the user can click on the "Facial senses" control shown therein.
  • the mobile phone can receive an instruction to enhance the facial features of the captured photo.
  • the mobile phone can The facial features area selection interface is displayed.
  • the mobile phone may display a facial features area selection interface, and the facial features area selection interface may include selection guide controls for each facial features, for example, FIG. 11(b) ) Shown in the "left eye” control, “right eye” control, “lips” control, “nose” control, “left ear” control, “right ear” control, "left eyebrow” control, and "right eyebrow” Control.
  • the user can click on the control corresponding to the facial features that he wants to enhance.
  • the mobile phone can receive an instruction from the user to click on the control corresponding to the facial features that he wants to enhance, and in response to the instruction, based on the face recognition strategy, identify the user in the first image.
  • the mobile phone can receive an instruction from the user to click on the "left eye” control, and in response to the instruction, based on the face recognition strategy, recognize the first The left eye area of the face in the image.
  • the phone can circle the left eye area through the prompt box.
  • control settings and display content of the above-mentioned facial features area selection interface are merely illustrative, and this application is not limited.
  • the facial features area selection interface can also include an "OK" control and a "Return” control.
  • the user can click on the "Left Eye” control and the "Lip” control, correspondingly, The mobile phone can receive an instruction from the user to click the "left eye” control and the "lip” control, and in response to the instruction, based on the face recognition strategy, recognize the left eye area and the lip area of the face in the first image.
  • the user can click the "OK” control.
  • the mobile phone can receive an instruction from the user to click the "OK” control.
  • the mobile phone can display a guide image selection interface.
  • the guide image selection interface refer to the above embodiment Figure 11(b) and its corresponding description will not be repeated here.
  • control type of the interface and the display content of the display interface of the mobile phone are only an indication, and this application is not limited.
  • the user can click on the "custom area” control shown therein, which can instruct the user to select the enhanced area in the first image, correspondingly Yes, the mobile phone can receive an instruction from the user to click the "custom area” control, and in response to the instruction, as shown in Figure 12(b), the mobile phone can display an enhanced area selection interface.
  • FIG. 12(b) shows a schematic diagram of an enhanced area selection interface, and the user can manually circle the enhanced area in the enhanced area selection interface, as shown in FIG. 12(c), the user
  • the mobile phone can display the "OK” control and the "Continue to select” control.
  • the user can click on the "OK” control, and in response to the user's click on the "OK” control, the mobile phone can display a guide image selection interface.
  • the guide image selection interface refer to Figure 15(b) in the above embodiment and its corresponding The description is not repeated here.
  • the user can click on the "Continue Selection” control.
  • the mobile phone can display an enhanced area selection interface, and the user can continue to delineate the enhanced area in the enhanced area selection interface, as shown in the figure As shown in 12(d), the user can manually circle the enhanced area in the enhanced area selection interface, and after the delineation is completed, in the interface shown in Figure 12(e), click the "OK" control to Enter the selection interface of the guide image.
  • the user can click on the “custom area” control shown therein, which can instruct the user to select the enhanced area in the first image.
  • the mobile phone can receive the user’s click
  • the mobile phone can display the enhanced area selection interface, which is different from the above-mentioned Figure 12(b) to Figure 12(e), as shown in Figure 13(a) In that way, the mobile phone can display a guide frame of preset size on the display interface.
  • a rectangular frame of preset size can be displayed in the center of the interface, and the user can drag the rectangular frame to the position of the enhanced area (as shown in Figure 13 ( a)), and change the size of the enhanced area by changing the size of the rectangular frame (as shown in Figure 13(b)).
  • the mobile phone can determine the enhancement based on the user's operation on the guide frame Area, as shown in Figure 13(c), after the user has completed the delineation, click the "OK" control to enter the selection interface of the guide image, or click the "Continue Selection" control to enter the selection interface of the guide image .
  • the electronic device may construct an album dedicated to saving the guide image.
  • the guidance image display interface of the mobile phone when performing image guidance, can also display the "select from the guidance image gallery" control. Specifically, the user can click The "select from the guide image gallery" control, in response to the user's operation, the mobile phone can display the interface of the guide image gallery, as shown in Figure 14(b), the images in the guide image gallery can be based on preset rules Classify, for example, classify according to characters, sceneries, animals, etc. Further, in the classification of characters, it is also possible to classify according to different people, which is not limited in this application. As shown in Figure 14(b), the guide image gallery display interface can include a "character" control and a "scene" control.
  • the “character” control When the user clicks on the "character” control, it can display as shown in Figure 14(c) A character selection interface, where the interface may include a selection control corresponding to a person's name, and the user can click the corresponding control to instruct the mobile phone to display an album built with the image of the corresponding person, and further, the user can select a guide image from the album displayed on the mobile phone.
  • the interface may include a selection control corresponding to a person's name, and the user can click the corresponding control to instruct the mobile phone to display an album built with the image of the corresponding person, and further, the user can select a guide image from the album displayed on the mobile phone.
  • the mobile phone can install an application that guides the image gallery.
  • the user can click on the icon corresponding to the "Guide Image Gallery” application.
  • the mobile phone can receive the user's click "
  • the mobile phone can display the display interface of the guide image gallery.
  • the mobile phone can display the guide image shown in Figure 15(b)
  • the gallery interface as shown in Figure 15(b)
  • the guide image gallery display interface can include a "character” control and a "scene” control
  • the user clicks on the "character” control it can display as shown in Figure 15(c)
  • the character selection interface is shown, where the interface may include a selection control corresponding to the name of the person, and the user can click the corresponding control to instruct the mobile phone to display the photo album constructed by the image of the corresponding person, as shown in Figure 15(c),
  • the user can click on the "Zhang San” control.
  • the mobile phone can obtain the user's instruction to click on the "Zhang San” control and display the album as shown in Figure 15(d).
  • the album display interface may also include controls for modifying the album, such as the "+" control shown in Figure 15(d).
  • the user can click the "+" Control to add images in the album.
  • the mobile phone can display the local album in response to the operation, and guide the user to select the image that he wants to add to the album.
  • users can also delete images that have been added to the album.
  • control settings and display content in the above-mentioned album display interface are merely illustrative, and this application is not limited.
  • the user can directly add the displayed image to the guide image gallery from the third-party application.
  • Figure 16(a) shows a schematic diagram of a chat interface.
  • Zhang San sends an image.
  • the phone receives the image, it can be displayed on the chat interface, such as As shown in Figure 16(b), the user can press and hold the image for a long time, and in response to the operation, the mobile phone can display a guide for operating the image.
  • the guide can include "save to “Album” control, “Save to Guide Image Gallery” control and “Copy” control, the user can click the above “Save to Guide Image Gallery” control, and the mobile phone can respond to this operation and save the image to the Guide Image Gallery (as shown in the figure) 17), or display the display interface as shown in Figure 15(b) to guide the user to save the image to the corresponding album.
  • control settings and display content in the foregoing embodiment are merely illustrative, and this application is not limited.
  • the electronic device enhances the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, the target image includes the enhanced target object, and the enhanced target The sharpness of the object is greater than the sharpness of the target object in the first image.
  • the image quality of the first image is low (for example, the brightness or tone of the face is poor, the clarity of the details of the face is low, the brightness or tone of one or more facial features is poor, or one or more features are poor. If the details of the facial features are low), then the first image obtained can be enhanced. For example, by clicking the "enhance" control as shown in FIG. 4(c), the electronic device is equivalent to receiving an enhancement instruction.
  • the user needs to enhance an image stored in an album in the electronic device.
  • the user wants to send a selfie to other users, but opens the image Later, it was found that the image quality was very low (for example: poor face brightness or tone, low face detail definition, poor brightness or tone of one or more facial features, or low detail definition of one or more facial features)
  • the user can open the album and enhance the self-portrait (the first image) to be sent in the album, for example, click on the "enhanced" control as shown in Figure 12(c), and the electronic device is equivalent to receiving an enhancement instruction .
  • the electronic device may enhance the target object in the first image based on the target object in the guide image through the neural network.
  • the target object can also be understood as the same facial features of different people.
  • the first image is obtained by photographing Zhang San’s front face
  • the first image includes Zhang San’s eyes (target object)
  • the corresponding The guide image is obtained by photographing Li Si’s front face, and the guide image includes Li Si’s eyes. If the posture information of Zhang San and Li Si’s eyes is very similar, then Li Si’s eyes in the guide image can also be used as the enhancement target
  • the target object of the subject Zhang San's eyes).
  • the principle of image enhancement is that under the premise of improving the image quality of the first image, the target image will not be too distorted compared to the first image. Therefore, when the first image needs to be enhanced
  • the guide image is used as the first image guide image, in which the target object relative to the target object, and the target object's posture difference cannot be too large. Therefore, the posture information of the target object and the target object The degree of difference of the posture information is within the preset range.
  • the electronic device may display an album interface in response to the enhanced instruction to guide the user to select a guide image, for example, as shown in FIG. 15(c) In that way, the user can select a guide image in the guide image selection interface shown in FIG. 15(c), and in response to the user's image selection operation, the electronic device can obtain the guide image corresponding to the image selection operation.
  • the first image includes a human face or part of a human face (target object).
  • the electronic device will determine whether the guide image is There are target objects that are close to the target object in the first image.
  • the principle of image enhancement is that under the premise of improving the image quality of the first image, the target image will not be too distorted compared to the first image. Therefore, when the first image needs to be enhanced
  • the guide image is used as the first image guide image, in which the target object relative to the target object, and the target object's posture difference cannot be too large. Therefore, the posture information of the target object and the target object The degree of difference of the posture information is within the preset range.
  • the electronic device may determine whether there is a target object in the guide image that is similar to the face pose in the first image based on the landmark detection method of the key points of the face.
  • the key points of the face can also be called the feature points of the face, and usually include the points that constitute the facial features (eyebrows, eyes, nose, mouth, and ears) and the outline of the face.
  • the method of detecting a face image and marking one or more key points in the face image may be called a face key point detection method or a face alignment detection method.
  • the feature area in the face image can be determined.
  • the feature area here can include but is not limited to: eyebrow area, eye area, nose area, mouth area, ear area, etc. Wait.
  • the electronic device may realize the difference degree judgment of the target object and the posture information of the target object based on a face key point detection model.
  • the face key point detection model can be called to perform face detection on the first image and the guide image, respectively, to determine multiple key points in the first image and the guide image
  • the key points here can include but are not limited to: key points of the mouth, key points of the eyebrows, key points of the eyes, key points of the nose, key points of the ears, and key points of the face contour, etc.
  • Point labeling information can include, but is not limited to: location labeling information (such as marking the location of the key point), shape labeling information (such as marking as a dot shape), feature information, etc., where the feature information is used to indicate the key
  • the point category if the feature information is the feature information of the eyes, it indicates that the key point is the key point of the eye, and if the feature information is the feature
  • the multiple key points determined in the first image and the guide image may be shown as gray dots in FIG. 18.
  • the similarity between the target object and the posture information of the target object can be determined based on the annotation information of the key points and the position annotation information (for example, the pixel coordinates of the key points).
  • the first image and the guide image may be cropped first, so that the target object is in the first image.
  • the position and posture of is close to the position and posture of the target object in the guide image.
  • the bounding range of the cropping process may be below the eyebrows (including the eyebrows), above the chin, and the left and right sides are bounded by the edge of the face contour (which may include the ears).
  • the cropped image can be zoomed so that the size of the target object and the target object are the same.
  • the cropped image may be rotated to make the target object and the target object the same.
  • Rotation processing refers to rotating the target object or the target object clockwise or counterclockwise at a certain rotation angle with the target object or the center point of the target object as the origin.
  • a certain area may be appropriately reserved around the circled area.
  • the area 1903 is the target object and the area of the target object delineated by the electronic device
  • the area 1902 is a certain area appropriately left around the delineated area of the electronic device, which is equivalent to the cropped image as shown in FIG. 19 In area 1902 and area 1903.
  • FIG. 20(a) shows a schematic diagram of a first image
  • FIG. 20(b) shows a guide image
  • the target object and the target object are the human face in the first image and the guide image, respectively.
  • the posture of the human face in Figure 20 (a) and Figure 20 (b) is too different, and the electronic device can
  • the guide image in Fig. 20(b) is image-processed.
  • the target object can be rotated first, so that the posture of the rotated target object and the target object are basically the same, as shown in Figure 21 (b),
  • Figure 21 (b) is an illustration of the guided image after rotation.
  • the size of the target object can be scaled, so that the size of the scaled target object and the size of the target object are basically the same as shown in FIG. 21(c)
  • FIG. 21(c) shows the scaled Schematic representation of the guided image.
  • the electronic device may obtain the key points in the face range of the first image and the guide image based on the annotation information, and the pixel coordinates corresponding to each key point in the face range , The electronic device can calculate the sum of squares of the difference of the pixel coordinates corresponding to each key point in the face range of the first image and the guide image, and if the sum of squares obtained by the above calculation exceeds a preset threshold, the first image is considered The difference between the posture information of the target object in the guide image and the target object in the guide image is too large.
  • the electronic device may prompt the user to select the guide image again.
  • the electronic device may prompt the user to select the guide image again.
  • the electronic device may obtain the key points in the left eye range of the first image and the guide image based on the annotation information, and the pixel coordinates corresponding to each key point in the left eye range.
  • the electronic device can separately calculate the sum of squares of the difference of the pixel coordinates corresponding to each key point in the left eye range of the first image and the guide image. If the sum of squares obtained by the above calculation exceeds a preset threshold, it is considered that the first image is The difference between the posture information of the target object and the target object of the guide image is too large.
  • the electronic device may prompt the user to select the guide image again.
  • the electronic device may prompt the user to select the guide image again.
  • the feature area of the target object in the first image can be determined according to the label information of each of the multiple key points, and the target object’s location in the first image Guide the feature area in the image.
  • the labeling information may include: feature information, location labeling information, and so on. Therefore, in an embodiment, the characteristic area may be determined according to the characteristic information of each key point.
  • the category of each target key point can be determined according to the characteristic information of each key point, the area formed by the target key points of the same category is regarded as a characteristic area, and the category is regarded as the category of the characteristic area.
  • the feature information is selected as the key points of the feature information of the nose, and the categories of these key points are all nose key points; the area constituted by these target key points is regarded as the nose area.
  • the characteristic area may be determined according to the position labeling information of each key point.
  • the labeling position of each key point can be determined according to the position labeling information, and the key points of adjacent positions can be connected. If the resulting shape is connected with the facial features (eyebrows, eyes, nose, mouth, ears) If any of the shapes are similar, the area formed by the key points of these adjacent positions is determined as the characteristic area, and the type of the characteristic area is determined according to the shape. For example, if the shape obtained by connecting the target key points of adjacent positions is similar to the shape of the nose, the area formed by the key points of these adjacent positions can be determined as the nose area.
  • the electronic device can determine the degree of difference between the posture information of the target object and the target object based on the comparison between the shape of the characteristic region corresponding to the target object and the shape of the characteristic region corresponding to the target object.
  • the electronic device can determine the cheek area, left and right eye area, nose area, etc. in the first image.
  • Lip area, left and right ear area and left and right eyebrow area and determine the cheek area, left and right eye area, nose area, lip area, left and right ear area and left and right eyebrow area in the guide image, and combine the features in the first image and the guide image respectively Compare the shape of the area:
  • the comparison results of each of the above regions can be combined to determine the degree of difference between the posture information of the target object and the target object. For example, when there is a If the comparison result of the region is too large, then it is determined that the difference between the posture information of the target object and the target object is too large. It can also be that when there is a region where the comparison result is not much different, then the target object and the target object are determined to be less different. The posture information of the target object has little difference. At this time, only the facial features with little difference can be enhanced in the follow-up.
  • the face alignment algorithms can include but are not limited to: machine learning regression algorithms, such as supervision Descent algorithm (supervised descent method, SDM), local binary features (local binary features, LBF) algorithm; or convolutional neural network (convolutional neural network, CNN) algorithm, such as face landmark detection based on deep multi-task learning ( facial landmark detection by deep multi-task learning, TCDCN) algorithm, dense face alignment (3D dense face alignment, 3DDFA) algorithm, etc.
  • machine learning regression algorithms such as supervision Descent algorithm (supervised descent method, SDM), local binary features (local binary features, LBF) algorithm
  • convolutional neural network convolutional neural network, CNN
  • face landmark detection based on deep multi-task learning facial landmark detection by deep multi-task learning, TCDCN
  • dense face alignment (3D dense face alignment, 3DDFA) algorithm etc.
  • the electronic device determines that the difference between the posture information of the target object and the target object is too large, it can prompt the user to select the guide image again, as shown in Figure 15(d), if the user selects The second image is used as the guide image of the first image.
  • the electronic device can determine that the posture information of the target object (face) of the first image and the target object (face) of the guide image are too different.
  • the interface shown in Fig. 10(d) prompts the user that the posture difference is too large, and guides the user to select the guide image again.
  • the electronic device may calculate the proximity of the target object and the target object in terms of posture information, detail definition, etc., for each image. , And prompt for the user's reference (displayed in the interface or played to the user by voice).
  • the user may select multiple guide images as the guide images of the first image.
  • the electronic device can provide a special library for users to store guide images.
  • the user can take face photos or download face photos from the Internet. Some of them can be of higher quality (excellent brightness and high detail definition).
  • Stored in the gallery where the guide image is stored For example, the user takes a photo of his face as the target object, and stores some of the high-quality (brightness, high detail definition) in a gallery storing the guide image, and then the user takes his face photo.
  • Guide images are collected and accumulated by users themselves, and guide image photo libraries can be created in different categories, which can be updated and deleted from time to time.
  • the boot image storage area can be a local storage medium of the electronic device, or it can be stored on the cloud. For details, refer to FIG. 14(a) and the description of related embodiments, which will not be repeated here.
  • the selection of the guide image may be automatically completed by the electronic device.
  • the electronic device can select the image that is closest to the posture information of the target object in the first image as the guide image, or consider other standards, such as the dynamic range of brightness DR, detail definition information, and so on. If the electronic device detects a plurality of guide images including posture information and the target object approaching the target object, the guide image may be obtained based on the above criteria, or randomly selected, or presented to the user through an interface for the user to choose. In this embodiment, the detail definition of the target object is greater than the detail definition of the target object.
  • the electronic device may provide a neural network to enhance the target object in the guiding image to the target object in the first image.
  • the electronic device may first perform pixel registration of the target object in the first image and the target object in the guide image, and determine the second pixel point corresponding to each of the M first pixel points,
  • the second pixel is a pixel included in the target object.
  • the electronic device can first divide the first image and the second image into a grid, and register the coordinate points of the grid in the first image with the coordinate points of the grid in the second image, and then use an interpolation algorithm Calculate the correspondence between the pixel points of the target object in the first image and the pixel points of the target object in the guide image.
  • the target object in the guide image may include M first pixel points
  • the electronic device may perform pixel registration of the target object and the target object based on a neural network or other registration algorithm to determine M
  • Each of the first pixel points corresponds to a second pixel point
  • the second pixel point is a pixel point included in the target object.
  • the target object includes the first pixel A1, and the pixel information around the first pixel A1 is mathematically analyzed to extract features, and the corresponding .
  • the pixel information of the target object in the first image is also mathematically analyzed to extract features, and a second pixel A2 on the target object can be found (as shown in Figure 22(b)).
  • the feature extracted from the image information is the most matching/close to the feature extracted from the image information surrounding the first pixel point A1. Therefore, it can be determined that the first pixel point A1 corresponds to the second pixel point A2.
  • the second pixel included in the target object in the guide image corresponding to each of the M first pixel points can be determined.
  • the electronic device may perform fusion processing on each second pixel point and the corresponding first pixel point in the first image to obtain the target image.
  • the second pixel and the corresponding first pixel are pixel-fused to obtain the target image .
  • the pixel displacement between each second pixel and the corresponding first pixel may be determined, and Based on the pixel displacement, each second pixel is translated to obtain a registered target object.
  • the registered target object also includes N third pixels, and each third pixel is based on adjacent The pixel value of the first pixel is generated by interpolation, and the N is a positive integer, and the registered target object is merged with the target object to obtain a target image.
  • the electronic device may obtain the high-frequency information of the second pixel; and obtain the first The low-frequency information of the pixel; the low-frequency information and the high-frequency information are fused.
  • Figure 23(a) shows a schematic diagram of a target object
  • Figure 23(b) shows a schematic diagram of a registered target object.
  • the registered target object and the target object in the first image also have non-overlapping areas (B1 and B2).
  • the target object in the first image is directly aligned with the target object in the first image. If the target object is fused, artifacts will appear, that is, when the information of the registered target object is "posted"/fused to the target object in the first image, it is "posted”/fused to the wrong position.
  • this application It is possible to perform pixel fusion processing on only the area of the registered target object that overlaps the target object in the first image, and for the area of the registered target object that does not overlap the target object in the first image , You can perform super-resolution enhancement processing on this area. That is: the target object in the first image includes a first area, the registered target object includes a second area, the first area overlaps the second area, and the first area is overlapped with the second area. The pixels in the second area are fused.
  • the target object in the first image further includes a third area that is offset from the registered target object, and super-resolution enhancement processing is performed on the third area.
  • the pixel fusion method (used for detail enhancement) in this embodiment of the application can be implemented based on an AI network, for example, through training, such that:
  • Encoder 1 is only responsible for encoding the low-frequency information of the picture, and automatically filters out the high-frequency information.
  • Encoder 2 can encode the high-frequency and low-frequency information of the picture, and its corresponding decoder 2 can restore the high- and low-frequency encoding information output by the encoder 2 to the original input image.
  • the encoding of the low-frequency information must be in accordance with the encoder in the way 1 is similar.
  • the output result is similar to the low-frequency encoding information output by the registered guided image through the encoder 2.
  • the edge area of the target object may be smoothed in the first image.
  • the target image includes an enhanced target object
  • the dynamic range DR of the brightness of the enhanced target object is equal to the DR of the brightness of the target object in the guide image.
  • the difference is smaller than the difference between the DR of the brightness of the target object and the DR of the brightness of the target object in the first image.
  • the target image includes an enhanced target object
  • the difference between the hue of the enhanced target object and the hue of the target object in the guide image is smaller than that of the first image The difference between the hue of the target object in and the hue of the target object.
  • the target image includes an enhanced target object, and the detail definition of the enhanced target object is greater than that of the target object in the first image.
  • the target object in the first image can be directly replaced with the target object in the guide image, that is, the enhanced target object can be directly the target object in the guide image.
  • the application is not limited.
  • the pixel fusion module shown in FIG. 23(d) can be integrated into the decoder, and the codec can be implemented based on a traditional algorithm or based on an AI network.
  • the name of the module in FIG. 23(d) is only an illustration, and does not constitute a limitation to the present application.
  • the pixel fusion module can also be understood as a code fusion module, which is not limited here.
  • the above describes an image enhancement method provided by an embodiment of the present application by taking a human face or a partial region of a human face as the target object. Next, taking the target object as the moon as an example, another image enhancement method is introduced.
  • the electronic device may obtain a first image including the target object as the moon, as shown in FIG. 23(e), which shows a schematic diagram of a first image, and the first image includes the moon.
  • the electronic device after acquiring the first image, can detect that the first image includes the moon. Specifically, the electronic device can detect whether the first image includes the moon based on a trained AI network. This application does not Not limited.
  • the electronic device can obtain the guide image including the target object as the moon.
  • the difference from the above-mentioned enhancement of the face area is that the rotation period of the moon is equal to the period of rotation around the earth, so that it always faces the earth with the same face. Therefore, the posture information (texture feature) of the moon in the first image and the guide image is basically the same when there is no occlusion, so the electronic device does not need to judge whether the posture information is similar, as shown in Figure 23(f) As shown in Fig. 23(f), a schematic diagram of a guide image is shown, and the guide image includes the moon.
  • the electronic device can automatically select the guide image. Without ignoring the influence of the balance of the sky, part of the surface of the moon that people can see on the ground is constantly changing. At this time, electronic devices can infer the actual moon surface that can be seen that night by synchronizing the date, time and place, and guide The guide image is selected from the image gallery/album, etc., where the attitude information of the moon included in the guide image is close to the attitude information of the moon surface that is actually visible that night.
  • the electronic device may refer to the scene of the moon in the first image to select the guide image.
  • a guide image including a blood wolf moon may be selected as the guide image of the first image.
  • the electronic device may enhance the moon included in the first image by guiding the moon included in the image to obtain the target image.
  • the electronic device may obtain the area A of the moon in the first image and the area B of the moon in the guide image, and register the area A of the moon with the area B of the moon, so that the matching
  • the quasi-aligned area A and area B basically overlap.
  • the first image is called picture A
  • the guide image is called picture R
  • the area A of the moon in the first image is called picture a.
  • Area B of the moon is called graph r.
  • picture a may be translated first, so that the center (or center) of the moon in picture a coincides with the center (or center) of the moon in picture r after translation, and picture b is obtained.
  • the plane coordinates can be established with the center of the picture b as the origin, the angle between the x-axis (or y-axis) of the coordinates and the horizontal line of the picture is theta, and the picture b is stretched toward the x-axis and y-axis direction of the coordinate, And by selecting the appropriate theta and zoom factor, the area A of the moon in the image b and the area B of the moon in the image r can be accurately registered, and the image c can be obtained.
  • the moon phase in Figure A is not a full moon, or a scene with a certain occlusion on the moon, as long as it satisfies: the moon area in Figure c is an incomplete perfect circle, that is, a perfect circle is left. According to the contour of the contour, the moon area can be restored to a perfect circle.
  • the restoration result is basically the same as the moon area (also a perfect circle) in Figure r, which means that the registration is successful. .
  • the affine transformation matrix W that directly transforms the graph A to the graph D and the inverse matrix W -1 of the affine transformation moment W can be calculated, and W is applied to the graph a to obtain the graph d, and W ⁇ 1 Act on graph r to get graph p. Compare figure d with figure r, figure p and figure a. If the moon area is very different, it means that the registration has failed, and the subsequent guidance enhancement will be stopped, and the system will report an error (prompt the user to enhance the failure).
  • the comparison criterion can be that the following conditions are met:
  • Condition 1 The area of the moon area in picture d that exceeds the contour line of the moon area in picture r is less than a certain threshold;
  • Condition 2 The minimum distance between the contour line of the moon area in image d and the contour line of the moon area in image r is less than a certain threshold
  • Condition 3 The area of the moon area of picture p that exceeds the contour of the moon area of picture a is less than a certain threshold;
  • Condition 4 The minimum distance between the contour line of the moon area in the picture p and the contour line of the moon area in the picture a is less than a certain threshold.
  • Condition 5 The area of the intersection of the moon area of the image d and the image r is divided by the area of the moon area of the image r, and this value should be greater than a certain threshold;
  • Condition 6 The intersection area of the moon area of picture p and picture a is divided by the area of the moon area of picture a, and this value is greater than a certain threshold.
  • W -1 can be applied to the graph R to obtain the graph P. Perform post-processing again. Finally, the result is embedded (fused) back to the original photo.
  • the image P, the image p, the image A, and the image a are scaled to the size of the original crop image A before scaling.
  • Picture M Picture p1/255.0
  • lM e_protect/10.0+sum of the pixel values of image M, where the numerical protection value e_protect can take 1.0/255.0;
  • Picture T (Picture A1/255.0+l0*(1.0-Picture M)-l0)*Amp/(l0max-l0min+e_protect), where Amp is an adjustable parameter that controls and inherits some of the details of the original moon in Picture L Intensity, if set to 0, it is not inherited.
  • Lmax the maximum pixel value of the image IMG
  • Picture IMGs (UINT8) (picture IMG), if Lmax is less than or equal to 255, otherwise equal to (UINT8) (255.0*picture IMG/Lmax), where UINT8 refers to the corresponding data type conversion of the pixel value in the picture.
  • the image p1 is blurred, that is, a certain number of up-sampling and blurring operations are performed to obtain the image p1v. Therefore, the output result of the post-processing retaining some detailed features of the original moon in the image L is:
  • the picture is processed in yuv format, the above result is only used for the y channel. If it is in rgb format, all three channels of r, g, and b are used.
  • the color information of the original moon uv channel in image L is UVL
  • the color information of the moon uv channel in the guide image is UVR.
  • UVR becomes UVP after matrix W -1 transformation
  • the median of UVP (or UVR) is uvp( Including one each for u channel and v channel).
  • the UVP information value at the edge of the picture P1 is expanded outward to fill the moon area of the picture L to the moon area of the picture P1.
  • the color information of the moon uv channel of the picture L at this time is UVf, and UVf should be fused with the color information of the uv channel outside the moon area of the picture L, so after enhancement, it will finally replace the uv of the picture A and embed it back
  • the uv channel color information is:
  • post-processing can also be performed, where post-processing can also include, for example, deblurring, background noise reduction, etc., to make the enhancement effect better, as shown in Figure 23(g), Figure 23( g) shows a schematic diagram of a target image.
  • An embodiment of the present application provides an image enhancement method, including: acquiring a first image, the first image includes a target object; acquiring a guide image, the guide image includes the target object, and the target object in the guide image The definition of is greater than the definition of the target object in the first image; according to the target object in the guide image, the target object in the first image is enhanced by a neural network to obtain a target image, the target image Including an enhanced target object, the sharpness of the enhanced target object is greater than the sharpness of the target object in the first image.
  • the guided image is enhanced by the neural network to be enhanced (the first image). Since the information in the guided image is used for reference, compared with the traditional face enhancement technology, the direct processing of the enhanced image will not cause distortion. Situation, the enhancement effect is better.
  • FIG. 24 is a schematic diagram of an embodiment of an image enhancement method provided by an embodiment of this application.
  • the image enhancement method provided in this embodiment is Methods include:
  • the server receives a first image sent by an electronic device, where the first image includes a target object.
  • the server acquires a guide image according to the first image, the guide image includes the target object, and the definition of the target object in the guide image is greater than the definition of the target object in the first image.
  • the server enhances the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, where the target image includes an enhanced target object, and the enhanced target object
  • the definition of is greater than the definition of the target object in the first image.
  • the server sends the target image to the electronic device.
  • FIG. 25a is a system architecture diagram of an image enhancement system provided by an embodiment of the application.
  • the image enhancement system 2500 includes an execution device 2510, a training device 2520, a database 2530, a client device 2540, and
  • the data storage system 2550 and the execution device 2510 include a calculation module 2511.
  • the client device 2540 may be the electronic device in the foregoing embodiment, and the execution device may be the electronic device or the server in the foregoing embodiment.
  • the database 2530 stores an image set
  • the training device 2520 generates a target model/rule 2501 for processing the first image and the guide image, and uses the image set in the database to iteratively train the target model/rule 2501 to obtain a mature Target model/rule 2501.
  • the target model/rule 2501 is a convolutional neural network as an example for description.
  • the convolutional neural network obtained by the training device 2520 can be applied to different systems or devices, such as mobile phones, tablets, laptops, VR devices, server data processing systems, and so on.
  • the execution device 2510 can call data, codes, etc. in the data storage system 2550, and can also store data, instructions, etc. in the data storage system 2550.
  • the data storage system 2550 may be placed in the execution device 2510, or the data storage system 2550 may be an external memory relative to the execution device 2510.
  • the calculation module 2511 can perform a convolution operation on the first image and the guide image acquired by the client device 2540 through the convolutional neural network. After the first feature plane and the second feature plane are extracted, the first feature plane and the second feature plane can be combined. The plane is spliced. Based on performing a convolution operation on the first feature plane and the second feature plane, a second pixel point corresponding to each first pixel point in the M first pixel points is determined.
  • the execution device 2510 and the client device 2540 may be separate and independent devices.
  • the execution device 2510 is equipped with an I/O interface 2512 for data interaction with the client device 2540.
  • the "user" can The first image and the guide image are input to the I/O interface 212 through the client device 2540, and the execution device 210 returns the target image to the client device 2540 through the I/O interface 2512 and provides it to the user.
  • FIG. 25a is only a schematic diagram of the architecture of two image enhancement systems provided by an embodiment of the present invention, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the execution device 2510 may be configured in the client device 2540.
  • the execution device 2510 may be the main processor (Host The module used for array image processing in the CPU), the execution device 2510 can also be a graphics processing unit (GPU) or a neural network processor (NPU) in a mobile phone or tablet, and the GPU or NPU is used as a coprocessor. Loaded on the main processor, the main processor assigns tasks.
  • the main processor assigns tasks.
  • the convolutional neural network is a deep learning architecture.
  • the deep learning architecture refers to the use of machine learning algorithms to perform multiple operations at different abstract levels. Level of learning.
  • CNN is a feed-forward artificial neural network. Each neuron in the feed-forward artificial neural network responds to overlapping regions in the input image.
  • the convolutional neural network can logically include input layer, convolutional layer and neural network layer, but because the function of input layer and output layer is mainly to facilitate the import and export of data, with the continuous development of convolutional neural network In practical applications, the concept of input layer and output layer is gradually diluted, but the function of input layer and output layer is realized through convolutional layer.
  • high-dimensional convolutional neural networks can also include other types of layers. The details are not limited here.
  • the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer may include many convolution kernels, and the convolution kernels may also be called filters or convolution operators, which are used to extract specific information from the input array matrix (that is, the digitized array image).
  • a convolution kernel can essentially be a weight matrix. This weight matrix is usually predefined. The size of each weight matrix should be related to the size of each angle image in an array image.
  • the weight matrix is usually processed one pixel by one pixel (or two pixels then two pixels...it depends on the value of stride) in the horizontal direction to complete
  • the weight values in these weight matrices need to be obtained through a lot of training in practical applications.
  • Each weight matrix formed by the weight values obtained through training can extract information from the input angle image, thereby helping the high-dimensional convolutional neural network to perform correct prediction.
  • the depth dimension of the weight matrix and the depth dimension of the input array image are the same.
  • the weight matrix will extend to the entire depth of the input image. Therefore, convolution with a single depth dimension weight matrix will produce a single depth dimension convolution output, but in most cases, a single depth dimension weight matrix is not used, but a weight matrix of different depth dimensions is used to extract different images in the image.
  • a weight matrix with a depth dimension is used to extract edge information of an image
  • a weight matrix with a depth dimension is used to extract a specific color of the image
  • a weight matrix with a depth dimension is used to blur the unwanted noise in the image ...
  • the multiple weight matrices have the same dimensions, and the feature plane dimensions extracted by the multiple weight matrices with the same dimensions are also the same, and then the extracted feature maps with the same dimensions are combined to form the output of the convolution operation.
  • FIG. 25b is a schematic diagram of the convolution check provided in an embodiment of the application performing a convolution operation on an image.
  • a 6 ⁇ 6 image and a 2 ⁇ 2 convolution are used.
  • s refers to the horizontal coordinate of the image in the angular dimension
  • t refers to the vertical coordinate of the image in the angular dimension
  • x refers to the horizontal direction in an image
  • y refers to the coordinates in the vertical direction in an image
  • m refers to the angle of multiple convolution modules
  • n refers to the vertical coordinate of multiple convolution modules in the angular dimension
  • p refers to the horizontal coordinate in one convolution module
  • q refers to It is the vertical coordinate in a convolution module.
  • a convolution kernel can be determined from multiple convolution modules.
  • the high-dimensional convolutional neural network After the processing of the convolutional layer/pooling layer, the high-dimensional convolutional neural network is not enough to output the required output information. Because as mentioned earlier, the convolutional layer/pooling layer only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (the required class information or other related information), the convolutional neural network needs to use the neural network layer to generate one or a group of required classes of output. Therefore, the neural network layer can include multiple hidden layers, and the parameters contained in the hidden layers can be pre-trained according to the relevant training data of the specific task type. For example, the task type can include image recognition. Image classification, image super-resolution reconstruction, etc.
  • a convolution operation may be performed on the target object in the first image and the target object in the guide image based on the above-mentioned neural network.
  • the cropped guide image (including the target object ) And the cropped first image (including the target object) are scaled to the same specific size and input into the network.
  • the scaled image size becomes (D+d)*(D+d), where D is the side length of the central area, and d is the side length of the margin area.
  • the center D ⁇ D area can be divided into N ⁇ N blocks on average, with the center of each block as the basic grid point, and the margin area width d is the maximum allowable pixel displacement value set in the registration, which is
  • the network design is convenient, optional, can make d be X times (integer multiple) of D/N. In this way, the cropped first image and the guide image are equally divided into (2M+N) ⁇ (2M+N) blocks.
  • the cropped first image (including the marginal area) and the guide image (including the marginal area) are convolved based on the convolutional layer sets CNNgG and CNNgL, and the features Gcf and features are extracted respectively Lcf, this feature can be a contour feature.
  • the Gcf and Lcf are spliced, the convolutional layer set CNNg2 is designed, and the spliced Gcf and Lcf are convolved to obtain GLcf.
  • the convolutional layer sets CNNgs and CNNgc are designed to process GLcf respectively, and output feature GLcfs and feature GLcfc.
  • the side length ratio of feature GLcfs and feature GLcfc is (2M+N): (2M+N- 1).
  • the features GLcfs and GLcfc are processed, and the feature GLcfsf with the size of (2M+N) ⁇ (2M+N) ⁇ 2 and the size of (2M+N-1) ⁇ (2M+ N-1) ⁇ 2 feature GLcfcf.
  • N ⁇ N ⁇ 2 in the center of GLcfsf and (N-1) ⁇ (N-1) ⁇ 2 in the center of GLcfcf that is, the output “N ⁇ N” basic grid and “(N-1) ⁇ (N-1) )”
  • the meaning of the grid point displacement is to guide the image to be registered on the image to be enhanced, and the grid point position The displacement that the coordinate point should have.
  • the displacement of each pixel point can be interpolated from the displacement of the grid point coordinates.
  • the above-mentioned grid points may be the geometric center of the receptive field range of the convolution kernel corresponding to each convolution operation in the guide image, or the pixel position not far from the geometric center (the grid points and the feeling The interval between the geometric centers of the wild range is less than the preset value), which is not limited here.
  • the receptive field may be the area range of the pixel points on the feature map (feature map) output by each layer of the convolutional neural network mapped on the input picture.
  • the calculation range of the receptive field can also extend the first image infinitely outward to ensure that when the boundary of the first image is reached, the range of the receptive field is not cut off by the boundary of the first image.
  • the receptive field may include the edge complement of the feature layer in the convolution operation. Area.
  • FIG. 26 is a schematic structural diagram of the electronic device provided by an embodiment of the present application.
  • the electronic device includes:
  • the acquiring module 2601 is configured to acquire a first image, the first image includes a target object; to acquire a guide image according to the first image, the guide image includes the target object, and the target object in the guide image is clear The degree is greater than the sharpness of the target object in the first image;
  • the processing module 2602 is configured to enhance the target object in the first image through a neural network according to the target object in the guide image to obtain a target image.
  • the target image includes the enhanced target object.
  • the definition of the target object is greater than the definition of the target object in the first image.
  • the degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
  • the obtaining module 2601 is specifically used for:
  • the guide image is determined from the at least one second image according to the degree of difference between the posture of the target object in the first image and the posture of each second image in the at least one second image.
  • the electronic module further includes:
  • the display module 2603 is configured to display a first image selection interface, where the first image selection interface includes at least one image;
  • the receiving module 2604 is configured to receive a first image selection instruction, where the first image selection instruction indicates that the at least one second image is selected from at least one image included in the first image selection interface.
  • processing module is specifically used for:
  • At least one third image is determined according to the posture of the target object in the first image, each third image in the at least one third image includes the target object, and the posture of the target object included in each third image is the same as The degree of difference between the postures of the target objects in the first image is within a preset range;
  • the display module is further configured to display a second image selection interface, the second image selection interface including the at least one third image;
  • the receiving module is further configured to receive a second image selection instruction, where the second image selection instruction indicates that the guide image is selected from at least one third image included in the second image selection interface.
  • the target image includes an enhanced target object
  • the guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image
  • the guiding image feature includes at least one of the following image features:
  • the target image includes an enhanced target object, and the degree of difference between the posture of the enhanced target object and the posture of the target object in the first image is within a preset range.
  • the display module 2603 is also used for:
  • the acquisition module 2601 is specifically configured to receive a user's shooting operation, and in response to the shooting operation, acquire the first image;
  • the display module 2603 is also used for:
  • the album interface including a plurality of images
  • the acquisition module 2601 is specifically configured to receive a third image selection instruction, where the third image selection instruction indicates that the first image is selected from a plurality of images included in the album interface.
  • the obtaining module 2601 is specifically used for:
  • the processing module 2602 is specifically configured to obtain high-frequency information of each second pixel; to obtain low-frequency information of each first pixel, the second pixel is a pixel in the guide image Point, the first pixel point is a pixel point of the first image; the low-frequency information and the corresponding high-frequency information are fused.
  • the processing module 2602 is further configured to perform smoothing processing on the edge area of the target object in the first image after fusing each second pixel with the corresponding first pixel. .
  • the processing module 2602 is further configured to determine the pixel displacement between each second pixel and the corresponding first pixel; and translate each second pixel based on the pixel displacement to obtain the registration After the target audience.
  • processing module 2602 is specifically configured to merge the registered target object with the target object.
  • the target object includes a first area
  • the registered target object includes a second area
  • the first area overlaps the second area
  • the processing module 2602 is specifically configured to The pixel points of the first area and the second area are fused.
  • the target object further includes a third area that is offset from the registered target object, and the processing module 2602 is further configured to perform super-resolution enhancement processing on the third area .
  • the registered target object further includes N third pixels, each of the third pixels is generated by interpolation according to the pixel value of the adjacent first pixel, where N is Positive integer.
  • the processing module 2602 is specifically configured to perform a convolution operation on the first image to obtain a first feature plane; perform a convolution operation on the guide image to obtain a second feature plane; The first feature plane and the second feature plane perform a convolution operation to determine the second pixel point corresponding to each first pixel point in the M first pixel points, where the coordinate position of each grid point is equal to The interval between the geometric centers of the convolution kernels corresponding to one convolution operation is smaller than the preset value.
  • FIG. 27 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server includes:
  • the receiving module 2701 is configured to receive a first image sent by an electronic device, the first image includes a target object; to obtain a guide image, the guide image includes the target object, and the definition of the target object in the guide image is greater than The sharpness of the target object in the first image;
  • the processing module 2702 is configured to enhance the target object in the first image through a neural network according to the target object in the guide image to obtain a target image.
  • the target image includes the enhanced target object.
  • the definition of the target object is greater than the definition of the target object in the first image;
  • the sending module 2703 is configured to send the target image to the electronic device.
  • the degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
  • the receiving module 2701 is specifically configured to:
  • the guide image is determined from the at least one second image according to the degree of difference between the posture of the target object in the first image and the posture of each second image in the at least one second image.
  • the target image includes an enhanced target object
  • the guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image
  • the guiding image feature includes at least one of the following image features:
  • the target image includes an enhanced target object, and the degree of difference between the posture of the enhanced target object and the posture of the target object in the first image is within a preset range.
  • FIG. 28 is a schematic structural diagram of an electronic device provided by an embodiment of this application.
  • the electronic device 2800 may specifically be represented as a virtual reality VR device, a mobile phone, Tablets, laptops, smart wearable devices, etc., are not limited here.
  • the electronic device 2800 includes: a receiver 2801, a transmitter 2802, a processor 2803, and a memory 2804 (the number of processors 2803 in the electronic device 2800 may be one or more, and one processor is taken as an example in FIG. 28) , Where the processor 2803 may include an application processor 28031 and a communication processor 28032.
  • the receiver 2801, the transmitter 2802, the processor 2803, and the memory 2804 may be connected by a bus or other means.
  • the memory 2804 may include a read-only memory and a random access memory, and provides instructions and data to the processor 2803. A part of the memory 2804 may also include a non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 2804 stores a processor and operating instructions, executable modules or data structures, or a subset of them, or an extended set of them.
  • the operating instructions may include various operating instructions for implementing various operations.
  • the processor 2803 controls the operation of the electronic device.
  • the various components of the electronic device are coupled together through a bus system, where the bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
  • bus system may include a power bus, a control bus, and a status signal bus in addition to a data bus.
  • various buses are referred to as bus systems in the figure.
  • the methods disclosed in the foregoing embodiments of the present application may be applied to the processor 2803 or implemented by the processor 2803.
  • the processor 2803 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in the processor 2803 or instructions in the form of software.
  • the aforementioned processor 2803 may be a general-purpose processor, a digital signal processing (digital signal processing, DSP), a microprocessor or a microcontroller, and may further include an application specific integrated circuit (ASIC), field programmable Field-programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application specific integrated circuit
  • FPGA field programmable Field-programmable gate array
  • the processor 2803 can implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present application.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application can be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 2804, and the processor 2803 reads the information in the memory 2804, and completes the steps of the foregoing method in combination with its hardware.
  • the receiver 2801 can be used to receive input digital or character information, and generate signal input related to the related settings and function control of the electronic device.
  • the transmitter 2802 can be used to output digital or character information through the first interface; the transmitter 2802 can also be used to send instructions to the disk group through the first interface to modify the data in the disk group; the transmitter 2802 can also include display devices such as a display .
  • the processor 2803 is configured to execute processing-related steps in the image enhancement method in the foregoing embodiment.
  • FIG. 29 is a schematic structural diagram of the server provided by the embodiment of the present application.
  • the server may have relatively large differences due to different configurations or performance, and may include one or One or more central processing units (CPU) 2922 (e.g., one or more processors) and memory 2932, one or more storage media 2930 for storing application programs 2942 or data 2944 (e.g., one or one storage medium for storing data 2944 equipment).
  • the memory 2932 and the storage medium 2930 may be short-term storage or persistent storage.
  • the program stored in the storage medium 2930 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the training device.
  • the central processing unit 2922 may be configured to communicate with the storage medium 2930, and execute a series of instruction operations in the storage medium 2930 on the server 2900.
  • the server 2900 may also include one or more power supplies 2926, one or more wired or wireless network interfaces 2950, one or more input and output interfaces 2958, and/or one or more operating systems 2941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • operating systems 2941 such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the central processing unit 2922 is configured to execute the image enhancement method described in the foregoing embodiment.
  • the embodiment of the present application also provides a product including a computer program, which when running on a computer, causes the computer to execute the steps of the image enhancement method.
  • An embodiment of the present application also provides a computer-readable storage medium, which stores a program for signal processing, and when it runs on a computer, the computer executes the method described in the foregoing embodiment The steps of the image enhancement method.
  • the execution device and the training device provided in the embodiments of the present application may specifically be a chip.
  • the chip includes a processing unit and a communication unit.
  • the processing unit may be a processor, for example, and the communication unit may be an input/output interface, a pin, or Circuit etc.
  • the processing unit can execute the computer-executable instructions stored in the storage unit, so that the chip in the execution device executes the image enhancement method described in the foregoing embodiment, or causes the chip in the training device to execute the image enhancement method described in the foregoing embodiment.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc.
  • the storage unit may also be a storage unit located outside the chip in the wireless access device, such as Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • ROM Read-only memory
  • RAM random access memory
  • FIG. 30 is a schematic structural diagram of a chip provided by an embodiment of the application.
  • the chip may be expressed as a neural network processor NPU 300, which is mounted as a coprocessor to the main CPU (Host On the CPU), the Host CPU assigns tasks.
  • the core part of the NPU is the arithmetic circuit 3003.
  • the arithmetic circuit 3003 is controlled by the controller 3004 to extract matrix data from the memory and perform multiplication operations.
  • the arithmetic circuit 3003 includes multiple processing units (Process Engine, PE). In some implementations, the arithmetic circuit 3003 is a two-dimensional systolic array. The arithmetic circuit 3003 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 3003 is a general-purpose matrix processor.
  • the arithmetic circuit fetches the corresponding data of matrix B from the weight memory 3002 and caches it on each PE in the arithmetic circuit.
  • the arithmetic circuit fetches matrix A data and matrix B from the input memory 3001 to perform matrix operations, and the partial result or final result of the obtained matrix is stored in an accumulator 3008.
  • the unified memory 3006 is used to store input data and output data.
  • the weight data directly passes through the storage unit access controller (direct memory access controller, DMAC) 3005, and the DMAC is transferred to the weight memory 3002.
  • the input data is also transferred to the unified memory 3006 through the DMAC.
  • the BIU is the Bus Interface Unit, that is, the bus interface unit 3010, which is used for the interaction between the AXI bus and the DMAC and the instruction fetch buffer (IFB) 3009.
  • IFB instruction fetch buffer
  • the bus interface unit 3010 (Bus Interface Unit, BIU for short) is used for the instruction fetch memory 3009 to obtain instructions from an external memory, and is also used for the storage unit access controller 3005 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 3006 or to transfer the weight data to the weight memory 3002 or to transfer the input data to the input memory 3001.
  • the vector calculation unit 3007 includes multiple arithmetic processing units, and further processes the output of the arithmetic circuit if necessary, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on.
  • the vector calculation unit 3007 can store the processed output vector to the unified memory 3006.
  • the vector calculation unit 3007 can apply a linear function and/or a non-linear function to the output of the arithmetic circuit 3003, such as linearly interpolating the feature plane extracted by the convolutional layer, and for example a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 3007 generates normalized values, pixel-level summed values, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 3003, for example for use in a subsequent layer in a neural network.
  • the instruction fetch buffer 3009 connected to the controller 3004 is used to store instructions used by the controller 3004;
  • the unified memory 3006, the input memory 3001, the weight memory 3002, and the instruction fetch memory 3009 are all On-Chip memories.
  • the external memory is private to the NPU hardware architecture.
  • the processor mentioned in any of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the program of the above image enhancement method.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate.
  • the physical unit can be located in one place or distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the connection relationship between the modules indicates that they have a communication connection between them, which can be specifically implemented as one or more communication buses or signal lines.
  • this application can be implemented by means of software plus necessary general hardware. Of course, it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memory, Dedicated components and so on to achieve. Under normal circumstances, all functions completed by computer programs can be easily implemented with corresponding hardware. Moreover, the specific hardware structures used to achieve the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. Circuit etc. However, for this application, software program implementation is a better implementation in more cases. Based on this understanding, the technical solution of this application essentially or the part that contributes to the prior art can be embodied in the form of a software product.
  • the computer software product is stored in a readable storage medium, such as a computer floppy disk. , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, training device, or network device, etc.) execute the various embodiments described in this application method.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, training device, or data.
  • the center transmits to another website, computer, training equipment, or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless such as infrared, wireless, microwave, etc.
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a training device or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).

Abstract

本申请实施例提供了一种图像增强方法,包括:获取第一图像,所述第一图像包括目标对象;根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。本申请中,增强后的目标图像不会出现失真的情况,增强效果好。

Description

一种图像增强方法及装置
本申请要求于2019年10月25日提交中国国家知识产权局、申请号为201911026078.X、发明名称为“一种图像增强方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子技术领域,尤其涉及一种图像增强方法及装置。
背景技术
用户拍摄的图像经常会由于外界因素(例如亮度低等),导致拍摄的图像质量很差。
现有技术中,常常基于超分辨处理对图像进行增强,然而,复杂的场景中(比如对包括人脸的图像进行增强),由于图像本身的细节较多,增强后的图像极易出现失真的情况,降低了增强的效果。
发明内容
本申请实施例提供了一种图像增强方法及装置,通过神经网络将引导图像对待增强图(第一图像)进行增强,由于借鉴了引导图像中的信息,相比传统人脸增强技术中直接对待增强图像进行处理,不会出现失真的情况,增强效果更好。
第一方面,本申请实施例提供了一种图像增强方法,所述方法包括:
获取第一图像,所述第一图像包括目标对象;
根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
本申请提供了一种图像增强方法,包括:获取第一图像,所述第一图像包括目标对象;根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。通过上述方式,通过神经网络将引导图像对待增强图(第一图像)进行增强,由于借鉴了引导图像中的信息,相比传统人脸增强技术中直接对待增强图像进行处理,不会出现失真的情况,增强效果更好。
在第一方面的一种可选设计中,所述目标对象至少包括如下对象的一种:同一个人的人脸、眼、耳、鼻、眉或口。
在第一方面的一种可选设计中,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
在第一方面的一种可选设计中,所述根据所述第一图像,包括:
根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度,从所述至少一个第二图像中确定所述引导图像。
在第一方面的一种可选设计中,所述根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度之前,所述方法还包括:
显示第一图像选择界面,所述第一图像选择界面包括至少一个图像;
接收第一图像选择指令,所述第一图像选择指令表示从所述第一图像选择界面包括的至少一个图像中选择所述至少一个第二图像。
在第一方面的一种可选设计中,所述根据所述第一图像,包括:
根据所述第一图像中的目标对象的姿态确定至少一个第三图像,所述至少一个第三图像中的每个第三图像包括目标对象,且每个第三图像包括的目标对象的姿态与所述第一图像中的目标对象的姿态之间的差异度在预设范围内;
显示第二图像选择界面,所述第二图像选择界面包括所述至少一个第三图像;
接收第二图像选择指令,所述第二图像选择指令表示从所述第二图像选择界面包括的至少一个第三图像中选择所述引导图像。
在第一方面的一种可选设计中,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内,包括:
所述引导图像中的目标对象的轮廓形状与所述第一图像中的目标对象的轮廓形状的差异度在预设范围内。
在第一方面的一种可选设计中,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
在第一方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
在第一方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
在第一方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
在第一方面的一种可选设计中,所述获取第一图像,包括:
显示相机的拍摄界面;
接收用户的拍摄操作,响应于所述拍摄操作,获取所述第一图像;或
或,显示相机的相册界面,所述相册界面包括多个图像;
接收第三图像选择指令,所述第三图像选择指令表示从所述相册界面包括的多个图像中选择所述第一图像。
在第一方面的一种可选设计中,所述获取引导图像,包括:
接收服务器发送的所述引导图像。
第二方面,本申请提供了一种图像增强装置,应用于电子设备或服务器,图像增强装 置包括:
获取模块,用于获取第一图像,所述第一图像包括目标对象;根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
处理模块,用于根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
在第二方面的一种可选设计中,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
在第二方面的一种可选设计中,所述获取模块,具体用于:
根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度,从所述至少一个第二图像中确定所述引导图像。
在第二方面的一种可选设计中,所述电子模块还包括:
显示模块,用于显示第一图像选择界面,所述第一图像选择界面包括至少一个图像;
接收模块,用于接收第一图像选择指令,所述第一图像选择指令表示从所述第一图像选择界面包括的至少一个图像中选择所述至少一个第二图像。
在第二方面的一种可选设计中,所述处理模块,具体用于:
根据所述第一图像中的目标对象的姿态确定至少一个第三图像,所述至少一个第三图像中的每个第三图像包括目标对象,且每个第三图像包括的目标对象的姿态与所述第一图像中的目标对象的姿态之间的差异度在预设范围内;
所述显示模块,还用于显示第二图像选择界面,所述第二图像选择界面包括所述至少一个第三图像;
所述接收模块,还用于接收第二图像选择指令,所述第二图像选择指令表示从所述第二图像选择界面包括的至少一个第三图像中选择所述引导图像。
在第二方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
在第二方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
在第二方面的一种可选设计中,所述显示模块,还用于:
显示相机的拍摄界面;
所述获取模块,具体用于接收用户的拍摄操作,响应于所述拍摄操作,获取所述第一图像;
或,所述显示模块,还用于:
显示相机的相册界面,所述相册界面包括多个图像;
所述获取模块,具体用于接收第三图像选择指令,所述第三图像选择指令表示从所述 相册界面包括的多个图像中选择所述第一图像。
在第二方面的一种可选设计中,所述获取模块,具体用于:
接收服务器发送的所述引导图像。
第三方面,本申请提供了一种图像增强方法,包括:
接收电子设备发送的第一图像,所述第一图像包括目标对象;
根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
向所述电子设备发送所述目标图像。
在第三方面的一种可选设计中,所述目标对象至少包括如下对象的一种:同一个人的人脸、眼、耳、鼻、眉或口。
在第三方面的一种可选设计中,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
在第三方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
在第三方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
第四方面,本申请提供了一种服务器,包括:
接收模块,用于接收电子设备发送的第一图像,所述第一图像包括目标对象;根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
处理模块,用于根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
发送模块,用于向所述电子设备发送所述目标图像。
在第四方面的一种可选设计中,所述目标对象至少包括如下对象的一种:同一个人的人脸、眼、耳、鼻、眉或口。
在第四方面的一种可选设计中,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
在第四方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
在第四方面的一种可选设计中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
第五方面,本申请实施例提供了一种图像增强方法,所述方法包括:
获取第一图像,所述第一图像包括目标对象;
根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
根据所述引导图像中的目标对象对所述第一图像中的目标对象进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
在第五方面的一种可选设计中,所述目标对象为月亮。
第六方面,本申请提供了一种电子设备,包括:一个或多个处理器;一个或多个存储器;多个应用程序;以及一个或多个程序,其中所述一个或多个程序被存储在所述存储器中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行上述第一方面及第一方面的可能实现方式中任一项所述的步骤。
第七方面,本申请提供了一种服务器,包括:一个或多个处理器;一个或多个存储器;以及一个或多个程序,其中所述一个或多个程序被存储在所述存储器中,当所述一个或者多个程序被所述处理器执行时,使得所述服务器执行上述第一方面、第三方面、第一方面的可能实现方式及第三方面的可能实现方式中任一项所述的步骤。
第八方面,本申请提供了一种装置,该装置包含在电子设备中,该装置具有实现上述第一方面方面中任一项电子设备行为的功能。功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块或单元。例如,显示模块、获取模块、处理模块等。
第九方面,本申请提供了一种电子设备,包括:触摸显示屏,其中,触摸显示屏包括触敏表面和显示器;摄像头;一个或多个处理器;存储器;多个应用程序;以及一个或多个计算机程序。其中,一个或多个计算机程序被存储在存储器中,一个或多个计算机程序包括指令。当指令被电子设备执行时,使得电子设备执行上述第一方面方面中任一项可能的实现中的图像增强方法。
第十方面,本申请提供了一种计算机存储介质,包括计算机指令,当计算机指令在电子设备或服务器上运行时,使得电子设备执行上述任一方面任一项可能的图像增强方法。
第十一方面,本申请提供了一种计算机程序产品,当计算机程序产品在电子设备或服务器上运行时,使得电子设备执行上述任一方面任一项可能的图像增强方法。
本申请提供了一种图像增强方法,包括:获取第一图像,所述第一图像包括目标对象;根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。通过上述方式,通过神经网络将引导图像对待增强图(第一图像)进行增强,由 于借鉴了引导图像中的信息,相比传统人脸增强技术中直接对待增强图像进行处理,不会出现失真的情况,增强效果更好。
附图说明
图1为本申请实施例的一种应用场景架构示意图;
图2为一种电子设备的结构示意图;
图3a是本申请实施例的电子设备的软件结构框图;
图3b为本申请实施例提供的一种图像增强方法的实施例示意图;
图4(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图4(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图4(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图4(d)是本申请实施例提供的一例图像增强处理界面的示意图;
图5(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图5(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图5(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图6(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图6(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图6(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图7(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图7(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图7(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图8是本申请实施例提供的一例图像增强处理界面的示意图;
图9是本申请实施例提供的一例图像增强处理界面的示意图;
图10(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图10(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图10(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图10(d)是本申请实施例提供的一例图像增强处理界面的示意图;
图10(e)是本申请实施例提供的一例图像增强处理界面的示意图;
图10(f)是本申请实施例提供的一例图像增强处理界面的示意图;
图11(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图11(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图11(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图12(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图12(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图12(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图12(d)是本申请实施例提供的一例图像增强处理界面的示意图;
图12(e)是本申请实施例提供的一例图像增强处理界面的示意图;
图13(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图13(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图13(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图14(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图14(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图14(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图15(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图15(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图15(c)是本申请实施例提供的一例图像增强处理界面的示意图;
图15(d)是本申请实施例提供的一例图像增强处理界面的示意图;
图16(a)是本申请实施例提供的一例图像增强处理界面的示意图;
图16(b)是本申请实施例提供的一例图像增强处理界面的示意图;
图17是本申请实施例提供的一例图像增强处理界面的示意图;
图18是本申请实施例提供的一例图像增强处理界面的示意图;
图19为本申请实施例提供的一种图像的示意图;
图20(a)为一种第一图像的示意图;
图20(b)为一种引导图像的示意图;
图21(a)为一种引导图像的示意图;
图21(b)为一种引导图像的示意图;
图21(c)为一种人脸区域识别的示意图;
图22(a)为一种目标对象的示意图;
图22(b)为一种目标对象的示意图;
图23(a)为一种目标对象的示意图;
图23(b)为一种配准后的目标对象的示意图;
图23(c)为一种目标对象和配准后的目标对象的比对示意图;
图23(d)为一种图像增强的示意图;
图23(e)为一种图像增的示意图;
图23(f)为一种图像增强的示意图;
图23(g)为一种图像增强的示意图;
图24为本申请实施例提供的一种图像增强方法的实施例示意图;
图25a为本申请实施例提供的图像增强系统的一种系统架构图;
图25b为本申请实施例提供的卷积核对图像执行卷积操作的一种示意图;
图25c为本申请实施例提供的一种神经网络的一种示意图;
图26为本申请实施例提供的电子设备的一种结构示意图;
图27为本申请实施例提供的服务器的一种结构示意图;
图28为本申请实施例提供的电子设备的一种结构示意图;
图29是本申请实施例提供的服务器的一种结构示意图;
图30为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
本申请实施例提供了一种图像增强方法、电子设备及服务器,通过上述方式,通过神经网络将引导图像对待增强图(第一图像)进行增强,由于借鉴了引导图像中的信息,相比传统人脸增强技术中直接对待增强图像进行处理,不会出现失真的情况,增强效果更好。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
参照图1,图1为本申请实施例的一种应用场景架构示意图。如图1中示出的那样,本申请实施例提供的图像增强方法可以基于电子设备101来实现,且本申请实施例提供的图像增强方法也可以基于电子设备101和服务器102的交互来实现。
本申请实施例提供的图像增强方法可以应用于手机、平板电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等电子设备上,本申请实施例对电子设备的具体类型不作任何限制。
示例性的,图2示出了电子设备200的结构示意图。电子设备200可以包括处理器210,外部存储器接口220,内部存储器221,通用串行总线(universal serial bus,USB)接口230,充电管理模块240,电源管理模块241,电池242,天线1,天线2,移动通信模块250,无线通信模块260,音频模块270,扬声器270A,受话器270B,麦克风270C,耳机接口270D,传感器模块280,按键290,马达291,指示器292,摄像头293,显示屏294,以及用户标识模块(subscriber identification module,SIM)卡接口295等。其中传感器模块280可以包括压力传感器280A,陀螺仪传感器280B,气压传感器280C,磁传感器280D,加速度传感器280E,距离传感器280F,接近光传感器280G,指纹传感器280H,温度传感器280J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备200的具体限定。在本申请另一些实施例中,电子设备200可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器210可以包括一个或多个处理单元,例如:处理器210可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit, GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是电子设备200的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器210中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器210中的存储器为高速缓冲存储器。该存储器可以保存处理器210刚用过或循环使用的指令或数据。如果处理器210需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器210的等待时间,因而提高了系统的效率。
在一些实施例中,处理器210可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器110可以包含多组I2C总线。处理器210可以通过不同的I2C总线接口分别耦合触摸传感器280K、充电器、闪光灯、摄像头193等。例如:处理器210可以通过I2C接口耦合触摸传感器280K,使处理器210与触摸传感器280K通过I2C总线接口通信,实现电子设备200的触摸功能。
I2S接口可以用于音频通信。在一些实施例中,处理器210可以包含多组I2S总线。处理器210可以通过I2S总线与音频模块270耦合,实现处理器210与音频模块270之间的通信。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块270与无线通信模块260可以通过PCM总线接口耦合。在一些实施例中,音频模块270也可以通过PCM接口向无线通信模块260传递音频信号,实现通过蓝牙耳机接听电话的功能。所述I2S接口和所述PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器210与无线通信模块260。例如:处理器210通过UART接口与无线通信模块260中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块270可以通过UART接口向无线通信模块260传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器210与显示屏294,摄像头293等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器210和摄像头293通过CSI接口通信,实现 电子设备200的拍摄功能。处理器210和显示屏294通过DSI接口通信,实现电子设备200的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器210与摄像头293,显示屏294,无线通信模块260,音频模块270,传感器模块280等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口230是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口230可以用于连接充电器为电子设备200充电,也可以用于电子设备200与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备200的结构限定。在本申请另一些实施例中,电子设备200也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块240可以通过USB接口230接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块240可以通过电子设备200的无线充电线圈接收无线充电输入。充电管理模块240为电池242充电的同时,还可以通过电源管理模块241为电子设备供电。
电源管理模块241用于连接电池242,充电管理模块240与处理器210。电源管理模块241接收电池242和/或充电管理模块240的输入,为处理器210,内部存储器221,外部存储器,显示屏294,摄像头293,和无线通信模块260等供电。电源管理模块241还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块241也可以设置于处理器210中。在另一些实施例中,电源管理模块241和充电管理模块240也可以设置于同一个器件中。
电子设备200的无线通信功能可以通过天线1,天线2,移动通信模块250,无线通信模块260,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备200中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块250可以提供应用在电子设备200上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块250可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块250可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块250还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块250的至少部分功能模块可以被设置于处理器210中。在一些实施例中,移动通信模块250的至少部分功能模块可以与处理器210的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器270A,受话器270B等)输出声音信号,或通过显示屏294显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器210,与移动通信模块250或其他功能模块设置在同一个器件中。
无线通信模块260可以提供应用在电子设备200上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块260可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块260经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器210。无线通信模块260还可以从处理器210接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备200的天线1和移动通信模块250耦合,天线2和无线通信模块260耦合,使得电子设备200可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
电子设备200通过GPU,显示屏294,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏294和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器210可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏294用于显示图像,视频等。显示屏294包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备200可以包括1个或N个显示屏294,N为大于1的正整数。
电子设备200可以通过ISP,摄像头293,视频编解码器,GPU,显示屏294以及应用 处理器等实现拍摄功能。
ISP用于处理摄像头293反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头293中。
摄像头293用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备200可以包括1个或N个摄像头293,N为大于1的正整数。
例如,在本申请提供的图像处理方法中,摄像头可以采集图像,并将采集的图像显示在预览界面中。感光元件把采集到的光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP,做相关的图像加工处理。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备200在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备200可以支持一种或多种视频编解码器。这样,电子设备200可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备200的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口220可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备200的存储能力。外部存储卡通过外部存储器接口220与处理器210通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器221可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器210通过运行存储在内部存储器221的指令,从而执行电子设备200的各种功能应用以及数据处理。内部存储器221可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备200使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器221可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
电子设备200可以通过音频模块270,扬声器270A,受话器270B,麦克风270C,耳机接口270D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块270用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块270还可以用于对音频信号编码和解码。在一些实施例中,音频模块270可以设置于处理器210中,或将音频模块270的部分功能模块设置于处理器210中。
扬声器270A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备200可以通过扬声器270A收听音乐,或收听免提通话。
受话器270B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备200接听电话或语音信息时,可以通过将受话器270B靠近人耳接听语音。
麦克风270C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风270C发声,将声音信号输入到麦克风270C。电子设备200可以设置至少一个麦克风270C。在另一些实施例中,电子设备200可以设置两个麦克风270C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备200还可以设置三个,四个或更多麦克风270C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口270D用于连接有线耳机。耳机接口270D可以是USB接口230,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器280A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器280A可以设置于显示屏294。压力传感器280A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器280A,电极之间的电容改变。电子设备200根据电容的变化确定压力的强度。当有触摸操作作用于显示屏294,电子设备200根据压力传感器280A检测所述触摸操作强度。电子设备200也可以根据压力传感器280A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器280B可以用于确定电子设备200的运动姿态。在一些实施例中,可以通过陀螺仪传感器280B确定电子设备200围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器280B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器280B检测电子设备200抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备200的抖动,实现防抖。陀螺仪传感器280B还可以用于导航,体感游戏场景。
气压传感器280C用于测量气压。在一些实施例中,电子设备200通过气压传感器280C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器280D包括霍尔传感器。电子设备200可以利用磁传感器280D检测翻盖皮套的开合。在一些实施例中,当电子设备200是翻盖机时,电子设备200可以根据磁传感器 280D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器280E可检测电子设备200在各个方向上(一般为三轴)加速度的大小。当电子设备200静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器280F,用于测量距离。电子设备200可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备200可以利用距离传感器280F测距以实现快速对焦。
例如,在本申请提供的图像处理方法中,在摄像头拍摄图像的过程中,自动对焦过程就可以根据距离传感器280F测距,从而实现快速的自动对焦。
接近光传感器280G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备200通过发光二极管向外发射红外光。电子设备200使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备200附近有物体。当检测到不充分的反射光时,电子设备200可以确定电子设备200附近没有物体。电子设备200可以利用接近光传感器280G检测用户手持电子设备200贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器280G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器280L用于感知环境光亮度。电子设备200可以根据感知的环境光亮度自适应调节显示屏294亮度。环境光传感器280L也可用于拍照时自动调节白平衡。环境光传感器280L还可以与接近光传感器280G配合,检测电子设备200是否在口袋里,以防误触。
指纹传感器280H用于采集指纹。电子设备200可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器280J用于检测温度。在一些实施例中,电子设备200利用温度传感器280J检测的温度,执行温度处理策略。例如,当温度传感器280J上报的温度超过阈值,电子设备200执行降低位于温度传感器280J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备200对电池242加热,以避免低温导致电子设备200异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备200对电池242的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器280K,也称“触控面板”。触摸传感器280K可以设置于显示屏294,由触摸传感器280K与显示屏294组成触摸屏,也称“触控屏”。触摸传感器280K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏294提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器280K也可以设置于电子设备200的表面,与显示屏294所处的位置不同。
骨传导传感器280M可以获取振动信号。在一些实施例中,骨传导传感器280M可以获取人体声部振动骨块的振动信号。骨传导传感器280M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器280M也可以设置于耳机中,结合成骨传导耳机。音频模块270可以基于所述骨传导传感器280M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于所述骨传导传感器280M获取的血压跳动信号解 析心率信息,实现心率检测功能。
按键290包括开机键,音量键等。按键290可以是机械按键。也可以是触摸式按键。电子设备200可以接收按键输入,产生与电子设备200的用户设置以及功能控制有关的键信号输入。
马达291可以产生振动提示。马达291可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏294不同区域的触摸操作,马达291也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器292可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口295用于连接SIM卡。SIM卡可以通过插入SIM卡接口295,或从SIM卡接口295拔出,实现和电子设备200的接触和分离。电子设备200可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口295可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口295可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口295也可以兼容不同类型的SIM卡。SIM卡接口295也可以兼容外部存储卡。电子设备200通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备200采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备200中,不能和电子设备200分离。
电子设备200的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明电子设备200的软件结构。
图3a是本申请实施例的电子设备200的软件结构框图。分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。应用程序层可以包括一系列应用程序包。
如图3a所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图3a所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于 构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备200的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
Android runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(media libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。2D图形引擎是2D绘图的绘图引擎。内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
为了便于理解,本申请以下实施例将以具有图2和图3所示结构的电子设备为例,结合附图和应用场景,对本申请实施例提供的一种图像增强方法进行具体阐述。
参照图3b,图3b为本申请实施例提供的一种图像增强方法的实施例示意图,如图3b中示出的那样,本申请实施例提供的一种图像增强方法包括:
301、电子设备获取第一图像,所述第一图像包括目标对象。
本申请实施例中,电子设备可以基于用户的选择来确定需要进行图像增强的第一图像。
本申请实施例中,第一图像可以包括拍摄人脸得到的目标对象,其中,目标对象可以是人脸。
接下来介绍电子设备如何获取到第一图像。
可选的,在一种实施例中,第一图像可以是用户通过电子设备的摄像装置(例如摄像头) 对人脸进行实时拍摄所获取到的人脸图像。
可选的,在一种实施例中,用户从电子设备的本地图库或者云相册中选择的已存储的人脸图像,此处的云相册可以指位于云计算平台的网络相册。
可选的,在一种实施例中,电子设备可以对本地相册中存储的图像进行可增强判断,并基于判断结果,将提示用户对可以增强的图像进行增强,进而用户可以在电子设备提示的可以增强的图像中选择第一图像。
可选的,在另一种场景中,电子设备可以在拍摄界面中设置增强功能,相应的,不需要用户的选择,用户在拍摄得到图像之后,电子设备可以自动将用户拍摄的图像作为第一图像。
接下来分别进行说明:
首先介绍用户通过电子设备的摄像装置拍摄得到要增强的第一图像。
本实施例中,电子设备可以显示相机的拍摄界面,接收用户的拍摄操作,响应于所述拍摄操作,获取所述第一图像。
具体的,电子设备可以显示相机的拍摄界面,在摄像头对准人脸后,用户可以点击拍摄界面中的拍摄控件,相应的,电子设备可以接收用户的拍摄操作,响应于所述拍摄操作,进行拍摄,并获取所述第一图像,其中,第一图像包括与人脸或人脸的局部区域对应的目标对象。
具体的,图4(a)是本申请实施例提供的一例图像增强处理界面(graphical user interface,GUI)的示意图,图4(a)图示出了手机的解锁模式下,手机的屏幕显示系统显示了当前输出的界面内容401,该界面内容401为手机的主界面。该界面内容401显示了多款第三方应用程序(application,App),例如支付宝、任务卡商店、微博、相册、微信、卡包、设置、相机。应理解,界面内容401还可以包括其他更多的应用程序,本申请对此不作限定。
当手机检测到用户点击主界面401上的相机应用的图标402的操作后,可以启动相机应用,显示如图4(b)图所示的界面,该界面可以称为相机的拍摄界面403。该拍摄界面403上可以包括取景框、相册图标404、拍摄控件405和摄像头旋转控件406等。
其中,取景框用于获取拍摄预览的图像,实时显示预览图像,如图4(b)图中一个人脸的预览图像。相册图标404用于快捷进入相册,当手机检测到用户点击相册的图标404后,可以在触摸屏上展示已经拍摄的照片或者视频,或者显示从网络下载保存的照片或者视频等。拍摄控件405用于拍摄照片或者录像,当手机检测到用户点击拍摄控件405后,手机执行拍照操作,并将拍摄的照片保存下来;或者,当手机处于录像模式时,用户点击拍摄控件405后,手机执行录像操作,并将录制的视频保存下来。摄像头旋转控件406可以用于控制前置摄像头和后置摄像头的切换。
此外,该拍摄界面403上还包括用于设置拍摄模式的功能控件,例如4(b)图中人像模式、拍照模式、录像模式、专业模式和更多模式。应理解,当用户点击图标402后,响应于该点击操作,手机打开相机应用后默认在拍照模式下,本申请对此不做限定。
如图4(b)图所示,在普通拍照模式下,用户可以点击拍摄控件405进行拍摄。响应于 用户点击拍摄控件405的操作,手机执行拍照操作,并获取到拍照得到的第一图像。
在一种实施例中,用户在基于电子设备的相机拍摄后,由于拍摄的第一图像的图像质量较低,需要说明的是,第一图像的图像质量较低可以理解为第一图像中人脸区域的图像质量较低,或者,第一图像中人脸的部分区域(例如某个五官)的图像质量较低,此处并不限定。需要说明的是,图像质量较低可以基于用户的视觉来判断,示例性的,图像质量较低可以至少包括如下图像特征的至少一种:亮度较差、色调较差、细节清晰度低,例如:人脸亮度或色调较差、人脸细节清晰度低、某一个或多个五官的亮度或色调较差或某一个或多个五官的细节清晰度低。
此时,用户可以对拍摄得到的第一图像进行增强,例如点击如图4(c)中示出的“增强”控件,如图4(c)图所示,手机在执行拍摄操作之后可以在照片显示区域409显示拍摄的照片,此外,手机的显示界面中还可以显示“增强”控件和“保存”控件,用户可以点击该“保存”控件,相应的,手机可以接收到保存指令,响应于保存指令,将拍摄的照片保存下来,保存在相册图标404中。此外,用户可以点击该“增强”控件,相应的,手机可以接收到增强指令,响应于增强指令,手机可以确定用户要对当前显示界面显示的照片进行增强,需要说明的是,为了方便描述,以下将需要进行增强处理的图像称为第一图像。
以上介绍了手机对拍摄得到的第一图像进行增强的实施例,可选的,在另一种场景中,用户可以直接从相册中选择需要增强的第一图像。
本实施例中,电子设备可以显示相机的相册界面,所述相册界面包括多个图像,接收第三图像选择指令,所述第三图像选择指令表示从所述相册界面包括的多个图像中选择所述第一图像。
具体的,如图5(a)示出的那样,图5(a)示出了相册的图像显示界面,其中可以包括用户之前拍摄的图像以及从网络侧下载的图像等,如图5(b)示出的那样,用户可以选择其中的一张图像,例如可以对选择的图像进行点击操作,或者是长按操作,手机响应于上述操作,可以显示如图5(c)中示出的界面,其中,除了常规可以显示出的图像预览、“删除”控件等,还可以包括“增强”控件,用户可以点击上述“增强”控件,响应于用户点击“增强”控件的操作,手机可以对该图像进行增强,例如可以显示如图4(d)的增强区域选择界面等。
需要说明的是,上述实施例中的控件设置以及显示内容仅为一种示意,本申请并不限定。
可选的,在另一种场景中,手机可以对本地相册中存储的图像进行可增强判断,并基于判断结果,提示用户对可以增强的图像进行增强。
具体的,手机可以基于照片的亮度的动态范围、色调、皮肤纹理度以及是否有人脸姿态相似的高清引导图像作为判断依据,例如,图6(a)中的第一张图像相比于第二张图像的亮度较差,且其中的人脸姿态和第一张图像的姿态相似,因此,可以确定第一张图像为可增强图像。示例性的,如图6(a)中示出的那样,图6(a)示出了相册的图像显示界面,其中,该界面除了包括用户之前拍摄的图像以及从网络侧下载的图像,还可以包括“可增 强图像”控件,如图6(b)示出的那样,用户可以点击该“可增强图像”控件,响应于用户的操作,手机可以显示如图6(b)示出的可增强图像的显示界面,如图6(b)示出的那样,用户可以点击可增强图像的显示界面中想要进行增强的图像,响应于用户的操作,手机可以显示如图6(c)示出的图像预览界面,其中,除了常规可以显示出的图像预览、“删除”控件等,还可以包括“增强”控件,用户可以点击上述“增强”控件,响应于用户点击“增强”控件的操作,手机可以对该图像进行增强,例如可以显示如图4(d)的增强区域选择界面等。
需要说明的是,上述实施例中的控件设置以及显示内容仅为一种示意,本申请并不限定。
可选的,在另一种场景中,手机可以在拍摄界面中设置增强功能。
示例性的,如图7(a)中示出的那样,该拍摄界面403上包括用于设置拍摄模式的功能控件,例如图7(a)图中人像模式、拍照模式、录像模式、增强模式和更多模式。应理解,当用户将增强模式的图标滑动至当前的模式,响应于该点击操作,手机进入增强模式。如图7(b)中示出的那样,用户可以点击拍照控件405,手机响应于该操作,将拍摄获取的图像显示在图7(c)示出的显示界面中,此外,该显示界面还可以包括“保存”控件和“增强”控件,若用户点击“保存”控件,则手机可以响应于该操作,不对该图像进行增强处理,而是直接保存至本地相册中,若用户点击“增强”控件,则手机可以响应于该操作,对该图像进行增强处理,例如可以获取引导图像,并基于引导图像对上述拍摄获得的第一图像进行增强处理。
可选地,在另一种实施例中,手机可以不基于用户的操作而进入到增强模式,而是基于对拍摄界面上预览图像的图像质量分析来确定是否进入到增强模式。
如图8中示出的那样,当手机识别到拍摄的人脸的清晰度过低时,可以自动进入到增强模式。此外,手机还可以结合人脸在预览界面上出现的时长,来判断是否进入增强模式,可以降低误判率,降低用户对手机的移动等操作产生的影响。例如,手机识别出在预览界面上拍摄的人脸的清晰度过低,但是人脸出现的时间仅为1秒,下一秒内在预览界面上没有人脸,手机可以不进入增强模式。
需要说明的是,上述界面中控件的设置方式和显示内容仅为一种示意,这里并不限定。
手机拍摄进入增强模式之后,可以对图像预览区域的图像进行分析,并获取可以作为图像预览区域的图像的引导图像的引导图像,例如,可以在本地相册或者是本地增强图图库或者是云端增强图图库中查找是否有可以满足作为图像预览区域的图像的引导图像的引导图像(人脸的姿态表情相近、亮度和色调更优等),若获取到,则可以在用户拍摄,手机获取到第一图像后,自动基于引导图像对第一图像进行增强。
手机拍摄进入增强模式之后,如图8中示出的那样,手机的预览界面上可以包括提醒框,该提醒框可以用于提示用户当前的拍摄进入增强模式,该提醒框可以包括增强模式的文字内容和关闭控件(例如图8中示出的“退出”控件)。
其中,当用户点击关闭控件,响应于用户的点击操作,手机拍摄可以退出增强模式。例如,用户可能因为其他操作,手机的预览界面一定时间内都出现亮度的动态范围或者清 晰度过低的人脸,手机识别到亮度的动态范围或者清晰度过低的人脸而进入增强模式,此时用户可能并不希望进入增强模式拍摄人脸图片;或者用户拍完人脸图片而想退出增强模式进入到普通模式时,用户可以通过点击提醒框中的关闭控件,从而拍摄预览界面可以由图8切换到普通模式的显示界面。此外,还可以有其他关闭增强模式的方法,本申请对此不作限定。
可选地,在另一种实施例中,当手机识别到拍摄的人脸的亮度的动态范围或者清晰度过低时,可以显示可以让用户选择进入到增强模式的引导。
如图9中示出的那样,手机的预览界面上可以包括提醒框,该提醒框可以用于提示用户选择时都进入到增强模式,该提醒框可以包括增强模式的文字内容、确定控件和隐藏控件。具体的,当手机识别到拍摄的人脸的亮度的动态范围或者清晰度过低时,可以显示让用户选择进入到增强模式的引导,如图9中示出的那样,用户可以点击“进入”控件,以使得手机进入增强模式。
需要说明的是,上述界面中控件的设置方式和显示内容仅为一种示意,这里并不限定。
本申请实施例中,用户可以选择第一图像中要增强的目标对象。
具体的,手机可以显示目标对象选择控件,具体的,如图4(d)中示出的那样,目标对象选择控件可以包括“全部”控件、“五官”控件和“自定义区域”控件,其中,“全部”控件可以提供对当前拍摄的照片的人脸区域进行增强的功能,“五官”控件可以提供对当前拍摄的照片中的五官进行增强的功能,“自定义区域”控件可以提供对当前拍摄的照片中的自定义区域进行增强的功能。
需要说明的是,上述目标对象选择控件仅为一种示意,实际应用中,目标对象选择控件还可以是其他类型,本申请并不限定。
302、电子设备根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
本申请实施例中,用户可以从本地相册或云相册中选择用于增强第一图像的引导图像,或者由电子设备选择可以用于增强第一图像的引导图像,接下来分别进行说明。
首先介绍用户如何从本地相册或云相册中选择引导图像。
如图10(a)所示,用户可以点击图4(d)中示出的“全部”控件,相应的,手机可以接收到对拍摄的照片的人脸区域进行增强的指令,响应于该指令,手机可以显示引导图像选择界面,为了方便描述,以下将作为增强第一图像的引导图像称为引导图像。
可选地,如图10(b)中示出的那样,在一种实施例中,手机可以显示引导图像的选择界面,该引导图像的选择界面可以包括“从本地相册选择”控件以及“智能选择”控件。其中,用户可以点击“从本地相册选择”,相应的,手机可以接收到从本地相册选择引导图像的指令,响应于该指令,手机可以打开本地的相册,并在显示界面显示如图10(c)示出的引导图像的选择界面,可选地,图10(c)中可以包括相册显示区域501和待增强图显示区域502,其中,相册显示区域501可以显示本地相册中保存的照片的预览图,待增强图显示区域502可以显示出要增强的照片的预览图,上述控件的设置方式可以让用户在视觉上基于待增强图和引导图像进行比对,选择姿态更接近待增强图、细节清晰度更高的 引导图像。
应当理解,如本文所用,术语“高”和“低”(例如,术语“高质量”和“高分辨率”)并非指具体的阈值,而是指相对于彼此的关系。因此,“高分辨率”图像不需要大于特定数值的分辨率,但具有比相关“低分辨率”图像更高的分辨率。
需要说明的是,上述引导图像选择界面的设置仅为一种示意,本申请并不限定。
可选地,如图10(c)中示出的那样,用户可以从相册显示区域501中选择一个图像作为引导图像,相应的,手机可以接收到用户的图片选择指令,并获取用户选择的图像,手机可以确定上述用户选择的引导图像作为图4(b)中用户拍摄的图像的引导图像。
可选地,在一种实施例中,在获取到引导图像后,手机可以基于第一图像和引导图像中的人脸的姿态相似程度判断第一图像和引导图像中人脸的姿态表情是否接近,若第一图像和引导图像中人脸的姿态表情接近,则可以确定上述用户选择的引导图像可以作为第一图像的引导图像,若第一图像和引导图像中人脸的姿态表情不接近,则可以确定上述用户选择的引导图像不可以作为第一图像的引导图像。
需要说明的是,关于如何判断第一图像和引导图像中人脸的姿态表情的接近程度将在后文中描述,这里不再赘述。
示例性的,如图10(c)中示出的那样,若用户选择相册显示区域501中的第一张图像,由于相册显示区域501中的第一张图像和第一图像的姿态表情很接近,因此手机在基于图像中人脸的姿态接近程度的判断后,可以确定该引导图像可以作为第一图像的引导图像,并基于图像增强方法,对第一图像进行增强,如图10(f)中示出的那样,手机在基于图像增强方法,对第一图像进行增强后,可以显示目标图像。
示例性的,如图10(d)中示出的那样,若用户选择相册显示区域501中的第二张图像,由于相册显示区域501中的第一张图像和第一图像的姿态表情不接近(人脸的朝向差异很大),因此手机在基于图像中人脸的姿态接近程度的判断后,可以确定该引导图像不可以作为第一图像的引导图像,此时,手机可以提示用户重新进行引导图像的选择,可选地,如图10(e)中示出的那样,手机可以在界面上显示“姿态差异过大,请重新选择”的提示,并返回至图10(c)中示出的引导图像选择界面,使得用户可以重新选择姿态更接近于第一图像的引导图像。例如,若用户重新选择选择相册显示区域501中的第一张图像,由于相册显示区域501中的第一张图像和第一图像的姿态表情很接近,因此手机在基于图像中人脸的姿态接近程度的判断后,可以确定该引导图像可以作为第一图像的引导图像,并基于图像增强方法,对第一图像进行增强,如图10(f)中示出的那样,手机在基于图像增强方法,对第一图像进行增强后,可以显示目标图像。
可选的,在另一种实施例中,手机当获取到第一图像和引导图像后,可以将第一图像和引导图像发送到服务器,服务器基于图像增强方法,对第一图像进行增强,并将目标图像发送到手机,进一步的,手机可以显示目标图像。
关于手机或服务器如何基于图像增强方法,对第一图像进行增强,将在后文中进行描绘,这里不再赘述。
接下来介绍电子设备如何自动选择可以作为第一图像的引导图像的引导图像。
本申请实施例中,电子设备可以基于人脸的姿态匹配策略和其他图像处理策略,从本地的相册或云端的相册中选择一张亮度和色调更优、细节清晰度较高,且人脸姿态相似度高的图像作为引导图像,来增强第一图像。
示例性的,如图10(b)中示出的那样,当用户点击“智能选择”控件时,相应的,手机可以接收到用户点击“智能选择”控件,并基于人脸的姿态匹配策略和其他图像处理策略,从本地的相册或云端的相册中选择一张亮度和色调更优、细节清晰度较高,且人脸姿态相似度高的图像作为引导图像,来增强第一图像。
其中,亮度的动态范围可以指目标对象或目标对象中包括的像素中,最亮的像素点和最暗的像素点之间的灰度等级数量。
可选地,在另一种实施例中,引导图像可以由服务器进行选择,而不需要引导用户选择,具体的,如图10(b)中示出的那样,当用户点击“智能选择”控件时,相应的,手机可以接收到用户点击“智能选择”控件,并将第一图像发送到服务器,服务器可以基于人脸的姿态匹配策略和其他图像处理策略,从本地的相册或云端的相册中选择一张亮度和色调更优,细节清晰度较高,且人脸姿态相似度高的图像作为引导图像,来增强第一图像。
可选地,本申请实施例中,手机在基于图像增强方法,对第一图像进行增强后,可以显示目标图像。此外,还可以在显示界面显示其他控件,例如图10(f)中示出的“保存”控件和“取消”控件。
具体的,用户若点击上述“保存”控件,手机可以响应于用户的操作,显示的目标图像保存至本地的相册,或者其他存储位置,例如存到云端等。
可选地,手机可以响应于用户点击“保存”控件的操作将第一图像以及增强的后的第一图像都保存至本地的相册,或者其他存储位置,例如存到云端等,这里并不限定。
用户若点击上述“取消”控件,则表示用户可能对目标图像的增强效果,并不满意,可选地,手机可以响应于上述用户点击“取消”的控件的操作,返回至相机的拍摄界面,例如图4(b)中示出的界面,或者,手机可以响应于上述用户点击“取消”的控件的操作,返回至图10(b)中的界面,提示用户重新进行引导图像的选择,或者,手机可以响应于上述用户点击“取消”的控件的操作,返回至图10(c)中显示的界面,提示用户重新进行引导图像的选择。
需要说明的是,上述界面的控件类型和手机显示界面的显示内容仅为一种示意,本申请并不限定。
可选地,在一种实施例中,用户可以只增强第一图像中的局部区域,例如只增强第一图像中的一个或多个五官,或者其他区域,这里并不限定。
具体的,如图11(a)所示,用户可以点击其中示出的“五官”控件,相应的,手机可以接收到对拍摄的照片的五官区域进行增强的指令,响应于该指令,手机可以显示五官区域选择界面。
可选地,如图11(b)中示出的那样,在一种实施例中,手机可以显示五官区域选择界面,该五官区域选择界面可以包括各个五官的选择引导控件,例如图11(b)中示出的“左眼”控件、“右眼”控件、“嘴唇”控件、“鼻子”控件、“左耳”控件、“右耳”控件、 “左眉”控件、以及“右眉”控件。其中,用户可以点击想增强的五官对应的控件,相应的,手机可以接收到用户点击想增强的五官对应的控件的指令,响应于该指令,基于人脸识别策略,识别第一图像中与用户的选择相对应的五官区域。示例性的,图11(b)中用户点击了“左眼”控件,相应的,手机可以接收到用户点击“左眼”控件的指令,响应于该指令,基于人脸识别策略,识别第一图像中人脸的左眼区域,可选的,手机可以通过提示框将左眼区域圈出来。
需要说明的是,上述五官区域选择界面的控件设置和显示内容仅为一种示意,本申请并不限定。
需要说明的是,关于如何基于人脸识别策略,识别第一图像中与用户的选择相对应的五官区域将在后文描述,这里不再赘述。
可选地,五官区域选择界面还可以包括“确定”控件和“返回”控件,如图11(c)中示出的那样,用户可以点击“左眼”控件和“嘴唇”控件,相应的,手机可以接收到用户点击“左眼”控件和“嘴唇”控件的指令,响应于该指令,基于人脸识别策略,识别第一图像中人脸的左眼区域和嘴唇区域。用户可以点击“确定”控件,相应的,手机可以接收到用户点击“确定”控件的指令,响应于该指令,手机可以显示引导图像的选择界面,关于引导图像的选择界面可以参照上述实施例中图11(b)及其对应的描述,这里不再赘述。
需要说明的是,上述界面的控件类型和手机显示界面的显示内容仅为一种示意,本申请并不限定。
可选地,在一种实施例中,如图12(a)所示,用户可以点击其中示出的“自定义区域”控件,该控件可以指示用户自己在第一图像中选择增强区域,相应的,手机可以接收到用户点击“自定义区域”控件的指令,响应于该指令,如图12(b)中示出的那样,手机可以显示增强区域选择界面。
可选地,图12(b)中示出了一种增强区域选择界面的示意,用户可以在该增强区域选择界面中手动圈出增强区域,如图12(c)中示出的那样,用户在完成增强区域的圈定后,手机可以显示“确定”控件以及“继续选择”控件。其中,用户可以点击该“确定”控件,响应于用户点击“确定”控件的指令,手机可以显示引导图像选择界面,关于引导图像的选择界面可以参照上述实施例中图15(b)及其对应的描述,这里不再赘述。
可选的,用户可以点击该“继续选择”控件,响应于用户点击“继续选择”控件的指令,手机可以显示增强区域选择界面,用户可以继续在该增强区域选择界面中圈定增强区域,如图12(d)中示出的那样,用户可以在该增强区域选择界面中手动圈出增强区域,并在圈定完成后,在图12(e)示出的界面中,点击“确定”控件,以进入引导图像的选择界面。
可选的,在另一种实施例中,用户可以点击其中示出的“自定义区域”控件,该控件可以指示用户自己在第一图像中选择增强区域,相应的,手机可以接收到用户点击“自定义区域”控件的指令,响应于该指令,手机可以显示增强区域选择界面,和上述图12(b)至图12(e)中不同的是,如图13(a)中示出的那样,手机可以在显示界面显示一个预设 大小的引导框,例如,可以在界面的中心显示一个预设大小的矩形框,用户可以拖动该矩形框平移至增强区域的位置(如图13(a)中示出的那样),并通过改变矩形框的大小来改变增强区域的大小(如图13(b)中示出的那样),相应的,手机可以基于用户对于引导框的操作确定增强区域,如图13(c)中示出的那样,用户在圈定完成后,点击“确定”控件,以进入引导图像的选择界面,或者可以点击“继续选择”控件,以进入引导图像的选择界面。
可选的,在一种实施例中,电子设备可以构建专门用于保存引导图像的相册。
参照图14(a),如图14(a)中示出的那样,在进行图像引导时,手机的引导图像显示界面还可以显示“从引导图像图库中选择”控件,具体的,用户可以点击该“从引导图像图库中选择”控件,响应于用户的操作,手机可以显示引导图像图库的界面,如图14(b)中示出的那样,引导图像图库中的图像可以基于预设的规则进行分类,例如按照人物、景物、动物等进行分类,进一步的,在人物的分类中,还可以按照人的不同进行分类,本申请并不限定。如图14(b)中示出的那样,引导图像图库显示界面可以包括“人物”控件和“景物”控件,用户在点击“人物”控件中,可以显示如图14(c)中示出的人物选择界面,其中,该界面可以包括人名对应的选择控件,用户可以点击对应的控件来指示手机显示对应的人的图像构建的相册,进而,用户可以在手机显示的相册中选择引导图像。
可选的,手机可以安装引导图像图库的应用程序,如图15(a)中示出的那样,用户可以点击“引导图像图库”应用程序对应的图标,相应的,手机可以接收到用户点击“引导图像图库”应用程序的指令,响应于该指令,手机可以显示引导图像图库的显示界面,可选的,在一种实施例中,手机可以显示如图15(b)中示出的引导图像图库界面,如图15(b)中示出的那样,引导图像图库显示界面可以包括“人物”控件和“景物”控件,用户在点击“人物”控件中,可以显示如图15(c)中示出的人物选择界面,其中,该界面可以包括人名对应的选择控件,用户可以点击对应的控件来指示手机显示对应的人的图像构建的相册,如图15(c)中示出的那样,用户可以点击“张三”控件,相应的,手机可以获取到用户点击“张三”控件的指令,并显示如图15(d)中示出的相册。
可选的,在一种实施例中,相册显示界面中还可以包括对相册进行修改的控件,例如图15(d)中示出的“+”控件,具体的,用户可以点击该“+”控件,来增加本相册中的图像,例如,用户点击“+”控件后,手机可以响应于该操作,显示本地相册,并引导用户选择想添加进该相册中的图像。此外,用户还可以删除已经添加进该相册中的图像。
需要说明的是,上述相册显示界面中的控件设置以及显示内容仅为一种示意,本申请并不限定。
可选的,在一种实施例中,用户可以直接从第三方应用程序中将显示的图像添加进引导图像图库中。
如图16(a)中示出的那样,图16(a)示出了一种聊天界面的示意图,张三发送了一张图像,手机接收到该图像之后,可以在聊天界面上显示,如图16(b)中示出的那样,用户可以长按该图像,手机响应于该操作,可以显示对于该图像进行操作的引导,如图17中示出的那样,引导中可以包括“保存到相册”控件、“保存到引导图像图库”控件以及“复 制”控件,用户可以点击上述“保存到引导图像图库”控件,手机可以响应于该操作,将该图像保存至引导图像图库中(如图17中示出的那样),或者显示如图15(b)中的显示界面,来引导用户将图像保存至对应分类的相册中。
需要说明的是,上述实施例中的控件设置以及显示内容仅为一种示意,本申请并不限定。
303、电子设备根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
本申请实施例中,由于第一图像的图像质量较低(例如:人脸亮度或色调较差、人脸细节清晰度低、某一个或多个五官的亮度或色调较差或某一个或多个五官的细节清晰度低),则可以对拍摄得到的第一图像进行增强,例如点击如图4(c)中示出的“增强”控件,电子设备相当于接收到了增强指令。
可选地,在一种实施例中,用户需要对电子设备中存储在相册中的某一张图像进行增强,例如,在一种场景中,用户想向其他用户发送一张自拍,然而打开图像后发现其图像质量很低(例如:人脸亮度或色调较差、人脸细节清晰度低、某一个或多个五官的亮度或色调较差或某一个或多个五官的细节清晰度低),则用户可以打开相册,在相册中对要发送的自拍(第一图像)进行增强,例如点击如图12(c)中示出的“增强”控件,进而,电子设备相当于接收到了增强指令。
本申请实施例中,电子设备可以通过神经网络基于引导图像中的目标对象对第一图像中的目标对象进行增强。
需要说明的是,目标对象也可以理解为不同的人的相同五官,例如,第一图像为拍摄张三的正脸得到的,则第一图像中包括张三的眼睛(目标对象),相应的,引导图像为拍摄李四的正脸得到的,则引导图像中包括李四的眼睛,如果张三和李四的眼睛的姿态信息很相似,则引导图像中李四的眼睛也可以作为增强目标对象(张三的眼睛)的目标对象。
本申请实施例中,由于对于图像进行增强的原则在于在使得第一图像的图像质量提高的前提下,不会使得目标图像相比于第一图像过于失真,因此,在需要增强第一图像中的目标对象的情况中,作为第一图像引导图像的引导图像,其中相对于目标对象的目标对象,和目标对象的姿态差异不能过大,因此,所述目标对象的姿态信息和所述目标对象的姿态信息的差异度在预设范围内。
可选地,在一种实施例中,电子设备在接收到增强指令后,响应于增强指令,可以显示相册界面,来引导用户进行引导图像的选择,例如,如图15(c)中示出的那样,用户可以在图15(c)中示出的引导图像选择界面中选择一个引导图像,响应于用户的图像选择操作,电子设备可以获取到与图像选择操作相对应的引导图像。
可选地,在一种实施例中,若用户需要对第一图像中的全部人脸或部分人脸进行增强(关于用户如何选定需要增强的区域(目标对象)可以参照图4(d)及其对应的实施例描述,这里不再赘述),此时,第一图像包括人脸或部分人脸(目标对象),相应的,电子设备在获取到引导图像之后,会判断引导图像中是否有与第一图像中的目标对象相近的目标 对象。
本申请实施例中,由于对于图像进行增强的原则在于在使得第一图像的图像质量提高的前提下,不会使得目标图像相比于第一图像过于失真,因此,在需要增强第一图像中的目标对象的情况中,作为第一图像引导图像的引导图像,其中相对于目标对象的目标对象,和目标对象的姿态差异不能过大,因此,所述目标对象的姿态信息和所述目标对象的姿态信息的差异度在预设范围内。
接下来说明,如何判断第一图像中的目标对象和引导图像中的目标对象的姿态的差异度。
在一种实施例中,电子设备可以基于人脸关键点landmark检测方法来确定引导图像中是否有与第一图像中的人脸姿态相近的目标对象。其中,人脸关键点也可称为人脸特征点,通常包含了构成人脸五官(眉毛、眼睛、鼻子、嘴部以及耳朵)以及人脸轮廓的点。对人脸图像进行检测,标注人脸图像中的一个或多个关键点的方法,可以称为人脸关键点检测方法或者人脸对齐检测方法。通过对人脸图像进行人脸对齐检测,可以确定出人脸图像中的特征区域,此处的特征区域可以包括但不限于:眉毛区域、眼睛区域、鼻子区域、嘴部区域、耳朵区域,等等。
本申请实施例中,电子设备可以基于一种人脸关键点检测模型来实现目标对象和目标对象的姿态信息的差异度判断。具体的,在获取到第一图像和引导图像之后,可以调用该人脸关键点检测模型对第一图像和引导图像分别进行人脸检测,以确定第一图像和引导图像中的多个关键点以及各个关键点的标注信息,此处的关键点可以包括但不限于:嘴部关键点、眉毛关键点、眼睛关键点、鼻子关键点、耳朵关键点以及人脸轮廓关键点,等等;关键点的标注信息可以包括但不限于:位置标注信息(如标注出关键点所在的位置),形状标注信息(如标注为圆点形状)、特征信息,等等,其中特征信息用于表示该关键点的类别,如特征信息为眼睛的特征信息,则表明该关键点为眼睛的关键点,又如特征信息为鼻子的特征信息,则表明该关键点为鼻子的关键点等等。在第一图像和引导图像中确定出的多个关键点可以如图18中的灰色圆点所示。在确定了多个关键点之后,可以基于关键点的标注信息以及位置标注信息(例如可以是关键点的像素坐标),确定目标对象和目标对象的姿态信息的相似度。
需要说明的是,电子设备在基于目标对象和目标对象的特征点的像素坐标来比较姿态信息差异度之前,可以先对第一图像和引导图像进行裁剪处理,以使得目标对象在第一图像中的位置和姿态接近于目标对象在引导图像中的位置和姿态。可选的,以目标对象和目标对象为人脸为例,裁剪处理的圈定范围可以是眉毛以下(包含眉毛),下巴以上,左右以脸轮廓边缘为界(可以包含耳朵)。
可选的,若目标对象和目标对象的大小不同,则可以对裁剪后的图像进行缩放处理,以使得目标对象和目标对象的大小相同。
可选的,若目标对象和目标对象的人脸的不同,则可以对裁剪后的图像进行旋转处理,以使得目标对象和目标对象的相同。旋转处理是指以目标对象或目标对象的中心点为原点,将目标对象或目标对象以某一旋转角度进行顺时针或者逆时针的旋转处理。
可选的,为了后续的像素配准,可以在圈定区域周围适当留有一定的区域。如图19中示出的那样,其中区域1903为电子设备圈定出的目标对象和目标对象的区域,区域1902为电子设备圈定区域周围适当留有一定的区域,相当于裁剪后的图像为图19中的区域1902和区域1903。
示例性的,如图20(a)和图20(b)中示出的那样,图20(a)示出了一种第一图像的示意图,图20(b)示出了一种引导图像的示意图,其中,目标对象和目标对象分别为第一图像和引导图像中的人脸,然而,图20(a)和图20(b)中的人脸的姿态差异过大,电子设备可以对图20(b)中的引导图像进行图像处理。
如图21(a)中示出的那样,可以先对目标对象进行旋转处理,以使得旋转后的目标对象和目标对象的位姿基本一致,如图21(b)示出的那样,图21(b)为旋转后的引导图像的示意。接下来可以对目标对象的大小进行缩放,以使得缩放后的目标对象的大小和目标对象的大小基本一致如图21(c)中示出的那样,图21(c)示出了缩放后后的引导图像的示意。
需要说明的是,仅为一种示意,也可以对图20(a)中的第一图像进行处理,本申请并不限定。
示例性的,若目标对象为全部人脸,则电子设备可以基于标注信息获取到第一图像和引导图像中人脸范围内的关键点,以及人脸范围内的每个关键点对应的像素坐标,电子设备可以分别计算第一图像和引导图像中人脸范围内的每个关键点对应的像素坐标的差的平方和,如果上述计算得到的平方和超过预设的阈值,则认为第一图像中的目标对象和引导图像的目标对象的姿态信息的差异度过大。可选地,电子设备在确定第一图像中的目标对象和引导图像的目标对象的姿态信息的差异度过大(不在预设范围)后,可以提示用户重新选择引导图像,具体的,可以参照图10(d)及其对应的实施例中的描述,这里不再赘述。
示例性的,若目标对象为左眼,则电子设备可以基于标注信息获取到第一图像和引导图像中左眼范围内的关键点,以及左眼范围内的每个关键点对应的像素坐标,电子设备可以分别计算第一图像和引导图像中左眼范围内的每个关键点对应的像素坐标的差的平方和,如果上述计算得到的平方和超过预设的阈值,则认为第一图像中的目标对象和引导图像的目标对象的姿态信息的差异度过大。可选地,电子设备在确定第一图像中的目标对象和引导图像的目标对象的姿态信息的差异度过大(不在预设范围)后,可以提示用户重新选择引导图像,具体的,可以参照图10(d)及其对应的实施例中的描述,这里不再赘述。
可选地,在另一种实施例中,在得到多个关键点之后,可以根据多个关键点中的各个关键点的标注信息确定目标对象在第一图像中的特征区域,以及目标对象在引导图像中的特征区域。由前述可知,标注信息可以包括:特征信息、位置标注信息等。因此,在一个实施例中,可以根据各个关键点的特征信息确定特征区域。具体的,可以根据各关键点的特征信息确定各目标关键点的类别,将同一类别的目标关键点所构成的区域作为一个特征区域,并将类别作为该特征区域的类别。例如,选取特征信息全为鼻子的特征信息的关键点,这些关键点的类别都是鼻子关键点;将这些目标关键点所构成的区域作为鼻子区域。
在另一个实施例中,可以根据各个关键点的位置标注信息确定特征区域。具体的,可以先根据位置标注信息确定各个关键点的标注位置,将相邻位置的关键点连接起来,若连接所得到的形状与人脸的五官(眉毛、眼睛、鼻子、嘴部、耳朵)中的任意一种形状相似,则将这些相邻位置的关键点所构成的区域确定为特征区域,并根据该形状确定特征区域的类别。例如,若将相邻位置的目标关键点连接起来所得到的形状与鼻子的形状相似,则可以将这些相邻位置的关键点所构成的区域确定为鼻子区域。相应的,电子设备可以基于目标对象对应的特征区域形状和目标对象对应的特征区域形状的比对,确定目标对象和目标对象的姿态信息的差异度。
例如,若目标对象为人脸,由于人脸是由脸颊、左右眼、鼻子、嘴唇、左右耳、左右眉组成,因此,电子设备可以在第一图像中确定脸颊区域、左右眼区域、鼻子区域、嘴唇区域、左右耳区域和左右眉区域,且在引导图像中确定脸颊区域、左右眼区域、鼻子区域、嘴唇区域、左右耳区域和左右眉区域,并分别将第一图像和引导图像中的特征区域进行形状的比对:
将第一图像中的左眼区域与引导图像中的左眼区域进行形状比对;
将第一图像中的右眼区域与引导图像中的右眼区域进行形状比对;
将第一图像中的脸颊区域与引导图像中的脸颊区域进行形状比对;
将第一图像中的鼻子区域与引导图像中的鼻子区域进行形状比对;
将第一图像中的嘴唇区域与引导图像中的嘴唇区域进行形状比对;
将第一图像中的左耳区域与引导图像中的左耳区域进行形状比对;
将第一图像中的右耳区域与引导图像中的右耳区域进行形状比对;
将第一图像中的左眉区域与引导图像中的左眉区域进行形状比对;
将第一图像中的右眉区域与引导图像中的右眉区域进行形状比对;
并基于比对结果确定目标对象和目标对象的姿态信息的差异度,具体的,可以综合上述每种区域的比对结果来确定目标对象和目标对象的姿态信息的差异度,例如,当有一种区域的比对结果为差异度过大,则确定目标对象和目标对象的姿态信息的差异度过大,也可以是,当有一种区域的比对结果为差异度不大,则确定目标对象和目标对象的姿态信息的差异度不大,此时,后续可以只对差异度不大的五官区域进行增强。
需要说明的是,上述人脸关键点检测模型可以采用人脸对齐算法和样本数据集进行分级拟合训练得到的,此处的人脸对齐算法可以包括但不限于:机器学习回归算法,例如监督下降算法(supervised descent method,SDM)、局部二值特征(local binary features,LBF)算法;或者卷积神经网络(convolutional neural network,CNN)算法,例如基于深度多任务学习的人脸标志点检测(facial landmark detection by deep multi-task learning,TCDCN)算法、密集人脸对齐(3D dense face alignment,3DDFA)算法等等。基于这些算法,可以设计得到一个原始模型,然后基于原始模型和样本数据集进行训练后,最终可以得到人脸关键点检测模型。
需要说明的是,若电子设备判断出目标对象和目标对象的姿态信息的差异度过大时,可以提示用户重新进行引导图像的选择,如图15(d)中示出的那样,若用户选择第二张 图像作为第一图像的引导图像,电子设备可以确定出第一图像的目标对象(人脸)与引导图像的目标对象(人脸)的姿态信息差异度过大,因此,可以在如图10(d)示出的界面中提示用户姿态差过大,并引导用户重新进行引导图像的选择。
可选地,在一种实施例中,在用户进行引导图像的选择界面上,电子设备可以对于每张图像,计算其中的目标对象与目标对象在姿态信息、细节清晰度等等方面的接近程度,并提示给用户参考(在界面中显示或通过语音播放给用户)。
可选地,在一种实施例中,用户可以选择多个引导图像作为第一图像的引导图像。
需要说明的是,电子设备可以提供一个专门用户存储引导图像的图库,用户拍摄到人脸照片,或者从网络上下载人脸照片,将其中一部分质量较高(亮度优、细节清晰度高)的存放到存储引导图像的图库中。例如,用户拍摄到目标对象为自己的人脸照片,将其中一部分质量较高(亮度优、细节清晰度高)的存放到存储引导图像的图库中,之后用户再拍摄对象是自己的人脸照片,就可以用这个图库里的引导图像对其做引导增强。引导图像由用户自己收集积累,可以分门别类创建引导图像相库,支持时时更新和删除。引导图像存储区可以是电子设备本地的存储介质,也可以存到云上。具体可以参照图14(a)及其相关实施例的描述,这里不再赘述。
可选地,在一种实施例中,引导图像的选择可以由电子设备自动完成。可选地,电子设备可以选择与第一图像中的目标对象的姿态信息最接近的图像作为引导图像,或者综合其他标准考虑,比如亮度的动态范围DR,细节清晰度信息等等。若电子设备检测到多个包括姿态信息和目标对象接近的目标对象的引导图像,则可以基于上述标准,或者随机筛选,或者通过界面呈现给用户供用户选择,来获取引导图像。本实施例中,所述目标对象的细节清晰度大于所述目标对象的细节清晰度。
本实施例中,电子设备可以提供神经网络将引导图像中的目标对象对第一图像中的目标对象进行增强。
具体的,电子设备可以首先将第一图像中的目标对象和引导图像中的目标对象进行像素配准,确定所述M个第一像素点中每个第一像素点对应的第二像素点,所述第二像素点为所述目标对象包括的像素点。
需要说明的是,电子设备可以首先对第一图像和第二图像划分网格,并将第一图像中网格的坐标点和第二图像中网格的坐标点进行配准,之后通过插值算法,计算出第一图像中的目标对象的像素点和引导图像中的目标对象的像素点的对应关系。
本申请实施例中,引导图像中的目标对象可以包括M个第一像素点,电子设备可以基于神经网络或者其他配准算法将所述目标对象和所述目标对象进行像素配准,以确定M个第一像素点中每个第一像素点对应的第二像素点,第二像素点为所述目标对象包括的像素点。
可选的,在一种实施例中,如图22(a)示出的那样,目标对象包括第一像素点A1,对第一像素点A1周围的像素点信息进行数学分析提取特征,相应的,对第一图像中的目标对象的像素点信息也进行数学分析提取特征,可以查找到目标对象上的一个第二像素点A2(如图22(b)示出的那样),由其周围的图像信息所提取的特征和该第一像素点A1周围 的图像信息所提取的特征最为匹配/相近,因此,可以确定第一像素点A1对应于第二像素点A2。
同理可以确定出M个第一像素点中每个第一像素点对应的引导图像中目标对象包括的第二像素点。
关于如何确定所述M个第一像素点中每个第一像素点对应的第二像素点,将在图25a至图25c以及对应的实施例中描述,这里不再赘述。
本实施例中,电子设备可以在所述第一图像中,将每个第二像素点与对应的第一像素点进行融合处理,得到目标图像。
本申请实施例中,可以基于上述得到的第二像素点和第一像素点的对应关系,在第一图像中,将第二像素点和对应的第一像素点进行像素融合,从而得到目标图像。
可选的,在一种实施例中,在确定了第二像素点和第一像素点的对应关系之后,可以确定每个第二像素点与对应的第一像素点之间的像素位移,并基于所述像素位移对每个第二像素进行平移,得到配准后的目标对象,配准后的目标对象还包括N个第三像素点,每个所述第三像素点为根据相邻的第一像素点的像素值通过插值生成的,所述N为正整数,将所述配准后的目标对象与所述目标对象进行融合,从而得到目标图像。
可选的,本申请实施例中,在进行第一图像中的目标对象和引导图像中的目标对象的融合时,电子设备可以获取所述第二像素点的高频信息;获取所述第一像素点的低频信息;将所述低频信息和所述高频信息进行融合处理。
参照图23(a)和图23(b)示出的那样,图23(a)示出了一种目标对象的示意,图23(b)示出一种配准后的目标对象的示意,由图23(c)可知,配准后的目标对象和第一图像中的目标对象也有不重合的区域(B1和B2),此时,若将第一图像中的目标对象直接和配准后的目标对象进行融合,则会出现伪影,即配准后的目标对象的信息往第一图像中的目标对象上“贴”/融合时,“贴”/融合错了位置,因此,本申请中,可以只将配准后的目标对象中与第一图像中的目标对象重合的区域进行像素融合处理,而针对于配准后的目标对象中与第一图像中的目标对象不重合的区域,可以对该区域进行超分辨增强处理。即:所述第一图像中的目标对象包括第一区域,所述配准后的目标对象包括第二区域,所述第一区域与所述第二区域重合,将所述第一区域与所述第二区域的像素点进行融合处理。所述第一图像中的目标对象还包括第三区域,所述第三区域与所述配准后的目标对象错开,对所述第三区域进行超分辨增强处理。
如图23(d)所示,本申请实施例中的像素融合方法(用于细节增强)可以基于AI网络实现,例如可以通过训练,使得:
a.编码器1只负责编码图片的低频信息,对于高频信息自动滤去。
b.编码器2可编码出图片的高频和低频信息,其对应的解码器2可以将编码器2输出的高低频编码信息恢复为原输入图,其中编码低频信息在方式上要和编码器1相似。例如:
i.如果让配准后引导图像过编码器1,则输出结果和配准后的引导图像过编码器2输出的低频编码信息相似。
ii.将增强后的图像过编码器1,则输出结果和配准后的引导图像过编码器2输出的低 频编码信息相似。
本申请实施例中,将每个第二像素点与对应的第一像素点进行融合处理之后,可以在所述第一图像中对所述目标对象的边缘区域进行平滑处理。通过上述方式,使得目标图像中的目标对象的轮廓区域不会出现失真的情况,提高了图像增强的效果。
可选的,本申请实施例中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的亮度的动态范围DR与所述引导图像中的所述目标对象的亮度的DR的差值小于所述第一图像中的目标对象的亮度的DR与所述目标对象的亮度的DR的差值。
可选的,本申请实施例中,所述目标图像包括增强后的目标对象,所述增强后的目标对象的色调与所述引导图像中的目标对象的色调的差值小于所述第一图像中的目标对象的色调与所述目标对象的色调的差值。
可选的,本申请实施例中,目标图像包括增强后的目标对象,增强后的目标对象的细节清晰度大于所述第一图像中的目标对象的的细节清晰度。
可选的,在另一种实施例中,可以直接将直接将第一图像中的目标对象替换为引导图像中的目标对象,即增强后的目标对象可以直接为引导图像中的目标对象,本申请并不限定。
需要说明的是,图23(d)中示出的像素融合模块可以集成到解码器中,编解码器可以是基于传统算法实现,也可以基于AI网络实现。
需要说明的是,图23(d)中的模块的名称仅为一种示意,并不构成对本申请的限定,例如,像素融合模块也可以理解为编码融合模块,此处并不限定。
以上以人脸或人脸中的局部区域为目标对象,介绍了本申请实施例提供的一种图像增强方法,接下来以目标对象为月亮为例,介绍另一种图像增强方法。
电子设备可以获取包括目标对象为月亮的第一图像,如图23(e)中示出的那样,图23(e)中示出了一个第一图像的示意,该第一图像中包括月亮。
关于电子设备如何获取第一图像,可以参照上述实施例中的描述,这里并不限定。
本申请实施例中,电子设备在获取到第一图像后,可以检测到第一图像中包括月亮,具体的,电子设备可以基于训练好的AI网络来检测第一图像是否包括月亮,本申请并不限定。
电子设备可以获取包括目标对象为月亮的引导图像,和上述对人脸区域进行增强有所不同的是,由于月球的自转周期和绕地球转动的周期相等,使得它总以同一面对着地球,因此第一图像和引导图像中的月亮在没有遮挡的情况下,姿态信息(纹理特征)是基本一致的,因此电子设备可以不用进行姿态信息是否相似的判断,如图23(f)中示出的那样,图23(f)中示出了一个引导图像的示意,该引导图像中包括月亮。
需要说明的是,有些场景不适合做引导增强的,也可以通过场景判断进行排除,比如,若第一图像中的月亮被云雾、建筑、或者其他景和物遮挡严重,则可以提示用户第一图像不适合做引导增强。
需要说明的是,电子设备可以自动选择引导图像。若不忽略天平动导致的影响,人们在地上能看见的月亮中的一部分表面是不断变化的,此时,电子设备可以通过日期时间地 点的同步,推知当晚实际能见的月面,而从引导图像图库/相册等等中挑出引导图像,其中,引导图像包括的月亮的姿态信息与当晚实际能见的月亮表面的姿态信息接近。
需要说明的是,电子设备可以参考第一图像中月亮所处的场景来选择引导图像。示例性的,若第一图像中月亮所处的环境为夜空下的欧洲古代城堡,则可以选择包括血狼月的引导图像作为第一图像的引导图像。
电子设备可以通过引导图像中包括的月亮对第一图像中包括的月亮进行增强,得到目标图像。
可选的,在一种实施例中,电子设备可以获取第一图像中月亮的区域A,以及引导图像中月亮的区域B,并将月亮的区域A向月亮的区域B进行配准,使得配准后的区域A和区域B基本上完全重合,为了方便叙述,以下将第一图像称为图A,引导图像称为图R,第一图像中月亮的区域A称为图a,引导图像中月亮的区域B称为图r。
可选的,在一种实施例中,可以首先将图a进行平移,以使得平移后图a中月亮的中心(或圆心)与图r中的月亮的中心(或圆心)重合,得到图b,进一步的,可以以图b中心为原点建立平面坐标,记坐标的x轴(或y轴)与图片水平线的夹角为theta,将图b往该坐标的x轴以及y轴方向做伸缩,并选择合适的theta以及缩放系数,可以让图b中的月亮的区域A和图r中的月亮的区域B精确配准,得到图c。
需要说明的是,如果图A中的月相不是圆月,或者是月亮上有一定的遮挡的场景,则只要满足:图c中的月亮区域是残缺的正圆形,即留有正圆形的轮廓,按该轮廓正圆轨迹延伸可将此月亮区域复原为正圆形,复原结果和图r中的月亮区域(也是正圆形)基本上完全重合,即可以认为进行了成功的配准。
本申请实施例中,参照由图a到图c的变换,同样作用于图A,得到图C,将图C绕图中心旋转直到月亮纹理位置和图R中的重合,得图D。旋转角记为q。
本申请实施例中,可以计算出图A直接变换到图D的仿射变换矩阵W,以及仿射变换矩W的逆矩阵W -1,并将W作用于图a得图d,将W -1作用于图r得图p。比较图d和图r、图p和图a。如果月亮区域差异大,说明配准失败,停止后续的引导增强,系统报错(提示用户增强失败)。
可选的,比较的标准可以是满足如下条件:
条件1:图d的月亮区域超出图r的月亮区域轮廓线外的面积小于某个阈值;
条件2:图d的月亮区域轮廓线与图r的月亮区域轮廓线之间的最小距离小于某个阈值;
条件3:图p的月亮区域超出图a的月亮区域轮廓线外的面积小于某个阈值;
条件4:图p的月亮区域轮廓线与图a的月亮区域轮廓线之间的最小距离小于某个阈值。
需要说明的是,如果设定仅处理圆月或准圆月(包括轻微遮挡)场景,还可以加上如下条件:
条件5:图d和图r的月亮区域交集面积除以图r的月亮区域面积,该值应大于某个阈值;
条件6:图p和图a的月亮区域交集面积除以图a的月亮区域面积,该值大于某个阈值。
本申请实施例中,可以将W -1作用于图R,得到图P。再进行后处理。最后将结果嵌(融合)回原照片。将图P、图p、图A、图a缩放到原始裁剪图A缩放前的大小。
首先,为了图P与图A完美融合,需要获取图P和图A中月亮的交集:
图p1=图p∩图a;图P1=图P∩图p1;图A1=图A∩图p1;
为了保留图L(或者说图A)中原始月亮的一些细节特征,也就是将图A1的一些细节特征赋给图P1,可以进行如下运算:
图M=图p1/255.0;
lM=e_protect/10.0+图M像素值总和,其中数值保护值e_protect可以取1.0/255.0;
l0max=图A1的最大像素值/255.0;
l0=图A1的像素值总和/255.0/lM;
图tmp=图A1+(255-图p1);
l0min=图tmp的最小像素值/255.0;
图T=(图A1/255.0+l0*(1.0-图M)-l0)*Amp/(l0max-l0min+e_protect),其中Amp为可调参数,控制继承图L中原始月亮的一些细节特征的强度,若取0则是不继承。
图IMG=(图P1*l0/l1+图T).*图M,其中,.*表示元素点点对应相乘;
Lmax=图IMG的最大像素值;
图IMGs=(UINT8)(图IMG),如果Lmax小于等于255,否则等于(UINT8)(255.0*图IMG/Lmax),其中,UINT8是指对图中像素值进行相应的数据类型转换。
将图p1进行虚化处理,即进行一定次数的上下采样和模糊blur等操作,得到图p1v,于是,后处理保留图L中原始月亮的一些细节特征的输出结果为:
图IMGs.*图p1v/255.0+图A.*(1.0-图p1v/255.0)。将其代替图A嵌回图L,完成月亮增强。
如果图片按yuv格式进行处理,则上述结果只用于y通道。如果是rgb格式,则r、g、b三通道都用。
一般来说,若要偏重于维护美化后的真实感,美化只用引导图像中月亮的(高清)纹理细节,而不用其颜色。所以一般是在yuv格式下,对y通道进行月亮超级美化处理,而保留uv颜色通道的不变,即结果会继承图L中原始月亮的颜色。但如果要用引导图像中月亮的颜色信息,则可以按照以下步骤进行:
记图L中原始月亮uv通道颜色信息为UVL,引导图像中月亮uv通道颜色信息为UVR,UVR经过矩阵W -1变换后成为UVP,记UVP(或者说是UVR)的中位数为uvp(包含u通道和v通道各一个)。要给图L的月亮区域用引导图像中月亮的颜色上色可以是:对图P1的月亮区,用对应的UVP;而对图L的月亮区到图P1的月亮区,用uvp填充,或者用图P1的边缘的UVP信息值向外等值扩张填满图L的月亮区到图P1的月亮区。上色完后,记此时图L的月亮uv通道颜色信息为UVf,UVf要和图L月亮区外的uv通道颜色信息进行融合,于是增强后,最终代替图A的uv嵌回图L的uv通道颜色信息为:
UVf.*图av/255.0+UVA.*(1.0–图av/255.0),其中图av是图a的虚化处理结果。
需要说明的是,还可以进行后处理,其中,后处理还可以包括比如去模糊deblur,背景降噪等等,使增强效果更优,如图23(g)中示出的那样,图23(g)中示出了一个目标图像的示意。
本申请实施例提供了一种图像增强方法,包括:获取第一图像,所述第一图像包括目标对象;获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。通过上述方式,通过神经网络将引导图像对待增强图(第一图像)进行增强,由于借鉴了引导图像中的信息,相比传统人脸增强技术中直接对待增强图像进行处理,不会出现失真的情况,增强效果更好。
本申请还提供了一种图像增强方法,参照图24,图24为本申请实施例提供的一种图像增强方法的实施例示意图,如图24中示出的那样,本实施例提供的图像增强方法包括:
2401、服务器接收电子设备发送的第一图像,所述第一图像包括目标对象。
2402、服务器根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
2403、服务器根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
2404、服务器向所述电子设备发送所述目标图像。
关于服务器如何执行上述图像增强方法可以参照上述实施例中电子设备执行步骤的描述,相似之处不再赘述。
请先参阅图25a,图25a为本申请实施例提供的图像增强系统的一种系统架构图,在图25a中,图像增强系统2500包括执行设备2510、训练设备2520、数据库2530、客户设备2540和数据存储系统2550,执行设备2510中包括计算模块2511。客户设备2540可以是上述实施例中的电子设备,执行设备可以是上述实施例中的电子设备或服务器。
其中,数据库2530中存储有图像集合,训练设备2520生成用于处理第一图像和引导图像的目标模型/规则2501,并利用数据库中的图像集合对目标模型/规则2501进行迭代训练,得到成熟的目标模型/规则2501。本申请实施例中以目标模型/规则2501为卷积神经网络为例进行说明。
训练设备2520得到的卷积神经网络可以应用不同的系统或设备中,例如手机、平板、笔记本电脑、VR设备、服务器的数据处理系统等等。其中,执行设备2510可以调用数据存储系统2550中的数据、代码等,也可以将数据、指令等存入数据存储系统2550中。数据存储系统2550可以置于执行设备2510中,也可以为数据存储系统2550相对执行 设备2510是外部存储器。
计算模块2511可以通过卷积神经网络对客户设备2540获取的第一图像和引导图像进行卷积操作,在提取到第一特征平面和第二特征平面后,可以将第一特征平面和第二特征平面进行拼接。基于对所述第一特征平面和所述第二特征平面执行卷积操作,确定所述M个第一像素点中每个第一像素点对应的第二像素点。
本申请的一些实施例中,请参阅图25a,执行设备2510和客户设备2540可以为分别独立的设备,执行设备2510配置有I/O接口2512,与客户设备2540进行数据交互,“用户”可以通过客户设备2540向I/O接口212输入第一图像和引导图像,执行设备210通过I/O接口2512将目标图像返回给客户设备2540,提供给用户。
值得注意的,图25a仅是本发明实施例提供的两种图像增强系统的架构示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制。例如,在本申请的另一些实施例中,执行设备2510可以配置于客户设备2540中,作为示例,例如当客户设备为手机或平板时,执行设备2510可以为手机或平板的主处理器(Host CPU)中用于进行阵列图像处理的模块,执行设备2510也可以为手机或平板中的图形处理器(graphics processing unit,GPU)或者神经网络处理器(NPU),GPU或NPU作为协处理器挂载到主处理器上,由主处理器分配任务。
接下来介绍本申请实施例所采用的卷积神经网络,卷积神经网络是一种深度学习(deep learning)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,CNN是一种前馈(feed-forward)人工神经网络,该前馈人工神经网络中的各个神经元对输入其中的图像中的重叠区域作出响应。其中,卷积神经网络在逻辑上可以包括输入层,卷积层以及神经网络层,但由于输入层和输出层的作用主要是为了方便数据的导入和导出,随着卷积神经网络的不断发展,在实际应用中,输入层和输出层的概念逐渐被淡化,而是通过卷积层来实现输入层和输出层的功能,当然,高维卷积神经网络中还可以包括其他类型的层,具体此处不做限定。
卷积层:
卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。卷积层可以包括很多个卷积核,卷积核也可以称为滤波器(filter)或者卷积算子,用于从输入的阵列矩阵(也即数字化的阵列图像)中提取特定信息。一个卷积核本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,每个权重矩阵的大小应该与一个阵列图像中每个角度图像的大小相关,在对阵列图像进行卷积操作的过程中,权重矩阵通常在阵列图像的每个角度图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素……这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以从输入的角度图像中提取信息,从而帮助高维卷积神经网络进行正确的预测。
需要注意的是,权重矩阵的纵深维度(depth dimension)和输入的阵列图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此, 和单一纵深维度的权重矩阵进行卷积会产生单一纵深维度的卷积化输出,但是大多数情况下不使用单一纵深维度权重矩阵,而是采用不同纵深维度的权重矩阵提取图像中不同的特征,例如一个纵深维度的权重矩阵用来提取图像边缘信息,另一个纵深维度的权重矩阵用来提取图像的特定颜色,又一个纵深维度的权重矩阵用来对图像中不需要的噪点进行模糊化……该多个权重矩阵维度相同,经过该多个维度相同的权重矩阵提取后的特征平面维度也相同,再将提取到的多个维度相同的特征图合并形成卷积运算的输出。
具体的,作为一个示例,请参阅图25b,图25b为本申请实施例提供的卷积核对图像执行卷积操作的一种示意图,图25b中以6×6的图像、2×2的卷积模块为例进行说明,其中,s指的是图像在角度维度的水平方向上的坐标,t指的是图像在角度维度的竖直方向上的坐标,x指的是在一个图像中的水平方向上的坐标,y指的是在一个图像中的竖直方向上的坐标,通过(x,y,s,t)可以确定图像上的一个像素点,m指的是多个卷积模块在角度维度上的水平方向上的坐标,n指的是多个卷积模块在角度维度上的竖直方向上的坐标,p指的是在一个卷积模块中的水平方向上的坐标,q指的是在一个卷积模块中的竖直方向上的坐标,通过(m,n,p,q)可以从多个卷积模块中确定一个卷积核。
神经网络层:
在经过卷积层/池化层的处理后,高维卷积神经网络还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或别的相关信息),卷积神经网络需要利用神经网络层来生成一个或者一组所需要的类的数量的输出。因此,在神经网络层中可以包括多层隐含层,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到,例如该任务类型可以包括图像识别,图像分类,图像超分辨率重建等等。
可选的,在一种实施例中,可以基于上述神经网络对第一图像中的目标对象和引导图像中的目标对象进行卷积操作,具体的,可以将裁剪后的引导图像(包括目标对象)和裁剪后的第一图像(包括目标对象)缩放到同样的特定尺寸,输入网络。对于裁剪后的第一图像而言,缩放后的图像大小变为(D+d)*(D+d),其中,D为中心区域边长,d为留边区域的边长。
本申请实施例中,可以将中心D×D区域平均划分为N×N块,以每块中心作为基础网格点,留边区域宽度d为配准中设定的像素最大允许位移值,为网络设计方便,可选的,可令d是D/N的X倍(整数倍)。这样,裁剪后的第一图像和引导图像就平均划分为了(2M+N)×(2M+N)块。
如图25c中示出的那样,将裁剪后的第一图像(包括留边区域)和引导图像(包括留边区域)基于卷积层集CNNgG和CNNgL进行卷积操作,分别提取特征Gcf和特征Lcf,该特征可以是轮廓线特征。
将Gcf和Lcf进行拼接,设计卷积层集CNNg2,对拼接后的Gcf和Lcf进行卷积处理得到GLcf。
网络倒数第二段,设计卷积层集CNNgs和CNNgc分别处理GLcf,输出特征GLcfs和特征GLcfc,可选的,特征GLcfs和特征GLcfc的边长比为(2M+N):(2M+N-1)。
基于网络末段卷积层集CNNgf,处理特征GLcfs和特征GLcfc,得到尺寸为(2M+N)×(2M+N)×2的特征GLcfsf和尺寸为(2M+N-1)×(2M+N-1)×2的特征GLcfcf。取GLcfsf中心的N×N×2和GLcfcf中心的(N-1)×(N-1)×2,即输出的“N×N”基础网格及“(N-1)×(N-1)”内嵌网格的网格点对应的坐标点位移,其中,×2可以指位移有x和y两方向,格点位移的含义是:引导图像向待增强图上配准,格点位置的坐标点所应该有的位移。
本申请实施例中,可以由格点坐标的位移,插值出各像素点的位移。
本申请实施例中,上述网格点可以为每次卷积操作对应的卷积核在引导图像中的感受野范围的几何中心,或者为距离几何中心不远的像素位置(网格点和感受野范围的几何中心之间的间隔小于预设值),这里并不限定。其中,在卷积神经网络中,感受野(receptive field)可以是卷积神经网络每一层输出的特征图(feature map)上的像素点在输入图片上映射的区域范围。
需要说明的是,感受野的计算范围也可以将第一图像往外无限延拓,确保在到达第一图像的边界时,感受野范围不被第一图像的边界截断。
其中,本申请中可以是最后一层卷积操作中,卷积核在引导图像中的感受野范围,需要说明的是,本申请中,感受野可以包括在卷积操作中特征层边缘补0的区域。
本申请实施例还提供一种电子设备,请参阅图26,图26为本申请实施例提供的电子设备的一种结构示意图,电子设备包括:
获取模块2601,用于获取第一图像,所述第一图像包括目标对象;根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
处理模块2602,用于根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
可选的,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
可选的,所述获取模块2601,具体用于:
根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度,从所述至少一个第二图像中确定所述引导图像。
可选的,所述电子模块还包括:
显示模块2603,用于显示第一图像选择界面,所述第一图像选择界面包括至少一个图像;
接收模块2604,用于接收第一图像选择指令,所述第一图像选择指令表示从所述第一图像选择界面包括的至少一个图像中选择所述至少一个第二图像。
可选的,所述处理模块,具体用于:
根据所述第一图像中的目标对象的姿态确定至少一个第三图像,所述至少一个第三图像中的每个第三图像包括目标对象,且每个第三图像包括的目标对象的姿态与所述第一图 像中的目标对象的姿态之间的差异度在预设范围内;
所述显示模块,还用于显示第二图像选择界面,所述第二图像选择界面包括所述至少一个第三图像;
所述接收模块,还用于接收第二图像选择指令,所述第二图像选择指令表示从所述第二图像选择界面包括的至少一个第三图像中选择所述引导图像。
可选的,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
可选的,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
可选的,所述显示模块2603,还用于:
显示相机的拍摄界面;
所述获取模块2601,具体用于接收用户的拍摄操作,响应于所述拍摄操作,获取所述第一图像;
或,所述显示模块2603,还用于:
显示相机的相册界面,所述相册界面包括多个图像;
所述获取模块2601,具体用于接收第三图像选择指令,所述第三图像选择指令表示从所述相册界面包括的多个图像中选择所述第一图像。
可选的,所述获取模块2601,具体用于:
接收服务器发送的所述引导图像。
可选的,所述处理模块2602,具体用于获取每个第二像素点的高频信息;获取每个第一像素点的低频信息,所述第二像素点为所述引导图像中的像素点,所述第一像素点为所述第一图像的像素点;将所述低频信息和对应的高频信息进行融合处理。
可选的,所述处理模块2602,还用于将每个第二像素点与对应的第一像素点进行融合处理之后,在所述第一图像中对所述目标对象的边缘区域进行平滑处理。
可选的,所述处理模块2602,还用于确定每个第二像素点与对应的第一像素点之间的像素位移;基于所述像素位移对每个第二像素进行平移,得到配准后的目标对象。
可选的,所述处理模块2602,具体用于将所述配准后的目标对象与所述目标对象进行融合。
可选的,所述目标对象包括第一区域,所述配准后的目标对象包括第二区域,所述第一区域与所述第二区域重合,所述处理模块2602,具体用于将所述第一区域与所述第二区域的像素点进行融合处理。
可选的,所述目标对象还包括第三区域,所述第三区域与所述配准后的目标对象错开,所述处理模块2602,还用于对所述第三区域进行超分辨增强处理。
可选的,所述配准后的目标对象还包括N个第三像素点,每个所述第三像素点为根据相邻的第一像素点的像素值通过插值生成的,所述N为正整数。
可选的,所述处理模块2602,具体用于对所述第一图像执行卷积操作,得到第一特征平面;对所述引导图像执行卷积操作,得到第二特征平面;基于对所述第一特征平面和所述第二特征平面执行卷积操作,确定所述M个第一像素点中每个第一像素点对应的第二像素点,其中,每个网格点的坐标位置与一次卷积操作对应的卷积核的几何中心的间隔小于预设值。
本申请实施例还提供一种服务器,请参阅图27,图27为本申请实施例提供的服务器的一种结构示意图,服务器包括:
接收模块2701,用于接收电子设备发送的第一图像,所述第一图像包括目标对象;获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
处理模块2702,用于根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
发送模块2703,用于向所述电子设备发送所述目标图像。
可选的,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
可选的,所述接收模块2701,具体用于:
根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度,从所述至少一个第二图像中确定所述引导图像。
可选的,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
可选的,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
接下来介绍本申请实施例提供的一种电子设备,请参阅图28,图28为本申请实施例提供的电子设备的一种结构示意图,电子设备2800具体可以表现为虚拟现实VR设备、手机、平板、笔记本电脑、智能穿戴设备等,此处不做限定。具体的,电子设备2800包括:接收器2801、发射器2802、处理器2803和存储器2804(其中电子设备2800中的处理器2803的数量可以一个或多个,图28中以一个处理器为例),其中,处理器2803可以包括应用处理器28031和通信处理器28032。在本申请的一些实施例中,接收器2801、发射器2802、处理器2803和存储器2804可通过总线或其它方式连接。
存储器2804可以包括只读存储器和随机存取存储器,并向处理器2803提供指令和数据。存储器2804的一部分还可以包括非易失性随机存取存储器(non-volatile random access memory,NVRAM)。存储器2804存储有处理器和操作指令、可执行模块或者数据结 构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各种操作。
处理器2803控制电子设备的操作。具体的应用中,电子设备的各个组件通过总线系统耦合在一起,其中总线系统除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线系统。
上述本申请实施例揭示的方法可以应用于处理器2803中,或者由处理器2803实现。处理器2803可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器2803中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器2803可以是通用处理器、数字信号处理器(digital signal processing,DSP)、微处理器或微控制器,还可进一步包括专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。该处理器2803可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器2804,处理器2803读取存储器2804中的信息,结合其硬件完成上述方法的步骤。
接收器2801可用于接收输入的数字或字符信息,以及产生与电子设备的相关设置以及功能控制有关的信号输入。发射器2802可用于通过第一接口输出数字或字符信息;发射器2802还可用于通过第一接口向磁盘组发送指令,以修改磁盘组中的数据;发射器2802还可以包括显示屏等显示设备。
本申请实施例中,在一种情况下,处理器2803,用于执行上述实施例中的图像增强方法中与处理相关的步骤。
本申请实施例还提供了一种服务器,请参阅图29,图29是本申请实施例提供的服务器的一种结构示意图,服务器可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)2922(例如,一个或一个以上处理器)和存储器2932,一个或一个以上存储应用程序2942或数据2944的存储介质2930(例如一个或一个以上海量存储设备)。其中,存储器2932和存储介质2930可以是短暂存储或持久存储。存储在存储介质2930的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对训练设备中的一系列指令操作。更进一步地,中央处理器2922可以设置为与存储介质2930通信,在服务器2900上执行存储介质2930中的一系列指令操作。
服务器2900还可以包括一个或一个以上电源2926,一个或一个以上有线或无线网络接口2950,一个或一个以上输入输出接口2958,和/或,一个或一个以上操作系统2941,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
本申请实施例中,中央处理器2922,用于执行上述实施例中描述的图像增强方法。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行图像增强方法的步骤。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于进行信号处理的程序,当其在计算机上运行时,使得计算机执行如前述实施例描述的方法中的图像增强方法的步骤。
本申请实施例提供的执行设备和训练设备具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使执行设备内的芯片执行上述实施例描述的图像增强方法,或者,以使训练设备内的芯片执行上述实施例描述的图像增强方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图30,图30为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU 300,NPU 300作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路3003,通过控制器3004控制运算电路3003提取存储器中的矩阵数据并进行乘法运算。
在一些实现中,运算电路3003内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路3003是二维脉动阵列。运算电路3003还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路3003是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器3002中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器3001中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)3008中。
统一存储器3006用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(direct memory access controller,DMAC)3005,DMAC被搬运到权重存储器3002中。输入数据也通过DMAC被搬运到统一存储器3006中。
BIU为Bus Interface Unit即,总线接口单元3010,用于AXI总线与DMAC和取指存储器(Instruction Fetch Buffer,IFB)3009的交互。
总线接口单元3010(Bus Interface Unit,简称BIU),用于取指存储器3009从外部存储器获取指令,还用于存储单元访问控制器3005从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器3006或将权重数据搬运到权重存储器3002中或将输入数据数据搬运到输入存储器3001中。
向量计算单元3007包括多个运算处理单元,在需要的情况下,对运算电路的输出做 进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如Batch Normalization(批归一化),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元3007能将经处理的输出的向量存储到统一存储器3006。例如,向量计算单元3007可以将线性函数和/或非线性函数应用到运算电路3003的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元3007生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路3003的激活输入,例如用于在神经网络中的后续层中的使用。
控制器3004连接的取指存储器(instruction fetch buffer)3009,用于存储控制器3004使用的指令;
统一存储器3006,输入存储器3001,权重存储器3002以及取指存储器3009均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述图像增强方法的程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,训练设备,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传 输,例如,所述计算机指令可以从一个网站站点、计算机、训练设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、训练设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。

Claims (22)

  1. 一种图像增强方法,其特征在于,所述方法包括:
    获取第一图像,所述第一图像包括目标对象;
    根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
    根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
  2. 根据权利要求1所述的方法,其特征在于,所述目标对象至少包括如下对象的一种:同一个人的人脸、眼、耳、鼻、眉或口。
  3. 根据权利要求1或2所述的方法,其特征在于,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述根据所述第一图像获取引导图像,包括:
    根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度,从所述至少一个第二图像中确定所述引导图像。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度之前,所述方法还包括:
    显示第一图像选择界面,所述第一图像选择界面包括至少一个图像;
    接收第一图像选择指令,所述第一图像选择指令表示从所述第一图像选择界面包括的至少一个图像中选择所述至少一个第二图像。
  6. 根据权利要求1至3任一所述的方法,其特征在于,所述根据所述第一图像获取引导图像,包括:
    根据所述第一图像中的目标对象的姿态确定至少一个第三图像,所述至少一个第三图像中的每个第三图像包括目标对象,且每个第三图像包括的目标对象的姿态与所述第一图像中的目标对象的姿态之间的差异度在预设范围内;
    显示第二图像选择界面,所述第二图像选择界面包括所述至少一个第三图像;
    接收第二图像选择指令,所述第二图像选择指令表示从所述第二图像选择界面包括的至少一个第三图像中选择所述引导图像。
  7. 根据权利要求1至6任一所述的方法,其特征在于,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
    亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
  8. 根据权利要求1至7任一所述的方法,其特征在于,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
  9. 根据权利要求1至8任一所述的方法,其特征在于,所述获取第一图像,包括:
    显示相机的拍摄界面;
    接收用户的拍摄操作,响应于所述拍摄操作,获取所述第一图像;或,显示相机的相册界面,所述相册界面包括多个图像;
    接收第三图像选择指令,所述第三图像选择指令表示从所述相册界面包括的多个图像中选择所述第一图像。
  10. 根据权利要求1至9任一所述的方法,其特征在于,所述根据所述第一图像获取引导图像,包括:
    接收服务器发送的引导图像,所述引导图像为所述服务器根据所述第一图像获取的。
  11. 一种图像增强装置,其特征在于,应用于电子设备或服务器,所述图像增强装置包括:
    获取模块,用于获取第一图像,所述第一图像包括目标对象;根据所述第一图像获取引导图像,所述引导图像包括所述目标对象,所述引导图像中的目标对象的清晰度大于所述第一图像中的目标对象的清晰度;
    处理模块,用于根据所述引导图像中的目标对象对所述第一图像中的目标对象通过神经网络进行增强,得到目标图像,所述目标图像包括增强后的目标对象,所述增强后的目标对象的清晰度大于所述第一图像中的目标对象的清晰度。
  12. 根据权利要求11所述的图像增强装置,其特征在于,所述目标对象至少包括如下对象的一种:同一个人的人脸、眼、耳、鼻、眉或口。
  13. 根据权利要求11或12所述的图像增强装置,其特征在于,所述引导图像中的目标对象的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
  14. 根据权利要求11至13任一所述的图像增强装置,其特征在于,所述获取模块,具体用于:
    根据所述第一图像中的目标对象的姿态与至少一个第二图像中的每个第二图像的姿态的差异度,从所述至少一个第二图像中确定所述引导图像。
  15. 根据权利要求14所述的图像增强装置,其特征在于,所述电子模块还包括:
    显示模块,用于显示第一图像选择界面,所述第一图像选择界面包括至少一个图像;
    接收模块,用于接收第一图像选择指令,所述第一图像选择指令表示从所述第一图像选择界面包括的至少一个图像中选择所述至少一个第二图像。
  16. 根据权利要求11至13任一所述的图像增强装置,其特征在于,所述处理模块,具体用于:
    根据所述第一图像中的目标对象的姿态确定至少一个第三图像,所述至少一个第三图像中的每个第三图像包括目标对象,且每个第三图像包括的目标对象的姿态与所述第一图像中的目标对象的姿态之间的差异度在预设范围内;
    所述显示模块,还用于显示第二图像选择界面,所述第二图像选择界面包括所述至少一个第三图像;
    所述接收模块,还用于接收第二图像选择指令,所述第二图像选择指令表示从所述第二图像选择界面包括的至少一个第三图像中选择所述引导图像。
  17. 根据权利要求11至16任一所述的图像增强装置,其特征在于,所述目标图像包括增强后的目标对象,所述增强后的目标对象的引导图像特征比所述第一图像中的目标对象接近于所述引导图像中的目标对象,其中,所述引导图像特征至少包括如下一种图像特征:
    亮度的动态范围、色调、对比度、饱和度、纹理信息和轮廓信息。
  18. 根据权利要求11至17任一所述的图像增强装置,其特征在于,所述目标图像包括增强后的目标对象,所述增强后的目标对象的的姿态与所述第一图像中的目标对象的姿态的差异度在预设范围内。
  19. 根据权利要求11至18任一所述的图像增强装置,其特征在于,所述显示模块,还用于:
    显示相机的拍摄界面;
    所述获取模块,具体用于接收用户的拍摄操作,响应于所述拍摄操作,获取所述第一图像;
    或,所述显示模块,还用于:
    显示相机的相册界面,所述相册界面包括多个图像;
    所述获取模块,具体用于接收第三图像选择指令,所述第三图像选择指令表示从所述相册界面包括的多个图像中选择所述第一图像。
  20. 一种电子设备,其特征在于,包括:
    一个或多个处理器;
    一个或多个存储器;
    多个应用程序;
    以及一个或多个程序,其中所述一个或多个程序被存储在所述存储器中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行1至10任一所述的步骤。
  21. 一种服务器,其特征在于,包括:
    一个或多个处理器;
    一个或多个存储器;
    多个应用程序;
    以及一个或多个程序,其中所述一个或多个程序被存储在所述存储器中,当所述一个或者多个程序被所述处理器执行时,使得所述电子设备执行1至10任一所述的步骤。
  22. 一种计算机存储介质,其特征在于,包括计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行如权利要求1至10中任一项所述的图像增强方法。
PCT/CN2020/118833 2019-10-25 2020-09-29 一种图像增强方法及装置 WO2021078001A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911026078.X 2019-10-25
CN201911026078.XA CN112712470A (zh) 2019-10-25 2019-10-25 一种图像增强方法及装置

Publications (1)

Publication Number Publication Date
WO2021078001A1 true WO2021078001A1 (zh) 2021-04-29

Family

ID=75541157

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118833 WO2021078001A1 (zh) 2019-10-25 2020-09-29 一种图像增强方法及装置

Country Status (2)

Country Link
CN (1) CN112712470A (zh)
WO (1) WO2021078001A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113301251B (zh) * 2021-05-20 2023-10-20 努比亚技术有限公司 辅助拍摄方法、移动终端及计算机可读存储介质
CN113923372B (zh) * 2021-06-25 2022-09-13 荣耀终端有限公司 曝光调整方法及相关设备
US20230097869A1 (en) * 2021-09-28 2023-03-30 Samsung Electronics Co., Ltd. Method and apparatus for enhancing texture details of images
CN114399622A (zh) * 2022-03-23 2022-04-26 荣耀终端有限公司 图像处理方法和相关装置
CN114827567A (zh) * 2022-03-23 2022-07-29 阿里巴巴(中国)有限公司 视频质量分析方法、设备和可读介质
CN114926351B (zh) * 2022-04-12 2023-06-23 荣耀终端有限公司 图像处理方法、电子设备以及计算机存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250825A (zh) * 2016-07-22 2016-12-21 厚普(北京)生物信息技术有限公司 一种在医保应用中场景自适应的人脸识别系统
CN106920224A (zh) * 2017-03-06 2017-07-04 长沙全度影像科技有限公司 一种评估拼接图像清晰度的方法
US20180365532A1 (en) * 2017-06-20 2018-12-20 Nvidia Corporation Semi-supervised learning for landmark localization
JP2019023798A (ja) * 2017-07-24 2019-02-14 日本放送協会 超解像装置およびプログラム
CN109671023A (zh) * 2019-01-24 2019-04-23 江苏大学 一种人脸图像超分辨率二次重建方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056562B (zh) * 2016-05-19 2019-05-28 京东方科技集团股份有限公司 一种人脸图像处理方法、装置及电子设备
JP6840957B2 (ja) * 2016-09-01 2021-03-10 株式会社リコー 画像類似度算出装置、画像処理装置、画像処理方法、及び記録媒体
CN107527332B (zh) * 2017-10-12 2020-07-31 长春理工大学 基于改进Retinex的低照度图像色彩保持增强方法
CN109544482A (zh) * 2018-11-29 2019-03-29 厦门美图之家科技有限公司 一种卷积神经网络模型生成方法及图像增强方法
CN110084775B (zh) * 2019-05-09 2021-11-26 深圳市商汤科技有限公司 图像处理方法及装置、电子设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250825A (zh) * 2016-07-22 2016-12-21 厚普(北京)生物信息技术有限公司 一种在医保应用中场景自适应的人脸识别系统
CN106920224A (zh) * 2017-03-06 2017-07-04 长沙全度影像科技有限公司 一种评估拼接图像清晰度的方法
US20180365532A1 (en) * 2017-06-20 2018-12-20 Nvidia Corporation Semi-supervised learning for landmark localization
JP2019023798A (ja) * 2017-07-24 2019-02-14 日本放送協会 超解像装置およびプログラム
CN109671023A (zh) * 2019-01-24 2019-04-23 江苏大学 一种人脸图像超分辨率二次重建方法

Also Published As

Publication number Publication date
CN112712470A (zh) 2021-04-27

Similar Documents

Publication Publication Date Title
WO2021136050A1 (zh) 一种图像拍摄方法及相关装置
WO2020168956A1 (zh) 一种拍摄月亮的方法和电子设备
WO2021078001A1 (zh) 一种图像增强方法及装置
WO2020077511A1 (zh) 一种拍摄场景下的图像显示方法及电子设备
WO2021104485A1 (zh) 一种拍摄方法及电子设备
WO2021013132A1 (zh) 输入方法及电子设备
WO2021052111A1 (zh) 图像处理方法及电子装置
WO2022017261A1 (zh) 图像合成方法和电子设备
CN113170037B (zh) 一种拍摄长曝光图像的方法和电子设备
EP4361954A1 (en) Object reconstruction method and related device
WO2022179604A1 (zh) 一种分割图置信度确定方法及装置
CN114140365B (zh) 基于事件帧的特征点匹配方法及电子设备
WO2022012418A1 (zh) 拍照方法及电子设备
CN115272138B (zh) 图像处理方法及其相关设备
WO2024021742A1 (zh) 一种注视点估计方法及相关设备
CN110138999B (zh) 一种用于移动终端的证件扫描方法及装置
CN113538227A (zh) 一种基于语义分割的图像处理方法及相关设备
WO2021180046A1 (zh) 图像留色方法及设备
CN113452969B (zh) 图像处理方法和装置
US20230162529A1 (en) Eye bag detection method and apparatus
EP4325877A1 (en) Photographing method and related device
US20230014272A1 (en) Image processing method and apparatus
WO2023011348A1 (zh) 检测方法及电子设备
WO2022062985A1 (zh) 视频特效添加方法、装置及终端设备
CN115760931A (zh) 图像处理方法及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20878416

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20878416

Country of ref document: EP

Kind code of ref document: A1