CN112712470A - Image enhancement method and device - Google Patents

Image enhancement method and device Download PDF

Info

Publication number
CN112712470A
CN112712470A CN201911026078.XA CN201911026078A CN112712470A CN 112712470 A CN112712470 A CN 112712470A CN 201911026078 A CN201911026078 A CN 201911026078A CN 112712470 A CN112712470 A CN 112712470A
Authority
CN
China
Prior art keywords
image
target object
interface
guide
electronic device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911026078.XA
Other languages
Chinese (zh)
Inventor
邵纬航
王银廷
乔蕾
李默
张一帆
黄一宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201911026078.XA priority Critical patent/CN112712470A/en
Priority to PCT/CN2020/118833 priority patent/WO2021078001A1/en
Publication of CN112712470A publication Critical patent/CN112712470A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • G06T5/60
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Abstract

The embodiment of the application provides an image enhancement method, which comprises the following steps: acquiring a first image, wherein the first image comprises a target object; acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image; and enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image. In the application, the situation of distortion of the enhanced target image can not occur, and the enhancement effect is good.

Description

Image enhancement method and device
Technical Field
The present application relates to the field of electronic technologies, and in particular, to an image enhancement method and apparatus.
Background
Images taken by users often have poor quality due to external factors (e.g., low brightness, etc.).
In the prior art, images are often enhanced based on super-resolution processing, however, in a complex scene (for example, an image including a human face is enhanced), because details of the image are more, the enhanced image is prone to be distorted, and the enhancement effect is reduced.
Disclosure of Invention
The embodiment of the application provides an image enhancement method and device, a guide image is used for enhancing a to-be-enhanced image (a first image) through a neural network, and information in the guide image is used for reference, so that compared with the traditional face enhancement technology, the image to be enhanced is directly processed, the situation of distortion cannot occur, and the enhancement effect is better.
In a first aspect, an embodiment of the present application provides an image enhancement method, where the method includes:
acquiring a first image, wherein the first image comprises a target object;
acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
and enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
The application provides an image enhancement method, which comprises the following steps: acquiring a first image, wherein the first image comprises a target object; acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image; and enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image. Through the mode, the guide image is used for enhancing the image to be enhanced (the first image) through the neural network, and the information in the guide image is used for reference, so that compared with the traditional face enhancement technology in which the image to be enhanced is directly processed, the situation of distortion cannot occur, and the enhancement effect is better.
In an alternative design of the first aspect, the target object includes at least one of: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
In an optional design of the first aspect, a degree of difference between a pose of the target object in the guide image and a pose of the target object in the first image is within a preset range.
In an alternative design of the first aspect, the first image comprises:
determining the guide image from the at least one second image according to a degree of difference between a pose of the target object in the first image and a pose of each of the at least one second image.
In an optional design of the first aspect, before the determining according to the degree of difference between the pose of the target object in the first image and the pose of each of the at least one second image, the method further comprises:
displaying a first image selection interface, the first image selection interface including at least one image;
receiving a first image selection instruction, wherein the first image selection instruction represents that at least one second image is selected from at least one image included in the first image selection interface.
In an alternative design of the first aspect, the first image comprises:
determining at least one third image according to the posture of the target object in the first image, wherein each third image in the at least one third image comprises the target object, and the difference degree between the posture of the target object in each third image and the posture of the target object in the first image is within a preset range;
displaying a second image selection interface, the second image selection interface including the at least one third image;
receiving a second image selection instruction, wherein the second image selection instruction represents that the guide image is selected from at least one third image included in the second image selection interface.
In an optional design of the first aspect, a degree of difference between a pose of the target object in the guide image and a pose of the target object in the first image is within a preset range includes:
the difference degree between the contour shape of the target object in the guide image and the contour shape of the target object in the first image is within a preset range.
In an alternative design of the first aspect, a sharpness of the target object in the guide image is greater than a sharpness of the target object in the first image.
In an optional design of the first aspect, the target image includes an enhanced target object, and a guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image, where the guide image feature includes at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
In an optional design of the first aspect, the target image includes an enhanced target object having a sharpness greater than a sharpness of the target object in the first image.
In an optional design of the first aspect, the target image includes an enhanced target object, and a difference between a posture of the enhanced target object and a posture of the target object in the first image is within a preset range.
In an alternative design of the first aspect, the acquiring the first image includes:
displaying a shooting interface of a camera;
receiving shooting operation of a user, and acquiring the first image in response to the shooting operation; or
Or displaying an album interface of the camera, wherein the album interface comprises a plurality of images;
receiving a third image selection instruction, wherein the third image selection instruction represents that the first image is selected from a plurality of images included in the album interface.
In an alternative design of the first aspect, the acquiring the guide image includes:
and receiving the guide image sent by the server.
In a second aspect, the present application provides an image enhancement apparatus applied to an electronic device or a server, the image enhancement apparatus including:
an acquisition module for acquiring a first image, the first image including a target object; acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
and the processing module is used for enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
In an optional design of the second aspect, a degree of difference between the pose of the target object in the guide image and the pose of the target object in the first image is within a preset range.
In an optional design of the second aspect, the obtaining module is specifically configured to:
determining the guide image from the at least one second image according to a degree of difference between a pose of the target object in the first image and a pose of each of the at least one second image.
In an alternative design of the second aspect, the electronic module further includes:
the display module is used for displaying a first image selection interface, and the first image selection interface comprises at least one image;
a receiving module, configured to receive a first image selection instruction, where the first image selection instruction indicates that the at least one second image is selected from at least one image included in the first image selection interface.
In an optional design of the second aspect, the processing module is specifically configured to:
determining at least one third image according to the posture of the target object in the first image, wherein each third image in the at least one third image comprises the target object, and the difference degree between the posture of the target object in each third image and the posture of the target object in the first image is within a preset range;
the display module is further configured to display a second image selection interface, where the second image selection interface includes the at least one third image;
the receiving module is further configured to receive a second image selection instruction, where the second image selection instruction indicates that the guide image is selected from at least one third image included in the second image selection interface.
In an alternative design of the second aspect, the target image includes an enhanced target object, and the guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image, where the guide image feature includes at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
In an optional design of the second aspect, the target image includes an enhanced target object, and a difference between a posture of the enhanced target object and a posture of the target object in the first image is within a preset range.
In an alternative design of the second aspect, the display module is further configured to:
displaying a shooting interface of a camera;
the acquisition module is specifically used for receiving shooting operation of a user and acquiring the first image in response to the shooting operation;
or, the display module is further configured to:
displaying an album interface of a camera, the album interface including a plurality of images;
the obtaining module is specifically configured to receive a third image selection instruction, where the third image selection instruction indicates that the first image is selected from a plurality of images included in the album interface.
In an optional design of the second aspect, the obtaining module is specifically configured to:
and receiving the guide image sent by the server.
In a third aspect, the present application provides an image enhancement method, including:
receiving a first image sent by electronic equipment, wherein the first image comprises a target object;
acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
and enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
And sending the target image to the electronic equipment.
In an alternative design of the third aspect, the target object includes at least one of: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
In an optional design of the third aspect, a degree of difference between a pose of the target object in the guide image and a pose of the target object in the first image is within a preset range.
In an optional design of the third aspect, the target image includes an enhanced target object, and the guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image, where the guide image feature includes at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
In an optional design of the third aspect, the target image includes an enhanced target object, and a difference between a posture of the enhanced target object and a posture of the target object in the first image is within a preset range.
In a fourth aspect, the present application provides a server, comprising:
the receiving module is used for receiving a first image sent by electronic equipment, and the first image comprises a target object; acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
and the processing module is used for enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
And the sending module is used for sending the target image to the electronic equipment.
In an optional design of the fourth aspect, the target object includes at least one of: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
In an optional design of the fourth aspect, a degree of difference between the pose of the target object in the guide image and the pose of the target object in the first image is within a preset range.
In an optional design of the fourth aspect, the target image includes an enhanced target object, and the guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image, where the guide image feature includes at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
In an optional design of the fourth aspect, the target image includes an enhanced target object, and a difference between a posture of the enhanced target object and a posture of the target object in the first image is within a preset range.
In a fifth aspect, an embodiment of the present application provides an image enhancement method, where the method includes:
acquiring a first image, wherein the first image comprises a target object;
acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
and enhancing the target object in the first image according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
In an alternative design of the fifth aspect, the target object is the moon.
In a sixth aspect, the present application provides an electronic device, comprising: one or more processors; one or more memory; a plurality of application programs; and one or more programs, wherein the one or more programs are stored in the memory, which when executed by the processor, cause the electronic device to perform the steps of any of the above first aspect and possible implementations of the first aspect.
In a seventh aspect, the present application provides a server, including: one or more processors; one or more memories; and one or more programs, wherein the one or more programs are stored in the memory, which when executed by the processor, cause the server to perform the steps of any of the first aspect, the third aspect, the possible implementations of the first aspect, and the possible implementations of the third aspect.
In an eighth aspect, the present application provides an apparatus, included in an electronic device, having functionality to implement the behavior of the electronic device according to any of the first aspect. The functions can be realized by hardware, and the corresponding software can also be realized by hardware. The hardware or software includes one or more modules or units corresponding to the above-described functions. Such as a display module, an acquisition module, a processing module, etc.
In a ninth aspect, the present application provides an electronic device, comprising: a touch display screen, wherein the touch display screen comprises a touch sensitive surface and a display; a camera; one or more processors; a memory; a plurality of application programs; and one or more computer programs. Wherein the one or more computer programs are stored in the memory, the one or more computer programs comprising instructions. The instructions, when executed by the electronic device, cause the electronic device to perform the image enhancement method of any possible implementation of the first aspect described above.
In a tenth aspect, the present application provides a computer storage medium comprising computer instructions that, when executed on an electronic device or a server, cause the electronic device to perform any one of the possible image enhancement methods of the above aspects.
In an eleventh aspect, the present application provides a computer program product, which when run on an electronic device or a server, causes the electronic device to perform any one of the possible image enhancement methods of the above aspects.
The application provides an image enhancement method, which comprises the following steps: acquiring a first image, wherein the first image comprises a target object; acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image; and enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image. Through the mode, the guide image is used for enhancing the image to be enhanced (the first image) through the neural network, and the information in the guide image is used for reference, so that compared with the traditional face enhancement technology in which the image to be enhanced is directly processed, the situation of distortion cannot occur, and the enhancement effect is better.
Drawings
Fig. 1 is a schematic diagram of an application scenario architecture according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of an electronic device;
FIG. 3a is a block diagram of a software architecture of an electronic device according to an embodiment of the present application;
fig. 3b is a schematic diagram of an embodiment of an image enhancement method provided in an embodiment of the present application;
FIG. 4(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 4(b) is a schematic diagram of an example of an image enhancement processing interface provided in the embodiment of the present application;
FIG. 4(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 4(d) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 5(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 5(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 5(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 6(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 6(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 6(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 7(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 7(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 7(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 8 is a diagram illustrating an example of an image enhancement interface according to an embodiment of the present disclosure;
FIG. 9 is a diagram illustrating an example of an image enhancement interface according to an embodiment of the present disclosure;
FIG. 10(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 10(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 10(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 10(d) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 10(e) is a schematic diagram of an example of an image enhancement processing interface provided in the embodiments of the present application;
FIG. 10(f) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 11(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 11(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 11(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 12(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 12(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 12(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 12(d) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 12(e) is a schematic diagram of an example of an image enhancement processing interface provided in the embodiments of the present application;
FIG. 13(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 13(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 13(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 14(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 14(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 14(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 15(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 15(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 15(c) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 15(d) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 16(a) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 16(b) is a schematic diagram of an example of an image enhancement processing interface provided in an embodiment of the present application;
FIG. 17 is a diagram illustrating an example of an image enhancement interface according to an embodiment of the present disclosure;
FIG. 18 is a schematic diagram of an example image enhancement processing interface provided in an embodiment of the present application;
FIG. 19 is a schematic view of an image provided by an embodiment of the present application;
FIG. 20(a) is a schematic view of a first image;
FIG. 20(b) is a schematic illustration of a guide image;
FIG. 21(a) is a schematic view of a guide image;
FIG. 21(b) is a schematic view of a guide image;
FIG. 21(c) is a schematic diagram of face region recognition;
FIG. 22(a) is a schematic illustration of a target object;
FIG. 22(b) is a schematic illustration of a target object;
FIG. 23(a) is a schematic illustration of a target object;
FIG. 23(b) is a schematic illustration of a target object after registration;
FIG. 23(c) is a schematic diagram showing an alignment of a target object and a registered target object;
FIG. 23(d) is a schematic illustration of image enhancement;
FIG. 23(e) is a schematic illustration of an image enhancement;
FIG. 23(f) is a schematic illustration of image enhancement;
FIG. 23(g) is a schematic illustration of image enhancement;
fig. 24 is a schematic diagram of an embodiment of an image enhancement method provided in an embodiment of the present application;
FIG. 25a is a system architecture diagram of an image enhancement system according to an embodiment of the present application;
FIG. 25b is a diagram illustrating a convolution kernel performing a convolution operation on an image according to an embodiment of the present application;
FIG. 25c is a schematic diagram of a neural network provided by an embodiment of the present application;
fig. 26 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 27 is a schematic structural diagram of a server according to an embodiment of the present application;
fig. 28 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 29 is a schematic structural diagram of a server provided in an embodiment of the present application;
fig. 30 is a schematic structural diagram of a chip according to an embodiment of the present disclosure.
Detailed Description
The embodiment of the application provides an image enhancement method, electronic equipment and a server, in the above way, a guide image is used for enhancing a to-be-enhanced image (a first image) through a neural network, and compared with the traditional face enhancement technology, the method has the advantages that the to-be-enhanced image is directly processed, the distortion condition is avoided, and the enhancement effect is better.
Embodiments of the present application are described below with reference to the accompanying drawings. As can be known to those skilled in the art, with the development of technology and the emergence of new scenarios, the technical solution provided in the embodiments of the present application is also applicable to similar technical problems.
The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various ways in which objects of the same nature may be described in connection with the embodiments of the application. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, fig. 1 is a schematic diagram of an application scenario architecture according to an embodiment of the present application. As shown in fig. 1, the image enhancement method provided by the embodiment of the present application may be implemented based on an electronic device 101, and the image enhancement method provided by the embodiment of the present application may also be implemented based on an interaction between the electronic device 101 and a server 102.
The image enhancement method provided by the embodiment of the application can be applied to electronic devices such as a mobile phone, a tablet personal computer, a wearable device, a vehicle-mounted device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), and the like, and the embodiment of the application does not limit the specific types of the electronic devices at all.
For example, fig. 2 shows a schematic structural diagram of the electronic device 200. The electronic device 200 may include a processor 210, an external memory interface 220, an internal memory 221, a Universal Serial Bus (USB) interface 230, a charging management module 240, a power management module 241, a battery 242, an antenna 1, an antenna 2, a mobile communication module 250, a wireless communication module 260, an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, an earphone interface 270D, a sensor module 280, a key 290, a motor 291, an indicator 292, a camera 293, a display screen 294, a Subscriber Identification Module (SIM) card interface 295, and the like. The sensor module 280 may include a pressure sensor 280A, a gyroscope sensor 280B, an air pressure sensor 280C, a magnetic sensor 280D, an acceleration sensor 280E, a distance sensor 280F, a proximity light sensor 280G, a fingerprint sensor 280H, a temperature sensor 280J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
It is to be understood that the illustrated structure of the embodiment of the present application does not specifically limit the electronic device 200. In other embodiments of the present application, the electronic device 200 may include more or fewer components than shown, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 210 may include one or more processing units, such as: the processor 210 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors.
The controller may be, among other things, a neural center and a command center of the electronic device 200. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in processor 210 for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory. The memory may hold instructions or data that have just been used or recycled by processor 210. If the processor 210 needs to use the instruction or data again, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 210, thereby increasing the efficiency of the system.
In some embodiments, processor 210 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.
The I2C interface is a bi-directional synchronous serial bus that includes a serial data line (SDA) and a Serial Clock Line (SCL). In some embodiments, processor 110 may include multiple sets of I2C buses. The processor 210 may be coupled to the touch sensor 280K, the charger, the flash, the camera 193, etc. through different I2C bus interfaces, respectively. For example: the processor 210 may be coupled to the touch sensor 280K via an I2C interface, such that the processor 210 and the touch sensor 280K communicate via an I2C bus interface to implement the touch function of the electronic device 200.
The I2S interface may be used for audio communication. In some embodiments, processor 210 may include multiple sets of I2S buses. Processor 210 may be coupled to audio module 270 via an I2S bus to enable communication between processor 210 and audio module 270.
The PCM interface may also be used for audio communication, sampling, quantizing and encoding analog signals. In some embodiments, audio module 270 and wireless communication module 260 may be coupled by a PCM bus interface. In some embodiments, the audio module 270 may also transmit audio signals to the wireless communication module 260 through the PCM interface, so as to implement a function of answering a call through a bluetooth headset. Both the I2S interface and the PCM interface may be used for audio communication.
The UART interface is a universal serial data bus used for asynchronous communications. The bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is generally used to connect the processor 210 with the wireless communication module 260. For example: the processor 210 communicates with the bluetooth module in the wireless communication module 260 through the UART interface to implement the bluetooth function. In some embodiments, the audio module 270 may transmit the audio signal to the wireless communication module 260 through the UART interface, so as to realize the function of playing music through the bluetooth headset.
The MIPI interface may be used to connect the processor 210 with peripheral devices such as the display screen 294, the camera 293, and the like. The MIPI interface includes a Camera Serial Interface (CSI), a Display Serial Interface (DSI), and the like. In some embodiments, processor 210 and camera 293 communicate over a CSI interface to implement the capture functionality of electronic device 200. The processor 210 and the display screen 294 communicate through the DSI interface to implement a display function of the electronic device 200.
The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal and may also be configured as a data signal. In some embodiments, a GPIO interface may be used to connect processor 210 with camera 293, display 294, wireless communication module 260, audio module 270, sensor module 280, and the like. The GPIO interface may also be configured as an I2C interface, an I2S interface, a UART interface, a MIPI interface, and the like.
The USB interface 230 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 230 may be used to connect a charger to charge the electronic device 200, and may also be used to transmit data between the electronic device 200 and a peripheral device. And the earphone can also be used for connecting an earphone and playing audio through the earphone. The interface may also be used to connect other electronic devices, such as AR devices and the like.
It should be understood that the interfacing relationship between the modules illustrated in the embodiments of the present application is only an exemplary illustration, and does not constitute a structural limitation for the electronic device 200. In other embodiments of the present application, the electronic device 200 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.
The charging management module 140 is configured to receive charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 240 may receive charging input from a wired charger via the USB interface 230. In some wireless charging embodiments, the charging management module 240 may receive a wireless charging input through a wireless charging coil of the electronic device 200. The charging management module 240 may also supply power to the electronic device through the power management module 241 while charging the battery 242.
The power management module 241 is used to connect the battery 242, the charging management module 240 and the processor 210. The power management module 241 receives input from the battery 242 and/or the charging management module 240, and provides power to the processor 210, the internal memory 221, the external memory, the display 294, the camera 293, and the wireless communication module 260. The power management module 241 may also be used to monitor parameters such as battery capacity, battery cycle number, battery state of health (leakage, impedance), etc. In some other embodiments, the power management module 241 may also be disposed in the processor 210. In other embodiments, the power management module 241 and the charging management module 240 may be disposed in the same device.
The wireless communication function of the electronic device 200 may be implemented by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, the modem processor, the baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 200 may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 250 may provide a solution including 2G/3G/4G/5G wireless communication applied on the electronic device 200. The mobile communication module 250 may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module 250 may receive the electromagnetic wave from the antenna 1, filter, amplify, etc. the received electromagnetic wave, and transmit the electromagnetic wave to the modem processor for demodulation. The mobile communication module 250 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 250 may be disposed in the processor 210. In some embodiments, at least some of the functional modules of the mobile communication module 250 may be disposed in the same device as at least some of the modules of the processor 210.
The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs sound signals through an audio device (not limited to the speaker 270A, the receiver 270B, etc.) or displays images or videos through the display screen 294. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be separate from the processor 210, and may be disposed in the same device as the mobile communication module 250 or other functional modules.
The wireless communication module 260 may provide a solution for wireless communication applied to the electronic device 200, including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (bluetooth, BT), Global Navigation Satellite System (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR), and the like. The wireless communication module 260 may be one or more devices integrating at least one communication processing module. The wireless communication module 260 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering on the electromagnetic wave signal, and transmits the processed signal to the processor 210. The wireless communication module 260 may also receive a signal to be transmitted from the processor 210, frequency-modulate and amplify the signal, and convert the signal into electromagnetic waves via the antenna 2 to radiate the electromagnetic waves.
In some embodiments, antenna 1 of electronic device 200 is coupled to mobile communication module 250 and antenna 2 is coupled to wireless communication module 260, such that electronic device 200 may communicate with networks and other devices via wireless communication techniques. The wireless communication technology may include global system for mobile communications (GSM), General Packet Radio Service (GPRS), code division multiple access (code division multiple access, CDMA), Wideband Code Division Multiple Access (WCDMA), time-division code division multiple access (time-division code division multiple access, TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc. The GNSS may include a Global Positioning System (GPS), a global navigation satellite system (GLONASS), a beidou navigation satellite system (BDS), a quasi-zenith satellite system (QZSS), and/or a Satellite Based Augmentation System (SBAS).
The electronic device 200 implements display functions via the GPU, the display screen 294, and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 294 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or alter display information.
The display screen 294 is used to display images, video, and the like. The display screen 294 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, the electronic device 200 may include 1 or N display screens 294, N being a positive integer greater than 1.
The electronic device 200 may implement a shooting function through the ISP, the camera 293, the video codec, the GPU, the display screen 294, and the application processor.
The ISP is used to process the data fed back by the camera 293. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 293.
The camera 293 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP to be converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, electronic device 200 may include 1 or N cameras 293, N being a positive integer greater than 1.
For example, in the image processing method provided by the present application, a camera may capture an image and display the captured image in a preview interface. The photosensitive element converts the collected optical signal into an electrical signal, and then transmits the electrical signal to the ISP to be converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for relevant image processing.
The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals. For example, when the electronic device 200 selects a frequency point, the digital signal processor is used to perform fourier transform or the like on the frequency point energy.
Video codecs are used to compress or decompress digital video. The electronic device 200 may support one or more video codecs. In this way, the electronic device 200 may play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.
The NPU is a neural-network (NN) computing processor, which processes input information quickly by referring to a biological neural network structure, for example, by referring to a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can implement applications such as intelligent recognition of the electronic device 200, for example: image recognition, face recognition, speech recognition, text understanding, and the like.
The external memory interface 220 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the electronic device 200. The external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.
Internal memory 221 may be used to store computer-executable program code, including instructions. The processor 210 executes various functional applications and data processing of the electronic apparatus 200 by executing instructions stored in the internal memory 221. The internal memory 221 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (e.g., audio data, a phone book, etc.) created during use of the electronic device 200, and the like. In addition, the internal memory 221 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like.
Electronic device 200 may implement audio functions via audio module 270, speaker 270A, receiver 270B, microphone 270C, headset interface 270D, and an application processor, among other things. Such as music playing, recording, etc.
Audio module 270 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. Audio module 270 may also be used to encode and decode audio signals. In some embodiments, the audio module 270 may be disposed in the processor 210, or some functional modules of the audio module 270 may be disposed in the processor 210.
The speaker 270A, also called a "horn", is used to convert an audio electrical signal into an acoustic signal. The electronic apparatus 200 can listen to music through the speaker 270A or listen to a hands-free call.
The receiver 270B, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the electronic apparatus 200 receives a call or voice information, it is possible to receive voice by placing the receiver 270B close to the human ear.
The microphone 270C, also referred to as a "microphone," is used to convert acoustic signals into electrical signals. When making a call or transmitting voice information, the user can input a voice signal to the microphone 270C by speaking the user's mouth near the microphone 270C. The electronic device 200 may be provided with at least one microphone 270C. In other embodiments, the electronic device 200 may be provided with two microphones 270C to achieve a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device 200 may further include three, four or more microphones 270C to collect sound signals, reduce noise, identify sound sources, perform directional recording, and the like.
The headphone interface 270D is used to connect wired headphones. The headset interface 270D may be the USB interface 230, or may be an open mobile electronic device platform (OMTP) standard interface of 3.5mm, or a Cellular Telecommunications Industry Association (CTIA) standard interface.
The pressure sensor 280A is used to sense a pressure signal, which can be converted into an electrical signal. In some embodiments, the pressure sensor 280A may be disposed on the display screen 294. The pressure sensor 280A can be of a wide variety of types, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. When a force acts on the pressure sensor 280A, the capacitance between the electrodes changes. The electronic device 200 determines the intensity of the pressure from the change in capacitance. When a touch operation is applied to the display screen 294, the electronic apparatus 200 detects the intensity of the touch operation according to the pressure sensor 280A. The electronic apparatus 200 may also calculate the touched position from the detection signal of the pressure sensor 280A. In some embodiments, touch operations that act on the same touch position but with different touch operation strengths may correspond to different operation instructions. For example: and when the touch operation with the touch operation intensity smaller than the first pressure threshold value acts on the short message application icon, executing an instruction for viewing the short message. And when the touch operation with the touch operation intensity larger than or equal to the first pressure threshold value acts on the short message application icon, executing an instruction of newly building the short message.
The gyro sensor 280B may be used to determine the motion pose of the electronic device 200. In some embodiments, the angular velocity of the electronic device 200 about three axes (i.e., x, y, and z axes) may be determined by the gyroscope sensor 280B. The gyro sensor 280B may be used for photographing anti-shake. For example, when the shutter is pressed, the gyro sensor 280B detects a shake angle of the electronic device 200, calculates a distance to be compensated for by the lens module according to the shake angle, and allows the lens to counteract the shake of the electronic device 200 by a reverse movement, thereby achieving anti-shake. The gyro sensor 280B may also be used for navigation, somatosensory gaming scenes.
The air pressure sensor 280C is used to measure air pressure. In some embodiments, the electronic device 200 calculates altitude, aiding in positioning and navigation from barometric pressure values measured by the barometric pressure sensor 280C.
The magnetic sensor 280D includes a hall sensor. The electronic device 200 may detect the opening and closing of the flip holster using the magnetic sensor 280D. In some embodiments, when the electronic device 200 is a flip phone, the electronic device 200 may detect the opening and closing of the flip according to the magnetic sensor 280D. And then according to the opening and closing state of the leather sheath or the opening and closing state of the flip cover, the automatic unlocking of the flip cover and other characteristics are set.
The acceleration sensor 280E may detect the magnitude of acceleration of the electronic device 200 in various directions (typically three axes). The magnitude and direction of gravity can be detected when the electronic device 200 is stationary. The method can also be used for recognizing the posture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.
A distance sensor 280F for measuring distance. The electronic device 200 may measure the distance by infrared or laser. In some embodiments, taking a picture of a scene, the electronic device 200 may utilize the distance sensor 280F to range for fast focus.
For example, in the image processing method provided by the present application, in the process of taking an image by a camera, an automatic focusing process can be performed according to the distance measured by the distance sensor 280F, so as to achieve fast automatic focusing.
The proximity light sensor 280G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic apparatus 200 emits infrared light to the outside through the light emitting diode. The electronic device 200 detects infrared reflected light from a nearby object using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 200. When insufficient reflected light is detected, the electronic device 200 may determine that there are no objects near the electronic device 200. The electronic device 200 can utilize the proximity sensor 280G to detect that the user holds the electronic device 200 close to the ear for talking, so as to automatically turn off the screen to save power. The proximity light sensor 280G may also be used in a holster mode, a pocket mode automatically unlocks and locks the screen.
The ambient light sensor 280L is used to sense the ambient light level. The electronic device 200 may adaptively adjust the brightness of the display screen 294 based on the perceived ambient light level. The ambient light sensor 280L may also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 280L may also cooperate with the proximity light sensor 280G to detect whether the electronic device 200 is in a pocket to prevent inadvertent contact.
The fingerprint sensor 280H is used to collect a fingerprint. The electronic device 200 may utilize the collected fingerprint characteristics to implement fingerprint unlocking, access to an application lock, fingerprint photographing, fingerprint incoming call answering, and the like.
The temperature sensor 280J is used to detect temperature. In some embodiments, the electronic device 200 implements a temperature processing strategy using the temperature detected by the temperature sensor 280J. For example, when the temperature reported by the temperature sensor 280J exceeds the threshold, the electronic device 200 performs a reduction in performance of a processor located near the temperature sensor 280J, so as to reduce power consumption and implement thermal protection. In other embodiments, the electronic device 200 heats the battery 242 when the temperature is below another threshold to avoid the low temperature causing the electronic device 200 to shut down abnormally. In other embodiments, when the temperature is below a further threshold, the electronic device 200 performs a boost on the output voltage of the battery 242 to avoid an abnormal shutdown due to low temperature.
The touch sensor 280K is also referred to as a "touch panel". The touch sensor 280K may be disposed on the display screen 294, and the touch sensor 280K and the display screen 294 form a touch screen, which is also called a "touch screen". The touch sensor 280K is used to detect a touch operation applied thereto or thereabout. The touch sensor can communicate the detected touch operation to the application processor to determine the touch event type. Visual output related to touch operations may be provided through the display screen 294. In other embodiments, the touch sensor 280K can be disposed on a surface of the electronic device 200 at a different location than the display screen 294.
The bone conduction sensor 280M may acquire a vibration signal. In some embodiments, bone conduction sensor 280M may obtain a vibration signal of a vibrating bone mass of a human vocal part. The bone conduction sensor 280M may also contact the human body pulse to receive the blood pressure pulsation signal. In some embodiments, bone conduction sensor 280M may also be disposed in a headset, integrated into a bone conduction headset. The audio module 270 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part acquired by the bone conduction sensor 280M, so as to implement a voice function. The application processor can analyze heart rate information based on the blood pressure pulsation signal acquired by the bone conduction sensor 280M, so as to realize a heart rate detection function.
The keys 290 include a power-on key, a volume key, etc. The keys 290 may be mechanical keys. Or may be touch keys. The electronic apparatus 200 may receive a key input, and generate a key signal input related to user setting and function control of the electronic apparatus 200.
The motor 291 may generate a vibration cue. The motor 291 can be used for both incoming call vibration prompting and touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 291 may also respond to different vibration feedback effects for touch operations on different areas of the display 294. Different application scenes (such as time reminding, receiving information, alarm clock, game and the like) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.
Indicator 292 may be an indicator light that may be used to indicate a state of charge, a change in charge, or may be used to indicate a message, missed call, notification, etc.
The SIM card interface 295 is used to connect a SIM card. The SIM card can be attached to and detached from the electronic apparatus 200 by being inserted into the SIM card interface 295 or being pulled out from the SIM card interface 295. The electronic device 200 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 295 may support a Nano SIM card, a Micro SIM card, a SIM card, etc. Multiple cards can be inserted into the same SIM card interface 295 at the same time. The types of the plurality of cards may be the same or different. The SIM card interface 295 may also be compatible with different types of SIM cards. The SIM card interface 295 may also be compatible with external memory cards. The electronic device 200 interacts with the network through the SIM card to implement functions such as communication and data communication. In some embodiments, the electronic device 200 employs esims, namely: an embedded SIM card. The eSIM card can be embedded in the electronic device 200 and cannot be separated from the electronic device 200.
The software system of the electronic device 200 may employ a layered architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application takes an Android system with a layered architecture as an example, and exemplarily illustrates a software structure of the electronic device 200.
Fig. 3a is a block diagram of a software structure of an electronic device 200 according to an embodiment of the present application. The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android runtime) and system library, and a kernel layer from top to bottom. The application layer may include a series of application packages.
As shown in fig. 3a, the application package may include applications such as camera, gallery, calendar, phone call, map, navigation, WLAN, bluetooth, music, video, short message, etc.
The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions.
As shown in FIG. 3a, the application framework layers may include a window manager, content provider, view system, phone manager, resource manager, notification manager, and the like.
The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.
The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.
The phone manager is used to provide communication functions of the electronic device 200. Such as management of call status (including on, off, etc.).
The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.
The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message reminders, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, sounding a prompt tone, vibrating the electronic device, flashing an indicator light, etc.
The Android runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.
The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a kernel library of android.
The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application layer and the application framework layer as binary files. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.
The system library may include a plurality of functional modules. For example: surface managers (surface managers), media libraries (media libraries), three-dimensional graphics processing libraries (e.g., OpenGL ES), 2D graphics engines (e.g., SGL), and the like.
The surface manager is used to manage the display subsystem and provide fusion of 2D and 3D layers for multiple applications.
The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, and the like.
The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like. The 2D graphics engine is a drawing engine for 2D drawings. The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.
For convenience of understanding, in the following embodiments of the present application, an electronic device having a structure shown in fig. 2 and fig. 3 is taken as an example, and an image enhancement method provided by the embodiments of the present application is specifically described in conjunction with the drawings and application scenarios.
Referring to fig. 3b, fig. 3b is a schematic diagram of an embodiment of an image enhancement method provided in an embodiment of the present application, and as shown in fig. 3b, the image enhancement method provided in the embodiment of the present application includes:
301. an electronic device acquires a first image, the first image including a target object.
In the embodiment of the application, the electronic device may determine the first image needing image enhancement based on the selection of the user.
In this embodiment of the application, the first image may include a target object obtained by shooting a human face, where the target object may be a human face.
How the electronic device acquires the first image is described next.
Alternatively, in an embodiment, the first image may be a human face image obtained by a user shooting a human face in real time through a camera (e.g., a camera) of the electronic device.
Optionally, in an embodiment, the user selects a stored face image from a local gallery of the electronic device or a cloud album, where the cloud album may refer to a network album located on a cloud computing platform.
Optionally, in an embodiment, the electronic device may perform an enhancement judgment on the images stored in the local album, and based on a judgment result, prompt the user to enhance the images that can be enhanced, so that the user may select the first image from the images that can be enhanced and are prompted by the electronic device.
Optionally, in another scenario, the electronic device may set an enhancement function in the shooting interface, and accordingly, after the user obtains the image by shooting, the electronic device may automatically use the image shot by the user as the first image without selection of the user.
The following description is made separately:
first, a user obtains a first image to be enhanced by shooting through a camera of an electronic device.
In this embodiment, the electronic device may display a shooting interface of the camera, receive a shooting operation of a user, and acquire the first image in response to the shooting operation.
Specifically, the electronic device may display a shooting interface of the camera, and after the camera is directed at the face, the user may click a shooting control in the shooting interface, and accordingly, the electronic device may receive a shooting operation of the user, perform shooting in response to the shooting operation, and acquire the first image, where the first image includes a target object corresponding to the face or a local area of the face.
Specifically, fig. 4(a) is a schematic diagram of an example of a Graphical User Interface (GUI) provided in the embodiment of the present application, where fig. 4(a) illustrates that, in an unlocking mode of a mobile phone, a screen display system of the mobile phone displays currently output interface content 401, and the interface content 401 is a main interface of the mobile phone. The interface content 401 shows a variety of third party applications (apps), such as pay for treasures, task card stores, micro blogs, photo albums, micro mail, card packages, settings, cameras. It should be understood that interface content 401 may also include other and more applications, which are not limited in this application.
When the mobile phone detects that the user clicks the icon 402 of the camera application on the main interface 401, the camera application may be started, and an interface as shown in fig. 4(b) is displayed, which may be referred to as a shooting interface 403 of the camera. The shooting interface 403 may include a view finder, an album icon 404, a shooting control 405, a camera rotation control 406, and the like.
The viewfinder is used for acquiring an image for shooting preview, and displaying the preview image in real time, such as a preview image of a human face in the image in fig. 4 (b). The album icon 404 is used for entering an album quickly, and after the mobile phone detects that the user clicks the album icon 404, the shot photos or videos can be displayed on the touch screen, or the photos or videos downloaded and stored from the network and the like can be displayed. The shooting control 405 is used for shooting a photo or recording a video, and when the mobile phone detects that the user clicks the shooting control 405, the mobile phone executes a shooting operation and stores the shot photo; or, when the mobile phone is in the video recording mode, after the user clicks the shooting control 405, the mobile phone executes the video recording operation, and stores the recorded video. The camera rotation control 406 may be used to control the switching of the front camera and the rear camera.
In addition, the shooting interface 403 also includes functional controls for setting shooting modes, such as a 4(b) portrait mode, a photo mode, a video mode, a professional mode, and more modes. It should be understood that, when the user clicks the icon 402, in response to the click operation, the mobile phone defaults to the photographing mode after opening the camera application, which is not limited in this application.
As shown in fig. 4(b), in the normal photographing mode, the user can click the photographing control 405 to perform photographing. In response to the operation of clicking the shooting control 405 by the user, the mobile phone executes the shooting operation and acquires a first image obtained by shooting.
In an embodiment, after the user takes a picture by a camera based on an electronic device, since the image quality of the taken first image is low, it should be noted that the low image quality of the first image may be understood as that the image quality of the face region in the first image is low, or the image quality of a partial region (for example, a certain five sense organs) of the face in the first image is low, and is not limited herein. It should be noted that the image quality may be judged based on the vision of the user, and for example, the image quality may at least include at least one of the following image characteristics: poor brightness, poor hue, low detail clarity, for example: the human face has poor brightness or color tone, the human face has low detail definition, one or more five sense organs have poor brightness or color tone, or one or more five sense organs have low detail definition.
At this time, the user may enhance the first image obtained by shooting, for example, click an "enhancement" control as shown in fig. 4(c), the mobile phone may display the shot photo in the photo display area 409 after performing the shooting operation, and in addition, an "enhancement" control and a "save" control may also be displayed in the display interface of the mobile phone, and the user may click the "save" control, and accordingly, the mobile phone may receive a save instruction, save the shot photo in response to the save instruction, and save the photo in the album icon 404. In addition, the user may click the "enhance" control, and accordingly, the mobile phone may receive an enhancement instruction, and in response to the enhancement instruction, the mobile phone may determine that the user is to enhance the picture displayed on the current display interface, where for convenience of description, an image that needs to be enhanced is referred to as a first image hereinafter.
In the above, the embodiment of enhancing the shot first image by the mobile phone is introduced, and optionally, in another scenario, the user may directly select the first image to be enhanced from the album.
In this embodiment, the electronic device may display an album interface of the camera, where the album interface includes a plurality of images, and receive a third image selection instruction, where the third image selection instruction indicates that the first image is selected from the plurality of images included in the album interface.
Specifically, as shown in fig. 5(a), fig. 5(a) shows an image display interface of the album, in which images previously taken by the user and images downloaded from the network side may be included, as shown in fig. 5(b), the user may select one of the images, for example, a click operation may be performed on the selected image, or a long-press operation may be performed, and in response to the above operation, the mobile phone may display an interface as shown in fig. 5(c), in which, in addition to an image preview, a "delete" control, and the like, which may be conventionally displayed, an "enhance" control may be included, the user may click on the above "enhance" control, and in response to the user clicking the "enhance" control, the mobile phone may enhance the image, for example, an enhanced area selection interface as shown in fig. 4(d) may be displayed.
It should be noted that the control setting and the display content in the foregoing embodiments are only an illustration, and the application is not limited thereto.
Optionally, in another scenario, the mobile phone may perform enhancement judgment on the image stored in the local album, and prompt the user to enhance the image that can be enhanced based on the judgment result.
Specifically, the mobile phone may use the dynamic range of the brightness, the hue, the skin texture degree of the photo, and whether there is a high-definition guidance image with similar human face pose as the basis for the determination, for example, the brightness of the first image in fig. 6(a) is worse than that of the second image, and the human face pose thereof is similar to the pose of the first image, and thus, it may be determined that the first image is an enhanced image. Illustratively, as shown in fig. 6(a), fig. 6(a) shows an image display interface of an album, wherein the interface may include an "enhanced image" control which a user may click, as shown in fig. 6(b), in addition to images previously taken by the user and images downloaded from the network side, and in response to the user's operation, the cell phone may display a display interface of the enhanced images as shown in fig. 6(b), the user may click an image desired to be enhanced in the display interface of the enhanced images, and in response to the user's operation, the cell phone may display an image preview interface as shown in fig. 6(c), wherein in addition to the conventionally displayable image preview, "delete" control, etc., an "enhanced" control may be included, the user may click the "enhance" control, and in response to the operation of clicking the "enhance" control by the user, the mobile phone may enhance the image, for example, an enhanced region selection interface as shown in fig. 4(d) may be displayed.
It should be noted that the control setting and the display content in the foregoing embodiments are only an illustration, and the application is not limited thereto.
Optionally, in another scenario, the mobile phone may set an enhanced function in the shooting interface.
Illustratively, as shown in fig. 7(a), the shooting interface 403 includes functional controls for setting a shooting mode, such as a portrait mode, a shooting mode, a video recording mode, an enhanced mode, and a more mode in fig. 7 (a). It should be appreciated that when the user slides the enhanced mode icon to the current mode, the handset enters the enhanced mode in response to the clicking operation. As shown in fig. 7(b), the user may click the photographing control 405, and the mobile phone displays the image obtained by photographing in the display interface shown in fig. 7(c) in response to the operation, and the display interface may further include a "save" control and an "enhance" control, and if the user clicks the "save" control, the mobile phone may respond to the operation, and not perform the enhancement processing on the image, but directly save the image into the local album, and if the user clicks the "enhance" control, the mobile phone may respond to the operation, perform the enhancement processing on the image, for example, may obtain a guide image, and perform the enhancement processing on the first image obtained by photographing based on the guide image.
Alternatively, in another embodiment, the mobile phone may not enter the enhanced mode based on the user's operation, but determine whether to enter the enhanced mode based on an image quality analysis of the preview image on the photographing interface.
As shown in fig. 8, when the mobile phone recognizes that the sharpness of the photographed face is too low, the enhanced mode may be automatically entered. In addition, the mobile phone can judge whether to enter the enhancement mode or not by combining the time length of the face appearing on the preview interface, so that the misjudgment rate can be reduced, and the influence of the user on the operations of the mobile phone, and the like can be reduced. For example, the mobile phone recognizes that the definition of the face photographed on the preview interface is too low, but the time for the face to appear is only 1 second, and the mobile phone may not enter the enhanced mode because there is no face on the preview interface in the next second.
It should be noted that the setting manner and the display content of the control in the interface are only an illustration, and are not limited herein.
After the mobile phone enters the enhancement mode, the image of the image preview area may be analyzed, and a guide image that may be used as a guide image of the image preview area may be obtained, for example, whether a guide image that may satisfy the guide image of the image preview area (the posture and expression of the face are close, the brightness and the color tone are better, etc.) exists may be searched in a local album, a local enhanced map library, or a cloud enhanced map library, and if the guide image is obtained, the first image may be taken by the user, and after the first image is obtained by the mobile phone, the first image is automatically enhanced based on the guide image.
After the mobile phone photographing enters the enhanced mode, as shown in fig. 8, a reminder box may be included on the preview interface of the mobile phone, and the reminder box may be used to prompt the user to enter the enhanced mode when the current photographing is performed, and may include text content of the enhanced mode and a close control (e.g., an "exit" control shown in fig. 8).
When the user clicks the closing control, the mobile phone shooting can exit the enhanced mode in response to the clicking operation of the user. For example, a user may have a dynamic range of brightness or a face with too low definition within a certain time due to other operations, and the mobile phone recognizes the dynamic range of brightness or the face with too low definition to enter the enhanced mode, and the user may not wish to enter the enhanced mode to take a face picture; or when the user finishes shooting the face picture and wants to exit the enhanced mode and enter the normal mode, the user can click the closing control in the reminding frame, so that the shooting preview interface can be switched to the display interface in the normal mode from fig. 8. In addition, there may be other methods of turning off the enhancement mode, which is not limited in this application.
Alternatively, in another embodiment, when the mobile phone recognizes that the dynamic range or the sharpness of the brightness of the photographed face is too low, a guide may be displayed that allows the user to select entry into the enhanced mode.
As shown in fig. 9, a preview interface of the mobile phone may include a reminder box, which may be used to prompt the user to enter the enhanced mode when selecting, and the reminder box may include enhanced mode text content, a determination control, and a hidden control. Specifically, when the mobile phone recognizes that the dynamic range or the sharpness of the brightness of the photographed face is too low, guidance for the user to select entry into the enhanced mode may be displayed, and as shown in fig. 9, the user may click an "entry" control, so that the mobile phone enters the enhanced mode.
It should be noted that the setting manner and the display content of the control in the interface are only an illustration, and are not limited herein.
In the embodiment of the application, the user can select the target object to be enhanced in the first image.
Specifically, the mobile phone may display a target object selection control, and specifically, as shown in fig. 4(d), the target object selection control may include an "all" control, a "five sense organs" control, and a "custom region" control, where the "all" control may provide a function of enhancing a face region of the currently-taken photo, the "five sense organs" control may provide a function of enhancing five sense organs in the currently-taken photo, and the "custom region" control may provide a function of enhancing a custom region in the currently-taken photo.
It should be noted that the target object selection control is only an illustration, and in practical applications, the target object selection control may also be of another type, which is not limited in the present application.
302. The electronic equipment acquires a guide image according to the first image, the guide image comprises the target object, and the definition of the target object in the guide image is larger than that of the target object in the first image.
In the embodiment of the present application, a user may select a guide image for enhancing a first image from a local album or a cloud album, or select a guide image that may be used for enhancing a first image by an electronic device, which will be described below.
First, how a user selects a guide image from a local album or a cloud album is described.
As shown in fig. 10(a), the user may click on the "all" control shown in fig. 4(d), and accordingly, the mobile phone may receive an instruction to enhance the face region of the taken photograph, and in response to the instruction, the mobile phone may display a guide image selection interface, and for convenience of description, a guide image as the enhanced first image will be referred to as a guide image hereinafter.
Optionally, as shown in fig. 10(b), in one embodiment, the cell phone may display a guide image selection interface that may include a "select from local album" control and a "smart select" control. The user may click "select from the local album", and accordingly, the mobile phone may receive an instruction to select a guidance image from the local album, and in response to the instruction, the mobile phone may open the local album, and display a selection interface of the guidance image as shown in fig. 10(c) on the display interface, optionally, fig. 10(c) may include an album display area 501 and a to-be-enhanced image display area 502, where the album display area 501 may display a preview of a photograph stored in the local album, the to-be-enhanced image display area 502 may display a preview of a photograph to be enhanced, and the setting manner of the above control may allow the user to visually compare based on the to-be-enhanced image and the guidance image, and select a guidance image with a gesture closer to the to-be-enhanced image and a higher definition of details.
It should be understood that, as used herein, the terms "high" and "low" (e.g., the terms "high quality" and "high resolution") do not refer to a specific threshold, but rather to a relationship relative to one another. Thus, a "high resolution" image need not be greater than a certain number of resolutions, but has a higher resolution than the associated "low resolution" image.
The guide image selection interface is provided by way of illustration only, and the present application is not limited thereto.
Alternatively, as shown in fig. 10(c), the user may select an image from the album display area 501 as a guide image, and accordingly, the cellular phone may receive a picture selection instruction from the user and acquire the image selected by the user, and the cellular phone may determine the guide image selected by the user as the guide image of the image taken by the user in fig. 4 (b).
Optionally, in an embodiment, after the guide image is obtained, the mobile phone may determine whether the pose expressions of the faces in the first image and the guide image are close to each other based on the pose similarity degrees of the faces in the first image and the guide image, if the pose expressions of the faces in the first image and the guide image are close to each other, it may be determined that the guide image selected by the user may be used as the guide image of the first image, and if the pose expressions of the faces in the first image and the guide image are not close to each other, it may be determined that the guide image selected by the user may not be used as the guide image of the first image.
It should be noted that how to determine the proximity of the facial gestures in the first image and the guide image will be described later, and will not be described herein again.
For example, as shown in fig. 10(c), if the user selects the first image in the album display area 501, since the pose expressions of the first image and the first image in the album display area 501 are very close, the mobile phone may determine that the guide image may be the guide image of the first image after the judgment based on the pose proximity of the face in the image, and enhance the first image based on the image enhancement method, and as shown in fig. 10(f), the mobile phone may display the target image after the enhancement of the first image based on the image enhancement method.
For example, as shown in fig. 10(d), if the user selects the second image in the album display area 501, since the pose expressions of the first image and the first image in the album display area 501 are not close (the orientation of the face is different greatly), the mobile phone may determine that the guide image cannot be used as the guide image of the first image after the judgment based on the pose proximity of the face in the images, and at this time, the mobile phone may prompt the user to select the guide image again, and optionally, as shown in fig. 10(e), the mobile phone may display a prompt of "the pose difference is too large, please select again" on the interface, and return to the guide image selection interface shown in fig. 10(c), so that the user may reselect the guide image with the pose closer to the first image. For example, if the user reselects the first image in the album display area 501, since the pose expressions of the first image and the first image in the album display area 501 are very close, after the mobile phone determines the proximity of the pose of the face in the image, the mobile phone may determine that the guide image may be the guide image of the first image, and enhance the first image based on an image enhancement method, as shown in fig. 10(f), after the mobile phone enhances the first image based on the image enhancement method, the target image may be displayed.
Optionally, in another embodiment, after the mobile phone acquires the first image and the guide image, the first image and the guide image may be sent to the server, and the server enhances the first image based on an image enhancement method and sends the target image to the mobile phone, and further, the mobile phone may display the target image.
How to enhance the first image based on the image enhancement method by the mobile phone or the server will be described later, and details are not repeated here.
Next, how the electronic device automatically selects a guide image that can be a guide image of the first image will be described.
In the embodiment of the application, the electronic device can select an image with better brightness and tone and higher detail definition from a local album or an album at the cloud based on the posture matching strategy and other image processing strategies of the face, and the image with high face posture similarity serves as a guide image to enhance the first image.
Illustratively, as shown in fig. 10(b), when the user clicks the "smart selection" control, the mobile phone may receive that the user clicks the "smart selection" control, and select an image with better brightness and color tone, higher detail definition, and high face pose similarity from a local album or an album in the cloud as the guide image based on the face pose matching policy and other image processing policies to enhance the first image.
The dynamic range of the brightness may refer to the number of gray levels between the brightest pixel point and the darkest pixel point in the target object or the pixels included in the target object.
Alternatively, in another embodiment, the guide image may be selected by the server without guiding the user to select, specifically, as shown in fig. 10(b), when the user clicks the "smart selection" control, and accordingly, the mobile phone may receive that the user clicks the "smart selection" control, and send the first image to the server, and the server may select an image with better brightness and color tone, higher detail definition, and high face pose similarity from a local album or an album in the cloud as the guide image to enhance the first image based on the face pose matching policy and other image processing policies.
Optionally, in this embodiment of the application, the mobile phone may display the target image after enhancing the first image based on the image enhancement method. In addition, other controls, such as the "save" control and the "cancel" control shown in fig. 10(f), may also be displayed on the display interface.
Specifically, if the user clicks the "save" control, the mobile phone may respond to the operation of the user, and save the displayed target image to a local album or other storage locations, for example, to a cloud.
Optionally, the mobile phone may save the first image and the enhanced first image to a local album or other storage location, for example, a cloud, in response to the user clicking the "save" control, which is not limited herein.
Alternatively, the mobile phone may return to the interface in fig. 10(b) in response to the operation of the user clicking the "cancel" control, and prompt the user to select the guidance image again, or the mobile phone may return to the interface displayed in fig. 10(c) in response to the operation of the user clicking the "cancel" control, and prompt the user to select the guidance image again.
It should be noted that the control type of the interface and the display content of the mobile phone display interface are only examples, and the application is not limited thereto.
Optionally, in an embodiment, the user may enhance only a local region in the first image, such as only one or more of the five sense organs or other regions in the first image, without limitation herein.
Specifically, as shown in fig. 11(a), the user may click a "five sense organs" control shown therein, and accordingly, the mobile phone may receive an instruction to enhance the five sense organs region of the taken photograph, and in response to the instruction, the mobile phone may display a five sense organs region selection interface.
Alternatively, as shown in fig. 11(b), in one embodiment, the cell phone may display a facial region selection interface that may include selection guidance controls for each facial region, such as the "left eye" control, the "right eye" control, the "lips" control, the "nose" control, the "left ear" control, the "right ear" control, the "left eyebrow" control, and the "right eyebrow" control shown in fig. 11 (b). The mobile phone can receive an instruction of clicking the control corresponding to the facial features to be enhanced, and identifies the facial feature region corresponding to the selection of the user in the first image based on the face identification strategy in response to the instruction. For example, in fig. 11(b), the user clicks the "left eye" control, and accordingly, the mobile phone may receive an instruction that the user clicks the "left eye" control, and in response to the instruction, identify the left eye region of the face in the first image based on the face recognition policy, and optionally, the mobile phone may circle the left eye region through the presentation box.
It should be noted that the control setting and display content of the above-mentioned five sense organ region selection interface are only an illustration, and the application is not limited thereto.
It should be noted that, regarding how to identify the region of the five sense organs corresponding to the selection of the user in the first image based on the face recognition policy, the description will be omitted here.
Optionally, the five sense organ region selection interface may further include a "determine" control and a "return" control, as shown in fig. 11(c), the user may click the "left eye" control and the "lips" control, and accordingly, the mobile phone may receive an instruction for the user to click the "left eye" control and the "lips" control, and in response to the instruction, identify the left eye region and the lips region of the face in the first image based on the face recognition policy. The user may click the "ok" control, and accordingly, the mobile phone may receive an instruction of the user to click the "ok" control, and in response to the instruction, the mobile phone may display a selection interface of the guide image, and the selection interface of the guide image may refer to fig. 11(b) and its corresponding description in the above embodiment, which is not described herein in detail.
It should be noted that the control type of the interface and the display content of the mobile phone display interface are only examples, and the application is not limited thereto.
Alternatively, in one embodiment, as shown in fig. 12(a), the user may click on a "custom region" control shown therein, which may instruct the user to select the enhanced region in the first image themselves, and accordingly, the mobile phone may receive an instruction from the user to click on the "custom region" control, and in response to the instruction, as shown in fig. 12(b), the mobile phone may display an enhanced region selection interface.
Alternatively, an illustration of an enhanced region selection interface is shown in fig. 12(b), in which the user may manually circle the enhanced region, as shown in fig. 12(c), after the user completes the circle of the enhanced region, the mobile phone may display a "determine" control and a "continue to select" control. The user may click the "confirm" control, and in response to an instruction for the user to click the "confirm" control, the mobile phone may display a guidance image selection interface, and for the selection interface of the guidance image, reference may be made to fig. 15(b) and the description corresponding thereto in the above embodiment, which is not described herein again.
Optionally, the user may click the "continue selection" control, in response to an instruction from the user to click the "continue selection" control, the mobile phone may display an enhanced region selection interface, and the user may continue to circle the enhanced region in the enhanced region selection interface, as shown in fig. 12(d), the user may manually circle the enhanced region in the enhanced region selection interface, and after the circle is completed, click the "confirm" control in the interface shown in fig. 12(e), so as to enter the selection interface for guiding the image.
Alternatively, in another embodiment, the user may click on a "custom region" control shown therein, which may instruct the user to select the enhanced region in the first image, and accordingly, the mobile phone may receive an instruction from the user to click on the "custom region" control, in response to which the mobile phone may display an enhanced region selection interface, unlike in fig. 12(b) to 12(e) described above, as shown in fig. 13(a), the mobile phone may display a guide frame of a preset size on the display interface, for example, may display a rectangular frame of a preset size in the center of the interface, and the user may drag the rectangular frame to translate to the position of the enhanced region (as shown in fig. 13 (a)) and change the size of the enhanced region by changing the size of the rectangular frame (as shown in fig. 13 (b)), accordingly, the mobile phone may determine the enhanced region based on the operation of the user on the guide box, as shown in fig. 13(c), after the user completes the delineation, the user clicks the "determine" control to enter the selection interface of the guide image, or may click the "continue selection" control to enter the selection interface of the guide image.
Alternatively, in one embodiment, the electronic device may construct a photo album dedicated to storing the boot image.
Referring to fig. 14(a), as shown in fig. 14(a), when image guidance is performed, the guidance image display interface of the mobile phone may further display a "select from guidance image gallery" control, specifically, the user may click the "select from guidance image gallery" control, and in response to an operation of the user, the mobile phone may display an interface of the guidance image gallery, as shown in fig. 14(b), images in the guidance image gallery may be classified based on preset rules, for example, according to human beings, scenes, animals, and the like, and further, in the classification of people, according to different human beings, which is not limited in the present application. As shown in fig. 14(b), the guidance image gallery display interface may include a "person" control and a "scene" control, and when the user clicks the "person" control, a person selection interface as shown in fig. 14(c) may be displayed, wherein the interface may include a selection control corresponding to the name of the person, and the user may click the corresponding control to instruct the mobile phone to display the photo album constructed by the image of the corresponding person, and further, the user may select a guidance image in the photo album displayed by the mobile phone.
Optionally, the mobile phone may install an application program for guiding the image gallery, as shown in fig. 15(a), the user may click on an icon corresponding to the application program for guiding the image gallery, accordingly, the mobile phone may receive an instruction from the user to click on the application program for guiding the image gallery, in response to which the mobile phone may display a display interface for guiding the image gallery, optionally, in an embodiment, the mobile phone may display an interface for guiding the image gallery as shown in fig. 15(b), the display interface for guiding the image gallery may include a "person" control and a "scene" control, and in clicking the "person" control, the user may display a person selection interface as shown in fig. 15(c), wherein the interface may include a selection control corresponding to a name, and the user may click on the corresponding control to instruct the mobile phone to display an album constructed by the image of the corresponding person, as shown in fig. 15(c), the user may click the "zhang san" control, and accordingly, the mobile phone may obtain an instruction that the user clicks the "zhang san" control, and display the album as shown in fig. 15 (d).
Optionally, in an embodiment, the album display interface may further include a control for modifying the album, for example, a "+" control shown in fig. 15(d), specifically, the user may click the "+" control to add the image in the album, for example, after the user clicks the "+" control, the mobile phone may respond to the operation to display the local album and guide the user to select the image to be added into the album. In addition, the user can delete images that have been added to the album.
It should be noted that the setting and the display content of the control in the album display interface are only an illustration, and the application is not limited thereto.
Alternatively, in one embodiment, the user may add the displayed image directly from the third party application to the library of guide images.
As shown in fig. 16(a), fig. 16(a) shows a schematic diagram of a chat interface, where three pictures are sent, and after the mobile phone receives the pictures, can be displayed on the chat interface, as shown in fig. 16(b), the user can press the image for a long time, the handset responds to the operation, a guide may be displayed that operates on the image, as shown in fig. 17, which may include a "save to album" control, a "save to guide image gallery" control, and a "copy" control, which the user may click on, the cell phone may save the image to the guide image gallery in response to the operation (as shown in fig. 17), or a display interface as in fig. 15(b) is displayed to guide the user to save the images into the album of the corresponding category.
It should be noted that the control setting and the display content in the foregoing embodiments are only an illustration, and the application is not limited thereto.
303. And the electronic equipment enhances the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
In the embodiment of the application, since the image quality of the first image is low (for example, the brightness or the color tone of the face is poor, the detail definition of the face is low, the brightness or the color tone of one or more five sense organs is poor, or the detail definition of one or more five sense organs is low), the first image obtained by shooting may be enhanced, for example, the "enhancement" control element shown in fig. 4(c) is clicked, and the electronic device equivalently receives the enhancement instruction.
Alternatively, in one embodiment, the user may need to enhance an image stored in the album in the electronic device, for example, in a scenario where the user wants to send a self-timer to other users, and then open the image to find that the image quality is low (for example, the brightness or color tone of the face is poor, the definition of the details of the face is low, the brightness or color tone of one or more five sense organs is poor, or the definition of the details of one or more five sense organs is low), the user may open the album, and enhance the self-timer (the first image) to be sent in the album, for example, click on an "enhance" control as shown in fig. 12(c), and the electronic device is equivalent to receiving an enhancement instruction.
In an embodiment of the application, the electronic device may enhance the target object in the first image based on the target object in the guide image through the neural network.
It should be noted that the target object may also be understood as the same five sense organs of different people, for example, if the first image is obtained by taking a right face of zhang san, then eyes of zhang san (target object) are included in the first image, and correspondingly, if the guide image is obtained by taking a right face of lie, then eyes of lie are included in the guide image, and if the posture information of eyes of zhang san and lie are very similar, then eyes of lie in the guide image may also be used as the target object for enhancing the target object (eyes of zhang).
In the embodiment of the present application, since the principle of enhancing the image is that the target image is not distorted more than the first image on the premise that the image quality of the first image is improved, in the case that the target object in the first image needs to be enhanced, the guidance image as the first image guidance image in which the difference between the postures of the target object and the target object with respect to the target object cannot be too large is used, and therefore, the difference between the posture information of the target object and the posture information of the target object is within the preset range.
Alternatively, in an embodiment, after receiving the enhancement instruction, the electronic device may display an album interface in response to the enhancement instruction to guide the user to select the guide image, for example, as shown in fig. 15(c), the user may select one guide image in the guide image selection interface shown in fig. 15(c), and in response to the image selection operation by the user, the electronic device may acquire the guide image corresponding to the image selection operation.
Optionally, in an embodiment, if the user needs to enhance all or part of the face in the first image (how the user selects the region (target object) that needs to be enhanced may be described with reference to fig. 4(d) and the corresponding embodiment, which is not described herein again), at this time, the first image includes the face or part of the face (target object), and accordingly, after the electronic device acquires the guide image, it may be determined whether there is a target object in the guide image that is close to the target object in the first image.
In the embodiment of the present application, since the principle of enhancing the image is that the target image is not distorted more than the first image on the premise that the image quality of the first image is improved, in the case that the target object in the first image needs to be enhanced, the guidance image as the first image guidance image in which the difference between the postures of the target object and the target object with respect to the target object cannot be too large is used, and therefore, the difference between the posture information of the target object and the posture information of the target object is within the preset range.
Next, how to determine the degree of difference in the postures of the target object in the first image and the target object in the guide image will be described.
In one embodiment, the electronic device may determine whether there is a target object in the guide image that is similar to the face pose in the first image based on a face keypoint landmark detection method. The face key points may also be referred to as face feature points, and generally include points constituting facial features (eyebrows, eyes, nose, mouth, and ears) and a face contour. The method for detecting a face image and labeling one or more key points in the face image may be referred to as a face key point detection method or a face alignment detection method. By performing face alignment detection on the face image, a feature region in the face image can be determined, where the feature region may include but is not limited to: eyebrow regions, eye regions, nose regions, mouth regions, ear regions, and the like.
In the embodiment of the application, the electronic device can judge the difference degree of the posture information of the target object and the target object based on a face key point detection model. Specifically, after the first image and the guide image are acquired, the face key point detection model may be called to perform face detection on the first image and the guide image respectively, so as to determine a plurality of key points in the first image and the guide image and annotation information of each key point, where the key points may include but are not limited to: mouth keypoints, eyebrow keypoints, eye keypoints, nose keypoints, ear keypoints, and face contour keypoints, among others; the annotation information of the key points may include, but is not limited to: the method includes the steps of marking position marking information (for example, marking the position of a key point), shape marking information (for example, marking the shape of a circular point), feature information and the like, wherein the feature information is used for representing the category of the key point, if the feature information is the feature information of an eye, the key point is shown to be the key point of the eye, and if the feature information is the feature information of a nose, the key point is shown to be the key point of the nose and the like. The plurality of keypoints identified in the first image and the guide image may be as indicated by the gray dots in fig. 18. After determining the plurality of keypoints, similarity of pose information of the target object and the target object may be determined based on annotation information of the keypoints and position annotation information (which may be, for example, pixel coordinates of the keypoints).
It should be noted that, before comparing the difference degree of the pose information based on the pixel coordinates of the target object and the feature point of the target object, the electronic device may perform a cropping process on the first image and the guide image so that the position and the pose of the target object in the first image are close to the position and the pose of the target object in the guide image. Optionally, taking the target object and the target object as human faces as examples, the bounding range of the clipping processing may be below eyebrows (including eyebrows), above chin, left and right, and the edge of the face contour is used as a boundary (including ears).
Optionally, if the target object and the target object are different in size, the cropped image may be scaled so that the target object and the target object have the same size.
Optionally, if the faces of the target object and the target object are different, the cut image may be rotated, so that the target object and the target object are the same. The rotation processing is processing for rotating the target object or the target object clockwise or counterclockwise by a certain rotation angle with the center point of the target object or the target object as an origin.
Alternatively, a certain area may be left around the delineation area for subsequent pixel registration. As shown in fig. 19, a region 1903 is a region of the target object and the target object defined by the electronic device, and a region 1902 is a region appropriately left around the defined region of the electronic device, which corresponds to the clipped image as the region 1902 and the region 1903 in fig. 19.
Illustratively, as shown in fig. 20(a) and 20(b), fig. 20(a) shows a schematic diagram of a first image, and fig. 20(b) shows a schematic diagram of a guide image, in which the target object and the target object are human faces in the first image and the guide image, respectively, however, the pose difference of the human faces in fig. 20(a) and 20(b) is too large, and the electronic device may perform image processing on the guide image in fig. 20 (b).
As shown in fig. 21(a), the target object may be rotated first so that the poses of the rotated target object and the target object substantially coincide, and as shown in fig. 21(b), fig. 21(b) is a schematic diagram of the rotated guide image. Next, the size of the target object may be scaled so that the size of the scaled target object and the size of the target object substantially coincide as shown in fig. 21(c), which shows a schematic of the scaled guide image.
The first image in fig. 20(a) may be processed by way of example only, and the present application is not limited thereto.
For example, if the target object is a whole face, the electronic device may obtain, based on the annotation information, key points in the range of the first image and the guide image in the face range and pixel coordinates corresponding to each key point in the range of the face, and the electronic device may calculate a sum of squares of differences between the pixel coordinates corresponding to each key point in the range of the first image and the guide image in the face range, and if the sum of squares calculated by the calculation exceeds a preset threshold, it is determined that the difference between the pose information of the target object in the first image and the pose information of the target object in the guide image is too large. Optionally, after determining that the difference degree between the posture information of the target object in the first image and the posture information of the target object in the guide image is too large (not within the preset range), the electronic device may prompt the user to reselect the guide image, which may specifically refer to fig. 10(d) and the description in the corresponding embodiment thereof, and is not described herein again.
For example, if the target object is a left eye, the electronic device may obtain, based on the annotation information, key points in the first image and the guide image in the range of the left eye and pixel coordinates corresponding to each key point in the range of the left eye, and the electronic device may calculate a sum of squares of differences between the pixel coordinates corresponding to each key point in the range of the left eye in the first image and the guide image in the range of the left eye in the first image, and if the sum of squares calculated above exceeds a preset threshold, it is considered that the difference between the pose information of the target object in the first image and the pose information of the target object in the guide. Optionally, after determining that the difference degree between the posture information of the target object in the first image and the posture information of the target object in the guide image is too large (not within the preset range), the electronic device may prompt the user to reselect the guide image, which may specifically refer to fig. 10(d) and the description in the corresponding embodiment thereof, and is not described herein again.
Optionally, in another embodiment, after obtaining the plurality of key points, a feature region of the target object in the first image and a feature region of the target object in the guide image may be determined according to the annotation information of each key point in the plurality of key points. As can be seen from the foregoing, the labeling information may include: characteristic information, position marking information, and the like. Therefore, in one embodiment, the feature region may be determined from feature information of the respective keypoints. Specifically, the category of each target keypoint may be determined based on the feature information of each keypoint, an area formed by target keypoints of the same category may be used as one feature area, and the category may be used as the category of the feature area. For example, selecting key points of which the feature information is all nose feature information, wherein the categories of the key points are nose key points; the region formed by these target key points is referred to as a nose region.
In another embodiment, the feature area may be determined according to the position labeling information of each key point. Specifically, the labeling position of each key point may be determined according to the position labeling information, the key points at adjacent positions may be connected, and if the connected shape is similar to any one of the five sense organs (eyebrow, eye, nose, mouth, ear) of the human face, the region formed by the key points at the adjacent positions may be determined as the feature region, and the category of the feature region may be determined according to the shape. For example, if the shape obtained by connecting the target key points of adjacent positions is similar to the shape of the nose, the region formed by the key points of the adjacent positions may be determined as the nasal region. Accordingly, the electronic device may determine the degree of difference between the posture information of the target object and the posture information of the target object based on the comparison between the shape of the feature region corresponding to the target object and the shape of the feature region corresponding to the target object.
For example, if the target object is a human face, since the human face is composed of a cheek, left and right eyes, a nose, lips, left and right ears, and left and right eyebrows, the electronic device may determine the cheek region, the left and right eye regions, the nose region, the lip region, the left and right ear regions, and the left and right eyebrow regions in the first image, and determine the cheek region, the left and right eye regions, the nose region, the lip region, the left and right ear regions, and the left and right eyebrow regions in the guide image, and compare the shapes of the feature regions in the first image and the guide image, respectively:
comparing the shape of the left eye region in the first image with the shape of the left eye region in the guide image;
comparing the shape of the right eye region in the first image with the shape of the right eye region in the guide image;
comparing the shape of the cheek region in the first image to the cheek region in the guide image;
comparing the shape of the nose region in the first image with the shape of the nose region in the guide image;
comparing the shape of the lip region in the first image with the shape of the lip region in the guide image;
comparing the shape of the left ear region in the first image with the shape of the left ear region in the guide image;
comparing the shape of the right ear region in the first image with the shape of the right ear region in the guide image;
comparing the shape of the left eyebrow region in the first image with the shape of the left eyebrow region in the guide image;
comparing the shape of the right eyebrow area in the first image with the shape of the right eyebrow area in the guide image;
specifically, the difference between the posture information of the target object and the posture information of the target object may be determined by combining the comparison results of each of the regions, for example, when the difference is too large, the difference between the posture information of the target object and the posture information of the target object is determined to be too large, or when the difference is not large, the difference between the posture information of the target object and the posture information of the target object is determined to be not large, and at this time, only the five-sense-organ region with small difference may be subsequently enhanced.
It should be noted that the face keypoint detection model may be obtained by performing a hierarchical fitting training using a face alignment algorithm and a sample data set, where the face alignment algorithm may include, but is not limited to: machine learning regression algorithms, such as the Supervised Descent Method (SDM), Local Binary Features (LBF) algorithms; or a Convolutional Neural Network (CNN) algorithm, such as a face landmark detection (TCDCN) algorithm based on deep multitask learning, a dense face alignment (3D dense face alignment, 3DDFA) algorithm, and so on. Based on the algorithms, an original model can be designed and obtained, and then a face key point detection model can be finally obtained after training is carried out based on the original model and a sample data set.
It should be noted that, if the electronic device determines that the difference between the pose information of the target object and the pose information of the target object is too large, the electronic device may prompt the user to select the guide image again, and as shown in fig. 15(d), if the user selects the second image as the guide image of the first image, the electronic device may determine that the difference between the pose information of the target object (face) of the first image and the pose information of the target object (face) of the guide image is too large, and thus, the electronic device may prompt the user to select the guide image again in the interface shown in fig. 10 (d).
Alternatively, in one embodiment, on the selection interface for guiding the images by the user, the electronic device may calculate the proximity of the target object and the target object in each image in terms of pose information, detail clarity and the like, and prompt the user for reference (display in the interface or play to the user by voice).
Alternatively, in one embodiment, the user may select a plurality of guide images as the guide image for the first image.
It should be noted that the electronic device may provide a gallery in which the guide image is stored by a special user, and the user takes a facial picture or downloads the facial picture from a network, and stores a part of the facial picture with high quality (with high brightness and high detail definition) into the gallery in which the guide image is stored. For example, the user takes a face picture of the user, and stores a part of the face picture with high quality (excellent brightness and high detail definition) in a gallery storing the guide image, and then the user takes a face picture of the user, so that the guide image in the gallery can be used for guide enhancement. The guide images are collected and accumulated by the user, a guide image photo library can be created according to categories, and updating and deleting are supported at any time. The boot image storage area may be a storage medium local to the electronic device or may be stored on the cloud. Reference may be made to fig. 14(a) and the description of the related embodiments, which are not repeated herein.
Alternatively, in one embodiment, the selection of the guide image may be done automatically by the electronic device. Alternatively, the electronic device may select an image closest to the pose information of the target object in the first image as the guide image, or integrate other criteria considerations, such as the dynamic range DR of brightness, detail sharpness information, and the like. If the electronic device detects a plurality of guide images including attitude information and a target object close to the target object, the guide images may be obtained based on the above criteria, or randomly filtered, or presented to the user through an interface for selection by the user. In this embodiment, the detail definition of the target object is greater than the detail definition of the target object.
In this embodiment, the electronic device may provide that the neural network augments the target object in the guide image to the target object in the first image.
Specifically, the electronic device may first perform pixel registration on the target object in the first image and the target object in the guide image, and determine a second pixel point corresponding to each first pixel point in the M first pixel points, where the second pixel point is a pixel point included in the target object.
It should be noted that, the electronic device may first divide a grid into the first image and the second image, register a coordinate point of the grid in the first image and a coordinate point of the grid in the second image, and then calculate, through an interpolation algorithm, a correspondence between a pixel point of the target object in the first image and a pixel point of the target object in the guide image.
In this embodiment of the application, a target object in a guide image may include M first pixel points, and an electronic device may perform pixel registration on the target object and the target object based on a neural network or other registration algorithms to determine a second pixel point corresponding to each first pixel point of the M first pixel points, where the second pixel point is a pixel point included in the target object.
Optionally, in an embodiment, as shown in fig. 22(a), the target object includes a first pixel point a1, the pixel point information around the first pixel point a1 is mathematically analyzed to extract features, and accordingly, the pixel point information of the target object in the first image is also mathematically analyzed to extract features, so that a second pixel point a2 on the target object can be found (as shown in fig. 22 (b)), and the features extracted from the image information around the second pixel point a 3526 are most matched/similar to the features extracted from the image information around the first pixel point a1, and therefore, it can be determined that the first pixel point a1 corresponds to the second pixel point a 2.
Similarly, a second pixel point included in the target object in the guidance image corresponding to each first pixel point in the M first pixel points can be determined.
How to determine the second pixel point corresponding to each first pixel point in the M first pixel points will be described in fig. 25a to 25c and the corresponding embodiments, and details are not repeated here.
In this embodiment, the electronic device may perform fusion processing on each second pixel point and the corresponding first pixel point in the first image to obtain a target image.
In this embodiment of the application, based on the obtained correspondence between the second pixel points and the first pixel points, in the first image, the second pixel points and the corresponding first pixel points are subjected to pixel fusion, so that a target image is obtained.
Optionally, in an embodiment, after determining the correspondence between the second pixel points and the first pixel points, a pixel displacement between each second pixel point and the corresponding first pixel point may be determined, and each second pixel point is translated based on the pixel displacement to obtain a registered target object, where the registered target object further includes N third pixel points, each third pixel point is generated by interpolation according to a pixel value of an adjacent first pixel point, and N is a positive integer, and the registered target object and the target object are fused to obtain a target image.
Optionally, in this embodiment of the application, when fusing the target object in the first image and the target object in the guide image, the electronic device may obtain the high-frequency information of the second pixel point; acquiring low-frequency information of the first pixel point; and carrying out fusion processing on the low-frequency information and the high-frequency information.
As shown in fig. 23(a) and 23(B), fig. 23(a) shows a schematic of a target object, fig. 23(B) shows a schematic of a registered target object, and as can be seen from fig. 23(c), the registered target object and the target object in the first image also have non-overlapping regions (B1 and B2), and in this case, if the target object in the first image is directly fused with the registered target object, an artifact occurs, that is, when information of the registered target object is "pasted"/fused onto/onto the target object in the first image, the position is "pasted"/fused misplaced, and therefore, in the present application, only the region of the registered target object that overlaps the target object in the first image may be subjected to pixel fusion processing, and the region of the registered target object that does not overlap the target object in the first image, the region may be subjected to super-resolution enhancement processing. Namely: the target object in the first image comprises a first region, the registered target object comprises a second region, the first region and the second region are overlapped, and the pixel points of the first region and the second region are subjected to fusion processing. The target object in the first image further comprises a third region, the third region is staggered with the target object after registration, and super-resolution enhancement processing is performed on the third region.
As shown in fig. 23(d), the pixel fusion method in the embodiment of the present application may be implemented based on an AI network, for example, by training, so that:
a. the encoder 1 is only responsible for encoding the low-frequency information of the picture, and automatically filters out the high-frequency information.
b. The encoder 2 can encode the high-frequency and low-frequency information of the picture, and the corresponding decoder 2 can restore the high-frequency and low-frequency encoded information output by the encoder 2 to the original input picture, wherein the encoded low-frequency information is similar to the encoder 1 in the manner. For example:
i. if the registered guide image is passed through the encoder 1, the output result is similar to the low frequency encoding information output by the registered guide image passing through the encoder 2.
And ii, the enhanced image is processed by the encoder 1, and the output result is similar to the low-frequency coding information output by the registered guide image encoder 2.
In this embodiment of the application, after fusion processing is performed on each second pixel point and the corresponding first pixel point, smoothing processing may be performed on the edge area of the target object in the first image. By the method, the contour region of the target object in the target image is not distorted, and the image enhancement effect is improved.
Optionally, in this embodiment of the application, the target image includes an enhanced target object, and a difference between a dynamic range DR of a brightness of the enhanced target object and a DR of the brightness of the target object in the guide image is smaller than a difference between the DR of the brightness of the target object in the first image and the DR of the brightness of the target object.
Optionally, in this embodiment of the present application, the target image includes an enhanced target object, and a difference between a color tone of the enhanced target object and a color tone of the target object in the guide image is smaller than a difference between a color tone of the target object in the first image and a color tone of the target object.
Optionally, in this embodiment of the application, the target image includes an enhanced target object, and the detail definition of the enhanced target object is greater than the detail definition of the target object in the first image.
Optionally, in another embodiment, the target object in the first image may be directly replaced by the target object in the guide image, that is, the enhanced target object may be directly the target object in the guide image, which is not limited in this application.
It should be noted that the pixel fusion module shown in fig. 23(d) may be integrated into a decoder, and the codec may be implemented based on a conventional algorithm or an AI network.
Note that the names of the blocks in fig. 23(d) are merely illustrative and do not limit the present application, and for example, the pixel fusion block may be understood as a coding fusion block, and is not limited thereto.
In the above, the face or the local area in the face is taken as the target object, and an image enhancement method provided by the embodiment of the present application is introduced, and then another image enhancement method is introduced by taking the target object as the moon as an example.
The electronic device may acquire a first image including the moon as the target object, as shown in fig. 23(e), which shows a schematic of one first image including the moon in fig. 23 (e).
As to how the electronic device acquires the first image, reference may be made to the description in the above embodiments, which is not limited herein.
In this embodiment of the application, after the electronic device acquires the first image, it may be detected that the first image includes a moon, specifically, the electronic device may detect whether the first image includes a moon based on a trained AI network, and the application is not limited.
The electronic apparatus may acquire the guide image including the moon as the target object, and unlike the above enhancement of the face region, since the rotation cycle of the moon and the cycle of the revolution around the earth are equal so that it always faces the earth, the attitude information (texture feature) is substantially uniform in the case where the moon in the first image and the guide image is not occluded, and therefore the electronic apparatus may not perform the determination of whether the attitude information is similar, as shown in fig. 23(f), and a schematic of a guide image including the moon in fig. 23(f) is shown
It should be noted that some scenes are not suitable for guidance enhancement, and may be excluded by scene determination, for example, if the moon in the first image is seriously blocked by cloud, buildings, or other scenes and objects, the user may be prompted that the first image is not suitable for guidance enhancement.
It should be noted that the electronic device may automatically select the guidance image. If the influence caused by the translation of the sky is not ignored, part of the surface of the moon visible on the ground is changed continuously, at this time, the electronic device can deduce the moon surface actually visible at the night through the synchronization of the date, the time and the place, and pick out the guide image from the guide image gallery/album and the like, wherein the posture information of the moon included in the guide image is close to the posture information of the moon surface actually visible at the night.
It should be noted that the electronic device may select the guidance image with reference to a scene in which the moon is located in the first image. For example, if the environment in which the moon is located in the first image is the european ancient castle in the night sky, a guide image including the wolf moon may be selected as the guide image of the first image.
The electronic device may enhance the moon included in the first image by guiding the moon included in the image, resulting in a target image.
Alternatively, in an embodiment, the electronic device may acquire a region a of the moon in the first image and a region B of the moon in the guidance image, and register the region a of the moon to the region B of the moon, so that the registered region a and region B substantially completely coincide with each other.
Optionally, in an embodiment, the graph a may be first translated to make the center (or the center of the circle) of the moon in the graph a coincide with the center (or the center of the circle) of the moon in the graph r after the translation, so as to obtain a graph B, further, a plane coordinate may be established with the center of the graph B as an origin, an angle between an x axis (or a y axis) of the coordinate and a horizontal line of the graph is taken as theta, the graph B is extended and contracted in the x axis direction and the y axis direction of the coordinate, and an appropriate theta and a scaling coefficient are selected, so that the area a of the moon in the graph B and the area B of the moon in the graph r may be accurately registered, so as to obtain a graph c.
It should be noted that, if the lunar phase in the diagram a is not a round moon, or is a scene with a certain occlusion on the moon, it is only necessary to satisfy: the moon region in the graph c is a incomplete perfect circle, that is, a perfect circle contour is left, the moon region can be restored to a perfect circle by extending the perfect circle trajectory of the contour, and the restoration result is substantially completely coincident with the moon region (also a perfect circle) in the graph r, that is, the successful registration can be considered.
In the embodiment of the application, the graph A is also acted on by referring to the transformation from the graph a to the graph C, the graph C is obtained, and the graph C is rotated around the center of the graph until the position of the moon texture is coincident with the position in the graph R, and the graph D is obtained. The rotation angle is denoted as q.
In the embodiment of the present application, the affine transformation matrix W for directly transforming the graph a to the graph D and the inverse matrix W of the affine transformation moment W can be calculated-1Applying W to figure d from figure a, and applying W to figure d-1And (5) working on the graph r to obtain a graph p. Compare graph d with graph r, and graph p with graph a. If the difference of the moon areas is large, the registration fails, the subsequent guidance enhancement is stopped, and the system reports an error (the enhancement failure of the user is prompted).
Alternatively, the criterion for comparison may be that the following condition is satisfied:
condition 1: the area of the moon region of FIG. d outside the contour of the moon region of FIG. r is less than a threshold;
condition 2: the minimum distance between the moon region contour of plot d and the moon region contour of plot r is less than a certain threshold;
condition 3: the area of the moon region of graph p beyond the contour line of the moon region of graph a is less than a certain threshold;
condition 4: the minimum distance between the moon region contour of graph p and the moon region contour of graph a is less than a certain threshold.
It should be noted that, if it is set to only process a full or quasi-full (including slight occlusion) scene, the following conditions may be added:
condition 5: the intersection area of the moon regions of graph d and graph r divided by the moon region area of graph r should be greater than some threshold;
condition 6: the intersection area of the moon regions of graph p and graph a divided by the moon region area of graph a is greater than some threshold.
In the embodiment of the present application, W may be-1By acting on the graph R, a graph P is obtained. And then carrying out post-treatment. And finally, embedding (fusing) the result back to the original photo. And scaling the graph P, the graph P, the graph A and the graph a to the size of the original cropping graph A before scaling.
First, for graph P to fuse perfectly with graph a, the intersection of the moon in graph P and graph a needs to be obtained:
map p1 ═ map p ═ map a; map P1 ═ map P andmap P1; map a1 ═ map a ═ map p 1;
to preserve some of the detail features of the moon in FIG. L (or in FIG. A), i.e., to assign some of the detail features of FIG. A1 to FIG. P1, the following operations may be performed:
graph M-graph p 1/255.0;
lM ═ e _ protect/10.0+ map M pixel value sum, where the value guard e _ protect can take the value 1.0/255.0;
l0max is the maximum pixel value/255.0 of map a 1;
l0 ═ sum of pixel values/255.0/lM of fig. a 1;
map tmp ═ map a1+ (255-map p 1);
l0min equals the minimum pixel value/255.0 of map tmp;
graph T ═ map a1/255.0+ L0 (1.0-graph M) -L0 ═ Amp/(L0max-L0min + e _ protect), where Amp is a tunable parameter that controls the strength of some detail features that inherit the original moon in graph L, and if 0, it does not.
Map IMG (map P1 l0/l1+ map T) map M, wherein ". denotes the multiplication of element dot correspondences;
lmax is the maximum pixel value of the map IMG;
and (UINT8) (IMG), if Lmax is smaller than or equal to 255, otherwise (UINT8) (255.0) graph IMG/Lmax), wherein UINT8 refers to corresponding data type conversion of pixel values in the graph.
The blurring process of the graph p1, that is, the operations of up-down sampling and fuzzy blur, etc. are performed for a certain number of times, to obtain the graph p1v, so that the output result of the post-processing retaining some detail features of the original moon in the graph L is:
figure IMGs. figure p1 v/255.0. figure a. (1.0-figure p1 v/255.0). And embedding the image back to the image L instead of the image A to finish moon brightness enhancement.
If the picture is processed in yuv format, the above result is only for the y-channel. In the rgb format, all three channels r, g, b are used.
Generally, to focus on maintaining a realistic look after beautification, the beautification is done by only guiding the (high definition) texture details of the moon in the image, not its color. Therefore, under the yuv format, the y channel is usually performed with the moon super beautification processing, while the uv color channel is kept unchanged, i.e. the result inherits the original moon color in the graph L. However, if the color information of the moon in the guide image is to be used, the following steps can be followed:
recording the original moon uv channel color information in the graph L as UVL, guiding the moon uv channel color information in the image as UVR, wherein the UVR passes through the matrix W-1The conversion was to UVP, with the median value of UVP (or UVR) being recorded as UVP (including one each of the u and v channels). The moon region of fig. L to be colored with the color of the moon in the guide image may be: for the moon region of FIG. P1, the corresponding UVP is applied; and the moon region of fig. L to the moon region of fig. P1 is filled with UVP, or the moon region of fig. L is filled to the moon region of fig. P1 with an outward equal expansion of the UVP information values of the edges of fig. P1. After coloring, it is noted that the moon uv channel color information of the graph L is UVf, and UVf is to be fused with uv channel color information outside the moon bright area of the graph L, so that after enhancement, the uv channel color information finally embedded back to the graph L instead of the uv of the graph a is:
UVf. graph av/255.0+ UVA (1.0-graph av/255.0), wherein graph av is the blurring treatment result of graph a.
It should be noted that post-processing may also be performed, wherein the post-processing may further include, for example, deblurr, background noise reduction, and the like, so as to make the enhancement effect better, as shown in fig. 23(g), and fig. 23(g) shows an illustration of a target image.
The embodiment of the application provides an image enhancement method, which comprises the following steps: acquiring a first image, wherein the first image comprises a target object; acquiring a guide image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image; and enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image. Through the mode, the guide image is used for enhancing the image to be enhanced (the first image) through the neural network, and the information in the guide image is used for reference, so that compared with the traditional face enhancement technology in which the image to be enhanced is directly processed, the situation of distortion can not occur, and the enhancement effect is better.
The present application further provides an image enhancement method, referring to fig. 24, fig. 24 is a schematic diagram of an embodiment of the image enhancement method provided in the present application, and as shown in fig. 24, the image enhancement method provided in the present embodiment includes:
2401. the server receives a first image sent by the electronic equipment, wherein the first image comprises a target object.
2402. The server acquires a guide image according to the first image, the guide image comprises the target object, and the definition of the target object in the guide image is larger than that of the target object in the first image.
2403. And the server enhances the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
2404. And the server sends the target image to the electronic equipment.
For how the server performs the image enhancement method, reference may be made to the description of the steps performed by the electronic device in the foregoing embodiment, and similar parts are not repeated.
Referring to fig. 25a, fig. 25a is a system architecture diagram of an image enhancement system according to an embodiment of the present application, in fig. 25a, the image enhancement system 2500 includes an execution device 2510, a training device 2520, a database 2530, a client device 2540 and a data storage system 2550, and the execution device 2510 includes a calculation module 2511. The client device 2540 may be an electronic device in the above-described embodiment, and the execution device may be an electronic device or a server in the above-described embodiment.
The database 2530 stores an image set, the training device 2520 generates a target model/rule 2501 for processing the first image and the guide image, and performs iterative training on the target model/rule 2501 by using the image set in the database to obtain a mature target model/rule 2501. In the embodiment of the present application, the target model/rule 2501 is exemplified as a convolution neural network.
The convolutional neural network obtained by the training device 2520 can be applied to different systems or devices, such as a mobile phone, a tablet, a notebook computer, a VR device, a data processing system of a server, and so on. The execution device 2510 may call data, code, etc. in the data storage system 2550, or may store data, instructions, etc. in the data storage system 2550. The data storage system 2550 may reside within the execution device 2510, or the data storage system 2550 may be external memory to the execution device 2510.
The calculation module 2511 may perform a convolution operation on the first image and the guide image acquired by the client device 2540 through a convolutional neural network, and after the first feature plane and the second feature plane are extracted, the first feature plane and the second feature plane may be merged. And determining a second pixel point corresponding to each first pixel point in the M first pixel points based on performing convolution operation on the first characteristic plane and the second characteristic plane.
In some embodiments of the present application, referring to fig. 25a, the execution device 2510 and the client device 2540 may be separate devices, the execution device 2510 may be configured with an I/O interface 2512, and perform data interaction with the client device 2540, and the "user" may input the first image and the guide image to the I/O interface 212 through the client device 2540, and the execution device 210 may return the target image to the client device 2540 through the I/O interface 2512 and provide the target image to the user.
It should be noted that fig. 25a is only an architecture schematic diagram of two image enhancement systems provided by the embodiment of the present invention, and the position relationship between the devices, modules, and the like shown in the diagram does not constitute any limitation. For example, in other embodiments of the present application, the execution device 2510 may be configured in the client device 2540, for example, when the client device is a mobile phone or a tablet, the execution device 2510 may be a module in a Host processor (Host CPU) of the mobile phone or the tablet for performing array image processing, and the execution device 2510 may also be a Graphics Processing Unit (GPU) or a neural Network Processor (NPU) in the mobile phone or the tablet, where the GPU or NPU is mounted on the Host processor as a coprocessor and tasks are allocated by the Host processor.
Next, a convolutional neural network adopted in the embodiment of the present application is described, where the convolutional neural network is a deep learning (deep learning) architecture, and the deep learning architecture refers to performing multiple levels of learning at different abstraction levels through a machine learning algorithm. As a deep learning architecture, CNN is a feed-forward artificial neural network in which individual neurons respond to overlapping regions in an image input thereto. The convolutional neural network may logically include an input layer, a convolutional layer, and a neural network layer, but because the input layer and the output layer are mainly used to facilitate data import and export, with continuous development of the convolutional neural network, in practical applications, concepts of the input layer and the output layer are gradually faded, and functions of the input layer and the output layer are realized through the convolutional layer, of course, other types of layers may also be included in the high-dimensional convolutional neural network, and the specific details are not limited herein.
And (3) rolling layers:
the output of a convolutional layer may be used as input to a subsequent pooling layer, or may be used as input to another convolutional layer to continue the convolution operation. The convolutional layer may include a number of convolution kernels, which may also be referred to as filters (or convolution operators), for extracting specific information from the input array matrix (i.e., the digitized array image). A convolution kernel can be essentially a weight matrix, which is usually predefined, and the size of each weight matrix should be related to the size of each angle image in an array image, and during the convolution operation on the array image, the weight matrix is usually processed on each angle image of the array image one pixel after another in the horizontal direction (or two pixels after two pixels … …, which depends on the value of the step size stride), so as to complete the task of extracting a specific feature from the image. The weight values in the weight matrixes need to be obtained through a large amount of training in practical application, and each weight matrix formed by the trained weight values can extract information from an input angle image, so that the high-dimensional convolutional neural network can be helped to carry out correct prediction.
It should be noted that the depth dimension (depth dimension) of the weight matrix is the same as the depth dimension of the input array image, and the weight matrix extends to the entire depth of the input image during the convolution operation. Therefore, convolution with a weight matrix of a single depth dimension will produce a convolution output of a single depth dimension, but in most cases, a weight matrix of a single depth dimension is not used, but different features in an image are extracted by using weight matrices of different depth dimensions, for example, a weight matrix of one depth dimension is used to extract image edge information, a weight matrix of another depth dimension is used to extract a specific color of the image, a weight matrix of another depth dimension is used to blur … … unwanted noise points in the image, the dimensions of the multiple weight matrices are the same, the dimensions of feature planes extracted by the weight matrices of the multiple depth dimensions are the same, and then the feature maps of the multiple dimensions are combined to form an output of convolution operation.
Specifically, as an example, please refer to fig. 25b, where fig. 25b is a schematic diagram of performing a convolution operation on an image by a convolution kernel according to an embodiment of the present application, and fig. 25b is an example of a 6 × 6 image and a2 × 2 convolution module, where s refers to a coordinate of the image in a horizontal direction of an angular dimension, t refers to a coordinate of the image in a vertical direction of the angular dimension, x refers to a coordinate of the image in the horizontal direction of the image, y refers to a coordinate of the image in the vertical direction of the image, a pixel point on the image can be determined by (x, y, s, t), m refers to a coordinate of a plurality of convolution modules in the horizontal direction of the angular dimension, n refers to a coordinate of a plurality of convolution modules in the vertical direction of the angular dimension, and p refers to a coordinate of the convolution module in the horizontal direction, q refers to the coordinate in the vertical direction in one convolution module, from which one convolution kernel can be determined by (m, n, p, q).
A neural network layer:
after processing by convolutional/pooling layers, the high-dimensional convolutional neural network is not enough to output the required output information. Since, as mentioned above, the convolutional/pooling layers only extract features and reduce the parameters introduced by the input image. However, to generate the final output information (class information or other relevant information as needed), the convolutional neural network requires the use of the neural network layer to generate one or a set of outputs of the number of classes needed. Therefore, a plurality of hidden layers may be included in the neural network layer, and parameters included in the plurality of hidden layers may be obtained by pre-training according to the relevant training data of a specific task type, for example, the task type may include image recognition, image classification, image super-resolution reconstruction, and the like.
Alternatively, in an embodiment, a convolution operation may be performed on the target object in the first image and the target object in the guide image based on the neural network, and specifically, the cropped guide image (including the target object) and the cropped first image (including the target object) may be scaled to the same specific size and input to the network. For the first image after cropping, the scaled image size becomes (D + D) × (D + D), where D is the side length of the central region and D is the side length of the edge-remaining region.
In the embodiment of the present application, a center D × D region may be averagely divided into N × N blocks, each block center is used as a basic grid point, and a width D of a margin region is a maximum allowed displacement value of a pixel set in registration, so that the network design is convenient, and optionally, D may be X times (integer multiple) of D/N. Thus, the first image and the guide image after the cropping are divided into (2M + N) × (2M + N) blocks on average.
As shown in fig. 25c, the cropped first image (including the margin region) and the guide image (including the margin region) are convolved based on the convolutional layer sets CNNgG and CNNgL to extract a feature Gcf and a feature Lcf, which may be a contour line feature, respectively.
Gcf and Lcf are spliced to design a convolutional layer set CNNg2, and the spliced Gcf and Lcf are subjected to convolution processing to obtain GLcf.
In the second last stage of the network, convolutional layer sets CNNgs and CNNgc are designed to process GLcfs respectively and output characteristic GLcfs and characteristic GLcfc, and optionally, the side length ratio of the characteristic GLcfs to the characteristic GLcfc is (2M + N): (2M + N-1).
And processing the characteristic GLcfs and the characteristic GLcfc based on the network terminal convolutional layer set CNNgf to obtain the characteristic GLcfsf with the size of (2M + N) × (2M + N) × 2 and the characteristic GLcfc with the size of (2M + N-1) × (2M + N-1) × 2. Taking coordinate point displacement corresponding to grid points of output 'N × N' basic grid and 'embedded grid (N-1) × (N-1) × 2 at the center of GLcfsf and (N-1) × (N-1)' embedded grid at the center of GLcfsf, wherein, x 2 can mean that the displacement has two directions of x and y, and the meaning of grid point displacement is as follows: and (4) guiding the image to be registered on the graph to be enhanced, and shifting all coordinate points of the grid point positions.
In the embodiment of the application, the displacement of each pixel point can be interpolated according to the displacement of the grid point coordinates.
In this embodiment, the grid points may be the geometric center of the receptive field range of the convolution kernel in the guide image corresponding to each convolution operation, or the pixel positions that are not far away from the geometric center (the interval between the grid points and the geometric center of the receptive field range is smaller than a preset value), which is not limited herein. In the convolutional neural network, the receptive field (receptive field) may be an area range in which pixels on a feature map (feature map) output by each layer of the convolutional neural network are mapped on an input picture.
It should be noted that the calculation range of the receptive field may also extend the first image outwards indefinitely, so as to ensure that the receptive field range is not truncated by the boundary of the first image when the boundary of the first image is reached.
In this application, the domain range of the convolution kernel in the guide image in the last convolution operation may be described, and in this application, the domain may include a region where the edge of the feature layer is complemented by 0 in the convolution operation.
An embodiment of the present application further provides an electronic device, please refer to fig. 26, where fig. 26 is a schematic structural diagram of the electronic device according to the embodiment of the present application, and the electronic device includes:
an obtaining module 2601, configured to obtain a first image, where the first image includes a target object; acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
a processing module 2602, configured to enhance, according to a target object in the guide image, the target object in the first image through a neural network to obtain a target image, where the target image includes the enhanced target object, and a definition of the enhanced target object is greater than a definition of the target object in the first image.
Optionally, a difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
Optionally, the obtaining module 2601 is specifically configured to:
determining the guide image from the at least one second image according to a degree of difference between a pose of the target object in the first image and a pose of each of the at least one second image.
Optionally, the electronic module further includes:
a display module 2603 for displaying a first image selection interface, the first image selection interface comprising at least one image;
a receiving module 2604, configured to receive a first image selection instruction, where the first image selection instruction indicates that the at least one second image is selected from at least one image included in the first image selection interface.
Optionally, the processing module is specifically configured to:
determining at least one third image according to the posture of the target object in the first image, wherein each third image in the at least one third image comprises the target object, and the difference degree between the posture of the target object in each third image and the posture of the target object in the first image is within a preset range;
the display module is further configured to display a second image selection interface, where the second image selection interface includes the at least one third image;
the receiving module is further configured to receive a second image selection instruction, where the second image selection instruction indicates that the guide image is selected from at least one third image included in the second image selection interface.
Optionally, the target image includes an enhanced target object, and a guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image, where the guide image feature includes at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
Optionally, the target image includes an enhanced target object, and a difference between a posture of the enhanced target object and a posture of the target object in the first image is within a preset range.
Optionally, the display module 2603 is further configured to:
displaying a shooting interface of a camera;
the obtaining module 2601 is specifically configured to receive a shooting operation of a user, and obtain the first image in response to the shooting operation;
or, the display module 2603 is further configured to:
displaying an album interface of a camera, the album interface including a plurality of images;
the obtaining module 2601 is specifically configured to receive a third image selection instruction, where the third image selection instruction indicates that the first image is selected from a plurality of images included in the album interface.
Optionally, the obtaining module 2601 is specifically configured to:
and receiving the guide image sent by the server.
Optionally, the processing module 2602 is specifically configured to obtain high-frequency information of each second pixel point; acquiring low-frequency information of each first pixel point, wherein the second pixel point is a pixel point in the guide image, and the first pixel point is a pixel point of the first image; and carrying out fusion processing on the low-frequency information and the corresponding high-frequency information.
Optionally, the processing module 2602 is further configured to perform fusion processing on each second pixel point and the corresponding first pixel point, and then perform smoothing processing on the edge area of the target object in the first image.
Optionally, the processing module 2602 is further configured to determine pixel displacement between each second pixel point and the corresponding first pixel point; and translating each second pixel based on the pixel displacement to obtain a registered target object.
Optionally, the processing module 2602 is specifically configured to fuse the registered target object with the target object.
Optionally, the target object includes a first region, the registered target object includes a second region, the first region and the second region are overlapped, and the processing module 2602 is specifically configured to perform fusion processing on pixel points of the first region and the second region.
Optionally, the target object further includes a third region, the third region is staggered from the registered target object, and the processing module 2602 is further configured to perform super-resolution enhancement processing on the third region.
Optionally, the registered target object further includes N third pixel points, each third pixel point is generated by interpolation according to a pixel value of an adjacent first pixel point, and N is a positive integer.
Optionally, the processing module 2602 is specifically configured to perform a convolution operation on the first image to obtain a first feature plane; performing convolution operation on the guide image to obtain a second feature plane; and determining a second pixel point corresponding to each first pixel point in the M first pixel points based on performing convolution operation on the first characteristic plane and the second characteristic plane, wherein the interval between the coordinate position of each grid point and the geometric center of a convolution kernel corresponding to one convolution operation is smaller than a preset value.
An embodiment of the present application further provides a server, please refer to fig. 27, where fig. 27 is a schematic structural diagram of the server provided in the embodiment of the present application, and the server includes:
a receiving module 2701, configured to receive a first image sent by an electronic device, where the first image includes a target object; obtaining a guide image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
the processing module 2702 is configured to enhance, according to a target object in the guide image, the target object in the first image through a neural network to obtain a target image, where the target image includes the enhanced target object, and a definition of the enhanced target object is greater than a definition of the target object in the first image;
a sending module 2703, configured to send the target image to the electronic device.
Optionally, a difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
Optionally, the receiving module 2701 is specifically configured to:
determining the guide image from the at least one second image according to a degree of difference between a pose of the target object in the first image and a pose of each of the at least one second image.
Optionally, the target image includes an enhanced target object, and a guide image feature of the enhanced target object is closer to the target object in the guide image than the target object in the first image, where the guide image feature includes at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
Optionally, the target image includes an enhanced target object, and a difference between a posture of the enhanced target object and a posture of the target object in the first image is within a preset range.
Referring to fig. 28, fig. 28 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, and the electronic device 2800 may be embodied as a virtual reality VR device, a mobile phone, a tablet, a notebook computer, an intelligent wearable device, and the like, which is not limited herein. Specifically, the electronic device 2800 includes: a receiver 2801, a transmitter 2802, a processor 2803, and a memory 2804 (where the number of processors 2803 in an electronic device 2800 may be one or more, for example, one processor in fig. 28), where the processor 2803 may include an application processor 28031 and a communication processor 28032. In some embodiments of the present application, receiver 2801, transmitter 2802, processor 2803, and memory 2804 may be connected by a bus or other means.
Memory 2804 may include both read-only memory and random access memory, and provides instructions and data to processor 2803. A portion of the memory 2804 may also include non-volatile random access memory (NVRAM). The memory 2804 stores a processor and operating instructions, executable modules or data structures, or a subset thereof, or an expanded set thereof, wherein the operating instructions may include various operating instructions for performing various operations.
The processor 2803 controls the operation of the electronic device. In a particular application, the various components of the electronic device are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, the various buses are referred to in the figures as a bus system.
The methods disclosed in the embodiments of the present application may be implemented in the processor 2803 or implemented by the processor 2803. The processor 2803 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by instructions in the form of hardware, integrated logic circuits, or software in the processor 2803. The processor 2803 may be a general purpose processor, a Digital Signal Processor (DSP), a microprocessor or a microcontroller, and may further include an Application Specific Integrated Circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The processor 2803 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or may be implemented by a combination of hardware and software modules in the decoding processor. The software modules may be located in ram, flash, rom, prom, eprom, eeprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 2804, and the processor 2803 reads the information in the memory 2804 and, in conjunction with its hardware, performs the steps of the above-described method.
Receiver 2801 can be used to receive input numeric or character information and to generate signal inputs related to settings and function control of the electronic device. The transmitter 2802 may be used to output numeric or character information through a first interface; transmitter 2802 is also operable to send a command to the disk group via the first interface to modify data in the disk group; the transmitter 2802 may also include a display device such as a display screen.
In this embodiment, the processor 2803 is configured to perform processing-related steps in the image enhancement method in the foregoing embodiments.
Referring to fig. 29, fig. 29 is a schematic structural diagram of a server provided in the embodiment of the present application, and the server may generate a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 2922 (e.g., one or more processors) and a memory 2932, and one or more storage media 2930 (e.g., one or more mass storage devices) storing an application 2942 or data 2944. Memory 2932 and storage media 2930 may be, among other things, transient or persistent storage. The program stored on storage medium 2930 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the exercise device. Still further, central processor 2922 may be disposed in communication with storage medium 2930 to perform a series of instruction operations in storage medium 2930 on server 2900.
The server 2900 might also include one or more power supplies 2926, one or more wired or wireless network interfaces 2950, one or more input-output interfaces 2958, and/or one or more operating systems 2941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
In the embodiment of the present application, the central processing unit 2922 is configured to execute the image enhancement method described in the above embodiment.
There is also provided in an embodiment of the present application a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the image enhancement method.
An embodiment of the present application also provides a computer-readable storage medium, in which a program for signal processing is stored, which, when run on a computer, causes the computer to perform the steps of the image enhancement method in the method as described in the foregoing embodiment.
The execution device and the training device provided by the embodiment of the application may specifically be chips, and the chips include: a processing unit, which may be, for example, a processor, and a communication unit, which may be, for example, an input/output interface, a pin or a circuit, etc. The processing unit may execute the computer-executable instructions stored by the storage unit to cause the chip in the execution device to execute the image enhancement method described in the above embodiments, or to cause the chip in the training device to execute the image enhancement method described in the above embodiments. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, and the like, and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as a read-only memory (ROM) or another type of static storage device that can store static information and instructions, a Random Access Memory (RAM), and the like.
Specifically, referring to fig. 30, fig. 30 is a schematic structural diagram of a chip provided in the embodiment of the present application, where the chip may be represented as a neural network processor NPU 300, and the NPU 300 is mounted on a main CPU (Host CPU) as a coprocessor, and the Host CPU allocates tasks. The core portion of the NPU is an arithmetic circuit 3003, and the controller 3004 controls the arithmetic circuit 3003 to extract matrix data in the memory and perform multiplication.
In some implementations, the arithmetic circuit 3003 internally includes a plurality of processing units (PEs). In some implementations, the operational circuitry 3003 is a two-dimensional systolic array. The arithmetic circuit 3003 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuitry 3003 is a general-purpose matrix processor.
For example, assume that there is an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to the matrix B from the weight memory 3002 and buffers the data on each PE in the arithmetic circuit. The arithmetic circuit takes the matrix a data from the input memory 3001 and performs matrix operation with the matrix B, and stores a partial result or a final result of the obtained matrix in an accumulator (accumulator) 3008.
The unified memory 3006 stores input data and output data. The weight data directly passes through a Direct Memory Access Controller (DMAC) 3005, and the DMAC is transferred to a weight memory 3002. The input data is also carried into the unified memory 3006 by the DMAC.
The BIU is a Bus Interface Unit 3010 for the interaction of the AXI Bus with the DMAC and an Instruction Fetch Buffer (IFB) 3009.
The Bus Interface Unit 3010(Bus Interface Unit, BIU for short) is configured to obtain an instruction from the external memory by the instruction fetch memory 3009, and is further configured to obtain, from the external memory, the original data of the input matrix a or the weight matrix B by the storage Unit access controller 3005.
The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 3006, to transfer weight data to the weight memory 3002, or to transfer input data to the input memory 3001.
The vector calculation unit 3007 includes a plurality of operation processing units, and further processes the output of the operation circuit such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like, if necessary. The method is mainly used for non-convolution/full-connection layer network calculation in the neural network, such as Batch Normalization, pixel-level summation, up-sampling of a characteristic plane and the like.
In some implementations, the vector calculation unit 3007 can store the processed output vector to the unified memory 3006. For example, the vector calculation unit 3007 may apply a linear function and/or a nonlinear function to the output of the arithmetic circuit 3003, such as linear interpolation of the feature planes extracted by the convolutional layers, and further such as a vector of accumulated values to generate the activation values. In some implementations, the vector calculation unit 3007 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as activation inputs to the operational circuitry 3003, e.g., for use in subsequent layers in a neural network.
An instruction fetch buffer 3009 connected to the controller 3004, configured to store instructions used by the controller 3004;
the unified memory 3006, the input memory 3001, the weight memory 3002, and the instruction fetch memory 3009 are all On-Chip memories. The external memory is private to the NPU hardware architecture.
The processor mentioned in any of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the programs of the image enhancement method.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided by the present application, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be various, such as analog circuits, digital circuits, or dedicated circuits. However, the implementation of software programs is a better implementation for the present application in many cases. Based on such understanding, the technical solutions of the present application may be substantially embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, an exercise device, or a network device) to execute the method according to the embodiments of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, training device, or data center to another website site, computer, training device, or data center by wire (e.g., coaxial cable, fiber optics, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a training device, a data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Claims (22)

1. A method of image enhancement, the method comprising:
acquiring a first image, wherein the first image comprises a target object;
acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
and enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
2. The method of claim 1, wherein the target object comprises at least one of: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
3. The method according to claim 1 or 2, wherein a degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
4. The method of any of claims 1 to 3, wherein said acquiring a guide image from said first image comprises:
determining the guide image from the at least one second image according to a degree of difference between a pose of the target object in the first image and a pose of each of the at least one second image.
5. The method of claim 4, wherein the method further comprises, prior to the determining the degree of difference between the pose of the target object in the first image and the pose of each of the at least one second image:
displaying a first image selection interface, the first image selection interface including at least one image;
receiving a first image selection instruction, wherein the first image selection instruction represents that at least one second image is selected from at least one image included in the first image selection interface.
6. The method of any of claims 1 to 3, wherein said acquiring a guide image from said first image comprises:
determining at least one third image according to the posture of the target object in the first image, wherein each third image in the at least one third image comprises the target object, and the difference degree between the posture of the target object in each third image and the posture of the target object in the first image is within a preset range;
displaying a second image selection interface, the second image selection interface including the at least one third image;
receiving a second image selection instruction, wherein the second image selection instruction represents that the guide image is selected from at least one third image included in the second image selection interface.
7. The method according to any of claims 1 to 6, wherein the target image comprises an enhanced target object having a leading image feature closer to the target object in the leading image than the target object in the first image, wherein the leading image feature comprises at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
8. The method according to any one of claims 1 to 7, wherein the target image comprises an enhanced target object, and a difference degree between a posture of the enhanced target object and a posture of the target object in the first image is within a preset range.
9. The method of any of claims 1 to 8, wherein said acquiring a first image comprises:
displaying a shooting interface of a camera;
receiving shooting operation of a user, and acquiring the first image in response to the shooting operation; or displaying an album interface of the camera, wherein the album interface comprises a plurality of images;
receiving a third image selection instruction, wherein the third image selection instruction represents selection of the first image from a plurality of images included in the album interface.
10. The method of any of claims 1 to 9, wherein said acquiring a guide image from said first image comprises:
and receiving a guide image sent by a server, wherein the guide image is obtained by the server according to the first image.
11. An image enhancement apparatus applied to an electronic device or a server, the image enhancement apparatus comprising:
an acquisition module for acquiring a first image, the first image including a target object; acquiring a guide image according to the first image, wherein the guide image comprises the target object, and the definition of the target object in the guide image is greater than that of the target object in the first image;
and the processing module is used for enhancing the target object in the first image through a neural network according to the target object in the guide image to obtain a target image, wherein the target image comprises the enhanced target object, and the definition of the enhanced target object is greater than that of the target object in the first image.
12. The image enhancement apparatus according to claim 11, wherein the target object includes at least one of: the face, eyes, ears, nose, eyebrows, or mouth of the same person.
13. The image enhancement apparatus according to claim 11 or 12, wherein a degree of difference between the posture of the target object in the guide image and the posture of the target object in the first image is within a preset range.
14. The image enhancement device according to any one of claims 11 to 13, wherein the obtaining module is specifically configured to:
determining the guide image from the at least one second image according to a degree of difference between a pose of the target object in the first image and a pose of each of the at least one second image.
15. The image enhancement device of claim 14, wherein the electronic module further comprises:
the display module is used for displaying a first image selection interface, and the first image selection interface comprises at least one image;
a receiving module, configured to receive a first image selection instruction, where the first image selection instruction indicates that the at least one second image is selected from at least one image included in the first image selection interface.
16. The image enhancement device according to any one of claims 11 to 13, wherein the processing module is specifically configured to:
determining at least one third image according to the posture of the target object in the first image, wherein each third image in the at least one third image comprises the target object, and the difference degree between the posture of the target object in each third image and the posture of the target object in the first image is within a preset range;
the display module is further configured to display a second image selection interface, where the second image selection interface includes the at least one third image;
the receiving module is further configured to receive a second image selection instruction, where the second image selection instruction indicates that the guide image is selected from at least one third image included in the second image selection interface.
17. The image enhancement device according to any one of claims 11 to 16, wherein the target image comprises an enhanced target object, and a leading image feature of the enhanced target object is closer to the target object in the leading image than the target object in the first image, wherein the leading image feature comprises at least one of the following image features:
dynamic range of luminance, hue, contrast, saturation, texture information, and contour information.
18. The image enhancement device according to any one of claims 11 to 17, wherein the target image includes an enhanced target object whose posture differs from the posture of the target object in the first image within a preset range.
19. The image enhancement device of any one of claims 11 to 18, wherein the display module is further configured to:
displaying a shooting interface of a camera;
the acquisition module is specifically used for receiving shooting operation of a user and acquiring the first image in response to the shooting operation;
or, the display module is further configured to:
displaying an album interface of a camera, the album interface including a plurality of images;
the obtaining module is specifically configured to receive a third image selection instruction, where the third image selection instruction indicates that the first image is selected from a plurality of images included in the album interface.
20. An electronic device, comprising:
one or more processors;
one or more memories;
a plurality of application programs;
and one or more programs, wherein the one or more programs are stored in the memory, and when executed by the processor, cause the electronic device to perform any of the steps of 1-10.
21. A server, comprising:
one or more processors;
one or more memories;
a plurality of application programs;
and one or more programs, wherein the one or more programs are stored in the memory, and when executed by the processor, cause the electronic device to perform any of the steps of 1-10.
22. A computer storage medium comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the image enhancement method of any one of claims 1 to 10.
CN201911026078.XA 2019-10-25 2019-10-25 Image enhancement method and device Pending CN112712470A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201911026078.XA CN112712470A (en) 2019-10-25 2019-10-25 Image enhancement method and device
PCT/CN2020/118833 WO2021078001A1 (en) 2019-10-25 2020-09-29 Image enhancement method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911026078.XA CN112712470A (en) 2019-10-25 2019-10-25 Image enhancement method and device

Publications (1)

Publication Number Publication Date
CN112712470A true CN112712470A (en) 2021-04-27

Family

ID=75541157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911026078.XA Pending CN112712470A (en) 2019-10-25 2019-10-25 Image enhancement method and device

Country Status (2)

Country Link
CN (1) CN112712470A (en)
WO (1) WO2021078001A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113301251A (en) * 2021-05-20 2021-08-24 努比亚技术有限公司 Auxiliary shooting method, mobile terminal and computer-readable storage medium
CN113923372A (en) * 2021-06-25 2022-01-11 荣耀终端有限公司 Exposure adjusting method and related equipment
CN114399622A (en) * 2022-03-23 2022-04-26 荣耀终端有限公司 Image processing method and related device
CN114827567A (en) * 2022-03-23 2022-07-29 阿里巴巴(中国)有限公司 Video quality analysis method, apparatus and readable medium
CN114926351A (en) * 2022-04-12 2022-08-19 荣耀终端有限公司 Image processing method, electronic device, and computer storage medium
US20230097869A1 (en) * 2021-09-28 2023-03-30 Samsung Electronics Co., Ltd. Method and apparatus for enhancing texture details of images

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056562A (en) * 2016-05-19 2016-10-26 京东方科技集团股份有限公司 Face image processing method and device and electronic device
CN107527332A (en) * 2017-10-12 2017-12-29 长春理工大学 Enhancement Method is kept based on the low-light (level) image color for improving Retinex
US20180061083A1 (en) * 2016-09-01 2018-03-01 Toshihiro Suzuki System and method for calculating image similarity and recording medium
CN109544482A (en) * 2018-11-29 2019-03-29 厦门美图之家科技有限公司 A kind of convolutional neural networks model generating method and image enchancing method
CN110084775A (en) * 2019-05-09 2019-08-02 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250825A (en) * 2016-07-22 2016-12-21 厚普(北京)生物信息技术有限公司 A kind of at the medical insurance adaptive face identification system of applications fields scape
CN106920224B (en) * 2017-03-06 2019-11-05 长沙全度影像科技有限公司 A method of assessment stitching image clarity
US10783393B2 (en) * 2017-06-20 2020-09-22 Nvidia Corporation Semi-supervised learning for landmark localization
JP6832252B2 (en) * 2017-07-24 2021-02-24 日本放送協会 Super-resolution device and program
CN109671023B (en) * 2019-01-24 2023-07-21 江苏大学 Face image super-resolution secondary reconstruction method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056562A (en) * 2016-05-19 2016-10-26 京东方科技集团股份有限公司 Face image processing method and device and electronic device
US20180061083A1 (en) * 2016-09-01 2018-03-01 Toshihiro Suzuki System and method for calculating image similarity and recording medium
CN107527332A (en) * 2017-10-12 2017-12-29 长春理工大学 Enhancement Method is kept based on the low-light (level) image color for improving Retinex
CN109544482A (en) * 2018-11-29 2019-03-29 厦门美图之家科技有限公司 A kind of convolutional neural networks model generating method and image enchancing method
CN110084775A (en) * 2019-05-09 2019-08-02 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113301251A (en) * 2021-05-20 2021-08-24 努比亚技术有限公司 Auxiliary shooting method, mobile terminal and computer-readable storage medium
CN113301251B (en) * 2021-05-20 2023-10-20 努比亚技术有限公司 Auxiliary shooting method, mobile terminal and computer readable storage medium
CN113923372A (en) * 2021-06-25 2022-01-11 荣耀终端有限公司 Exposure adjusting method and related equipment
US20230097869A1 (en) * 2021-09-28 2023-03-30 Samsung Electronics Co., Ltd. Method and apparatus for enhancing texture details of images
CN114399622A (en) * 2022-03-23 2022-04-26 荣耀终端有限公司 Image processing method and related device
CN114827567A (en) * 2022-03-23 2022-07-29 阿里巴巴(中国)有限公司 Video quality analysis method, apparatus and readable medium
CN114926351A (en) * 2022-04-12 2022-08-19 荣耀终端有限公司 Image processing method, electronic device, and computer storage medium

Also Published As

Publication number Publication date
WO2021078001A1 (en) 2021-04-29

Similar Documents

Publication Publication Date Title
WO2020168956A1 (en) Method for photographing the moon and electronic device
CN113132620B (en) Image shooting method and related device
WO2020077511A1 (en) Method for displaying image in photographic scene and electronic device
WO2021078001A1 (en) Image enhancement method and apparatus
CN112887583B (en) Shooting method and electronic equipment
CN113538273B (en) Image processing method and image processing apparatus
WO2020029306A1 (en) Image capture method and electronic device
WO2020102978A1 (en) Image processing method and electronic device
WO2022017261A1 (en) Image synthesis method and electronic device
CN113170037B (en) Method for shooting long exposure image and electronic equipment
CN110138999B (en) Certificate scanning method and device for mobile terminal
CN114979457B (en) Image processing method and related device
CN113452969B (en) Image processing method and device
CN116206100A (en) Image processing method based on semantic information and electronic equipment
WO2021180046A1 (en) Image color retention method and device
CN112150499A (en) Image processing method and related device
CN114697543B (en) Image reconstruction method, related device and system
CN113536834A (en) Pouch detection method and device
CN113821130A (en) Method and related device for determining screenshot area
CN114445522A (en) Brush effect graph generation method, image editing method, device and storage medium
CN113495733A (en) Theme pack installation method and device, electronic equipment and computer readable storage medium
CN113518172A (en) Image processing method and device
CN115150542B (en) Video anti-shake method and related equipment
CN116193275B (en) Video processing method and related equipment
CN115686182B (en) Processing method of augmented reality video and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination