WO2022021025A1 - 图像增强方法及装置 - Google Patents

图像增强方法及装置 Download PDF

Info

Publication number
WO2022021025A1
WO2022021025A1 PCT/CN2020/104969 CN2020104969W WO2022021025A1 WO 2022021025 A1 WO2022021025 A1 WO 2022021025A1 CN 2020104969 W CN2020104969 W CN 2020104969W WO 2022021025 A1 WO2022021025 A1 WO 2022021025A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature map
pixel
processed
neural network
Prior art date
Application number
PCT/CN2020/104969
Other languages
English (en)
French (fr)
Inventor
杨鑫
尹宝才
许可
于乐天
李蒙
Original Assignee
华为技术有限公司
大连理工大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司, 大连理工大学 filed Critical 华为技术有限公司
Priority to PCT/CN2020/104969 priority Critical patent/WO2022021025A1/zh
Priority to CN202080101508.4A priority patent/CN115769247A/zh
Publication of WO2022021025A1 publication Critical patent/WO2022021025A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20192Edge enhancement; Edge preservation

Definitions

  • the present application relates to the technical field of image processing, and in particular, to an image enhancement method and apparatus.
  • an image captured under low-light conditions can be adjusted to an image acceptable to human eyes, ie, the brightness and/or contrast of the image can be enhanced.
  • the effect of enhancing the brightness and/or contrast of the image and denoising the image cannot be achieved at the same time. Therefore, how to simultaneously realize the enhancement of the brightness and/or contrast of the image captured under low light conditions and the denoising of the image is a technical problem to be solved urgently.
  • the present application provides an image enhancement method and device, through which the enhancement of image brightness and/or contrast and image denoising can be simultaneously achieved.
  • the present application provides an image enhancement method, which is applied to an image enhancement apparatus.
  • the method includes: acquiring a low-frequency feature map of an image to be processed through a first neural network, and determining a first image according to the low-frequency feature map.
  • the first image includes basic information of the reconstructed image to be processed, and the basic information includes contour information of the image to be processed.
  • a high-frequency feature map of the image to be processed is acquired through a second neural network, and a second image is determined according to the high-frequency feature map.
  • the second image includes detail information of the reconstructed image to be processed, where the detail information includes at least one of edges or textures of the image to be processed.
  • the first image and the second image are fused to obtain an enhanced image of the to-be-processed image.
  • the image enhancement method provided by the present application can denoise and reconstruct the basic image by acquiring the low-frequency feature map of the image to be processed (wherein, the denoising is because the noise information is mostly high-frequency information, and the basic image is reconstructed according to the low-frequency features, It is equivalent to filtering out most of the high-frequency information, that is, denoising), and reconstructing the detailed image through the acquired high-frequency feature map of the image to be processed, and then fusing the basic image and the detailed image to obtain the image to be processed.
  • Enhanced image Through this method, the brightness and/or contrast of the image can be enhanced, and the noise in the image to be processed can also be effectively filtered.
  • the above-mentioned "obtaining the low-frequency feature map of the image to be processed through the first neural network” includes: using the first neural network to obtain the first feature map of the image to be processed, and then using the first neural network The pixel value of the first pixel in the first feature map and the pixel value of the second pixel corresponding to the first pixel in the image to be processed are multiplied to obtain a low-frequency feature map of the image to be processed.
  • the above-mentioned "obtaining the high-frequency feature map of the image to be processed through the second neural network” includes: using the second neural network to invert the pixel value of each pixel in the aforementioned first feature map to obtain the second feature map.
  • the second neural network uses the second neural network to multiply the pixel value of the third pixel in the second feature map by the pixel value of the fourth pixel corresponding to the third pixel in the aforementioned first image, or use the second neural network to The pixel value of the third pixel is multiplied by the pixel value of the fifth pixel corresponding to the third pixel in the image to be processed to obtain a high-frequency feature map of the image to be processed.
  • the second neural network and the first neural network have the same network structure.
  • the method of the present application shares the parameters for acquiring the low-frequency feature map of the image to be processed, that is, the image enhancement device will be used to determine the first feature map of the image to be processed. Inversion, the second feature map for determining the high-frequency feature map of the image to be processed can be obtained.
  • the image enhancement device can reduce a large number of convolution operations (that is, the convolution operations of acquiring the second feature map according to the image to be processed or the first image) during the process of enhancing the image to be processed by the method of the present application. , thereby saving computing power.
  • the above-mentioned “inversion” means: making a difference between 1 and the pixel value of each pixel in the first feature map.
  • the second feature map for determining the high-frequency feature map of the image to be processed can be obtained simply and quickly based on the first feature map used to determine the image to be processed, thereby reducing a large number of convolution operations (ie The convolution operation to obtain the second feature map according to the image to be processed or the first image) saves computing power.
  • the above-mentioned "determining the first image according to the low-frequency feature map” includes: reconstructing the basic information of the image to be processed by using the third neural network to obtain the third image according to the low-frequency feature map. Next, the color and/or contrast of the third image is enhanced by a constant ⁇ , resulting in a fourth image. Then, the pixel value of the sixth pixel in the third image is multiplied by the pixel value of the seventh pixel corresponding to the sixth pixel in the fourth image to obtain the first image.
  • the above-mentioned "constant ⁇ " is a preset constant, or, the above-mentioned “constant ⁇ ” is obtained through a third neural network.
  • the above-mentioned "determining the second image according to the high-frequency feature map” includes: using the fourth neural network to reconstruct the detailed information of the image to be processed according to the high-frequency feature map to obtain the second image .
  • the network structure of the fourth neural network is the same as that of the above-mentioned third neural network.
  • the feature map used to obtain the high-frequency feature map in the fourth neural network is obtained by inverting the pixel value of each pixel in the feature map used to obtain the low-frequency feature map in the third neural network.
  • the fourth reconstruction network can share the parameters of the third reconstruction network, that is, the fourth neural network can be obtained by inverting the pixel values in the feature map used to obtain the low-frequency feature map in the third neural network. Feature maps used in the network to obtain high-frequency feature maps. In this way, a large number of convolution operations (that is, the convolution operations for obtaining the feature map used to obtain the high-frequency feature map in the fourth neural network) can be reduced, thereby saving computing power.
  • the above-mentioned “merging the first image and the second image to obtain an image after the image to be processed is enhanced” includes: the pixel value of the eighth pixel in the first image and the The pixel values of the ninth pixel corresponding to the eighth pixel in the second image are added, so as to obtain an enhanced image of the to-be-processed image.
  • the present application provides an image enhancement apparatus.
  • the image enhancement apparatus is configured to perform any one of the methods provided in the first aspect above.
  • the image enhancement apparatus may be divided into functional modules according to any of the methods provided in the first aspect.
  • each function module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the present application may divide the image enhancement device into an acquisition unit, a determination unit, a fusion unit, and the like according to functions.
  • the image enhancement device includes: a memory and one or more processors, the memory and the processor being coupled.
  • the memory is used for storing computer instructions
  • the processor is used for invoking the computer instructions to execute any one of the methods provided by the first aspect and any possible design manners thereof.
  • the present application provides a computer readable storage medium, such as a computer non-transitory readable storage medium.
  • a computer program (or instruction) is stored thereon, and when the computer program (or instruction) runs on the image enhancement device, the image enhancement device is made to perform any one of the possible implementations of the first aspect above. a method.
  • the present application provides a computer program product that, when running on an image enhancement device, causes any one of the methods provided by any one of the possible implementations of the first aspect to be performed.
  • the present application provides a chip system, including: a processor, where the processor is configured to call and run a computer program stored in the memory from a memory, and execute any one of the methods provided in the implementation manner of the first aspect.
  • any device, computer storage medium, computer program product or chip system provided above can be applied to the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding method. The beneficial effects in the method will not be repeated here.
  • FIG. 1 is a schematic diagram 1 of a hardware structure of an image enhancement apparatus according to an embodiment of the present application
  • FIG. 2 is a second schematic diagram of the hardware structure of an image enhancement apparatus according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an image enhancement method provided by an embodiment of the present application.
  • FIG. 4 is a network structure diagram of an ACE network provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of corresponding pixels in different feature maps provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a first reconstruction network provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a first CDT module provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an image enhancement model provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a method for training an image enhancement model provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an image enhancement apparatus provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a chip system according to an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a computer program product provided by an embodiment of the present application.
  • words such as “exemplary” or “for example” are used to represent examples, illustrations or illustrations. Any embodiments or designs described in the embodiments of the present application as “exemplary” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present the related concepts in a specific manner.
  • first and second are only used for description purposes, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features.
  • a feature defined as “first”, “second” may expressly or implicitly include one or more of that feature.
  • plural means two or more.
  • the meaning of the term “at least one” refers to one or more, and the meaning of the term “plurality” in this application refers to two or more.
  • a plurality of second messages refers to two or more more than one second message.
  • system and “network” are often used interchangeably herein.
  • the size of the sequence number of each process does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not be used in the embodiment of the present application. Implementation constitutes any limitation.
  • determining B according to A does not mean that B is only determined according to A, and B may also be determined according to A and/or other information.
  • the term “if” may be interpreted to mean “when” or “upon” or “in response to determining” or “in response to detecting.”
  • the phrases “if it is determined" or “if a [statement or event] is detected” can be interpreted to mean “when determining" or “in response to determining... ” or “on detection of [recited condition or event]” or “in response to detection of [recited condition or event]”.
  • references throughout the specification to "one embodiment,” “an embodiment,” and “one possible implementation” mean that a particular feature, structure, or characteristic related to the embodiment or implementation is included in the present application at least one embodiment of .
  • appearances of "in one embodiment” or “in an embodiment” or “one possible implementation” in various places throughout this specification are not necessarily necessarily referring to the same embodiment.
  • the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
  • the embodiment of the present application provides an image enhancement method.
  • the method reconstructs basic information of the low-frequency feature image of the image to be processed, reconstructs the details of the high-frequency feature map of the image to be processed, and then fuses the reconstructed images to obtain The enhanced image to be processed.
  • the enhanced image obtained by this method can not only enhance the brightness and/or contrast of the image to be processed, but also filter out the noise of the image.
  • the image to be processed may be an image captured under low light conditions, or any image that needs to be enhanced, which is not limited in this embodiment of the present application.
  • the above-mentioned basic information includes contour information of the image to be processed, and the like.
  • the above-mentioned detailed information includes at least one of edges or textures of the image to be processed.
  • An embodiment of the present application provides an image enhancement apparatus, and the image enhancement apparatus is configured to execute the above-mentioned image enhancement method.
  • the image enhancement device may be a terminal.
  • the image enhancement device may be a server.
  • the above-mentioned terminal may be a portable device such as a mobile phone, a tablet computer, a wearable electronic device, etc., or a computing device such as a personal computer (PC), a personal digital assistant (PDA), a netbook, etc., or a Any other terminal device capable of implementing the embodiments of this application is not limited in this application.
  • a portable device such as a mobile phone, a tablet computer, a wearable electronic device, etc.
  • a computing device such as a personal computer (PC), a personal digital assistant (PDA), a netbook, etc.
  • PC personal computer
  • PDA personal digital assistant
  • netbook a netbook
  • the above-mentioned image enhancement method can be implemented by an application program installed on the terminal, such as a client application program for processing images.
  • the above application may be an embedded application installed in the device (ie, a system application of the device), or may be a downloadable application.
  • an embedded application is an application provided as part of the implementation of a device (such as a mobile phone).
  • a downloadable application is an application that can provide its own internet protocol multimedia subsystem (IMS) connection, which can be pre-installed in the device or can be downloaded by the user and installed in the Third-party apps on the device.
  • IMS internet protocol multimedia subsystem
  • FIG. 1 shows a hardware structure of the mobile phone 10 .
  • the mobile phone 10 may include a processor 110 , an external memory interface 120 , an internal memory 121 , a touch screen 130 , an antenna 140 and the like.
  • the touch screen 130 includes a display screen 131 and a touch panel 132 .
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, an image signal processor (image signal processor, ISP), a graphics processor (graphics processing unit, GPU), controller, memory, video codec, digital signal processor (DSP), baseband processor, and/or neural-network processing unit (NPU) Wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor an image signal processor
  • ISP image signal processor
  • GPU graphics processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • the controller may be the nerve center and command center of the mobile phone 10 .
  • the controller can generate an operation control signal according to the instruction operation code and timing signal, and complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby increasing the efficiency of the system.
  • the processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transceiver (universal asynchronous transmitter) receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transceiver
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB universal serial bus
  • the mobile phone 10 may adopt different interface connection manners described above, or a combination of multiple interface connection manners.
  • the mobile phone 10 can realize the shooting function through the ISP, the camera, the video codec, the GPU, the touch screen 130, the AP, and the like.
  • the ISP is used to process the data fed back by the camera. Cameras are used to acquire still images or video. The object is projected through the lens to generate an optical image onto the photosensitive element.
  • DSP is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals.
  • Video codecs are used to compress or decompress digital video.
  • an ISP, a camera, a video codec, a GPU, a touch screen 130, an AP, etc. can be used to capture the above-mentioned images to be processed.
  • the display screen 131 of the touch screen 130 is used for displaying images, videos and the like.
  • the display screen 131 may be used to display the above-mentioned to-be-processed image and the enhanced to-be-processed image.
  • the touch pad 132 of the touch screen 130 may be used to input user instructions and the like.
  • the mobile phone 10 implements the display function through the GPU, the display screen 131, the AP, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 131 and the application processor AP.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the NPU is a neural-network (NN) computing processor, which can rapidly process input information and continuously self-learn by drawing on the structure of biological neural networks, such as the transfer mode between neurons in the human brain.
  • Applications such as intelligent processing of the mobile phone 10, such as image enhancement, can be implemented through the NPU.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the mobile phone 10 .
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save files such as images on an external memory card.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the processor 110 executes various functional applications and data processing of the mobile phone 10 by executing the instructions stored in the internal memory 121 .
  • the antenna 140 transmits and receives electromagnetic wave signals.
  • Each antenna in handset 10 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • the antenna 1 can be multiplexed as a diversity antenna of the wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
  • the structures illustrated in the embodiments of the present application do not constitute a specific limitation on the mobile phone 10 .
  • the mobile phone 10 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the embodiment of the present application further provides another image enhancement device, which is used for training a neural network, thereby obtaining an image enhancement model capable of implementing the above-mentioned image enhancement method.
  • the image enhancement apparatus may be a server, or any other computing device with computing power for training a neural network.
  • FIG. 2 shows a schematic diagram of a hardware structure of a server provided by an embodiment of the present application.
  • the server 20 may include a processor 21 , a memory 22 , a communication interface 23 and a bus 24 .
  • the processor 21 , the memory 22 and the communication interface 23 may be connected through a bus 24 .
  • the processor 21 is the control center of the server 20, and may be a general-purpose central processing unit (central processing unit, CPU), or other general-purpose processors. Wherein, the general-purpose processor may be a microprocessor or any conventional processor or the like.
  • processor 21 may include one or more CPUs, such as CPU 0 and CPU 1 shown in FIG. 2 .
  • the memory 22 may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (RAM) or other type of static storage device that can store information and instructions
  • ROM read-only memory
  • RAM random access memory
  • a dynamic storage device that can also be an electrically erasable programmable read-only memory (EEPROM), a magnetic disk storage medium, or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
  • EEPROM electrically erasable programmable read-only memory
  • magnetic disk storage medium or other magnetic storage device, or can be used to carry or store instructions or data structures in the form of desired program code and any other medium that can be accessed by a computer, but is not limited thereto.
  • the memory 22 may exist independently of the processor 21 .
  • the memory 22 may be connected to the processor 21 through a bus 24 for storing data, instructions or program codes.
  • the processor 21 calls and executes the instructions or program codes stored in the memory 22, an image enhancement model that implements the image enhancement method provided by the embodiments of the present application can be trained.
  • the memory 22 may also be integrated with the processor 21 .
  • the communication interface 23 is used for connecting the server 20 with other devices (such as terminals, etc.) through a communication network, and the communication network can be an Ethernet, a radio access network (RAN), a wireless local area network (wireless local area networks, WLAN), etc.
  • the communication interface 23 may include a receiving unit for receiving data, and a transmitting unit for transmitting data.
  • the bus 24 can be an industry standard architecture (Industry Standard Architecture, ISA) bus, a peripheral device interconnect (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus and the like.
  • ISA Industry Standard Architecture
  • PCI peripheral device interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one thick line is used in FIG. 2, but it does not mean that there is only one bus or one type of bus.
  • the structure shown in FIG. 2 does not constitute a limitation on the management and control device.
  • the server 20 may include more or less components than those shown in the figure, or a combination of certain some components, or a different arrangement of components.
  • FIG. 3 shows a schematic flowchart of an image enhancement method provided by an embodiment of the present application.
  • the method is applied to an image enhancement apparatus, and the image enhancement apparatus may be a terminal or a server as shown in FIG. 2 , which is not limited.
  • the method may include the following steps:
  • An image enhancement apparatus acquires an image to be processed.
  • the image to be processed may be an image captured under low light conditions, or an image captured under other shooting conditions and needs to be enhanced, which is not limited in this embodiment of the present application.
  • the image enhancement apparatus may acquire the to-be-processed image from a local picture library.
  • the pictures in the picture library include pre-shot and saved pictures, pictures downloaded from the network, pictures transmitted by Bluetooth, pictures sent by social software, and video screenshots in videos, etc., which are not limited.
  • the user can select and load an image to be processed from a local picture library through a touch operation on a touch screen of the image enhancement device (eg, the touch screen 130 shown in FIG. 1 ), or through a voice interaction module of the image enhancement device.
  • the image enhancement device can acquire the image to be processed.
  • the image enhancement device may also take pictures in real time, and use the pictures obtained in real time as the images to be processed.
  • the image enhancement apparatus may also acquire the image to be processed in any other manner, which is not specifically limited in this embodiment of the present application.
  • the image enhancement device obtains the low-frequency feature map of the image to be processed through the first neural network.
  • the first neural network may be an attention to context encoding (ACE) network.
  • ACE attention to context encoding
  • the image enhancement device may obtain the low-frequency feature map of the image to be processed through the ACE network.
  • the process can include the following steps:
  • Step 1 The image enhancement apparatus uses the ACE network 40 to obtain the first feature map 42 of the image 41 to be processed.
  • the first feature map 42 is used to determine the first low-frequency feature map 43 of the image to be processed.
  • the ACE network 40 can use two convolution kernels to perform a convolution operation with the image to be processed 41 to obtain the feature map 1 and the feature map 2 .
  • the above two convolution kernels may be convolution kernels of two different receptive fields (or called fields of view (FOV)), such as convolution kernel 1 and convolution kernel 2 shown in FIG. 4 .
  • FOV fields of view
  • the above-mentioned convolution kernel 1 may be a convolution kernel with a size of 3 ⁇ 3 and a dilation rate of 2, that is, the receptive field of the convolution kernel 1 is 5.
  • the above-mentioned convolution kernel 2 may be a 1 ⁇ 1 convolution kernel, that is, the receptive field of the convolution kernel 2 is 1.
  • the ACE network 40 can use the convolution kernel 1 to perform a dilated convolution operation with the image to be processed 41 to obtain the feature map 1 .
  • the ACE network 40 can use the convolution kernel 2 to perform an ordinary convolution operation with the image to be processed 41 to obtain the feature map 2 .
  • the above-mentioned convolution kernel 1 may be a convolution kernel with a size of 5 ⁇ 5, that is, the receptive field of the convolution kernel 1 is 5.
  • the above-mentioned convolution kernel 2 may be a 1 ⁇ 1 convolution kernel, that is, the receptive field of the convolution kernel 2 is 1.
  • the ACE network 40 can use the convolution kernel 1 to perform an ordinary convolution operation with the image to be processed 41 to obtain the feature map 1 .
  • the ACE network 40 can use the convolution kernel 2 to perform an ordinary convolution operation with the image to be processed 41 to obtain the feature map 2 .
  • the ACE network 40 can fill the image 41 to be processed before using the convolution kernel 1 and the convolution kernel 2 to perform the convolution operation with the image to be processed respectively.
  • the ACE network 40 makes the difference between the pixel values of the corresponding pixels in the feature map 1 and the feature map 2, and then obtains the bit-sensing feature map C a for determining the high-frequency feature map of the image to be processed.
  • the corresponding pixels in the feature map 1 and the feature map 2 refer to the pixels located at the same position in the feature map 1 and the feature map 2 .
  • the difference between the pixel values of the corresponding pixels in the feature map 1 and the feature map 2 refers to the difference between the pixel values of the pixels located at the same position in the feature map 1 and the feature map 2.
  • the ACE network 40 can make a difference between the pixel value of pixel 1 in the feature map 1 and the pixel value of the pixel 2 corresponding to the pixel 1 in the feature map 2, so as to obtain a high-frequency feature map for determining the image to be processed.
  • the pixel 1 and the pixel 2 are the pixels located in the same position in the feature map 1 and the feature map 2.
  • FIG. 5 exemplarily shows pixels located in the same position in the feature map 51 and the feature map 52 .
  • the pixel Z511 in the feature map 51 and the pixel Z521 in the feature map 52 are pixels located at the same position.
  • pixel Z515 in feature map 51 and Z525 in feature map 52 are co-located pixels
  • pixel Z519 in feature map 51 and Z529 in feature map 52 are co-located pixels, and so on.
  • the ACE network 40 inverts the pixel value of each pixel in the pair of bit-aware feature map C a to obtain That is, the first feature map 42 used to determine the first low-frequency feature map 43 of the image to be processed.
  • the ACE network 40 can make a difference between 1 and each pixel value in C a , so as to obtain That is,
  • Step 2 The image enhancement apparatus multiplies the first feature map 42 by the pixel value of the corresponding pixel of the image to be processed 41 using the first neural network (ie the ACE network 40 ) to obtain the first low-frequency feature map 43 of the image to be processed.
  • the first neural network ie the ACE network 40
  • the multiplication of the first feature map 42 by the pixel value of the corresponding pixel of the image to be processed 41 refers to the multiplication of the first feature map 42 and the pixel value of the pixel located at the same position in the image to be processed 41 .
  • the image enhancement apparatus may use the ACE network 40 to multiply the pixel value of the first pixel in the first feature map 42 with the pixel value of the second pixel corresponding to the first pixel in the image to be processed 41, so as to obtain The first low frequency feature map 43 .
  • the first pixel and the second pixel are pixels located at the same position in the first feature map 42 and the image to be processed 41 .
  • Step 3 Based on the first low-frequency feature map 43 , the image enhancement apparatus uses the ACE network 40 to obtain a second low-frequency feature map 47 with contextual information.
  • the context information is the context information of the pixels in the second low-frequency feature map 47 .
  • the ACE network 40 may first perform pooling on the first low-frequency feature map 43 to obtain a pooled low-frequency feature map 44 .
  • the ACE network 40 may pool the first low-frequency feature map 43 by adopting a max pooling method. Alternatively, the ACE network 40 may use an average pooling method to pool the first low-frequency feature map 43 .
  • the embodiments of the present application do not specifically limit the specific pooling manner.
  • the ACE network 40 obtains a low-frequency feature map 46 with contextual information through the non-local neural sub-network 45 .
  • the context information refers to the relationship information between the pixel and all pixels except the pixel in the low-frequency feature map 44 obtained after pooling .
  • the non-local neural sub-network 45 can transpose the low-frequency feature map 44 to obtain the transposed low-frequency feature map 440. . Then, the non-local neural sub-network 45 performs a convolution operation on the low-frequency feature map 44 and the transposed low-frequency feature map 440, thereby obtaining the low-frequency feature map M. Then, the non-local neural sub-network 45 performs a convolution operation on the low-frequency feature map M and the transposed low-frequency feature map 440 to obtain a low-frequency feature map 46 with contextual information.
  • the process of obtaining the low-frequency feature map 46 with contextual information through the non-local neural sub-network 45 by the ACE network 40 can be understood as the process of pooling the low-frequency feature map 44 obtained by pooling. features for further learning.
  • the ACE network 40 de-pools the low-frequency feature map 46 to obtain a second low-frequency feature map 47 .
  • the way in which the ACE network 40 performs de-pooling on the low-frequency feature map 46 is inverse to the way in which the ACE network 40 performs pooling on the first low-frequency feature map 43 .
  • the ACE network 40 performs de-pooling on the low-frequency feature map 46, which may specifically be: according to the size of the first low-frequency feature map 43, the ACE network 40 predicts the adjacent pixels of each pixel in the low-frequency feature map 46, so that A second low frequency feature map 47 is obtained.
  • Step 4 (optional): The image enhancement apparatus uses the ACE network 40 to fuse the first low-frequency feature map 43 and the second low-frequency feature map 47 to obtain a low-frequency feature map 48 of the target.
  • the ACE network 40 can add the pixel values of the corresponding pixels of the first low-frequency feature map 43 and the second low-frequency feature map 47 to obtain a target low-frequency feature map, which is the embodiment of the present application.
  • the sum of the pixel values of the corresponding pixels in the first low frequency characteristic map 43 and the second low frequency characteristic map 47 refers to the sum of the pixel values of the pixels located in the same position in the first low frequency characteristic map 43 and the second low frequency characteristic map 47 .
  • the ACE network 40 may add the pixel value of pixel 1 in the first low-frequency feature map 43 and the pixel value of pixel 2 corresponding to the pixel 1 in the second low-frequency feature map 47, so as to obtain the target low-frequency feature map.
  • pixel 1 and pixel 2 are pixels located at the same position in the first low-frequency characteristic map 43 and the second low-frequency characteristic map 47 .
  • the first low-frequency feature map 43 may also be directly used as the low-frequency feature map of the image to be processed described in the embodiment of the present application, which is not specifically limited. It should be understood that, in this case, the image enhancement apparatus does not need to perform the above-mentioned steps 3 and 4.
  • the image enhancement apparatus determines the first image based on the above-mentioned low-frequency characteristic map.
  • the first image includes basic information of the reconstructed image to be processed, and the basic information includes outline information of the image to be processed, and the like.
  • the image enhancement device may determine the first image through the following steps:
  • the image enhancement apparatus uses the first reconstruction network (corresponding to the third neural network in the embodiment of the present application) to reconstruct the basic information of the to-be-processed image, and obtains the basic image (corresponding to the third neural network in the embodiment of the present application). three images).
  • the first reconstruction network may be a U-shaped network (Unet network) combined with a first cross domain transformation (cross domain transformation, CDT) module.
  • Unet network U-shaped network
  • CDT cross domain transformation
  • the Unet network includes at least one layer of downsampling convolutional network (ie, a network with a stride of 2) and at least one layer of deconvolutional network.
  • the first CDT module is applied in combination with the at least one layer of deconvolution network, and each layer of deconvolution network is connected to a first CDT module.
  • FIG. 6 exemplarily shows the structure of the first reconstruction network 60 .
  • the first reconstruction network 60 includes four layers of convolutional networks, which are respectively a convolutional network 601 , a convolutional network 602 , a convolutional network 603 and a convolutional network 604 .
  • the convolutional network 601 is a first-layer convolutional network, and its input is the low-frequency feature map obtained in the above step S102, such as the low-frequency feature map 48 shown in FIG. 4 .
  • the convolutional network 604 is a fourth-layer convolutional network whose output is the feature map 605 .
  • the size of the feature map output by the convolutional network of this layer can be the size of the feature map input to the convolutional network of this layer. 1/2 of .
  • the first reconstruction network 60 further includes four layers of deconvolution networks, which are a deconvolution network 6011 , a deconvolution network 6021 , a deconvolution network 6031 and a deconvolution network 6041 respectively.
  • the deconvolution network and the convolution network correspond one by one, for example, the deconvolution network 6011 corresponds to the convolution network 601, that is, the size of the output feature map of the deconvolution network 6011 is the same as the size of the input feature map of the convolution network 601. same.
  • the input of the deconvolution network 6041 is the feature map 605 output by the convolution network 604 .
  • each time a layer of deconvolution network passes through the size of the feature map output by the deconvolution network of this layer can be input to the deconvolution network of this layer. 2 times the size of the feature map.
  • the first reconstruction network 60 further includes four first CDT modules, which are CDT module 6042 , CDT module 6032 , CDT module 6022 and CDT module 6012 respectively.
  • the CDT module 6042 is connected to the deconvolution network 6041, and the input feature map of the CDT module 6042 is the output feature map of the deconvolution network 6041.
  • the CDT module 6042 and the deconvolution network 6041 are in the same network layer.
  • the CDT module 6042 is also connected to the deconvolution network 6031 , and the output feature map of the CDT module 6042 is the input feature map of the deconvolution network 6031 .
  • the connection of the CDT module 6032, the CDT module 6022, and the CDT module 6012 to the deconvolution network is similar to the connection of the CDT module 6042 to the deconvolution network, and details are not described here.
  • the feature map 62 output by the CDT module 6012 is the basic image reconstructed by the first reconstruction network 60 .
  • the image enhancement apparatus can reconstruct the basic information of the to-be-processed image by using the first reconstruction network as shown in FIG. 6 according to the low-frequency feature map of the to-be-processed image, thereby obtaining the basic image.
  • the basic image reconstructed by using the first reconstruction network has higher brightness and/or contrast than the image to be processed, that is, the brightness and/or brightness of the basic image. Contrast, which is the same (or similar) to the brightness and/or contrast of an image taken under natural daylight during the day.
  • FIG. 7 shows a schematic structural diagram of the first CDT module 70 .
  • the first CDT module 70 may be any of the CDT modules in FIG. 6 .
  • the input to the first CDT module 70 is a feature map 71 .
  • the feature map 71 may be the feature map output by the deconvolution network 6041 in FIG. 6 .
  • the feature map 71 may be the feature map output by the deconvolution network 6011 in FIG. 6 . This is not specifically limited.
  • the feature map 71 can be convolved with the convolution kernel 1 and the convolution 2 respectively, and the difference of the operation results can be reversed to obtain the feature map 72 (the embodiment of this application is referred to as the third feature map), the feature map 72 is used to represent the contrast-aware features of the feature map 71 .
  • the feature map 72 is multiplied by the corresponding pixels of the feature map 71 to obtain a feature map 73 , which is a low-frequency feature map of the feature map 71 .
  • the process of obtaining the feature map 72 according to the feature map 71 and further obtaining the feature map 73 can refer to the process of obtaining the first feature map according to the to-be-processed image 41 and the process of further obtaining the first low-frequency feature map 43 in FIG. To repeat.
  • the first CDT module 70 takes the feature map 73 and the feature map 71 as a 2-channel feature map 74 as a whole, and determines the global feature v of the 2-channel feature map 74 .
  • the global feature v can be the average of all pixels in the 2-channel feature map 74 .
  • the first CDT module 70 uses the global feature v and the 2-channel feature map 74 to perform dot product to obtain a 2-channel feature map 75, which is the feature map output by the CDT module 70. It can be understood that the global feature in the CDT module can globally adjust the 2-channel feature map 74, so that the output 2-channel feature map 75 is closer to the real.
  • the output 2-channel feature map 75 is the feature map 62 shown in FIG. 6 .
  • the CDT module includes the process of obtaining the low-frequency feature map of the input feature map. This process can filter out most of the high-frequency information in the deconvolved feature map, so as to further filter out the image to be processed. noise.
  • the Unet network itself also has the function of filtering out noise.
  • the CDT module can be combined in the deconvolution network of the Unet network to filter out the introduced high-frequency information.
  • the image enhancement device enhances the brightness and/or contrast of the above-mentioned basic image by using a constant ⁇ to obtain a basic image with enhanced brightness and/or contrast (corresponding to the fourth image in the embodiment of the present application).
  • the constant ⁇ may be a preset constant of the image enhancement apparatus, or may be a constant output by the above-mentioned first reconstruction network when outputting the basic image, which is not limited in this embodiment of the present application.
  • the image enhancement apparatus may directly enhance the brightness and/or contrast of the above-mentioned basic image, that is, the image enhancement apparatus may directly use the constant ⁇ to perform dot product with the above-mentioned basic image, so as to obtain the fourth image.
  • the image enhancement apparatus first uses a preset neural network to learn the above-mentioned basic image to obtain a fourth feature map.
  • the fourth feature map represents color and/or contrast information of the base image.
  • the embodiments of the present application do not specifically limit the structure and parameters of the preset neural network, but only need the size of the fourth feature map output by the preset neural network to be the same as the size of the image to be processed. .
  • the image enhancement apparatus performs sigmoid processing on the pixel values in the fourth feature map to obtain a fifth feature map.
  • the pixel value of each pixel in the fourth feature map is usually any integer between 1-255.
  • the image enhancement apparatus performs normalization processing on the pixel values in the fourth feature map, usually by transforming the pixel values from 0 to 255 into values between 0 and 1 in proportion. That is to say, the pixel value in the fifth feature map is a value between 0-1.
  • the value of pixel 1 in the fourth feature map is 100, after normalization processing, the value of this pixel is 0.392 (ie, 100/255).
  • the value of pixel 2 in the fourth feature map is 200, after normalization processing, the value of this pixel is 0.784 (ie, 200/255).
  • the image enhancement device may perform brightness and/or contrast enhancement on the fifth feature map, that is, the image enhancement device may use a constant ⁇ to perform dot product with the fifth feature map, thereby obtaining a fourth image.
  • the image enhancement device multiplies the basic image after the enhanced brightness and/or contrast and the pixel value of the corresponding pixel of the basic image to obtain a first image.
  • the multiplication of the pixel values of the corresponding pixels of the basic image (that is, the fourth image) after the enhanced brightness and/or contrast and the corresponding pixels of the basic image (that is, the third image) refers to that the fourth image and the third image are located in the same position. Multiply pixel values of pixels.
  • the image enhancement apparatus may multiply the pixel value of the sixth pixel in the third image and the pixel value of the seventh pixel corresponding to the sixth pixel in the fourth image, thereby obtaining the first image.
  • the sixth pixel and the seventh pixel are pixels located at the same position in the third image and the fourth image.
  • the image enhancement apparatus may also directly use the basic image (ie, the third image) obtained in step S1031 as the first image.
  • steps S1032 and S1033 need not be performed in this embodiment of the present application.
  • the image enhancement device obtains the high-frequency feature map of the image to be processed through the second neural network.
  • the second neural network may be an ACE network, the second neural network and the first neural network have the same network structure, and the second neural network may share the network parameters of the first neural network.
  • the second neural network can invert the pixel value of each pixel in the first feature map obtained by the first neural network, that is, to obtain a second feature map for determining the high-frequency feature map of the image to be processed. In this way, the second neural network can determine the second feature map without a large number of convolution operations, thereby saving computing power.
  • the description of the second neural network inverting the pixel value of each pixel in the first feature map may refer to the description of the inversion of the bit-sensing feature map C a by the ACE network 40 in S102 above, which will not be repeated here.
  • the second neural network may multiply the second feature map and the above-mentioned pixel value of the corresponding pixel of the first image, thereby obtaining the first high-frequency feature map.
  • multiplying the second feature map with the pixel value of the corresponding pixel in the first image refers to multiplying the second feature map with the pixel value of the pixel located at the same position in the first image.
  • the image enhancement apparatus may multiply the pixel value of the third pixel in the second feature map with the pixel value of the fourth pixel corresponding to the third pixel in the first image, thereby obtaining the first high-frequency feature map.
  • the third pixel and the fourth pixel are the pixels located in the same position in the second feature map and the first image.
  • the second neural network may multiply the second feature map and the pixel value of the corresponding pixel of the image to be processed, so as to obtain the first high-frequency feature map.
  • multiplying the second feature map by the pixel value of the corresponding pixel of the image to be processed refers to multiplying the second feature map by the pixel value of the pixel located at the same position in the image to be processed.
  • the description of the second feature map and the pixel located at the same position in the image to be processed may refer to the description of the pixel located at the same position in the feature map 1 and the feature map 2 in the above S102, and will not be repeated here.
  • the image enhancement device may multiply the pixel value of the third pixel in the second feature map with the pixel value of the fifth pixel corresponding to the third pixel in the image to be processed, thereby obtaining the first high-frequency feature map.
  • the third pixel and the fifth pixel are the pixels located in the same position in the second feature map and the image to be processed.
  • the second neural network determines the second high-frequency feature map according to the first high-frequency feature map, and determines the second high-frequency feature map according to the first high-frequency feature map and the second high-frequency feature map , for the process of determining the target high-frequency feature map, reference may be made to the description of obtaining the target low-frequency feature map 48 in steps 3 and 4 in S102 above, which will not be repeated here.
  • the image enhancement device determines the second image based on the above-mentioned high-frequency characteristic map.
  • the second image includes detailed information of the reconstructed image to be processed, and the detailed information includes at least one of edges or textures of the image to be processed, which is not specifically limited.
  • the image enhancement apparatus may use the second reconstruction network (corresponding to the fourth neural network in the embodiment of the present application) to reconstruct the detailed information of the image to be processed, so as to obtain the reconstructed detailed image of the image to be processed, that is, the above-mentioned second image .
  • the second reconstruction network corresponding to the fourth neural network in the embodiment of the present application
  • the second reconstruction network may be a Unet network combined with the first CDT module.
  • the network structure of the second reconstruction network is the same as that of the first reconstruction network, that is, the network structures of the fourth neural network and the third neural network are the same.
  • the deconvolution network in the second reconstruction network is applied in combination with the second CDT module.
  • the deconvolution network in the second reconstruction network is applied in combination with the second CDT module.
  • the second CDT module may share the network parameters of the first CDT module.
  • the second CDT module can invert the third feature map obtained by the first CDT module, that is, obtain a sixth feature map for determining the high-frequency feature map of the input feature map of the first CDT module.
  • the second CDT module can determine the sixth feature map without a large number of convolution operations, thereby saving the calculation amount of the system.
  • the network layer where the second CDT module is located is the same as the network where the first CDT module is located.
  • the third feature map of the first CDT module at the first network layer may be shared by the second CDT module at the first network layer in the second reconstruction network.
  • the second CDT module determines the global feature v and uses the global feature v to globally adjust the image reconstructed by the deconvolution, which can be referred to above, and will not be repeated here.
  • the image enhancement device may further perform a convolution operation on the detail image output by the second reconstruction network, so as to obtain a detail image after further convolution processing.
  • the image enhancement device takes the detail image after the further convolution operation as the second image.
  • the above-mentioned deep learning can be implemented through a neural network, and the network structure and network parameters of the neural network are not specifically limited in the embodiments of the present application, only the size of the feature map output by the neural network and the size of the image to be processed Just the same size.
  • the image enhancement device fuses the first image and the second image to obtain an enhanced image of the to-be-processed image.
  • the image enhancement device may add the pixel values of the corresponding pixels of the first image and the second image to obtain an enhanced image of the to-be-processed image.
  • the sum of pixel values of corresponding pixels of the first image and the second image refers to the sum of pixel values of pixels located at the same position in the first image and the second image.
  • the image enhancement device may add the pixel value of the eighth pixel in the first image and the pixel value of the ninth pixel corresponding to the eighth pixel in the second image, so as to obtain an image after image enhancement to be processed.
  • the eighth pixel and the ninth pixel are pixels located at the same position in the first image and the second image.
  • the image enhancement method denoises and reconstructs the basic image by acquiring the low-frequency feature map of the image to be processed, and reconstructs the detail image by using the acquired high-frequency feature map of the image to be processed, and then fuses the basic image. and the detail image, and then obtain the enhanced image of the image to be processed.
  • the brightness and/or contrast of the image can be enhanced, and the noise in the image to be processed can also be effectively filtered.
  • the second neural network may share the first feature map of the first neural network
  • the second CDT module may share the third feature map of the first CDT module.
  • the image enhancement method provided by the embodiments of the present application can be implemented by directly executing the above steps S101-S106 through an image enhancement device, or by pre-training in the image enhancement device and capable of implementing the above method image enhancement model.
  • FIG. 8 shows a schematic structural diagram of an image enhancement model.
  • the image enhancement model 80 includes a first neural network module 81 , a first reconstruction module 82 , a second neural network module 83 , a second reconstruction module 84 and a fusion module 85 .
  • the first neural network module 81 and the first reconstruction module 82 can be used as the neural network module of the first stage in the image enhancement model 80
  • the second neural network module 83 and the second reconstruction module 84 can be used as the image enhancement model 80 .
  • the neural network module of the second stage can be used as the image enhancement model 80 .
  • the first neural network module 81 may include the above-mentioned first neural network, and is used to realize the function of acquiring the low-frequency feature map of the image to be processed in the above step S102.
  • the first reconstruction module 82 may include a first reconstruction network sub-module 821 .
  • the first reconstruction network sub-module 821 may include the first reconstruction network described above, and is configured to implement the function of reconstructing the basic image (ie, the third image) of the image to be processed in the above step S1031 .
  • the first reconstruction module 82 may further include an enhancement sub-module 822 .
  • the enhancement sub-module 822 may be used to implement the function of obtaining a fourth image after performing brightness and/or contrast enhancement on the base image in step S1032 above.
  • the enhancement sub-module 822 may include the preset neural network described in step S1032 above, and the preset neural network is used to obtain the fourth feature map described above, and the fourth feature map can be used to obtain the fourth image.
  • the first reconstruction module 82 is further configured to realize the function of multiplying the brightness and/or contrast-enhanced base image and the pixel value of the corresponding pixel of the base image in step S1033 above to obtain the first image.
  • the second neural network module 83 may include the second neural network described above, and is configured to implement the function of acquiring the high-frequency feature map of the image to be processed in the above step S104.
  • the second reconstruction module 84 may include the second reconstruction network described above, and is configured to implement the function of determining the second image based on the high-frequency feature map in step S105 above.
  • the fusion module 85 can be used to realize the function of fusing the first image output by the first reconstruction module 82 and the second image output by the second reconstruction network module 84 in step S106 above to obtain an enhanced image of the to-be-processed image.
  • the functions and beneficial effects realized by the modules in the first neural network, the second neural network, the first reconstruction network, the second reconstruction network and the image enhancement model 80 can all refer to the descriptions of S101-S106 above, here No longer.
  • image enhancement model can be pre-trained by an image enhancement device (eg, the server shown in FIG. 2 ), or pre-trained by any device capable of training a neural network model.
  • an image enhancement device eg, the server shown in FIG. 2
  • pre-trained by any device capable of training a neural network model e.g., the server shown in FIG. 2
  • FIG. 9 shows a schematic flowchart of a method for training an image enhancement model by an image enhancement device, and the method may include the following steps:
  • the image enhancement apparatus acquires at least one training sample.
  • any training sample in the at least one training sample includes a training image pair, and the training image pair includes a training image and a training target image.
  • the training image can be used as the image to be enhanced, and the training target image can be used as the enhanced target image of the training image.
  • the training image and the training target image may be images in standard RGB format.
  • the size of the training image and the training target image in a training image pair are usually the same.
  • the embodiments of the present application do not specifically limit the size of the training image and the training target image in a training image pair.
  • the size of the training image and the training target image is small (for example, 512 ⁇ 384), the computing power required by the image enhancement apparatus for training the image enhancement model can be saved.
  • the image content in the training image is the same as or similar to the image content in the training target image.
  • the training image pair A includes a training image A and a training target image A
  • the training image A may be an image captured by scene A under low-light conditions
  • the training target image A may be captured under natural lighting conditions during the day. image.
  • the training image A and the training target image A are images of the scene A captured at the same or similar shooting angles.
  • the images in the existing training samples can be randomly flipped horizontally or vertically.
  • a training sample B can be added.
  • the image enhancement device obtains an image enhancement model by training according to the at least one training sample.
  • the image enhancement device may perform iterative training on the neural network according to the at least one training sample described above, so as to obtain the above-mentioned image enhancement model.
  • the image enhancement device uses the m training samples to iteratively train the neural network, so as to obtain the process of image enhancement model, The following steps can be included:
  • Step 1 The image enhancement device inputs the training image 1 in the training sample 1 among the m training samples into the initial image enhancement model.
  • the structure of the initial image enhancement model is shown in Fig. 8, including multiple neural networks, which will not be repeated here.
  • the initial image enhancement model can output the target image 1 through the methods described in S101-S106 above.
  • the image enhancement device may calculate the loss function 1 according to the target image 1 and the training target image 1 in the training sample 1 . It can be seen that the training image 1 and the training target image 1 belong to the same training image pair.
  • the image enhancement device may also calculate the loss function 2 according to the target image 1 and the third image obtained in the process of enhancing the training image 1 by the initial image enhancement model.
  • the description of the third image obtained in the process of enhancing the training image 1 by the initial image enhancement model may refer to the description of obtaining the third image above, which will not be repeated here.
  • the image enhancement device feeds back the loss function 1 and the loss function 2 to the initial image enhancement model respectively to adjust the parameters of the neural network in the initial image enhancement model, thereby obtaining the image enhancement model 2 with adjusted neural network parameters.
  • the image enhancement device fuses the loss function 1 and the loss function 2 into a loss function 0, and then feeds back the loss function 0 to the initial image enhancement model to adjust the parameters of the neural network in the initial image enhancement model, thereby obtaining the adjusted neural network.
  • Step 2 The image enhancement device inputs the training image 2 in the training sample 2 into the image enhancement model 2, and refers to the above step 1 to obtain the image enhancement model 3.
  • the current image enhancement model is used as the target image enhancement model.
  • the output that is, the image enhancement model described in FIG. 8 is obtained by the training of the image enhancement device.
  • the image enhancement device after the image enhancement device is trained to obtain the image enhancement model of the target, it can be released as a dedicated image enhancement App, so that after the user installs/updates the image enhancement App, the image can be enhanced through the App. Or, after the image enhancement device is trained to obtain the image enhancement model of the target, it can be applied to an image processing App as an image enhancement function module of the App. In this way, when the user installs/updates the image processing app including the image enhancement model, the image enhancement model in the app can be used to enhance the image.
  • the mobile phone 10 shown in FIG. 1 has an image processing App including an image enhancement model installed, and the user can open the image processing App by clicking the “image processing” App icon on the touch screen of the mobile phone 10 . Then, the user can load the image to be processed on the display interface of the image processing App, and in the picture editing state, click the "image enhancement" button in the toolbar to enhance the image to be processed. At this time, the enhanced to-be-processed image can be displayed on the display interface of the image processing App.
  • the image enhancement method provided by this application can obtain an image enhancement model through pre-selection training, obtain the low-frequency feature map of the image to be processed to denoise and reconstruct the basic image, and obtain the high-frequency feature map of the image to be processed.
  • the detail image is reconstructed, and then the base image and the detail image are fused to obtain an enhanced image of the to-be-processed image.
  • the noise in the to-be-processed image can be effectively filtered out while the brightness and/or contrast of the image can be enhanced.
  • the second neural network can share the first feature map of the first neural network
  • the second CDT module can share the third feature map of the first CDT module.
  • the image enhancement apparatus may be divided into functional modules according to the above method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • FIG. 10 shows a schematic structural diagram of an image enhancement apparatus 100 provided by an embodiment of the present application.
  • the image enhancement apparatus 100 can be used to perform the above-mentioned image enhancement method, for example, to perform the method shown in FIG. 3 .
  • the image enhancement apparatus 100 may include an acquisition unit 101 , a determination unit 102 and a fusion unit 103 .
  • the obtaining unit 101 is configured to obtain the low-frequency feature map of the image to be processed through the first neural network.
  • the determining unit 102 is configured to determine the first image according to the low-frequency characteristic map acquired by the acquiring unit 101 .
  • the first image includes basic information of the reconstructed image to be processed, and the basic information includes contour information of the image to be processed.
  • the obtaining unit 101 is further configured to obtain the high-frequency feature map of the image to be processed through the second neural network.
  • the determining unit 102 is further configured to determine the second image according to the high-frequency feature map acquired by the acquiring unit 101 .
  • the second image includes detail information of the reconstructed image to be processed, where the detail information includes at least one of edges or textures of the image to be processed.
  • the fusion unit 103 is configured to fuse the first image and the second image to obtain an enhanced image of the to-be-processed image.
  • the acquiring unit 101 may be configured to execute S102 and S104
  • the determining unit 102 may be configured to execute S103 and S105
  • the fusion unit 103 may be configured to execute S106 .
  • the obtaining unit 101 is specifically configured to: use the first neural network to obtain the first feature map of the image to be processed; The pixel values of the second pixel corresponding to the first pixel are multiplied to obtain a low-frequency feature map of the image to be processed.
  • the obtaining unit 101 is further specifically configured to: use the second neural network to invert the pixel value of each pixel in the first feature map to obtain a second feature map; and, use the second neural network to convert the third pixel in the second feature map.
  • the pixel values of the five pixels are multiplied to obtain the high-frequency feature map of the image to be processed.
  • the second neural network and the first neural network have the same network structure.
  • the obtaining unit 101 may be configured to perform S102 and S104.
  • the above-mentioned “inversion” means making a difference between 1 and the pixel value of each pixel in the first feature map.
  • the determining unit 102 is specifically configured to use the third neural network to reconstruct the basic information of the image to be processed according to the low-frequency feature map obtained by the obtaining unit 101 to obtain a third image; and, enhance the third image by a constant ⁇ to obtain a fourth image; and multiply the pixel value of the sixth pixel in the third image by the pixel value of the seventh pixel corresponding to the sixth pixel in the fourth image to obtain the first image. image.
  • the determining unit 102 may be configured to perform S1031 ⁇ S1033 .
  • the above-mentioned constant ⁇ is a preset constant, or, the above-mentioned constant ⁇ is obtained through a third neural network.
  • the determining unit 102 is further specifically configured to use the fourth neural network to reconstruct the detailed information of the to-be-processed image according to the high-frequency feature map obtained by the obtaining unit 101, to obtain a second image; wherein, the fourth neural network and The network structure of the above-mentioned third neural network is the same.
  • the feature map used to obtain the high-frequency feature map in the fourth neural network is obtained by inverting the pixel value of each pixel in the feature map used to obtain the low-frequency feature map in the third neural network.
  • the determining unit 102 may be configured to perform S105.
  • the fusion unit 103 is specifically configured to add the pixel value of the eighth pixel in the first image and the pixel value of the ninth pixel corresponding to the eighth pixel in the second image to obtain an enhanced image of the to-be-processed image. .
  • the fusion unit 103 may be configured to perform S106.
  • the acquisition unit 101 , the determination unit 122 and the fusion unit 103 in the image enhancement apparatus 100 may be implemented by the processor 101 in FIG. 1 executing program codes in the internal memory 121 in FIG. 1 .
  • the chip system 110 includes at least one processor and at least one interface circuit.
  • the processor may be the processor 111 shown in the solid line box in FIG. 11 (or the processor 111 shown in the dotted line box)
  • the one interface circuit may be the interface circuit 112 shown in the solid line box in FIG. 11 (or the interface circuit 112 shown in the dotted line box).
  • the two processors include the processor 111 shown in the solid line box and the processor 111 shown in the dotted line box in FIG. 11
  • the two interfaces includes the interface circuit 112 shown in the solid line box and the interface circuit 112 shown in the dashed line box in FIG. 11 . This is not limited.
  • the processor 111 and the interface circuit 112 may be interconnected by wires.
  • the interface circuit 112 may be used to receive signals (eg, to acquire images to be processed, etc.).
  • the interface circuit 112 may be used to send signals to other devices (eg, the processor 111).
  • the interface circuit 112 may read the instructions stored in the memory and send the instructions to the processor 111 .
  • the image enhancement apparatus can be caused to perform each step in the above-mentioned embodiment.
  • the chip system 110 may also include other discrete devices, which are not specifically limited in this embodiment of the present application.
  • Another embodiment of the present application further provides a computer-readable storage medium, where an instruction is stored in the computer-readable storage medium, and when the instruction is executed on an image enhancement apparatus, the image enhancement apparatus executes the method shown in the foregoing method embodiments Each step performed by the image enhancement device in the flow.
  • the disclosed methods may be implemented as computer program instructions encoded in a machine-readable format on a computer-readable storage medium or on other non-transitory media or articles of manufacture.
  • FIG. 12 schematically shows a conceptual partial view of a computer program product provided by an embodiment of the present application, where the computer program product includes a computer program for executing a computer process on a computing device.
  • the computer program product is provided using the signal bearing medium 120 .
  • the signal bearing medium 120 may include one or more program instructions, which, when executed by one or more processors, may provide the functions or portions of the functions described above with respect to FIG. 3 or FIG. 9 .
  • reference to one or more features of S101 - S106 in FIG. 3 may be undertaken by one or more instructions associated with the signal bearing medium 120 .
  • the program instructions in Figure 12 also describe example instructions.
  • signal bearing medium 120 may include computer readable medium 121 such as, but not limited to, hard drives, compact discs (CDs), digital video discs (DVDs), digital tapes, memories, read-only memory (read only memory) -only memory, ROM) or random access memory (RAM), etc.
  • computer readable medium 121 such as, but not limited to, hard drives, compact discs (CDs), digital video discs (DVDs), digital tapes, memories, read-only memory (read only memory) -only memory, ROM) or random access memory (RAM), etc.
  • the signal bearing medium 120 may include a computer recordable medium 122 such as, but not limited to, memory, read/write (R/W) CDs, R/W DVDs, and the like.
  • a computer recordable medium 122 such as, but not limited to, memory, read/write (R/W) CDs, R/W DVDs, and the like.
  • signal bearing medium 120 may include communication medium 123, such as, but not limited to, digital and/or analog communication media (eg, fiber optic cables, waveguides, wired communication links, wireless communication links, etc.).
  • communication medium 123 such as, but not limited to, digital and/or analog communication media (eg, fiber optic cables, waveguides, wired communication links, wireless communication links, etc.).
  • Signal bearing medium 120 may be conveyed by a wireless form of communication medium 123 (eg, a wireless communication medium that conforms to the IEEE 1202.11 standard or other transmission protocol).
  • the one or more program instructions may be, for example, computer-executable instructions or logic-implemented instructions.
  • an image enhancement device such as described with respect to FIG. 3 or FIG. 9 may be configured to respond to programs via one or more of computer readable medium 121 , computer recordable medium 122 , and/or communication medium 123 . Instructions that provide various operations, functions, or actions.
  • the computer may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • a software program When implemented using a software program, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the processes or functions according to the embodiments of the present application are generated, in whole or in part, on the computer and when the computer executes the instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server, or data center over a wire (e.g.
  • coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to transmit to another website site, computer, server or data center.
  • Computer-readable storage media can be any available media that can be accessed by a computer or data storage devices including one or more servers, data centers, etc., that can be integrated with the media.
  • Useful media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state disks (SSDs)), and the like.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

一种图像增强方法及装置,涉及图像处理技术领域,该方法能够同时实现图像亮度和/或对比度的增强以及图像去噪。该方法包括:通过第一神经网络获取待处理图像的低频特征图,并根据该低频特征图,确定第一图像;其中,该第一图像包括重建的待处理图像的基础信息,该基础信息包括待处理图像的轮廓信息;接着,通过第二神经网络获取待处理图像的高频特征图,并根据高频特征图,确定第二图像;其中,该第二图像包括重建的待处理图像的细节信息,该细节信息包括待处理图像的边缘或纹理中的至少一种;然后,将第一图像和第二图像融合,得到待处理图像增强后的图像。

Description

图像增强方法及装置 技术领域
本申请涉及图像处理技术领域,尤其涉及一种图像增强方法及装置。
背景技术
低光照条件在日常生活中极为常见,例如夜间环境、日落时的环境等。因而,低光照条件下所拍摄的图像(例如在夜间进行的公共环境监控时拍摄到的监控图片,或者日落时的个人成像等),通常信噪比低、且噪声严重。
通常,通过图像增强方法,可以将低光照条件下所拍摄的图像调整到人眼可以接受的程度的图像,即增强图像的亮度和/或对比度。然而通过传统的图像增强方法处理低光照条件所拍摄的图像时,无法同时实现增强图像的亮度和/或对比度、以及图像去噪的效果。因而,如何同时实现低光照条件所拍摄图像亮度和/或对比度的增强,以及该图像的去噪,是亟待解决的技术问题。
发明内容
本申请提供了一种图像增强方法及装置,通过该方法,能够同时实现图像亮度和/或对比度的增强以及图像去噪。
为达上述目的,本申请提供如下技术方案:
第一方面,本申请提供一种图像增强方法,应用于图像增强装置。该方法包括:通过第一神经网络获取待处理图像的低频特征图,并根据该低频特征图,确定第一图像。其中,该第一图像包括重建的待处理图像的基础信息,该基础信息包括待处理图像的轮廓信息。接着,通过第二神经网络获取待处理图像的高频特征图,并根据该高频特征图,确定第二图像。其中,该第二图像包括重建的待处理图像的细节信息,该细节信息包括待处理图像的边缘或纹理中的至少一种。然后,将该第一图像和该第二图像融合,从而得到待处理图像增强后的图像。
这样,本申请提供的图像增强方法,能够通过获取的待处理图像的低频特征图来去噪和重建基础图像(其中,去噪是由于噪声信息多为高频信息,根据低频特征重建基础图像,相当于将大部分的高频信息滤除,即实现了去噪),并通过获取的待处理图像的高频特征图来重建细节图像,然后融合该基础图像和细节图像,进而得到待处理图像增强后的图像。通过该方法,能够在实现图像的亮度和/或对比度的增强的同时,还可以有效滤除待处理图像中的噪声。
在一种可能的设计方式中,上述的“通过第一神经网络获取待处理图像的低频特征图”包括:使用第一神经网络获取待处理图像的第一特征图,接着,使用第一神经网络将该第一特征图中第一像素的像素值和待处理图像中对应该第一像素的第二像素的像素值相乘,以得到待处理图像的低频特征图。上述的“通过第二神经网络获取待处理图像的高频特征图”包括:使用第二神经网络将前述第一特征图中每个像素的像素值取反,得到第二特征图。接着,使用第二神经网络将该第二特征图中第三像素的像素值和前述的第一图像中对应该第三像素的第四像素的像素值相乘,或者,使用第二神经网络将第三像素的像素值和待处理图像中的对应该第三像素的第五像素的像素 值相乘,以得到待处理图像的高频特征图。其中,第二神经网络和第一神经网络的网络结构相同。
通过该可能的设计,本申请方法在获取待处理图像的高频特征图时,共享了获取待处理图像低频特征图的参数,即,图像增强装置将用于确定待处理图像的第一特征图取反,即可得到用于确定待处理图像高频特征图的第二特征图。通过这种设计,可以使得图像增强装置在通过本申请方法增强待处理图像的过程中,减少了大量的卷积运算(即根据待处理图像或第一图像获取第二特征图的卷积运算),从而节省了算力。
在另一种可能的设计方式中,上述的“取反”表示:将1和第一特征图中每个像素的像素值做差。
通过该可能的设计,可以简单快速的基于用于确定待处理图像的第一特征图,得到用于确定待处理图像高频特征图的第二特征图,从而减少了大量的卷积运算(即根据待处理图像或第一图像获取第二特征图的卷积运算),节省了算力。
在另一种可能的设计方式中,上述的“根据低频特征图,确定第一图像”包括:根据该低频特征图,使用第三神经网络重建待处理图像的基础信息,得到第三图像。接着,通过常数α增强该第三图像的颜色和/或对比度,得到第四图像。然后,将该第三图像中第六像素的像素值和该第四图像中对应该第六像素的第七像素的像素值相乘,得到第一图像。
在另一种可能的设计方式中,上述的“常数α”是预设的常数,或者,上述的“常数α”通过第三神经网络获得。
在另一种可能的设计方式中,上述的“根据高频特征图,确定第二图像”包括:根据该高频特征图,使用第四神经网络重建待处理图像的细节信息,得到第二图像。其中,该第四神经网络和上述的第三神经网络的网络结构相同。该第四神经网络中用于得到高频特征图的特征图,是对上述的第三神经网络中用于得到低频特征图的特征图中的每个像素的像素值取反后得到的。
通过上述几种可能的设计,第四重建网络可以共享第三重建网络的参数,即通过将第三神经网络中用于得到低频特征图的特征图中的像素值取反,从而得到第四神经网络中用于得到高频特征图的特征图。这样的话,可以减少大量的卷积运算(即获取第四神经网络中用于得到高频特征图的特征图的卷积运算),从而节省了算力。
在另一种可能的设计方式中,上述的“将该第一图像和该第二图像融合,从而得到待处理图像增强后的图像”包括:将该第一图像中第八像素的像素值和该第二图像中对应该第八像素的第九像素的像素值相加,从而得到待处理图像增强后的图像。
第二方面,本申请提供了一种图像增强装置。
在一种可能的设计方式中,该图像增强装置用于执行上述第一方面提供的任一种方法。本申请可以根据上述第一方面提供的任一种方法,对该图像增强装置进行功能模块的划分。例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。示例性的,本申请可以按照功能将该图像增强装置划分为获取单元、确定单元以及融合单元等。上述划分的各个功能模块执行的可能的技术方案和有益效果的描述均可以参考上述第一方面或其相应的可能的设计提供的技术方案,此处不再赘述。
在另一种可能的设计中,该图像增强装置包括:存储器和一个或多个处理器,存储器和处理器耦合。存储器用于存储计算机指令,处理器用于调用该计算机指令,以执行如第一方面及其任一种可能的设计方式提供的任一种方法。
第三方面,本申请提供了一种计算机可读存储介质,如计算机非瞬态的可读存储介质。其上储存有计算机程序(或指令),当该计算机程序(或指令)在图像增强装置上运行时,使得该图像增强装置执行上述第一方面中的任一种可能的实现方式提供的任一种方法。
第四方面,本申请提供了一种计算机程序产品,当其在图像增强装置上运行时,使得第一方面中的任一种可能的实现方式提供的任一种方法被执行。
第五方面,本申请提供了一种芯片系统,包括:处理器,处理器用于从存储器中调用并运行该存储器中存储的计算机程序,执行第一方面中的实现方式提供的任一种方法。
可以理解的是,上述提供的任一种装置、计算机存储介质、计算机程序产品或芯片系统等均可以应用于上文所提供的对应的方法,因此,其所能达到的有益效果可参考对应的方法中的有益效果,此处不再赘述。
在本申请中,上述图像增强装置的名字对设备或功能模块本身不构成限定,在实际实现中,这些设备或功能模块可以以其他名称出现。只要各个设备或功能模块的功能和本申请类似,属于本申请权利要求及其等同技术的范围之内。
本申请的这些方面或其他方面在以下的描述中会更加简明易懂。
附图说明
图1为本申请实施例提供的一种图像增强装置的硬件结构示意图一;
图2为本申请实施例提供的一种图像增强装置的硬件结构示意图二;
图3为本申请实施例提供的图像增强方法的流程示意图;
图4为本申请实施例提供的ACE网络的网络结构图;
图5为本申请实施例提供的不同特征图中的对应像素的示意图;
图6为本申请实施例提供的第一重建网络的结构示意图;
图7为本申请实施例提供的第一CDT模块的结构示意图;
图8为本申请实施例提供的一种图像增强模型的结构示意图;
图9为本申请实施例提供的一种训练图像增强模型的方法流程示意图;
图10为本申请实施例提供的一种图像增强装置的结构示意图;
图11为本申请实施例提供的一种芯片系统的结构示意图;
图12为本申请实施例提供的计算机程序产品的结构示意图。
具体实施方式
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
在本申请的实施例中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、 “第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中,除非另有说明,“多个”的含义是两个或两个以上。
本申请中术语“至少一个”的含义是指一个或多个,本申请中术语“多个”的含义是指两个或两个以上,例如,多个第二报文是指两个或两个以上的第二报文。本文中术语“系统”和“网络”经常可互换使用。
应理解,在本文中对各种所述示例的描述中所使用的术语只是为了描述特定示例,而并非旨在进行限制。如在对各种所述示例的描述和所附权利要求书中所使用的那样,单数形式“一个(“a”,“an”)”和“该”旨在也包括复数形式,除非上下文另外明确地指示。
还应理解,本文中所使用的术语“和/或”是指并且涵盖相关联的所列出的项目中的一个或多个项目的任何和全部可能的组合。术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本申请中的字符“/”,一般表示前后关联对象是一种“或”的关系。
还应理解,在本申请的各个实施例中,各个过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
还应理解,术语“包括”(也称“includes”、“including”、“comprises”和/或“comprising”)当在本说明书中使用时指定存在所陈述的特征、整数、步骤、操作、元素、和/或部件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元素、部件、和/或其分组。
还应理解,术语“如果”可被解释为意指“当...时”(“when”或“upon”)或“响应于确定”或“响应于检测到”。类似地,根据上下文,短语“如果确定...”或“如果检测到[所陈述的条件或事件]”可被解释为意指“在确定...时”或“响应于确定...”或“在检测到[所陈述的条件或事件]时”或“响应于检测到[所陈述的条件或事件]”。
应理解,说明书通篇中提到的“一个实施例”、“一实施例”、“一种可能的实现方式”意味着与实施例或实现方式有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”、“一种可能的实现方式”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。
本申请实施例提供了一种图像增强方法,该方法通过对待处理图像的低频特征图像进行基础信息重建,对待处理图像的高频特征图进行细节重建,然后通过将重建所得的图像融合,从而得到了增强后的待处理图像。通过该方法得到的增强图像,不经能够增强待处理图像的亮度和/或对比度,还能滤除图像的噪声。
其中,待处理图像可以是低光照条件下拍摄的图像,或任何需要进行增强的图像,本申请实施例对此不作限定。
其中,上述的基础信息包括待处理图像的轮廓信息等。上述的细节信息包括待处 理图像的边缘或纹理中的至少一种。
本申请实施例提供一种图像增强装置,该图像增强装置用于执行上述的图像增强方法。该图像增强装置可以是终端。或者,该图像增强装置可以是服务器。
其中,上述终端可以是手机、平板电脑、可穿戴电子设备等便携式设备,也可以是个人计算机(personal computer,PC)、个人数字助理(personal digital assistant,PDA)、上网本等计算设备,还可以是其他任一能够实现本申请实施例的终端设备,本申请对此不作限定。
其中,当上述图像增强装置为终端时,上述的图像增强方法可以通过安装在终端上的应用程序实现,例如用于处理图像的客户端应用程序等。
上述应用程序可以是安装在设备中的嵌入式应用程序(即设备的系统应用),也可以是可下载应用程序。其中,嵌入式应用程序是作为设备(如手机)实现的一部分提供的应用程序。可下载应用程序是一个可以提供自己的因特网协议多媒体子系统(internet protocol multimedia subsystem,IMS)连接的应用程序,该可下载应用程序是可以预先安装在设备中的应用或可以由用户下载并安装在设备中的第三方应用。
参考图1,以上述的图像增强装置是手机为例,图1示出了手机10的一种硬件结构。如图1所示,手机10可以包括处理器110,外部存储器接口120,内部存储器121,触摸屏130,天线140等。其中,触摸屏130包括显示屏131和触控板132。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图像信号处理器(image signal processor,ISP),图形处理器(graphics processing unit,GPU),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是手机10的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
可以理解的是,手机10可以采用上述描述的不同的接口连接方式,或多种接口连接方式的组合。
应理解,手机10可以通过ISP,摄像头,视频编解码器,GPU,触摸屏130以及 AP等实现拍摄功能。ISP用于处理摄像头反馈的数据。摄像头用于获取静态图像或视频。物体通过镜头生成光学图像投射到感光元件。DSP用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。视频编解码器用于对数字视频压缩或解压缩。作为示例,ISP,摄像头,视频编解码器,GPU,触摸屏130以及AP等,可以用于拍摄得到上述的待处理图像。
其中,触摸屏130的显示屏131,用于显示图像、视频等。作为示例,显示屏131可以用于显示上述的待处理图像和增强后的待处理图像。触摸屏130的触控板132可以用于输入用户指令等。
应理解,手机10通过GPU,显示屏131,以及AP等实现显示功能。GPU为图像处理的微处理器,连接显示屏131和应用处理器AP。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
应理解,NPU为神经网络(neural-network,NN)计算处理器,可以通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现手机10的智能处理等应用,例如图像增强等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展手机10的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将图像等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行手机10的各种功能应用以及数据处理。
天线140发射和接收电磁波信号。手机10中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
可以理解的是,本申请实施例示意的结构并不构成对手机10的具体限定。在本申请另一些实施例中,手机10可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
本申请实施例还提供另一种图像增强装置,该图像增强装置用于对神经网络进行训练,从而得到能够实现上述图像增强方法的图像增强模型。该图像增强装置可以是服务器,或者是具有训练神经网络算力的其他任一种计算设备。
参考图2,图2示出了本申请实施例提供的一种服务器的硬件结构示意图。如图2所示,服务器20可以包括处理器21、存储器22、通信接口23以及总线24。其中,处理器21、存储器22以及通信接口23之间可以通过总线24连接。
处理器21是服务器20的控制中心,可以是一个通用中央处理单元(central processing unit,CPU),也可以是其他通用处理器等。其中,通用处理器可以是微处理器或者是任何常规的处理器等。
作为一个示例,处理器21可以包括一个或多个CPU,例如图2中所示的CPU 0和CPU 1。
存储器22可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。
一种可能的实现方式中,存储器22可以独立于处理器21存在。存储器22可以通过总线24与处理器21相连接,用于存储数据、指令或者程序代码。处理器21调用并执行存储器22中存储的指令或程序代码时,能够训练出实现本申请实施例提供的图像增强方法的图像增强模型。
另一种可能的实现方式中,存储器22也可以和处理器21集成在一起。
通信接口23,用于服务器20与其他设备(如终端等)通过通信网络连接,所述通信网络可以是以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。通信接口23可以包括用于接收数据的接收单元,以及用于发送数据的发送单元。
总线24,可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component Interconnect,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图2中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
需要指出的是,图2中示出的结构并不构成对该管控设备的限定,除图2所示部件之外,该服务器20可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合附图,对本申请实施例提供的图像增强方法予以描述。
参考图3,图3示出了本申请实施例所提供的图像增强方法的流程示意图。该方法应用于图像增强装置,该图像增强装置可以终端,也可以是图2所示的服务器,对此不作限定。该方法可以包括以下步骤:
S101、图像增强装置获取待处理图像。
其中,待处理图像可以是低光照条件下拍摄的图像,或者是其他拍摄条件下拍摄所得、且需要增强的图像,本申请实施例对此不作限定。
可选的,图像增强装置可以从本地的图片库获取到待处理图像。其中,图片库中的图片包括预先拍摄保存的图片、网络下载的图片、蓝牙传送的图片、社交软件发送的图片、以及视频中的视频截图等,对此不作限定。
作为示例,用户可以通过对图像增强装置的触摸屏(例如图1所示的触摸屏130)的触控操作,或者通过图像增强装置的语音交互模块,从本地图片库中选择并加载待处理图像。相应的,图像增强装置可以获取到待处理图像。
可选的,图像增强装置也可以实时拍摄图片,并将实时拍摄得到的图片作为待处理图像。
当然,图像增强装置也可以通过其他任意方式获取待处理图像,本申请实施例对 此不作具体限定。
S102、图像增强装置通过第一神经网络获取上述待处理图像的低频特征图。
其中,第一神经网络可以是注意力上下文编码(attention to context encoding,ACE)网络。
可选的,图像增强装置可以通过ACE网络,获取待处理图像的低频特征图。
下面参考图4所示的ACE网络40的网络结构图,对图像增强装置通过ACE网络40获取待处理图像的低频特征图的过程予以说明。该过程可以包括以下步骤:
步骤1、图像增强装置使用ACE网络40获取待处理图像41的第一特征图42。
其中,第一特征图42用于确定待处理图像的第一低频特征图43。
具体的,ACE网络40可以使用两个卷积核,分别与待处理图像41进行卷积运算,以得到特征图1和特征图2。
其中,上述两个卷积核,可以是两个不同感受野(或称为视场(field of view FOV))的卷积核,例如图4所示的卷积核1和卷积核2。
示例性的,上述的卷积核1可以是大小为3×3、且扩张率为2的卷积核,即卷积核1的感受野为5。上述的卷积核2可以是1×1大小的卷积核,即卷积核2的感受野为1。这样,ACE网络40可以使用卷积核1与待处理图像41进行扩张卷积运算,以得到特征图1。ACE网络40可以使用卷积核2与待处理图像41进行普通卷积运算,以得到特征图2。
又示例性的,上述的卷积核1可以是大小为5×5的卷积核,即卷积核1的感受野为5。上述的卷积核2可以是1×1大小的卷积核,即卷积核2的感受野为1。这样,ACE网络40可以使用卷积核1与待处理图像41进行普通卷积运算,以得到特征图1。ACE网络40可以使用卷积核2与待处理图像41进行普通卷积运算,以得到特征图2。
可以理解,ACE网络40可以根据卷积核1和卷积核2的大小,在使用卷积核1和卷积核2分别与待处理图像进行卷积运算前,可以先对待处理图像41进行填充
(padding)操作,以使卷积运算后所输出的特征图(如上述的特征图1和特征图2)的大小,与待处理图像的大小相同。
应理解,如果卷积核的大小为1×1时,则无需对待处理图像进行填充。
接着,ACE网络40将特征图1和特征图2中对应像素的像素值做差,即可得到用于确定待处理图像的高频特征图的对比特感知特征图C a
其中,特征图1和特征图2中的对应像素,是指位于特征图1和特征图2中相同位置的像素。这样的话,特征图1和特征图2中对应像素的像素值做差,是指位于特征图1与特征图2中相同位置的像素的像素值做差。
作为示例,ACE网络40可以将特征图1中像素1的像素值,和特征图2中对应该像素1的像素2的像素值做差,从而可以得到用于确定待处理图像的高频特征图的对比特感知特征图C a。其中,像素1和像素2,即为特征图1和特征图2中位于相同位置的像素。
参考图5,图5示例性的示出了特征图51中和特征图52中的位于相同位置的像素。如图5所示,特征图51中像素Z511和特征图52中的Z521是位于相同位置的像素。类似的,特征图51中像素Z515和特征图52中的Z525是位于相同位置的像素, 特征图51中像素Z519和特征图52中的Z529是位于相同位置的像素,等等。
然后,ACE网络40将该对比特感知特征图C a中的每个像素的像素值取反,得到
Figure PCTCN2020104969-appb-000001
Figure PCTCN2020104969-appb-000002
即为用于确定待处理图像的第一低频特征图43的第一特征图42。
具体的,对C a中的每个像素的像素值取反,可以是ACE网络40通过1和C a中的每个像素值做差,从而得到
Figure PCTCN2020104969-appb-000003
也即,
Figure PCTCN2020104969-appb-000004
步骤2、图像增强装置使用第一神经网络(即ACE网络40)将第一特征图42与待处理图像41对应像素的像素值相乘,得到待处理图像的第一低频特征图43。
其中,第一特征图42与待处理图像41对应像素的像素值相乘,是指第一特征图42和待处理图像41中位于相同位置的像素的像素值相乘。
其中,第一特征图42和待处理图像41中位于相同位置的像素的说明,可以参考上述步骤2中特征图1和特征图2中的位于相同位置的像素的描述,这里不予赘述。
作为示例,图像增强装置可以使用ACE网络40,将第一特征图42中第一像素的像素值,和待处理图像41中对应该第一像素的第二像素的像素值相乘,从而可以得到第一低频特征图43。其中,第一像素和第二像素,是第一特征图42与待处理图像41中位于相同位置的像素。
步骤3(可选的)、基于第一低频特征图43,图像增强装置使用ACE网络40获取具有上下文信息的第二低频特征图47。
其中,该上下文信息是第二低频特征图47中像素的上下文信息。
具体的,ACE网络40可以先将第一低频特征图43进行池化(pooling),以得到池化后的低频特征图44。
其中,ACE网络40可以采用最大值池化(maxpooling)方式,将第一低频特征图43进行池化。或者,ACE网络40可以采用平均值池化(averagepooling)方式,将第一低频特征图43进行池化。这里,本申请实施例对具体的池化方式不作具体限定。
接着,基于池化所得的低频特征图44,ACE网络40通过非局部神经子网络45,获取具有上下文信息的低频特征图46。
其中,对于池化所得的低频特征图44中的任一个像素而言,该上下文信息是指该像素与池化后所得的低频特征图44中除该像素之外的所有像素之间的关系信息。
可选的,若池化所得的低频特征图44看做是一个二维的矩阵块,这样,非局部神经子网络45可以将低频特征图44进行转置,得到转置后的低频特征图440。然后,非局部神经子网络45将低频特征图44和转置的低频特征图440进行卷积运算,从而得到低频特征图M。然后,非局部神经子网络45再将低频特征图M和转置的低频特征图440进行卷积运算,从而得到具有上下文信息的低频特征图46。
容易理解,基于池化所得的低频特征图44,ACE网络40通过非局部神经子网络45,获取具有上下文信息的低频特征图46的过程,可以被理解为对池化所得的低频特征图44中的特征做进一步学习的过程。
然后,ACE网络40对低频特征图46进行反池化,以得到第二低频特征图47。
其中,ACE网络40对低频特征图46进行反池化的方式,与ACE网络40对第一低频特征图43进行池化的方式相逆。
这里,ACE网络40对低频特征图46进行反池化,具体可以为:根据第一低频特 征图43的大小,ACE网络40对低频特征图46中的每个像素的相邻像素进行预测,从而得到第二低频特征图47。
步骤4(可选的)、图像增强装置使用ACE网络40将第一低频特征图43和第二低频特征图47融合,得到目标的低频特征图48。
可选的,ACE网络40可以将第一低频特征图43和第二低频特征图47的对应像素的像素值进行加和,从而得到目标低频特征图,该目标低频特征图即为本申请实施例中所述的待处理图像的低频特征图。
其中,第一低频特征图43和第二低频特征图47对应像素的像素值加和,是指第一低频特征图43和第二低频特征图47中位于相同位置的像素的像素值加和。
其中,第一低频特征图43和第二低频特征图47中位于相同位置的像素的说明,可以参考上述步骤2中特征图1和特征图2中的位于相同位置的像素的描述,这里不予赘述。
作为示例,ACE网络40可以将第一低频特征图43中像素1的像素值,和第二低频特征图47中对应该像素1的像素2的像素值加和,从而可以得到目标低频特征图。其中,像素1和像素2,是第一低频特征图43与第二低频特征图47中位于相同位置的像素。
可选的,本申请实施例也可以直接将第一低频特征图43作为本申请实施例中所述的待处理图像的低频特征图,对此不作具体限定。应理解,在这种情况下,图像增强装置无须执行上述的步骤3和步骤4。
应理解,由于图像中大部分的噪声是高频噪声,这样的话,上述确定待处理图像低频特征图时,滤除了待处理图像的大部分的高频信息,即上述方法实现了待处理图像的噪声过滤。
S103、图像增强装置基于上述的低频特征图,确定第一图像。
其中,第一图像包括重建的待处理图像的基础信息,该基础信息包括待处理图像的轮廓信息等。
基于上述的低频特征图,图像增强装置可以通过以下步骤确定第一图像:
S1031、基于上述的低频特征图,图像增强装置使用第一重建网络(对应于本申请实施例的第三神经网络)重建待处理图像的基础信息,得到基础图像(对应于本申请实施例的第三图像)。
其中,该第一重建网络可以是结合了第一跨领域重建(cross domain transformation,CDT)模块的U型网络(Unet网络)。
这里,Unet网络包括至少一层下采样卷积网络(即步长(stride)为2的网络)和至少一层反卷积网络。其中,第一CDT模块与该至少一层反卷积网络结合应用,且每一层反卷积网络连接一个第一CDT模块。
作为示例,如图6所示,图6示例性的示出了第一重建网络60的结构。如图6所示,第一重建网络60包括四层卷积网络,分别为卷积网络601、卷积网络602、卷积网络603以及卷积网络604。其中,卷积网络601为第一层卷积网络,其输入为上述步骤S102所获取的低频特征图,例如图4所示的低频特征图48。卷积网络604为第四层卷积网络,其输出为特征图605。可以理解,卷积网络601-604在卷积过程中, 每经过一层下采样卷积网络,该层卷积网络所输出的特征图大小,可以是输入到该层卷积网络的特征图大小的1/2。
如图6所示,第一重建网络60还包括四层反卷积网络,分别为反卷积网络6011、反卷积网络6021、反卷积网络6031以及反卷积网络6041。其中,反卷积网络和卷积网络一一对应,例如反卷积网络6011与卷积网络601对应,即反卷积网络6011输出特征图的大小,与卷积网络601的输入特征图的大小相同。其中,反卷积网络6041的输入为卷积网络604输出的特征图605。可以理解,反卷积网络6011-6014在反卷积过程中,每经过一层反卷积网络,该层的反卷积网络所输出的特征图大小,可以是输入到该层反卷积网络的特征图大小的2倍。
如图6所示,第一重建网络60还包括四个第一CDT模块,分别为CDT模块6042、CDT模块6032、CDT模块6022以及CDT模块6012。其中,CDT模块6042与反卷积网络6041连接,CDT模块6042的输入特征图为反卷积网络6041的输出特征图,此时,CDT模块6042和反卷积网络6041处于相同的网络层。CDT模块6042还与反卷积网络6031连接,CDT模块6042的输出特征图为反卷积网络6031的输入特征图。其中,CDT模块6032、CDT模块6022以及CDT模块6012,分别与反卷积网络的连接情况,与CDT模块6042与反卷积网络的连接情况类似,这里不再赘述。
其中,CDT模块6012所输出的特征图62,即为第一重建网络60所重建的基础图像。
这样,图像增强装置可以根据待处理图像的低频特征图,通过使用如图6所示的第一重建网络,将待处理图像的基础信息进行重建,从而得到基础图像。
应理解,如果待处理图像是低光照条件下拍摄的图像,则使用第一重建网络所重建得到的基础图像,其亮度和/或对比度高于待处理图像,即该基础图像的亮度和/或对比度,与白天自然光照下拍摄得到的图像的亮度和/或对比度相同(或相近)。
下面,对上述第一重建网络中的CDT模块予以详细说明。
参考图7,图7示出了第一CDT模块70的结构示意图。第一CDT模块70可以是图6中的任一个CDT模块。如图7所示,第一CDT模块70的输入是特征图71。如果第一CDT模块70是图6中的CDT模块6042,则特征图71可以是图6中的反卷积网络6041所输出的特征图。类似的,如果第一CDT模块70是图6中的CDT模块6012,则特征图71可以是图6中的反卷积网络6011所输出的特征图。对此不作具体限定。
如图7所示,特征图71可以分别与卷积核1和卷积2进行卷积运算,并将运算结果做差后取反,既可得特征图72(本申请实施例称为第三特征图),该特征图72用于表示特征图71的对比度感知特征。接着,特征图72与特征图71的对应像素相乘,即得到特征图73,该特征图73是特征图71的低频特征图。其中,根据特征图71得到特征图72,以及进一步得到特征图73的过程,可以参考图4中根据待处理图像41得到第一特征图,以及进一步得到第一低频特征图43的过程,这里不予赘述。
接着,第一CDT模块70将特征图73和特征图71作为整体的2通道特征图74,并确定该2通道特征图74的全局特征v。其中,全局特征v可以是2通道特征图74中所有像素的平均值。然后,第一CDT模块70使用全局特征v和2通道特征图74点 乘,得到2通道特征图75,该2通道特征图75即为CDT模块70输出的特征图。可以理解是,CDT模块中的全局特征可以对2通道特征图74进行全局调整,从而使得输出的2通道特征图75更接近真实。
应理解,当第一CDT模块70是图6中的CDT模块6012时,其输出的2通道特征图75,即为图6中所示的特征图62。
由上述描述可知,CDT模块中包括获取输入特征图的低频特征图的过程,这一过程可以滤除经反卷积后的特征图中的大部分高频信息,从而能够进一步滤除待处理图像的噪声。
应理解,Unet网络自身也具有滤除噪声的功能。然而在Unet网络中的反卷积网络重建图像时,会引入部分高频信息,因此在Unet网络的反卷积网络中结合CDT模块,可以将引入的高频信息滤除。
S1032(可选的)、图像增强装置通过常数α增强上述基础图像的亮度和/或对比度,得到增强了亮度和/或对比度后的基础图像(对应本申请实施例的第四图像)。
其中,常数α可以是图像增强装置预设的常数,也可以是上述的第一重建网络在输出基础图像时所输出的常数,本申请实施例对此不作限定。
在一种可能的实现方式中,图像增强装置可以直接对上述的基础图像进行亮度和/或对比度的增强,即,图像增强装置可以直接使用常数α与上述基础图像进行点乘,从而得到第四图像。
在另一种可能的实现方式中,图像增强装置先使用预设的神经网络对上述基础图像进行学习,以得到第四特征图。该第四特征图表示该基础图像的颜色和/或对比度信息。其中,本申请实施例对该预设的神经网络的结构以及参数不作具体限定,而只需经该预设的神经网络所输出的第四特征图的大小,与待处理图像的大小相同即可。
接着,图像增强装置将该第四特征图中的像素值进行归一(sigmoid)处理,得到第五特征图。
其中,第四特征图中每个像素的像素值通常是1-255之间的任意整数。图像增强装置将该第四特征图中的像素值进行归一处理,通常是将该0-255的像素值,按比例变换为0-1之间的数值。也就是说,第五特征图中的像素值,是0-1之间的数值。
示例性的,如果第四特征图中的像素1的值为100,则经归一处理后,该像素的值即为0.392(即100/255)。又示例性的,如果第四特征图中的像素2的值为200,则经归一处理后,该像素的值即为0.784(即200/255)。
然后,图像增强装置可以对第五特征图进行亮度和/或对比度的增强,即图像增强装置可以使用常数α与该第五特征图进行点乘,从而得到第四图像。
S1033(可选的)、图像增强装置将增强亮度和/或对比度后的基础图像,以及基础图像对应像素的像素值相乘,得到第一图像。
其中,增强亮度和/或对比度后的基础图像(即第四图像),以及基础图像(即第三图像)对应像素的像素值相乘,是指第四图像和第三图像中位于相同位置的像素的像素值相乘。
其中,第四图像和第三图像中位于相同位置的像素的说明,可以参考上述S102中的特征图1和特征图2中的位于相同位置的像素的描述,这里不予赘述。
作为示例,图像增强装置可以将第三图像中第六像素的像素值,和第四图像中对应该第六像素的第七像素的像素值相乘,从而可以得到第一图像。其中,第六像素和第七像素,是第三图像和第四图像中位于相同位置的像素。
应理解,图像增强装置也可以将步骤S1031所获取的基础图像(即第三图像)直接作为第一图像,这种情况下,本申请实施例无需执行步骤S1032和步骤S1033。
S104、图像增强装置通过第二神经网络获取上述待处理图像的高频特征图。
其中,第二神经网络可以是ACE网络,第二神经网络和第一神经网络的网络结构相同,并且,第二神经网络可以共享第一神经网络的网络参数。这样的话,第二神经网络可以将第一神经网络所获取的第一特征图中的每个像素的像素值取反,即得到用于确定待处理图像的高频特征图的第二特征图。这样,第二神经网络无需通过大量的卷积运算,即可确定出第二特征图,从而节省了算力。
其中,第二神经网络对第一特征图中每个像素的像素值取反的说明,可以参考上文S102中ACE网络40将对比特感知特征图C a取反的描述,这里不予赘述。
接着,在一种可能的实现方式中,第二神经网络可以将第二特征图和上述的第一图像对应像素的像素值相乘,从而得到第一高频特征图。
其中,第二特征图和第一图像对应像素的像素值相乘,是指第二特征图和第一图像中位于相同位置的像素的像素值相乘。
其中,第二特征图和第一图像中位于相同位置的像素的说明,可以参考上述S102中的特征图1和特征图2中的位于相同位置的像素的描述,这里不予赘述。
作为示例,图像增强装置可以将第二特征图中第三像素的像素值,和第一图像中对应该第三像素的第四像素的像素值相乘,从而可以得到第一高频特征图。其中,第三像素和第四像素,是第二特征图和第一图像中位于相同位置的像素。
在另一种可能的实现方式中,第二神经网络可以将第二特征图和待处理图像对应像素的像素值相乘,从而得到第一高频特征图。
其中,第二特征图和待处理图像对应像素的像素值相乘,是指第二特征图和待处理图像中位于相同位置的像素的像素值相乘。
其中,第二特征图和待处理图像中位于相同位置的像素的说明,可以参考上述S102中的特征图1和特征图2中的位于相同位置的像素的描述,这里不予赘述。
作为示例,图像增强装置可以将第二特征图中第三像素的像素值,和待处理图像中对应该第三像素的第五像素的像素值相乘,从而可以得到第一高频特征图。其中,第三像素和第五像素,是第二特征图和待处理图像中位于相同位置的像素。
由于第二神经网络和第一神经网络网络结构相同,因此,第二神经网络根据第一高频特征图确定第二高频特征图,以及根据第一高频特征图和第二高频特征图,确定目标高频特征图的过程,可以参考上文S102中步骤3和步骤4中得到目标低频特征图48的描述,这里不予赘述。
S105、图像增强装置基于上述的高频特征图,确定第二图像。
其中,第二图像包括重建的待处理图像的细节信息,该细节信息包括待处理图像的边缘或纹理中的至少一种,对此不作具体限定。
具体的,图像增强装置可以使用第二重建网络(对应于本申请实施例的第四神经 网络)重建待处理图像的细节信息,从而得到重建的待处理图像的细节图像,即上述的第二图像。
其中,该第二重建网络可以是结合了第一CDT模块的Unet网络。第二重建网络的网络结构,与第一重建网络的网络结构相同,也即,第四神经网络和第三神经网络的网络结构相同。
第二重建网络的网络结构,可以参考上文S1031中第一重建网络结构的描述,这里不予赘述。
其中,第二重建网络中的反卷积网络与第二CDT模块结合应用,具体描述可以参考上文S1031的描述,这里不作赘述。
需要说明的是,第二CDT模块可以共享第一CDT模块的网络参数。这样,第二CDT模块可以将第一CDT模块所获取的第三特征图取反,即得到用于确定第一CDT模块输入特征图的高频特征图的第六特征图。这样,第二CDT模块无需通过大量的卷积运算,即可确定出第六特征图,从而节省了系统的计算量。
其中,对第三特征图取反的说明,可以参考上文S102中ACE网络40将对比特感知特征图C a取反的描述,这里不予赘述。
应理解,当第二CDT模块共享第一CDT模块所获取的第三特征图时,该第二CDT模块所处的网络层,与该第一CDT模块所处的网络相同。
作为示例,在第一重建网络中,处于第一层网络层的第一CDT模块的第三特征图,可以被第二重建网络中处于第一层网络层的第二CDT模块共享。
此外,第二CDT模块确定全局特征v,以及使用全局特征v对反卷积重建的图像做全局调整的说明,可以参考上文,这里不作赘述。
可选的,图像增强装置还可以对第二重建网络所输出的细节图像,做进一步的卷积操作,从而得到经进一步卷积处理之后的细节图像。这种情况下,图像增强装置将该经进一步卷积操作之后的细节图像作为第二图像。
其中,上述的深度学习可以通过神经网络实现,本申请实施例对该神经网络的网络结构、以及网络参数不作具体限定,只需经该神经网络所输出的特征图的大小,与待处理图像的大小相同即可。
S106、图像增强装置将第一图像和第二图像融合,得到待处理图像增强后的图像。
可选的,图像增强装置可以将第一图像和第二图像对应像素的像素值加和,即可得到待处理图像增强后的图像。
其中,第一图像和第二图像对应像素的像素值加和,是指第一图像和第二图像中位于相同位置的像素的像素值加和。
其中,第一图像和第二图像中位于相同位置的像素的说明,可以参考上述S102中的特征图1和特征图2中的位于相同位置的像素的描述,这里不予赘述。
作为示例,图像增强装置可以将第一图像中第八像素的像素值,和第二图像中对应该第八像素的第九像素的像素值相加,从而可以得到待处理图像增强后的图像。其中,第八像素和第九像素,是第一图像和第二图像中位于相同位置的像素。
至此,本申请提供的图像增强方法,通过获取的待处理图像的低频特征图来去噪和重建基础图像,并通过获取的待处理图像的高频特征图来重建细节图像,然后融合 该基础图像和细节图像,进而得到待处理图像增强后的图像。通过该方法,能够在实现图像的亮度和/或对比度的增强的同时,还可以有效滤除待处理图像中的噪声。
此外,在本申请实施例提供的图像增强方法中,第二神经网络可以共享第一神经网络的第一特征图,第二CDT模块可以共享第一CDT模块的第三特征图,这样的话,可以使得图像增强装置在通过上述方法增强待处理图像的过程中,减少了大量的卷积运算,从而节省了算力。
应理解,在实际应用中,本申请实施例提供的图像增强方法,可以通过图像增强装置直接执行上述步骤S101-S106实现,也可以通过在图像增强装置中预置预先训练的、能够实现上述方法的图像增强模型来实现。
下面,对该图像增强模型的结构予以简单描述:
参考图8,图8示出了一种图像增强模型的结构示意图。如图8所示,图像增强模型80包括第一神经网络模块81、第一重建模块82、第二神经网络模块83、第二重建模块84以及融合模块85。其中,第一神经网络模块81和第一重建模块82,可以作为图像增强模型80中第一阶段的神经网络模块,第二神经网络模块83和第二重建模块84,可以作为图像增强模型80中第二阶段的神经网络模块。
其中,第一神经网络模块81可以包括上文所述的第一神经网络,并用于实现上文中步骤S102中获取待处理图像的低频特征图的功能。
第一重建模块82,可以包括第一重建网络子模块821。第一重建网络子模块821可以包括上文所述的第一重建网络,并用于实现上文中步骤S1031中重建待处理图像的基础图像(即第三图像)的功能。
可选的,第一重建模块82,还可以包括增强子模块822。增强子模块822可以用于实现上文中步骤S1032中对基础图像进行亮度和/或对比度增强后得到第四图像的功能。应理解,增强子模块822中可以包括上文中步骤S1032中所述的预设的神经网络,该预设的神经网络用于得到上文所述的第四特征图,该第四特征图可以用于得到第四图像。第一重建模块82,还用于实现上文中步骤S1033中将增强亮度和/或对比度后的基础图像、以及基础图像的对应像素的像素值相乘,得到第一图像的功能。
第二神经网络模块83,可以包括上文所述的第二神经网络,并用于实现上文中步骤S104中获取待处理图像的高频特征图的功能。
第二重建模块84,可以包括上文所述的第二重建网络,并用于实现上文中步骤S105中基于高频特征图确定第二图像的功能。
融合模块85,可以用于实现上文中步骤S106中融合第一重建模块82输出的第一图像和第二重建网络模块84输出的第二图像,以得到待处理图像增强后的图像的功能。
其中,第一神经网络、第二神经网络、第一重建网络、第二重建网络以及图像增强模型80中各个模块所实现的功能及其有益效果,均可以参考上文中S101-S106的描述,这里不再赘述。
应理解,上述的图像增强模型可以通过图像增强装置(例如图2所示的服务器)预先训练得到,也可以通过任一种具有训练神经网络模型能力的设备预先训练得到。
下面,以图像增强装置预先训练得到上述的图像增强模型为例,对图像增强装置训练图像增强模型的方法予以描述。
参考图9,图9示出了图像增强装置训练图像增强模型的方法流程示意图,该方法可以包括以下步骤:
S201、图像增强装置获取至少一个训练样本。
图像增强装置获取训练样本的描述,可以参考上文S101获取待处理图像的描述,这里不再赘述。
其中,该至少一个训练样本中的任一个训练样本,包括一个训练图像对,该训练图像对包括一张训练图像和一张训练目标图像。其中,该训练图像可以作为待增强的图像,该训练目标图像可以作为该训练图像增强后的目标图像。
其中,训练图像和训练目标图像可以是标准的RGB格式的图像。
应理解,一个训练图像对中的训练图像和训练目标图像的尺寸通常是相同的。其中,本申请实施例对一个训练图像对中的训练图像和训练目标图像的尺寸大小不作具体限定。当然,若训练图像和训练目标图像的尺寸较小(例如512×384),则可以节省图像增强装置在训练图像增强模型时需要的算力。
应理解,在一个训练图像对中,训练图像中的图像内容,和训练目标图像中的图像内容相同或相近。
示例性的,如果训练图像对A包括训练图像A和训练目标图像A,则训练图像A可以是场景A在低光照条件下拍摄的图像,训练目标图像A可以是白天自然光照条件下拍摄到的图像。其中,训练图像A和训练目标图像A,是在相同或相近的拍摄角度下,所拍摄到的场景A的图像。
可选的,为增加训练样本的数量,可以通过对已有训练样本中的图像(包括训练图像和训练目标图像)进行随机的水平翻转或垂直翻转。示例性的,训练样本A中的图像经水平翻转后,可以增加一个训练样本B。
S202、图像增强装置根据上述的至少一个训练样本,训练得到图像增强模型。
具体的,图像增强装置可以根据上述的至少一个训练样本,对神经网络进行迭代训练,从而得到上述的图像增强模型。
具体的,如果上述至少一个训练样本包括m(m是大于等于1的整数)个训练样本,则图像增强装置使用该m个训练样本,对神经网络进行迭代训练,从而得到图像增强模型的过程,可以包括以下步骤:
步骤1:图像增强装置将m个训练样本中的训练样本1中的训练图像1,输入到初始的图像增强模型中。
其中,初始的图像增强模型的结构如图8所示,包括多个神经网络,这里不再赘述。
图像增强装置将训练图像1输入到初始的图像增强模型后,该初始的图像增强模型可以通过上述S101-S106所述的方法,输出目标图像1。
然后,图像增强装置可以根据目标图像1、以及训练样本1中的训练目标图像1,计算得到损失函数1。可以看出,训练图像1和训练目标图像1属于同一个训练图像对。
可选的,图像增强装置还可以根据目标图像1,以及初始的图像增强模型增强训练图像1的过程中所得到的第三图像,计算损失函数2。其中,初始的图像增强模型 增强训练图像1的过程中所得到的第三图像的说明,可以参考上文中得到第三图像的描述,这里不再赘述。
然后,图像增强装置将损失函数1和损失函数2分别反馈至初始的图像增强模型,以调整初始的图像增强模型中神经网络的参数,从而得到调整了神经网络参数的图像增强模型2。或者,图像增强装置将损失函数1和损失函数2融合为损失函数0,再将损失函数0反馈至初始的图像增强模型,以调整初始的图像增强模型中神经网络的参数,从而得到调整了神经网络参数的图像增强模型2。
步骤2、图像增强装置将训练样本2中的训练图像2输入到图像增强模型2中,并参考上述步骤1,以得到图像增强模型3。
这样,通过多次的循环迭代的训练,当循环迭代的次数达到预设阈值,或者图像增强装置计算得到的损失函数小于或等于预设阈值,则将当前的图像增强模型作为目标的图像增强模型输出,即图像增强装置训练得到了图8所述的图像增强模型。
这样的话,当图像增强装置训练得到目标的图像增强模型后,可以将其作为专用的图像增强App发布,这样,用户安装/更新该图像增强App后,即可通过该App来增强图像。或者,图像增强装置训练得到目标的图像增强模型后,可以将其应用于图像处理的App中,以作为该App的图像增强功能模块。这样,当用户安装/更新了包括有图像增强模型的图像处理的App,即可使用该App中的图像增强模型来增强图像。
示例性,以图1所示的手机10中安装了包括图像增强模型的图像处理App,则用户可以通过点击手机10触摸屏上的“图像处理”App图标,以打开该图像处理App。然后,用户可以在该图像处理App的显示界面加载待处理图像,并在图片编辑状态下,在工具栏中点击“图像增强”按钮,从而增强待处理图像。这时,在该图像处理App的显示界面上即可显示增强后的待处理图像。
综上,本申请提供的图像增强方法,能够通过预选训练得到图像增强模型,获取的待处理图像的低频特征图来去噪和重建基础图像,并通过获取的待处理图像的高频特征图来重建细节图像,然后融合该基础图像和细节图像,得到了待处理图像增强后的图像。通过该方法,能够在实现图像亮度和/或对比度增强的同时,还可以有效滤除待处理图像中的噪声。
此外,在上述的图像增强模型中,第二神经网络可以共享第一神经网络的第一特征图,第二CDT模块可以共享第一CDT模块的第三特征图,这样的话,通过该图像增强模型增强图像时,减少了大量的卷积运算,从而节省了算力。
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对图像增强装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处 理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
如图10所示,图10示出了本申请实施例提供的一种图像增强装置100的结构示意图。图像增强装置100可以用于执行上述的图像增强方法,例如用于执行图3所示的方法。其中,图像增强装置100可以包括获取单元101、确定单元102以及融合单元103。
获取单元101,用于通过第一神经网络获取待处理图像的低频特征图。确定单元102,用于根据获取单元101所获取的低频特征图,确定第一图像。其中,该第一图像包括重建的待处理图像的基础信息,该基础信息包括待处理图像的轮廓信息。接着,获取单元101,还用于通过第二神经网络获取待处理图像的高频特征图。确定单元102,还用于根据获取单元101所获取的高频特征图,确定第二图像。其中,该第二图像包括重建的待处理图像的细节信息,该细节信息包括待处理图像的边缘或纹理中的至少一种。然后,融合单元103,用于将第一图像和第二图像融合,从而得到待处理图像增强后的图像。
作为示例,结合图3,获取单元101可以用于执行S102和S104,确定单元102可以用于执行S103和S105,融合单元103可以用于执行S106。
可选的,获取单元101具体用于:使用第一神经网络获取待处理图像的第一特征图;以及,使用第一神经网络将第一特征图中第一像素的像素值和待处理图像中对应该第一像素的第二像素的像素值相乘,以得到待处理图像的低频特征图。获取单元101还具体用于:使用第二神经网络将第一特征图中每个像素的像素值取反,得到第二特征图;以及,使用第二神经网络将第二特征图中第三像素的像素值和第一图像中对应该第三像素的第四像素的像素值相乘,或者,使用第二神经网络将第三像素的像素值和待处理图像中的对应该第三像素的第五像素的像素值相乘,以得到待处理图像的高频特征图。其中,第二神经网络和第一神经网络的网络结构相同。
作为示例,结合图3,获取单元101可以用于执行S102和S104。
可选的,上述“取反”表示将1和第一特征图中每个像素的像素值做差。
可选的,确定单元102,具体用于根据获取单元101所获取的低频特征图,使用第三神经网络重建待处理图像的基础信息,得到第三图像;以及,通过常数α增强该第三图像的颜色和/或对比度,得到第四图像;以及,将该第三图像中第六像素的像素值和该第四图像中对应该第六像素的第七像素的像素值相乘,得到第一图像。
作为示例,结合图3,确定单元102可以用于执行S1031~S1033。
可选的,上述的常数α是预设的常数,或者,上述的常数α通过第三神经网络获得。
可选的,确定单元102,还具体用于根据获取单元101所获取的高频特征图,使用第四神经网络重建待处理图像的细节信息,得到第二图像;其中,该第四神经网络和上述的第三神经网络的网络结构相同。该第四神经网络中用于得到高频特征图的特征图,是对上述第三神经网络中用于得到低频特征图的特征图中的每个像素的像素值取反后得到的。
作为示例,结合图3,确定单元102可以用于执行S105。
可选的,融合单元103,具体用于将第一图像中第八像素的像素值和第二图像中对应该第八像素的第九像素的像素值相加,得到待处理图像增强后的图像。
作为示例,结合图3,融合单元103可以用于执行S106。
关于上述可选方式的具体描述可以参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种图像增强装置100的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。
作为示例,结合图1,图像增强装置100中的获取单元101、确定单元122以及融合单元103可以通过图1中的处理器101执行图1中的内部存储器121中的程序代码实现。
本申请实施例还提供一种芯片系统110,如图11所示,该芯片系统110包括至少一个处理器和至少一个接口电路。作为示例,当该芯片系统110包括一个处理器和一个接口电路时,则该一个处理器可以是图11中实线框所示的处理器111(或者是虚线框所示的处理器111),该一个接口电路可以是图11中实线框所示的接口电路112(或者是虚线框所示的接口电路112)。当该芯片系统110包括两个处理器和两个接口电路时,则该两个处理器包括图11中实线框所示的处理器111和虚线框所示的处理器111,该两个接口电路包括图11中实线框所示的接口电路112和虚线框所示的接口电路112。对此不作限定。
处理器111和接口电路112可通过线路互联。例如,接口电路112可用于接收信号(例如获取待处理图像等)。又例如,接口电路112可用于向其它装置(例如处理器111)发送信号。示例性的,接口电路112可读取存储器中存储的指令,并将该指令发送给处理器111。当该指令被处理器111执行时,可使得图像增强装置执行上述实施例中的各个步骤。当然,该芯片系统110还可以包含其他分立器件,本申请实施例对此不作具体限定。
本申请另一实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当指令在图像增强装置上运行时,该图像增强装置执行上述方法实施例所示的方法流程中该图像增强装置执行的各个步骤。
在一些实施例中,所公开的方法可以实施为以机器可读格式被编码在计算机可读存储介质上的或者被编码在其它非瞬时性介质或者制品上的计算机程序指令。
图12示意性地示出本申请实施例提供的计算机程序产品的概念性局部视图,该计算机程序产品包括用于在计算设备上执行计算机进程的计算机程序。
在一个实施例中,计算机程序产品是使用信号承载介质120来提供的。该信号承载介质120可以包括一个或多个程序指令,其当被一个或多个处理器运行时可以提供以上针对图3或图9描述的功能或者部分功能。因此,例如,参考图3中S101~S106的一个或多个特征可以由与信号承载介质120相关联的一个或多个指令来承担。此外,图12中的程序指令也描述示例指令。
在一些示例中,信号承载介质120可以包含计算机可读介质121,诸如但不限于,硬盘驱动器、紧密盘(CD)、数字视频光盘(DVD)、数字磁带、存储器、只读存储记忆体(read-only memory,ROM)或随机存储记忆体(random access memory,RAM) 等等。
在一些实施方式中,信号承载介质120可以包含计算机可记录介质122,诸如但不限于,存储器、读/写(R/W)CD、R/W DVD、等等。
在一些实施方式中,信号承载介质120可以包含通信介质123,诸如但不限于,数字和/或模拟通信介质(例如,光纤电缆、波导、有线通信链路、无线通信链路、等等)。
信号承载介质120可以由无线形式的通信介质123(例如,遵守IEEE 1202.11标准或者其它传输协议的无线通信介质)来传达。一个或多个程序指令可以是,例如,计算机可执行指令或者逻辑实施指令。
在一些示例中,诸如针对图3或图9描述的图像增强装置可以被配置为,响应于通过计算机可读介质121、计算机可记录介质122、和/或通信介质123中的一个或多个程序指令,提供各种操作、功能、或者动作。
应该理解,这里描述的布置仅仅是用于示例的目的。因而,本领域技术人员将理解,其它布置和其它元素(例如,机器、接口、功能、顺序、和功能组等等)能够被取而代之地使用,并且一些元素可以根据所期望的结果而一并省略。另外,所描述的元素中的许多是可以被实现为离散的或者分布式的组件的、或者以任何适当的组合和位置来结合其它组件实施的功能实体。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上和执行计算机执行指令时,全部或部分地产生按照本申请实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (16)

  1. 一种图像增强方法,其特征在于,所述方法包括:
    通过第一神经网络获取待处理图像的低频特征图;
    根据所述低频特征图,确定第一图像,所述第一图像包括重建的所述待处理图像的基础信息,所述基础信息包括所述待处理图像的轮廓信息;
    通过第二神经网络获取所述待处理图像的高频特征图;
    根据所述高频特征图,确定第二图像,所述第二图像包括重建的所述待处理图像的细节信息,所述细节信息包括所述待处理图像的边缘或纹理中的至少一种;
    将所述第一图像和所述第二图像融合,得到增强图像。
  2. 根据权利要求1所述的方法,其特征在于,
    所述通过第一神经网络获取待处理图像的低频特征图,包括:
    使用第一神经网络获取所述待处理图像的第一特征图;以及,使用第一神经网络将所述第一特征图中第一像素的像素值和所述待处理图像中对应所述第一像素的第二像素的像素值相乘,以得到所述待处理图像的低频特征图;
    所述通过第二神经网络获取所述待处理图像的高频特征图,包括:
    使用所述第二神经网络将所述第一特征图中每个像素的像素值取反,得到第二特征图;以及,使用所述第二神经网络将所述第二特征图中第三像素的像素值和所述第一图像中对应所述第三像素的第四像素的像素值相乘,或者,使用所述第二神经网络将所述第三像素的像素值和所述待处理图像中对应所述第三像素的第五像素的像素值相乘,以得到所述待处理图像的高频特征图;
    其中,所述第二神经网络和所述第一神经网络的网络结构相同。
  3. 根据权利要求2所述的方法,其特征在于,所述取反表示将1和所述第一特征图中每个像素的像素值做差。
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,所述根据所述低频特征图,确定第一图像,包括:
    根据所述低频特征图,使用第三神经网络重建所述待处理图像的基础信息,得到第三图像;
    通过常数α增强所述第三图像的颜色或对比度中至少一个,得到第四图像;
    将所述第三图像中第六像素的像素值和所述第四图像中对应所述第六像素的第七像素的像素值相乘,得到所述第一图像。
  5. 根据权利要求4所述的方法,其特征在于,所述常数α是预设的常数,或者,所述常数α通过所述第三神经网络获得。
  6. 根据权利要求4或5所述的方法,其特征在于,所述根据所述高频特征图,确定第二图像,包括:
    根据所述高频特征图,使用第四神经网络重建所述待处理图像的细节信息,得到第二图像;
    其中,所述第四神经网络和所述第三神经网络的网络结构相同;所述第四神经网络中用于得到高频特征图的特征图,是对所述第三神经网络中用于得到低频特征图的特征图中的每个像素的像素值取反后得到的。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,所述将所述第一图像和所述第二图像融合,得到增强图像,包括:
    将所述第一图像中第八像素的像素值和所述第二图像中对应所述第八像素的第九像素的像素值相加,得到所述增强图像。
  8. 一种图像增强装置,其特征在于,所述装置包括:
    获取单元,用于通过第一神经网络获取待处理图像的低频特征图;
    确定单元,用于根据所述低频特征图,确定第一图像,所述第一图像包括重建的所述待处理图像的基础信息,所述基础信息包括所述待处理图像的轮廓信息;
    所述获取单元,还用于通过第二神经网络获取所述待处理图像的高频特征图;
    所述确定单元,还用于根据所述高频特征图,确定第二图像,所述第二图像包括重建的所述待处理图像的细节信息,所述细节信息包括所述待处理图像的边缘或纹理中的至少一种;
    融合单元,用于将所述第一图像和所述第二图像融合,得到增强图像。
  9. 根据权利要求8所述的图像增强装置,其特征在于,所述获取单元具体用于:
    使用第一神经网络获取所述待处理图像的第一特征图;以及,使用第一神经网络将所述第一特征图中第一像素的像素值和所述待处理图像中对应所述第一像素的第二像素的像素值相乘,以得到所述待处理图像的低频特征图;以及
    使用所述第二神经网络将所述第一特征图中每个像素的像素值取反,得到第二特征图;以及,使用所述第二神经网络将所述第二特征图中第三像素的像素值和所述第一图像中对应所述第三像素的第四像素的像素值相乘,或者,使用所述第二神经网络将所述第三像素的像素值和所述待处理图像中对应所述第三像素的第五像素的像素值相乘,以得到所述待处理图像的高频特征图;
    其中,所述第二神经网络和所述第一神经网络的网络结构相同。
  10. 根据权利要求9所述的图像增强装置,其特征在于,所述取反表示将1和所述第一特征图中每个像素的像素值做差。
  11. 根据权利要求8-10中任一项所述的图像增强装置,其特征在于,
    所述确定单元,具体用于根据所述低频特征图,使用第三神经网络重建所述待处理图像的基础信息,得到第三图像;以及,通过常数α增强所述第三图像的颜色和/或对比度,得到第四图像;以及,将所述第三图像中第六像素的像素值和所述第四图像中对应所述第六像素的第七像素的像素值相乘,得到所述第一图像。
  12. 根据权利要求11所述的图像增强装置,其特征在于,所述常数α是预设的常数,或者,所述常数α通过所述第三神经网络获得。
  13. 根据权利要求11或12所述的图像增强装置,其特征在于,
    所述确定单元,还具体用于根据所述高频特征图,使用第四神经网络重建所述待处理图像的细节信息,得到第二图像;
    其中,所述第四神经网络和所述第三神经网络的网络结构相同;所述第四神经网络中用于得到高频特征图的特征图,是对所述第三神经网络中用于得到低频特征图的特征图中的每个像素的像素值取反后得到的。
  14. 根据权利要求8-13中任一项所述的图像增强装置,其特征在于,
    所述融合单元,具体用于将所述第一图像中第八像素的像素值和所述第二图像中对应所述第八像素的第九像素的像素值相加,得到所述增强图像。
  15. 一种图像增强装置,其特征在于,所述装置包括:存储器和一个或多个处理器,所述存储器用于存储计算机指令,所述处理器用于调用所述计算机指令,以执行如权利要求1-7中任一所述的图像增强方法。
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行权利要求1-7中任一所述的图像增强方法。
PCT/CN2020/104969 2020-07-27 2020-07-27 图像增强方法及装置 WO2022021025A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/104969 WO2022021025A1 (zh) 2020-07-27 2020-07-27 图像增强方法及装置
CN202080101508.4A CN115769247A (zh) 2020-07-27 2020-07-27 图像增强方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/104969 WO2022021025A1 (zh) 2020-07-27 2020-07-27 图像增强方法及装置

Publications (1)

Publication Number Publication Date
WO2022021025A1 true WO2022021025A1 (zh) 2022-02-03

Family

ID=80037979

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/104969 WO2022021025A1 (zh) 2020-07-27 2020-07-27 图像增强方法及装置

Country Status (2)

Country Link
CN (1) CN115769247A (zh)
WO (1) WO2022021025A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115375980A (zh) * 2022-06-30 2022-11-22 杭州电子科技大学 基于区块链的数字图像的存证系统及其存证方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104992416A (zh) * 2015-06-30 2015-10-21 小米科技有限责任公司 图像增强方法和装置、智能设备
CN105408935A (zh) * 2013-04-26 2016-03-16 弗劳恩霍夫应用研究促进协会 上采样和信号增强
US20170091575A1 (en) * 2015-09-25 2017-03-30 Intel Corporation Method and system of low-complexity histrogram of gradients generation for image processing
CN107730443A (zh) * 2017-10-27 2018-02-23 北京小米移动软件有限公司 图像处理方法、装置及用户设备
CN109087269A (zh) * 2018-08-21 2018-12-25 厦门美图之家科技有限公司 弱光图像增强方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105408935A (zh) * 2013-04-26 2016-03-16 弗劳恩霍夫应用研究促进协会 上采样和信号增强
CN104992416A (zh) * 2015-06-30 2015-10-21 小米科技有限责任公司 图像增强方法和装置、智能设备
US20170091575A1 (en) * 2015-09-25 2017-03-30 Intel Corporation Method and system of low-complexity histrogram of gradients generation for image processing
CN107730443A (zh) * 2017-10-27 2018-02-23 北京小米移动软件有限公司 图像处理方法、装置及用户设备
CN109087269A (zh) * 2018-08-21 2018-12-25 厦门美图之家科技有限公司 弱光图像增强方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115375980A (zh) * 2022-06-30 2022-11-22 杭州电子科技大学 基于区块链的数字图像的存证系统及其存证方法
CN115375980B (zh) * 2022-06-30 2023-05-09 杭州电子科技大学 基于区块链的数字图像的存证系统及其存证方法

Also Published As

Publication number Publication date
CN115769247A (zh) 2023-03-07

Similar Documents

Publication Publication Date Title
US10944914B1 (en) System and method for generating multi-exposure frames from single input
JP7266672B2 (ja) 画像処理方法および画像処理装置、ならびにデバイス
JP7226851B2 (ja) 画像処理の方法および装置並びにデバイス
CN111654594B (zh) 图像拍摄方法、图像拍摄装置、移动终端及存储介质
TWI769725B (zh) 圖像處理方法、電子設備及電腦可讀儲存介質
KR102385188B1 (ko) 외부 전자 장치에서 생성된 정보를 이용하여 이미지 데이터를 처리하는 방법 및 전자 장치
WO2022134971A1 (zh) 一种降噪模型的训练方法及相关装置
CN113706414B (zh) 视频优化模型的训练方法和电子设备
CN106303156B (zh) 对视频去噪的方法、装置及移动终端
CN116324878A (zh) 针对图像效果的分割
CN115061770B (zh) 显示动态壁纸的方法和电子设备
CN117078509B (zh) 模型训练方法、照片生成方法及相关设备
CN116744120B (zh) 图像处理方法和电子设备
KR20160149842A (ko) 영상 처리 방법 및 그 장치
WO2022021025A1 (zh) 图像增强方法及装置
CN114429495A (zh) 一种三维场景的重建方法和电子设备
WO2024093545A1 (zh) 拍照方法及电子设备
CN113658065A (zh) 图像降噪方法及装置、计算机可读介质和电子设备
CN115633262B (zh) 图像处理方法和电子设备
CN113538227A (zh) 一种基于语义分割的图像处理方法及相关设备
WO2022133954A1 (zh) 图像渲染方法、装置和系统、计算机可读存储介质
CN114244655A (zh) 信号处理方法及相关装置
CN112950516B (zh) 图像局部对比度增强的方法及装置、存储介质及电子设备
CN115063301A (zh) 一种视频去噪方法、视频处理方法和装置
CN117593611B (zh) 模型训练方法、图像重建方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20947202

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20947202

Country of ref document: EP

Kind code of ref document: A1