CN116091337B - Image enhancement method and device based on event signal nerve coding mode - Google Patents

Image enhancement method and device based on event signal nerve coding mode Download PDF

Info

Publication number
CN116091337B
CN116091337B CN202211515957.0A CN202211515957A CN116091337B CN 116091337 B CN116091337 B CN 116091337B CN 202211515957 A CN202211515957 A CN 202211515957A CN 116091337 B CN116091337 B CN 116091337B
Authority
CN
China
Prior art keywords
image
event
neural
resolution
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211515957.0A
Other languages
Chinese (zh)
Other versions
CN116091337A (en
Inventor
施柏鑫
滕明桂
周矗
楼涵月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202211515957.0A priority Critical patent/CN116091337B/en
Publication of CN116091337A publication Critical patent/CN116091337A/en
Application granted granted Critical
Publication of CN116091337B publication Critical patent/CN116091337B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/92Dynamic range modification of images or parts thereof based on global image properties
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image enhancement method and device based on an event signal nerve coding mode, which combines the characteristics of a physical imaging model and a data driving mode, provides a robust and well coding mode of motion and time sequence information in an event signal stream, namely a nerve event frame, constructs a conversion relation between a low-quality image and a high-quality image through the nerve event frame, effectively solves the problems of noise interference and the like in the event signal, and realizes deblurring and super-resolution of the image signal. Meanwhile, the invention designs a unified high-speed video generation framework under the guidance of the event signals, and realizes the recovery from the low-speed video to the high-speed video.

Description

Image enhancement method and device based on event signal nerve coding mode
Technical Field
The invention relates to the technical field of computer vision, in particular to an image enhancement method and device based on an event signal nerve coding mode.
Background
With the continuous improvement of artificial intelligence and computer computing power, deep learning has been developed in various fields of computer vision, and the current deep learning method has superior performance to the traditional vision method in multiple tasks such as object classification, tracking detection and the like, and also has increased the robustness and practicality of the computer vision algorithm. However, in a practical scenario, there is interference of the environment on the detection algorithm, such as a high dynamic range, high-speed motion, and the like, which seriously affects the acquisition of the image, and this also limits the performance of the downstream computer vision task. Therefore, how to improve the quality of image data as much as possible at the input end becomes one of the hot problems of research.
Image enhancement is to enhance a low-quality image signal so as to obtain an image signal with characteristics of high dynamic range, high time resolution and the like. Image enhancement is taken as a basic task of computational photography, and development of the image enhancement is crucial to further improvement of future computer vision tasks, and the image enhancement is also an indispensable ring in computer vision tasks.
Through years of development of the traditional digital cameras, a certain progress is made in the dynamic range and time resolution of images, and most digital cameras can achieve a dynamic range of 40-60 dB and a spatial-temporal resolution of 1080P@120Hz/4K@60 Hz. However, the performance can not meet the requirements of the fields of automatic driving, unmanned plane control, industrial intelligence and the like, especially the reliability of algorithms in extreme scenes of automatic driving and the like needs to be tested, and the defects of the traditional digital camera are more obvious. The reason for this is that the imaging model of the traditional camera, which samples at a fixed time frame by frame, limits the further improvement of the dynamic range and time resolution of the camera.
Neuromorphic cameras are a new type of camera, different from traditional cameras, by quantifying the number of photons in a fixed exposure time, resulting in image pixel values. The neuromorphic camera simulates the retina imaging mode of human eyes in the imaging principle, and is mainly divided into two types, namely a dynamic visual imaging model, and signals can be generated only when a scene changes; the other is based on a fovea vision sampling imaging model, and a static scene is recorded while a dynamic scene is recorded through a pulse method mechanism. Compared with the traditional camera, the time resolution of the two types of neuromorphic cameras is greatly improved, and the neuromorphic camera has good auxiliary effect on image enhancement.
In recent years, a dynamic vision sensor (Dynamic Vision Sensor, abbreviated as DVS) is a first type of neuromorphic camera, which is capable of monitoring changes in the irradiation value of a scene, and if the changes in the irradiation value exceed a set threshold value, an event signal (recording trigger position, time stamp, and polarity) is generated, so that the camera is also called an event camera. Thus, event cameras exhibit good characteristics of high temporal resolution, low latency, high dynamic range, etc., compared to conventional cameras, and have been widely used in computer vision related tasks. But the event signal only records the change of the irradiation value, lacks texture information of the static region, which makes it difficult to directly restore the gray image from the event signal. Although recent event cameras (such as DAVIS) have been able to sample grayscale images asynchronously at the same time, the resulting grayscale images are severely limited by the low resolution of the sensor (typically 346 x 260 pixel resolution) and motion blur problems. Meanwhile, the event camera discards a frame-by-frame imaging mode and outputs discrete event signal streams, so that the event camera is difficult to be compatible with the current deep learning image enhancement algorithm framework. Therefore, finding a suitable event signal coding mode, while retaining its high-speed and high-dynamic information, the compatible deep learning image enhancement framework becomes an important research direction.
Currently, there are mainly two ways to implement event signal encoding: 1) Based on an artificially defined coding scheme and 2) based on a data driven coding scheme.
Mode 1), such as event enhanced high quality image restoration (Event Enhanced High-Quality Image Recovery) (European Conference on Computer Vision (ECCV) 2020), two event signal encoding modes are widely used: voxel Grid (Voxel Grid) and Event frame (Event Stack). The voxel grid is processed by bilinear interpolation for the event signal flow, the event signals are accumulated according to linear weights, and the event signals are encoded into a three-dimensional matrix; the event frames are encoded by directly accumulating the event signals within a fixed time interval or a certain number of event signals. Although the timing information of the event signal is retained to some extent by the way of encoding into a three-dimensional matrix, both of these encoding methods fail to realize the full use of the event signal information, and as the time accuracy increases (the number of channels increases), they exhibit high sensitivity to noise signals. When the noise interference present in the event signal is strong, the performance of mode 1) is significantly degraded, limiting the enhancement of the image quality.
Mode 2), such as micro-recursion surface (A Differentiable Recurrent Surface for Asynchronous Event-Based Data) Based on asynchronous Event Data (European Conference on Computer Vision (ECCV) 2020), mainly has two representation modes of Event Tensor (EST) and Matrix neural representation (Matrix-LSTM), which utilize fully connected networks or long and short term memory neural networks to weight each Event signal and then obtain the encoding of the Event signal in a weighted combination mode, such a method achieves better effect in high-level visual tasks (classification, detection), but does not consider physical models of Event signals and image enhancement, and cannot perform specific Data processing for the image enhancement tasks, so that the auxiliary effect of the encoded Event signals on the image enhancement is limited. Meanwhile, the coding mode discards intermediate information, and the direct migration is difficult to directly and effectively extract space-time information in event signals to assist in image enhancement, so that the high-speed video can not be recovered under the existing image enhancement framework.
Disclosure of Invention
Aiming at the defect that an event signal imaging model is not considered in the prior art, the invention provides an image enhancement method based on an event signal nerve coding mode.
In order to achieve the above object, the present invention provides the following technical solutions:
in one aspect, the present invention provides an image enhancement method based on an event signal neural coding manner, including the steps of:
s1, event signal nerve coding: accumulating the input discrete event signal streams in a period of time, extracting features through a convolution layer, and performing forward and reverse bidirectional coding through a bidirectional long-short-time memory neural network to obtain neural event frames at different moments;
s2, deblurring a blurred image: the method comprises the steps of performing deblurring operation on an image by utilizing high-time resolution information in an event signal reserved by a neural event frame and utilizing a neural network to obtain a clear image;
s3, super-resolution of the low-resolution image: the method comprises the steps of utilizing high-time resolution information in event signals reserved by a neural event frame, and utilizing a neural network to improve the resolution of images in a time space changing mode;
s4, high-speed video generation: combining the deblurred image and the super-resolution image to generate a high-speed video which is restored and reconstructed.
Further, in step S1, a dense convolution module is used to extract features, and then the features of the long-short-term memory neural network coding timing signals are used to extract timing information.
Further, in step S2, the image signal and the neural event frame are fused by a U-Net neural network, and the residual error between the blurred image and the clear image is learned and output by the network, so as to recover the clear image from the blurred image.
Further, in step S3, the image signal features and the neural event frames are gradually fused by the multi-layer RRDB module, and finally, the super resolution of the image is realized in a pixel rearrangement manner.
Further, the loss functions used by the neural networks of steps S2 and S3 are each composed of two parts, a mean square error and a perceptual error:
l=α·l 2 (I o ,I gt )+β·l prec (I o ,I gt )
wherein I is o Representing the output image, I gt Representing the target image, the two parameters alpha and beta are respectively set to 100 and 0.5, l 2 (. Cndot.) represents the mean square error, l pree (. Cndot.) is the perceptual error, which is defined as follows:
l prec (I o ,I gt )=l 2h (I o ),φ h (I gt ))
wherein phi is h (. Cndot.) represents a layer h feature map of a VGG19 network that has been pre-trained on ImageNet.
Further, the whole network adopts a stage-by-stage training strategy, and firstly, a neural event frame encoder and a deblurring network are trained simultaneously; after the two networks are trained relatively stably, continuing to train by combining the super-division network, and fine-tuning parameters of the neural event frame encoder and the deblurring network; in two pairsThe learning rates of the individual stages are respectively set to 1×10 -3 And 1X 10 -4 The method comprises the steps of carrying out a first treatment on the surface of the Both stages of optimizers use ADAM optimizers.
Further, the whole network only uses gray images in the training process; in the test process, an image is firstly converted into a YUV color space from an RGB color space, a Y channel is separated, deblurring and superdivision are carried out on the Y channel, the UV color channel is directly interpolated to a corresponding resolution, and then the final color image is obtained by combining the UV color channel with the Y channel.
On the other hand, the invention also provides an image enhancement device based on the event signal nerve coding mode, which comprises the following modules to realize the method of any one of the above steps:
event signal neural coding module: the method comprises the steps of accumulating input discrete event signal streams in a period of time, extracting features through a convolution layer, and carrying out forward and reverse bidirectional coding through a bidirectional long-short-time memory neural network to obtain neural event frames at different moments;
a blurred image deblurring module: the method comprises the steps of performing deblurring operation on an image by utilizing high-time resolution information in an event signal reserved by a neural event frame and utilizing a neural network to obtain a clear image;
a low resolution image super resolution module: the method comprises the steps of utilizing high-time resolution information in event signals reserved by a neural event frame, and utilizing a neural network to improve the resolution of images in a time space changing mode;
a high-speed video generation module: the method is used for combining the deblurred image and the super-resolution image to generate the high-speed video which is restored and reconstructed.
In yet another aspect, the present invention further provides an apparatus, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; wherein:
the memory is used for storing a computer program;
the processor is configured to implement any one of the methods described above when executing the program stored on the memory.
Compared with the prior art, the invention has the beneficial effects that:
the image enhancement method and the device based on the event signal nerve coding mode combine the characteristics of a physical imaging model and a data driving mode, provide a robust and good coding mode for coding motion and time sequence information in an event signal stream, namely a nerve event frame, construct a conversion relation between a low-quality image and a high-quality image through the nerve event frame, effectively overcome the problems of noise interference and the like in the event signal, and realize deblurring and super-resolution of the image signal. Meanwhile, the invention designs a unified high-speed video generation framework under the guidance of the event signals, and realizes the recovery from the low-speed video to the high-speed video.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some embodiments described in the present invention, and that other drawings may be obtained from these drawings by those of ordinary skill in the art.
Fig. 1 is a flowchart of an image enhancement method based on an event signal neural coding manner according to an embodiment of the present invention.
Fig. 2 is a diagram of an image deblurring application effect according to an embodiment of the present invention.
Fig. 3 is a diagram of an image super-resolution application effect provided by an embodiment of the present invention.
Fig. 4 is a diagram of an application effect of high-speed video generation according to an embodiment of the present invention.
Detailed Description
For a better understanding of the present technical solution, the method of the present invention is described in detail below with reference to the accompanying drawings.
The image enhancement method based on the event signal nerve coding mode, as shown in fig. 1, comprises the following steps:
s1, event signal nerve coding: accumulating the input discrete event signal streams in a period of time, extracting features through a convolution layer, and performing forward and reverse bidirectional coding through a bidirectional long-short-time memory neural network to obtain neural event frames at different moments;
s2, deblurring a blurred image: the method comprises the steps of performing deblurring operation on an image by utilizing high-time resolution information in an event signal reserved by a neural event frame and utilizing a neural network to obtain a clear image;
s3, super-resolution of the low-resolution image: the method comprises the steps of utilizing high-time resolution information in event signals reserved by a neural event frame, and utilizing a neural network to improve the resolution of images in a time space changing mode;
s4, high-speed video generation: combining the deblurred image and the super-resolution image to generate a high-speed video which is restored and reconstructed.
Each step is specifically implemented by a module or architecture which is designed correspondingly in the neural network:
(1) Event signal neural coding: for the signal of a single event, whether the signal is noise or not is difficult to distinguish, and for the image enhancement task, the information in the single event signal does not need to be processed, so the invention designs a data-driven event signal coding mode, namely a neural event frame. As shown in fig. 1, the input discrete event signal streams are accumulated for a period of time, then features are extracted through a designed convolution layer, and then bidirectional encoding is performed through a bidirectional long-short-time memory neural network, so as to obtain a neural event frame, which is expressed as:
wherein e i Representing a single event signal, E i Representing the resulting neural event frame, N (·) represents the neural encoding process. The bidirectional coding mode can extract forward and reverse information in one-time coding, can overcome the problem of different positive and negative thresholds, and can effectively combine zero-mean random white noise in the bidirectional coding process to obtain a more robust event signal coding mode, thereby being a follow-up eventThe fusion of the part signal and the image signal provides a better input.
(2) Deblurring the blurred image: after the neural event frame is obtained, the image can be deblurred by using the high-time resolution information in the event signal which is reserved by the neural event frame. The invention fuses the image signal characteristics and the neural event frames through a U-Net neural network, and the network learns and outputs the residual error between the blurred image and the clear image, thereby realizing the recovery of the clear image from the blurred image, and the method is expressed as follows:
wherein B represents a blurred image and wherein,representing the final recovered sharp image, D (-) represents the process of fusing the neural event frame with the image signal, deblurring with the network D-Net.
(3) Super-resolution of low-resolution image: after the neural event frame is obtained, the characteristic of high time resolution reserved in the event signal can be utilized at the same time, so that the resolution of the image is improved in a space-time conversion mode. The invention gradually fuses the image signal characteristics and the neural event frames through the multi-layer RRDB module, and finally realizes the super-resolution of the image in a pixel rearrangement mode, which is expressed as follows:
wherein the method comprises the steps ofLow resolution image representing input, +.>Representing the final recovered high resolution image, S (-) represents the process of fusing the neural event frame and the image signal, using the network S-Net image superdivision.
(4) High-speed video generation: the invention establishes the relationship between the clear image and the blurred image and between the low resolution image and the high resolution image through the neural event frame. Meanwhile, through one-time bidirectional nerve coding of the event signals, corresponding nerve event frames at different moments are obtained, so that clear images and high-resolution images can be recovered in parallel. As shown in fig. 1, the method combines image deblurring and image super-resolution, and realizes the recovery reconstruction from low-quality video to high-resolution high-speed clear video.
The invention adopts synthetic data to train the neural network, and the specific training process is as follows:
(1) Synthesizing training data
a) The REDS dataset was downloaded from the network as the basis, from which 23280 images were selected, with an image resolution of 720 x 1280 pixels.
b) Simulating the image blurring and low resolution image generation process. Firstly, performing resolution downsampling on a video frame to obtain a low-resolution image with resolution of 180 multiplied by 320 pixels, then performing frame interpolation on a video with a frame rate of 960 frames/second to obtain a video with a frame rate of 60 frames/second, and then averaging the pixel values of 17 adjacent images to obtain a low-frame-rate blurred video with a frame rate of 60 frames/second, namely a low-quality input image signal.
c) The simulation event signal is generated by a V2E simulator, and the input of the simulator is a low-resolution clear video with the frame rate of 960 frames/second.
d) Before training, a data enhancement operation is applied to resize the original image and randomly intercept the original image into an image of 64×64 pixels, match a clear image of 256×256 pixels at the corresponding position, and increase the amount of training data by rotation, mirror inversion, etc.
(2) Training of neural networks
a) Loss function: the method consists of two parts, namely a mean square error and a perception error, and is defined as follows:
l=α·l 2 (I o ,I gt )+β·l prec (I o ,I gt ),
wherein I is o Representation ofOutput image, I gt Two parameters, α and β, representing the target image are set to 100 and 0.5, respectively. l (L) 2 (. Cndot.) represents the mean square error, and l prec (. Cndot.) represents the perceived error. The definition of the perceptual error is as follows:
l prec (I o ,I gt )=l 2h (I o ),φ h (I gt )),
wherein phi is h (. Cndot.) shows a h-layer feature map of a VGG19 network that has been pre-trained on ImageNet, where we use feature maps output by two convolution layers VGG3,3 and VGG5,5 to measure the perceived difference of the two images by calculating the mean square error. The same loss function described above is used for both the training deblurring network D-Net and the superdistribution network S-Net.
b) The whole network adopts a stage-by-stage training strategy, and firstly, the neural event frame encoder and the deblurring network are trained simultaneously. After the training of the two networks is relatively stable, we continue to train in conjunction with the supersplit network in the second stage, and this stage will not fix the parameters of the neural event frame encoder and deblurring network, nor will it need to be fine-tuned. The two phases are trained for 100 rounds respectively, and after the training of the first 50 rounds is completed, the learning rate needs to be adjusted, and the learning rate linearly decreases to 0 in the last 50 rounds. The learning rate is set to 1×10 in two stages respectively -3 And 1X 10 -4 . The optimizer uses an ADAM optimizer in both stages.
c) During training, the embodiment randomly divides training set and validation set in the simulation data according to the ratio of 9:1, and the Batchsize is set to 8, and the BatchNorm layer is used for helping the network convergence. Training was performed using a PyTorch 1.7 and NVIDIA 3090 GPU.
d) In the training process, only gray images are used, in the testing process, the images are firstly converted from RGB color space to YUV color space, Y channels are separated, deblurring and superdivision are carried out on the Y channels, the UV color channels are directly interpolated to corresponding resolutions, and then the final color images are obtained by combining the Y channels.
Corresponding to the method provided by the embodiment of the invention, the invention provides an image enhancement device based on an event signal nerve coding mode, which comprises event signal nerve coding, fuzzy image deblurring, low resolution image super resolution and high-speed video generation 4 modules, so as to realize the method flow provided by the embodiment of the invention.
When the method or the device is applied, the following steps are adopted:
a) And (3) setting up an environment: building a virtual environment on a Linux system by utilizing Anaconda, and installing an environment dependent package, wherein the virtual environment comprises the following steps of: python 3.8,PyTorch 1.7,CUDA 11.3,cuDNN, etc.
b) And (3) data generation: the training data employs simulation data, which is synthesized from REDS data sets, to generate event signals through V2E simulation. The test data is taken by a DAVIS346 camera to obtain a gray image with 260×346 pixels resolution and an event signal.
c) Model training: the model structure is built according to fig. 1, the model is trained on a NVIDIA GeForce RTX3090 display card, and the model converges after 200 rounds of training.
d) Model test: the grey image signal obtained by real shooting is fuzzy and low in resolution, firstly, an event signal is encoded into a neural event frame, deblurring and super resolution are carried out on an image by utilizing a network D-Net and a network S-Net, the obtained result is shown in fig. 2 and 3, the recovery result of the invention can obtain good image edge reconstruction, the recovery image quality is obviously higher than that of the prior method, meanwhile, a high-speed video is generated based on an image enhancement frame, the generation result is shown in fig. 4, the high-speed video frame reconstructed by the invention can keep better image detail and frame-to-frame continuity, and the recovery quality is higher than that of the prior method.
Compared with the prior art 1, the invention provides a coding mode of the neural event frame, which can effectively overcome the problems of threshold change, noise interference and the like of an event camera, so that the image enhancement is more stable.
Compared with the prior art 2, the method is based on the imaging model of the event camera, avoids the problems of overfitting and the like of a data driving method, and effectively extracts the motion and space-time information in the event signal aiming at the image enhancement task, so that the image enhancement effect is improved.
In summary, the invention combines the imaging model of the event camera, proposes a deep learning image enhancement framework based on the neural event frame, performs neural encoding on the event signal by utilizing a data driving method, retains the effective information in the event signal by a bidirectional encoding mode aiming at the image enhancement task, can effectively realize high-speed and high-resolution clear image restoration, and improves the quality of image enhancement.
Corresponding to the method provided by the embodiment of the invention, the embodiment of the invention also provides electronic equipment, which comprises: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the method flow provided by the embodiment of the invention when executing the program stored in the memory.
The communication bus mentioned in the above control device may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, there is also provided a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the steps of any of the methods provided by the embodiments of the present invention described above.
In yet another embodiment of the present invention, a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of any of the methods provided by the embodiments of the present invention described above is also provided.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital terminal device line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the apparatus embodiments, the electronic device embodiments, the computer-readable storage medium embodiments, and the computer program product embodiments, the description is relatively simple, as relevant to the description of the method embodiments in part, since they are substantially similar to the method embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (7)

1. An image enhancement method based on an event signal nerve coding mode is characterized by comprising the following steps:
s1, event signal nerve coding: accumulating the input discrete event signal streams in a period of time, extracting features through a convolution layer, and performing forward and reverse bidirectional coding through a bidirectional long-short-time memory neural network to obtain neural event frames at different moments;
s2, deblurring a blurred image: the method comprises the steps of performing deblurring operation on an image by utilizing high-time resolution information in an event signal reserved by a neural event frame and utilizing a neural network to obtain a clear image; the specific process is as follows: the method comprises the steps that through a U-Net neural network, image signal characteristics and neural event frames are fused, residual errors between a blurred image and a clear image are output through network learning, and clear image recovery from the blurred image is achieved, and the method is expressed as follows:
wherein B represents a blurred image and wherein,representing a final recovered clear image, D (-) representing a process of fusing the neural event frame and the image signal to be deblurred by using a network D-Net;
s3, super-resolution of the low-resolution image: the method comprises the steps of utilizing high-time resolution information in event signals reserved by a neural event frame, and utilizing a neural network to improve the resolution of images in a time space changing mode; the specific process is as follows: gradually fusing the image signal characteristics and the neural event frames through the multi-layer RRDB module, and finally realizing the image super-resolution in a pixel rearrangement mode, wherein the image super-resolution is expressed as follows:
wherein the method comprises the steps ofLow resolution image representing input, +.>Representing the finally recovered high-resolution image, wherein S (-) represents the process of integrating the neural event frame and the image signal by utilizing the network S-Net image superdivision;
s4, high-speed video generation: combining the deblurred image and the super-resolution image to generate a high-speed video which is restored and reconstructed.
2. The method for enhancing an image based on an event signal neural coding method according to claim 1, wherein in step S1, a dense convolution module is used to extract features, and then the features of the long-short-term memory neural network coding timing signals are used to extract timing information.
3. The method for enhancing an image based on an event signal neural coding scheme according to claim 1, wherein the loss functions used by the neural networks of steps S2 and S3 are each composed of two parts, a mean square error and a perceptual error:
l=α·l 2 (I o ,I gt )+β·l prec (I o ,I gt )
wherein I is o Representing the output image, I gt Representing the target image, the two parameters alpha and beta are respectively set to 100 and 0.5, l 2 (. Cndot.) represents the mean square error, l prec (. Cndot.) is the perceptual error, which is defined as follows:
l prec (I o ,I gt )=l 2h (I o ),φ h (I gt ))
wherein phi is h (. Cndot.) represents a layer h feature map of a VGG19 network that has been pre-trained on ImageNet.
4. The method for enhancing an image based on an event signal neural coding scheme according to claim 1, wherein the whole network adopts a stage-by-stage training strategy, and the neural event frame encoder and the deblurring network are trained at the same time; after the two networks are relatively stable, the training is continued by combining the super-division network, andfine tuning parameters of the neural event frame encoder and the deblurring network; the learning rates in the two stages are respectively set to 1×10 -3 And 1X 10 -4 The method comprises the steps of carrying out a first treatment on the surface of the Both stages of optimizers use ADAM optimizers.
5. The method for enhancing an image based on an event signal neural coding scheme according to claim 1, wherein the whole network uses only gray images in the training process; in the test process, an image is firstly converted into a YUV color space from an RGB color space, a Y channel is separated, deblurring and superdivision are carried out on the Y channel, the UV color channel is directly interpolated to a corresponding resolution, and then the final color image is obtained by combining the UV color channel with the Y channel.
6. An image enhancement device based on event signal neural coding, comprising the following modules to implement the method of any one of claims 1-5:
event signal neural coding module: the method comprises the steps of accumulating input discrete event signal streams in a period of time, extracting features through a convolution layer, and carrying out forward and reverse bidirectional coding through a bidirectional long-short-time memory neural network to obtain neural event frames at different moments;
a blurred image deblurring module: the method comprises the steps of performing deblurring operation on an image by utilizing high-time resolution information in an event signal reserved by a neural event frame and utilizing a neural network to obtain a clear image;
a low resolution image super resolution module: the method comprises the steps of utilizing high-time resolution information in event signals reserved by a neural event frame, and utilizing a neural network to improve the resolution of images in a time space changing mode;
a high-speed video generation module: the method is used for combining the deblurred image and the super-resolution image to generate the high-speed video which is restored and reconstructed.
7. An apparatus comprising a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other via the communication bus; it is characterized in that the method comprises the steps of,
the memory is used for storing a computer program;
the processor being adapted to implement the method of any of claims 1-5 when executing a program stored on the memory.
CN202211515957.0A 2022-11-29 2022-11-29 Image enhancement method and device based on event signal nerve coding mode Active CN116091337B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211515957.0A CN116091337B (en) 2022-11-29 2022-11-29 Image enhancement method and device based on event signal nerve coding mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211515957.0A CN116091337B (en) 2022-11-29 2022-11-29 Image enhancement method and device based on event signal nerve coding mode

Publications (2)

Publication Number Publication Date
CN116091337A CN116091337A (en) 2023-05-09
CN116091337B true CN116091337B (en) 2024-02-02

Family

ID=86201491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211515957.0A Active CN116091337B (en) 2022-11-29 2022-11-29 Image enhancement method and device based on event signal nerve coding mode

Country Status (1)

Country Link
CN (1) CN116091337B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116523754A (en) * 2023-05-10 2023-08-01 广州民航职业技术学院 Method and system for enhancing quality of automatically-identified image of aircraft skin damage

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667442A (en) * 2020-05-21 2020-09-15 武汉大学 High-quality high-frame-rate image reconstruction method based on event camera
CN113240605A (en) * 2021-05-21 2021-08-10 南开大学 Image enhancement method for forward and backward bidirectional learning based on symmetric neural network
CN113837938A (en) * 2021-07-28 2021-12-24 北京大学 Super-resolution method for reconstructing potential image based on dynamic vision sensor
CN114463218A (en) * 2022-02-10 2022-05-10 中国科学技术大学 Event data driven video deblurring method
CN115082341A (en) * 2022-06-24 2022-09-20 西安理工大学 Low-light image enhancement method based on event camera

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11037278B2 (en) * 2019-01-23 2021-06-15 Inception Institute of Artificial Intelligence, Ltd. Systems and methods for transforming raw sensor data captured in low-light conditions to well-exposed images using neural network architectures

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667442A (en) * 2020-05-21 2020-09-15 武汉大学 High-quality high-frame-rate image reconstruction method based on event camera
CN113240605A (en) * 2021-05-21 2021-08-10 南开大学 Image enhancement method for forward and backward bidirectional learning based on symmetric neural network
CN113837938A (en) * 2021-07-28 2021-12-24 北京大学 Super-resolution method for reconstructing potential image based on dynamic vision sensor
CN114463218A (en) * 2022-02-10 2022-05-10 中国科学技术大学 Event data driven video deblurring method
CN115082341A (en) * 2022-06-24 2022-09-20 西安理工大学 Low-light image enhancement method based on event camera

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Bio-inspired color image enhancement model;Yufeng Zheng et al;《The International Society for Optical Engineering》;1-12 *
基于生成对抗网络的低照度图像增强方法研究;蔡文成;《中国优秀硕士学位论文全文数据库(电子期刊)》;第2020年卷(第08期);全文 *

Also Published As

Publication number Publication date
CN116091337A (en) 2023-05-09

Similar Documents

Publication Publication Date Title
Baldwin et al. Time-ordered recent event (TORE) volumes for event cameras
Jinno et al. Multiple exposure fusion for high dynamic range image acquisition
US11741581B2 (en) Training method for image processing model, image processing method, network device, and storage medium
CN113837938B (en) Super-resolution method for reconstructing potential image based on dynamic vision sensor
CN111669514B (en) High dynamic range imaging method and apparatus
CN113076685A (en) Training method of image reconstruction model, image reconstruction method and device thereof
CN112529776B (en) Training method of image processing model, image processing method and device
CN113067979A (en) Imaging method, device, equipment and storage medium based on bionic pulse camera
CN111079764A (en) Low-illumination license plate image recognition method and device based on deep learning
Yan et al. High dynamic range imaging via gradient-aware context aggregation network
Yuan et al. Single image dehazing via NIN-DehazeNet
CN116091337B (en) Image enhancement method and device based on event signal nerve coding mode
Yang et al. Learning event guided high dynamic range video reconstruction
Rasheed et al. LSR: Lightening super-resolution deep network for low-light image enhancement
CN112750092A (en) Training data acquisition method, image quality enhancement model and method and electronic equipment
CN112651911A (en) High dynamic range imaging generation method based on polarization image
Jiang et al. Event-based low-illumination image enhancement
Tang et al. Structure-embedded ghosting artifact suppression network for high dynamic range image reconstruction
Shaw et al. Hdr reconstruction from bracketed exposures and events
Zhang et al. Iterative multi‐scale residual network for deblurring
CN117078574A (en) Image rain removing method and device
CN115358962B (en) End-to-end visual odometer method and device
Liu et al. Sensing Diversity and Sparsity Models for Event Generation and Video Reconstruction from Events
Cui et al. Multi-stream attentive generative adversarial network for dynamic scene deblurring
US20230394632A1 (en) Method and image processing device for improving signal-to-noise ratio of image frame sequences

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant