WO2023211742A1

WO2023211742A1 - Image processing method, electronic system and a non-transitory computer-readable medium

Info

Publication number: WO2023211742A1
Application number: PCT/US2023/019169
Authority: WO
Inventors: Jiang Li; Yangbo XIE; Ling OUYANG
Original assignee: Innopeak Technology, Inc.
Priority date: 2022-04-25
Filing date: 2023-04-20
Publication date: 2023-11-02

Abstract

An image processing method, an electronic system and a non-transitory computer-readable medium are provided. The image processing system is implemented at an electronic system. The image processing system includes the following steps: obtaining an initial image; degrading the initial image based on a predefined light source of a camera module to generate an input image; merging an image of the predefined light source into a light region of the initial image to generate a ground truth image; and processing the input image using an image processing network module to generate an output image, including training the image processing network module in accordance with a comparison of the output image and the ground truth image.

Description

IMAGE PROCESSING METHOD, ELECTRONIC SYSTEM AND A NON-TRANSITORY COMPUTER-READABLE MEDIUM CROSS-REFERENCE TO RELATED APPLICATION This application claims the priority benefit of US application serial no. 63/334,487, filed on April 25, 2022. The entirety of the above-mentioned patent application is hereby incorporated by reference herein and made a part of this specification. Technical Field [0001] The present invention relates to the field of data synthesis, and specifically, to an image processing method, an electronic system and a non-transitory computer-readable medium. Related Art [0002] In general, a large amount of training data is indispensable for data-driven algorithms like neural networks-based algorithms. While it is theoretically possible to experimentally capture training data pairs, the cost in time and labor for obtaining hundreds of thousands of images are prohibitively expensive and in practice impossible to implement. More specifically, for networks-based algorithms that perform image restoration tasks, images pairs that contain both the realistically degraded images and the ideal non-degraded images are required in order to train the neural networks to learn the transformation from the degraded images to the ideal non- degraded images. [0003] However, the mere time required for physical measurements exceeds the normal cycle allowed by a typical research and development time budget. For example, a typical data-driven neural network training requires the quantity of the training data pairs to be on the order of 100,000 at least. Assuming a speed of data collection of 100 image pairs per hour, the required time would be more than 100 work days. [0004] Moreover, special hardware systems need to be designed and manufactured to acquire the degraded images and the ideal non-degraded images. Such data acquisition needs to minimize the differences in factors independent of the image degradation process. For example, the temporal differences or the differences in viewing angle. In reality, the undesirable differences between the degraded images and the ideal non-degraded images are unavoidable and will affect the quality of the neural network training. SUMMARY OF INVENTION Technical Problem [0005] A novel data synthesis method for generating realistic training data for training state-of- the-art neural networks-based image restoration algorithms is desirable. Solution to Problem [0006] The image processing method of the invention is implemented at an electronic system. The image processing method includes the following steps: obtaining an initial image; degrading the initial image based on a predefined light source of a camera module to generate an input image; merging an image of the predefined light source into a light region of the initial image to generate a ground truth image; and processing the input image using an image processing network module to generate an output image, including training the image processing network module in accordance with a comparison of the output image and the ground truth image. [0007] In an embodiment of the invention, the camera module is an under-display camera. [0008] In an embodiment of the invention, the image processing method further includes the following steps: determining whether a comparison of image qualities of the output image and the ground truth image satisfies a predefined criterion; and in accordance with a determination that the comparison of image qualities of the output image and the ground truth image satisfies the predefined criterion, completing training of the image processing network module and associating the image processing network module with a plurality of weights and a plurality of biases. [0009] In an embodiment of the invention, the initial image includes a first initial image, the input image includes a first input image. The ground truth image includes a first ground truth image. The image processing method further includes the following steps: in accordance with a determination that the comparison of image qualities of the output image does not satisfy the predefined criterion: obtaining a second initial image; generating a second input image and a second ground truth image; and training the image processing network module based on the second input image and the ground truth image. [0010] In an embodiment of the invention, the initial image is selected from an image database including a plurality of low-noise images having image qualities higher than a threshold level. The image database is configured to provide the plurality of low-noise images for training neural networks configured to process images. [0011] In an embodiment of the invention, the step of degrading the initial image further includes the following step: applying a haze effect, a diffraction effect, and one or more types of noises on the initial image. [0012] In an embodiment of the invention, the one or more types of noises includes a shot noise and a read noise. The image processing method further includes the following step: determining the shot noise and the read noise based on camera statistics. The step of degrading the initial image further includes the following step: adding the shot noise and the read noise into the initial image. [0013] In an embodiment of the invention, the ground truth image does not have the haze effect or the one or more types of noises, and the light source in the ground truth image is convolved with the predefined light source of the camera module. [0014] In an embodiment of the invention, the step of obtaining the initial image includes the following steps: obtaining a digital image having a JPEG image format; and converting the digital image from the JPEG image format to a RAW image format to generate the initial image having the RAW image format. [0015] In an embodiment of the invention, the predefined light source includes an array of point light sources. Each point light source includes an optical fiber light source masked by a pinhole having a radius smaller than 100 times of a light wavelength, and is configured to provide an incident light having a range of incident angles. The image processing method further includes the following step: representing each point light source with a point source function (PSF) to generate the input image and the ground truth image. [0016] In an embodiment of the invention, each PSF has a high dynamic range that is greater than a threshold ratio in brightness, and is applied with a series of exposure times for each of a plurality of incident angles. A weighted average is applied in the corresponding input image or the corresponding ground truth image. [0017] In an embodiment of the invention, the initial image includes a plurality of saturated pixels and a plurality of unsaturated pixel distinct from the plurality of saturated pixels. The step of degrading the initial image includes the following steps: degrading the initial image includes varying the predefined light source for the plurality of unsaturated pixels and the plurality of saturated pixels; and merging the image of the predefined light source into the light region of the initial image includes applying an augmented light source to the plurality of saturated pixels. [0018] In an embodiment of the invention, the augmented light source is represent by a plurality of PSFs having different shapes and intensity distributions. [0019] In an embodiment of the invention, the step of varying the predefined light source further includes the following steps: representing the predefined light source with a plurality of spatially averaged PSFs for degrading the plurality of unsaturated pixels; and representing the predefined light source with a plurality of PSFs having different shapes and intensity distributions for degrading the plurality of saturated pixels. [0020] The electronic system of the invention includes one or more processor; and a memory, coupled to the one or more processor, and configured to store a plurality of instructions, which when executed by the one or more processor cause the one or more processor to perform the above image processing method. [0021] The non-transitory computer-readable medium of the invention is configured to store a plurality of instructions, which when executed by one or more processor cause the one or more processor to perform the above image processing method. Effects of Invention [0022] Based on the above, according to the image processing method, the electronic system and non-transitory computer-readable medium of the invention can automatically generate a large amount of training data and effectively train the image processing network module. [0023] To make the aforementioned more comprehensible, several embodiments accompanied with drawings are described in detail as follows. BRIEF DESCRIPTION OF DRAWINGS [0024] FIG. 1 is a schematic diagram of an electronic system according to an embodiment of the invention. [0025] FIG.2 is a flow chart of an image processing method according to an embodiment of the invention. [0026] FIG. 3 is a flow chart of an image processing method according to another embodiment of the invention. [0027] FIG. 4A is a schematic diagram of the input image according to an embodiment of the invention. [0028] FIG. 4B is a schematic diagram of an input image according to an embodiment of the invention. [0029] FIG. 5A is a schematic diagram of measuring a point spread function according to an embodiment of the invention. [0030] FIG.5B is a schematic diagram of a set of reconstructed point spread functions according to an embodiment of the invention. [0031] FIG. 6 is a flow chart of data simulation pipeline according to an embodiment of the invention. [0032] FIG. 7 is a flow chart of generating a simulated noise-free degraded image according to an embodiment of the invention. [0033] FIG. 8 is a flow chart of noise synthesis according to an embodiment of the invention. DESCRIPTION OF EMBODIMENTS [0034] FIG. 1 is a schematic diagram of an electronic system according to an embodiment of the invention. Referring to FIG.1, in the embodiment of the invention, the electronic system 100 includes a processor 110 and a memory 120, and the memory 120 may store an image processing network module 121 and relevant instructions. The processor 110 is (electronically) coupled to the memory 120, and may execute the image processing network module 121 and relevant instructions to implement an image processing method of the invention. In the embodiment of the invention, the electronic system 100 may be one or more personal computer (PC), one or more server computer, one or more workstation computer or composed of multiple computing devices, but the invention is not limited thereto. In one embodiment of the invention, the electronic system 100 may include more processors for executing the image processing network module 121 and relevant instructions to implement the image processing method of the invention. [0035] In the embodiment of the invention, the processor 110 may include, for example, a central processing unit (CPU), a graphic processing unit (GPU), or other programmable general- purpose or special-purpose microprocessor, digital signal processor (DSP), application specific integrated circuit (ASIC), programmable logic device (PLD), other similar processing circuits or a combination of these devices. In the embodiment of the invention, the memory 120 may be a non-transitory computer-readable recording medium, such as a read-only memory (ROM), an erasable programmable read-only memory (EPROM), an electrically-erasable programmable read-only memory (EEPROM) or a non-volatile memory (NVM), but the present invention is not limited thereto. In one embodiment of the invention, the image processing network module 121 may also be stored in the non-transitory computer-readable recording medium of one apparatus, and executed by the processor of another one apparatus. [0036] In the embodiment of the invention, the image processing network module 121 includes a neural network model, and the neural network model may be implemented by an engineered convolutional neural network (U-NET) structure, but the invention is not limited thereto. In one embodiment of the invention, the above neural network model may also be implemented by a context aggregation network (CAN) or other convolutional neural networks. [0037] FIG.2 is a flow chart of an image processing method according to an embodiment of the invention. Referring to FIG. 1 and FIG. 2, the electronic system 100 may execute the following steps S201 to S204 to implement the image processing method. In step S201, the processor 110 may obtain an initial image. In the embodiment of the invention, the initial image may be selected from an image database including a plurality of low-noise images having image qualities higher than a threshold level, and the image database may be configured to provide the plurality of low-noise images for training the image processing network module 121. The image database may be set in an external storage device or the memory 120. [0038] In step S202, the processor 110 may degrade the initial image based on a predefined light source of a camera module to generate an input image. In the embodiment of the invention, the processor 110 may use the initial image to simulate the image captured by the camera module under the influence of strong point light sources, other noises and equipment mechanisms. The camera module may an under-display camera, but the invention is not limited thereto. In step S203, the processor 110 may merge an image of the predefined light source into a light region of the initial image to generate a ground truth image. In the embodiment of the invention, the processor 110 may use the same initial image to simulate the ground truth image for represent the desired result processed by the image processing network module 121. [0039] In step S204, the processor 110 may process the input image using an image processing network module 121 to generate an output image, including training the image processing network module 121 in accordance with a comparison of the output image and the ground truth image. In the embodiment of the invention, the processor 110 may generate the training data (input image and the ground truth image) according the initial image provided by the image database. The processor 110 may input the input image into the image processing network module 121, so that the image processing network module 121 may output the output image. The processor 110 may compare the output image and the ground truth image to determine whether a comparison of image qualities of the output image and the ground truth image satisfies a predefined criterion. Moreover, in accordance with a determination that the comparison of image qualities of the output image and the ground truth image satisfies the predefined criterion, the processor 110 may complete training of the image processing network module and associating the image processing network module with a plurality of weights and a plurality of biases. Therefore, the electronic system 100 and the image processing method of the embodiment may be able to automatically generate the training data, and effectively train the image processing network module 121 by using the training data. [0040] FIG. 3 is a flow chart of an image processing method according to another embodiment of the invention. The embodiment of FIG. 3 may be used to illustrate a specific implementation detail of the image processing method of the embodiment of FIG. 2. Referring to FIG. 1 and FIG.3, the electronic system 100 may execute the following steps S301 to S310 to implement the image processing method. It is should be noted that, the processor 110 may perform data simulation pipeline (i.e. execute steps S301 to S307) to implement, for example, the under display camera image synthesis, and perform neural network processing pipeline (i.e. steps S308 to S310) to implement the normal image synthesis. [0041] In step S301, the processor 110 may obtain the initial image. The initial image may randomly sample one image from the image database. In step S302, the processor 110 may perfume haze simulation on the initial image. In step S304, the processor 110 may perform bright light source diffraction simulation on the initial image. In step S305, the processor 110 may perform noise simulation on the initial image. In step S306, the processor 110 may generate the input image (i.e. as a degraded image). The processor 110 may degrade the initial image based on the predefined light source of the camera module, and may apply the haze effect, the diffraction effect, and one or more types of noises on the initial image. For example, referring to FIG.4A, FIG.4A is a schematic diagram of the input image according to an embodiment of the invention. As shown in FIG. 4A, the processor 110 may generate the input image 410. [0042] On the other hand, in step S303, the processor 110 may perform light source simulation in the initial image. In step S308, the processor 110 may generate the ground truth image (i.e. as a target sample). The processor 110 may generate the initial image based on the predefined light source of the camera module. For example, referring to FIG.4B, FIG.4B is a schematic diagram of the ground truth image according to an embodiment of the invention. As shown in FIG. 4B, the processor 110 may generate the ground truth image 420. The ground truth image 420 does not have the haze effect or the one or more types of noises, and the light source in the ground truth image 420 is convolved with the predefined light source of the camera module. [0043] Then, the processor 110 may input the input image into the image processing network module 121, so that the image processing network module 121 may generate the output image. In step S309, the processor 110 may compare the image qualities of the output image and the ground truth image. In step S310, the processor 110 may determine whether a comparison of image qualities satisfies a predefined criterion. If the comparison of image qualities satisfies the predefined criterion, the processor 110 completes training of the image processing network module 121 and associating the image processing network module 121 with the plurality of weights and the plurality of biases. If the comparison of image qualities does not satisfy the predefined criterion, the processor 110 may generate next training data, and continuously train the image processing network module 121. [0044] For example, the initial image may include a first initial image, the input image may include a first input image, and the ground truth image may include a first ground truth image. In accordance with the determination that the comparison of image qualities of the first output image and the first ground truth image does not satisfy the predefined criterion, the processor 110 may further obtain a second initial image, generate a second input image and a second ground truth image, and train the image processing network module 121 based on the second input image and the second ground truth image. Therefore, by analogy, the processor 110 may repeatedly execute steps S301 to S307 to generate a large amount of training data, and repeatedly execute steps S308 to S310 to effectively train the image processing network module 121 by using the large amount of training data. [0045] FIG. 5A is a schematic diagram of measuring a point spread function according to an embodiment of the invention. FIG. 5B is a schematic diagram of a set of reconstructed point spread functions according to an embodiment of the invention. Referring to FIG.1, FIG.5A and FIG. 5B, in the embodiment of the invention, a method for measuring high dynamic range point source functions (PSFs) is provided below. Specifically, the predefined light source may include an array of point light sources, and each point light source may include an optical fiber light source masked by a pinhole. The predefined light source may be used to approximate the point sources, which is an optical fiber source masked by a pinhole with a radius smaller than 100 times of the mean wavelength. As shown in FIG. 5A, an optical fiber light source 510 masked by a pinhole may be configured to provide incident light to the camera module 520. The processor 110 may represent each point light source with a point source function (PSF) to generate the input image and the ground truth image. [0046] In the embodiment of the invention, since many of the optical processes may include the incident-angle-dependent diffraction, the PSFs may have a complex dependency on the direction of the point source relative to the optical center of the camera module. That is, the each point light source may provide an incident light having a range of incident angles. Thus, as shown in FIG. 5A, the camera module 520 may be coupled to a data-acquisition computer 530 (or the processor 110 of FIG.1), and may be arranged on a motorized pan-tilt stage 540. The motorized pan-tilt stage 540 may be used and programmed to move the smartphone towards, for example, a roughly 60 degrees by 60 degrees pan-tilt angular range, so that the camera module 520 may generate a plurality of capturing images for obtaining PSFs at different incident angles. Thus, the measurement of the PSFs may cover the entire field-of-view of the camera. Therefore, the data-acquisition computer 530 may collect the capturing images generated by the camera module 520 to perform post-processing to generate the PSFs. As shown as the image 550 of FIG. 5B, the data-acquisition computer 530 may reconstruct a set of PSFs corresponding to a plurality of sub-regions of the entire image region from the capturing images. [0047] In the embodiment of the invention, since many of the optical processes of the image degradation involve optical diffraction, and one characteristic of a diffraction pattern is the spatial spreading of its power over many higher-order features. While very high order diffractions are not prominent for common lighting conditions, they are nevertheless noticeable when the interested scenes contain very strong light sources. For example, the spot lights commonly seen at shopping malls can cause intense diffraction patterns that significantly degrades the intelligibility of other parts of the images. Therefore, to fully characterize the very high order diffraction patterns, the embodiment may measure the PSFs over an ultra-high dynamic range of more than 1,000,000:1 in brightness. More than 50 frames, from the lowest exposure time- analog gain setting to the highest one, are taken at each incident direction, and weighted averaging is then performed according to their exposure time-analog gain setting to obtain the high-dynamic range (HDR) result. Therefore, in the embodiment of the invention, the camera module 520 may also obtain high dynamic ratio images for measuring the PSFs, so that each reconstructed PSF may have a high dynamic range that is greater than a threshold ratio in brightness. Moreover, the each reconstructed PSF may be applied with a series of exposure times for each of the plurality of incident angles, and the weighted average may be applied in the corresponding input image or the corresponding ground truth image. [0048] In the embodiment of the invention, the data-acquisition computer 530 may further perform data augmentation to further enhance the variety of the sampling of the discretely measured PSFs. The applied data augmentation may contain both the spatial one based on random affine transformation (covering spatial translation, scaling, shearing, rotation and reflection) as well as the random color gains (covering spectrum ranges for common light sources). [0049] FIG. 6 is a flow chart of data simulation pipeline according to an embodiment of the invention. Referring to FIG. 1 and FIG. 6, the processor 110 may execute the following steps S601 to S607 to implement the data simulation pipeline. The embodiment of FIG.6 may be used to illustrate a specific implementation detail of the data simulation pipeline of the above embodiment of FIG. 3. In the embodiment of the invention, if the image database does not provide enough linear data (RAW image format), the processor 110 may convert (simulate) linear RAW data from the readily available tone-mapped data (in JPEG or similar formats). [0050] In step S601, the processor 110 may convert the initial image from a JPEG image format to RAW image format. In the embodiment of the invention, the JPEG-to-RAW conversation process is an approximated inversion of the image signal processing (ISP) pipeline, which may include white balance adjustment, color correction, gamma correction, tone mapping, etc. The ISP parameters used in JPEG-to-RAW conversation process are based on the imaging statistics of the interested camera modules. In the embodiment of the invention, the initial image may include a plurality of saturated pixels and a plurality of unsaturated pixel distinct from the plurality of saturated pixels. [0051] In step S602, the processor 110 may perform unsaturated pixels degradation on the plurality of unsaturated pixels of the initial image by using a spatially averaged PSF. In the embodiment of the invention, for unsaturated pixels, the degradation process may be well approximated as a linear system and the inverse problem of recovering the image is generally easier than that with the saturated pixels. Take an under-display camera as an example, the degradation of its unsaturated pixels is reflected in their fog-like appearance with losses in sharpness and contrast. In the case of under-display camera, such degradation is known as haze due to their foggy appearances. Therefore, to synthesize the degraded unsaturated pixels, the processor 110 may convolve the pixels in linear RAW data with the corresponding PSFs. While the PSFs of an imaging system are generally spatially-variant, a spatially-averaged PSF can be used for the purpose of accelerating the simulation while capturing most of the characteristics of the degradation processes for the unsaturated pixels. In step S603, the processor 110 may perform saturated pixels degradation on the plurality of saturated pixels of by using randomly affine transformed PSFs. In other words, the processor 110 may vary the predefined light source (i.e. the PSFs) for the plurality of unsaturated pixels and the plurality of saturated pixels of the initial image to generate a simulated noise-free degraded image. [0052] In step S605, the processor 110 may add noise one the simulated noise-free degraded image. In step S606, the processor 110 may generate the input image. On the other hand, in step S604, the processor 110 may perform saturated pixels degradation by using a normal PSF. The processor 110 may apply an augmented light source to the plurality of saturated pixels of the initial image by using the normal PSF. The augmented light source may be represented by a plurality of PSFs having different shapes and intensity distributions. Thus, in step S607, the processor 110 may generate the ground truth image. Therefore, the electronic system 100 may implement efficient and accurate image synthesis by separating degradation modeling of unsaturated and saturated pixels. [0053] FIG. 7 is a flow chart of generating a simulated noise-free degraded image according to an embodiment of the invention. Referring to FIG.1 and FIG.7, the embodiment of FIG.7 may be used to illustrate a specific implementation detail of synthesis of the degradation of the saturated pixels of the above embodiment of FIG. 6 (i.e. step S603 of FIG. 6). In step S701, the processor 110 may generate a plurality of augmented PSFs (augmented spatially-variant PSFs). In step S702, the processor 110 may simulate multiple light source with various shapes and intensity distributions. In step S703, the processor 110 may convolve the light sources with the augmented PSFs. That is, the non-degraded light sources may be convoluted with the augmented spatially- variant PSFs to obtain the degraded images of the light sources. In step S704, the processor 110 may generate a linearized image with simulated degradation of unsaturated pixels by the plurality of spatially averaged PSFs. In step S705, the processor 110 may generate a simulated noise-free degraded image. That is, the degraded images of the light sources are then superimposed upon the linearized images (synthesized degraded images with only unsaturated pixels) to generate the simulated noise-free degraded image. [0054] In other words, in the embodiment of the invention, the processor 110 may represent the predefined light source with the plurality of spatially averaged PSFs for degrading the plurality of unsaturated pixels of the initial image, and represent the predefined light source with the plurality of PSFs having different shapes and intensity distributions for degrading the plurality of saturated pixels of the initial image, so as to generate the simulated noise-free degraded image. [0055] FIG. 8 is a flow chart of noise synthesis according to an embodiment of the invention. Referring to FIG. 1 and FIG. 8, the embodiment of FIG. 8 may be used to illustrate a specific implementation detail of noise synthesis of the above embodiment of FIG.6 (i.e. step S605 of FIG. 6). In step S801, the processor may obtain the simulated noise-free degraded image. In step S802, the processor 110 may obtain the shot noise. The processor may superimpose the initial image with image having the shot noise. In step S803, the processor 110 may obtain the read noise. The processor may superimpose the previously superimposed image with image having the read noise. In the embodiment of the invention, the processor 110 may determine the shot noise and the read noise based on camera features (camera statistics), and the shot noise and the read noise are modeled with measured noise statistics. In step S804, the processor 110 may generate a simulated degraded image with noise. Thus, the synthesis of realistic noise of the camera module is integrated into the training data generation pipeline in order to add the noise reduction capacity in our neural network. In addition, the image processing network module 121 may contain tunable parameters to control the de-noise strength. [0056] In summary, the electronic system and non-transitory computer-readable medium of the invention can automatically generate a large amount of reliable and useful training data by using the existing image database, so as to effectively train the image processing network module to learn to remove the degradation of the unsaturated and the saturated pixels, as well as suppress the noise. In other words, without labor-intensive, time-consuming and expensive experimental data collection, the present invention can digitally generate realistic training data, greatly reducing the time and cost required to train image processing network modules. Thus, the trained image processing network module may implement effectively image processing on the image captured by the camera module (e.g. the under-display camera) to generate a corresponding optimized image. Reference Signs List [0057] 110：Processor 120：Memory 121：Image processing network module 410：Input image 420：Ground truth image 510：Optical fiber light source 520：Camera module 530：Data-acquisition computer 540：Motorized pan-tilt stage 550：Image S201~S204, S301~S310, S601~S607, S701~S705, S801~S804：Step

Claims

WHAT IS CLAIMED IS: 1. An image processing method, implemented at an electronic system, comprising: obtaining an initial image; degrading the initial image based on a predefined light source of a camera module to generate an input image; merging an image of the predefined light source into a light region of the initial image to generate a ground truth image; and processing the input image using an image processing network module to generate an output image, including training the image processing network module in accordance with a comparison of the output image and the ground truth image. 2. The image processing method according to claim 1, wherein the camera module is an under-display camera. 3. The image processing method according to claim 1 or 2, further comprising: determining whether a comparison of image qualities of the output image and ground truth image satisfies a predefined criterion; and in accordance with a determination that the comparison of image qualities of the output image and ground truth image satisfies the predefined criterion, completing training of the image processing network module and associating the image processing network module with a plurality of weights and a plurality of biases. 4. The image processing method according to any one of claims 1 to 3, wherein the initial image comprises a first initial image, the input image comprises a first input image, the ground truth image comprises a first ground truth image, and the image processing method further comprises: in accordance with a determination that the comparison of image qualities of the first output image and the first ground truth image does not satisfy the predefined criterion: obtaining a second initial image; generating a second input image and a second ground truth image; and training the image processing network module based on the second input image and the ground truth image. 5. The image processing method according to any one of claims 1 to 4, wherein the initial image is selected from an image database comprising a plurality of low-noise images having image qualities higher than a threshold level, and the image database is configured to provide the plurality of low-noise images for training the image processing network module. 6. The image processing method according to any one of claims 1 to 5, wherein the step of degrading the initial image comprises: applying a haze effect, a diffraction effect, and one or more types of noises on the initial image. 7. The image processing method according to claim 6, wherein the one or more types of noises comprises a shot noise and a read noise, and the image processing method further comprises: determining the shot noise and the read noise based on camera features, wherein the step of degrading the initial image further comprises: adding the shot noise and the read noise into the initial image. 8. The image processing method according to claim 6, wherein the ground truth image does not have the haze effect or the one or more types of noises, and the light source in the ground truth image is convolved with the predefined light source of the camera module. 9. The image processing method according to any one of claims 1 to 8, wherein the step of obtaining the initial image comprises: obtaining a digital image having a JPEG image format; and converting the digital image from the JPEG image format to a RAW image format to generate the initial image having the RAW image format. 10. The image processing method according to any one of claims 1 to 9, wherein the predefined light source comprises an array of point light sources, and each point light source comprises an optical fiber light source masked by a pinhole, and is configured to provide an incident light having a range of incident angles, wherein the image processing method further comprises: representing each point light source with a point source function (PSF) to generate the input image and the ground truth image. 11. The image processing method according to claim 10, wherein each PSF has a high dynamic range that is greater than a threshold ratio in brightness, and is applied with a series of exposure times for each of a plurality of incident angles, and a weighted average is applied in the corresponding input image or the corresponding ground truth image. 12. The image processing method according to any one of claims 1 to 11, wherein the initial image comprises a plurality of saturated pixels and a plurality of unsaturated pixel distinct from the plurality of saturated pixels, and the step of degrading the initial image comprises: wherein the step of degrading the initial image comprises: varying the predefined light source for the plurality of unsaturated pixels and the plurality of saturated pixels, wherein the step of merging the image of the predefined light source into the light region of the initial image comprises: applying an augmented light source to the plurality of saturated pixels. 13. The image processing method according to claim 12, wherein the augmented light source is represent by a plurality of PSFs having different shapes and intensity distributions. 14. The image processing method according to claim 12, wherein the step of varying the predefined light source comprises: representing the predefined light source with a plurality of spatially averaged PSFs for degrading the plurality of unsaturated pixels; and representing the predefined light source with a plurality of PSFs having different shapes and intensity distributions for degrading the plurality of saturated pixels. 15. An electronic system, comprising: one or more processor; and a memory, coupled to the one or more processor, and configured to store a plurality of instructions, which when executed by the one or more processor cause the one or more processor to perform the image processing method of any one of claims 1 to 14. 16. A non-transitory computer-readable medium, configured to store a plurality of instructions, which when executed by one or more processor cause the one or more processor to perform the image processing method of any one of claims 1 to 14.