CN118115375A - Image processing method, apparatus, electronic device, and computer-readable storage medium - Google Patents

Image processing method, apparatus, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN118115375A
CN118115375A CN202211499327.9A CN202211499327A CN118115375A CN 118115375 A CN118115375 A CN 118115375A CN 202211499327 A CN202211499327 A CN 202211499327A CN 118115375 A CN118115375 A CN 118115375A
Authority
CN
China
Prior art keywords
image
reference frame
feature
original images
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211499327.9A
Other languages
Chinese (zh)
Inventor
曾辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202211499327.9A priority Critical patent/CN118115375A/en
Publication of CN118115375A publication Critical patent/CN118115375A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The present application relates to an image processing method, an apparatus, an electronic device, a storage medium, and a computer program product. The method comprises the following steps: denoising and demosaicing the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold; performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold; and generating a target image according to the second image characteristic and the third image characteristic. By adopting the method, the accuracy of image processing can be improved.

Description

Image processing method, apparatus, electronic device, and computer-readable storage medium
Technical Field
The present application relates to the field of image technology, and in particular, to an image processing method, an image processing device, an electronic device, and a computer readable storage medium.
Background
With the development of image technology, a multi-frame image fusion technology is developed, and in the multi-frame image fusion process, ISP (IMAGE SIGNAL Processing ) Processing is generally required, including demosaicing, denoising, white balance, tone mapping, contrast enhancement and the like, so that a final image is obtained through fusion.
However, the conventional image processing method has a problem in that the image processing accuracy is not high.
Disclosure of Invention
Embodiments of the present application provide an image processing method, apparatus, electronic device, computer-readable storage medium, and computer program product, which can improve accuracy of image processing.
In a first aspect, the present application provides an image processing method. The method comprises the following steps:
Denoising and demosaicing the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold;
performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold;
and generating a target image according to the second image characteristic and the third image characteristic.
In a second aspect, the present application also provides an image processing apparatus. The device comprises:
The first processing module is used for carrying out denoising processing and demosaicing processing on first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold;
The second processing module is used for performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold;
And the image generation module is used for generating a target image according to the second image characteristic and the third image characteristic.
In a third aspect, the application further provides electronic equipment. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor executes the computer program to realize the following steps:
Denoising and demosaicing the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold;
performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold;
and generating a target image according to the second image characteristic and the third image characteristic.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
Denoising and demosaicing the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold;
performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold;
and generating a target image according to the second image characteristic and the third image characteristic.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:
Denoising and demosaicing the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold;
performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold;
and generating a target image according to the second image characteristic and the third image characteristic.
The image processing method, the image processing device, the electronic equipment, the computer readable storage medium and the computer program product perform denoising processing and demosaicing processing on first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features with frequency lower than a preset frequency threshold; performing super-resolution processing on the second image features to obtain third image features with the frequency higher than or equal to a preset frequency threshold; that is, the information such as the low-frequency color profile of the image can be obtained more accurately through the denoising process and the demosaicing process, and the information such as the high-frequency texture detail of the image can be obtained more accurately through the super-resolution process, so that the target image with denoising, demosaicing and super-resolution is generated according to the second image feature and the third image feature, and the accuracy of the image processing is improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an image processing method in one embodiment;
FIG. 2 is a flowchart of an image processing method in another embodiment;
FIG. 3 is a flowchart of an image processing method in another embodiment;
FIG. 4 is a flowchart of an image processing method in another embodiment;
FIG. 5 is a flowchart of an image processing method in another embodiment;
FIG. 6 is a flowchart of an image processing method in another embodiment;
FIG. 7 is a flow chart of generating a multi-frame image of a network to be input in another embodiment;
FIG. 8 is a flowchart of an image processing method in another embodiment;
FIG. 9 is a block diagram showing the structure of an image processing apparatus in one embodiment;
fig. 10 is an internal structural diagram of an electronic device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
In one embodiment, as shown in fig. 1, an image processing method is provided, where the method is applied to an electronic device for illustration, and the electronic device may be a terminal or a server; it will be appreciated that the method may also be applied to a system comprising a terminal and a server and implemented by interaction of the terminal and the server. The terminal can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things equipment and portable wearable equipment, and the internet of things equipment can be smart speakers, smart televisions, smart air conditioners, smart vehicle-mounted equipment, smart automobiles and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.
In this embodiment, the image processing method includes the steps of:
step S102, denoising and demosaicing are carried out on first image features extracted from at least two original images based on noise information of reference frames in the at least two original images, so as to obtain second image features; the frequency of the second image feature is below a preset frequency threshold.
It will be appreciated that the original image may be a RAW domain image, i.e. RAW, RAW image data.
The reference frame is an image used for reference during registration. Alternatively, the reference frame may be the image with the highest sharpness of the at least two original images, or may be the image with the highest brightness of the at least two original images, which is not limited herein. The noise information of the reference frame may be a noise figure of the reference frame.
Denoising refers to the process of reducing noise in a digital image. Demosaicing (de-mosaicing, also known as de-mosaicing, demosaicking or debayering) is a digital image processing algorithm, which aims to reconstruct a full-Color image from an incomplete Color sample output from a photosensitive element covered with a Color filter array (Color FILTER ARRAY, CFA). This method is also known as color filter array interpolation (CFA interpolation) or color reconstruction (Color reconstruction).
The preset frequency threshold may be set as desired. The frequency of the second image feature is below a preset frequency threshold, i.e. the second image feature comprises a low frequency image feature. The low frequency, i.e. the color, changes slowly, i.e. the gray level changes slowly, representing a continuously graded area, which is the low frequency. For an image, the content within the edge is low frequency, while the content within the edge is most of the information of the image, i.e. the general outline and contour of the image, and is the approximate information of the image. Illustratively, the second image feature may include information of outline, color, etc.
Optionally, the electronic device acquires original images of at least two RAW domains through the image sensor; inputting the original images of at least two RAW domains into a joint demosaicing, denoising and super-resolution network; in the combined mosaic-removing, denoising and super-resolution network, performing mosaic-removing, denoising and super-resolution processing on at least two RAW domain original images to obtain a target image.
Optionally, in the joint de-mosaicing, denoising and super-resolution network, extracting first image features from at least two original images, and determining a reference frame from the at least two original images; acquiring noise information of a reference frame; and carrying out denoising and demosaicing on the first image characteristic based on the noise information of the reference frame to obtain a second image characteristic.
Step S104, performing super-resolution processing on the second image feature to obtain a third image feature; the frequency of the third image feature is greater than or equal to a preset frequency threshold.
The super-resolution processing is to increase the resolution of the original image by a hardware or software method, and the super-resolution reconstruction is the process of obtaining a high-resolution image by a series of low-resolution images. The magnification of the super resolution may be set as needed. Illustratively, the super resolution processing may be 2 times super resolution.
The frequency of the third image feature is higher than or equal to a preset frequency threshold, i.e. the third image feature comprises a high frequency image feature. The high frequency, namely the frequency change is fast, and the corresponding gray level change is fast at the image edge position with large gray level difference between adjacent areas; and the details of the image are also areas of sharp change in gray values, just because of the sharp change in gray values. Illustratively, the third image feature includes information such as edges and details.
Optionally, the electronic device inputs the second image feature into a Super-resolution module (Super-resolution imaging, SR), and performs Super-resolution processing on the second image feature through the Super-resolution module to obtain a third image feature after Super-resolution processing.
Step S106, generating a target image according to the second image feature and the third image feature.
Optionally, the electronic device adds the second image feature and the third image feature to generate the target image.
Optionally, the electronic device multiplies the second image feature and the third image feature by corresponding weight factors respectively, and then adds the second image feature and the third image feature to generate the target image. The weight factors may be set as desired.
Optionally, the electronic device selects a first partial feature from the second image features, selects a second partial feature from the third image features, and adds the first partial feature and the second partial feature to generate the target image.
The manner of generating the target image from the second image feature and the third image feature may be set as needed, and is not limited thereto.
According to the image processing method, based on noise information of reference frames in at least two original images, denoising and demosaicing are carried out on first image features extracted from the at least two original images, so that second image features with frequency lower than a preset frequency threshold value are obtained; performing super-resolution processing on the second image features to obtain third image features with the frequency higher than or equal to a preset frequency threshold; that is, the information such as the low-frequency color profile of the image can be obtained more accurately through the denoising process and the demosaicing process, and the information such as the high-frequency texture detail of the image can be obtained more accurately through the super-resolution process, so that the target image with abundant details, reality and super-resolution is generated according to the second image feature and the third image feature, and the accuracy of the image processing is improved. Meanwhile, the requirements of image processing speed, power consumption, calculation power, resolution and the like of the electronic equipment can be met.
And through the combined mosaic, denoising and super-resolution network architecture, the second image feature of the low-frequency component and the third image feature of the high-frequency component are output, so that the utilization rate of each part in the combined mosaic, denoising and super-resolution network architecture can be improved, the super-resolution processing part is focused on restoring the high-frequency texture detail information of the image, and the combined mosaic and denoising part can be focused on restoring the low-frequency information of the image. In addition, the strategy of high-low frequency separation enables the degree of freedom to be larger in the aspects of denoising strength, super-resolution multiple and the like, and supports customization according to different application scenes.
In one embodiment, denoising and demosaicing are performed on a first image feature extracted from at least two original images based on noise information of a reference frame in the at least two original images, to obtain a second image feature, including: denoising the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain denoised image features; and performing demosaicing processing on the denoised image characteristics to obtain second image characteristics.
It can be understood that the electronic device performs denoising processing on the first image feature extracted from the at least two original images based on noise information of the reference frames in the at least two original images, so as to obtain a denoised image feature, and performs demosaicing processing on the denoised image feature, so that a denoised and demosaiced second image feature can be accurately obtained.
In another embodiment, the electronic device may perform demosaicing processing on the first image feature first, and then perform denoising processing to obtain the second image feature.
In one embodiment, denoising a first image feature extracted from at least two original images based on noise information of a reference frame in the at least two original images to obtain a denoised image feature, including: performing feature mapping on noise information of reference frames in at least two original images and first image features extracted from the at least two original images to obtain mapping features; and denoising the mapping characteristics to obtain denoised image characteristics.
Optionally, denoising the mapping feature to obtain a denoised image feature, including: sequentially carrying out downsampling and upsampling on the mapping characteristics to obtain upsampling characteristics; the resolution of the up-sampled features is the same as the resolution of the mapped features; and obtaining the denoising image characteristic based on the upsampling characteristic and the mapping characteristic. Wherein the noise information may be a noise figure.
The downsampling, also called downsampling, is a multi-rate digital signal processing technology or a process of reducing the sampling rate of a signal, and is generally used for reducing the data transmission rate or the data size. Upsampling is also known as Upsampling (Upsampling) or interpolation (Interpolating).
The electronic equipment performs feature mapping on the noise map of the reference frame and the first image feature by using a convolution layer and an activation function layer to obtain mapping features which can be processed by the denoising module; inputting the mapping characteristics into UResNet modules comprising 4 times of 2 times of downsampling and 4 times of 2 times of upsampling, and sequentially downsampling and upsampling the mapping characteristics to obtain upsampling characteristics; and splicing the upsampling features and the mapping features in the channel dimension to obtain the denoising image features. The UResNet module is a UNet taking the residual module as a basic processing unit, and the output after each up-sampling and the same resolution characteristic before down-sampling are input into a subsequent up-sampling module after channel dimension splicing. The number and multiplying power of downsampling and upsampling in UResNet modules can be set according to the requirements, and are not limited herein.
In this embodiment, the electronic device performs feature mapping on noise information of a reference frame in at least two original images and first image features extracted from at least two original images to obtain mapping features, and performs denoising processing on the mapping features to obtain denoised image features with denoising effect, so that denoising effect of the image features is improved.
In one embodiment, demosaicing the de-noised image features to obtain second image features includes: and carrying out up-sampling and residual error processing on the denoised image characteristics to obtain second image characteristics.
Optionally, the electronic device performs 2 times up-sampling on the denoised image features, and then inputs the obtained image features into a network containing 2 depth residual modules for residual processing, so as to obtain a joint demosaicing and denoised second image feature. The number of depth residual modules in the network including the depth residual modules can be set according to the needs.
Optionally, the electronic device may further process the second image feature through two convolution layers to obtain a first image component in the RGB domain.
In one embodiment, performing super-resolution processing on the second image feature to obtain a third image feature, including: residual processing is carried out on the second image feature, and residual processing features are obtained; and upsampling the residual processing feature to obtain a third image feature.
Optionally, the electronic device inputs the second image feature into a network containing a depth residual module, and residual processing is performed through the network containing the depth residual module to obtain a residual processing feature; and upsampling the residual processing feature to obtain a third image feature.
The number of depth residual modules in the network including the depth residual modules can be set according to the needs. Illustratively, to balance processing efficiency and processing effect, the network may include 8 depth residual modules.
Optionally, the electronic device may upsample the residual processing feature by using a nearest neighbor interpolation method to obtain a third image feature.
Optionally, taking 2 times super-resolution processing as an example, the electronic device performs residual processing on the second image feature to obtain a residual processing feature; and up-sampling the residual processing characteristic by 2 times to obtain a third image characteristic.
For the super-resolution multiplying power requirement of more than 1 time and less than 2 times, the electronic equipment performs up-sampling of the preset multiplying power on the residual processing characteristics to obtain third image characteristics; and up-sampling the second image features by preset multiplying power to obtain up-sampled second image features. Wherein, the preset multiplying power is more than 1 and less than 2.
For super-resolution multiplying power requirements larger than 2 times, the electronic equipment is connected with a super-resolution processing branch with preset multiplying power in parallel to obtain a third image feature with preset multiplying power; meanwhile, the up-sampling of the corresponding multiple is multiplied by the second image feature of 2 times (or other times closest to the preset multiplying power and smaller than the integral power of 2 of the preset multiplying power), so as to obtain the up-sampled second image feature.
For the super-resolution multiplying power requirement of 8 times, the electronic device increases the up-sampling times to 3 times during super-resolution processing and combined de-noising and demosaicing processing, namely 2 times is obtained by the first 2 times of up-sampling, 4 times is obtained by the second 2 times of up-sampling, and 8 times is obtained by the third 2 times of up-sampling.
Optionally, the electronic device may further process the third image feature through two convolution layers to obtain a second image component in the RGB domain; the target image in the RGB domain can be obtained by adding the first image component in the RGB domain and the second image component in the RGB domain.
In this embodiment, residual processing is performed on the second image feature to obtain a residual processing feature, and up-sampling is performed on the residual processing feature, so that a third image feature with super resolution can be accurately obtained.
In one embodiment, as shown in fig. 2, there is also provided another image processing method, including the steps of:
step S202, feature mapping is carried out on noise information of a reference frame in at least two original images and first image features extracted from the at least two original images, and mapping features are obtained.
Step S204, sequentially performing downsampling and upsampling on the mapping characteristics to obtain upsampling characteristics; the resolution of the upsampled features is the same as the resolution of the mapped features.
It should be noted that, the electronic device performs downsampling and upsampling on the mapping feature in sequence, so as to achieve a denoising effect.
Step S206, obtaining the denoising image feature based on the up-sampling feature and the mapping feature.
Step S208, up-sampling and residual processing are carried out on the denoised image characteristics to obtain second image characteristics; the frequency of the second image feature is below a preset frequency threshold.
Step S210, carrying out residual processing on the second image feature to obtain a residual processing feature; upsampling the residual processing feature by a preset multiplying power to obtain a third image feature; the frequency of the third image feature is greater than or equal to a preset frequency threshold.
The preset multiplying power can be set according to the requirement. Illustratively, the preset magnification may be 2 times.
Step S212, up-sampling the second image feature with a preset multiplying power to obtain an up-sampled second image feature.
It will be appreciated that the third image feature is based on the second image feature and further comprises upsampling at a predetermined magnification, so that the resolution of the third image feature is higher than the resolution of the second image feature. In order to more accurately obtain a synthesized target image, the electronic equipment also carries out up-sampling of a preset multiplying power on the second image feature to obtain an up-sampled second image feature; the resolution of the upsampled second image feature coincides with the resolution of the third image feature.
Optionally, the electronic device performs upsampling of the second image feature by a preset magnification in a bicubic interpolation (Bicubic interpolation) manner, so as to obtain an upsampled second image feature.
Step S214, generating a target image according to the up-sampled second image feature and third image feature.
In this embodiment, the electronic device performs residual processing on the second image feature to obtain a residual processing feature, then performs up-sampling of a preset multiplying power on the residual processing feature to obtain a third image feature, and also performs up-sampling of the preset multiplying power on the second image feature to obtain an up-sampled second image feature, where the resolution of the up-sampled second image feature is consistent with that of the third image feature, so that the target image can be generated more accurately according to the up-sampled second image feature and the up-sampled third image feature that are consistent in resolution.
In one embodiment, extracting the first image feature from at least two original images includes: extracting sub-image features of each original image from at least two original images; and fusing at least two sub-image features to obtain a first image feature.
Optionally, the electronic device extracts sub-image features of each original image from at least two original images through a multi-layer convolution layer, and fuses the at least two sub-image features through the convolution layer or a self-attention mechanism to obtain the first image feature.
In this embodiment, the electronic device extracts sub-image features of each original image from at least two original images, and may fuse at least two sub-image features by using complementary information between at least two sub-image features to obtain a first image feature with more image information.
In one embodiment, as shown in fig. 3, another image processing method is provided, including the steps of:
Step S302, a motion area mask of each non-reference frame except the reference frame in at least two original images is acquired.
Optionally, the electronic device determines a reference frame and a non-reference frame other than the reference frame from at least two original images; the motion regions of the respective non-reference frames are detected, and a motion region mask for each non-reference frame is generated.
Step S304 extracts the sub-image features of each original image based on the reference frame, the non-reference frame and the corresponding motion area mask.
Optionally, the electronic device does not need to perform motion detection on the reference frame, that is, the mask of the motion area of the reference frame is a full black image; the electronic equipment connects the reference frame and the corresponding motion area mask in series in the channel dimension, and connects the non-reference frame and the corresponding motion area mask in series in the channel dimension to obtain a reference frame after series connection and a non-reference frame after series connection; sub-image features of each original image are extracted from the tandem reference frames and the tandem non-reference frames through a feature extraction network. Wherein the feature extraction network may comprise a plurality of convolution layers. Illustratively, when the RAW image of the RAW domain includes 4 channels and the moving area mask and the RAW image of the RAW domain are connected in series in the channel dimension, the RAW image of the new RAW domain is generated to include 5 channels including the original 4 channels and the moving area mask.
Optionally, the electronic device may also input the reference frame and the concatenated non-reference frame into a feature extraction network for feature extraction.
It will be appreciated that the image and the motion area mask are concatenated in the channel dimension, i.e. the motion area mask acts as a channel for the image addition.
And step S306, fusing at least two sub-image features to obtain a first image feature.
Step S308, feature mapping is carried out on noise information of reference frames in at least two original images and the first image features, and mapping features are obtained.
Step S310, sequentially performing downsampling and upsampling on the mapping characteristics to obtain upsampling characteristics; the resolution of the upsampled features is the same as the resolution of the mapped features.
Step S312, obtaining the denoising image feature based on the upsampling feature and the mapping feature.
Step S314, up-sampling and residual processing are carried out on the denoised image characteristics to obtain second image characteristics; the frequency of the second image feature is below a preset frequency threshold.
Step S316, residual processing is carried out on the second image feature, and residual processing features are obtained.
Step S318, up-sampling the residual processing feature with a preset multiplying power to obtain a third image feature; the frequency of the third image feature is greater than or equal to a preset frequency threshold.
Step S320, up-sampling the second image feature with a preset multiplying power to obtain an up-sampled second image feature.
Step S322, generating a target image according to the up-sampled second image feature and third image feature.
In this embodiment, the electronic device may acquire a motion region mask of each non-reference frame except the reference frame in at least two original images, and determine a motion region position in each image based on the reference frame, the non-reference frame, and the corresponding motion region mask, so that sub-image features required by each original image may be extracted more accurately with respect to the motion region position and the non-motion region position, and accuracy of image processing is improved.
And, the motion region mask is used to increase the degree of noise reduction of the motion region. It will be appreciated that combining denoising, mosaic-decoding and super-resolution networks tends to use multi-frame information for denoising, since noise is typically zero-mean, and the more the number of frames, the more denoising is after averaging. However, the motion area in each non-reference frame is the motion area of the reference frame, that is, the motion area uses one frame of information, so that the joint denoising, mosaic and super-resolution network can implicitly improve the denoising degree of the motion area and avoid splicing artifacts.
In one embodiment, the method for determining noise information of the reference frame includes: determining shot noise and readout noise corresponding to a reference frame based on shooting parameters of the reference frame; generating a target noise map of the reference frame based on shot noise and readout noise corresponding to the reference frame; the target noise map contains noise information for the reference frame.
Shot noise (shot noise) is noise caused by non-uniformity of electron emission in an active device such as an electro-vacuum tube in a communication apparatus. The read noise (read noise) is noise generated electronically during the transfer of the charge in the pixel out of the camera, is a combination of all noise generated by the system components such as charge transfer noise, sense amplifier reset noise, analog-to-digital conversion quantization noise, noise caused by crosstalk between the row transfer clock and the horizontal register drive clock, and the like when the charge of each pixel is converted into a signal and converted into a digital value.
The photographing parameters of the reference frame may Include Sensitivity (ISO), and may further include information such as a shutter time period or an aperture value.
Optionally, the electronic device inputs the shooting parameters of the reference frame into a noise model, and outputs shot noise and readout noise corresponding to the shooting parameters of the reference frame through the noise model. Wherein the noise model may be a gaussian-poisson noise model. It will be appreciated that the noise model is obtained by calibrating the image sensor in advance, since different digital gain (DIGITAL GAIN) settings during the shooting will affect the noise intensity of the shooting.
Optionally, generating a target noise map of the reference frame based on shot noise and readout noise corresponding to the reference frame includes: multiplying each pixel in the reference frame by shot noise respectively to obtain an intermediate noise diagram; and adding the readout noise to the intermediate noise map to generate a target noise map of the reference frame.
The electronic equipment multiplies each pixel in the reference frame by the noise value of shot noise respectively to obtain an intermediate noise diagram; and adding the noise value of the readout noise to each pixel in the intermediate noise map to generate a target noise map of the reference frame. Wherein each pixel value in the target noise map represents noise information of a pixel at a corresponding position of the reference frame.
In this embodiment, the electronic device determines shot noise and readout noise corresponding to the reference frame based on the shooting parameters of the reference frame, and can accurately generate a target noise map of the reference frame based on the shot noise and readout noise corresponding to the reference frame, so as to obtain noise information of the reference frame from the target noise map.
In one embodiment, as shown in fig. 4, the electronic device acquires at least two original images (Low Quality Image, low-quality input images), and a motion area mask for each original image, and concatenates each original image with the respective motion area mask in a channel dimension to obtain a new original image; extracting sub-image features of each new original image in at least two new original images; fusing at least two sub-image features through a convolution layer or a self-attention mechanism to obtain a first image feature; determining shot noise and readout noise corresponding to a reference frame based on shooting parameters of the reference frame; generating a target noise map of the reference frame based on shot noise and readout noise corresponding to the reference frame in at least two original images; inputting the target noise image and the first image feature into a JDD (Joint Demosaic and Denoising, joint demosaicing) module, and carrying out denoising processing and demosaicing processing on the first image feature based on the target noise image of the reference frame through the JDD module to obtain a second image feature; the JDD module comprises UResNet network structures and 2 times up sampling; up-sampling the second image features with preset multiplying power to obtain up-sampled second image features, wherein the up-sampled second image features are low-frequency components; inputting the second image feature into an SR (Super-Resolution) module, and performing Super-Resolution processing on the second image feature through the SR module to obtain a third image feature, wherein the third image feature is a high-frequency component; the super-resolution processing comprises up-sampling of a preset multiplying power; and adding the second image characteristic and the third image characteristic after upsampling to generate a target image.
In one embodiment, as shown in fig. 5, another image processing method is provided, including the steps of:
Step S502, motion detection is carried out on non-reference frames except for the reference frames in at least two original images, and a motion area of each non-reference frame is determined.
Optionally, the electronic device performs motion detection on non-reference frames except the reference frame in at least two original images by using a motion detection algorithm, and determines a motion area of each non-reference frame.
Optionally, the electronic device performs difference processing on non-reference frames except the reference frame in at least two original images and the reference frame respectively to obtain a difference image corresponding to each non-reference frame; and determining the motion area of each non-reference frame from the difference image corresponding to each non-reference frame.
For the difference image corresponding to each non-reference frame, the electronic device may compare each pixel value in the difference image with a preset threshold, use a pixel with a pixel value greater than the preset threshold as a motion pixel, and form each motion pixel into a motion region of the non-reference frame. The preset threshold value can be set according to requirements.
Optionally, in order to improve the image processing efficiency, the electronic device may downsample at least two original images, perform motion detection on non-reference frames except for the reference frame in the downsampled at least two original images, and determine a motion area of each non-reference frame.
Step S504, based on the image information of the reference frames, the motion area of each non-reference frame is updated to obtain updated non-reference frames.
Alternatively, the electronic device may update the motion region of each non-reference frame based on the image information of the reference frame corresponding to the motion region position of the non-reference frame, resulting in an updated non-reference frame.
In an alternative embodiment, for each non-reference frame, the motion area of the non-reference frame is replaced with the image information corresponding to the position of the motion area of the non-reference frame in the reference frame, so as to obtain an updated non-reference frame.
In another alternative embodiment, for each non-reference frame, the image information in the reference frame corresponding to the position of the motion region of the non-reference frame is overlaid on the motion region of the non-reference frame to obtain an updated non-reference frame.
Step S506, taking the reference frame and the updated non-reference frame as at least two new original images, and carrying out denoising processing and demosaicing processing on the first image features extracted from the at least two original images based on noise information of the reference frame in the at least two original images to obtain second image features; the frequency of the second image feature is below a preset frequency threshold.
The electronic equipment performs denoising processing and demosaicing processing on the first image features extracted from the new at least two original images based on noise information of reference frames in the new at least two original images to obtain second image features.
Step S508, performing super-resolution processing on the second image feature to obtain a third image feature; the frequency of the third image feature is greater than or equal to a preset frequency threshold.
Step S510, generating a target image according to the second image feature and the third image feature.
It can be understood that, because of the interval between the shooting moments of at least two original images, the shot object inevitably generates autonomous motion, and when the motion is large, the multi-frame registration can fail in the motion area, so that the problems of motion ghosting and the like can occur after the multi-frame images are fused.
In this embodiment, the electronic device performs motion detection on non-reference frames except for the reference frame in at least two original images, and determines a motion area of each non-reference frame; based on the image information of the reference frames, the motion area of each non-reference frame is updated to obtain updated non-reference frames, so that the problems of ghosting and the like after image fusion can be avoided, larger displacement among multiple frames is eliminated, and the accuracy of image processing is improved.
In one embodiment, as shown in fig. 6, another image processing method is provided, including the steps of:
Step S602, registering non-reference frames except the reference frames in at least two original images to the reference frames to obtain registered non-reference frames.
Optionally, the electronic device registers the non-reference frame to the reference frame, which may specifically include operations of displacement, interpolation, and the like, and may also include operations of brightness alignment, image block alignment, and the like.
Optionally, registering non-reference frames except the reference frame in at least two original images to the reference frame to obtain a registered non-reference frame, including: determining a motion transformation relationship between the reference frame and each non-reference frame based on the reference frame and non-reference frames other than the reference frame in the at least two original images; registering the reference frames based on the motion transformation relation corresponding to each non-reference frame to obtain registered non-reference frames.
Wherein the motion transformation relationship includes at least one of an affine transformation matrix and a feature point optical flow.
Optionally, in order to improve the image processing efficiency, the electronic device slices the reference frame and non-reference frames except the reference frame in at least two original images to obtain each image block of the reference frame and each image block of the non-reference frames except the reference frame in at least two original images; respectively carrying out corner detection on each image block of the reference frame and each image block of the non-reference frame to obtain a reference characteristic point of each image block of the reference frame and a non-reference characteristic point of each image block of the non-reference frame; for each image block in each non-reference frame, the electronic equipment determines a motion transformation relation between a non-reference feature point in the image block of the non-reference frame and a reference feature point of the image block at a position corresponding to the reference frame, and registers the image block of the non-reference frame to the image block at the position corresponding to the reference frame according to the motion transformation relation; and obtaining the registered non-reference frame based on each registered image block in the non-reference frame.
Optionally, for each image block in each non-reference frame, the electronic device determines an affine transformation matrix or a feature point optical flow between a non-reference feature point in the image block of the non-reference frame and a reference feature point of the image block at a position corresponding to the reference frame, multiplies the image block of the non-reference frame by the affine transformation matrix or the feature point optical flow to obtain a registered image block in the non-reference frame, and accurately registers the registered image block at the position corresponding to the reference frame.
Optionally, the electronic device may perform smoothing filtering processing on the adjacent image blocks, and then splice each image block after the smoothing filtering processing to obtain a registered non-reference frame, so that a transition region between the adjacent image blocks in the registered non-reference frame is more natural.
Optionally, each adjacent image block obtained by the electronic device during dicing has coincident pixels, and the coincident pixels of the adjacent image blocks are fused to obtain a non-reference frame after registration, so that a transition region between the adjacent image blocks in the non-reference frame after registration is more natural.
In step S604, motion detection is performed on non-reference frames except for the reference frame in at least two original images, and a motion area of each non-reference frame is determined.
Step S606 updates the motion area of each registered non-reference frame based on the image information of the reference frame, resulting in an updated non-reference frame.
For each non-reference frame, the electronic device replaces the motion area of the registered non-reference frame with the image information of the position of the motion area corresponding to the non-reference frame in the reference frame, and an updated non-reference frame is obtained.
Step S608, taking the reference frame and the updated non-reference frame as at least two new original images, and carrying out denoising processing and demosaicing processing on the first image features extracted from the at least two original images based on noise information of the reference frame in the at least two original images to obtain second image features; the frequency of the second image feature is below a preset frequency threshold.
Step S610, performing super-resolution processing on the second image feature to obtain a third image feature; the frequency of the third image feature is greater than or equal to a preset frequency threshold.
Step S612, generating a target image according to the second image feature and the third image feature.
It can be understood that in the process of capturing at least two original images by the electronic device, there is an overall displacement or a local displacement between the original images, which may specifically include a case where the electronic device itself is displaced or the image is entirely displaced in a short time due to a local displacement of the object to be captured or the like. Therefore, the electronic equipment registers non-reference frames except the reference frames in at least two original images to the reference frames, so that a plurality of image features can be fused more accurately later, and the accuracy of image processing is improved.
The electronic equipment uses a high-precision and high-efficiency registration alignment module, details which are difficult to obtain by a traditional single-frame algorithm can be recovered through sub-pixel level complementary information among multiple frames, and meanwhile, the denoising capability of the image can be greatly improved through effective utilization of the multiple-frame information. In addition, for multi-frame input, the high-precision motion area detection module and the non-reference frame motion area are replaced by the reference frame motion area, so that the problem of ghosting caused by movement of a shooting object can be effectively avoided.
In one embodiment, determining a reference frame from at least two original images includes: determining the definition of each of at least two original images; a reference frame is determined from at least two original images based on the sharpness of each original image.
Optionally, the electronic device determines an original image with the highest definition from at least two original images as the reference frame.
Optionally, the electronic device determines a next highest definition original image from the at least two original images as the reference frame.
It should be noted that, the manner of determining the reference frame from at least two original images is not limited.
Optionally, determining the sharpness of each of the at least two original images includes: for each RAW image in the RAW domain, carrying out average processing on a green channel in the RAW image in the RAW domain to generate a gray level image; extracting a Gaussian difference operator from the gray level map; based on the Gaussian difference operator, the sharpness of the original image is determined.
Optionally, for each RAW domain original image, the electronic device performs average processing on 2 green channels in the RAW domain original image to generate a gray scale image; extracting a gaussian difference operator (DoG, DIFFERENCE OF GAUSSIAN) from the gray map; based on the Gaussian difference operator, the sharpness of the original image is determined.
Optionally, determining the sharpness of the original image based on the gaussian difference operator includes: and averaging all elements included in the Gaussian difference operator to obtain the definition of the original image. The elements in the Gaussian difference operator comprise the size and variance of the Gaussian kernel, and the size and variance can be obtained through analysis statistics of image data.
In the Gaussian difference operator, parameters such as the size and the variance of the Gaussian kernel are fixed after being adjusted according to the data distribution, namely, the overall distribution of the parameters such as the size and the variance of the Gaussian kernel is counted, the size and the variance of a group of Gaussian kernels are determined according to the overall distribution of the parameters, so that the parameters such as the size and the variance of the Gaussian kernel conform to the overall distribution of the data of the parameters, and then the parameters such as the size and the variance of the Gaussian kernel are fixed.
Optionally, the electronic device may use the average value of the gaussian difference operator as the sharpness of the original image, or may use the average value of the gaussian difference operator as the sharpness score of the original image, and determine the sharpness ranks of the original images according to the sharpness scores of the original images. The Gaussian difference operator is a matrix with image length and image width, and all element parameters of the matrix are averaged to obtain a definition score. The sharpness score is highest, i.e. the sharpness of the corresponding original image is highest.
It can be understood that, because the sampling rate of the green channel is high and the human eyes are more sensitive to the original image in the RAW domain, and the information carried by the green channel is more, the 2 green channels in the original image in the RAW domain are subjected to average processing, so that a gray level image with more information can be obtained, and the definition of the original image can be more accurately determined.
The electronic equipment acquires the Raw domain original image, image signals received by the image sensor during shooting can be reserved to the greatest extent, damage to image detail information, structural information, color brightness information and the like caused by a preamble traditional algorithm such as traditional demosaicing, denoising, tone mapping and the like is avoided, meanwhile, information loss caused by image dynamic range compression and the like can be avoided, and compared with a traditional YUV domain or RGB domain image, the method has larger latitude and better visual effect.
In one embodiment, as shown in fig. 7, the electronic device photographs through an image sensor of the lens module and converts (dump) at least two original images; calculating the definition of each original image, sorting according to the definition, removing the original images with low definition, and obtaining the rest original images; determining a reference frame and a non-reference frame from the remaining original image; respectively dicing and extracting characteristic points from the reference frame and the non-reference frame, and calculating an optical flow or affine transformation matrix between the reference frame and the non-reference frame; registering the non-reference frame to the reference frame based on the optical flow or affine transformation matrix; detecting the motion area of each non-reference frame, and determining the mask of the motion area of each non-reference frame; and replacing the registered non-reference frame motion area with reference frame pixels based on the motion area mask of each non-reference frame to obtain multi-frame images of the network to be input. The lens module can be a tele module, and the network refers to a combined de-mosaic, de-noising and super-resolution network.
In one embodiment, the method further comprises: mapping the second image feature to the RGB domain and mapping the third image feature to the RGB domain; generating a target image from the second image feature and the third image feature, comprising: and adding the second image features of the RGB domain and the third image features of the RGB domain to generate a target image.
Optionally, the electronic device inputs the second image feature into two convolution layers, and maps the second image feature to the RGB domain through the two convolution layers to obtain the second image feature of the RGB domain; inputting the third image features into two convolution layers, and mapping the third image features to an RGB domain through the two convolution layers to obtain third image features of the RGB domain; and adding the second image features of the RGB domain and the third image features of the RGB domain to generate a target image of the RGB domain.
In this embodiment, the electronic device maps both the second image feature and the third image feature to the RGB domain, so that the target image in the RGB domain can be accurately generated.
In one embodiment, the method further comprises: in the case of locking the shooting parameters, at least two original images are shot.
Optionally, the photographing parameters include an Auto exposure parameter (AE, automatic Exposure), an Auto Focus parameter (AF), an Auto white balance parameter (AWB, automatic White Balance), or the like. The photographing parameters also include sensitivity, shutter time length, and the like, without being limited thereto.
Optionally, the electronic device continuously shoots at least two original images for the same shooting scene under the condition of locking shooting parameters.
It will be appreciated that the electronic device stores frames of raw images in a queue during continuous shooting, and when the user activates the shutter, at least two raw images are taken forward from the register.
Optionally, to avoid a large displacement caused by the movement of the electronic device during the shooting process or the movement of the shooting object, the electronic device may control the shutter time period to be less than or equal to the preset shutter time period. The preset shutter time length can be set according to requirements, and the preset shutter time length is the maximum shutter time length of the electronic equipment in the shooting process.
In this embodiment, when the electronic device captures at least two original images under the condition of locking the capturing parameters, the captured at least two original images can maintain consistency of image features such as brightness and color, so as to improve accuracy of subsequent image processing.
In one embodiment, as shown in fig. 8, the electronic device photographs and dumps a plurality of RAW graphs, the number of RAW graphs may be N; selecting frames from the plurality of RAW graphs, namely selecting reference frames and non-reference frames, wherein the electronic equipment can calculate the definition of each RAW graph, and select the reference frame with the highest definition and other non-reference frames (the total number is smaller than N) according to the definition sequence; registering the non-reference frames to the reference frames to obtain registered RAW images, wherein the registered RAW images comprise the reference frames and the non-reference frames after registration; calculating a motion region mask of the non-reference frame relative to the reference frame; after the motion area of the non-reference frame is replaced with the image information of the corresponding position of the reference frame according to the motion area mask, connecting each image with the motion area mask in series, and sending the images into a feature extraction network to extract the features of each image so as to obtain first image features; and acquiring a target noise image of the reference frame according to a pre-calibrated sensor noise model, inputting the target noise image into a joint denoising, mosaic and super-resolution network after being connected with the first image feature in series, and finally obtaining a target image. The target image may be input to a subsequent image processing engine for processing.
In one embodiment, according to the optimization requirements of image details, smearing feeling and the like in different scenes, the electronic device can also support sharpening of the reference frame and noise diagram during texture adjustment, so that the operations of denoising force, overlapping gray noise on the output result of the JDD module and the like can be controlled.
Optionally, the electronic device sharpens the reference frame in the input Raw domain original image, so that more weak textures can be reserved when the combined denoising, mosaic and super-resolution network is processed, the smearing feeling is reduced, and meanwhile, artifacts such as black and white edges caused by sharpening in an RGB domain or a YUV domain are avoided.
Optionally, the denoising degree of the joint denoising, mosaic decoding and super-resolution network is influenced by the target noise figure of the reference frame. The denoising intensity of the joint denoising, mosaic and super-resolution network can be controlled by adjusting the target noise graph input to the joint denoising, mosaic and super-resolution network during reasoning, and the balance between the smearing feeling and the noise is obtained.
In addition, the joint denoising, mosaic decoding and super-resolution network can adjust the denoising intensity globally or locally for the image. For example, for an image shot by a night scene, the electronic device can control the joint denoising, the mosaic decoding and the super-resolution network to integrally improve the denoising strength; for the image shot in the daytime, the electronic equipment can control the joint denoising, mosaic decoding and super-resolution network to judge the dark area of the image, and the denoising intensity is improved for the dark area.
Optionally, granular gray noise may enhance visual quality in textured areas due to the specificity of human perception. The electronic equipment can decouple the JDD module from the SR module, so that the original image noise is overlapped on the output of the JDD module, and the smearing feeling is reduced.
In one embodiment, there is also provided an image processing method including the steps of:
And step A1, shooting to obtain at least two original images under the condition of locking shooting parameters.
Step A2, aiming at the original image of each RAW domain, carrying out average processing on a green channel in the original image of the RAW domain to generate a gray level image; extracting a Gaussian difference operator from the gray level map; and averaging all elements included in the Gaussian difference operator to obtain the definition of the original image. .
And step A3, determining a reference frame from at least two original images based on the definition of each original image.
Step A4, determining a motion transformation relation between the reference frame and each non-reference frame based on the reference frame and the non-reference frames except the reference frame in at least two original images; registering the reference frames based on the motion transformation relation corresponding to each non-reference frame to obtain registered non-reference frames; the motion transformation relationship includes at least one of an affine transformation matrix and a feature point optical flow.
And step A5, performing motion detection on non-reference frames except the reference frames in at least two original images, and determining a motion area of each non-reference frame.
And step A6, for each non-reference frame, replacing the motion area of the non-reference frame after registration with the image information corresponding to the motion area position of the non-reference frame in the reference frame to obtain an updated non-reference frame.
Step A7, taking the reference frame and the updated non-reference frame as at least two new original images, and acquiring a motion area mask of each non-reference frame except the reference frame in the at least two new original images; sub-image features of each original image are extracted based on the reference frames, the non-reference frames, and the corresponding motion region masks.
And step A8, fusing at least two sub-image features to obtain a first image feature.
Step A9, determining shot noise and readout noise corresponding to the reference frame based on shooting parameters of the reference frame; multiplying each pixel in the reference frame by shot noise respectively to obtain an intermediate noise diagram; adding readout noise to the intermediate noise map to generate a target noise map of the reference frame; the target noise map contains noise information for the reference frame.
And step A10, performing feature mapping on the target noise figure and the first image feature of the reference frame to obtain mapping features.
Step A11, sequentially performing downsampling and upsampling on the mapping characteristics to obtain upsampling characteristics; the resolution of the up-sampled features is the same as the resolution of the mapped features; based on the up-sampling feature and the mapping feature, obtaining a denoising image feature; up-sampling and residual processing are carried out on the denoising image features to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold; the second image feature is mapped to the RGB domain.
Step A12, carrying out residual processing on the second image feature to obtain a residual processing feature; up-sampling the residual processing features at a preset multiplying power to obtain third image features, and mapping the third image features to an RGB domain; the frequency of the third image feature is greater than or equal to a preset frequency threshold.
Step A13, up-sampling the second image features of the RGB domain by a preset multiplying power to obtain up-sampled second image features of the RGB domain; and adding the up-sampled second image features of the RGB domain and the third image features of the RGB domain to generate a target image.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides an image processing device for realizing the above-mentioned image processing method. The implementation of the solution provided by the apparatus is similar to the implementation described in the above method, so the specific limitation of one or more embodiments of the image processing apparatus provided below may refer to the limitation of the image processing method hereinabove, and will not be repeated herein.
In one embodiment, as shown in fig. 9, there is provided an image processing apparatus including: a first processing module 902, a second processing module 904, and an image generation module 906, wherein:
the first processing module 902 is configured to perform denoising processing and demosaicing processing on first image features extracted from at least two original images based on noise information of reference frames in at least two original images, so as to obtain second image features; the frequency of the second image feature is below a preset frequency threshold.
The second processing module 904 is configured to perform super-resolution processing on the second image feature to obtain a third image feature; the frequency of the third image feature is greater than or equal to a preset frequency threshold.
An image generation module 906 for generating a target image based on the second image feature and the third image feature.
The image processing device performs denoising processing and demosaicing processing on first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features with frequency lower than a preset frequency threshold; performing super-resolution processing on the second image features to obtain third image features with the frequency higher than or equal to a preset frequency threshold; that is, the information such as the low-frequency color profile of the image can be obtained more accurately through the denoising process and the demosaicing process, and the information such as the high-frequency texture detail of the image can be obtained more accurately through the super-resolution process, so that the target image with denoising, demosaicing and super-resolution is generated according to the second image feature and the third image feature, and the accuracy of the image processing is improved.
In one embodiment, the first processing module 902 is further configured to perform denoising processing on the first image feature extracted from the at least two original images based on noise information of the reference frames in the at least two original images, so as to obtain a denoised image feature; and performing demosaicing processing on the denoised image characteristics to obtain second image characteristics.
In one embodiment, the first processing module 902 is further configured to perform feature mapping on noise information of a reference frame in at least two original images and first image features extracted from at least two original images to obtain mapped features; and denoising the mapping characteristics to obtain denoised image characteristics.
In one embodiment, the first processing module 902 is further configured to sequentially downsample and upsample the mapping feature to obtain an upsampled feature; the resolution of the up-sampled features is the same as the resolution of the mapped features; and obtaining the denoising image characteristic based on the upsampling characteristic and the mapping characteristic.
In one embodiment, the first processing module 902 is further configured to perform upsampling and residual processing on the denoised image feature to obtain a second image feature.
In one embodiment, the second processing module 904 is further configured to perform residual processing on the second image feature to obtain a residual processing feature; and upsampling the residual processing feature to obtain a third image feature.
In an embodiment, the second processing module 904 is further configured to perform up-sampling of a preset multiplying power on the residual processing feature to obtain a third image feature; the first processing module 902 is further configured to perform upsampling of a preset magnification on the second image feature, to obtain an upsampled second image feature; the image generating module 906 is further configured to generate a target image according to the second image feature and the third image feature after upsampling.
In one embodiment, the first processing module 902 is further configured to extract sub-image features of each original image from at least two original images; and fusing at least two sub-image features to obtain a first image feature.
In one embodiment, the first processing module 902 is further configured to obtain a motion region mask of each non-reference frame except the reference frame in at least two original images; sub-image features of each original image are extracted based on the reference frames, the non-reference frames, and the corresponding motion region masks.
In one embodiment, the first processing module 902 is further configured to determine shot noise and readout noise corresponding to the reference frame based on a shooting parameter of the reference frame; generating a target noise map of the reference frame based on shot noise and readout noise corresponding to the reference frame; the target noise map contains noise information for the reference frame.
In one embodiment, the first processing module 902 is further configured to multiply each pixel in the reference frame by shot noise to obtain an intermediate noise map; and adding the readout noise to the intermediate noise map to generate a target noise map of the reference frame.
In one embodiment, the apparatus further comprises a motion detection module; the motion detection module is used for detecting the motion of non-reference frames except the reference frames in at least two original images and determining the motion area of each non-reference frame; based on the image information of the reference frames, the motion area of each non-reference frame is updated to obtain updated non-reference frames, and the reference frames and the updated non-reference frames are used as at least two new original images, and the first processing module 902 is configured to perform denoising processing and demosaicing processing on the first image features extracted from the at least two original images.
In one embodiment, the motion detection module is further configured to replace, for each non-reference frame, a motion region of the non-reference frame with image information corresponding to a position of the motion region of the non-reference frame in the reference frame, to obtain an updated non-reference frame.
In one embodiment, the apparatus further comprises a registration module; the registration module is used for registering non-reference frames except the reference frames in at least two original images to the reference frames to obtain registered non-reference frames; the motion detection module is further configured to update a motion region of each registered non-reference frame based on image information of the reference frame, to obtain an updated non-reference frame.
In one embodiment, the registration module is further configured to determine a motion transformation relationship between the reference frame and each of the non-reference frames based on the reference frame and the non-reference frames other than the reference frame in the at least two original images; registering the reference frames based on the motion transformation relation corresponding to each non-reference frame to obtain registered non-reference frames.
In one embodiment, the up-motion transformation relationship includes at least one of an affine transformation matrix and a feature point optical flow.
In one embodiment, the apparatus further comprises a reference frame determination module; the reference frame determining module is used for determining the definition of each original image in at least two original images; a reference frame is determined from at least two original images based on the sharpness of each original image.
In one embodiment, the reference frame determining module is further configured to perform an average process on a green channel in an original image of the RAW domain for each original image of the RAW domain, so as to generate a gray scale map; extracting a Gaussian difference operator from the gray level map; based on the Gaussian difference operator, the sharpness of the original image is determined.
In one embodiment, the reference frame determining module is further configured to average each element included in the gaussian difference operator to obtain sharpness of the original image.
In one embodiment, the first processing module 902 is further configured to map a second image feature to an RGB domain, and the second processing module 904 is further configured to map a third image feature to the RGB domain; the image generating module 906 is further configured to add the second image feature of the RGB domain and the third image feature of the RGB domain to generate the target image.
In one embodiment, the device is further used for a shooting module; the shooting module is used for shooting to obtain at least two original images under the condition of locking shooting parameters.
The respective modules in the above-described image processing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or independent of a processor in the electronic device, or may be stored in software in a memory in the electronic device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, an electronic device, which may be a terminal, is provided, and an internal structure thereof may be as shown in fig. 10. The electronic device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input device. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the electronic device is used to exchange information between the processor and the external device. The communication interface of the electronic device is used for conducting wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image processing method. The display unit of the electronic device is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device. The display screen can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the electronic equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 10 is merely a block diagram of a portion of the structure associated with the present inventive arrangements and is not limiting of the electronic device to which the present inventive arrangements are applied, and that a particular electronic device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
The embodiment of the application also provides a computer readable storage medium. One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform steps of an image processing method.
Embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform an image processing method.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (25)

1. An image processing method, comprising:
Denoising and demosaicing the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold;
performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold;
and generating a target image according to the second image characteristic and the third image characteristic.
2. The method according to claim 1, wherein the performing a de-noising process and a demosaicing process on the first image feature extracted from the at least two original images based on noise information of reference frames in the at least two original images to obtain the second image feature includes:
denoising the first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain denoised image features;
And performing demosaicing processing on the denoised image characteristics to obtain second image characteristics.
3. The method according to claim 2, wherein denoising the first image feature extracted from the at least two original images based on noise information of reference frames in the at least two original images, to obtain a denoised image feature, comprises:
Performing feature mapping on noise information of reference frames in at least two original images and first image features extracted from the at least two original images to obtain mapping features;
and denoising the mapping characteristics to obtain denoised image characteristics.
4. A method according to claim 3, wherein denoising the mapped feature results in a denoised image feature, comprising:
Sequentially performing downsampling and upsampling on the mapping features to obtain upsampling features; the resolution of the upsampled features is the same as the resolution of the mapped features;
And obtaining the denoising image characteristic based on the upsampling characteristic and the mapping characteristic.
5. The method of claim 2, wherein said demosaicing the de-noised image features to obtain second image features comprises:
And carrying out up-sampling and residual error processing on the denoising image features to obtain second image features.
6. The method of claim 1, wherein performing super-resolution processing on the second image feature to obtain a third image feature comprises:
Residual processing is carried out on the second image feature to obtain a residual processing feature;
and upsampling the residual processing feature to obtain a third image feature.
7. The method of claim 6, wherein upsampling the residual processing feature to obtain a third image feature comprises:
Up-sampling the residual processing features at a preset multiplying power to obtain third image features;
The method further comprises the steps of:
upsampling the second image feature by the preset multiplying power to obtain an upsampled second image feature;
the generating a target image according to the second image feature and the third image feature comprises:
and generating a target image according to the second image characteristic after upsampling and the third image characteristic.
8. The method of claim 1, wherein extracting a first image feature from the at least two original images comprises:
extracting sub-image features of each original image from the at least two original images;
And fusing at least two sub-image features to obtain a first image feature.
9. The method of claim 8, wherein extracting sub-image features of each original image from the at least two original images comprises:
Acquiring a motion area mask of each non-reference frame except the reference frame in the at least two original images;
sub-image features of each original image are extracted based on the reference frame, the non-reference frame, and a corresponding motion region mask.
10. The method according to claim 1, wherein the determining the noise information of the reference frame includes:
determining shot noise and readout noise corresponding to the reference frame based on shooting parameters of the reference frame;
Generating a target noise map of the reference frame based on shot noise and readout noise corresponding to the reference frame; the target noise map includes noise information for the reference frame.
11. The method of claim 10, wherein generating the target noise map for the reference frame based on shot noise and readout noise corresponding to the reference frame comprises:
multiplying each pixel in the reference frame by the shot noise respectively to obtain an intermediate noise diagram;
And adding the intermediate noise map to the readout noise to generate a target noise map of the reference frame.
12. The method according to claim 1, wherein the method further comprises:
Performing motion detection on non-reference frames except for reference frames in the at least two original images, and determining a motion area of each non-reference frame;
and updating a motion area of each non-reference frame based on the image information of the reference frame to obtain an updated non-reference frame, taking the reference frame and the updated non-reference frame as at least two new original images, and executing the steps of denoising and demosaicing the first image features extracted from the at least two original images.
13. The method of claim 12, wherein updating the motion region of each non-reference frame based on the image information of the reference frame to obtain updated non-reference frames comprises:
And for each non-reference frame, replacing the motion area of the non-reference frame with the image information of the motion area position corresponding to the non-reference frame in the reference frame to obtain an updated non-reference frame.
14. The method according to claim 12, wherein the method further comprises:
registering non-reference frames except for the reference frames in the at least two original images to the reference frames to obtain registered non-reference frames;
Updating the motion area of each non-reference frame based on the image information of the reference frame to obtain updated non-reference frames, including:
And updating the motion area of each registered non-reference frame based on the image information of the reference frame to obtain updated non-reference frames.
15. The method of claim 14, wherein registering non-reference frames other than the reference frame in the at least two original images to the reference frame results in a registered non-reference frame, comprising:
Determining a motion transformation relationship between the reference frame and each non-reference frame based on the reference frame and non-reference frames other than the reference frame in the at least two original images;
Registering the reference frames based on the motion transformation relation corresponding to each non-reference frame to obtain registered non-reference frames.
16. The method of claim 15, wherein the motion transformation relationship comprises at least one of an affine transformation matrix and a feature point optical flow.
17. The method according to any one of claims 1 to 16, wherein determining a reference frame from at least two original images comprises:
Determining the definition of each of at least two original images;
a reference frame is determined from at least two original images based on the sharpness of each original image.
18. The method of claim 17, wherein determining sharpness of each of the at least two original images comprises:
For each RAW domain original image, carrying out average processing on a green channel in the RAW domain original image to generate a gray level image;
extracting a Gaussian difference operator from the gray level map;
and determining the definition of the original image based on the Gaussian difference operator.
19. The method of claim 18, wherein the determining the sharpness of the original image based on the gaussian difference operator comprises:
and averaging all elements included in the Gaussian difference operator to obtain the definition of the original image.
20. The method according to any one of claims 1 to 16, further comprising:
mapping the second image feature to an RGB domain and mapping the third image feature to an RGB domain;
the generating a target image according to the second image feature and the third image feature comprises:
And adding the second image features of the RGB domain and the third image features of the RGB domain to generate a target image.
21. The method according to any one of claims 1 to 16, further comprising:
In the case of locking the shooting parameters, at least two original images are shot.
22. An image processing apparatus, comprising:
The first processing module is used for carrying out denoising processing and demosaicing processing on first image features extracted from at least two original images based on noise information of reference frames in the at least two original images to obtain second image features; the frequency of the second image feature is lower than a preset frequency threshold;
The second processing module is used for performing super-resolution processing on the second image features to obtain third image features; the frequency of the third image feature is higher than or equal to a preset frequency threshold;
And the image generation module is used for generating a target image according to the second image characteristic and the third image characteristic.
23. An electronic device comprising a memory and a processor, the memory having stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of the image processing method according to any of claims 1 to 21.
24. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 21.
25. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the method of any one of claims 1 to 21.
CN202211499327.9A 2022-11-28 2022-11-28 Image processing method, apparatus, electronic device, and computer-readable storage medium Pending CN118115375A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211499327.9A CN118115375A (en) 2022-11-28 2022-11-28 Image processing method, apparatus, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211499327.9A CN118115375A (en) 2022-11-28 2022-11-28 Image processing method, apparatus, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN118115375A true CN118115375A (en) 2024-05-31

Family

ID=91209108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211499327.9A Pending CN118115375A (en) 2022-11-28 2022-11-28 Image processing method, apparatus, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN118115375A (en)

Similar Documents

Publication Publication Date Title
CN110827200B (en) Image super-resolution reconstruction method, image super-resolution reconstruction device and mobile terminal
CN106027851B (en) Method and system for processing images
US8130278B2 (en) Method for forming an improved image using images with different resolutions
CN113508416B (en) Image fusion processing module
US11334961B2 (en) Multi-scale warping circuit for image fusion architecture
US11816858B2 (en) Noise reduction circuit for dual-mode image fusion architecture
Xu et al. Deep joint demosaicing and high dynamic range imaging within a single shot
CN115115516A (en) Real-world video super-resolution algorithm based on Raw domain
Ma et al. Restoration and enhancement on low exposure raw images by joint demosaicing and denoising
CN113689335B (en) Image processing method and device, electronic equipment and computer readable storage medium
CN113379609A (en) Image processing method, storage medium and terminal equipment
CN114581355A (en) Method, terminal and electronic device for reconstructing HDR image
US11841926B2 (en) Image fusion processor circuit for dual-mode image fusion architecture
Ye et al. LFIENet: light field image enhancement network by fusing exposures of LF-DSLR image pairs
CN117768774A (en) Image processor, image processing method, photographing device and electronic device
Vien et al. Single-shot high dynamic range imaging via multiscale convolutional neural network
WO2024055458A1 (en) Image noise reduction processing method and apparatus, device, storage medium, and program product
JP7025237B2 (en) Image processing equipment and its control method and program
US11803949B2 (en) Image fusion architecture with multimode operations
US11798146B2 (en) Image fusion architecture
CN115035013A (en) Image processing method, image processing apparatus, terminal, and readable storage medium
CN118115375A (en) Image processing method, apparatus, electronic device, and computer-readable storage medium
JP2023111637A (en) Image processing device and method and imaging apparatus
CN113473028A (en) Image processing method, image processing device, camera assembly, electronic equipment and medium
Cho et al. PyNET-Q× Q: An Efficient PyNET Variant for Q× Q Bayer Pattern Demosaicing in CMOS Image Sensors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination