WO2023016468A1 - 解马赛克方法、电子设备及存储介质 - Google Patents

解马赛克方法、电子设备及存储介质 Download PDF

Info

Publication number
WO2023016468A1
WO2023016468A1 PCT/CN2022/111227 CN2022111227W WO2023016468A1 WO 2023016468 A1 WO2023016468 A1 WO 2023016468A1 CN 2022111227 W CN2022111227 W CN 2022111227W WO 2023016468 A1 WO2023016468 A1 WO 2023016468A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
rgbir
target
pixel
channel
Prior art date
Application number
PCT/CN2022/111227
Other languages
English (en)
French (fr)
Inventor
井敏皓
戢仁和
Original Assignee
北京旷视科技有限公司
北京迈格威科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京旷视科技有限公司, 北京迈格威科技有限公司 filed Critical 北京旷视科技有限公司
Publication of WO2023016468A1 publication Critical patent/WO2023016468A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Definitions

  • the RGBIR image sensor is a sensor that can simultaneously sense infrared information and visible light information.
  • the RGBIR image sensor replaces the pixels covering the green filter in the Bayer array with the pixels covering the near-infrared filter, so that the information of two bands of visible light and near-infrared light can be simultaneously sensed through one image sensor.
  • the image required by the computer vision task is a target RGB image and/or a target IR image. Therefore, it is necessary to obtain a target RGB image and/or a target IR image based on the RGBIR image collected by the RGBIR image sensor. According to the RGBIR image collected by the RGBIR image sensor, Obtaining the target RGB image and/or the target IR image is called demosaicing.
  • the commonly used demosaic algorithm is: interpolating the RGBIR image collected by the RGBIR image sensor to generate an RGB image and an IR (Infrared Radiation, near-infrared light) image. Perform color correction on the RGB image generated by interpolation and/or the IR image generated by interpolation, eliminate the interference information in the RGB image generated by interpolation and the IR image generated by interpolation, complete demosaic, and obtain the target RGB image, target IR image.
  • IR Infrared Radiation, near-infrared light
  • the commonly used demosaicing algorithm will introduce a large interpolation error, resulting in low quality of the target RGB image and/or the target IR image. Moreover, manual correction is very dependent on expert experience, and correction errors will inevitably occur, resulting in low quality of the target RGB image and target IR image.
  • the commonly used demosaic algorithm is strongly related to the color filter array of the RGBIR sensor, and the demosaic algorithm suitable for the color filter array of one type of RGBIR sensor cannot be applied to the color filter array of another type of RGBIR sensor. The demosaic algorithm must be developed separately for each model of RGBIR sensor, resulting in high cost.
  • Embodiments of the present application provide a demosaicing method, electronic equipment, and a storage medium.
  • the embodiment of this application provides a demosaic method, including:
  • the target RGBIR image is a preprocessed RGBIR image, and the preprocessing includes dark level compensation;
  • a first neural network to perform first processing on the target RGBIR image to obtain a target IR image; and/or, using a second neural network to perform second processing on the target RGBIR image to obtain a target RGB image; wherein the first The processing includes: eliminating the visible light component in the pixel value corresponding to the IR channel pixel point in the target RGBIR image, and predicting the pixel value of the color channel pixel point in the IR channel, and the second processing includes: for each color channel, Eliminate the near-infrared light component in the pixel value corresponding to the pixel point of the color channel in the target RGBIR image, and predict the pixel value of the pixel point other than the pixel point of the color channel in the color channel.
  • the embodiment of the present application also provides an electronic device, including:
  • the processor is configured to execute the instructions to implement the demosaic method described above.
  • the embodiment of the present application also provides a storage medium, and when instructions in the storage medium are executed by a processor of the electronic device, the electronic device can execute the demosaicing method described above.
  • An embodiment of the present application further provides a computer program product, including a computer program/instruction, and when the computer program/instruction is executed by a processor, the foregoing demosaicing method is implemented.
  • the demosaicing method provided in the embodiment of the present application uses the first neural network and/or the second neural network to complete the demosaicing of the target RGBIR image to obtain the target IR image and/or the target RGB image.
  • the target RGBIR image is eliminated.
  • the component of the pixel value corresponding to the pixel of the corresponding channel in the image is the component of the interference information
  • the component of the interference information is the visible light component or the near-infrared light component
  • the pixel value of the corresponding pixel in the corresponding channel is predicted, in the demosaic
  • the component pixel values with corresponding interference information are not used for interpolation, and interpolation errors will not be generated.
  • the demosaic process is automatically completed by the first neural network and/or the second neural network without relying on expert experience. Correction is performed in a manner that does not generate correction errors, thereby avoiding the adverse effects of interpolation errors and correction errors on demosaicing, improving the image quality of the obtained target IR image and/or the image quality of the target RGB image, and generating signal noise for subsequent generation High ratio, good color reproduction IR images and/or good color reproduction RGB images provide favorable conditions.
  • the first neural network and/or the second neural network can eliminate the component as interference information in the pixel value corresponding to any channel pixel in the target RGBIR image, and predict that any pixel in the target RGBIR image is in The pixel value of the corresponding channel is not affected by the color filter array of the sensor, and is suitable for demosaicing the RGBIR image collected by any RGBIR image sensor.
  • Fig. 1 shows the flowchart of the demosaic method provided by the embodiment of the present application
  • Fig. 2 shows a structural schematic diagram suitable for the first treatment or the second treatment
  • Fig. 3 shows another structural schematic diagram suitable for the first treatment or the second treatment
  • FIG. 4 shows a structural block diagram of a demosaic device provided in an embodiment of the present application
  • FIG. 5 shows a structural block diagram of an electronic device provided by an embodiment of the present application.
  • Fig. 1 shows the flowchart of the demosaic method provided by the embodiment of the present application, the method includes:
  • Step 101 acquiring a target RGBIR image.
  • the target RGBIR image is a preprocessed RGBIR image, and the preprocessing includes dark level compensation.
  • the preprocessed RGBIR image can be obtained by pre-processing the RGBIR image collected by the RGBIR sensor by the preprocessing device. Since the RGBIR sensor itself has a dark level, the RGBIR image collected by the RGBIR sensor is affected by the dark level of the RGBIR sensor. Therefore, it is necessary to eliminate the influence of the dark level of the RGBIR sensor on the RGBIR image collected by the RGBIR sensor.
  • the RGBIR image collected by the RGBIR sensor may be pre-compensated for dark levels by the device used for preprocessing to obtain a preprocessed RGBIR image.
  • obtaining the target RGBIR image includes: obtaining an original RGBIR image
  • the original RGBIR image is the RGBIR image collected by the RGBIR sensor.
  • dark level compensation can be performed on the original RGBIR image to obtain a dark level compensated RGBIR image, and the dark level compensated RGBIR image can be used as the target RGBIR image.
  • Compensating the dark level of the original RGBIR image can be expressed as:
  • BLCbayer represents the dark level of the RGBIR sensor, and the dark level of the RGBIR sensor can be provided by the RGBIR sensor manufacturer.
  • R, G, B, and IR on the right side of the equation represent the original pixel value of each pixel in the R channel, the original pixel value in the G channel, the original pixel value in the B channel, the original pixel value in the IR channel, etc.
  • R, G, B, and IR on the left side of the formula respectively represent the pixel value of the R channel pixel after dark level compensation, the pixel value of the G channel pixel after dark level compensation, and the B channel pixel value after dark level compensation. Pixel value after level compensation, pixel value of IR channel pixel after dark level compensation.
  • the R pixel corresponds to the red filter in the RGBIR sensor array.
  • the position of the R pixel in the RGBIR image is the same as the red filter corresponding to the R pixel in the RGBIR image.
  • the locations of the sensors in the array are the same.
  • the G pixel point corresponds to the green filter in the array of the RGBIR sensor, and for a G pixel point, the position of the G pixel point in the RGBIR image is the position of the green filter point corresponding to the G pixel point in the RGBIR sensor array same location.
  • the B pixel corresponds to the blue color filter in the RGBIR sensor array.
  • the position of the B pixel in the RGBIR image is the same as the blue filter corresponding to the B pixel in the RGBIR sensor array. in the same position.
  • the IR pixel corresponds to the near-infrared filter in the RGBIR sensor array.
  • the position of the IR pixel in the RGBIR image is the same as the near-infrared filter corresponding to the IR pixel in the RGBIR sensor array. in the same position.
  • the preprocessing may further include: image normalization.
  • Image normalization can make the pixel values in the corresponding image obtained through image normalization conform to a certain distribution, which is convenient for the neural network used to process the corresponding image to process the corresponding image obtained through image normalization.
  • Image normalization can be performed on the RGBIR image after dark level compensation to obtain the target RGBIR image.
  • Step 102 using the first neural network to perform the first processing on the target RGBIR image to obtain the target IR image, and/or using the second neural network to perform the second processing on the target RGBIR image to obtain the target RGB image.
  • the first neural network may be a convolutional neural network
  • the first processing includes: eliminating the visible light component in the pixel value corresponding to the IR channel pixel in the target RGBIR image, and predicting the color channel pixel in the IR channel Pixel values.
  • the pixel value of the IR channel pixel in the RGBIR image is composed of the part affected by near-infrared light, that is, the near-infrared light component, and the part affected by visible light, that is, the visible light component.
  • the pixel value of each pixel in the target IR image does not have a visible light component.
  • the target IR image For each IR channel pixel in the target RGBIR image, the target IR image includes a pixel corresponding to the IR channel pixel, and the pixel value of the pixel corresponding to the IR channel pixel corresponds to the IR channel pixel For the near-infrared light component in the pixel value, the position of the pixel corresponding to the IR channel pixel in the target IR image is the same as the position of the IR channel pixel in the target RGBIR image.
  • the target IR image For each R channel pixel in the target RGBIR image, the target IR image includes a pixel corresponding to the R channel pixel, and the pixel value of the pixel corresponding to the R channel pixel is predicted, the R channel
  • the pixel value of the pixel in the IR channel, the position of the pixel corresponding to the R channel pixel in the target IR image is the same as the position of the R channel pixel in the target RGBIR image.
  • the target IR image For each G channel pixel in the target RGBIR image, the target IR image includes a pixel corresponding to the G channel pixel, and the pixel value of the pixel corresponding to the G channel pixel is predicted, the G channel
  • the pixel value of the pixel in the IR channel, the position of the pixel corresponding to the G channel pixel in the target IR image is the same as the position of the G channel pixel in the target RGBIR image.
  • the target IR image For each B channel pixel in the target RGBIR image, the target IR image includes a pixel corresponding to the B channel pixel, and the pixel value corresponding to the B channel pixel is predicted, the B channel
  • the pixel value of the pixel in the IR channel, the position of the pixel corresponding to the B channel pixel in the target IR image is the same as the position of the B channel pixel in the target RGBIR image.
  • the first neural network before using the first neural network to obtain the target IR image, the first neural network is pre-trained using a plurality of image pairs for training the first neural network.
  • the image pairs used to train the first neural network include an RGBIR image and an IR image without visible light interference.
  • An IR image without visible light interference may be captured by a sensor for capturing an IR image without visible light interference, for example, an IR image without visible light interference may be captured by an IR sensor.
  • the RGBIR image and the IR image without visible light interference in the image pair can be taken at the same time by the RGBIR image sensor, the sensor used to collect the IR image without visible light interference get the same object.
  • the RGBIR image sensor is at the same location as the sensor for capturing the IR image without visible light interference.
  • the RGBIR image sensor captures the same object to obtain the RGBIR image in the image pair, and at the same moment, the IR image sensor used to collect no visible light interference captures the same object to obtain the image The IR image of the pair without visible light interference.
  • the first neural network is trained using the image pairs used to train the first neural network one at a time.
  • the image pairs used to train the first neural network are different for each utilization. Every time the first neural network is trained, the RGBIR image in the image pair for training the first neural network is used as the input of the first neural network, and the IR image without visible light interference in the image pair for training the first neural network is used as a label That is Ground-Truth.
  • the first neural network learns the relationship between the RGBIR image and the IR image without visible light interference. Specifically, the first neural network can learn the relationship between the pixel values of the same position in the RGBIR image and the IR image without visible light interference.
  • the second neural network may be a convolutional neural network
  • the second processing includes: for each color channel, eliminating the near-infrared light component in the pixel value corresponding to the pixel point of the color channel in the target RGBIR image, and Predict the pixel values of pixels other than the color channel pixel in the color channel.
  • the R channel, G channel, and B channel are all color channels. Due to the crosstalk between visible light and near-infrared light, the pixel values of R channel pixels, G channel pixels, and B channel pixels in an RGBIR image are determined by the part affected by visible light, that is, the visible light component, and the part affected by near-infrared light, that is, near-infrared light. Infrared light components and composition.
  • the visible light component in the pixel value corresponding to the pixel point of the R channel can be obtained.
  • the visible light component in the pixel value corresponding to the pixel point of the G channel can be obtained.
  • the visible light component in the pixel value corresponding to the pixel point of the G channel can be obtained.
  • the visible light component in the pixel value corresponding to the B-channel pixel can be obtained.
  • the target R-channel image, target G-channel image, and target B-channel image can be obtained, and the target R-channel image, target G-channel image, and target B-channel image form the target RGB image.
  • the target R channel image For each R channel pixel in the target RGBIR image, the target R channel image includes a pixel corresponding to the R channel pixel, and the pixel value of the pixel corresponding to the R channel pixel is the R channel pixel For the visible light component in the corresponding pixel value, the position of the pixel corresponding to the R channel pixel in the target R channel image is the same as the position of the R channel pixel in the target RGBIR image.
  • the target R channel image For each G channel pixel in the target RGBIR image, the target R channel image includes a pixel corresponding to the G channel pixel, and the pixel value of the pixel corresponding to the G channel pixel is predicted, the The pixel value of the G channel pixel in the R channel, the position of the pixel corresponding to the G channel pixel in the target R channel image is the same as the position of the G channel pixel in the target RGBIR image.
  • the target R channel image For each B channel pixel in the target RGBIR image, the target R channel image includes a pixel corresponding to the B channel pixel, and the pixel value of the pixel corresponding to the B channel pixel is predicted, the The pixel value of the B channel pixel in the R channel, the position of the pixel corresponding to the B channel pixel in the target R channel image is the same as the position of the B channel pixel in the target RGBIR image.
  • the target R channel image For each IR channel pixel in the target RGBIR image, the target R channel image includes a pixel corresponding to the IR channel pixel, and the pixel value of the pixel corresponding to the IR channel pixel is predicted, the The pixel value of the IR channel pixel in the R channel, the position of the pixel corresponding to the IR channel pixel in the target R channel image is the same as the position of the IR channel pixel in the target RGBIR image.
  • the target G channel image For each G channel pixel in the target RGBIR image, the target G channel image includes a pixel corresponding to the G channel pixel, and the pixel value of the pixel corresponding to the G channel pixel is the G channel pixel For the visible light component in the corresponding pixel value, the position of the pixel corresponding to the G-channel pixel in the target G-channel image is the same as the position of the G-channel pixel in the target RGBIR image.
  • the target G channel image For each R channel pixel in the target RGBIR image, the target G channel image includes a pixel corresponding to the R channel pixel, and the pixel value of the pixel corresponding to the R channel pixel is predicted, the The pixel value of the R channel pixel in the G channel, the position of the pixel corresponding to the R channel pixel in the target G channel image is the same as the position of the R channel pixel in the target RGBIR image.
  • the target G channel image For each B channel pixel in the target RGBIR image, the target G channel image includes a pixel corresponding to the B channel pixel, and the pixel value of the pixel corresponding to the B channel pixel is predicted, the The pixel value of the B channel pixel in the G channel, the position of the pixel corresponding to the B channel pixel in the target G channel image is the same as the position of the B channel pixel in the target RGBIR image.
  • the target G channel image For each IR channel pixel in the target RGBIR image, the target G channel image includes a pixel corresponding to the IR channel pixel, and the pixel value of the pixel corresponding to the IR channel pixel is predicted, the The pixel value of the IR channel pixel in the G channel, the position of the pixel corresponding to the IR channel pixel in the target G channel image is the same as the position of the IR channel pixel in the target RGBIR image.
  • the target B channel image For each B channel pixel in the target RGBIR image, the target B channel image includes a pixel corresponding to the B channel pixel, and the pixel value of the pixel corresponding to the B channel pixel is the B channel pixel For the visible light component in the corresponding pixel value, the position of the pixel corresponding to the B channel pixel in the target B channel image is the same as the position of the B channel pixel in the target RGBIR image.
  • the target B channel image For each R channel pixel in the target RGBIR image, the target B channel image includes a pixel corresponding to the R channel pixel, and the pixel value of the pixel corresponding to the R channel pixel is predicted, the The pixel value of the R channel pixel in the B channel, the position of the pixel corresponding to the R channel pixel in the target B channel image is the same as the position of the R channel pixel in the target RGBIR image.
  • the target B channel image For each G channel pixel in the target RGBIR image, the target B channel image includes a pixel corresponding to the G channel pixel, and the pixel value of the pixel corresponding to the G channel pixel is predicted, the The pixel value of the G channel pixel in the B channel, the position of the pixel corresponding to the G channel pixel in the target B channel image is the same as the position of the R channel pixel in the target RGBIR image.
  • the target B channel image For each IR channel pixel in the target RGBIR image, the target B channel image includes a pixel corresponding to the IR channel pixel, and the pixel value of the pixel corresponding to the IR channel pixel is predicted, the The pixel value of the IR channel pixel in the B channel, the position of the pixel corresponding to the IR channel pixel in the target B channel image is the same as the position of the IR channel pixel in the target RGBIR image.
  • the second neural network before using the second neural network to obtain the target RGB image, the second neural network is pre-trained using a plurality of image pairs for training the second neural network.
  • the image pairs used to train the second neural network include an RGBIR image and an RGB image without near-infrared light interference.
  • the RGB image without near-infrared light interference is collected by a sensor for collecting the RGB image without near-infrared light interference.
  • an RGB image without interference from near-infrared light can be captured by an RGB sensor.
  • the RGBIR image in the image pair and the RGB image without near-infrared light interference can pass through the RGBIR image sensor, the sensor used to collect the RGB image without near-infrared light interference It is obtained by photographing the same object at the same time at the same time.
  • the position of the RGBIR image sensor is the same as the position of the sensor for collecting the RGB image without near-infrared light interference.
  • the RGBIR image sensor shoots the same object to obtain the RGBIR image in the image pair, and at the same moment, the sensor for collecting RGB images without near-infrared light interference shoots the same object, Obtain the RGB image without near-infrared light interference in the image pair.
  • the second neural network is trained each time with an image pair used for training the second neural network.
  • the image pairs used to train the second neural network are different for each utilization.
  • the second neural network is trained every time, the RGBIR image in the image pair for training the second neural network is used as the input of the second neural network, the RGB image without near-infrared light interference in the image pair for training the second neural network as a label.
  • the second neural network learns the association relationship between the RGBIR image and the RGB image without near-infrared light interference, specifically, the second neural network can learn the RGBIR image and the RGB image without near-infrared light interference. The relationship between the pixel values of the pixels.
  • the first neural network and/or the second neural network are used to complete demosaicing to obtain the target IR image and/or target RGB image.
  • the component as the interference information is the visible light component or the near-infrared light component, and the pixel value of the corresponding pixel point in the corresponding channel is predicted.
  • the corresponding interference is not used
  • the component pixel values of the information are interpolated without interpolation errors.
  • the demosaicing process is automatically completed by the first neural network and/or the second neural network without relying on expert experience.
  • the first neural network and/or the second neural network can eliminate the component as interference information in the pixel value corresponding to any channel pixel in the target RGBIR image, and predict that any pixel in the target RGBIR image is in The pixel value of the corresponding channel is not affected by the color filter array of the sensor, and is suitable for demosaicing the RGBIR image collected by any RGBIR image sensor.
  • the size of the target IR image matches the size of the target RGBIR image
  • the size of the target RGB image matches the size of the target RGBIR image
  • eliminating the visible light component of the pixel value of the IR channel pixel in the target RGBIR image, and predicting the pixel value of the color channel pixel in the IR channel includes: performing feature extraction processing on the target RGBIR image to obtain the first feature , the first feature includes: near-infrared light band information of the target RGBIR image; performing IR image reconstruction processing, the IR image reconstruction process includes: based on the first feature, eliminate the visible light component in the pixel value corresponding to the IR channel pixel point, and obtain the IR channel The near-infrared light component in the pixel value corresponding to the pixel point; based on the near-infrared light component in the pixel value corresponding to the IR channel pixel point, predict the pixel value of the color channel pixel point in the IR channel.
  • the first neural network may include a feature extraction module and a reconstruction module.
  • the feature extraction module in the first neural network may perform feature extraction processing on the target RGBIR image to obtain the first feature.
  • the IR image reconstruction process can be performed by the reconstruction module in the first neural network.
  • the feature extraction module in the first neural network may include blocks and pooling layers, and the blocks may include multiple convolutional layers.
  • the reconstruction module in the first neural network may include blocks, upsampling layers.
  • the input of the feature extraction module in the first neural network is the target RGBIR image
  • the output of the feature extraction module in the first neural network is the first feature
  • the input of the reconstruction module in the first neural network is the first feature
  • the near-infrared light component in the pixel value corresponding to the color channel pixel in the target RGBIR image is eliminated, and the pixels other than the color channel pixel are predicted to be in the color channel
  • the pixel value of the color channel includes: performing feature extraction processing on the target RGBIR image to obtain a second feature, the second feature includes: visible light band information of the target RGBIR image; performing RGB image reconstruction processing, and the RGB image reconstruction process includes: For each color channel, based on the second feature, eliminate the near-infrared light component in the pixel value corresponding to the color channel pixel point in the target RGBIR image, and obtain the visible light component in the pixel value corresponding to the color channel pixel point; based on the color channel pixel point For the visible light component in the corresponding pixel value, predict the pixel value of the pixel point in the color channel except for the pixel point of the color channel.
  • the first neural network may include a feature extraction module and a reconstruction module.
  • the feature extraction module in the second neural network can perform feature extraction processing on the target RGBIR image to obtain the second feature.
  • the RGB image reconstruction process can be performed by the reconstruction module in the second neural network.
  • the feature extraction module in the second neural network may include blocks and pooling layers, and the blocks may include multiple convolutional layers.
  • the reconstruction module in the second neural network may include blocks, upsampling layers.
  • the input of the feature extraction module in the second neural network is the target RGBIR image
  • the output of the feature extraction module in the second neural network is the second feature
  • the input of the reconstruction module in the second neural network is the second feature
  • the second neural network The output of the reconstruction module in the network is the target RGB image.
  • FIG. 2 shows a schematic diagram of a structure suitable for the first treatment or the second treatment.
  • the feature extraction module in Fig. 2 refers to the feature extraction module in the first neural network
  • the reconstruction module in Fig. 2 refers to the reconstruction module in the first neural network
  • the feature extraction module in Fig. 2 refers to the feature extraction module in the second neural network
  • the reconstruction module in Fig. 2 refers to the reconstruction module in the second neural network .
  • the feature extraction module includes block 1, block 2, and pooling layer.
  • the reconstruction module includes: block 3, block 4, and upsampling layer.
  • the input of the feature extraction module that is, the input of block 1, is an RGBIR image.
  • the output of the reconstruction module ie the output of block 4
  • the output of the reconstruction module is the target RGB image.
  • FIG. 3 shows another structural diagram suitable for the first treatment or the second treatment.
  • the feature extraction module in Fig. 3 refers to the feature extraction module in the first neural network
  • the reconstruction module in Fig. 3 refers to the reconstruction module in the first neural network
  • the structure in Fig. 2 is the structure for the second processing
  • the feature extraction module in Fig. 3 refers to the feature extraction module in the second neural network
  • the reconstruction module in Fig. 3 refers to the reconstruction module in the second neural network .
  • the feature extraction module includes block 1, block 2, block 3, and multiple pooling layers.
  • the reconstruction module includes: block 4, block 5, block 6, multiple upsampling layers.
  • the input of the feature extraction module that is, the input of block 1, is an RGBIR image. If the structure in Fig. 3 is the structure used for the first processing, the output of the reconstruction module, ie the output of the up-sampling layer connected to block 6, is the target IR image. If the structure in FIG. 3 is used for the second processing, the output of the reconstruction module is the target RGB image.
  • the residual connection in Figure 3 is an optional connection, and the residual connection is used to add the outputs of the two layers to obtain a connection result, which is used as the input of a certain block. If the output of the pooling layer connected to block 1 and the output of the upsampling layer connected to block 5 are added through the residual connection, the resulting connection result is used as the input of block 6. If the output of the pooling layer connected to block 2 and the output of the upsampling layer connected to block 4 are added through the residual connection, the resulting connection result is used as the input of block 5.
  • the target RGBIR image before acquiring the target RGBIR image, it also includes: acquiring a plurality of first training image pairs, where the first training image pair includes: wherein the first RGBIR training image and the label image corresponding to the first RGBIR training image, The label image corresponding to the first RGBIR training image is an IR image without visible light interference, and the label image corresponding to the first RGBIR training image is obtained by performing the first integral operation on the multispectral image corresponding to the first RGBIR training image; using multiple first RGBIR training images A training image pair trains the first neural network.
  • the image used to train the first neural network may be referred to as the first RGBIR training image.
  • a first training image pair is used to train the first neural network.
  • the first training image pair includes: a first RGBIR training image and a label image corresponding to the first RGBIR training image.
  • the first RGBIR training images in the first training image pair used to train the first neural network each time are different.
  • the label image corresponding to the first RGBIR training image is an IR image without visible light interference, that is, in the label image corresponding to the first RGBIR training image, the pixel value of each pixel does not have a visible light component.
  • the multispectral image corresponding to the first RGBIR training image can describe the IR response value of each pixel point in a plurality of bands in the label image corresponding to the first RGBIR training image, and the first RGBIR training image can be described.
  • a multispectral image corresponding to an RGBIR training image performs a first integration operation. Through the first integration operation, for each pixel in the label image corresponding to the first RGBIR training image, according to the IR of the pixel in multiple bands The response value is to determine the pixel value of the pixel. After determining the pixel value of each pixel in the label image corresponding to the first RGBIR training image, the label image corresponding to the first RGBIR training image can be obtained.
  • the multispectral image corresponding to the first RGBIR training image can be collected while pre-collecting the first RGBIR training image, the first RGBIR training image, the first RGBIR training image corresponding Multi-spectral images can be obtained by simultaneously shooting the same object at the same time through RGBIR image sensors and multi-spectral image sensors.
  • the RGBIR image sensor is at the same position as the multispectral image sensor.
  • a first RGBIR training image in a first training image pair is input into the first neural network to obtain a predicted IR image output by the first neural network.
  • the parameter values of the parameters of the infrared band information extraction network are updated.
  • the multispectral image corresponding to the first RGBIR training image can describe the IR response value of each pixel in multiple bands in the label image corresponding to the first RGBIR training image, and perform the second multispectral image corresponding to the first RGBIR training image.
  • An integral operation can accurately determine the pixel value of each pixel in the label image corresponding to the first RGBIR training image, and the obtained label image corresponding to the first RGBIR training image has high accuracy.
  • acquiring a plurality of pairs of first training images includes: acquiring first RGBIR training images of a plurality of different scenes and multispectral images corresponding to the first RGBIR training images of a plurality of different scenes; For each first RGBIR training image, perform a first integration operation on the multispectral image corresponding to the first RGBIR training image to obtain a label image corresponding to the first RGBIR training image; combine the first RGBIR training image with the first RGBIR The label images corresponding to the training images are determined as the first training image pair.
  • a plurality of different scenes may refer to a plurality of scenes with different lighting conditions.
  • the first RGBIR training image of the scene refers to the first RGBIR training image collected in advance in the scene.
  • a plurality of first RGBIR training images of the scene can be collected in advance in the scene, and for each first RGBIR training image of the scene, the first RGBIR training image can be collected while the first RGBIR training image is collected.
  • the multispectral image corresponding to the first RGBIR training image is subjected to a first integral operation to obtain a label image corresponding to the first RGBIR training image, and the first RGBIR training image and The label images corresponding to the first RGBIR training image are combined into a first training image pair.
  • a plurality of first training image pairs may include a plurality of first RGBIR training images of different scenes and label images corresponding to the first RGBIR training images of a plurality of different scenes, and may utilize a plurality of different scene
  • the first RGBIR training image of the first RGBIR training image and the label image corresponding to the first RGBIR training image of a plurality of different scenes train the first neural network, after completing the training of the first neural network, the first neural network is suitable for the The RGBIR image collected under any scene in the scene is processed.
  • the target RGBIR image before acquiring the target RGBIR image, it also includes: acquiring a plurality of second training image pairs, the second training image pair including: the second RGBIR training image, the label image corresponding to the second RGBIR training image, wherein, The label image corresponding to the second RGBIR training image is an RGB image without near-infrared light interference, and the label image corresponding to the second RGBIR training image is obtained by performing a second integral operation on the multispectral image corresponding to the second RGBIR training image; A second neural network is trained using a plurality of second training image pairs.
  • the image used for training the second neural network may be referred to as the second RGBIR training image.
  • a second training image pair is used to train the second neural network.
  • the second training image pair includes: a second RGBIR training image and a label image corresponding to the second RGBIR training image.
  • the second RGBIR training images in the second training image pair used for training the second neural network each time are different.
  • the label image corresponding to the second RGBIR training image is an RGB image without near-infrared light interference, that is, in the label image corresponding to the second RGBIR training image, the pixel value of each pixel does not have an IR component.
  • the multispectral image corresponding to the second RGBIR training image can describe the visible light response value of each pixel point in multiple bands in the label image corresponding to the second RGBIR training image, and the second RGBIR training image can be used.
  • the multi-spectral image corresponding to the two RGBIR training images performs the second integration operation, through the second integration operation, for each pixel in the label image corresponding to the two RGBIR training images, according to the visible light response of the pixel in multiple bands value, to determine the pixel value of the pixel. After determining the pixel value of each pixel in the label image corresponding to the second RGBIR training image, the label image corresponding to the second RGBIR training image can be obtained.
  • the multispectral image corresponding to the second RGBIR training image can be collected while pre-collecting the second RGBIR training image, the second RGBIR training image, the second RGBIR training image corresponding Multi-spectral images can be obtained by simultaneously shooting the same object at the same time through RGBIR image sensors and multi-spectral image sensors.
  • the RGBIR image sensor is at the same position as the multispectral image sensor.
  • a second RGBIR training image in a second training image pair is input into the second neural network to obtain a predicted RGB image output by the second neural network.
  • backpropagation is performed to update the parameter values of the parameters of the second neural network.
  • the multispectral image corresponding to the second RGBIR training image can describe the visible light response value of each pixel in the label image corresponding to the second RGBIR training image in multiple bands, and perform the second multispectral image corresponding to the second RGBIR training image.
  • the integral operation can accurately determine the pixel value of each pixel in the label image corresponding to the second RGBIR training image, and the obtained label image corresponding to the second RGBIR training image has high accuracy.
  • obtaining a plurality of second training image pairs includes: obtaining a plurality of second RGBIR training images of different scenes and multispectral images corresponding to a plurality of different second RGBIR training images; The second RGBIR training image, the second integration operation is performed on the multispectral image corresponding to the second RGBIR training image to obtain the label image corresponding to the second RGBIR training image; the second RGBIR training image and the second RGBIR training image The corresponding label images are combined into a second training image pair.
  • the multiple scenes may refer to multiple scenes with different lighting conditions.
  • the second RGBIR training image of the scene refers to the second RGBIR training image collected in advance in the scene.
  • a plurality of second RGBIR training images can be collected in advance under the scene, and for each second RGBIR training image of the scene, the second RGBIR training image can be collected while collecting the second RGBIR training image.
  • the image corresponds to the multispectral image.
  • the second integration operation is performed on the multispectral image corresponding to the second RGBIR training image to obtain the label image corresponding to the second RGBIR training image, and the second RGBIR training image and The label images corresponding to the second RGBIR training image are combined into a second training image pair.
  • a plurality of second training image pairs may include a plurality of second RGBIR training images of different scenes and label images corresponding to the second RGBIR training images of a plurality of different scenes, and may utilize a plurality of different scene
  • the label image corresponding to the second RGBIR training image of the second RGBIR training image and the second RGBIR training image of a plurality of different scenes trains the second neural network, after completing the training of the second neural network, the second neural network is suitable for targeting at multiple scenes
  • the RGBIR image collected under any scene in the scene is processed.
  • FIG. 4 shows a structural block diagram of a demosaic device provided by an embodiment of the present application.
  • the demosaic device includes: an acquisition unit 401 and a demosaic unit 402 .
  • the acquisition unit 401 is configured to acquire a target RGBIR image; wherein, the target RGBIR image is a preprocessed RGBIR image, and the preprocessing includes dark level compensation;
  • the demosaic unit 402 is configured to use a first neural network to perform first processing on the target RGBIR image to obtain a target IR image; and/or use a second neural network to perform second processing to the target RGBIR image to obtain a target RGB image ;
  • the first processing includes: eliminating the visible light component in the pixel value corresponding to the IR channel pixel in the target RGBIR image, and predicting the pixel value of the color channel pixel in the IR channel
  • the second processing includes : For each color channel, eliminate the near-infrared light component in the pixel value corresponding to the pixel point of the color channel in the target RGBIR image, and predict the pixel points other than the pixel point of the color channel in the color channel Pixel values.
  • the demosaicing unit 402 is further configured to perform feature extraction processing on the target RGBIR image to obtain a first feature
  • the first feature includes: near-infrared band information of the target RGBIR image
  • Reconstruction processing the IR image reconstruction processing includes: based on the first feature, eliminate the visible light component in the pixel value corresponding to the IR channel pixel point, and obtain the near infrared light in the pixel value corresponding to the IR channel pixel point
  • Component Predict the pixel value of the color channel pixel in the IR channel based on the near-infrared light component in the pixel value corresponding to the IR channel pixel.
  • the demosaicing unit 202 is further configured to perform feature extraction processing on the target RGBIR image to obtain a second feature, the second feature including: visible light band information of the target RGBIR image; performing RGB image reconstruction processing, so
  • the RGB image reconstruction process includes: for each color channel, based on the second feature, eliminating the near-infrared light component in the pixel value corresponding to the color channel pixel in the target RGBIR image, to obtain the color channel pixel A visible light component in the corresponding pixel value; based on the visible light component in the pixel value corresponding to the pixel point in the color channel, predict the pixel values in the color channel of the pixel points other than the pixel point in the color channel.
  • the size of the target IR image matches the size of the target RGBIR image
  • the size of the target RGB image matches the size of the target RGBIR image
  • the demosaicing device includes:
  • the first training unit is configured to acquire a plurality of first training image pairs before acquiring the target RGBIR image, and the first training image pair includes: wherein the first RGBIR training image and the label image corresponding to the first RGBIR training image , the label image is an IR image without visible light interference, and the label image is obtained by performing a first integration operation on the multispectral image corresponding to the first RGBIR training image; using a plurality of first training images to train the first neural network network.
  • the first training unit is further configured to acquire a plurality of first RGBIR training images of different scenes and multispectral images corresponding to the first RGBIR training images of a plurality of different scenes;
  • the first RGBIR training image performing a first integral operation on the multispectral image corresponding to the first RGBIR training image, to obtain a label image corresponding to the first RGBIR training image;
  • combining the first RGBIR training image and the first RGBIR training image A label image corresponding to an RGBIR training image is determined as a first training image pair.
  • the demosaic device includes:
  • the second training unit is configured to acquire a plurality of second training image pairs before acquiring the target RGBIR image, and the second training image pair includes: a second RGBIR training image, a label image corresponding to the second RGBIR training image, wherein the The label image corresponding to the second RGBIR training image is an RGB image without near-infrared light interference, and the label image corresponding to the second RGBIR training image performs a second integral operation on the multispectral image corresponding to the second RGBIR training image Obtained; using a plurality of second training image pairs to train a second neural network.
  • the second training unit is further configured to acquire second RGBIR training images of multiple different scenes and multispectral images corresponding to the second RGBIR training images of multiple different scenes;
  • the second RGBIR training image performing a second integration operation on the multispectral image corresponding to the second RGBIR training image, to obtain a label image corresponding to the second RGBIR training image; combining the second RGBIR training image and the first RGBIR training image
  • the label images corresponding to the two RGBIR training images are combined into a second training image pair.
  • the demosaic device also includes:
  • a preprocessing unit configured to acquire an original RGBIR image; perform the preprocessing on the original RGBIR image to obtain the target RGBIR image.
  • any one of the steps and specific operations in any one of the embodiments of the demosaicing method provided in this application can be completed by corresponding units in the demosaicing device.
  • For the corresponding operation process completed by each unit in the demosaicing device refer to the corresponding operation process described in the embodiment of the demosaicing method.
  • Demosaicing is completed by the demosaicing device, and the target IR image and/or the target RGB image can be obtained.
  • the component as the interference information in the pixel value corresponding to the corresponding channel pixel point in the target RGBIR image is eliminated, as The component of the interference information is the visible light component or the near-infrared light component, and the pixel value of the corresponding pixel point in the corresponding channel is predicted.
  • the pixel value of the component with the corresponding interference information is not used for interpolation, and no interpolation will be generated.
  • the mosaic solution process is automatically completed by the first neural network and/or the second neural network, without relying on expert experience, manual correction is used, and no correction error will be generated, thereby avoiding interpolation errors and correction errors.
  • Adverse effects of mosaic improve the image quality of the obtained target IR image and/or the image quality of the target RGB image, and provide for the subsequent generation of IR images with high signal-to-noise ratio and good color reproduction and/or RGB images with good color reproduction. favorable conditions.
  • the first neural network and/or the second neural network can eliminate the component as interference information in the pixel value corresponding to any channel pixel in the target RGBIR image, and predict that any pixel in the target RGBIR image is in The pixel value of the corresponding channel is not affected by the color filter array of the sensor, and is suitable for demosaicing the RGBIR image collected by any RGBIR image sensor.
  • Fig. 5 is a structural block diagram of an electronic device provided in this embodiment.
  • the electronic device includes a processing component 522 , which further includes one or more processors, and a memory resource, represented by memory 532 , for storing instructions executable by the processing component 522 , such as application programs.
  • the application program stored in memory 532 may include one or more modules each corresponding to a set of instructions.
  • the processing component 522 is configured to execute instructions to perform the above method.
  • the electronic device may also include a power supply component 526 configured to perform power management of the electronic device, a wired or wireless network interface 550 configured to connect the electronic device to a network, and an input-output (I/O) interface 558.
  • the electronic device can operate based on an operating system stored in the memory 532, such as Windows ServerTM, MacOS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
  • a storage medium including instructions such as a memory including instructions, which can be executed by an electronic device to complete the above method.
  • the storage medium may be a non-transitory computer readable storage medium such as ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage medium. equipment etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

本申请实施例提供了解马赛克方法、电子设备及存储介质,该方法包括:获取目标RGBIR图像;利用第一神经网络对目标RGBIR图像进行第一处理,得到目标IR图像;和/或,利用第二神经网络对目标RGBIR图像进行第二处理,得到目标RGB图像;其中,第一处理包括:消除目标RGBIR图像中的IR通道像素点所对应像素值中的可见光分量,以及预测颜色通道像素点在IR通道的像素值,第二处理包括:对于每一个颜色通道,消除目标RGBIR图像中的该颜色通道像素点所对应像素值中的近红外光分量,以及预测除了该颜色通道像素点之外的像素点在该颜色通道的像素值。

Description

解马赛克方法、电子设备及存储介质
本申请要求在2021年08月11日提交中国专利局、申请号为202110919874.7、发明名称为“解马赛克方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
图像处理领域
背景技术
RGBIR图像传感器为可以同时感应红外信息和可见光信息的传感器。RGBIR图像传感器将Bayer阵列中的覆盖绿色滤光片的像素点更换成了覆盖近红外滤光片的像素点,从而,通过一个图像传感器同时感应到可见光和近红外光两个波段的信息。而计算机视觉任务需要的图像为目标RGB图像和/或目标IR图像,因此,需要根据RGBIR图像传感器采集的RGBIR图像,得到目标RGB图像和/或目标IR图像,根据RGBIR图像传感器采集的RGBIR图像,得到目标RGB图像和/或目标IR图像称之为解马赛克。
目前,通常采用的解马赛克的算法为:对RGBIR图像传感器采集的RGBIR图像进行插值生成RGB图像和IR(Infrared Radiation,近红外光)图像。对通过插值生成的RGB图像和/或通过插值生成的IR图像进行色彩校正,消除通过插值生成的RGB图像和通过插值生成的IR图像中的干扰信息,完成解马赛克,得到目标RGB图像、目标IR图像。
通常采用的解马赛克的算法会引入较大的插值误差,导致目标RGB图像和/或目标IR图像的质量较低。并且采用手工的方式进行校正,十分依赖于专家经验,难免产生校正误差,导致目标RGB图像、目标IR图像的质量较低。通常采用的解马赛克的算法与RGBIR传感器的色彩滤波阵列强相关,适用于一个型号的RGBIR传感器的色彩滤波阵列的解马赛克算法无法应用在另一个型号的RGBIR传感器的色彩滤波阵列上。必须针对每一个型号的RGBIR传感器,单独开发解马赛克算法,导致成本较高。
发明内容
本申请实施例提供一种解马赛克方法、电子设备及存储介质。
本申请实施例提供一种解马赛克方法,包括:
获取目标RGBIR图像;其中,所述目标RGBIR图像为经过预处理后的RGBIR图像,所述预处理包括暗电平补偿;
利用第一神经网络对所述目标RGBIR图像进行第一处理,得到目标IR图像;和/或,利用第二神经网络对目标RGBIR图像进行第二处理,得到目标RGB图像;其中,所述第一处理包括:消除所述目标RGBIR图像中的IR通道像素点所对应像素值中的可见光分量,以及预测颜色通道像素点在IR通道的像素值,所述第二处理包括:对于每一个颜色通道,消除目标RGBIR图像中的所述颜色通道像素点所对应像素值中的近红外光分量,以及预测除了所述颜色通道像素点之外的像素点在所述颜色通道的像素值。
本申请实施例还提供了一种电子设备,包括:
处理器;
用于存储所述处理器可执行指令的存储器;
其中,所述处理器被配置为执行所述指令,以实现上述解马赛克方法。
本申请实施例还提供了一种存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行上述解马赛克方法。
本申请实施例还提供了一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被处理器执行时实现上述解马赛克方法。
本申请实施例提供的解马赛克方法,利用第一神经网络和/或第二神经网络完成目标RGBIR图像的解马赛克,得到目标IR图像和/或目标RGB图像,在解马赛克过程中,消除目标RGBIR图像中的相应的通道像素点所对应像素值中的作为干扰信息的分量,作为干扰信息的分量为可见光分量或近红外光分量,预测相应的像素点在相应的通道的像素值,在解马赛克过程中,没有利用具有相应的干扰信息的分量像素值进行插值,不会产生插值误差,同时,解马赛克过程由第一神经网络和/或第二神经网络自动完成,无需依赖专家经验,采用手工的方式进行校正,不会产生校正误差,从而,避免插值误差和校正误差对解马赛克的不利影响,提升得到的目标IR图像的图像质量和/或目标RGB图像的图像质量, 为后续生成信噪比高、色彩还原度好的IR图像和/或色彩还原度好的RGB图像提供了有利的条件。另一方面,第一神经网络和/或第二神经网络可以消除目标RGBIR图像中的任意一个通道像素点所对应像素值中的作为干扰信息的分量,预测目标RGBIR图像中的任意一个像素点在相应的通道的像素值,不受传感器的色彩滤波阵列的影响,适用于对任意一个RGBIR图像传感器采集的RGBIR图像进行解马赛克。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1示出了本申请实施例提供的解马赛克方法的流程图;
图2示出了适用于第一处理或第二处理的一个结构示意图;
图3示出了适用于第一处理或第二处理的另一个结构示意图;
图4示出了本申请实施例提供的解马赛克装置的结构框图;
图5示出了本申请实施例提供的电子设备的结构框图。
具体实施例
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
图1示出了本申请实施例提供的解马赛克方法的流程图,该方法包括:
步骤101,获取目标RGBIR图像。
在本申请中,目标RGBIR图像为经过预处理后的RGBIR图像,预处理包括暗电平补偿。
预处理后的RGBIR图像可以由用于预处理的设备预先对RGBIR传感器采集的RGBIR图像进行暗电平补偿得到。由于 RGBIR传感器自身存在暗电平,通过RGBIR传感器采集的RGBIR图像受到RGBIR传感器的暗电平的影响,因此,需要消除RGBIR传感器的暗电平对RGBIR传感器采集的RGBIR图像的影响。可以由用于预处理的设备预先对RGBIR传感器采集的RGBIR图像进行暗电平补偿,得到预处理后的RGBIR图像。
在一些实施例中,获取目标RGBIR图像包括:获取原始RGBIR图像;
对原始RGBIR图像进行预处理,得到目标RGBIR图像。原始RGBIR图像为RGBIR传感器采集的RGBIR的图像。在对原始RGBIR图像进行预处理时,可以对原始RGBIR图像进行暗电平补偿,得到经过暗电平补偿的RGBIR图像,可以将经过暗电平补偿的RGBIR图像作为目标RGBIR图像。
对原始RGBIR图像进行暗电平补偿可以表示为:
Figure PCTCN2022111227-appb-000001
其中,BLCbayer表示RGBIR传感器的暗电平,RGBIR传感器的暗电平可以由RGBIR传感器生产商提供。
等式右侧的R、G、B、IR分别表示各个像素点在R通道的原始像素值、在G通道的原始像素值、在B通道的原始像素值、在IR通道的原始像素值,等式左侧的R、G、B、IR分别表示R通道像素点的经过暗电平补偿后的像素值、G通道像素点的经过暗电平补偿后的像素值、B通道像素点的经过暗电平补偿后的像素值、IR通道像素点的经过暗电平补偿后的像素值。
在本申请中,R像素点对应于RGBIR传感器的阵列中的红色滤光片,对于一个R像素点,该R像素点在RGBIR图像中的位置与该R像素点对应的红色滤光片在RGBIR传感器的阵列中的位置相同。G像素点对应于RGBIR传感器的阵列中的绿色滤光片,对于一个G像素点,该G像素点在RGBIR图像中的位置与该G像素点对应的绿色滤光片在RGBIR传感器的阵列中的位置相同。B像素点对应于 RGBIR传感器的阵列中的蓝色色滤光片,对于一个B像素点,该B像素点在RGBIR图像中的位置与该B像素点对应的蓝色滤光片在RGBIR传感器的阵列中的位置相同。IR像素点对应于RGBIR传感器的阵列中的近红外滤光片,对于一个IR像素点,该IR像素点在RGBIR图像中的位置与该IR像素点对应的近红外滤光片在RGBIR传感器的阵列中的位置相同。
在一种具体实施方式中,预处理还可以包括:图像标准化(normalization)。图像标准化可以使得通过图像标准化后得到的相应的图像中的像素值符合某种分布,便于用于处理相应的图像的神经网络对通过图像标准化后得到的相应的图像进行处理。可以对经过暗电平补偿的RGBIR图像进行图像标准化,得到目标RGBIR图像。
步骤102,利用第一神经网络对目标RGBIR图像进行第一处理,得到目标IR图像,和/或,利用第二神经网络对目标RGBIR图像进行第二处理,得到目标RGB图像。
在本申请中,第一神经网络可以为卷积神经网络,第一处理包括:消除目标RGBIR图像中的IR通道像素点所对应像素值中的可见光分量,以及预测颜色通道像素点在IR通道的像素值。
由于可见光和近红外光的串扰现象,RGBIR图像中的IR通道像素点的像素值由受近红外光影响的部分即近红外光分量和受可见光影响的部分即可见光分量组成。消除IR通道像素点所对应像素值中的可见光分量,可以得到IR通道像素点所对应像素值中的近红外光分量。
在本申请中,目标IR图像中的每一个像素点的像素值均不具有可见光分量。
对于目标RGBIR图像中的每一个IR通道像素点,目标IR图像包括一个对应于该IR通道像素点的像素点,对应于该IR通道像素点的像素点的像素值为该IR通道像素点所对应像素值中的近红外光分量,对应于该IR通道像素点的像素点在目标IR图像中的位置与该IR通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个R通道像素点,目标IR图像包括一个对应于该R通道像素点的像素点,对应于该R通道像素点的像素点的像素值为预测出的、该R通道像素点在IR通道的像素值,对应于该R通道像素点的像素点在目标IR图像中的位置与该R通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个G通道像素点,目标IR图像包括一个对应于该G通道像素点的像素点,对应于该G通道像素点的像素点的像素值为预测出的、该G通道像素点在IR通道的像素值,对应于该G通道像素点的像素点在目标IR图像中的位置与该G通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个B通道像素点,目标IR图像包括一个对应于该B通道像素点的像素点,对应于该B通道像素点的像素点的像素值为预测出的、该B通道像素点在IR通道的像素值,对应于该B通道像素点的像素点在目标IR图像中的位置与该B通道像素点在目标RGBIR图像中的位置相同。
在本申请中,在利用第一神经网络获取目标IR图像之前,预先利用多个用于训练第一神经网络的图像对训练第一神经网络。
用于训练第一神经网络的图像对包括一个RGBIR图像和一个没有可见光干扰的IR图像。没有可见光干扰的IR图像可以由用于采集没有可见光干扰的IR图像的传感器采集,例如,可以由IR传感器采集没有可见光干扰的IR图像。对于每一个用于训练第一神经网络的图像对,该图像对中的RGBIR图像和没有可见光干扰的IR图像可以通过RGBIR图像传感器、用于采集没有可见光干扰的IR图像的传感器在同一时刻同时拍摄同一个对象得到。在该同一个时刻,RGBIR图像传感器所处的位置与用于采集没有可见光干扰的IR图像的传感器所处的位置相同。在该同一个时刻,该RGBIR图像传感器拍摄该同一个对象,得到该图像对中的RGBIR图像,在该同一个时刻,用于采集没有可见光干扰的IR图像传感器拍摄该同一个对象,得到该图像对中的没有可见光干扰的IR图像。
每一次利用一个用于训练第一神经网络的图像对训练第一神 经网络。每一次利用的用于训练第一神经网络的图像对不同。每一次训练第一神经网络,用于训练第一神经网络的图像对中的RGBIR图像作为第一神经网络的输入,用于训练第一神经网络的图像对中的没有可见光干扰的IR图像作为标签即Ground-Truth。第一神经网络学习RGBIR图像和没有可见光干扰的IR图像的关联关系,具体地,第一神经网络可以学习RGBIR图像和没有可见光干扰的IR图像中的相同位置的像素点的像素值的关联关系。
在本申请中,第二神经网络可以为卷积神经网络,第二处理包括:对于每一个颜色通道,消除目标RGBIR图像中的该颜色通道像素点所对应像素值中的近红外光分量,以及预测除了该颜色通道像素点之外的像素点在该颜色通道的像素值。
R通道、G通道、B通道均为颜色通道。由于可见光和近红外光的串扰现象,RGBIR图像中的R通道像素点、G通道像素点、B通道像素点的像素值由受可见光影响的部分即可见光分量和受近红外光影响的部分即近红外光分量和组成。
消除R通道像素点所对应像素值中的近红外光分量,可以得到R通道像素点所对应像素值中的可见光分量。消除G通道像素点所对应像素值中的近红外光分量,可以得到G通道像素点所对应像素值中的可见光分量。消除B通道像素点所对应像素值中的近红外光分量,可以得到B通道像素点所对应像素值中的可见光分量。
在本申请中,通过第二处理,可以得到目标R通道图像、目标G通道图像、目标B通道图像,目标R通道图像、目标G通道图像、目标B通道图像组成目标RGB图像。
对于目标RGBIR图像中的每一个R通道像素点,目标R通道图像包括一个对应于该R通道像素点的像素点,该对应于该R通道像素点的像素点的像素值为该R通道像素点所对应像素值中的可见光分量,对应于该R通道像素点的像素点在目标R通道图像中的位置与该R通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个G通道像素点,目标R通道图像包括一个对应于该G通道像素点的像素点,该对应于该G通道像 素点的像素点的像素值为预测出的、该G通道像素点在R通道的像素值,对应于该G通道像素点的像素点在目标R通道图像中的位置与该G通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个B通道像素点,目标R通道图像包括一个对应于该B通道像素点的像素点,该对应于该B通道像素点的像素点的像素值为预测出的、该B通道像素点在R通道的像素值,对应于该B通道像素点的像素点在目标R通道图像中的位置与该B通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个IR通道像素点,目标R通道图像包括一个对应于该IR通道像素点的像素点,该对应于该IR通道像素点的像素点的像素值为预测出的、该IR通道像素点在R通道的像素值,对应于该IR通道像素点的像素点在目标R通道图像中的位置与该IR通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个G通道像素点,目标G通道图像包括一个对应于该G通道像素点的像素点,该对应于该G通道像素点的像素点的像素值为该G通道像素点所对应像素值中的可见光分量,对应于该G通道像素点的像素点在目标G通道图像中的位置与该G通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个R通道像素点,目标G通道图像包括一个对应于该R通道像素点的像素点,该对应于该R通道像素点的像素点的像素值为预测出的、该R通道像素点在G通道的像素值,对应于该R通道像素点的像素点在目标G通道图像中的位置与该R通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个B通道像素点,目标G通道图像包括一个对应于该B通道像素点的像素点,该对应于该B通道像素点的像素点的像素值为预测出的、该B通道像素点在G通道的像素值,对应于该B通道像素点的像素点在目标G通道图像中的位置与该B通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个IR通道像素点,目标G通道图像包括一个对应于该IR通道像素点的像素点,该对应于该IR通道 像素点的像素点的像素值为预测出的、该IR通道像素点在G通道的像素值,对应于该IR通道像素点的像素点在目标G通道图像中的位置与该IR通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个B通道像素点,目标B通道图像包括一个对应于该B通道像素点的像素点,该对应于该B通道像素点的像素点的像素值为该B通道像素点所对应像素值中的可见光分量,对应于该B通道像素点的像素点在目标B通道图像中的位置与该B通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个R通道像素点,目标B通道图像包括一个对应于该R通道像素点的像素点,该对应于该R通道像素点的像素点的像素值为预测出的、该R通道像素点在B通道的像素值,对应于该R通道像素点的像素点在目标B通道图像中的位置与该R通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个G通道像素点,目标B通道图像包括一个对应于该G通道像素点的像素点,该对应于该G通道像素点的像素点的像素值为预测出的、该G通道像素点在B通道的像素值,对应于该G通道像素点的像素点在目标B通道图像中的位置与该R通道像素点在目标RGBIR图像中的位置相同。
对于目标RGBIR图像中的每一个IR通道像素点,目标B通道图像包括一个对应于该IR通道像素点的像素点,该对应于该IR通道像素点的像素点的像素值为预测出的、该IR通道像素点在B通道的像素值,对应于该IR通道像素点的像素点在目标B通道图像中的位置与该IR通道像素点在目标RGBIR图像中的位置相同。
在本申请中,在利用第二神经网络获取目标RGB图像之前,预先利用多个用于训练第二神经网络的图像对训练第二神经网络。用于训练第二神经网络的图像对包括一个RGBIR图像和一个没有近红外光干扰的RGB图像。没有近红外光干扰的RGB图像由用于采集没有近红外光干扰的RGB图像的传感器采集。例如,可以由RGB传感器采集没有近红外光干扰的RGB图像。
对于每一个用于训练第二神经网络的图像对,该图像对中的 RGBIR图像和没有近红外光干扰的RGB图像可以通过RGBIR图像传感器、用于采集没有近红外光干扰的RGB图像的传感器在同一时刻同时拍摄同一个对象得到。在该同一个时刻,RGBIR图像传感器所处的位置与用于采集没有近红外光干扰的RGB图像的传感器所处的位置相同。在该同一个时刻,该RGBIR图像传感器拍摄该同一个对象,得到该图像对中的RGBIR图像,在该同一个时刻,用于采集没有近红外光干扰的RGB图像的传感器拍摄该同一个对象,得到该图像对中的没有近红外光干扰的RGB图像。
每一次利用一个用于训练第二神经网络的图像对训练第二神经网络。每一次利用的用于训练第二神经网络的图像对不同。每一次训练第二神经网络,用于训练第二神经网络的图像对中的RGBIR图像作为第二神经网络的输入,用于训练第二神经网络的图像对中的没有近红外光干扰的RGB图像作为标签。在训练过程中,第二神经网络学习RGBIR图像和没有近红外光干扰的RGB图像的关联关系,具体地,第二神经网络可以学习RGBIR图像和没有近红外光干扰的RGB图像中的相同位置的像素点的像素值的关联关系。
在本申请中,利用第一神经网络和/或第二神经网络完成解马赛克,得到目标IR图像和/或目标RGB图像,在解马赛克过程中,消除目标RGBIR图像中的相应的通道像素点所对应像素值中的作为干扰信息的分量,作为干扰信息的分量为可见光分量或近红外光分量,预测相应的像素点在相应的通道的像素值,在解马赛克过程中,没有利用具有相应的干扰信息的分量像素值进行插值,不会产生插值误差,同时,解马赛克过程由第一神经网络和/或第二神经网络自动完成,无需依赖专家经验,采用手工的方式进行校正,不会产生校正误差,从而,避免插值误差和校正误差对解马赛克的不利影响,提升得到的目标IR图像的图像质量和/或目标RGB图像的图像质量,为后续生成信噪比高、色彩还原度好的IR图像和/或色彩还原度好的RGB图像提供了有利的条件。另一方面,第一神经网络和/或第二神经网络可以消除目标RGBIR图像中的任意一个通道 像素点所对应像素值中的作为干扰信息的分量,预测目标RGBIR图像中的任意一个像素点在相应的通道的像素值,不受传感器的色彩滤波阵列的影响,适用于对任意一个RGBIR图像传感器采集的RGBIR图像进行解马赛克。
在一些实施例中,目标IR图像的尺寸与目标RGBIR图像的尺寸一致,目标RGB图像的尺寸与目标RGBIR图像的尺寸一致。
在一些实施例中,消除目标RGBIR图像中的IR通道像素点的像素值的可见光分量,以及预测颜色通道像素点在IR通道的像素值包括:对目标RGBIR图像进行特征提取处理,得到第一特征,第一特征包括:目标RGBIR图像的近红外光波段信息;进行IR图像重建处理,IR图像重建处理包括:基于第一特征,消除IR通道像素点所对应像素值中的可见光分量,得到IR通道像素点所对应像素值中的近红外光分量;基于IR通道像素点所对应像素值中的近红外光分量,预测颜色通道像素点在IR通道的像素值。
在本申请中,第一神经网络可以包括特征提取模块、重建模块。可以由第一神经网络中的特征提取模块对目标RGBIR图像进行特征提取处理,得到第一特征。可以由第一神经网络中的重建模块进行IR图像重建处理。第一神经网络中的特征提取模块可以包括区块、池化层,区块可以包括多个卷积层。第一神经网络中的重建模块可以包括区块、上采样层。第一神经网络中的特征提取模块的输入为目标RGBIR图像,第一神经网络中的特征提取模块的输出为第一特征,第一神经网络中的重建模块的输入为第一特征,第一神经网络中的重建模块的输出为目标IR图像。
在一些实施例中,对于每一个颜色通道,消除目标RGBIR图像中的所述颜色通道像素点所对应像素值中的近红外光分量,以及预测除了该颜色通道像素点之外的像素点在该颜色通道的像素值包括:对目标RGBIR图像进行特征提取处理,得到第二特征,第二特征包括:目标RGBIR图像的可见光波段信息;进行RGB图像重建处理,RGB图像重建处理包括:对于每一个颜色通道,基于第二特征,消除目标RGBIR图像中的该颜色通道像素点所对应像素 值中的近红外光分量,得到该颜色通道像素点所对应像素值中的可见光分量;基于该颜色通道像素点所对应像素值中的可见光分量,预测除了该颜色通道像素点之外的像素点在该颜色通道的像素值。
在本申请中,第一神经网络可以包括特征提取模块、重建模块。可以由第二神经网络中的特征提取模块对目标RGBIR图像进行特征提取处理,得到第二特征。可以由第二神经网络中的重建模块进行RGB图像重建处理。第二神经网络中的特征提取模块可以包括区块、池化层,区块可以包括多个卷积层。第二神经网络中的重建模块可以包括区块、上采样层。第二神经网络中的特征提取模块的输入为目标RGBIR图像,第二神经网络中的特征提取模块的输出为第二特征,第二神经网络中的重建模块的输入为第二特征,第二神经网络中的重建模块的输出为目标RGB图像。
请参考图2,其示出了适用于第一处理或第二处理的一个结构示意图。
若图2中的结构为用于第一处理的结构,图2中的特征提取模块是指第一神经网络中的特征提取模块,图2中的重建模块是指第一神经网络中的重建模块。若图2中的结构为用于第二处理的结构,图2中的特征提取模块是指第二神经网络中的特征提取模块,图2中的重建模块是指第二神经网络中的重建模块。特征提取模块包括区块1、区块2、池化层。重建模块包括:区块3、区块4、上采样层。特征提取模块的输入即区块1的输入为RGBIR图像。若图2中的结构为用于第一处理的结构,重建模块的输出即区块4的输出为目标IR图像。若图2中的结构为用于第二处理的结构,重建模块的输出为目标RGB图像。
请参考图3,其示出了适用于第一处理或第二处理的另一个结构示意图。
若图3中的结构为用于第一处理的结构,图3中的特征提取模块是指第一神经网络中的特征提取模块,图3中的重建模块是指第一神经网络中的重建模块。若图2中的结构为用于第二处理的结构,图3中的特征提取模块是指第二神经网络中的特征提取模块,图3 中的重建模块是指第二神经网络中的重建模块。
特征提取模块包括区块1、区块2、区块3、多个池化层。重建模块包括:区块4、区块5、区块6、多个上采样层。特征提取模块的输入即区块1的输入为RGBIR图像。若图3中的结构为用于第一处理的结构,重建模块的输出即与区块6连接的上采样层的输出为目标IR图像。若图3中的结构为用于第二处理的结构,重建模块的输出为目标RGB图像。
图3中的残差连接为可选的连接,残差连接用于将两个层的输出相加,得到连接结果,该连接结果作为某一个区块的输入。若通过残差连接将与区块1连接的池化层的输出和与区块5连接的上采样层的输出相加,得到的连接结果作为区块6的输入。若通过残差连接将与区块2连接的池化层的输出和与区块4连接的上采样层的输出相加,得到的连接结果作为区块5的输入。
在一些实施例中,在获取目标RGBIR图像之前,还包括:获取多个第一训练图像对,第一训练图像对包括:其中,第一RGBIR训练图像以及第一RGBIR训练图像对应的标签图像,第一RGBIR训练图像对应的标签图像为没有可见光干扰的IR图像,第一RGBIR训练图像对应的标签图像通过对该第一RGBIR训练图像对应的多光谱图像进行第一积分操作得到;利用多个第一训练图像对训练第一神经网络。
在本申请中,可以将用于对第一神经网络进行训练的图像称之为第一RGBIR训练图像。
每一次对第一神经网络进行训练,采用一个第一训练图像对训练第一神经网络。
第一训练图像对包括:第一RGBIR训练图像、该第一RGBIR训练图像对应的标签图像。每一次对第一神经网络进行训练采用的第一训练图像对中的第一RGBIR训练图像不同。
第一RGBIR训练图像对应的标签图像为没有可见光干扰的IR图像,即在第一RGBIR训练图像对应的标签图像中,每一个像素点的像素值均不具有可见光分量。
对于每一个第一RGBIR训练图像,该第一RGBIR训练图像对应的多光谱图像可以描述第一RGBIR训练图像对应的标签图像中的每一个像素点在多个波段的IR响应值,可以对该第一RGBIR训练图像对应的多光谱图像进行第一积分操作,通过第一积分操作,可以对于该第一RGBIR训练图像对应的标签图像中的每一个像素点,根据该像素点在多个波段的IR响应值,确定该像素点的像素值,在确定该第一RGBIR训练图像对应的标签图像中的每一个像素点的像素值之后,即可得到该第一RGBIR训练图像对应的标签图像。
对于每一个第一RGBIR训练图像,可以在预先采集该第一RGBIR训练图像的同时,采集该第一RGBIR训练图像对应的多光谱图像,该第一RGBIR训练图像、该第一RGBIR训练图像对应的多光谱图像可以通过RGBIR图像传感器、多光谱图像传感器在同一时刻同时拍摄同一个对象得到。在该同一个时刻,RGBIR图像传感器所处的位置与多光谱图像传感器所处的位置相同。
在一次训练过程中,将一个第一训练图像对中的第一RGBIR训练图像输入到第一神经网络中,得到第一神经网络输出的预测IR图像。利用平方损失函数计算预测IR图像与该第一RGBIR训练图像对应的标签图像之间的损失。基于预测IR图像与该第一RGBIR训练图像对应的标签图像之间的损失,更新红外光波段信息提取网络的参数的参数值。
第一RGBIR训练图像对应的多光谱图像可以描述第一RGBIR训练图像对应的标签图像中的每一个一个像素点在多个波段的IR响应值,对第一RGBIR训练图像对应的多光谱图像进行第一积分操作,可以精确地确定第一RGBIR训练图像对应的标签图像中的每一个像素点的像素值,得到的第一RGBIR训练图像对应的标签图像的准确度高。
在一些实施例中,获取多个第一训练图像对包括:获取多个不同的场景的第一RGBIR训练图像和多个不同的场景的第一RGBIR训练图像对应的多光谱图像;对于获取到的每一个第一RGBIR训练图像,对该第一RGBIR训练图像对应的多光谱图像进行第一积 分操作,得到该第一RGBIR训练图像对应的标签图像;将该第一RGBIR训练图像和该第一RGBIR训练图像对应的标签图像确定为第一训练图像对。
多个不同的场景可以是指多个光照条件不同的场景。对于每一个场景,该场景的第一RGBIR训练图像是指预先在该场景下采集到的第一RGBIR训练图像。对于每一个场景,可以预先在该场景下采集该场景的多个第一RGBIR训练图像,对于该场景的每一个第一RGBIR训练图像,可以在采集该第一RGBIR训练图像的同时,采集该第一RGBIR训练图像对应的多光谱图像。
对于获取到的每一个第一RGBIR训练图像,对该第一RGBIR训练图像对应的多光谱图像进行第一积分操作,得到该第一RGBIR训练图像对应的标签图像,将该第一RGBIR训练图像和该第一RGBIR训练图像对应的标签图像组合为第一训练图像对。
在本申请中,多个第一训练图像对可以包括多个不同的场景的第一RGBIR训练图像和多个不同的场景的第一RGBIR训练图像对应的标签图像,可以利用包括多个不同的场景的第一RGBIR训练图像和多个不同的场景的第一RGBIR训练图像对应的标签图像训练第一神经网络,在完成第一神经网络的训练之后,第一神经网络网络适用于针对在多个场景中的任意一个场景下采集到的RGBIR图像进行处理。
在一些实施例中,在获取目标RGBIR图像之前,还包括:获取多个第二训练图像对,第二训练图像对包括:第二RGBIR训练图像、第二RGBIR训练图像对应的标签图像,其中,该第二RGBIR训练图像对应的标签图像为没有近红外光干扰的RGB图像,该第二RGBIR训练图像对应的标签图像通过对该第二RGBIR训练图像对应的多光谱图像进行第二积分操作得到;利用多个第二训练图像对训练第二神经网络。
在本申请中,可以将用于对第二神经网络进行训练的图像称之为第二RGBIR训练图像。
每一次对第二神经网络进行训练,采用一个第二训练图像对训 练第二神经网络。第二训练图像对包括:第二RGBIR训练图像、该第二RGBIR训练图像对应的标签图像。每一次对第二神经网络进行训练采用的第二训练图像对中的第二RGBIR训练图像不同。
第二RGBIR训练图像对应的标签图像为没有近红外光干扰的RGB图像,即在第二RGBIR训练图像对应的标签图像中,每一个像素点的像素值均不具有IR分量。
对于每一个第二RGBIR训练图像,该第二RGBIR训练图像对应的多光谱图像可以描述第二RGBIR训练图像对应的标签图像中的每一个像素点在多个波段的可见光响应值,可以对该第二RGBIR训练图像对应的多光谱图像进行第二积分操作,通过第二积分操作,可以对于该二RGBIR训练图像对应的标签图像中的每一个像素点,根据该像素点在多个波段的可见光响应值,确定该像素点的像素值。在确定第二RGBIR训练图像对应的标签图像中的每一个像素点的像素值之后,即可得到该第二RGBIR训练图像对应的标签图像。
对于每一个第二RGBIR训练图像,可以在预先采集该第二RGBIR训练图像的同时,采集该第二RGBIR训练图像对应的多光谱图像,该第二RGBIR训练图像、该第二RGBIR训练图像对应的多光谱图像可以通过RGBIR图像传感器、多光谱图像传感器在同一时刻同时拍摄同一个对象得到。在该同一个时刻,RGBIR图像传感器所处的位置与多光谱图像传感器所处的位置相同。
在一次训练过程中,将一个第二训练图像对中的第二RGBIR训练图像输入到第二神经网络中,得到第二神经网络输出的预测RGB图像。利用平方损失函数计算预测RGB图像与该第二RGBIR训练图像对应的标签图像之间的损失。基于预测RGB图像与该第二RGBIR训练图像对应的标签图像之间的损失,进行反向传播,更新第二神经网络的参数的参数值。
第二RGBIR训练图像对应的多光谱图像可以描述第二RGBIR训练图像对应的标签图像中的每一个像素点在多个波段的可见光响应值,对第二RGBIR训练图像对应的多光谱图像进行第二积分 操作,可以精确地确定第二RGBIR训练图像对应的标签图像中的每一个像素点的像素值,得到的第二RGBIR训练图像对应的标签图像的准确度高。
在一些实施例中,获取多个第二训练图像对包括:获取多个不同的场景的第二RGBIR训练图像和多个不同的第二RGBIR训练图像对应的多光谱图像;对于获取到的每一个第二RGBIR训练图像,对该第二RGBIR训练图像对应的多光谱图像进行第二积分操作,得到该第二RGBIR训练图像对应的标签图像;将该第二RGBIR训练图像和该第二RGBIR训练图像对应的标签图像组合为第二训练图像对。
多个场景可以是指多个光照条件不同的场景。对于每一个场景,该场景的第二RGBIR训练图像是指预先在该场景下采集到的第二RGBIR训练图像。对于每一个场景,可以预先在该场景下采集多个第二RGBIR训练图像,对于该场景的每一个第二RGBIR训练图像,可以在采集该第二RGBIR训练图像的同时,采集该第二RGBIR训练图像对应的多光谱图像。
对于获取到的每一个第二RGBIR训练图像,对该第二RGBIR训练图像对应的多光谱图像进行第二积分操作,得到该第二RGBIR训练图像对应的标签图像,将该第二RGBIR训练图像和该第二RGBIR训练图像对应的标签图像组合为第二训练图像对。
在本申请中,多个第二训练图像对可以包括多个不同的场景的第二RGBIR训练图像和多个不同的场景的第二RGBIR训练图像对应的标签图像,可以利用包括多个不同的场景的第二RGBIR训练图像和多个不同的场景的第二RGBIR训练图像对应的标签图像训练第二神经网络,在完成第二神经网络的训练之后,第二神经网络网络适用于针对在多个场景中的任意一个场景下采集到的RGBIR图像进行处理。
请参考图4,其示出了本申请实施例提供的解马赛克装置的结构框图。解马赛克装置包括:获取单元401,解马赛克单元402。
获取单元401被配置为获取目标RGBIR图像;其中,所述目标 RGBIR图像为经过预处理后的RGBIR图像,所述预处理包括暗电平补偿;
解马赛克单元402被配置为利用第一神经网络对所述目标RGBIR图像进行第一处理,得到目标IR图像;和/或,利用第二神经网络对目标RGBIR图像进行第二处理,得到目标RGB图像;其中,所述第一处理包括:消除所述目标RGBIR图像中的IR通道像素点所对应像素值中的可见光分量,以及预测颜色通道像素点在IR通道的像素值,所述第二处理包括:对于每一个颜色通道,消除目标RGBIR图像中的所述颜色通道像素点所对应像素值中的近红外光分量,以及预测除了所述颜色通道像素点之外的像素点在所述颜色通道的像素值。
在一些实施例中,解马赛克单元402进一步被配置为对所述目标RGBIR图像进行特征提取处理,得到第一特征,所述第一特征包括:目标RGBIR图像的近红外光波段信息;进行IR图像重建处理,所述IR图像重建处理包括:基于所述第一特征,消除所述IR通道像素点所对应像素值中的可见光分量,得到所述IR通道像素点所对应像素值中的近红外光分量;基于所述IR通道像素点所对应像素值中的近红外光分量,预测所述颜色通道像素点在IR通道的像素值。
在一些实施例中,解马赛克单元202进一步被配置为对目标RGBIR图像进行特征提取处理,得到第二特征,所述第二特征包括:目标RGBIR图像的可见光波段信息;进行RGB图像重建处理,所述RGB图像重建处理包括:对于每一个颜色通道,基于所述第二特征,消除目标RGBIR图像中的所述颜色通道像素点所对应像素值中的近红外光分量,得到所述颜色通道像素点所对应像素值中的可见光分量;基于所述颜色通道像素点所对应像素值中的可见光分量,预测除了所述颜色通道像素点之外的像素点在所述颜色通道的像素值。
在一些实施例中,目标IR图像的尺寸与目标RGBIR图像的尺寸一致,目标RGB图像的尺寸与目标RGBIR图像的尺寸一致。
在一些实施例中,解马赛克装置包括:
第一训练单元,被配置为在获取目标RGBIR图像之前,获取多个第一训练图像对,第一训练图像对包括:其中,第一RGBIR训练图像以及所述第一RGBIR训练图像对应的标签图像,所述标签图像为没有可见光干扰的IR图像,所述标签图像通过对所述第一RGBIR训练图像对应的多光谱图像进行第一积分操作得到;利用多个第一训练图像对训练第一神经网络。
在一些实施例中,第一训练单元进一步被配置为获取多个不同的场景的第一RGBIR训练图像和多个不同的场景的第一RGBIR训练图像对应的多光谱图像;对于获取到的每一个第一RGBIR训练图像,对所述第一RGBIR训练图像对应的多光谱图像进行第一积分操作,得到所述第一RGBIR训练图像对应的标签图像;将所述第一RGBIR训练图像和所述第一RGBIR训练图像对应的标签图像确定为第一训练图像对。
在一些实施例中,在一些实施例中,解马赛克装置包括:
第二训练单元,被配置为在获取目标RGBIR图像之前,获取多个第二训练图像对,第二训练图像对包括:第二RGBIR训练图像、第二RGBIR训练图像对应的标签图像,其中,所述第二RGBIR训练图像对应的标签图像为没有近红外光干扰的RGB图像,所述第二RGBIR训练图像对应的标签图像通过对所述第二RGBIR训练图像对应的多光谱图像进行第二积分操作得到;利用多个第二训练图像对训练第二神经网络。
在一些实施例中,第二训练单元进一步被配置为获取多个不同的场景的第二RGBIR训练图像和多个不同的场景的第二RGBIR训练图像对应的多光谱图像;对于获取到的每一个第二RGBIR训练图像,对所述第二RGBIR训练图像对应的多光谱图像进行第二积分操作,得到所述第二RGBIR训练图像对应的标签图像;将所述第二RGBIR训练图像和所述第二RGBIR训练图像对应的标签图像组合为第二训练图像对。
在一些实施例中,解马赛克装置还包括:
预处理单元,被配置为获取原始RGBIR图像;对所述原始RGBIR图像进行所述预处理,得到所述目标RGBIR图像。
本申请提供的解马赛克方法的实施例中的任意一个步骤和任意一个步骤中的具体操作均可以由解马赛克装置中的相应的单元完成。解马赛克装置中的各个单元完成的相应的操作的过程参考在解马赛克方法的实施例中描述的相应的操作的过程。
通过解马赛克装置完成解马赛克,可以得到目标IR图像和/或目标RGB图像,在解马赛克过程中,消除目标RGBIR图像中的相应的通道像素点所对应像素值中的作为干扰信息的分量,作为干扰信息的分量为可见光分量或近红外光分量,预测相应的像素点在相应的通道的像素值,在解马赛克过程中,没有利用具有相应的干扰信息的分量像素值进行插值,不会产生插值误差,同时,解马赛克过程由第一神经网络和/或第二神经网络自动完成,无需依赖专家经验,采用手工的方式进行校正,不会产生校正误差,从而,避免插值误差和校正误差对解马赛克的不利影响,提升得到的目标IR图像的图像质量和/或目标RGB图像的图像质量,为后续生成信噪比高、色彩还原度好的IR图像和/或色彩还原度好的RGB图像提供了有利的条件。另一方面,第一神经网络和/或第二神经网络可以消除目标RGBIR图像中的任意一个通道像素点所对应像素值中的作为干扰信息的分量,预测目标RGBIR图像中的任意一个像素点在相应的通道的像素值,不受传感器的色彩滤波阵列的影响,适用于对任意一个RGBIR图像传感器采集的RGBIR图像进行解马赛克。
图5是本实施例提供的一种电子设备的结构框图。电子设备包括处理组件522,其进一步包括一个或多个处理器,以及由存储器532所代表的存储器资源,用于存储可由处理组件522执行的指令,例如应用程序。存储器532中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件522被配置为执行指令,以执行上述方法。
电子设备还可以包括一个电源组件526被配置为执行电子设备的电源管理,一个有线或无线网络接口550被配置为将电子设备连 接到网络,和一个输入输出(I/O)接口558。电子设备可以操作基于存储在存储器532的操作系统,例如Windows ServerTM,MacOS XTM,UnixTM,LinuxTM,FreeBSDTM或类似。
在示例性实施例中,还提供了一种包括指令的存储介质,例如包括指令的存储器,上述指令可由电子设备执行以完成上述方法。可选地,存储介质可以是非临时性计算机可读存储介质,例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (16)

  1. 一种解马赛克方法,其特征在于,所述方法包括:
    获取目标RGBIR图像;其中,所述目标RGBIR图像为经过预处理后的RGBIR图像,所述预处理包括暗电平补偿;
    利用第一神经网络对所述目标RGBIR图像进行第一处理,得到目标IR图像;和/或,利用第二神经网络对目标RGBIR图像进行第二处理,得到目标RGB图像;
    其中,所述第一处理包括:消除所述目标RGBIR图像中的IR通道像素点所对应像素值中的可见光分量,以及预测颜色通道像素点在IR通道的像素值,所述第二处理包括:对于每一个颜色通道,消除目标RGBIR图像中的所述颜色通道像素点所对应像素值中的近红外光分量,以及预测除了所述颜色通道像素点之外的像素点在所述颜色通道的像素值。
  2. 根据权利要求1所述的方法,其特征在于,所述消除所述目标RGBIR图像中的IR通道像素点所对应像素值中的可见光分量,以及预测颜色通道像素点在IR通道的像素值包括:
    对所述目标RGBIR图像进行特征提取处理,得到第一特征,所述第一特征包括:目标RGBIR图像的近红外光波段信息;
    进行IR图像重建处理,所述IR图像重建处理包括:基于所述第一特征,消除所述IR通道像素点所对应像素值中的可见光分量,得到所述IR通道像素点所对应像素值中的近红外光分量;基于所述IR通道像素点所对应像素值中的近红外光分量,预测所述颜色通道像素点在IR通道的像素值。
  3. 根据权利要求2所述的方法,其特征在于,所述第一神经网络包括:
    特征提取模块、重建模块,所述特征提取模块用于对所述目标RGBIR图像进行特征提取处理,所述重建模块用于进行IR图像重建处理,所述特征提取模块包括区块和池化层,所述重建模块包括区块和上采样层;所述区块包括多个卷积层。
  4. 根据权利要求3所述的方法,其特征在于,所述特征提取模块包括两个区块,所述池化层设置在两个区块之间,所述重建模块包括两个区块, 所述上采样层设置在两个区块之间;
    或者,
    所述特征提取模块包括三个区块,每个所述区块之后设置有所述池化层,所述重建模块包括三个区块,每个所述区块之后设置有所述上采样层。
  5. 根据权利要求1所述的方法,其特征在于,所述对于每一个颜色通道,消除目标RGBIR图像中的所述颜色通道像素点所对应像素值中的近红外光分量,以及预测除了所述颜色通道像素点之外的像素点在所述颜色通道的像素值包括:
    对目标RGBIR图像进行特征提取处理,得到第二特征,所述第二特征包括:目标RGBIR图像的可见光波段信息;
    进行RGB图像重建处理,所述RGB图像重建处理包括:对于每一个颜色通道,基于所述第二特征,消除目标RGBIR图像中的所述颜色通道像素点所对应像素值中的近红外光分量,得到所述颜色通道像素点所对应像素值中的可见光分量;基于所述颜色通道像素点所对应像素值中的可见光分量,预测除了所述颜色通道像素点之外的像素点在所述颜色通道的像素值。
  6. 根据权利要求5所述的方法,其特征在于,所述第二神经网络包括:特征提取模块、重建模块,所述特征提取模块用于对所述目标RGBIR图像进行特征提取处理,所述重建模块用于进行RGB图像重建处理,所述特征提取模块包括区块和池化层,所述重建模块包括区块和上采样层;所述区块包括多个卷积层。
  7. 根据权利要求6所述的方法,其特征在于,所述特征提取模块包括两个区块,所述池化层设置在两个区块之间,所述重建模块包括两个区块,所述上采样层设置在两个区块之间;
    或者,
    所述特征提取模块包括三个区块,每个所述区块之后设置有所述池化层,所述重建模块包括三个区块,每个所述区块之后设置有所述上采样层。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述目标IR图像的尺寸与所述目标RGBIR图像的尺寸一致,所述目标RGB图像的尺寸与 所述目标RGBIR图像的尺寸一致。
  9. 根据权利要求1-7任一项所述的方法,其特征在于,在获取目标RGBIR图像之前,所述方法还包括:
    获取多个第一训练图像对,第一训练图像对包括:第一RGBIR训练图像以及所述第一RGBIR训练图像对应的标签图像,所述标签图像为没有可见光干扰的IR图像,所述标签图像通过对所述第一RGBIR训练图像对应的多光谱图像进行第一积分操作得到;
    利用多个第一训练图像对训练第一神经网络。
  10. 根据权利要求9所述的方法,其特征在于,获取多个第一训练图像对包括:
    获取多个不同的场景的第一RGBIR训练图像和多个不同的场景的第一RGBIR训练图像对应的多光谱图像;
    对于获取到的每一个第一RGBIR训练图像,对所述第一RGBIR训练图像对应的多光谱图像进行第一积分操作,得到所述第一RGBIR训练图像对应的标签图像;
    将所述第一RGBIR训练图像和所述第一RGBIR训练图像对应的标签图像确定为第一训练图像对。
  11. 根据权利要求1-7任一项所述的方法,其特征在于,在获取目标RGBIR图像之前,所述方法还包括:
    获取多个第二训练图像对,第二训练图像对包括:第二RGBIR训练图像、第二RGBIR训练图像对应的标签图像,其中,所述第二RGBIR训练图像对应的标签图像为没有近红外光干扰的RGB图像,所述第二RGBIR训练图像对应的标签图像通过对所述第二RGBIR训练图像对应的多光谱图像进行第二积分操作得到;
    利用多个第二训练图像对训练第二神经网络。
  12. 根据权利要求11所述的方法,其特征在于,获取多个第二训练图像对包括:
    获取多个不同的场景的第二RGBIR训练图像和多个不同的场景的第二 RGBIR训练图像对应的多光谱图像;
    对于获取到的每一个第二RGBIR训练图像,对所述第二RGBIR训练图像对应的多光谱图像进行第二积分操作,得到所述第二RGBIR训练图像对应的标签图像;将所述第二RGBIR训练图像和所述第二RGBIR训练图像对应的标签图像组合为第二训练图像对。
  13. 根据权利要求1-12任一项所述的方法,其特征在于,获取目标RGBIR图像包括:
    获取原始RGBIR图像;
    对所述原始RGBIR图像进行所述预处理,得到所述目标RGBIR图像。
  14. 一种电子设备,其特征在于,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现如权利要求1至13中任一项所述的方法。
  15. 一种存储介质,其特征在于,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行如权利要求1至13中任一项所述的方法。
  16. 一种计算机程序产品,包括计算机程序/指令,其特征在于,该计算机程序/指令被处理器执行时实现权利要求1-13中任一项所述的方法。
PCT/CN2022/111227 2021-08-11 2022-08-09 解马赛克方法、电子设备及存储介质 WO2023016468A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110919874.7A CN113781326A (zh) 2021-08-11 2021-08-11 解马赛克方法、装置、电子设备及存储介质
CN202110919874.7 2021-08-11

Publications (1)

Publication Number Publication Date
WO2023016468A1 true WO2023016468A1 (zh) 2023-02-16

Family

ID=78837375

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/111227 WO2023016468A1 (zh) 2021-08-11 2022-08-09 解马赛克方法、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN113781326A (zh)
WO (1) WO2023016468A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113781326A (zh) * 2021-08-11 2021-12-10 北京旷视科技有限公司 解马赛克方法、装置、电子设备及存储介质
CN115103168A (zh) * 2022-06-27 2022-09-23 展讯通信(上海)有限公司 图像生成方法、装置、电子设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060104505A1 (en) * 2004-11-15 2006-05-18 Chih-Lung Chen Demosaicking method and apparatus for color filter array interpolation in digital image acquisition systems
CN104539919A (zh) * 2014-12-31 2015-04-22 上海富瀚微电子股份有限公司 图像传感器的去马赛克方法及装置
CN107967668A (zh) * 2016-10-20 2018-04-27 上海富瀚微电子股份有限公司 一种图像处理方法及装置
CN111988587A (zh) * 2017-02-10 2020-11-24 杭州海康威视数字技术股份有限公司 图像融合设备和图像融合方法
CN112166455A (zh) * 2019-09-26 2021-01-01 深圳市大疆创新科技有限公司 图像处理方法、装置、可移动平台及机器可读存储介质
CN112529775A (zh) * 2019-09-18 2021-03-19 华为技术有限公司 一种图像处理的方法和装置
CN113781326A (zh) * 2021-08-11 2021-12-10 北京旷视科技有限公司 解马赛克方法、装置、电子设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012239103A (ja) * 2011-05-13 2012-12-06 Sony Corp 画像処理装置、および画像処理方法、並びにプログラム
US10638060B2 (en) * 2016-06-28 2020-04-28 Intel Corporation Color correction of RGBIR sensor stream based on resolution recovery of RGB and IR channels
GB201903816D0 (en) * 2019-03-20 2019-05-01 Spectral Edge Ltd Multispectral image decorrelation method and system
CN109978788B (zh) * 2019-03-25 2020-11-27 厦门美图之家科技有限公司 卷积神经网络生成方法、图像去马赛克方法及相关装置
GB201908517D0 (en) * 2019-06-13 2019-07-31 Spectral Edge Ltd 3D digital imagenoise reduction system and method
CN111667434B (zh) * 2020-06-16 2023-05-09 南京大学 一种基于近红外增强的弱光彩色成像方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060104505A1 (en) * 2004-11-15 2006-05-18 Chih-Lung Chen Demosaicking method and apparatus for color filter array interpolation in digital image acquisition systems
CN104539919A (zh) * 2014-12-31 2015-04-22 上海富瀚微电子股份有限公司 图像传感器的去马赛克方法及装置
CN107967668A (zh) * 2016-10-20 2018-04-27 上海富瀚微电子股份有限公司 一种图像处理方法及装置
CN111988587A (zh) * 2017-02-10 2020-11-24 杭州海康威视数字技术股份有限公司 图像融合设备和图像融合方法
CN112529775A (zh) * 2019-09-18 2021-03-19 华为技术有限公司 一种图像处理的方法和装置
CN112166455A (zh) * 2019-09-26 2021-01-01 深圳市大疆创新科技有限公司 图像处理方法、装置、可移动平台及机器可读存储介质
CN113781326A (zh) * 2021-08-11 2021-12-10 北京旷视科技有限公司 解马赛克方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113781326A (zh) 2021-12-10

Similar Documents

Publication Publication Date Title
WO2023016468A1 (zh) 解马赛克方法、电子设备及存储介质
US11625815B2 (en) Image processor and method
JP4501855B2 (ja) 画像信号処理装置、撮像装置、および画像信号処理方法、並びにコンピュータ・プログラム
US7184079B2 (en) White balance adjustment method, image processing apparatus and electronic camera
US8194160B2 (en) Image gradation processing apparatus and recording
CN111292246B (zh) 图像颜色校正方法、存储介质及内窥镜
TWI621099B (zh) 具有基於特徵的重影去除的陣列照相機影像組合
TWI737979B (zh) 圖像去馬賽克裝置及方法
US8223226B2 (en) Image processing apparatus and storage medium storing image processing program
JP2012239103A (ja) 画像処理装置、および画像処理方法、並びにプログラム
US20100277625A1 (en) Image processing apparatus, imaging apparatus, method of correction coefficient calculation, and storage medium storing image processing program
US8154630B2 (en) Image processing apparatus, image processing method, and computer readable storage medium which stores image processing program
JP5546166B2 (ja) 撮像装置、信号処理方法、及びプログラム
WO2007043325A1 (ja) 画像処理システム、画像処理プログラム
JP2007036462A (ja) 画像処理装置
JP7039183B2 (ja) 画像処理装置、画像処理方法、及びプログラム
WO2012153489A1 (ja) 画像処理システム
JP5372586B2 (ja) 画像処理装置
KR101923957B1 (ko) 감도 개선을 위한 영상 처리 장치 및 방법
JP4916341B2 (ja) 画像処理装置及び画像処理プログラム
TWI517098B (zh) 影像的色彩衰退補償方法
JP6794989B2 (ja) 映像処理装置、撮影装置、映像処理方法及びプログラム
US20230388667A1 (en) Rgb-nir processing and calibration
WO2022198436A1 (zh) 图像传感器、图像数据获取方法、成像设备
JP5836878B2 (ja) 画像処理装置、方法、及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22855449

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE