WO2019228456A1 - Image processing method, apparatus and device, and machine-readable storage medium - Google Patents

Image processing method, apparatus and device, and machine-readable storage medium Download PDF

Info

Publication number
WO2019228456A1
WO2019228456A1 PCT/CN2019/089272 CN2019089272W WO2019228456A1 WO 2019228456 A1 WO2019228456 A1 WO 2019228456A1 CN 2019089272 W CN2019089272 W CN 2019089272W WO 2019228456 A1 WO2019228456 A1 WO 2019228456A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixel
data format
neural network
intermediate image
Prior art date
Application number
PCT/CN2019/089272
Other languages
French (fr)
Chinese (zh)
Inventor
姜子伦
肖飞
范蒙
俞海
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2019228456A1 publication Critical patent/WO2019228456A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06T5/70

Definitions

  • the present application relates to the field of image technology, and in particular, to an image processing method, apparatus, device, and machine-readable storage medium.
  • the original image in the first data format collected by the imaging device cannot usually be directly displayed or transmitted. Therefore, the original image in the first data format can also be converted into a target image in the second data format for display or transmission.
  • an ISP (Image Signal Processing) algorithm can be used to convert the original image into a target image.
  • the ISP algorithm is used to solve image processing such as brightness and color compensation and correction.
  • the imaging device uses the ISP algorithm to convert the original image into the target image
  • the target image will lose the original image information to a certain extent. If the loss of the original image information is serious, it may not be repaired subsequently.
  • the original image collected when the lighting conditions are poor has a large noise after being processed by the ISP algorithm.
  • the present disclosure provides an image processing method, device, device, and machine-readable storage medium, which can effectively remove noise, improve the quality of a target image, and improve user experience.
  • the present disclosure provides an image processing method, which includes:
  • the buffer image is used to perform noise reduction processing on the intermediate image to obtain a target image in a second data format.
  • the buffer image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
  • the present disclosure provides an image processing apparatus including:
  • An image processing module configured to obtain an original image in a first data format, and convert the original image into an intermediate image in a second data format using a neural network;
  • a video processing module configured to perform noise reduction processing on the intermediate image by using a cached image to obtain a target image in a second data format, wherein the cached image includes a corresponding original image of a previous frame adjacent to the original image The target image.
  • the present disclosure provides an image processing apparatus including a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions that can be executed by the processor; and the processor is configured to execute the machine-executable Execute instructions to implement the method steps described above.
  • the present disclosure provides a machine-readable storage medium.
  • Computer instructions are stored on the machine-readable storage medium.
  • the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cache image is the target image corresponding to the previous frame of the original image of the original image, the two frames of the image (ie, the cache image and the intermediate image) are closely related in time and space.
  • the correlation between the two frames of image can be used to distinguish the signal and noise in the image, and the noise reduction process can be performed on the intermediate image to effectively remove the noise. Therefore, noise can be effectively removed even in poor lighting conditions, so that noise in the image can be effectively suppressed, and the quality of the target image is improved.
  • the original image in the first data format is converted into the intermediate image in the second data format by using a neural network, which can reduce the original image information loss of the intermediate image and can be repaired later.
  • FIGS. 1A and 1B are schematic diagrams of a neural network in an embodiment of the present disclosure.
  • FIGS. 2A-2C are schematic diagrams of an offline training neural network in an embodiment of the present disclosure.
  • FIG. 3 is a flowchart of an image processing method according to an embodiment of the present disclosure.
  • 4A-4D are schematic diagrams of image processing in an embodiment of the present disclosure.
  • FIG. 5 is a structural diagram of an image processing apparatus in an embodiment of the present disclosure.
  • FIG. 6 is a hardware configuration diagram of an image processing apparatus in an embodiment of the present disclosure.
  • first, second, third, etc. may be used in the embodiments of the present disclosure to describe various kinds of information, and these descriptions are only used to distinguish the same type of information from each other.
  • first information may also be referred to as the second information
  • second information may also be referred to as the first information.
  • word "if” can be interpreted as "at” or "at " or "in response to a determination”.
  • An embodiment of the present disclosure provides an image processing method, which can be applied to an image processing device.
  • the image processing device may be an imaging device, such as a video camera, and the type of the image processing device is not limited.
  • the original image in the first data format may be converted into an intermediate image in the second data format by using a neural network, and then the buffer image is used to perform noise reduction processing on the intermediate image to obtain the first image.
  • Target image in two data formats That is, the target image is an image subjected to noise reduction processing.
  • two frames of images that is, the cached image and the intermediate image
  • the intermediate image can be used to perform noise reduction processing on the intermediate image, which can effectively remove noise, and can also effectively remove noise when the lighting conditions are poor, so that the noise in the image can be effectively suppressed.
  • the image collected by the image processing device may be an original image, and a data format of the original image may be a first data format.
  • the first data format is an original image format, and usually includes image data of one or more spectral bands.
  • the original image in the first data format cannot be directly displayed or transmitted, that is, there is an exception when the original image in the first data format is displayed or transmitted.
  • the first data format may include a Bayer format.
  • the Bayer format is only an example, and there is no limitation on this first data format, and the data format of all the original images is within the protection scope of the present disclosure.
  • the image processing device uses the neural network to convert the original image, the intermediate image is obtained.
  • the intermediate image is the output image of the neural network, and is not the final target image.
  • the image processing device performs noise reduction processing on the intermediate image by using the cache image, the target image is obtained, that is, the final output image.
  • the data format of the intermediate image and the target image may be a second data format, and the second data format is any image format suitable for display or transmission. For example, when a target image in a second data format is displayed or transmitted, no abnormality occurs.
  • the second data format may include an RGB (Red Green Blue) format, a YUV (Luminance Chrominance) format, and the like.
  • RGB Red Green Blue
  • YUV Luminance Chrominance
  • All image formats suitable for display or transmission are within the protection scope of the present disclosure.
  • the following describes the neural network in the embodiment of the present disclosure, which can be used to convert an original image in a first data format into an intermediate image in a second data format.
  • you can also use the neural network to optimize the original image, such as adjusting the attributes of the original image, such as adjusting the brightness, color, contrast, signal-to-noise ratio, and size of the original image. This optimization method is not detailed. limit.
  • the neural network in the present disclosure may include, but is not limited to, a convolutional neural network (CNN for short), a recurrent neural network (RNN for short), a fully connected network, etc.
  • CNN convolutional neural network
  • RNN recurrent neural network
  • a fully connected network etc.
  • the convolutional neural network is taken as an example.
  • the structural units of the neural network in the present disclosure may include, but are not limited to, one or any combination of the following: a convolutional layer, a pooling layer, an excitation layer, a fully connected layer, and the like.
  • a convolutional layer may include: at least one convolutional layer, at least one pooling layer, and at least one fully connected layer.
  • the neural network may include: at least one convolutional layer and at least one excitation layer.
  • FIGS. 1A and 1B there are two examples of the neural network used in this embodiment.
  • the neural network may be composed of several convolutional layers (Conv), several pooling layers (Pool), and a fully connected layer (FC). There are no restrictions on the number of convolution layers and the number of pooling layers.
  • the neural network may be composed of several convolutional layers and several excitation layers. There are no restrictions on the number of convolutional layers or the number of excitation layers.
  • the neural network used in the present disclosure for converting the original image in the first data format into the intermediate image in the second data format may also have other structures, which is not limited as long as it includes at least one convolution layer, and Not limited to FIG. 1A or FIG. 1B, the neural network illustrated in FIG. 1A or FIG. 1B is merely an example.
  • the image features are enhanced by performing a convolution operation on the image using a convolution kernel.
  • the convolution kernel can be a matrix of size m * n.
  • the input of the convolution layer and the convolution kernel are convolved to obtain the output of the convolution layer.
  • the convolution operation is actually a filtering process.
  • the pixel value f (x, y) of the point (x, y) on the image is convolved with the convolution kernel w (x, y).
  • a 4 * 4 convolution kernel is provided.
  • the 4 * 4 convolution kernel contains 16 values, and the size of the 16 values can be configured as required. Slide the image in order according to the size of 4 * 4 to get multiple 4 * 4 sliding windows.
  • the 4 * 4 convolution kernel is convolved with each sliding window to obtain multiple convolution features.
  • the processing of the pooling layer is actually a process of downsampling. By performing operations such as maximizing, minimizing, and averaging multiple convolutional features output by the convolutional layer, the amount of calculation can be reduced and feature invariance can be maintained.
  • the principle of local image correlation can be used to sub-sample the image, which can reduce the amount of data processing and retain useful information.
  • the following formula for performing maximum pooling can be used to pool the convolution features and obtain the pooled features.
  • s represents the corresponding window size (s * s) during the pooling process
  • m and n are set values
  • j and k are convolution features output by the convolution layer
  • i represents the i-th image
  • y i j, k represents the features obtained by pooling the i-th image.
  • the activation function (such as a non-linear function) can be used to map the features of the pooling layer output, thereby introducing non-linear factors, so that the neural network can enhance the expression ability through non-linear combination.
  • the activation function of the excitation layer may include, but is not limited to, a ReLU (Rectified Linear Units, Rectified Linear Units) function. Taking the following ReLU function as an example, the ReLU function can set all the features x that are less than or equal to 0 to 0 in the output of the pooling layer, and keep the features that are greater than 0 unchanged.
  • each node of the fully-connected layer is connected to all the nodes in the previous layer, and is used to fully-connect all features input to the fully-connected layer to obtain a feature vector, and
  • the feature vector may include multiple features.
  • the fully connected layer may also use a 1 * 1 convolution layer, so that a fully convolutional network can be formed.
  • one or more convolutional layers, one or more pooling layers, one or more excitation layers, and one or more fully connected layers can be combined to construct a neural network.
  • the input of the neural network is the original image in the first data format
  • the output of the neural network is the intermediate image in the second data format. That is, after the original image in the first data format is input to the neural network, after processing by various structural units (such as convolutional layer, pooling layer, excitation layer, fully connected layer, etc.) in the neural network, the second image can be output Intermediate image in data format.
  • various structural units such as convolutional layer, pooling layer, excitation layer, fully connected layer, etc.
  • the neural network can be trained offline, which mainly trains various neural network parameters in the neural network, such as convolution layer parameters (such as convolution kernel parameters), pooling layer parameters, and excitation layer parameters.
  • convolution layer parameters such as convolution kernel parameters
  • pooling layer parameters such as convolution kernel parameters
  • excitation layer parameters there is no limitation on this, all parameters involved in the neural network are within the protection scope of this embodiment.
  • the neural network can fit the mapping relationship between input and output, that is, the mapping relationship between the original image in the first data format and the intermediate image in the second data format. In this way, when the input of the neural network is the original image in the first data format, and through the processing of the neural network, the output of the neural network is the intermediate image in the second data format.
  • the following describes the process of offline training of neural networks in detail in combination with specific application scenarios.
  • two training methods are introduced.
  • the neural networks obtained by the two training methods may be referred to as a first neural network and a second neural network, respectively.
  • a training image in a first data format and a training image in a second data format may be collected, and the training image in the first data format and the training image in the second data format may be associated.
  • the image data set is stored, and the image data set is output to the first neural network.
  • the first neural network uses the image data set to train each neural network parameter in the first neural network.
  • the process of obtaining the image data set can include:
  • Method 1 For the same frame image, the imaging device A collects the training image A1 in the first data format, the imaging device B synchronously acquires the training image B1 in the second data format, and then stores the training image A1 and the training image B1 in association. Similarly, the imaging device A acquires the training image A2, the imaging device B acquires the training image B2, and stores the training image A2 and the training image B2 in association.
  • the image data set may include the corresponding relationship between each group of training images An and training images Bn.
  • Method 2 The imaging device A collects the training image A1 in the first data format, and processes the training image A1 (such as white balance correction, color interpolation, curve mapping, etc., and the processing method is not limited) to obtain the second
  • the training image A1 'in a data format, and the training image A1 and the training image A1' are stored in association.
  • the imaging device A collects the training image A2 in the first data format, and processes the training image A2 to obtain the training image A2 ′ in the second data format, and stores the training image A2 and the training image A2 ′ in association with each other.
  • the final image data set may include the correspondence between each group of training images An and training images An ′.
  • an image data set can be obtained, and the image data set includes a correspondence between a training image in a first data format and a training image in a second data format.
  • a pre-designed first neural network can be trained, that is, each neural network parameter is trained, so that the first neural network fits the mapping relationship between the image in the first data format and the image in the second data format.
  • this training method such as back propagation, elastic propagation, and conjugate gradient.
  • the original image may be input into the trained first neural network to convert the original image in the first data format into the intermediate image in the second data format by the first neural network.
  • the first neural network can also be adjusted online. That is, the original image in the first data format and the target image in the second data format can be used to re-optimize the parameters of each neural network in the first neural network, and there is no limitation on the online adjustment process of the first neural network.
  • training images in a first data format and training images in a second data format may be collected to obtain device parameters (that is, parameters of an imaging device that acquires training images in the first data format). ), Associate the training image in the first data format, the training image in the second data format, and the device parameters to obtain an image data set, and output the image data set to the second neural network, and the second neural network uses the image data Set the parameters of each neural network in the training second neural network.
  • device parameters that is, parameters of an imaging device that acquires training images in the first data format.
  • the process of obtaining the image data set may include, but is not limited to:
  • the imaging device A acquires the training image A1 in the first data format, and acquires the device parameter 1 of the imaging device A.
  • the imaging device B synchronously acquires the training image B1 in the second data format, and then, the training image is A1, equipment parameter 1 and training image B1 are stored in association.
  • the imaging device A acquires the training image A2 and acquires the device parameter 2
  • the imaging device B acquires the training image B2, and stores the training image A2, the device parameter 2 and the training image B2 in association.
  • the image data set may include the correspondence between each group of training images An, device parameters n, and training images Bn.
  • the above device parameters may be fixed parameters that are not related to the environment (such as sensor sensitivity), or shooting parameters that are related to the environment (such as aperture size).
  • device parameters may include, but are not limited to, one or any combination of the following: sensor sensitivity, dynamic range, signal-to-noise ratio, pixel size, target surface size, resolution, frame rate, number of pixels, spectral response, photoelectric response, There are no restrictions on the array mode, lens aperture diameter, focal length, aperture size, hood model, filter aperture, viewing angle, etc.
  • Method 2 The imaging device 1 collects the training image set 1 (including a large number of training images) in the first data format, obtains the device parameters 1 of the imaging device 1, and processes each training image in the training image set 1 (such as white Balance correction, color interpolation, curve mapping, etc., without limitation), to obtain training image set 1 in the second data format, and training image set 1, device parameter 1 and training in the second data format in the first data format Image set 1 is stored in association.
  • the imaging device 2 collects the training image set 2 in the first data format, acquires the device parameters 2 of the imaging device 2, and processes each training image in the training image set 2 to obtain the training image set in the second data format. 2.
  • the final image data set may include the training image sets of each group in the first data format
  • the correspondence between K, the device parameter K, and the training image set K in the second data format is shown in FIG. 2C.
  • An image data set can be obtained through the above two methods, and the image data set includes a training image in a first data format, a correspondence between a device parameter and a training image in a second data format.
  • a pre-designed second neural network can be trained, that is, training the neural network parameters, so that the second neural network fits the mapping of the image in the first data format, the device parameters, and the image in the second data format. relationship.
  • this training method such as back propagation, elastic propagation, and conjugate gradient. It should be noted that the mapping relationship fitted by the second neural network includes device parameters.
  • the original image and the device parameters of the device that collects the original image can be obtained, and the original image and the device parameters can be input to the trained second neural network, so that the second neural network can
  • the device parameter converts a data format of the original image from the first data format to a second data format, thereby obtaining an intermediate image in a second data format.
  • the method may include the following steps.
  • Step 301 Obtain an original image in a first data format.
  • a light signal in a first wavelength range may be sampled to obtain an original image; or a light signal in a second wavelength range may be sampled to obtain an original image; or a first wavelength range and a second wavelength range may be obtained.
  • the light signal is sampled to obtain the original image.
  • the first wavelength range and the second wavelength range are merely examples and are not restrictive.
  • the first wavelength range may be a visible light wavelength range from 380 nm to 780 nm
  • the second wavelength range may be an infrared wavelength range from 780 nm to 2500 nm.
  • Step 302 Use the trained neural network to convert the original image into an intermediate image in a second data format.
  • the original image may be input to a trained first neural network, and the data format of the original image is converted from the first data format to the second data format by the first neural network; A data format of the device parameters of the original image device; the original image and the device parameters are then input to a trained second neural network, so that the second neural network converts the data of the first data format according to the device parameters The original image is converted into an intermediate image in the second data format.
  • the buffer image is used to perform noise reduction processing on the intermediate image in the second data format to obtain a target image in the second data format.
  • the buffer image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
  • the cache image corresponding to the original image of the next frame may be updated as the target image.
  • the acquisition method of the target image 1 is not limited, for example, the intermediate image 1 of the original image 1 can be directly determined as the target image 1)
  • the buffer image that is, the target image 1
  • the cached image in the cache is updated Is target image 2, that is, target image 1 is no longer a cached image.
  • the buffer image (that is, the target image 2) is used to perform noise reduction processing on the intermediate image 3 to obtain the target image 3, and the cached image in the cache is updated Is target image 3, that is, target image 2 is no longer a cached image.
  • the target image corresponding to the original image of the previous frame can be used to continuously update the cached image in the cache to perform noise reduction processing on the intermediate image corresponding to the original image of the current frame, thereby obtaining the target image corresponding to the original image of the current frame.
  • the buffer image is used to perform noise reduction processing on the intermediate image to obtain a target image in a second data format, including but not limited to: obtaining a motion estimation value of each pixel in the intermediate image according to the intermediate image and the buffer image;
  • the intermediate image is converted into a target image in a second data format.
  • the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cached image is the target image corresponding to the previous frame of the original image adjacent to the original image, the two adjacent frames (ie, the cached image and the intermediate image) are closely related in time and space. You can use the correlation between two adjacent frames to distinguish the signal and noise in the intermediate image, and perform noise reduction processing on the intermediate image to effectively remove the noise. Therefore, noise can be effectively removed even in poor lighting conditions, so that the noise in the target image can be effectively suppressed.
  • the above method flow may be executed by the image processing apparatus 100.
  • the image processing apparatus 100 may include three modules: an image processing module 101, a video processing module 102, and a training and learning module 103.
  • the image processing module 101 is configured to perform the foregoing steps 301 and 302, and the video processing module 102 is configured to perform the foregoing step 303.
  • the training and learning module 103 may be an offline module.
  • the neural network is trained and adjusted in advance using the image data set, and the trained and adjusted neural network is output to the image processing module 101.
  • the first neural network described in the above embodiment may be executed. And offline training of the second neural network.
  • the image processing module 101 obtains a pre-adjusted neural network from the training and learning module 103, processes the original image of the inputted first data format based on the neural network, and outputs the intermediate image of the second data format.
  • the video processing module 102 receives a frame of intermediate image in the second data format output by the image processing module 101, performs noise reduction processing in combination with the information of the cached image stored in the cache, obtains the target image, and stores the processed target image in the cache. As a cache image corresponding to the next frame, the video processing module 102 may record a frame of the image in the cache as a cache image.
  • the video processing module 102 is composed of a motion estimation unit 201 and a time domain processing unit 202.
  • step S1 may be performed by the motion estimation unit 201
  • step S2 may be performed by the time domain processing unit 202
  • step 303 described above may be implemented by the motion estimation unit 201 and the time domain processing unit 202.
  • step S1 a motion estimation value of each pixel in the intermediate image is obtained according to the intermediate image and the cache image.
  • the cache image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
  • step S2 the intermediate image is converted into a target image in a second data format according to the motion estimation value.
  • the above-mentioned video processing module 102 composed of the motion estimation unit 201 and the time-domain processing unit 202 is just an example.
  • the video processing module 102 may further include other units, such as a noise estimation unit, a spatial processing unit, and the like.
  • the estimation unit is configured to perform noise estimation on the image
  • the spatial processing unit is configured to perform spatial processing on the image, and there is no limitation on the processing process.
  • the video processing module 102 is composed of a motion estimation unit 201, a time-domain processing unit 202, a noise estimation unit 203, and a spatial-domain processing unit 204.
  • the aforementioned step 303 may be implemented by the aforementioned motion estimation unit 201, time domain processing unit 202, noise estimation unit 203, and spatial domain processing unit 204.
  • the video processing module 102 is composed of a motion estimation unit 201, a time domain processing unit 202, and a spatial domain processing unit 204.
  • the above-mentioned step 303 may be implemented by the above-mentioned motion estimation unit 201, time-domain processing unit 202, and space-domain processing unit 204.
  • step S1 is implemented through steps 4031 and 4032, which are specifically:
  • Step 4031 Obtain a correlation image according to the intermediate image and the cache image.
  • the correlation image is obtained from the pixel values of the corresponding positions of the intermediate image and the cache image according to a preset calculation method.
  • the preset calculation method may be a frame difference method, a convolution method, a cross-correlation method, etc., and there is no limitation on this.
  • Obtaining a correlation image according to the intermediate image and the cache image may include, but is not limited to:
  • Method 1 Use the frame difference method to calculate the correlation image, that is, the difference between the pixel values of each pixel point of the intermediate image and the buffer image to obtain the correlation image. For example, for each pixel in the associated image, a first pixel value corresponding to the pixel in the intermediate image, a second pixel value corresponding to the pixel in the cache image, and the first pixel value and the second pixel are obtained. The difference in values is determined as the pixel value of the pixel in the associated image.
  • the size of the intermediate image is 6 * 4
  • the size of the cache image is 6 * 4
  • the size of the correlation image is 6 * 4.
  • 6 represents the number of pixels in the horizontal direction
  • 4 represents the number of pixels in the vertical direction.
  • 6 and 4 are just an example. In practical applications, the number of pixels in the horizontal direction is much larger than 6, and the number of pixels in the vertical direction is much larger than 4.
  • the intermediate images, cached images, and related images are the same size. .
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the cache image are B11-B16, B21-B26, B31-B36, B41- B46
  • the pixels of the correlation image are C11-C16, C21-C26, C31-C36, and C41-C46 in this order.
  • the pixel value of each pixel of the correlation image can be calculated as follows: For the pixel point C11 in the related image, obtain the first pixel value (pixel value of the pixel point A11) corresponding to the pixel point C11 in the intermediate image, and the second pixel value (pixel point B11) corresponding to the pixel point C11 in the cache image. Pixel value), and then the difference between the first pixel value and the second pixel value is determined as the pixel value of the pixel C11 in the associated image. For the other pixels in the related image, refer to pixel C11 for the processing method, and the details will not be repeated.
  • Method 2 Use the convolution method to calculate the correlation image, that is, convolve the image blocks of the intermediate image and the cache image to obtain the correlation image.
  • the size of the image block is preset, such as 3 * 3. For example, for each pixel point in the associated image, a first image block corresponding to the pixel point is selected from the intermediate image, a second image block corresponding to the pixel point is selected from the cache image, and the first image block is The convolution value (that is, the convolution value of two matrices) with the second image block is determined as the pixel value of the pixel in the associated image.
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the cache image are B11-B16, B21-B26, B31-B36, B41-B46. It is assumed that the pixels of the correlation image are C11-C16, C21-C26, C31-C36, and C41-C46 in this order.
  • the pixel value of each pixel of the correlation image can be calculated as follows: For the pixel C11 in the related image, a first image block corresponding to the pixel C11 is selected from the intermediate image.
  • the first image block is a 3 * 3 matrix, and the first row includes the pixels A11, A12, and A13.
  • the second line includes pixels A21, A22, and A23, and the third line includes pixels A31, A32, and A33. Select the second image block corresponding to pixel C11 from the cache image.
  • the second image block is a 3 * 3 matrix.
  • the first row includes pixels B11, B12, and B13
  • the second row includes pixels B21, B22, and B23.
  • the third line includes pixels B31, B32, and B33.
  • Step 4032 Obtain a motion estimation value of each pixel in the intermediate image according to the correlation image.
  • the value of the motion estimation value may be binarized or continuous.
  • a smoothing process may be adopted to obtain a motion estimation value of each pixel point in the intermediate image.
  • the smoothing processing may include: an image filtering operation having a smoothing characteristic, such as an average filtering operation, a median filtering operation, a Gaussian filtering operation, and the like, and there is no limitation on this image filtering operation.
  • the mapping process may include a linear scaling operation and a panning operation.
  • the threshold processing may include: determining a motion estimation value according to a magnitude relationship between the pixel value and the threshold; limiting the value of the motion estimation value to a range divided by the threshold, and there is no limitation on this.
  • the process of obtaining the motion estimation value of each pixel can include, but is not limited to:
  • Method 1 If the motion estimation value is binarized, and smoothing and threshold processing are used to obtain the motion estimation value of each pixel in the intermediate image, then the average filtering operation and the correlation image can be performed. Threshold processing, etc., to obtain a motion estimation value for each pixel in the intermediate image.
  • a third image block corresponding to the pixel can be selected from the correlation image, and the third image block is averaged to obtain a pixel value corresponding to the pixel; if the pixel value If the pixel value is greater than the threshold value, the motion estimation value of the pixel is determined to be the first value (such as 1), and if the pixel value is not greater than the threshold value, the motion estimation value of the pixel is determined to be the second value (such as 0).
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the correlation image are C11-C16, C21-C26, C31-C36, C41-C46.
  • the correlation image is known (see step 4031)
  • the pixel value of each pixel of the correlation image is known.
  • the following methods can be used:
  • a third image block corresponding to the pixel A11 can be selected from the correlation image.
  • the third image block can be a 3 * 3 matrix (the matrix can also be a matrix of other sizes. This is not limited), the first line includes pixels C11, C12, and C13, the second line includes pixels C21, C22, and C23, and the third line includes pixels C31, C32, and C33.
  • an average filter may be performed on the 9 pixels of the third image block (that is, the average of the 9 pixel values is calculated), and the average filter result is the pixel value corresponding to the pixel A11.
  • the motion estimation value of the pixel A11 may be 1, and if the pixel value is not greater than the threshold, the motion estimation value of the pixel A11 may be 0.
  • the processing method can refer to pixel A11, which is not repeated here.
  • Method 2 If the motion estimation value is continuously obtained, and smoothing and mapping processing are used to obtain the motion estimation value of each pixel in the intermediate image, a median filtering operation and linearity can be performed on the correlation image. Scaling operations, etc., to obtain a motion estimation value for each pixel in the intermediate image.
  • a filter value located in a specific interval (such as interval 0-1) is obtained; then, for each pixel in the intermediate image It is also possible to obtain a filter value corresponding to the pixel located in a specific interval and determine the filter value as a motion estimation value of the pixel, that is, the motion estimation value is also a value located in a specific interval (such as interval 0-1).
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the correlation image are C11-C16, C21-C26, C31-C36, C41-C46. Since the pixel value of each pixel of the correlation image is known, the motion estimation value of each pixel in the intermediate image can be calculated as follows:
  • pixel A11 in the intermediate image corresponds to the filter value of pixel C11 0.039, that is, the motion estimate of pixel A11 is 0.039; for pixel A12 in the intermediate image, pixel A12 corresponds to the pixel
  • the filter value of point C12 is 0.196, that is, the motion estimation value of pixel A12 is 0.196; and so on.
  • step S2 can be implemented through steps 4041 and 4042.
  • the target image in the second data format can also be obtained by other methods, which is not limited as long as the target image can be obtained based on the motion estimation value. Just fine.
  • Step 4041 and step 4042 are specifically:
  • Step 4041 Obtain a low-noise image according to the intermediate image and the cached image.
  • the low-noise image is an image with a lower noise level than the intermediate image, and there is no restriction on the acquisition method.
  • the low-noise image may be the average of the intermediate image and the cached image. image.
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the cache image are B11-B16, B21-B26, B31-B36, and B41-B46.
  • the pixels of the low-noise image are D11-D16, D21-D26, D31-D36, and D41-D46 in this order. Based on this, since the pixel value of each pixel point of the intermediate image and the cache image is known, the pixel value of each pixel point of the low-noise image can be calculated as follows:
  • the pixel point D11 in the low-noise image obtain the pixel value corresponding to the pixel point D11 in the intermediate image (the pixel value of the pixel point A11) and the pixel value corresponding to the pixel point D11 in the cache image (the pixel value of the pixel point B11).
  • the average value of the above two pixel values is determined as the pixel value of the pixel point D11 in the low-noise image.
  • pixel D11 for the processing method, and the details will not be repeated.
  • Step 4042 Obtain a target image according to the intermediate image, the low-noise image, and the motion estimation value.
  • acquiring the target image according to the intermediate image, the low-noise image, and the motion estimation value may include, but is not limited to, determining a pixel value of a first pixel point in the target image, where the first pixel point is any one of the target image Pixels, the pixel value of the first pixel is determined in the following manner: determining a first weight of the first pixel in the intermediate image and a second weight of the first pixel in the low-noise image according to the motion estimation value of the first pixel; Determine the pixel value of the first pixel in the target image according to the pixel value and the first weight corresponding to the first pixel in the intermediate image, and the pixel value and the second weight corresponding to the first pixel in the low-noise image; The pixel values of all pixels determine the target image.
  • the first weight of the first pixel in the intermediate image may be A
  • the second weight of the first pixel in the low-noise image may be (1-A)
  • the pixel value corresponding to the first pixel point in the intermediate image is N
  • the pixel value corresponding to the first pixel point in the low-noise image is M
  • the pixel value of the first pixel point in the target image is N * A + M * (1-A).
  • the above method is only an example, and the pixel value of the first pixel point in the target image may also be obtained by other methods, which is not limited.
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the low-noise image are D11-D16, D21-D26, D31-D36, D41-D46
  • the pixels of the target image are E11-E16, E21-E26, E31-E36, E41-E46. Since the pixel value of each pixel of the intermediate image and the low-noise image is known, calculate each The pixel value of the pixel in the target image can be adopted as follows:
  • the first weight as the motion estimation value of the pixel E11
  • the second weight as 1 minus the motion estimation value of the pixel E11
  • the result of this calculation is the pixel value of pixel point E11.
  • pixel E11 for the processing method, which will not be repeated.
  • the motion estimation value when the motion estimation value is larger, it means that the first weight is larger and the second weight is smaller, that is, the pixel value of the intermediate image is higher, and the pixel value of the low-noise image is higher.
  • the motion estimation value of pixel E11 is relatively large (such as greater than 0.5)
  • the first weight is greater than the second weight.
  • the motion estimation value indicates that the pixel is compared with the previous frame.
  • the change of E11 is large, that is, the pixel value of the current frame is more accurate.
  • the first weight is greater than the second weight
  • the pixel value of the intermediate image in the current frame accounts for a larger proportion, which is more in line with the demand for larger motion estimates. In this way, the pixel value of pixel E11 in the target image is more accurate.
  • the pixel value of the pixel point may include, but is not limited to, the gray value, brightness value, and chrominance value of the pixel point, and the type of the pixel value is not limited, and may be related to actual image processing.
  • the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cached image is the target image corresponding to the previous frame of the original image adjacent to the original image, the two adjacent frames (ie, the cached image and the intermediate image) are closely related in time and space. The correlation between two adjacent frames of images can be used to distinguish the signal and noise in the images, and the intermediate image is processed for noise reduction. Therefore, noise can be effectively removed even in poor lighting conditions, so that the noise in the target image can be effectively suppressed.
  • an embodiment of the present disclosure also proposes an image processing device. As shown in FIG. 5, it is a structural diagram of the image processing device.
  • the image processing device includes:
  • An image processing module 501 configured to obtain an original image in a first data format, and use a neural network to convert the original image into an intermediate image in a second data format;
  • the video processing module 502 is configured to perform noise reduction processing on the intermediate image by using a cached image to obtain a target image in a second data format, where the cached image includes a previous frame corresponding to the original image corresponding to the original image Target image.
  • the image processing module 501 uses a neural network to convert the original image into an intermediate image in a second data format
  • the image processing module 501 is specifically configured to:
  • the original image and the device parameters are input to a second neural network to convert the original image in the first data format into an intermediate image in a second data format according to the device parameters by the second neural network.
  • the video processing module 502 When the video processing module 502 obtains the motion estimation value of each pixel in the intermediate image according to the intermediate image and the cache image, the video processing module 502 is specifically configured to: obtain a correlation image according to the intermediate image and the cache image; The correlation image acquires a motion estimation value of each pixel point in the intermediate image.
  • the video processing module 502 obtains a motion estimation value of each pixel in the intermediate image by using at least one of a smoothing process, a mapping process, and a threshold process.
  • the video processing module 502 converts the intermediate image into a target image in a second data format according to the motion estimation value
  • the video processing module 502 is specifically configured to: obtain a low-noise image according to the intermediate image and the cache image; An image, the low-noise image, and the motion estimation value to obtain a target image.
  • the video processing module 502 is specifically configured to obtain a target image according to the intermediate image, the low-noise image, and the motion estimation value:
  • a pixel value of a first pixel point in the target image where the first pixel point is any pixel point in the target image, and a pixel value of the first pixel point is determined in the following manner: according to the first The motion estimation value corresponding to the pixel determines the first weight of the first pixel in the intermediate image and the second weight of the first pixel in the low-noise image; according to the pixel corresponding to the first pixel in the intermediate image A value and the first weight, a pixel value corresponding to the first pixel point in the low-noise image, and the second weight, to determine a pixel value of the first pixel point in the target image;
  • the target image is determined according to pixel values of all pixel points of the target image.
  • the schematic diagram of the hardware architecture of the image processing device provided by the embodiment of the present disclosure can be specifically shown in FIG. 6, and includes: a processor 601 and a machine-readable storage medium 602, wherein: the machine-readable storage medium 602 stores machine-executable instructions that can be executed by the processor 601; the processor 601 is configured to execute machine-executable instructions to implement the image processing method disclosed in the above examples of the present disclosure.
  • the machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device, and may contain or store information, such as executable instructions, data, and so on.
  • the machine-readable storage medium may be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as hard drive), solid state hard disk, any type of storage disk (Such as optical discs, DVDs, etc.), or similar storage media, or a combination thereof.
  • the system, device, module, or unit described in the foregoing embodiments may be specifically implemented by a computer chip or entity, or a product with a certain function.
  • a typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email sending and receiving device, and a game control Desk, tablet computer, wearable device, or a combination of any of these devices.
  • the embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, the embodiments of the present disclosure may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • these computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device,
  • the instruction device implements the functions specified in a flowchart or a plurality of processes and / or a block or a plurality of blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of operation steps can be performed on the computer or other programmable device to generate a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

Abstract

Provided in the present disclosure are an image processing method, apparatus and device, and a machine-readable storage medium. The method comprises: acquiring an original image of a first data format; using a neural network to convert the original image into an intermediate image of a second data format; and using a buffered image to denoise the intermediate image to obtain a target image of the second data format, wherein the buffered image comprises a target image corresponding to a previous-frame original image adjacent to the original image.

Description

一种图像处理方法、装置、设备及机器可读存储介质Image processing method, device, equipment and machine-readable storage medium
相关申请的交叉引用Cross-reference to related applications
本专利申请要求于2018年5月31日提交的、申请号为201810556530.2、发明名称为“一种图像处理方法、装置、设备及机器可读存储介质”的中国专利申请的优先权,该申请的全文以引用的方式并入本文中。This patent application claims priority from a Chinese patent application filed on May 31, 2018 with an application number of 201810556530.2 and an invention name of "an image processing method, device, device, and machine-readable storage medium." The entire text is incorporated herein by reference.
技术领域Technical field
本申请涉及图像技术领域,尤其是涉及一种图像处理方法、装置、设备及机器可读存储介质。The present application relates to the field of image technology, and in particular, to an image processing method, apparatus, device, and machine-readable storage medium.
背景技术Background technique
成像设备采集的第一数据格式的原始图像通常无法直接进行显示或者传输,因此,还可以将第一数据格式的原始图像转换为第二数据格式的目标图像以进行显示或者传输。例如,可以采用ISP(Image Signal Processing,图像信号处理)算法将原始图像转换为目标图像,ISP算法用于解决亮度、色彩的补偿和校正等图像处理。The original image in the first data format collected by the imaging device cannot usually be directly displayed or transmitted. Therefore, the original image in the first data format can also be converted into a target image in the second data format for display or transmission. For example, an ISP (Image Signal Processing) algorithm can be used to convert the original image into a target image. The ISP algorithm is used to solve image processing such as brightness and color compensation and correction.
在成像设备采用ISP算法将原始图像转换为目标图像时,由于ISP算法本身的缺陷,以及每个处理模块损失的叠加,会导致目标图像在一定程度上损失原有图像信息。若原有图像信息的损失较为严重,则后续可能无法再修复。而且,在光照条件较差时采集的原始图像,经过ISP算法处理后存在的噪声较大。When the imaging device uses the ISP algorithm to convert the original image into the target image, due to the defects of the ISP algorithm itself and the superposition of the loss of each processing module, the target image will lose the original image information to a certain extent. If the loss of the original image information is serious, it may not be repaired subsequently. Moreover, the original image collected when the lighting conditions are poor has a large noise after being processed by the ISP algorithm.
发明内容Summary of the Invention
本公开提供一种图像处理方法、装置、设备及机器可读存储介质,可以有效去除噪声,提高目标图像的质量,提高用户使用感受。The present disclosure provides an image processing method, device, device, and machine-readable storage medium, which can effectively remove noise, improve the quality of a target image, and improve user experience.
本公开提供一种图像处理方法,所述方法包括:The present disclosure provides an image processing method, which includes:
获取第一数据格式的原始图像;Obtaining an original image in a first data format;
利用神经网络将所述原始图像转换为第二数据格式的中间图像;Using a neural network to convert the original image into an intermediate image in a second data format;
利用缓存图像对所述中间图像进行降噪处理,得到第二数据格式的目标图像,其中,所述缓存图像包括与所述原始图像相邻的上一帧原始图像对应的目标图像。The buffer image is used to perform noise reduction processing on the intermediate image to obtain a target image in a second data format. The buffer image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
本公开提供一种图像处理装置,所述装置包括:The present disclosure provides an image processing apparatus including:
图像处理模块,用于获取第一数据格式的原始图像,并利用神经网络将所述原始图 像转换为第二数据格式的中间图像;An image processing module, configured to obtain an original image in a first data format, and convert the original image into an intermediate image in a second data format using a neural network;
视频处理模块,用于利用缓存图像对所述中间图像进行降噪处理,得到第二数据格式的目标图像,其中,所述缓存图像包括与所述原始图像相邻的上一帧原始图像对应的目标图像。A video processing module, configured to perform noise reduction processing on the intermediate image by using a cached image to obtain a target image in a second data format, wherein the cached image includes a corresponding original image of a previous frame adjacent to the original image The target image.
本公开提供一种图像处理设备,包括:处理器和机器可读存储介质,所述机器可读存储介质存储有能够被所述处理器执行的机器可执行指令;所述处理器用于执行机器可执行指令,以实现上述的方法步骤。The present disclosure provides an image processing apparatus including a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions that can be executed by the processor; and the processor is configured to execute the machine-executable Execute instructions to implement the method steps described above.
本公开提供一种机器可读存储介质,所述机器可读存储介质上存储有计算机指令,所述计算机指令被执行时,实现上述的方法步骤。由以上技术方案可见,本公开实施例中,在利用神经网络将第一数据格式的原始图像转换为第二数据格式的中间图像后,可以利用缓存图像对中间图像进行降噪处理,得到第二数据格式的目标图像。由于缓存图像是原始图像的上一帧原始图像对应的目标图像,两帧图像(即缓存图像和中间图像)在时间和空间上紧密相关。可以利用两帧图像的相关性来区分图像中的信号和噪声,对中间图像进行降噪处理,可以有效去除噪声。因此在光照条件较差时也能够有效去除噪声,使得图像中的噪声可以得到有效抑制,提高目标图像的质量。而且,在上述方式中,是利用神经网络将第一数据格式的原始图像转换为第二数据格式的中间图像,可以减少中间图像的原有图像信息损失,后续可以修复。The present disclosure provides a machine-readable storage medium. Computer instructions are stored on the machine-readable storage medium. When the computer instructions are executed, the foregoing method steps are implemented. As can be seen from the above technical solutions, in the embodiment of the present disclosure, after the original image in the first data format is converted into the intermediate image in the second data format by using a neural network, the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cache image is the target image corresponding to the previous frame of the original image of the original image, the two frames of the image (ie, the cache image and the intermediate image) are closely related in time and space. The correlation between the two frames of image can be used to distinguish the signal and noise in the image, and the noise reduction process can be performed on the intermediate image to effectively remove the noise. Therefore, noise can be effectively removed even in poor lighting conditions, so that noise in the image can be effectively suppressed, and the quality of the target image is improved. Moreover, in the above manner, the original image in the first data format is converted into the intermediate image in the second data format by using a neural network, which can reduce the original image information loss of the intermediate image and can be repaired later.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1A和图1B是本公开一种实施方式中的神经网络的示意图。1A and 1B are schematic diagrams of a neural network in an embodiment of the present disclosure.
图2A-图2C是本公开一种实施方式中的离线训练神经网络的示意图。2A-2C are schematic diagrams of an offline training neural network in an embodiment of the present disclosure.
图3是本公开一种实施方式中的图像处理方法的流程图。FIG. 3 is a flowchart of an image processing method according to an embodiment of the present disclosure.
图4A-图4D是本公开一种实施方式中的图像处理的示意图。4A-4D are schematic diagrams of image processing in an embodiment of the present disclosure.
图5是本公开一种实施方式中的图像处理装置的结构图。FIG. 5 is a structural diagram of an image processing apparatus in an embodiment of the present disclosure.
图6是本公开一种实施方式中的图像处理设备的硬件结构图。FIG. 6 is a hardware configuration diagram of an image processing apparatus in an embodiment of the present disclosure.
具体实施方式Detailed ways
在本公开实施例使用的术语仅仅是出于描述特定实施例的目的,而非限制本公开。本公开和权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其它含义。还应当理解,本文中使用的术语“和/或”是指包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in the embodiments of the present disclosure is for the purpose of describing particular embodiments only, and is not intended to limit the present disclosure. The singular forms "a", "the" and "the" used in this disclosure and the claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term "and / or" as used herein refers to any or all possible combinations that include one or more of the associated listed items.
应当理解,在本公开实施例可能采用术语第一、第二、第三等来描述各种信息,这些描述仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,此外,所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that the terms first, second, third, etc. may be used in the embodiments of the present disclosure to describe various kinds of information, and these descriptions are only used to distinguish the same type of information from each other. For example, without departing from the scope of the present disclosure, the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information. Depending on the context, in addition, the word "if" can be interpreted as "at" or "at ..." or "in response to a determination".
本公开实施例中提出一种图像处理方法,可以应用于图像处理设备,该图像处理设备可以是成像设备,如摄像机等,对此图像处理设备的类型不做限制。An embodiment of the present disclosure provides an image processing method, which can be applied to an image processing device. The image processing device may be an imaging device, such as a video camera, and the type of the image processing device is not limited.
本公开实施例中,在获取到第一数据格式的原始图像后,可以利用神经网络将原始图像转换为第二数据格式的中间图像,然后利用缓存图像对中间图像进行降噪处理,从而得到第二数据格式的目标图像。也就是说,目标图像是经过降噪处理的图像。这样,可以利用两帧图像(即缓存图像和中间图像)对中间图像进行降噪处理,可以有效去除噪声,在光照条件较差时也能够有效去除噪声,使得图像中的噪声可以得到有效抑制,提高目标图像的质量,提高用户感受。In the embodiment of the present disclosure, after the original image in the first data format is obtained, the original image may be converted into an intermediate image in the second data format by using a neural network, and then the buffer image is used to perform noise reduction processing on the intermediate image to obtain the first image. Target image in two data formats. That is, the target image is an image subjected to noise reduction processing. In this way, two frames of images (that is, the cached image and the intermediate image) can be used to perform noise reduction processing on the intermediate image, which can effectively remove noise, and can also effectively remove noise when the lighting conditions are poor, so that the noise in the image can be effectively suppressed. Improve the quality of the target image and enhance the user experience.
为了更加清楚的对本公开进行说明,先对本公开的如下概念进行简要说明:In order to explain the present disclosure more clearly, the following concepts of the present disclosure are briefly described first:
1、第一数据格式和原始图像。1. The first data format and the original image.
图像处理设备采集到的图像可以为原始图像,原始图像的数据格式可以为第一数据格式,第一数据格式是原始图像格式,通常包含一个或多个光谱波段的图像数据。第一数据格式的原始图像无法直接进行显示或者传输,即在显示或者传输第一数据格式的原始图像时存在异常。The image collected by the image processing device may be an original image, and a data format of the original image may be a first data format. The first data format is an original image format, and usually includes image data of one or more spectral bands. The original image in the first data format cannot be directly displayed or transmitted, that is, there is an exception when the original image in the first data format is displayed or transmitted.
例如,第一数据格式可以包括Bayer格式。当然,Bayer格式只是一个示例,对此第一数据格式不做限制,所有原始图像的数据格式,均在本公开保护范围之内。For example, the first data format may include a Bayer format. Of course, the Bayer format is only an example, and there is no limitation on this first data format, and the data format of all the original images is within the protection scope of the present disclosure.
2、第二数据格式、中间图像和目标图像。2. The second data format, the intermediate image and the target image.
图像处理设备利用神经网络对原始图像进行转换后,得到的是中间图像。该中间图像也就是神经网络的输出图像,并不是最终的目标图像。图像处理设备利用缓存图像对中间图像进行降噪处理后,得到的才是目标图像,即最终输出的图像。After the image processing device uses the neural network to convert the original image, the intermediate image is obtained. The intermediate image is the output image of the neural network, and is not the final target image. After the image processing device performs noise reduction processing on the intermediate image by using the cache image, the target image is obtained, that is, the final output image.
中间图像和目标图像的数据格式均可以为第二数据格式,第二数据格式是适于显示或者传输的任意图像格式。例如,在显示或者传输第二数据格式的目标图像时,不会发生异常。The data format of the intermediate image and the target image may be a second data format, and the second data format is any image format suitable for display or transmission. For example, when a target image in a second data format is displayed or transmitted, no abnormality occurs.
例如,第二数据格式可以包括RGB(Red Green Blue,红绿蓝)格式、YUV(Luminance Chrominance,亮度色度)格式等。当然,RGB格式、YUV格式只是示例,对此第二数据格式不做限制,所有适于显示或者传输的图像格式,均在本公开保护范围之内。For example, the second data format may include an RGB (Red Green Blue) format, a YUV (Luminance Chrominance) format, and the like. Of course, the RGB format and the YUV format are just examples, and there is no limitation on this second data format. All image formats suitable for display or transmission are within the protection scope of the present disclosure.
下面介绍本公开实施例中的神经网络,其可以用于将第一数据格式的原始图像转换 为第二数据格式的中间图像。在转换过程中,还可以利用神经网络对原始图像进行优化,如对原始图像的属性进行调整,如调整原始图像的亮度、颜色、对比度、信噪比、大小等属性,对此优化方式不详加限制。The following describes the neural network in the embodiment of the present disclosure, which can be used to convert an original image in a first data format into an intermediate image in a second data format. During the conversion process, you can also use the neural network to optimize the original image, such as adjusting the attributes of the original image, such as adjusting the brightness, color, contrast, signal-to-noise ratio, and size of the original image. This optimization method is not detailed. limit.
本公开中的神经网络可以包括但不限于:卷积神经网络(简称CNN)、循环神经网络(简称RNN)、全连接网络等,本实施例中以卷积神经网络为例。The neural network in the present disclosure may include, but is not limited to, a convolutional neural network (CNN for short), a recurrent neural network (RNN for short), a fully connected network, etc. In this embodiment, the convolutional neural network is taken as an example.
本公开中的神经网络的结构单元可以包括但不限于以下之一或者任意组合:卷积层、池化层、激励层、全连接层等。对神经网络具体包括的结构单元不做限制,只要包括至少一个卷积层即可。例如,在一个例子中,神经网络可以包括:至少一个卷积层、至少一个池化层和至少一个全连接层。或者,在另一个例子中,神经网络可以包括:至少一个卷积层和至少一个激励层。The structural units of the neural network in the present disclosure may include, but are not limited to, one or any combination of the following: a convolutional layer, a pooling layer, an excitation layer, a fully connected layer, and the like. There are no restrictions on the structural units specifically included in the neural network, as long as it includes at least one convolutional layer. For example, in one example, a neural network may include: at least one convolutional layer, at least one pooling layer, and at least one fully connected layer. Alternatively, in another example, the neural network may include: at least one convolutional layer and at least one excitation layer.
例如,如图1A和图1B所示,为本实施例中所采用的神经网络的两个示例。For example, as shown in FIGS. 1A and 1B, there are two examples of the neural network used in this embodiment.
在图1A中,神经网络可以由若干个卷积层(Conv)、若干个池化层(Pool)以及一个全连接层(FC)构成。对卷积层的数量、池化层的数量不做限制。In FIG. 1A, the neural network may be composed of several convolutional layers (Conv), several pooling layers (Pool), and a fully connected layer (FC). There are no restrictions on the number of convolution layers and the number of pooling layers.
在图1B中,神经网络可以由若干个卷积层、若干个激励层构成。对卷积层的数量、激励层的数量,均不做限制。In FIG. 1B, the neural network may be composed of several convolutional layers and several excitation layers. There are no restrictions on the number of convolutional layers or the number of excitation layers.
当然,本公开中用于将第一数据格式的原始图像转换为第二数据格式的中间图像的神经网络还可以为其它结构,对此不做限制,只要包括至少一个卷积层即可,并不局限于图1A或图1B,图1A或图1B示意的神经网络仅为示例。Of course, the neural network used in the present disclosure for converting the original image in the first data format into the intermediate image in the second data format may also have other structures, which is not limited as long as it includes at least one convolution layer, and Not limited to FIG. 1A or FIG. 1B, the neural network illustrated in FIG. 1A or FIG. 1B is merely an example.
下面对神经网络中的各个计算层的算法及功能进行说明。The algorithms and functions of each computing layer in the neural network are described below.
在卷积层中,通过使用卷积核对图像进行卷积运算,使图像特征增强。该卷积核可以是一个m*n大小的矩阵,卷积层的输入与卷积核进行卷积运算,可得到卷积层的输出。卷积运算实际是一个滤波过程。在卷积运算中,将图像上点(x,y)的像素值f(x,y)与卷积核w(x,y)进行卷积。例如,提供4*4的卷积核,该4*4的卷积核包含16个数值,这16个数值的大小可以根据需要配置。按照4*4的大小在图像上依次滑动,得到多个4*4的滑动窗口。将该4*4的卷积核与每个滑动窗口进行卷积,得到多个卷积特征。这些卷积特征就是卷积层的输出,且被提供给池化层。In the convolution layer, the image features are enhanced by performing a convolution operation on the image using a convolution kernel. The convolution kernel can be a matrix of size m * n. The input of the convolution layer and the convolution kernel are convolved to obtain the output of the convolution layer. The convolution operation is actually a filtering process. In the convolution operation, the pixel value f (x, y) of the point (x, y) on the image is convolved with the convolution kernel w (x, y). For example, a 4 * 4 convolution kernel is provided. The 4 * 4 convolution kernel contains 16 values, and the size of the 16 values can be configured as required. Slide the image in order according to the size of 4 * 4 to get multiple 4 * 4 sliding windows. The 4 * 4 convolution kernel is convolved with each sliding window to obtain multiple convolution features. These convolutional features are the output of the convolutional layer and are provided to the pooling layer.
池化层的处理实际上就是一个降采样的过程。通过对卷积层输出的多个卷积特征进行取最大、取最小、取平均值等操作,可以减少计算量,并保持特征不变性。在池化层中,可以利用图像局部相关性的原理,对图像进行子抽样,从而可以减少数据处理量,并保留有用信息。在一个例子中,可以利用如下执行最大池化的公式对卷积特征进行池化处理,并得到池化处理后的特征。The processing of the pooling layer is actually a process of downsampling. By performing operations such as maximizing, minimizing, and averaging multiple convolutional features output by the convolutional layer, the amount of calculation can be reduced and feature invariance can be maintained. In the pooling layer, the principle of local image correlation can be used to sub-sample the image, which can reduce the amount of data processing and retain useful information. In one example, the following formula for performing maximum pooling can be used to pool the convolution features and obtain the pooled features.
Figure PCTCN2019089272-appb-000001
Figure PCTCN2019089272-appb-000001
其中,s表示对应的池化处理时的窗口大小(s*s),m和n均为设定的数值,j和k为卷积层输出的卷积特征,i表示针对第i个图像。y i j,k表示对第i个图像进行池化处理后得到的特征。 Among them, s represents the corresponding window size (s * s) during the pooling process, m and n are set values, j and k are convolution features output by the convolution layer, and i represents the i-th image. y i j, k represents the features obtained by pooling the i-th image.
在池化层之后的激励层,可以使用激活函数(如非线性函数)对池化层输出的特征进行映射,从而引入非线性因素,使得神经网络通过非线性的组合而增强表达能力。其中,激励层的激活函数可以包括但不限于ReLU(Rectified Linear Units,整流线性单元)函数。以如下的ReLU函数为例,则该ReLU函数可以将池化层输出的所有特征x中,小于等于0的特征置0,而大于0的特征保持不变。In the excitation layer after the pooling layer, the activation function (such as a non-linear function) can be used to map the features of the pooling layer output, thereby introducing non-linear factors, so that the neural network can enhance the expression ability through non-linear combination. The activation function of the excitation layer may include, but is not limited to, a ReLU (Rectified Linear Units, Rectified Linear Units) function. Taking the following ReLU function as an example, the ReLU function can set all the features x that are less than or equal to 0 to 0 in the output of the pooling layer, and keep the features that are greater than 0 unchanged.
Figure PCTCN2019089272-appb-000002
Figure PCTCN2019089272-appb-000002
在全连接层中,该全连接层的每一个结点都与上一层的所有结点相连,用于将输入给本全连接层的所有特征进行全连接处理,从而得到一个特征向量,且该特征向量中可以包括多个特征。进一步的,全连接层还可以采用1*1的卷积层,从而可以构成全卷积的网络。In the fully-connected layer, each node of the fully-connected layer is connected to all the nodes in the previous layer, and is used to fully-connect all features input to the fully-connected layer to obtain a feature vector, and The feature vector may include multiple features. Further, the fully connected layer may also use a 1 * 1 convolution layer, so that a fully convolutional network can be formed.
在实际应用中,可以根据不同需求,将一个或多个卷积层、一个或多个池化层、一个或多个激励层和一个或多个全连接层进行组合构建神经网络。In practical applications, according to different needs, one or more convolutional layers, one or more pooling layers, one or more excitation layers, and one or more fully connected layers can be combined to construct a neural network.
本实施例中,神经网络的输入是第一数据格式的原始图像,神经网络的输出是第二数据格式的中间图像。也就是说,将第一数据格式的原始图像输入给神经网络后,经过神经网络内各结构单元(如卷积层、池化层、激励层、全连接层等)的处理,可以输出第二数据格式的中间图像。In this embodiment, the input of the neural network is the original image in the first data format, and the output of the neural network is the intermediate image in the second data format. That is, after the original image in the first data format is input to the neural network, after processing by various structural units (such as convolutional layer, pooling layer, excitation layer, fully connected layer, etc.) in the neural network, the second image can be output Intermediate image in data format.
为实现上述功能,可以离线训练神经网络,主要是训练神经网络内各神经网络参数,如卷积层参数(如卷积核参数)、池化层参数、激励层参数等。对此不做限制,所有神经网络涉及的参数,均在本实施例的保护范围之内。通过训练神经网络内各神经网络参数,可以使神经网络拟合出输入和输出的映射关系,即得到第一数据格式的原始图像与第二数据格式的中间图像之间的映射关系。这样,当神经网络的输入是第一数据格式的原始图像时,通过神经网络的处理,神经网络的输出就是第二数据格式的中间图像。In order to achieve the above functions, the neural network can be trained offline, which mainly trains various neural network parameters in the neural network, such as convolution layer parameters (such as convolution kernel parameters), pooling layer parameters, and excitation layer parameters. There is no limitation on this, all parameters involved in the neural network are within the protection scope of this embodiment. By training each neural network parameter in the neural network, the neural network can fit the mapping relationship between input and output, that is, the mapping relationship between the original image in the first data format and the intermediate image in the second data format. In this way, when the input of the neural network is the original image in the first data format, and through the processing of the neural network, the output of the neural network is the intermediate image in the second data format.
以下结合具体的应用场景,对离线训练神经网络的过程进行详细说明。在本实施例中,介绍两种训练方式,为了区分方便,可以将这两种训练方式得到的神经网络,分别称为第一神经网络和第二神经网络。The following describes the process of offline training of neural networks in detail in combination with specific application scenarios. In this embodiment, two training methods are introduced. For the convenience of distinguishing, the neural networks obtained by the two training methods may be referred to as a first neural network and a second neural network, respectively.
参见图2A所示,为了离线训练第一神经网络,可以采集第一数据格式的训练图像、 第二数据格式的训练图像,将第一数据格式的训练图像与第二数据格式的训练图像进行关联存储,得到图像数据集,将图像数据集输出给第一神经网络,由第一神经网络利用该图像数据集训练第一神经网络内各神经网络参数。Referring to FIG. 2A, in order to train the first neural network offline, a training image in a first data format and a training image in a second data format may be collected, and the training image in the first data format and the training image in the second data format may be associated. The image data set is stored, and the image data set is output to the first neural network. The first neural network uses the image data set to train each neural network parameter in the first neural network.
得到图像数据集的过程,可以包括:The process of obtaining the image data set can include:
方式一、针对同一帧图像,成像设备A采集第一数据格式的训练图像A1,成像设备B同步采集第二数据格式的训练图像B1,然后,将训练图像A1和训练图像B1进行关联存储。同理,成像设备A采集训练图像A2,成像设备B采集训练图像B2,将训练图像A2和训练图像B2进行关联存储。以此类推,最终,图像数据集可以包括各组训练图像An和训练图像Bn的对应关系。Method 1: For the same frame image, the imaging device A collects the training image A1 in the first data format, the imaging device B synchronously acquires the training image B1 in the second data format, and then stores the training image A1 and the training image B1 in association. Similarly, the imaging device A acquires the training image A2, the imaging device B acquires the training image B2, and stores the training image A2 and the training image B2 in association. By analogy, in the end, the image data set may include the corresponding relationship between each group of training images An and training images Bn.
方式二、成像设备A采集第一数据格式的训练图像A1,并通过对该训练图像A1进行处理(如白平衡校正、色彩插值、曲线映射等,对此处理方式不做限制),得到第二数据格式的训练图像A1’,并将训练图像A1和训练图像A1’进行关联存储。同理,成像设备A采集第一数据格式的训练图像A2,通过对训练图像A2进行处理,得到第二数据格式的训练图像A2’,将训练图像A2和训练图像A2’进行关联存储,以此类推,最终图像数据集可以包括各组训练图像An和训练图像An’的对应关系。Method 2: The imaging device A collects the training image A1 in the first data format, and processes the training image A1 (such as white balance correction, color interpolation, curve mapping, etc., and the processing method is not limited) to obtain the second The training image A1 'in a data format, and the training image A1 and the training image A1' are stored in association. Similarly, the imaging device A collects the training image A2 in the first data format, and processes the training image A2 to obtain the training image A2 ′ in the second data format, and stores the training image A2 and the training image A2 ′ in association with each other. By analogy, the final image data set may include the correspondence between each group of training images An and training images An ′.
显然,经过上述两种方式,均可以得到图像数据集,该图像数据集包括第一数据格式的训练图像与第二数据格式的训练图像的对应关系。基于此图像数据集,可以对预先设计的第一神经网络进行训练,即训练各神经网络参数,使得第一神经网络拟合出第一数据格式的图像与第二数据格式的图像的映射关系。对此训练方式不做限制,如反向传播、弹性传播、共轭梯度等方式。Obviously, through the above two methods, an image data set can be obtained, and the image data set includes a correspondence between a training image in a first data format and a training image in a second data format. Based on this image data set, a pre-designed first neural network can be trained, that is, each neural network parameter is trained, so that the first neural network fits the mapping relationship between the image in the first data format and the image in the second data format. There are no restrictions on this training method, such as back propagation, elastic propagation, and conjugate gradient.
在第一神经网络训练完成后,可以将原始图像输入已训练的第一神经网络,以由第一神经网络将第一数据格式的原始图像转换为第二数据格式的中间图像。而且,在第一神经网络训练完成后,还可以在线对第一神经网络进行调整。也就是说,可以利用第一数据格式的原始图像和第二数据格式的目标图像,重新优化第一神经网络内各神经网络参数,对此第一神经网络的在线调整过程不做限制。After the training of the first neural network is completed, the original image may be input into the trained first neural network to convert the original image in the first data format into the intermediate image in the second data format by the first neural network. Moreover, after the training of the first neural network is completed, the first neural network can also be adjusted online. That is, the original image in the first data format and the target image in the second data format can be used to re-optimize the parameters of each neural network in the first neural network, and there is no limitation on the online adjustment process of the first neural network.
参见图2B所示,为了离线训练第二神经网络,可以采集第一数据格式的训练图像、第二数据格式的训练图像,获取设备参数(即采集第一数据格式的训练图像的成像设备的参数),将第一数据格式的训练图像、第二数据格式的训练图像和设备参数进行关联存储,得到图像数据集,将图像数据集输出给第二神经网络,由第二神经网络利用该图像数据集训练第二神经网络内各神经网络参数。As shown in FIG. 2B, in order to train the second neural network offline, training images in a first data format and training images in a second data format may be collected to obtain device parameters (that is, parameters of an imaging device that acquires training images in the first data format). ), Associate the training image in the first data format, the training image in the second data format, and the device parameters to obtain an image data set, and output the image data set to the second neural network, and the second neural network uses the image data Set the parameters of each neural network in the training second neural network.
得到图像数据集的过程,可以包括但不限于:The process of obtaining the image data set may include, but is not limited to:
方式一、针对同一帧图像,成像设备A采集第一数据格式的训练图像A1,并获取 成像设备A的设备参数1,成像设备B同步采集第二数据格式的训练图像B1,然后,将训练图像A1、设备参数1和训练图像B1进行关联存储。同理,成像设备A采集训练图像A2和获取设备参数2,成像设备B采集训练图像B2,将训练图像A2、设备参数2和训练图像B2进行关联存储。以此类推,最终,图像数据集可以包括各组训练图像An、设备参数n和训练图像Bn的对应关系。Method 1: For the same frame image, the imaging device A acquires the training image A1 in the first data format, and acquires the device parameter 1 of the imaging device A. The imaging device B synchronously acquires the training image B1 in the second data format, and then, the training image is A1, equipment parameter 1 and training image B1 are stored in association. Similarly, the imaging device A acquires the training image A2 and acquires the device parameter 2, and the imaging device B acquires the training image B2, and stores the training image A2, the device parameter 2 and the training image B2 in association. By analogy, in the end, the image data set may include the correspondence between each group of training images An, device parameters n, and training images Bn.
上述设备参数可以是与环境无关的固定参数(如传感器灵敏度),也可以是与环境有关的拍摄参数(例如光圈尺寸)。例如,设备参数可以包括但不限于以下之一或者任意组合:传感器的灵敏度、动态范围、信噪比、像元尺寸、靶面尺寸、分辨率、帧率、像素数、光谱响应、光电响应、阵列模式、镜头的光圈口径、焦距、光圈大小、遮光罩型号、滤镜口径、视角等,对此不做限制。The above device parameters may be fixed parameters that are not related to the environment (such as sensor sensitivity), or shooting parameters that are related to the environment (such as aperture size). For example, device parameters may include, but are not limited to, one or any combination of the following: sensor sensitivity, dynamic range, signal-to-noise ratio, pixel size, target surface size, resolution, frame rate, number of pixels, spectral response, photoelectric response, There are no restrictions on the array mode, lens aperture diameter, focal length, aperture size, hood model, filter aperture, viewing angle, etc.
方式二、成像设备1采集第一数据格式的训练图像集1(包括大量训练图像),获取成像设备1的设备参数1,并通过对训练图像集1中的每个训练图像进行处理(如白平衡校正、色彩插值、曲线映射等,对此不做限制),得到第二数据格式的训练图像集1,并将第一数据格式的训练图像集1、设备参数1和第二数据格式的训练图像集1进行关联存储。同理,成像设备2采集第一数据格式的训练图像集2,获取成像设备2的设备参数2,通过对训练图像集2中的每个训练图像进行处理,得到第二数据格式的训练图像集2,将第一数据格式的训练图像集2、设备参数2和第二数据格式的训练图像集2进行关联存储;以此类推,最终图像数据集可以包括各组第一数据格式的训练图像集K、设备参数K和第二数据格式的训练图像集K的对应关系,参见图2C所示。Method 2: The imaging device 1 collects the training image set 1 (including a large number of training images) in the first data format, obtains the device parameters 1 of the imaging device 1, and processes each training image in the training image set 1 (such as white Balance correction, color interpolation, curve mapping, etc., without limitation), to obtain training image set 1 in the second data format, and training image set 1, device parameter 1 and training in the second data format in the first data format Image set 1 is stored in association. Similarly, the imaging device 2 collects the training image set 2 in the first data format, acquires the device parameters 2 of the imaging device 2, and processes each training image in the training image set 2 to obtain the training image set in the second data format. 2. Store the training image set 2 in the first data format 2 and the device parameters 2 and the training image set 2 in the second data format in an associative manner; and so on, the final image data set may include the training image sets of each group in the first data format The correspondence between K, the device parameter K, and the training image set K in the second data format is shown in FIG. 2C.
经过上述两种方式均可以得到图像数据集,该图像数据集包括第一数据格式的训练图像、设备参数与第二数据格式的训练图像的对应关系。基于此图像数据集,可以对预先设计的第二神经网络进行训练,即训练神经网络参数,使得第二神经网络拟合出第一数据格式的图像、设备参数与第二数据格式的图像的映射关系。对此训练方式不做限制,如反向传播、弹性传播、共轭梯度等方式。需要注意的是,第二神经网络拟合出的映射关系中包括设备参数。An image data set can be obtained through the above two methods, and the image data set includes a training image in a first data format, a correspondence between a device parameter and a training image in a second data format. Based on this image data set, a pre-designed second neural network can be trained, that is, training the neural network parameters, so that the second neural network fits the mapping of the image in the first data format, the device parameters, and the image in the second data format. relationship. There are no restrictions on this training method, such as back propagation, elastic propagation, and conjugate gradient. It should be noted that the mapping relationship fitted by the second neural network includes device parameters.
在第二神经网络训练完成后,可以获取原始图像以及采集原始图像的设备的设备参数,并可以将原始图像和所述设备参数输入至已训练的第二神经网络,以由第二神经网络根据所述设备参数将所述原始图像的数据格式从所述第一数据格式转换为第二数据格式,从而得到第二数据格式的中间图像。After the training of the second neural network is completed, the original image and the device parameters of the device that collects the original image can be obtained, and the original image and the device parameters can be input to the trained second neural network, so that the second neural network can The device parameter converts a data format of the original image from the first data format to a second data format, thereby obtaining an intermediate image in a second data format.
在上述应用场景下,以下结合几个具体实施例对图像处理方法进行说明。In the above application scenario, an image processing method is described below with reference to several specific embodiments.
参见图3所示,为图像处理方法的流程示意图,该方法可以包括以下步骤。Referring to FIG. 3, which is a schematic flowchart of an image processing method, the method may include the following steps.
步骤301,获取第一数据格式的原始图像。Step 301: Obtain an original image in a first data format.
在一个例子中,可以对第一波长范围的光信号进行采样,得到原始图像;或者,对第二波长范围的光信号进行采样,得到原始图像;或者,对第一波长范围和第二波长范围的光信号进行采样,得到原始图像。其中,第一波长范围和第二波长范围仅为示例且不具有限制性,例如,第一波长范围可以是可见光波长范围380nm-780nm,第二波长范围可以是红外线波长范围780nm-2500nm。In one example, a light signal in a first wavelength range may be sampled to obtain an original image; or a light signal in a second wavelength range may be sampled to obtain an original image; or a first wavelength range and a second wavelength range may be obtained. The light signal is sampled to obtain the original image. The first wavelength range and the second wavelength range are merely examples and are not restrictive. For example, the first wavelength range may be a visible light wavelength range from 380 nm to 780 nm, and the second wavelength range may be an infrared wavelength range from 780 nm to 2500 nm.
步骤302,利用已训练的神经网络将该原始图像转换为第二数据格式的中间图像。参见上述实施例,可将原始图像输入已训练的第一神经网络,以由第一神经网络将原始图像的数据格式从所述第一数据格式转换为第二数据格式;或者,先获取采集第一数据格式的原始图像的设备的设备参数;再将原始图像和所述设备参数输入至已训练的第二神经网络,以由第二神经网络根据所述设备参数将所述第一数据格式的原始图像转换为第二数据格式的中间图像。Step 302: Use the trained neural network to convert the original image into an intermediate image in a second data format. Referring to the foregoing embodiment, the original image may be input to a trained first neural network, and the data format of the original image is converted from the first data format to the second data format by the first neural network; A data format of the device parameters of the original image device; the original image and the device parameters are then input to a trained second neural network, so that the second neural network converts the data of the first data format according to the device parameters The original image is converted into an intermediate image in the second data format.
步骤303,利用缓存图像对第二数据格式的中间图像进行降噪处理,得到第二数据格式的目标图像;该缓存图像包括与该原始图像相邻的上一帧原始图像对应的目标图像。步骤303后,可以将下一帧原始图像对应的缓存图像更新为该目标图像。In step 303, the buffer image is used to perform noise reduction processing on the intermediate image in the second data format to obtain a target image in the second data format. The buffer image includes a target image corresponding to a previous frame of the original image adjacent to the original image. After step 303, the cache image corresponding to the original image of the next frame may be updated as the target image.
例如,针对第一帧原始图像1,在得到原始图像1的目标图像1(对此目标图像1的获取方式不做限制,如可直接将原始图像1的中间图像1确定为目标图像1)后,将缓存中的缓存图像更新为目标图像1。针对第二帧原始图像2,在得到原始图像2的中间图像2后,利用缓存图像(即目标图像1)对中间图像2进行降噪处理,得到目标图像2,并将缓存中的缓存图像更新为目标图像2,即目标图像1不再是缓存图像。针对第三帧原始图像3,在得到原始图像3的中间图像3后,利用缓存图像(即目标图像2)对中间图像3进行降噪处理,得到目标图像3,并将缓存中的缓存图像更新为目标图像3,即目标图像2不再是缓存图像。For example, for the first frame of the original image 1, after obtaining the target image 1 of the original image 1 (the acquisition method of the target image 1 is not limited, for example, the intermediate image 1 of the original image 1 can be directly determined as the target image 1) To update the cached image in the cache to target image 1. For the second frame of the original image 2, after obtaining the intermediate image 2 of the original image 2, the buffer image (that is, the target image 1) is used to perform noise reduction processing on the intermediate image 2 to obtain the target image 2, and the cached image in the cache is updated Is target image 2, that is, target image 1 is no longer a cached image. For the third frame of the original image 3, after the intermediate image 3 of the original image 3 is obtained, the buffer image (that is, the target image 2) is used to perform noise reduction processing on the intermediate image 3 to obtain the target image 3, and the cached image in the cache is updated Is target image 3, that is, target image 2 is no longer a cached image.
以此类推,可以利用上一帧原始图像对应的目标图像不断更新缓存中的缓存图像,以对当前帧的原始图像对应的中间图像进行降噪处理,从而得到当前帧的原始图像对应的目标图像。By analogy, the target image corresponding to the original image of the previous frame can be used to continuously update the cached image in the cache to perform noise reduction processing on the intermediate image corresponding to the original image of the current frame, thereby obtaining the target image corresponding to the original image of the current frame. .
利用缓存图像对中间图像进行降噪处理,得到第二数据格式的目标图像,包括但不限于:根据中间图像和缓存图像获取中间图像中的每个像素点的运动估计值;根据运动估计值将中间图像转换为第二数据格式的目标图像。The buffer image is used to perform noise reduction processing on the intermediate image to obtain a target image in a second data format, including but not limited to: obtaining a motion estimation value of each pixel in the intermediate image according to the intermediate image and the buffer image; The intermediate image is converted into a target image in a second data format.
由以上技术方案可见,本公开实施例中,在利用神经网络将第一数据格式的原始图像转换为第二数据格式的中间图像后,可以利用缓存图像对中间图像进行降噪处理,得到第二数据格式的目标图像。由于缓存图像是与原始图像相邻的上一帧原始图像对应的目标图像,相邻两帧图像(即缓存图像和中间图像)在时间和空间上紧密相关。可以利 用相邻两帧图像的相关性来区分中间图像中的信号和噪声,对中间图像进行降噪处理,从而可以有效去除噪声。因此在光照条件较差时也能够有效去除噪声,使得目标图像中的噪声可以得到有效抑制。As can be seen from the above technical solutions, in the embodiment of the present disclosure, after the original image in the first data format is converted into the intermediate image in the second data format by using a neural network, the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cached image is the target image corresponding to the previous frame of the original image adjacent to the original image, the two adjacent frames (ie, the cached image and the intermediate image) are closely related in time and space. You can use the correlation between two adjacent frames to distinguish the signal and noise in the intermediate image, and perform noise reduction processing on the intermediate image to effectively remove the noise. Therefore, noise can be effectively removed even in poor lighting conditions, so that the noise in the target image can be effectively suppressed.
在一个实施例中,上述方法流程可以由图像处理装置100执行,如图4A所示,该图像处理装置100可以包含3个模块:图像处理模块101、视频处理模块102和训练学习模块103。其中,图像处理模块101用于执行上述步骤301和步骤302,视频处理模块102用于执行上述步骤303。In one embodiment, the above method flow may be executed by the image processing apparatus 100. As shown in FIG. 4A, the image processing apparatus 100 may include three modules: an image processing module 101, a video processing module 102, and a training and learning module 103. The image processing module 101 is configured to perform the foregoing steps 301 and 302, and the video processing module 102 is configured to perform the foregoing step 303.
其中,训练学习模块103可以是离线模块,预先利用图像数据集对神经网络进行训练调整,再将训练调整后的神经网络输出到图像处理模块101,可以执行上述实施例中介绍的第一神经网络和第二神经网络的离线训练。The training and learning module 103 may be an offline module. The neural network is trained and adjusted in advance using the image data set, and the trained and adjusted neural network is output to the image processing module 101. The first neural network described in the above embodiment may be executed. And offline training of the second neural network.
图像处理模块101从训练学习模块103获取预先调整好的神经网络,基于该神经网络对输入的第一数据格式的原始图像进行处理,输出第二数据格式的中间图像。The image processing module 101 obtains a pre-adjusted neural network from the training and learning module 103, processes the original image of the inputted first data format based on the neural network, and outputs the intermediate image of the second data format.
视频处理模块102接收图像处理模块101输出的一帧第二数据格式的中间图像,结合缓存中存储的缓存图像的信息进行降噪处理,得到目标图像,并将处理后的目标图像存入缓存中作为下一帧所对应的缓存图像,其中,视频处理模块102可以在缓存中记录一帧图像作为缓存图像。The video processing module 102 receives a frame of intermediate image in the second data format output by the image processing module 101, performs noise reduction processing in combination with the information of the cached image stored in the cache, obtains the target image, and stores the processed target image in the cache. As a cache image corresponding to the next frame, the video processing module 102 may record a frame of the image in the cache as a cache image.
在一个实施例中,如图4B所示,视频处理模块102由运动估计单元201和时域处理单元202构成。在这种情况下,可由运动估计单元201执行步骤S1,由时域处理单元202执行步骤S2,由运动估计单元201和时域处理单元202实现上述步骤303。In one embodiment, as shown in FIG. 4B, the video processing module 102 is composed of a motion estimation unit 201 and a time domain processing unit 202. In this case, step S1 may be performed by the motion estimation unit 201, step S2 may be performed by the time domain processing unit 202, and step 303 described above may be implemented by the motion estimation unit 201 and the time domain processing unit 202.
步骤S1,根据该中间图像和缓存图像获取该中间图像中的每个像素点的运动估计值。其中,缓存图像包括与该原始图像相邻的上一帧原始图像对应的目标图像。In step S1, a motion estimation value of each pixel in the intermediate image is obtained according to the intermediate image and the cache image. The cache image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
步骤S2,根据所述运动估计值将中间图像转换为第二数据格式的目标图像。In step S2, the intermediate image is converted into a target image in a second data format according to the motion estimation value.
当然,上述由运动估计单元201和时域处理单元202组成的视频处理模块102只是一个示例,在实际应用中,视频处理模块102还可以包括其它单元,如噪声估计单元、空域处理单元等,噪声估计单元用于对图像进行噪声估计,而空域处理单元用于对图像进行空域处理,对此处理过程不做限制。Of course, the above-mentioned video processing module 102 composed of the motion estimation unit 201 and the time-domain processing unit 202 is just an example. In practical applications, the video processing module 102 may further include other units, such as a noise estimation unit, a spatial processing unit, and the like. The estimation unit is configured to perform noise estimation on the image, and the spatial processing unit is configured to perform spatial processing on the image, and there is no limitation on the processing process.
在另一个实施例中,如图4C所示,视频处理模块102由运动估计单元201、时域处理单元202、噪声估计单元203和空域处理单元204构成。在这种情况下,可由上述运动估计单元201、时域处理单元202、噪声估计单元203和空域处理单元204实现上述步骤303。或者,如图4D所示,视频处理模块102由运动估计单元201、时域处理单元202和空域处理单元204构成。在这种情况下,可由上述运动估计单元201、时域处理单元202和空域处理单元204实现上述步骤303。In another embodiment, as shown in FIG. 4C, the video processing module 102 is composed of a motion estimation unit 201, a time-domain processing unit 202, a noise estimation unit 203, and a spatial-domain processing unit 204. In this case, the aforementioned step 303 may be implemented by the aforementioned motion estimation unit 201, time domain processing unit 202, noise estimation unit 203, and spatial domain processing unit 204. Alternatively, as shown in FIG. 4D, the video processing module 102 is composed of a motion estimation unit 201, a time domain processing unit 202, and a spatial domain processing unit 204. In this case, the above-mentioned step 303 may be implemented by the above-mentioned motion estimation unit 201, time-domain processing unit 202, and space-domain processing unit 204.
在一个实施例中,通过步骤4031和步骤4032来实现上述步骤S1,具体为:In one embodiment, the above step S1 is implemented through steps 4031 and 4032, which are specifically:
步骤4031,根据该中间图像和该缓存图像获取关联性图像。其中,关联性图像是由中间图像和缓存图像对应位置的像素值,按照预设的计算方式得到的。该预设的计算方式可以是帧差法、卷积法、互相关法等,对此不做限制。Step 4031: Obtain a correlation image according to the intermediate image and the cache image. The correlation image is obtained from the pixel values of the corresponding positions of the intermediate image and the cache image according to a preset calculation method. The preset calculation method may be a frame difference method, a convolution method, a cross-correlation method, etc., and there is no limitation on this.
其中,根据该中间图像和该缓存图像获取关联性图像,可以包括但不限于:Obtaining a correlation image according to the intermediate image and the cache image may include, but is not limited to:
方式一、采用帧差法计算关联性图像,即将中间图像和缓存图像的每个像素点的像素值作差,得到关联性图像。例如,针对关联性图像中的每个像素点,获取该像素点在中间图像对应的第一像素值、该像素点在缓存图像对应的第二像素值,并将第一像素值与第二像素值的差,确定为该像素点在关联性图像的像素值。Method 1: Use the frame difference method to calculate the correlation image, that is, the difference between the pixel values of each pixel point of the intermediate image and the buffer image to obtain the correlation image. For example, for each pixel in the associated image, a first pixel value corresponding to the pixel in the intermediate image, a second pixel value corresponding to the pixel in the cache image, and the first pixel value and the second pixel are obtained. The difference in values is determined as the pixel value of the pixel in the associated image.
例如,中间图像的大小为6*4,缓存图像的大小为6*4,关联性图像的大小为6*4,6表示横向的像素点数量,4表示纵向的像素点数量。当然,6和4只是一个示例,实际应用中,横向的像素点数量远大于6,纵向的像素点数量远大于4,对此不做限制,但中间图像、缓存图像、关联性图像的大小相同。For example, the size of the intermediate image is 6 * 4, the size of the cache image is 6 * 4, and the size of the correlation image is 6 * 4. 6 represents the number of pixels in the horizontal direction, and 4 represents the number of pixels in the vertical direction. Of course, 6 and 4 are just an example. In practical applications, the number of pixels in the horizontal direction is much larger than 6, and the number of pixels in the vertical direction is much larger than 4. There is no limitation on this, but the intermediate images, cached images, and related images are the same size. .
进一步的,可以假设中间图像的像素点依次是A11-A16、A21-A26、A31-A36、A41-A46,假设缓存图像的像素点依次是B11-B16、B21-B26、B31-B36、B41-B46,假设关联性图像的像素点依次是C11-C16、C21-C26、C31-C36、C41-C46。Further, it can be assumed that the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46, and the pixels of the cache image are B11-B16, B21-B26, B31-B36, B41- B46, it is assumed that the pixels of the correlation image are C11-C16, C21-C26, C31-C36, and C41-C46 in this order.
在此应用场景下,由于中间图像的每个像素点的像素值已知,缓存图像的每个像素点的像素值已知,计算关联性图像的每个像素点的像素值可以采用如下方式:针对关联性图像中的像素点C11,获取该像素点C11在中间图像对应的第一像素值(像素点A11的像素值)、该像素点C11在缓存图像对应的第二像素值(像素点B11的像素值),然后,将第一像素值与第二像素值的差确定为像素点C11在关联性图像的像素值。对于关联性图像中的其它像素点,处理方式参见像素点C11,不再重复赘述。In this application scenario, since the pixel value of each pixel of the intermediate image is known and the pixel value of each pixel of the cache image is known, the pixel value of each pixel of the correlation image can be calculated as follows: For the pixel point C11 in the related image, obtain the first pixel value (pixel value of the pixel point A11) corresponding to the pixel point C11 in the intermediate image, and the second pixel value (pixel point B11) corresponding to the pixel point C11 in the cache image. Pixel value), and then the difference between the first pixel value and the second pixel value is determined as the pixel value of the pixel C11 in the associated image. For the other pixels in the related image, refer to pixel C11 for the processing method, and the details will not be repeated.
方式二、采用卷积法计算关联性图像,即将中间图像和缓存图像的图像块作卷积,得到关联性图像,该图像块的尺寸为预设,如3*3。例如,针对关联性图像中的每个像素点,从该中间图像选择该像素点对应的第一图像块,并从缓存图像选择该像素点对应的第二图像块,并将该第一图像块与该第二图像块的卷积值(即两个矩阵的卷积值),确定为该像素点在关联性图像的像素值。Method 2: Use the convolution method to calculate the correlation image, that is, convolve the image blocks of the intermediate image and the cache image to obtain the correlation image. The size of the image block is preset, such as 3 * 3. For example, for each pixel point in the associated image, a first image block corresponding to the pixel point is selected from the intermediate image, a second image block corresponding to the pixel point is selected from the cache image, and the first image block is The convolution value (that is, the convolution value of two matrices) with the second image block is determined as the pixel value of the pixel in the associated image.
例如,可以假设中间图像的像素点依次是A11-A16、A21-A26、A31-A36、A41-A46,假设缓存图像的像素点依次是B11-B16、B21-B26、B31-B36、B41-B46,假设关联性图像的像素点依次是C11-C16、C21-C26、C31-C36、C41-C46。For example, you can assume that the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46, and the pixels of the cache image are B11-B16, B21-B26, B31-B36, B41-B46. It is assumed that the pixels of the correlation image are C11-C16, C21-C26, C31-C36, and C41-C46 in this order.
在此应用场景下,由于中间图像的每个像素点的像素值已知,缓存图像的每个像素点的像素值已知,计算关联性图像的每个像素点的像素值可以采用如下方式:针对关联 性图像中的像素点C11,从中间图像选择像素点C11对应的第一图像块,该第一图像块是一个3*3的矩阵,第一行包括像素点A11、A12和A13,第二行包括像素点A21、A22和A23,第三行包括像素点A31、A32和A33。从缓存图像选择像素点C11对应的第二图像块,该第二图像块是一个3*3的矩阵,第一行包括像素点B11、B12和B13,第二行包括像素点B21、B22和B23,第三行包括像素点B31、B32和B33。然后,计算第一图像块与第二图像块的卷积值,这个卷积值就是像素点C11的像素值。对于关联性图像中的其它像素点,处理方式可以参见像素点C11,在此不再重复赘述。In this application scenario, since the pixel value of each pixel of the intermediate image is known and the pixel value of each pixel of the cache image is known, the pixel value of each pixel of the correlation image can be calculated as follows: For the pixel C11 in the related image, a first image block corresponding to the pixel C11 is selected from the intermediate image. The first image block is a 3 * 3 matrix, and the first row includes the pixels A11, A12, and A13. The second line includes pixels A21, A22, and A23, and the third line includes pixels A31, A32, and A33. Select the second image block corresponding to pixel C11 from the cache image. The second image block is a 3 * 3 matrix. The first row includes pixels B11, B12, and B13, and the second row includes pixels B21, B22, and B23. The third line includes pixels B31, B32, and B33. Then, a convolution value of the first image block and the second image block is calculated, and the convolution value is the pixel value of the pixel point C11. For other pixels in the associated image, the processing method can refer to pixel C11, which will not be repeated here.
当然,上述方式一和方式二只是获取关联性图像的两个示例,还可以采用其它方式获取关联性图像,如采用互相关法计算关联性图像,对此不做限制。Of course, the above manners 1 and 2 are only two examples of acquiring the correlation image, and other manners can also be used to acquire the correlation image, such as using the cross-correlation method to calculate the correlation image, which is not limited.
步骤4032,根据该关联性图像获取该中间图像中的每个像素点的运动估计值,其中,所述运动估计值的取值可以是二值化的,或者,可以是连续的。Step 4032: Obtain a motion estimation value of each pixel in the intermediate image according to the correlation image. The value of the motion estimation value may be binarized or continuous.
例如,可以采用平滑处理、映射处理和阈值处理中的至少一种获取中间图像中的每个像素点的运动估计值。进一步的,平滑处理可以包括:具有平滑特性的图像滤波操作,如均值滤波操作、中值滤波操作、高斯滤波操作等,对此图像滤波操作不做限制。所述映射处理可以包括:线性缩放操作和平移操作。所述阈值处理可以包括:根据像素值与阈值的大小关系确定运动估计值;将运动估计值的取值限定到阈值划分的范围以内,对此不做限制。For example, at least one of a smoothing process, a mapping process, and a threshold process may be adopted to obtain a motion estimation value of each pixel point in the intermediate image. Further, the smoothing processing may include: an image filtering operation having a smoothing characteristic, such as an average filtering operation, a median filtering operation, a Gaussian filtering operation, and the like, and there is no limitation on this image filtering operation. The mapping process may include a linear scaling operation and a panning operation. The threshold processing may include: determining a motion estimation value according to a magnitude relationship between the pixel value and the threshold; limiting the value of the motion estimation value to a range divided by the threshold, and there is no limitation on this.
获取每个像素点的运动估计值的过程,可以包括但不限于:The process of obtaining the motion estimation value of each pixel can include, but is not limited to:
方式一、若运动估计值采用二值化取值,并采用平滑处理和阈值处理等方式,获取该中间图像中的每个像素点的运动估计值,则可以对关联性图像进行均值滤波操作和阈值处理等,从而得到中间图像中的每个像素点的运动估计值。Method 1: If the motion estimation value is binarized, and smoothing and threshold processing are used to obtain the motion estimation value of each pixel in the intermediate image, then the average filtering operation and the correlation image can be performed. Threshold processing, etc., to obtain a motion estimation value for each pixel in the intermediate image.
例如,针对中间图像中的每个像素点,可以从关联性图像选择该像素点对应的第三图像块,对第三图像块进行均值滤波,得到该像素点对应的像素值;若该像素值大于阈值,则确定该像素点的运动估计值为第一数值(如1),若该像素值不大于阈值,则确定该像素点的运动估计值为第二数值(如0)。For example, for each pixel in the intermediate image, a third image block corresponding to the pixel can be selected from the correlation image, and the third image block is averaged to obtain a pixel value corresponding to the pixel; if the pixel value If the pixel value is greater than the threshold value, the motion estimation value of the pixel is determined to be the first value (such as 1), and if the pixel value is not greater than the threshold value, the motion estimation value of the pixel is determined to be the second value (such as 0).
例如,假设中间图像的像素点依次是A11-A16、A21-A26、A31-A36、A41-A46,假设关联性图像的像素点依次是C11-C16、C21-C26、C31-C36、C41-C46。由于关联性图像已知(参见步骤4031),因此,关联性图像的每个像素点的像素值已知,为了计算中间图像中的每个像素点的运动估计值,可以采用如下方式:For example, assume that the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46, and the pixels of the correlation image are C11-C16, C21-C26, C31-C36, C41-C46. . Since the correlation image is known (see step 4031), the pixel value of each pixel of the correlation image is known. In order to calculate the motion estimation value of each pixel in the intermediate image, the following methods can be used:
针对中间图像中的像素点A11,可以从关联性图像选择像素点A11对应的第三图像块,该第三图像块可以是一个3*3的矩阵(该矩阵也可以为其它大小的矩阵,对此不做限制),第一行包括像素点C11、C12和C13,第二行包括像素点C21、C22和C23, 第三行包括像素点C31、C32和C33。然后,可以对第三图像块的9个像素点进行均值滤波(即计算9个像素值的均值),而均值滤波结果就是像素点A11对应的像素值。若该像素值大于阈值,则像素点A11的运动估计值可以为1,若该像素值不大于阈值,则像素点A11的运动估计值可以为0。对于中间图像中的其它像素点,处理方式可以参见像素点A11,在此不再重复赘述。For the pixel A11 in the intermediate image, a third image block corresponding to the pixel A11 can be selected from the correlation image. The third image block can be a 3 * 3 matrix (the matrix can also be a matrix of other sizes. This is not limited), the first line includes pixels C11, C12, and C13, the second line includes pixels C21, C22, and C23, and the third line includes pixels C31, C32, and C33. Then, an average filter may be performed on the 9 pixels of the third image block (that is, the average of the 9 pixel values is calculated), and the average filter result is the pixel value corresponding to the pixel A11. If the pixel value is greater than the threshold, the motion estimation value of the pixel A11 may be 1, and if the pixel value is not greater than the threshold, the motion estimation value of the pixel A11 may be 0. For other pixels in the intermediate image, the processing method can refer to pixel A11, which is not repeated here.
方式二、若运动估计值采用连续取值,并采用平滑处理和映射处理等方式,获取该中间图像中的每个像素点的运动估计值,则可以对关联性图像进行中值滤波操作和线性缩放操作等,从而得到中间图像中的每个像素点的运动估计值。Method 2: If the motion estimation value is continuously obtained, and smoothing and mapping processing are used to obtain the motion estimation value of each pixel in the intermediate image, a median filtering operation and linearity can be performed on the correlation image. Scaling operations, etc., to obtain a motion estimation value for each pixel in the intermediate image.
例如,对关联性图像中的每个像素点的像素值进行中值滤波、线性缩放后,得到位于特定区间(如区间0-1)的滤波值;然后,针对中间图像中的每个像素点,还可以获取该像素点对应的位于特定区间的滤波值,并将该滤波值确定为该像素点的运动估计值,即该运动估计值也是位于特定区间(如区间0-1)的数值。For example, after performing median filtering and linear scaling on the pixel value of each pixel in the correlation image, a filter value located in a specific interval (such as interval 0-1) is obtained; then, for each pixel in the intermediate image It is also possible to obtain a filter value corresponding to the pixel located in a specific interval and determine the filter value as a motion estimation value of the pixel, that is, the motion estimation value is also a value located in a specific interval (such as interval 0-1).
例如,假设中间图像的像素点依次是A11-A16、A21-A26、A31-A36、A41-A46,假设关联性图像的像素点依次是C11-C16、C21-C26、C31-C36、C41-C46。由于关联性图像的每个像素点的像素值已知,计算中间图像中的每个像素点的运动估计值可以采用如下方式:For example, assume that the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46, and the pixels of the correlation image are C11-C16, C21-C26, C31-C36, C41-C46. . Since the pixel value of each pixel of the correlation image is known, the motion estimation value of each pixel in the intermediate image can be calculated as follows:
先对关联性图像中的每个像素点的像素值进行中值滤波,将滤波后的像素值线性缩放到[0,1]范围内,得到位于区间[0,1]的滤波值,假设像素值的取值范围是0-255,则可以将每个像素点的像素值除以255,得到位于区间[0,1]的滤波值。例如,将像素点C11的像素值10除以255,得到滤波值0.039,将像素点C12的像素值50除以255,得到滤波值0.196,以此类推。然后,针对中间图像中的像素点A11,像素点A11就对应像素点C11的滤波值0.039,即像素点A11的运动估计值是0.039;针对中间图像中的像素点A12,像素点A12就对应像素点C12的滤波值0.196,即像素点A12的运动估计值是0.196;以此类推。First perform median filtering on the pixel value of each pixel in the correlation image, linearly scale the filtered pixel value to the range [0,1], and obtain the filtered value located in the interval [0,1]. Assume that the pixel The value ranges from 0 to 255. Then, the pixel value of each pixel point can be divided by 255 to obtain a filtered value located in the interval [0, 1]. For example, divide the pixel value 10 of the pixel C11 by 255 to obtain a filtered value of 0.039, divide the pixel value of the pixel C12 of 50 by 255, and obtain a filtered value of 0.196, and so on. Then, for pixel A11 in the intermediate image, pixel A11 corresponds to the filter value of pixel C11 0.039, that is, the motion estimate of pixel A11 is 0.039; for pixel A12 in the intermediate image, pixel A12 corresponds to the pixel The filter value of point C12 is 0.196, that is, the motion estimation value of pixel A12 is 0.196; and so on.
当然,上述方式一和方式二只是获取中间图像中的每个像素点的运动估计值的两个示例,还可以采用其它方式获取运动估计值,对此不做限制。Of course, the above manners 1 and 2 are only two examples of acquiring the motion estimation value of each pixel in the intermediate image, and other manners can also be used to acquire the motion estimation value, which is not limited.
在一个实施例中,可通过步骤4041和步骤4042来实现上述步骤S2,当然,还可以采用其它方式获取第二数据格式的目标图像,对此不做限制,只要能够根据运动估计值得到目标图像即可。步骤4041和步骤4042具体为:In one embodiment, the above step S2 can be implemented through steps 4041 and 4042. Of course, the target image in the second data format can also be obtained by other methods, which is not limited as long as the target image can be obtained based on the motion estimation value. Just fine. Step 4041 and step 4042 are specifically:
步骤4041,根据该中间图像和该缓存图像获取低噪图像,低噪图像是噪声程度低于中间图像的图像,对其获取方式不做限制,例如低噪图像可以是中间图像和缓存图像的均值图像。Step 4041: Obtain a low-noise image according to the intermediate image and the cached image. The low-noise image is an image with a lower noise level than the intermediate image, and there is no restriction on the acquisition method. For example, the low-noise image may be the average of the intermediate image and the cached image. image.
例如,假设中间图像的像素点依次是A11-A16、A21-A26、A31-A36、A41-A46,假设缓存图像的像素点依次是B11-B16、B21-B26、B31-B36、B41-B46,假设低噪图像的像素点依次是D11-D16、D21-D26、D31-D36、D41-D46。基于此,由于中间图像和缓存图像的每个像素点的像素值为已知,计算低噪图像的每个像素点的像素值可以采用如下方式:For example, assume that the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46, and the pixels of the cache image are B11-B16, B21-B26, B31-B36, and B41-B46. Assume that the pixels of the low-noise image are D11-D16, D21-D26, D31-D36, and D41-D46 in this order. Based on this, since the pixel value of each pixel point of the intermediate image and the cache image is known, the pixel value of each pixel point of the low-noise image can be calculated as follows:
针对低噪图像中的像素点D11,获取像素点D11在中间图像对应的像素值(像素点A11的像素值)、像素点D11在缓存图像对应的像素值(像素点B11的像素值),将上述两个像素值的均值确定为像素点D11在低噪图像的像素值。对于低噪图像中的其它像素点,处理方式参见像素点D11,不再重复赘述。For the pixel point D11 in the low-noise image, obtain the pixel value corresponding to the pixel point D11 in the intermediate image (the pixel value of the pixel point A11) and the pixel value corresponding to the pixel point D11 in the cache image (the pixel value of the pixel point B11). The average value of the above two pixel values is determined as the pixel value of the pixel point D11 in the low-noise image. For other pixels in the low-noise image, refer to pixel D11 for the processing method, and the details will not be repeated.
步骤4042,根据中间图像、低噪图像和运动估计值获取目标图像。Step 4042: Obtain a target image according to the intermediate image, the low-noise image, and the motion estimation value.
在一个例子中,根据中间图像、低噪图像和运动估计值获取目标图像,可以包括但不限于:确定目标图像中的第一像素点的像素值,第一像素点为目标图像中的任意一个像素点,第一像素点的像素值通过以下方式确定:根据第一像素点的运动估计值确定第一像素点在中间图像的第一权重、第一像素点在低噪图像的第二权重;根据第一像素点在中间图像对应的像素值和第一权重、第一像素点在低噪图像对应的像素值和第二权重,确定第一像素点在目标图像的像素值;根据目标图像的所有像素点的像素值,确定目标图像。In an example, acquiring the target image according to the intermediate image, the low-noise image, and the motion estimation value may include, but is not limited to, determining a pixel value of a first pixel point in the target image, where the first pixel point is any one of the target image Pixels, the pixel value of the first pixel is determined in the following manner: determining a first weight of the first pixel in the intermediate image and a second weight of the first pixel in the low-noise image according to the motion estimation value of the first pixel; Determine the pixel value of the first pixel in the target image according to the pixel value and the first weight corresponding to the first pixel in the intermediate image, and the pixel value and the second weight corresponding to the first pixel in the low-noise image; The pixel values of all pixels determine the target image.
假设该第一像素点的运动估计值为A,则该第一像素点在中间图像的第一权重可以为A,该第一像素点在低噪图像的第二权重可以为(1-A),假设该第一像素点在中间图像对应的像素值为N,该第一像素点在低噪图像对应的像素值为M,则该第一像素点在目标图像的像素值为N*A+M*(1-A)。当然,上述方式只是一个示例,还可以采用其它方式得到该第一像素点在目标图像的像素值,对此不做限制。Assuming that the motion estimation value of the first pixel is A, the first weight of the first pixel in the intermediate image may be A, and the second weight of the first pixel in the low-noise image may be (1-A) Suppose that the pixel value corresponding to the first pixel point in the intermediate image is N and the pixel value corresponding to the first pixel point in the low-noise image is M. Then the pixel value of the first pixel point in the target image is N * A + M * (1-A). Of course, the above method is only an example, and the pixel value of the first pixel point in the target image may also be obtained by other methods, which is not limited.
例如,假设中间图像的像素点依次是A11-A16、A21-A26、A31-A36、A41-A46,假设低噪图像的像素点依次是D11-D16、D21-D26、D31-D36、D41-D46,假设目标图像的像素点依次是E11-E16、E21-E26、E31-E36、E41-E46,由于中间图像和低噪图像的每个像素点的像素值为已知,计算目标图像的每个像素点在目标图像的像素值可以采用如下方式:For example, assume that the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46, and the pixels of the low-noise image are D11-D16, D21-D26, D31-D36, D41-D46. Suppose that the pixels of the target image are E11-E16, E21-E26, E31-E36, E41-E46. Since the pixel value of each pixel of the intermediate image and the low-noise image is known, calculate each The pixel value of the pixel in the target image can be adopted as follows:
针对目标图像中的像素点E11,确定第一权重为像素点E11的运动估计值、第二权重为1减去像素点E11的运动估计值,然后,计算像素点A11的像素值*第一权重+像素点D11的像素值*第二权重,这个计算结果就是像素点E11的像素值。对于目标图像中的其它像素点,处理方式参见像素点E11,不再重复赘述。For the pixel E11 in the target image, determine the first weight as the motion estimation value of the pixel E11, and the second weight as 1 minus the motion estimation value of the pixel E11, and then calculate the pixel value of the pixel A11 * the first weight + Pixel value of pixel point D11 * second weight, the result of this calculation is the pixel value of pixel point E11. For other pixels in the target image, refer to pixel E11 for the processing method, which will not be repeated.
从上述实施例中可以看出,当运动估计值越大时,则说明第一权重越大,第二 权重越小,即中间图像的像素值所占比重越高,且低噪图像的像素值所占比重越低,当运动估计值越小时,则说明第一权重越小,第二权重越大,即中间图像的像素值所占比重越低,且低噪图像的像素值所占比重越高。It can be seen from the above embodiments that when the motion estimation value is larger, it means that the first weight is larger and the second weight is smaller, that is, the pixel value of the intermediate image is higher, and the pixel value of the low-noise image is higher. The lower the proportion, the smaller the motion estimation value, the smaller the first weight and the larger the second weight, that is, the lower the proportion of the pixel values of the intermediate image, and the higher the proportion of the pixel values of the low-noise image high.
由于上述原理,则:当像素点E11的运动估计值比较大(如大于0.5)时,则第一权重大于第二权重,由于运动估计值比较大时,说明与上一帧相比,像素点E11的变化较大,即当前帧的像素值更加准确。显然,由于第一权重大于第二权重,因此,当前帧的中间图像的像素值所占的比重更大,更加符合运动估计值较大的需求,这样,目标图像中像素点E11的像素值更加准确。Because of the above principle, when the motion estimation value of pixel E11 is relatively large (such as greater than 0.5), the first weight is greater than the second weight. When the motion estimation value is relatively large, it indicates that the pixel is compared with the previous frame. The change of E11 is large, that is, the pixel value of the current frame is more accurate. Obviously, because the first weight is greater than the second weight, the pixel value of the intermediate image in the current frame accounts for a larger proportion, which is more in line with the demand for larger motion estimates. In this way, the pixel value of pixel E11 in the target image is more accurate.
在上述实施例中,像素点的像素值可以包括但不限于像素点的灰度值、亮度值、色度值,对像素值的类型不做限制,可以与实际的图像处理有关。In the above embodiment, the pixel value of the pixel point may include, but is not limited to, the gray value, brightness value, and chrominance value of the pixel point, and the type of the pixel value is not limited, and may be related to actual image processing.
由以上技术方案可见,本公开实施例中,在利用神经网络将第一数据格式的原始图像转换为第二数据格式的中间图像后,可以利用缓存图像对中间图像进行降噪处理,得到第二数据格式的目标图像。由于缓存图像是与原始图像相邻的上一帧原始图像对应的目标图像,相邻两帧图像(即缓存图像和中间图像)在时间和空间上紧密相关。可以利用相邻两帧图像的相关性来区分图像中的信号和噪声,对中间图像进行降噪处理。因此在光照条件较差时也能够有效去除噪声,使得目标图像中的噪声可以得到有效抑制。As can be seen from the above technical solutions, in the embodiment of the present disclosure, after the original image in the first data format is converted into the intermediate image in the second data format by using a neural network, the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cached image is the target image corresponding to the previous frame of the original image adjacent to the original image, the two adjacent frames (ie, the cached image and the intermediate image) are closely related in time and space. The correlation between two adjacent frames of images can be used to distinguish the signal and noise in the images, and the intermediate image is processed for noise reduction. Therefore, noise can be effectively removed even in poor lighting conditions, so that the noise in the target image can be effectively suppressed.
基于与上述方法同样的构思,本公开实施例中还提出一种图像处理装置,如图5所示,为所述图像处理装置的结构图,所述图像处理装置包括:Based on the same concept as the above method, an embodiment of the present disclosure also proposes an image processing device. As shown in FIG. 5, it is a structural diagram of the image processing device. The image processing device includes:
图像处理模块501,用于获取第一数据格式的原始图像,并利用神经网络将所述原始图像转换为第二数据格式的中间图像;An image processing module 501, configured to obtain an original image in a first data format, and use a neural network to convert the original image into an intermediate image in a second data format;
视频处理模块502,用于利用缓存图像对所述中间图像进行降噪处理,得到第二数据格式的目标图像,其中,所述缓存图像包括与所述原始图像相邻的上一帧原始图像对应的目标图像。The video processing module 502 is configured to perform noise reduction processing on the intermediate image by using a cached image to obtain a target image in a second data format, where the cached image includes a previous frame corresponding to the original image corresponding to the original image Target image.
所述图像处理模块501利用神经网络将所述原始图像转换为第二数据格式的中间图像时具体用于:When the image processing module 501 uses a neural network to convert the original image into an intermediate image in a second data format, the image processing module 501 is specifically configured to:
将所述原始图像输入第一神经网络,以由所述第一神经网络将所述第一数据格式的原始图像转换为所述第二数据格式的中间图像;或者,Inputting the original image into a first neural network to convert the original image in the first data format into an intermediate image in the second data format by the first neural network; or,
获取采集所述第一数据格式的原始图像的设备的设备参数;Acquiring device parameters of a device that collects the original image in the first data format;
将所述原始图像和所述设备参数输入至第二神经网络,以由所述第二神经网络根据所述设备参数将所述第一数据格式的原始图像转换为第二数据格式的中间图像。The original image and the device parameters are input to a second neural network to convert the original image in the first data format into an intermediate image in a second data format according to the device parameters by the second neural network.
所述视频处理模块502根据中间图像和缓存图像获取所述中间图像中的每个像 素点的运动估计值时具体用于:根据所述中间图像和所述缓存图像获取关联性图像;根据所述关联性图像获取所述中间图像中的每个像素点的运动估计值。When the video processing module 502 obtains the motion estimation value of each pixel in the intermediate image according to the intermediate image and the cache image, the video processing module 502 is specifically configured to: obtain a correlation image according to the intermediate image and the cache image; The correlation image acquires a motion estimation value of each pixel point in the intermediate image.
所述视频处理模块502采用平滑处理、映射处理和阈值处理中的至少一者获取所述中间图像中的每个像素点的运动估计值。The video processing module 502 obtains a motion estimation value of each pixel in the intermediate image by using at least one of a smoothing process, a mapping process, and a threshold process.
所述视频处理模块502根据所述运动估计值将所述中间图像转换为第二数据格式的目标图像时具体用于:根据所述中间图像和所述缓存图像获取低噪图像;根据所述中间图像、所述低噪图像和所述运动估计值获取目标图像。When the video processing module 502 converts the intermediate image into a target image in a second data format according to the motion estimation value, the video processing module 502 is specifically configured to: obtain a low-noise image according to the intermediate image and the cache image; An image, the low-noise image, and the motion estimation value to obtain a target image.
所述视频处理模块502根据所述中间图像、所述低噪图像和所述运动估计值获取目标图像时具体用于:The video processing module 502 is specifically configured to obtain a target image according to the intermediate image, the low-noise image, and the motion estimation value:
确定目标图像中的第一像素点的像素值,所述第一像素点为所述目标图像中的任意一个像素点,所述第一像素点的像素值通过以下方式确定:根据所述第一像素点对应的运动估计值确定所述第一像素点在中间图像的第一权重、所述第一像素点在低噪图像的第二权重;根据所述第一像素点在中间图像对应的像素值和所述第一权重、所述第一像素点在低噪图像对应的像素值和所述第二权重,确定所述第一像素点在所述目标图像的像素值;Determine a pixel value of a first pixel point in the target image, where the first pixel point is any pixel point in the target image, and a pixel value of the first pixel point is determined in the following manner: according to the first The motion estimation value corresponding to the pixel determines the first weight of the first pixel in the intermediate image and the second weight of the first pixel in the low-noise image; according to the pixel corresponding to the first pixel in the intermediate image A value and the first weight, a pixel value corresponding to the first pixel point in the low-noise image, and the second weight, to determine a pixel value of the first pixel point in the target image;
根据所述目标图像的所有像素点的像素值,确定所述目标图像。The target image is determined according to pixel values of all pixel points of the target image.
本公开实施例提供的图像处理设备,从硬件层面而言,其硬件架构示意图具体可以参见图6所示,包括:处理器601和机器可读存储介质602,其中:所述机器可读存储介质602存储有能够被所述处理器601执行的机器可执行指令;所述处理器601用于执行机器可执行指令,以实现本公开上述示例公开的图像处理方法。In terms of hardware, the schematic diagram of the hardware architecture of the image processing device provided by the embodiment of the present disclosure can be specifically shown in FIG. 6, and includes: a processor 601 and a machine-readable storage medium 602, wherein: the machine-readable storage medium 602 stores machine-executable instructions that can be executed by the processor 601; the processor 601 is configured to execute machine-executable instructions to implement the image processing method disclosed in the above examples of the present disclosure.
其中,上述机器可读存储介质可以是任何电子、磁性、光学或其它物理存储装置,可以包含或存储信息,如可执行指令、数据,等等。例如,机器可读存储介质可以是:RAM(Radom Access Memory,随机存取存储器)、易失存储器、非易失性存储器、闪存、存储驱动器(如硬盘驱动器)、固态硬盘、任何类型的存储盘(如光盘、dvd等),或者类似的存储介质,或者它们的组合。The machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device, and may contain or store information, such as executable instructions, data, and so on. For example, the machine-readable storage medium may be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as hard drive), solid state hard disk, any type of storage disk (Such as optical discs, DVDs, etc.), or similar storage media, or a combination thereof.
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The system, device, module, or unit described in the foregoing embodiments may be specifically implemented by a computer chip or entity, or a product with a certain function. A typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email sending and receiving device, and a game control Desk, tablet computer, wearable device, or a combination of any of these devices.
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本公开时可以把各单元的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, when describing the above device, the functions are divided into various units and described separately. Of course, the functions of the units may be implemented in the same or multiple software and / or hardware when implementing the present disclosure.
本领域内的技术人员应明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, the embodiments of the present disclosure may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
本公开是参照根据本公开实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可以由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其它可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其它可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present disclosure is described with reference to flowcharts and / or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present disclosure. It should be understood that each process and / or block in the flowcharts and / or block diagrams, and combinations of processes and / or blocks in the flowcharts and / or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing device to produce a machine such that the instructions generated by the processor of the computer or other programmable data processing device are used to generate instructions Means for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
而且,这些计算机程序指令也可以存储在能引导计算机或其它可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或者多个流程和/或方框图一个方框或者多个方框中指定的功能。Furthermore, these computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, The instruction device implements the functions specified in a flowchart or a plurality of processes and / or a block or a plurality of blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其它可编程数据处理设备上,使得在计算机或者其它可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其它可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of operation steps can be performed on the computer or other programmable device to generate a computer-implemented process, which can be executed on the computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
以上所述仅为本公开的实施例而已,并不用于限制本公开。对于本领域技术人员来说,本公开可以有各种更改和变化。凡在本公开的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本公开的权利要求范围之内。The above are only examples of the present disclosure and are not intended to limit the present disclosure. It will be apparent to those skilled in the art that various modifications and variations can be made in the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this disclosure shall be included in the scope of claims of this disclosure.

Claims (15)

  1. 一种图像处理方法,包括:An image processing method includes:
    获取第一数据格式的原始图像;Obtaining an original image in a first data format;
    利用神经网络将所述原始图像转换为第二数据格式的中间图像;Using a neural network to convert the original image into an intermediate image in a second data format;
    利用缓存图像对所述中间图像进行降噪处理,得到第二数据格式的目标图像,其中,所述缓存图像包括与所述原始图像相邻的上一帧原始图像对应的目标图像。The buffer image is used to perform noise reduction processing on the intermediate image to obtain a target image in a second data format. The buffer image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
  2. 根据权利要求1所述的方法,其特征在于,所述神经网络包括:The method according to claim 1, wherein the neural network comprises:
    至少一个卷积层、至少一个池化层和至少一个全连接层;或者,At least one convolutional layer, at least one pooling layer, and at least one fully connected layer; or
    所述神经网络包括:至少一个卷积层和至少一个激励层。The neural network includes: at least one convolutional layer and at least one excitation layer.
  3. 根据权利要求1所述的方法,其特征在于,利用所述神经网络将所述原始图像转换为所述第二数据格式的所述中间图像,包括:The method according to claim 1, wherein using the neural network to convert the original image into the intermediate image in the second data format comprises:
    将所述原始图像输入第一神经网络,以由所述第一神经网络将所述第一数据格式的所述原始图像转换为所述第二数据格式的所述中间图像;或者,Inputting the original image into a first neural network to convert the original image in the first data format into the intermediate image in the second data format by the first neural network; or,
    获取采集所述第一数据格式的原始图像的设备的设备参数;Acquiring device parameters of a device that collects the original image in the first data format;
    将所述原始图像和所述设备参数输入至第二神经网络,以由所述第二神经网络根据所述设备参数将所述第一数据格式的所述原始图像转换为所述第二数据格式的所述中间图像。Inputting the original image and the device parameters to a second neural network to convert the original image in the first data format to the second data format by the second neural network according to the device parameters The intermediate image.
  4. 根据权利要求1所述的方法,其特征在于,利用所述缓存图像对所述中间图像进行降噪处理,得到所述第二数据格式的所述目标图像,包括:The method according to claim 1, wherein using the cached image to perform noise reduction processing on the intermediate image to obtain the target image in the second data format comprises:
    根据所述中间图像和所述缓存图像获取所述中间图像中的每个像素点的运动估计值;Acquiring a motion estimation value of each pixel point in the intermediate image according to the intermediate image and the cache image;
    根据所述运动估计值将所述中间图像转换为所述第二数据格式的目标图像。Converting the intermediate image into a target image in the second data format according to the motion estimation value.
  5. 根据权利要求4所述的方法,其特征在于,采用平滑处理、映射处理和阈值处理中的至少一者获取所述中间图像中的每个像素点的运动估计值。The method according to claim 4, wherein at least one of a smoothing process, a mapping process, and a threshold process is used to obtain a motion estimation value of each pixel point in the intermediate image.
  6. 根据权利要求4所述的方法,其特征在于,根据所述运动估计值将所述中间图像转换为所述第二数据格式的所述目标图像,包括:The method according to claim 4, wherein converting the intermediate image to the target image in the second data format according to the motion estimation value comprises:
    根据所述中间图像和所述缓存图像获取低噪图像;Acquiring a low-noise image according to the intermediate image and the cache image;
    根据所述中间图像、所述低噪图像和所述运动估计值获取所述目标图像。Acquiring the target image according to the intermediate image, the low-noise image, and the motion estimation value.
  7. 根据权利要求6所述的方法,其特征在于,根据所述中间图像、所述低噪图像和所述运动估计值获取所述目标图像,包括:The method according to claim 6, wherein the acquiring the target image according to the intermediate image, the low-noise image, and the motion estimation value comprises:
    确定所述目标图像中的第一像素点的像素值,所述第一像素点为所述目标图像中的 任意一个像素点,Determine a pixel value of a first pixel point in the target image, where the first pixel point is any one pixel point in the target image,
    其中,所述第一像素点的像素值通过以下方式确定:The pixel value of the first pixel point is determined in the following manner:
    根据所述第一像素点对应的所述运动估计值确定所述第一像素点在所述中间图像的第一权重、所述第一像素点在所述低噪图像的第二权重;Determining a first weight of the first pixel in the intermediate image and a second weight of the first pixel in the low-noise image according to the motion estimation value corresponding to the first pixel;
    根据所述第一像素点在所述中间图像对应的像素值和所述第一权重、所述第一像素点在所述低噪图像对应的像素值和所述第二权重,确定所述第一像素点在所述目标图像的像素值;Determining the first pixel according to a pixel value corresponding to the first pixel point in the intermediate image and the first weight, a pixel value corresponding to the first pixel point in the low-noise image, and the second weight. A pixel value of a pixel in the target image;
    根据所述目标图像的所有像素点的像素值,确定所述目标图像。The target image is determined according to pixel values of all pixel points of the target image.
  8. 一种图像处理装置,包括:An image processing device includes:
    图像处理模块,用于获取第一数据格式的原始图像,并利用神经网络将所述原始图像转换为第二数据格式的中间图像;An image processing module, configured to obtain an original image in a first data format and convert the original image into an intermediate image in a second data format using a neural network;
    视频处理模块,用于利用缓存图像对所述中间图像进行降噪处理,得到第二数据格式的目标图像,其中,所述缓存图像包括与所述原始图像相邻的上一帧原始图像对应的目标图像。A video processing module, configured to perform noise reduction processing on the intermediate image by using a cached image to obtain a target image in a second data format, wherein the cached image includes a corresponding original image of a previous frame adjacent to the original image The target image.
  9. 根据权利要求8所述的装置,其特征在于,所述图像处理模块利用神经网络将所述原始图像转换为所述第二数据格式的所述中间图像时具体用于:The device according to claim 8, wherein the image processing module uses a neural network to convert the original image into the intermediate image in the second data format, and is specifically configured to:
    将所述原始图像输入第一神经网络,以由所述第一神经网络将所述第一数据格式的所述原始图像转换为所述第二数据格式的所述中间图像;或者,Inputting the original image into a first neural network to convert the original image in the first data format into the intermediate image in the second data format by the first neural network; or,
    获取采集所述第一数据格式的原始图像的设备的设备参数;Acquiring device parameters of a device that collects the original image in the first data format;
    将所述原始图像和所述设备参数输入至第二神经网络,以由所述第二神经网络根据所述设备参数将所述第一数据格式的所述原始图像转换为所述第二数据格式的所述中间图像。Inputting the original image and the device parameters to a second neural network to convert the original image in the first data format to the second data format by the second neural network according to the device parameters The intermediate image.
  10. 根据权利要求8所述的装置,其特征在于,所述视频处理模块利用所述缓存图像对所述中间图像进行降噪处理,得到所述第二数据格式的所述目标图像时具体用于:The device according to claim 8, wherein the video processing module performs noise reduction processing on the intermediate image by using the cached image to obtain the target image in the second data format, and is specifically configured to:
    根据所述中间图像和所述缓存图像获取所述中间图像中的每个像素点的运动估计值;Acquiring a motion estimation value of each pixel point in the intermediate image according to the intermediate image and the cache image;
    根据所述运动估计值将所述中间图像转换为所述第二数据格式的目标图像。Converting the intermediate image into a target image in the second data format according to the motion estimation value.
  11. 根据权利要求10所述的装置,其特征在于,The device according to claim 10, wherein:
    所述视频处理模块采用平滑处理、映射处理和阈值处理中的至少一者获取所述中间图像中的每个像素点的运动估计值。The video processing module uses at least one of a smoothing process, a mapping process, and a threshold process to obtain a motion estimation value of each pixel in the intermediate image.
  12. 根据权利要求10所述的装置,其特征在于,The device according to claim 10, wherein:
    所述视频处理模块根据所述运动估计值将所述中间图像转换为所述第二数据格式的所述目标图像时具体用于:The video processing module is specifically configured to convert the intermediate image into the target image in the second data format according to the motion estimation value:
    根据所述中间图像和所述缓存图像获取低噪图像;Acquiring a low-noise image according to the intermediate image and the cache image;
    根据所述中间图像、所述低噪图像和所述运动估计值获取所述目标图像。Acquiring the target image according to the intermediate image, the low-noise image, and the motion estimation value.
  13. 根据权利要求12所述的装置,其特征在于,所述视频处理模块根据所述中间图像、所述低噪图像和所述运动估计值获取所述目标图像时具体用于:The device according to claim 12, wherein the video processing module is specifically configured to obtain the target image according to the intermediate image, the low-noise image, and the motion estimation value:
    确定所述目标图像中的第一像素点的像素值,所述第一像素点为所述目标图像中的任意一个像素点,Determine a pixel value of a first pixel point in the target image, where the first pixel point is any pixel point in the target image,
    其中,所述第一像素点的像素值通过以下方式确定:The pixel value of the first pixel point is determined in the following manner:
    根据所述第一像素点对应的所述运动估计值确定所述第一像素点在所述中间图像的第一权重、所述第一像素点在所述低噪图像的第二权重;Determining a first weight of the first pixel in the intermediate image and a second weight of the first pixel in the low-noise image according to the motion estimation value corresponding to the first pixel;
    根据所述第一像素点在所述中间图像对应的像素值和所述第一权重、所述第一像素点在所述低噪图像对应的像素值和所述第二权重,确定所述第一像素点在所述目标图像的像素值;Determining the first pixel according to a pixel value corresponding to the first pixel point in the intermediate image and the first weight, a pixel value corresponding to the first pixel point in the low-noise image, and the second weight. A pixel value of a pixel in the target image;
    根据所述目标图像的所有像素点的像素值,确定所述目标图像。The target image is determined according to pixel values of all pixel points of the target image.
  14. 一种图像处理设备,包括:处理器和机器可读存储介质,所述机器可读存储介质存储有能够被所述处理器执行的机器可执行指令;所述处理器用于执行机器可执行指令,以实现权利要求1-7任一所述的方法步骤。An image processing device includes: a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions executable by the processor; and the processor is configured to execute the machine-executable instructions, To implement the method steps according to any one of claims 1-7.
  15. 一种机器可读存储介质,所述机器可读存储介质上存储有计算机指令,所述计算机指令被执行时,实现权利要求1-7任一所述的方法步骤。A machine-readable storage medium stores computer instructions on the machine-readable storage medium. When the computer instructions are executed, the method steps according to any one of claims 1-7 are implemented.
PCT/CN2019/089272 2018-05-31 2019-05-30 Image processing method, apparatus and device, and machine-readable storage medium WO2019228456A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810556530.2 2018-05-31
CN201810556530.2A CN110555808B (en) 2018-05-31 2018-05-31 Image processing method, device, equipment and machine-readable storage medium

Publications (1)

Publication Number Publication Date
WO2019228456A1 true WO2019228456A1 (en) 2019-12-05

Family

ID=68698716

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089272 WO2019228456A1 (en) 2018-05-31 2019-05-30 Image processing method, apparatus and device, and machine-readable storage medium

Country Status (2)

Country Link
CN (1) CN110555808B (en)
WO (1) WO2019228456A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001912A (en) * 2020-08-27 2020-11-27 北京百度网讯科技有限公司 Object detection method and device, computer system and readable storage medium
CN115661156A (en) * 2022-12-28 2023-01-31 成都数联云算科技有限公司 Image generation method, image generation device, storage medium, equipment and computer program product
CN116385307A (en) * 2023-04-11 2023-07-04 任成付 Picture information filtering effect identification system

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402153B (en) * 2020-03-10 2023-06-13 上海富瀚微电子股份有限公司 Image processing method and system
CN111476730B (en) * 2020-03-31 2024-02-13 珠海格力电器股份有限公司 Image restoration processing method and device
CN111583142B (en) * 2020-04-30 2023-11-28 深圳市商汤智能传感科技有限公司 Image noise reduction method and device, electronic equipment and storage medium
CN113724142B (en) * 2020-05-26 2023-08-25 杭州海康威视数字技术股份有限公司 Image Restoration System and Method
TWI828160B (en) * 2022-05-24 2024-01-01 國立陽明交通大學 Analyzing method of cilia images

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106803090A (en) * 2016-12-05 2017-06-06 中国银联股份有限公司 A kind of image-recognizing method and device
CN107220951A (en) * 2017-05-31 2017-09-29 广东欧珀移动通信有限公司 Facial image noise-reduction method, device, storage medium and computer equipment
CN107424184A (en) * 2017-04-27 2017-12-01 厦门美图之家科技有限公司 A kind of image processing method based on convolutional neural networks, device and mobile terminal

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1865460B1 (en) * 2005-03-31 2011-05-04 Nikon Corporation Image processing method
CN102769722B (en) * 2012-07-20 2015-04-29 上海富瀚微电子股份有限公司 Time-space domain hybrid video noise reduction device and method
CN105163102B (en) * 2015-06-30 2017-04-05 北京空间机电研究所 A kind of real time imaging auto white balance system and method based on FPGA
CN107767343B (en) * 2017-11-09 2021-08-31 京东方科技集团股份有限公司 Image processing method, processing device and processing equipment
CN107730474B (en) * 2017-11-09 2022-02-22 京东方科技集团股份有限公司 Image processing method, processing device and processing equipment
CN107767408B (en) * 2017-11-09 2021-03-12 京东方科技集团股份有限公司 Image processing method, processing device and processing equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106803090A (en) * 2016-12-05 2017-06-06 中国银联股份有限公司 A kind of image-recognizing method and device
CN107424184A (en) * 2017-04-27 2017-12-01 厦门美图之家科技有限公司 A kind of image processing method based on convolutional neural networks, device and mobile terminal
CN107220951A (en) * 2017-05-31 2017-09-29 广东欧珀移动通信有限公司 Facial image noise-reduction method, device, storage medium and computer equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001912A (en) * 2020-08-27 2020-11-27 北京百度网讯科技有限公司 Object detection method and device, computer system and readable storage medium
CN112001912B (en) * 2020-08-27 2024-04-05 北京百度网讯科技有限公司 Target detection method and device, computer system and readable storage medium
CN115661156A (en) * 2022-12-28 2023-01-31 成都数联云算科技有限公司 Image generation method, image generation device, storage medium, equipment and computer program product
CN116385307A (en) * 2023-04-11 2023-07-04 任成付 Picture information filtering effect identification system
CN116385307B (en) * 2023-04-11 2024-05-03 衡阳市欣嘉传媒有限公司 Picture information filtering effect identification system

Also Published As

Publication number Publication date
CN110555808B (en) 2022-05-31
CN110555808A (en) 2019-12-10

Similar Documents

Publication Publication Date Title
WO2019228456A1 (en) Image processing method, apparatus and device, and machine-readable storage medium
US9615039B2 (en) Systems and methods for reducing noise in video streams
KR102574141B1 (en) Image display method and device
US9262807B2 (en) Method and system for correcting a distorted input image
WO2021047345A1 (en) Image noise reduction method and apparatus, and storage medium and electronic device
GB2592835A (en) Configurable convolution engine for interleaved channel data
US8315474B2 (en) Image processing device and method, and image sensing apparatus
WO2020152521A1 (en) Systems and methods for transforming raw sensor data captured in low-light conditions to well-exposed images using neural network architectures
WO2021082883A1 (en) Main body detection method and apparatus, and electronic device and computer readable storage medium
CN108876753A (en) Optional enhancing is carried out using navigational figure pairing growth exposure image
US9282253B2 (en) System and method for multiple-frame based super resolution interpolation for digital cameras
CN107071234A (en) A kind of camera lens shadow correction method and device
US8861846B2 (en) Image processing apparatus, image processing method, and program for performing superimposition on raw image or full color image
CN106134180B (en) Image processing apparatus and method, storage being capable of recording mediums by the image processing program that computer is temporarily read, photographic device
WO2021035524A1 (en) Image processing method and apparatus, electronic device, and computer-readable storage medium
CN113784014B (en) Image processing method, device and equipment
Zhao et al. End-to-end denoising of dark burst images using recurrent fully convolutional networks
WO2019228450A1 (en) Image processing method, device, and equipment, and readable medium
CN114390188B (en) Image processing method and electronic equipment
CN113744355B (en) Pulse signal processing method, device and equipment
CN110930440A (en) Image alignment method and device, storage medium and electronic equipment
US11195247B1 (en) Camera motion aware local tone mapping
US11270412B2 (en) Image signal processor, method, and system for environmental mapping
CN104776919B (en) Infrared focal plane array ribbon Nonuniformity Correction system and method based on FPGA
JP4664259B2 (en) Image correction apparatus and image correction method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1