WO2019228456A1 - Procédé, appareil et dispositif de traitement d'image et support de stockage lisible par machine - Google Patents

Procédé, appareil et dispositif de traitement d'image et support de stockage lisible par machine Download PDF

Info

Publication number
WO2019228456A1
WO2019228456A1 PCT/CN2019/089272 CN2019089272W WO2019228456A1 WO 2019228456 A1 WO2019228456 A1 WO 2019228456A1 CN 2019089272 W CN2019089272 W CN 2019089272W WO 2019228456 A1 WO2019228456 A1 WO 2019228456A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixel
data format
neural network
intermediate image
Prior art date
Application number
PCT/CN2019/089272
Other languages
English (en)
Chinese (zh)
Inventor
姜子伦
肖飞
范蒙
俞海
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2019228456A1 publication Critical patent/WO2019228456A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing

Definitions

  • the present application relates to the field of image technology, and in particular, to an image processing method, apparatus, device, and machine-readable storage medium.
  • the original image in the first data format collected by the imaging device cannot usually be directly displayed or transmitted. Therefore, the original image in the first data format can also be converted into a target image in the second data format for display or transmission.
  • an ISP (Image Signal Processing) algorithm can be used to convert the original image into a target image.
  • the ISP algorithm is used to solve image processing such as brightness and color compensation and correction.
  • the imaging device uses the ISP algorithm to convert the original image into the target image
  • the target image will lose the original image information to a certain extent. If the loss of the original image information is serious, it may not be repaired subsequently.
  • the original image collected when the lighting conditions are poor has a large noise after being processed by the ISP algorithm.
  • the present disclosure provides an image processing method, device, device, and machine-readable storage medium, which can effectively remove noise, improve the quality of a target image, and improve user experience.
  • the present disclosure provides an image processing method, which includes:
  • the buffer image is used to perform noise reduction processing on the intermediate image to obtain a target image in a second data format.
  • the buffer image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
  • the present disclosure provides an image processing apparatus including:
  • An image processing module configured to obtain an original image in a first data format, and convert the original image into an intermediate image in a second data format using a neural network;
  • a video processing module configured to perform noise reduction processing on the intermediate image by using a cached image to obtain a target image in a second data format, wherein the cached image includes a corresponding original image of a previous frame adjacent to the original image The target image.
  • the present disclosure provides an image processing apparatus including a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions that can be executed by the processor; and the processor is configured to execute the machine-executable Execute instructions to implement the method steps described above.
  • the present disclosure provides a machine-readable storage medium.
  • Computer instructions are stored on the machine-readable storage medium.
  • the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cache image is the target image corresponding to the previous frame of the original image of the original image, the two frames of the image (ie, the cache image and the intermediate image) are closely related in time and space.
  • the correlation between the two frames of image can be used to distinguish the signal and noise in the image, and the noise reduction process can be performed on the intermediate image to effectively remove the noise. Therefore, noise can be effectively removed even in poor lighting conditions, so that noise in the image can be effectively suppressed, and the quality of the target image is improved.
  • the original image in the first data format is converted into the intermediate image in the second data format by using a neural network, which can reduce the original image information loss of the intermediate image and can be repaired later.
  • FIGS. 1A and 1B are schematic diagrams of a neural network in an embodiment of the present disclosure.
  • FIGS. 2A-2C are schematic diagrams of an offline training neural network in an embodiment of the present disclosure.
  • FIG. 3 is a flowchart of an image processing method according to an embodiment of the present disclosure.
  • 4A-4D are schematic diagrams of image processing in an embodiment of the present disclosure.
  • FIG. 5 is a structural diagram of an image processing apparatus in an embodiment of the present disclosure.
  • FIG. 6 is a hardware configuration diagram of an image processing apparatus in an embodiment of the present disclosure.
  • first, second, third, etc. may be used in the embodiments of the present disclosure to describe various kinds of information, and these descriptions are only used to distinguish the same type of information from each other.
  • first information may also be referred to as the second information
  • second information may also be referred to as the first information.
  • word "if” can be interpreted as "at” or "at " or "in response to a determination”.
  • An embodiment of the present disclosure provides an image processing method, which can be applied to an image processing device.
  • the image processing device may be an imaging device, such as a video camera, and the type of the image processing device is not limited.
  • the original image in the first data format may be converted into an intermediate image in the second data format by using a neural network, and then the buffer image is used to perform noise reduction processing on the intermediate image to obtain the first image.
  • Target image in two data formats That is, the target image is an image subjected to noise reduction processing.
  • two frames of images that is, the cached image and the intermediate image
  • the intermediate image can be used to perform noise reduction processing on the intermediate image, which can effectively remove noise, and can also effectively remove noise when the lighting conditions are poor, so that the noise in the image can be effectively suppressed.
  • the image collected by the image processing device may be an original image, and a data format of the original image may be a first data format.
  • the first data format is an original image format, and usually includes image data of one or more spectral bands.
  • the original image in the first data format cannot be directly displayed or transmitted, that is, there is an exception when the original image in the first data format is displayed or transmitted.
  • the first data format may include a Bayer format.
  • the Bayer format is only an example, and there is no limitation on this first data format, and the data format of all the original images is within the protection scope of the present disclosure.
  • the image processing device uses the neural network to convert the original image, the intermediate image is obtained.
  • the intermediate image is the output image of the neural network, and is not the final target image.
  • the image processing device performs noise reduction processing on the intermediate image by using the cache image, the target image is obtained, that is, the final output image.
  • the data format of the intermediate image and the target image may be a second data format, and the second data format is any image format suitable for display or transmission. For example, when a target image in a second data format is displayed or transmitted, no abnormality occurs.
  • the second data format may include an RGB (Red Green Blue) format, a YUV (Luminance Chrominance) format, and the like.
  • RGB Red Green Blue
  • YUV Luminance Chrominance
  • All image formats suitable for display or transmission are within the protection scope of the present disclosure.
  • the following describes the neural network in the embodiment of the present disclosure, which can be used to convert an original image in a first data format into an intermediate image in a second data format.
  • you can also use the neural network to optimize the original image, such as adjusting the attributes of the original image, such as adjusting the brightness, color, contrast, signal-to-noise ratio, and size of the original image. This optimization method is not detailed. limit.
  • the neural network in the present disclosure may include, but is not limited to, a convolutional neural network (CNN for short), a recurrent neural network (RNN for short), a fully connected network, etc.
  • CNN convolutional neural network
  • RNN recurrent neural network
  • a fully connected network etc.
  • the convolutional neural network is taken as an example.
  • the structural units of the neural network in the present disclosure may include, but are not limited to, one or any combination of the following: a convolutional layer, a pooling layer, an excitation layer, a fully connected layer, and the like.
  • a convolutional layer may include: at least one convolutional layer, at least one pooling layer, and at least one fully connected layer.
  • the neural network may include: at least one convolutional layer and at least one excitation layer.
  • FIGS. 1A and 1B there are two examples of the neural network used in this embodiment.
  • the neural network may be composed of several convolutional layers (Conv), several pooling layers (Pool), and a fully connected layer (FC). There are no restrictions on the number of convolution layers and the number of pooling layers.
  • the neural network may be composed of several convolutional layers and several excitation layers. There are no restrictions on the number of convolutional layers or the number of excitation layers.
  • the neural network used in the present disclosure for converting the original image in the first data format into the intermediate image in the second data format may also have other structures, which is not limited as long as it includes at least one convolution layer, and Not limited to FIG. 1A or FIG. 1B, the neural network illustrated in FIG. 1A or FIG. 1B is merely an example.
  • the image features are enhanced by performing a convolution operation on the image using a convolution kernel.
  • the convolution kernel can be a matrix of size m * n.
  • the input of the convolution layer and the convolution kernel are convolved to obtain the output of the convolution layer.
  • the convolution operation is actually a filtering process.
  • the pixel value f (x, y) of the point (x, y) on the image is convolved with the convolution kernel w (x, y).
  • a 4 * 4 convolution kernel is provided.
  • the 4 * 4 convolution kernel contains 16 values, and the size of the 16 values can be configured as required. Slide the image in order according to the size of 4 * 4 to get multiple 4 * 4 sliding windows.
  • the 4 * 4 convolution kernel is convolved with each sliding window to obtain multiple convolution features.
  • the processing of the pooling layer is actually a process of downsampling. By performing operations such as maximizing, minimizing, and averaging multiple convolutional features output by the convolutional layer, the amount of calculation can be reduced and feature invariance can be maintained.
  • the principle of local image correlation can be used to sub-sample the image, which can reduce the amount of data processing and retain useful information.
  • the following formula for performing maximum pooling can be used to pool the convolution features and obtain the pooled features.
  • s represents the corresponding window size (s * s) during the pooling process
  • m and n are set values
  • j and k are convolution features output by the convolution layer
  • i represents the i-th image
  • y i j, k represents the features obtained by pooling the i-th image.
  • the activation function (such as a non-linear function) can be used to map the features of the pooling layer output, thereby introducing non-linear factors, so that the neural network can enhance the expression ability through non-linear combination.
  • the activation function of the excitation layer may include, but is not limited to, a ReLU (Rectified Linear Units, Rectified Linear Units) function. Taking the following ReLU function as an example, the ReLU function can set all the features x that are less than or equal to 0 to 0 in the output of the pooling layer, and keep the features that are greater than 0 unchanged.
  • each node of the fully-connected layer is connected to all the nodes in the previous layer, and is used to fully-connect all features input to the fully-connected layer to obtain a feature vector, and
  • the feature vector may include multiple features.
  • the fully connected layer may also use a 1 * 1 convolution layer, so that a fully convolutional network can be formed.
  • one or more convolutional layers, one or more pooling layers, one or more excitation layers, and one or more fully connected layers can be combined to construct a neural network.
  • the input of the neural network is the original image in the first data format
  • the output of the neural network is the intermediate image in the second data format. That is, after the original image in the first data format is input to the neural network, after processing by various structural units (such as convolutional layer, pooling layer, excitation layer, fully connected layer, etc.) in the neural network, the second image can be output Intermediate image in data format.
  • various structural units such as convolutional layer, pooling layer, excitation layer, fully connected layer, etc.
  • the neural network can be trained offline, which mainly trains various neural network parameters in the neural network, such as convolution layer parameters (such as convolution kernel parameters), pooling layer parameters, and excitation layer parameters.
  • convolution layer parameters such as convolution kernel parameters
  • pooling layer parameters such as convolution kernel parameters
  • excitation layer parameters there is no limitation on this, all parameters involved in the neural network are within the protection scope of this embodiment.
  • the neural network can fit the mapping relationship between input and output, that is, the mapping relationship between the original image in the first data format and the intermediate image in the second data format. In this way, when the input of the neural network is the original image in the first data format, and through the processing of the neural network, the output of the neural network is the intermediate image in the second data format.
  • the following describes the process of offline training of neural networks in detail in combination with specific application scenarios.
  • two training methods are introduced.
  • the neural networks obtained by the two training methods may be referred to as a first neural network and a second neural network, respectively.
  • a training image in a first data format and a training image in a second data format may be collected, and the training image in the first data format and the training image in the second data format may be associated.
  • the image data set is stored, and the image data set is output to the first neural network.
  • the first neural network uses the image data set to train each neural network parameter in the first neural network.
  • the process of obtaining the image data set can include:
  • Method 1 For the same frame image, the imaging device A collects the training image A1 in the first data format, the imaging device B synchronously acquires the training image B1 in the second data format, and then stores the training image A1 and the training image B1 in association. Similarly, the imaging device A acquires the training image A2, the imaging device B acquires the training image B2, and stores the training image A2 and the training image B2 in association.
  • the image data set may include the corresponding relationship between each group of training images An and training images Bn.
  • Method 2 The imaging device A collects the training image A1 in the first data format, and processes the training image A1 (such as white balance correction, color interpolation, curve mapping, etc., and the processing method is not limited) to obtain the second
  • the training image A1 'in a data format, and the training image A1 and the training image A1' are stored in association.
  • the imaging device A collects the training image A2 in the first data format, and processes the training image A2 to obtain the training image A2 ′ in the second data format, and stores the training image A2 and the training image A2 ′ in association with each other.
  • the final image data set may include the correspondence between each group of training images An and training images An ′.
  • an image data set can be obtained, and the image data set includes a correspondence between a training image in a first data format and a training image in a second data format.
  • a pre-designed first neural network can be trained, that is, each neural network parameter is trained, so that the first neural network fits the mapping relationship between the image in the first data format and the image in the second data format.
  • this training method such as back propagation, elastic propagation, and conjugate gradient.
  • the original image may be input into the trained first neural network to convert the original image in the first data format into the intermediate image in the second data format by the first neural network.
  • the first neural network can also be adjusted online. That is, the original image in the first data format and the target image in the second data format can be used to re-optimize the parameters of each neural network in the first neural network, and there is no limitation on the online adjustment process of the first neural network.
  • training images in a first data format and training images in a second data format may be collected to obtain device parameters (that is, parameters of an imaging device that acquires training images in the first data format). ), Associate the training image in the first data format, the training image in the second data format, and the device parameters to obtain an image data set, and output the image data set to the second neural network, and the second neural network uses the image data Set the parameters of each neural network in the training second neural network.
  • device parameters that is, parameters of an imaging device that acquires training images in the first data format.
  • the process of obtaining the image data set may include, but is not limited to:
  • the imaging device A acquires the training image A1 in the first data format, and acquires the device parameter 1 of the imaging device A.
  • the imaging device B synchronously acquires the training image B1 in the second data format, and then, the training image is A1, equipment parameter 1 and training image B1 are stored in association.
  • the imaging device A acquires the training image A2 and acquires the device parameter 2
  • the imaging device B acquires the training image B2, and stores the training image A2, the device parameter 2 and the training image B2 in association.
  • the image data set may include the correspondence between each group of training images An, device parameters n, and training images Bn.
  • the above device parameters may be fixed parameters that are not related to the environment (such as sensor sensitivity), or shooting parameters that are related to the environment (such as aperture size).
  • device parameters may include, but are not limited to, one or any combination of the following: sensor sensitivity, dynamic range, signal-to-noise ratio, pixel size, target surface size, resolution, frame rate, number of pixels, spectral response, photoelectric response, There are no restrictions on the array mode, lens aperture diameter, focal length, aperture size, hood model, filter aperture, viewing angle, etc.
  • Method 2 The imaging device 1 collects the training image set 1 (including a large number of training images) in the first data format, obtains the device parameters 1 of the imaging device 1, and processes each training image in the training image set 1 (such as white Balance correction, color interpolation, curve mapping, etc., without limitation), to obtain training image set 1 in the second data format, and training image set 1, device parameter 1 and training in the second data format in the first data format Image set 1 is stored in association.
  • the imaging device 2 collects the training image set 2 in the first data format, acquires the device parameters 2 of the imaging device 2, and processes each training image in the training image set 2 to obtain the training image set in the second data format. 2.
  • the final image data set may include the training image sets of each group in the first data format
  • the correspondence between K, the device parameter K, and the training image set K in the second data format is shown in FIG. 2C.
  • An image data set can be obtained through the above two methods, and the image data set includes a training image in a first data format, a correspondence between a device parameter and a training image in a second data format.
  • a pre-designed second neural network can be trained, that is, training the neural network parameters, so that the second neural network fits the mapping of the image in the first data format, the device parameters, and the image in the second data format. relationship.
  • this training method such as back propagation, elastic propagation, and conjugate gradient. It should be noted that the mapping relationship fitted by the second neural network includes device parameters.
  • the original image and the device parameters of the device that collects the original image can be obtained, and the original image and the device parameters can be input to the trained second neural network, so that the second neural network can
  • the device parameter converts a data format of the original image from the first data format to a second data format, thereby obtaining an intermediate image in a second data format.
  • the method may include the following steps.
  • Step 301 Obtain an original image in a first data format.
  • a light signal in a first wavelength range may be sampled to obtain an original image; or a light signal in a second wavelength range may be sampled to obtain an original image; or a first wavelength range and a second wavelength range may be obtained.
  • the light signal is sampled to obtain the original image.
  • the first wavelength range and the second wavelength range are merely examples and are not restrictive.
  • the first wavelength range may be a visible light wavelength range from 380 nm to 780 nm
  • the second wavelength range may be an infrared wavelength range from 780 nm to 2500 nm.
  • Step 302 Use the trained neural network to convert the original image into an intermediate image in a second data format.
  • the original image may be input to a trained first neural network, and the data format of the original image is converted from the first data format to the second data format by the first neural network; A data format of the device parameters of the original image device; the original image and the device parameters are then input to a trained second neural network, so that the second neural network converts the data of the first data format according to the device parameters The original image is converted into an intermediate image in the second data format.
  • the buffer image is used to perform noise reduction processing on the intermediate image in the second data format to obtain a target image in the second data format.
  • the buffer image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
  • the cache image corresponding to the original image of the next frame may be updated as the target image.
  • the acquisition method of the target image 1 is not limited, for example, the intermediate image 1 of the original image 1 can be directly determined as the target image 1)
  • the buffer image that is, the target image 1
  • the cached image in the cache is updated Is target image 2, that is, target image 1 is no longer a cached image.
  • the buffer image (that is, the target image 2) is used to perform noise reduction processing on the intermediate image 3 to obtain the target image 3, and the cached image in the cache is updated Is target image 3, that is, target image 2 is no longer a cached image.
  • the target image corresponding to the original image of the previous frame can be used to continuously update the cached image in the cache to perform noise reduction processing on the intermediate image corresponding to the original image of the current frame, thereby obtaining the target image corresponding to the original image of the current frame.
  • the buffer image is used to perform noise reduction processing on the intermediate image to obtain a target image in a second data format, including but not limited to: obtaining a motion estimation value of each pixel in the intermediate image according to the intermediate image and the buffer image;
  • the intermediate image is converted into a target image in a second data format.
  • the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cached image is the target image corresponding to the previous frame of the original image adjacent to the original image, the two adjacent frames (ie, the cached image and the intermediate image) are closely related in time and space. You can use the correlation between two adjacent frames to distinguish the signal and noise in the intermediate image, and perform noise reduction processing on the intermediate image to effectively remove the noise. Therefore, noise can be effectively removed even in poor lighting conditions, so that the noise in the target image can be effectively suppressed.
  • the above method flow may be executed by the image processing apparatus 100.
  • the image processing apparatus 100 may include three modules: an image processing module 101, a video processing module 102, and a training and learning module 103.
  • the image processing module 101 is configured to perform the foregoing steps 301 and 302, and the video processing module 102 is configured to perform the foregoing step 303.
  • the training and learning module 103 may be an offline module.
  • the neural network is trained and adjusted in advance using the image data set, and the trained and adjusted neural network is output to the image processing module 101.
  • the first neural network described in the above embodiment may be executed. And offline training of the second neural network.
  • the image processing module 101 obtains a pre-adjusted neural network from the training and learning module 103, processes the original image of the inputted first data format based on the neural network, and outputs the intermediate image of the second data format.
  • the video processing module 102 receives a frame of intermediate image in the second data format output by the image processing module 101, performs noise reduction processing in combination with the information of the cached image stored in the cache, obtains the target image, and stores the processed target image in the cache. As a cache image corresponding to the next frame, the video processing module 102 may record a frame of the image in the cache as a cache image.
  • the video processing module 102 is composed of a motion estimation unit 201 and a time domain processing unit 202.
  • step S1 may be performed by the motion estimation unit 201
  • step S2 may be performed by the time domain processing unit 202
  • step 303 described above may be implemented by the motion estimation unit 201 and the time domain processing unit 202.
  • step S1 a motion estimation value of each pixel in the intermediate image is obtained according to the intermediate image and the cache image.
  • the cache image includes a target image corresponding to a previous frame of the original image adjacent to the original image.
  • step S2 the intermediate image is converted into a target image in a second data format according to the motion estimation value.
  • the above-mentioned video processing module 102 composed of the motion estimation unit 201 and the time-domain processing unit 202 is just an example.
  • the video processing module 102 may further include other units, such as a noise estimation unit, a spatial processing unit, and the like.
  • the estimation unit is configured to perform noise estimation on the image
  • the spatial processing unit is configured to perform spatial processing on the image, and there is no limitation on the processing process.
  • the video processing module 102 is composed of a motion estimation unit 201, a time-domain processing unit 202, a noise estimation unit 203, and a spatial-domain processing unit 204.
  • the aforementioned step 303 may be implemented by the aforementioned motion estimation unit 201, time domain processing unit 202, noise estimation unit 203, and spatial domain processing unit 204.
  • the video processing module 102 is composed of a motion estimation unit 201, a time domain processing unit 202, and a spatial domain processing unit 204.
  • the above-mentioned step 303 may be implemented by the above-mentioned motion estimation unit 201, time-domain processing unit 202, and space-domain processing unit 204.
  • step S1 is implemented through steps 4031 and 4032, which are specifically:
  • Step 4031 Obtain a correlation image according to the intermediate image and the cache image.
  • the correlation image is obtained from the pixel values of the corresponding positions of the intermediate image and the cache image according to a preset calculation method.
  • the preset calculation method may be a frame difference method, a convolution method, a cross-correlation method, etc., and there is no limitation on this.
  • Obtaining a correlation image according to the intermediate image and the cache image may include, but is not limited to:
  • Method 1 Use the frame difference method to calculate the correlation image, that is, the difference between the pixel values of each pixel point of the intermediate image and the buffer image to obtain the correlation image. For example, for each pixel in the associated image, a first pixel value corresponding to the pixel in the intermediate image, a second pixel value corresponding to the pixel in the cache image, and the first pixel value and the second pixel are obtained. The difference in values is determined as the pixel value of the pixel in the associated image.
  • the size of the intermediate image is 6 * 4
  • the size of the cache image is 6 * 4
  • the size of the correlation image is 6 * 4.
  • 6 represents the number of pixels in the horizontal direction
  • 4 represents the number of pixels in the vertical direction.
  • 6 and 4 are just an example. In practical applications, the number of pixels in the horizontal direction is much larger than 6, and the number of pixels in the vertical direction is much larger than 4.
  • the intermediate images, cached images, and related images are the same size. .
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the cache image are B11-B16, B21-B26, B31-B36, B41- B46
  • the pixels of the correlation image are C11-C16, C21-C26, C31-C36, and C41-C46 in this order.
  • the pixel value of each pixel of the correlation image can be calculated as follows: For the pixel point C11 in the related image, obtain the first pixel value (pixel value of the pixel point A11) corresponding to the pixel point C11 in the intermediate image, and the second pixel value (pixel point B11) corresponding to the pixel point C11 in the cache image. Pixel value), and then the difference between the first pixel value and the second pixel value is determined as the pixel value of the pixel C11 in the associated image. For the other pixels in the related image, refer to pixel C11 for the processing method, and the details will not be repeated.
  • Method 2 Use the convolution method to calculate the correlation image, that is, convolve the image blocks of the intermediate image and the cache image to obtain the correlation image.
  • the size of the image block is preset, such as 3 * 3. For example, for each pixel point in the associated image, a first image block corresponding to the pixel point is selected from the intermediate image, a second image block corresponding to the pixel point is selected from the cache image, and the first image block is The convolution value (that is, the convolution value of two matrices) with the second image block is determined as the pixel value of the pixel in the associated image.
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the cache image are B11-B16, B21-B26, B31-B36, B41-B46. It is assumed that the pixels of the correlation image are C11-C16, C21-C26, C31-C36, and C41-C46 in this order.
  • the pixel value of each pixel of the correlation image can be calculated as follows: For the pixel C11 in the related image, a first image block corresponding to the pixel C11 is selected from the intermediate image.
  • the first image block is a 3 * 3 matrix, and the first row includes the pixels A11, A12, and A13.
  • the second line includes pixels A21, A22, and A23, and the third line includes pixels A31, A32, and A33. Select the second image block corresponding to pixel C11 from the cache image.
  • the second image block is a 3 * 3 matrix.
  • the first row includes pixels B11, B12, and B13
  • the second row includes pixels B21, B22, and B23.
  • the third line includes pixels B31, B32, and B33.
  • Step 4032 Obtain a motion estimation value of each pixel in the intermediate image according to the correlation image.
  • the value of the motion estimation value may be binarized or continuous.
  • a smoothing process may be adopted to obtain a motion estimation value of each pixel point in the intermediate image.
  • the smoothing processing may include: an image filtering operation having a smoothing characteristic, such as an average filtering operation, a median filtering operation, a Gaussian filtering operation, and the like, and there is no limitation on this image filtering operation.
  • the mapping process may include a linear scaling operation and a panning operation.
  • the threshold processing may include: determining a motion estimation value according to a magnitude relationship between the pixel value and the threshold; limiting the value of the motion estimation value to a range divided by the threshold, and there is no limitation on this.
  • the process of obtaining the motion estimation value of each pixel can include, but is not limited to:
  • Method 1 If the motion estimation value is binarized, and smoothing and threshold processing are used to obtain the motion estimation value of each pixel in the intermediate image, then the average filtering operation and the correlation image can be performed. Threshold processing, etc., to obtain a motion estimation value for each pixel in the intermediate image.
  • a third image block corresponding to the pixel can be selected from the correlation image, and the third image block is averaged to obtain a pixel value corresponding to the pixel; if the pixel value If the pixel value is greater than the threshold value, the motion estimation value of the pixel is determined to be the first value (such as 1), and if the pixel value is not greater than the threshold value, the motion estimation value of the pixel is determined to be the second value (such as 0).
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the correlation image are C11-C16, C21-C26, C31-C36, C41-C46.
  • the correlation image is known (see step 4031)
  • the pixel value of each pixel of the correlation image is known.
  • the following methods can be used:
  • a third image block corresponding to the pixel A11 can be selected from the correlation image.
  • the third image block can be a 3 * 3 matrix (the matrix can also be a matrix of other sizes. This is not limited), the first line includes pixels C11, C12, and C13, the second line includes pixels C21, C22, and C23, and the third line includes pixels C31, C32, and C33.
  • an average filter may be performed on the 9 pixels of the third image block (that is, the average of the 9 pixel values is calculated), and the average filter result is the pixel value corresponding to the pixel A11.
  • the motion estimation value of the pixel A11 may be 1, and if the pixel value is not greater than the threshold, the motion estimation value of the pixel A11 may be 0.
  • the processing method can refer to pixel A11, which is not repeated here.
  • Method 2 If the motion estimation value is continuously obtained, and smoothing and mapping processing are used to obtain the motion estimation value of each pixel in the intermediate image, a median filtering operation and linearity can be performed on the correlation image. Scaling operations, etc., to obtain a motion estimation value for each pixel in the intermediate image.
  • a filter value located in a specific interval (such as interval 0-1) is obtained; then, for each pixel in the intermediate image It is also possible to obtain a filter value corresponding to the pixel located in a specific interval and determine the filter value as a motion estimation value of the pixel, that is, the motion estimation value is also a value located in a specific interval (such as interval 0-1).
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the correlation image are C11-C16, C21-C26, C31-C36, C41-C46. Since the pixel value of each pixel of the correlation image is known, the motion estimation value of each pixel in the intermediate image can be calculated as follows:
  • pixel A11 in the intermediate image corresponds to the filter value of pixel C11 0.039, that is, the motion estimate of pixel A11 is 0.039; for pixel A12 in the intermediate image, pixel A12 corresponds to the pixel
  • the filter value of point C12 is 0.196, that is, the motion estimation value of pixel A12 is 0.196; and so on.
  • step S2 can be implemented through steps 4041 and 4042.
  • the target image in the second data format can also be obtained by other methods, which is not limited as long as the target image can be obtained based on the motion estimation value. Just fine.
  • Step 4041 and step 4042 are specifically:
  • Step 4041 Obtain a low-noise image according to the intermediate image and the cached image.
  • the low-noise image is an image with a lower noise level than the intermediate image, and there is no restriction on the acquisition method.
  • the low-noise image may be the average of the intermediate image and the cached image. image.
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the cache image are B11-B16, B21-B26, B31-B36, and B41-B46.
  • the pixels of the low-noise image are D11-D16, D21-D26, D31-D36, and D41-D46 in this order. Based on this, since the pixel value of each pixel point of the intermediate image and the cache image is known, the pixel value of each pixel point of the low-noise image can be calculated as follows:
  • the pixel point D11 in the low-noise image obtain the pixel value corresponding to the pixel point D11 in the intermediate image (the pixel value of the pixel point A11) and the pixel value corresponding to the pixel point D11 in the cache image (the pixel value of the pixel point B11).
  • the average value of the above two pixel values is determined as the pixel value of the pixel point D11 in the low-noise image.
  • pixel D11 for the processing method, and the details will not be repeated.
  • Step 4042 Obtain a target image according to the intermediate image, the low-noise image, and the motion estimation value.
  • acquiring the target image according to the intermediate image, the low-noise image, and the motion estimation value may include, but is not limited to, determining a pixel value of a first pixel point in the target image, where the first pixel point is any one of the target image Pixels, the pixel value of the first pixel is determined in the following manner: determining a first weight of the first pixel in the intermediate image and a second weight of the first pixel in the low-noise image according to the motion estimation value of the first pixel; Determine the pixel value of the first pixel in the target image according to the pixel value and the first weight corresponding to the first pixel in the intermediate image, and the pixel value and the second weight corresponding to the first pixel in the low-noise image; The pixel values of all pixels determine the target image.
  • the first weight of the first pixel in the intermediate image may be A
  • the second weight of the first pixel in the low-noise image may be (1-A)
  • the pixel value corresponding to the first pixel point in the intermediate image is N
  • the pixel value corresponding to the first pixel point in the low-noise image is M
  • the pixel value of the first pixel point in the target image is N * A + M * (1-A).
  • the above method is only an example, and the pixel value of the first pixel point in the target image may also be obtained by other methods, which is not limited.
  • the pixels of the intermediate image are A11-A16, A21-A26, A31-A36, A41-A46
  • the pixels of the low-noise image are D11-D16, D21-D26, D31-D36, D41-D46
  • the pixels of the target image are E11-E16, E21-E26, E31-E36, E41-E46. Since the pixel value of each pixel of the intermediate image and the low-noise image is known, calculate each The pixel value of the pixel in the target image can be adopted as follows:
  • the first weight as the motion estimation value of the pixel E11
  • the second weight as 1 minus the motion estimation value of the pixel E11
  • the result of this calculation is the pixel value of pixel point E11.
  • pixel E11 for the processing method, which will not be repeated.
  • the motion estimation value when the motion estimation value is larger, it means that the first weight is larger and the second weight is smaller, that is, the pixel value of the intermediate image is higher, and the pixel value of the low-noise image is higher.
  • the motion estimation value of pixel E11 is relatively large (such as greater than 0.5)
  • the first weight is greater than the second weight.
  • the motion estimation value indicates that the pixel is compared with the previous frame.
  • the change of E11 is large, that is, the pixel value of the current frame is more accurate.
  • the first weight is greater than the second weight
  • the pixel value of the intermediate image in the current frame accounts for a larger proportion, which is more in line with the demand for larger motion estimates. In this way, the pixel value of pixel E11 in the target image is more accurate.
  • the pixel value of the pixel point may include, but is not limited to, the gray value, brightness value, and chrominance value of the pixel point, and the type of the pixel value is not limited, and may be related to actual image processing.
  • the buffer image can be used to perform noise reduction processing on the intermediate image to obtain the second The target image in the data format. Since the cached image is the target image corresponding to the previous frame of the original image adjacent to the original image, the two adjacent frames (ie, the cached image and the intermediate image) are closely related in time and space. The correlation between two adjacent frames of images can be used to distinguish the signal and noise in the images, and the intermediate image is processed for noise reduction. Therefore, noise can be effectively removed even in poor lighting conditions, so that the noise in the target image can be effectively suppressed.
  • an embodiment of the present disclosure also proposes an image processing device. As shown in FIG. 5, it is a structural diagram of the image processing device.
  • the image processing device includes:
  • An image processing module 501 configured to obtain an original image in a first data format, and use a neural network to convert the original image into an intermediate image in a second data format;
  • the video processing module 502 is configured to perform noise reduction processing on the intermediate image by using a cached image to obtain a target image in a second data format, where the cached image includes a previous frame corresponding to the original image corresponding to the original image Target image.
  • the image processing module 501 uses a neural network to convert the original image into an intermediate image in a second data format
  • the image processing module 501 is specifically configured to:
  • the original image and the device parameters are input to a second neural network to convert the original image in the first data format into an intermediate image in a second data format according to the device parameters by the second neural network.
  • the video processing module 502 When the video processing module 502 obtains the motion estimation value of each pixel in the intermediate image according to the intermediate image and the cache image, the video processing module 502 is specifically configured to: obtain a correlation image according to the intermediate image and the cache image; The correlation image acquires a motion estimation value of each pixel point in the intermediate image.
  • the video processing module 502 obtains a motion estimation value of each pixel in the intermediate image by using at least one of a smoothing process, a mapping process, and a threshold process.
  • the video processing module 502 converts the intermediate image into a target image in a second data format according to the motion estimation value
  • the video processing module 502 is specifically configured to: obtain a low-noise image according to the intermediate image and the cache image; An image, the low-noise image, and the motion estimation value to obtain a target image.
  • the video processing module 502 is specifically configured to obtain a target image according to the intermediate image, the low-noise image, and the motion estimation value:
  • a pixel value of a first pixel point in the target image where the first pixel point is any pixel point in the target image, and a pixel value of the first pixel point is determined in the following manner: according to the first The motion estimation value corresponding to the pixel determines the first weight of the first pixel in the intermediate image and the second weight of the first pixel in the low-noise image; according to the pixel corresponding to the first pixel in the intermediate image A value and the first weight, a pixel value corresponding to the first pixel point in the low-noise image, and the second weight, to determine a pixel value of the first pixel point in the target image;
  • the target image is determined according to pixel values of all pixel points of the target image.
  • the schematic diagram of the hardware architecture of the image processing device provided by the embodiment of the present disclosure can be specifically shown in FIG. 6, and includes: a processor 601 and a machine-readable storage medium 602, wherein: the machine-readable storage medium 602 stores machine-executable instructions that can be executed by the processor 601; the processor 601 is configured to execute machine-executable instructions to implement the image processing method disclosed in the above examples of the present disclosure.
  • the machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device, and may contain or store information, such as executable instructions, data, and so on.
  • the machine-readable storage medium may be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as hard drive), solid state hard disk, any type of storage disk (Such as optical discs, DVDs, etc.), or similar storage media, or a combination thereof.
  • the system, device, module, or unit described in the foregoing embodiments may be specifically implemented by a computer chip or entity, or a product with a certain function.
  • a typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email sending and receiving device, and a game control Desk, tablet computer, wearable device, or a combination of any of these devices.
  • the embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Moreover, the embodiments of the present disclosure may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • these computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device,
  • the instruction device implements the functions specified in a flowchart or a plurality of processes and / or a block or a plurality of blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of operation steps can be performed on the computer or other programmable device to generate a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

La présente invention concerne un procédé, un appareil et un dispositif de traitement d'images ainsi qu'un support de stockage lisible par machine. Le procédé consiste à : acquérir une image d'origine d'un premier format de données ; utiliser un réseau neuronal afin de convertir l'image d'origine en une image intermédiaire d'un second format de données ; et utiliser une image en tampon pour éliminer le bruit de l'image intermédiaire afin d'obtenir une image cible du second format de données, l'image en tampon comprenant une image cible correspondant à une image d'origine de la trame précédente adjacente à l'image d'origine.
PCT/CN2019/089272 2018-05-31 2019-05-30 Procédé, appareil et dispositif de traitement d'image et support de stockage lisible par machine WO2019228456A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810556530.2A CN110555808B (zh) 2018-05-31 2018-05-31 一种图像处理方法、装置、设备及机器可读存储介质
CN201810556530.2 2018-05-31

Publications (1)

Publication Number Publication Date
WO2019228456A1 true WO2019228456A1 (fr) 2019-12-05

Family

ID=68698716

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089272 WO2019228456A1 (fr) 2018-05-31 2019-05-30 Procédé, appareil et dispositif de traitement d'image et support de stockage lisible par machine

Country Status (2)

Country Link
CN (1) CN110555808B (fr)
WO (1) WO2019228456A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001912A (zh) * 2020-08-27 2020-11-27 北京百度网讯科技有限公司 目标检测方法和装置、计算机系统和可读存储介质
CN115661156A (zh) * 2022-12-28 2023-01-31 成都数联云算科技有限公司 图像生成方法、装置、存储介质、设备及计算机程序产品
CN116385307A (zh) * 2023-04-11 2023-07-04 任成付 画面信息滤波效果鉴定系统

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402153B (zh) * 2020-03-10 2023-06-13 上海富瀚微电子股份有限公司 一种图像处理方法及系统
CN111476730B (zh) * 2020-03-31 2024-02-13 珠海格力电器股份有限公司 一种图像修复的处理方法及装置
CN111583142B (zh) * 2020-04-30 2023-11-28 深圳市商汤智能传感科技有限公司 图像降噪方法及装置、电子设备和存储介质
CN113724142B (zh) * 2020-05-26 2023-08-25 杭州海康威视数字技术股份有限公司 图像复原系统及方法
TWI828160B (zh) * 2022-05-24 2024-01-01 國立陽明交通大學 纖毛影像的分析方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106803090A (zh) * 2016-12-05 2017-06-06 中国银联股份有限公司 一种图像识别方法和装置
CN107220951A (zh) * 2017-05-31 2017-09-29 广东欧珀移动通信有限公司 人脸图像降噪方法、装置、存储介质及计算机设备
CN107424184A (zh) * 2017-04-27 2017-12-01 厦门美图之家科技有限公司 一种基于卷积神经网络的图像处理方法、装置及移动终端

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE602006021728D1 (de) * 2005-03-31 2011-06-16 Nippon Kogaku Kk Bildverarbeitungsverfahren
CN102769722B (zh) * 2012-07-20 2015-04-29 上海富瀚微电子股份有限公司 时域与空域结合的视频降噪装置及方法
CN105163102B (zh) * 2015-06-30 2017-04-05 北京空间机电研究所 一种基于fpga的实时图像自动白平衡系统及方法
CN107767343B (zh) * 2017-11-09 2021-08-31 京东方科技集团股份有限公司 图像处理方法、处理装置和处理设备
CN107730474B (zh) * 2017-11-09 2022-02-22 京东方科技集团股份有限公司 图像处理方法、处理装置和处理设备
CN107767408B (zh) * 2017-11-09 2021-03-12 京东方科技集团股份有限公司 图像处理方法、处理装置和处理设备

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106803090A (zh) * 2016-12-05 2017-06-06 中国银联股份有限公司 一种图像识别方法和装置
CN107424184A (zh) * 2017-04-27 2017-12-01 厦门美图之家科技有限公司 一种基于卷积神经网络的图像处理方法、装置及移动终端
CN107220951A (zh) * 2017-05-31 2017-09-29 广东欧珀移动通信有限公司 人脸图像降噪方法、装置、存储介质及计算机设备

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001912A (zh) * 2020-08-27 2020-11-27 北京百度网讯科技有限公司 目标检测方法和装置、计算机系统和可读存储介质
CN112001912B (zh) * 2020-08-27 2024-04-05 北京百度网讯科技有限公司 目标检测方法和装置、计算机系统和可读存储介质
CN115661156A (zh) * 2022-12-28 2023-01-31 成都数联云算科技有限公司 图像生成方法、装置、存储介质、设备及计算机程序产品
CN116385307A (zh) * 2023-04-11 2023-07-04 任成付 画面信息滤波效果鉴定系统
CN116385307B (zh) * 2023-04-11 2024-05-03 衡阳市欣嘉传媒有限公司 画面信息滤波效果鉴定系统

Also Published As

Publication number Publication date
CN110555808A (zh) 2019-12-10
CN110555808B (zh) 2022-05-31

Similar Documents

Publication Publication Date Title
WO2019228456A1 (fr) Procédé, appareil et dispositif de traitement d'image et support de stockage lisible par machine
CN115442515B (zh) 图像处理方法和设备
US9615039B2 (en) Systems and methods for reducing noise in video streams
US9262807B2 (en) Method and system for correcting a distorted input image
WO2021047345A1 (fr) Procédé et appareil de réduction de bruit d'image, et support de stockage et dispositif électronique
GB2592835A (en) Configurable convolution engine for interleaved channel data
US8315474B2 (en) Image processing device and method, and image sensing apparatus
WO2021082883A1 (fr) Procédé et appareil de détection de corps principal, dispositif électronique et support de stockage lisible par ordinateur
CN108876753A (zh) 使用引导图像对合成长曝光图像进行可选增强
US9282253B2 (en) System and method for multiple-frame based super resolution interpolation for digital cameras
CN107071234A (zh) 一种镜头阴影校正方法及装置
US8861846B2 (en) Image processing apparatus, image processing method, and program for performing superimposition on raw image or full color image
CN106134180B (zh) 图像处理装置和方法、存储能够由计算机临时读取的图像处理程序的记录介质、摄像装置
WO2021035524A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique et support de stockage lisible par ordinateur
CN113784014B (zh) 一种图像处理方法、装置及设备
Zhao et al. End-to-end denoising of dark burst images using recurrent fully convolutional networks
Tan et al. A real-time video denoising algorithm with FPGA implementation for Poisson–Gaussian noise
CN110782480A (zh) 一种基于在线模板预测的红外行人跟踪方法
CN114390188B (zh) 一种图像处理方法和电子设备
CN113744355B (zh) 一种脉冲信号的处理方法、装置及设备
CN110930440A (zh) 图像对齐方法、装置、存储介质及电子设备
US11270412B2 (en) Image signal processor, method, and system for environmental mapping
CN112241670B (zh) 图像处理方法及装置
CN104776919B (zh) 基于fpga的红外焦平面阵列条带状非均匀性校正系统和方法
JP4664259B2 (ja) 画像補正装置および画像補正方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19811302

Country of ref document: EP

Kind code of ref document: A1