WO2020187029A1 - Image processing method and device, neural network training method, and storage medium - Google Patents

Image processing method and device, neural network training method, and storage medium Download PDF

Info

Publication number
WO2020187029A1
WO2020187029A1 PCT/CN2020/077763 CN2020077763W WO2020187029A1 WO 2020187029 A1 WO2020187029 A1 WO 2020187029A1 CN 2020077763 W CN2020077763 W CN 2020077763W WO 2020187029 A1 WO2020187029 A1 WO 2020187029A1
Authority
WO
WIPO (PCT)
Prior art keywords
sampling
output
processing
sampling process
level
Prior art date
Application number
PCT/CN2020/077763
Other languages
French (fr)
Chinese (zh)
Inventor
刘瀚文
张丽杰
朱丹
那彦波
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Publication of WO2020187029A1 publication Critical patent/WO2020187029A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T5/60
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the embodiments of the present disclosure relate to an image processing method, an image processing device, a training method of a neural network, and a storage medium.
  • CNN Convolutional Neural Network
  • At least one embodiment of the present disclosure provides an image processing method, including: receiving a first characteristic image; and performing multi-scale cyclic sampling processing on the first characteristic image at least once;
  • the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing
  • the first-level sampling processing includes first down-sampling processing, first up-sampling processing, and first residual linking Addition processing
  • the first down-sampling processing performs down-sampling processing based on the input of the first-level sampling processing to obtain a first down-sampled output
  • the first up-sampling processing performs up-sampling based on the first down-sampling output
  • the first up-sampling output is obtained by processing
  • the first residual link addition processing performs a first residual link addition on the input of the first-level sampling processing and the first up-sampling output, and then adds the first residual link
  • the result of a residual link addition is used as the output of the first-level sampling process
  • the second-level sampling process is nested between the first down-sampling process and the first up-sampling process, and receives the first
  • the down-sampling output is used as the
  • the size of the output of the first upsampling process is the same as the size of the input of the first downsampling process; the size of the output of the second upsampling process is The size is the same as the input size of the second downsampling process.
  • the multi-scale cyclic sampling processing further includes a third-level sampling processing, and the third-level sampling processing is nested in the second down-sampling processing and the During the second up-sampling process, the second down-sampling output is received as the input of the third-level sampling process, and the output of the third-level sampling process is provided as the input of the second up-sampling process, so that the second up-sampling The processing performs up-sampling processing based on the second down-sampling output; the third-level sampling processing includes third down-sampling processing, third up-sampling processing, and third residual link addition processing, where the third down-sampling processing The sampling process performs down-sampling based on the input of the third-level sampling process to obtain a third down-sampled output, and the third up-sampling process performs up-sampling based on the third down-sampled output to obtain a third up-s
  • the multi-scale cyclic sampling process includes the second-level sampling process that is sequentially executed multiple times, and the second-level sampling process receives the first-level sampling process for the first time.
  • the one-shot output is used as the input of the first second-level sampling process, and each second-level sampling process except the first second-level sampling process receives the previous second-level sampling
  • the processed output is used as the input of the second-level sampling process this time, and the output of the last second-level sampling process is used as the input of the first upsampling process.
  • the at least one multi-scale cyclic sampling process includes the multi-scale cyclic sampling process performed sequentially multiple times, and each time the input of the multi-scale cyclic sampling process is The input of the first-level sampling processing in the multi-scale cyclic sampling processing this time, and the output of the first-level sampling processing in the multi-scale cyclic sampling processing each time is used as the multi-scale cyclic sampling this time Processing output; the first multi-scale cyclic sampling process receives the first feature image as the input of the first multi-scale cyclic sampling process, except for the first multi-scale cyclic sampling process.
  • the multi-scale cyclic sampling process receives the output of the previous multi-scale cyclic sampling process as the input of this multi-scale cyclic sampling process, and the output of the last multi-scale cyclic sampling process is used as the at least one multi-scale cyclic sampling process.
  • the output of the scale cycle sampling process is used as the at least one multi-scale cyclic sampling process.
  • the multi-scale cyclic sampling processing further includes: performing the first down-sampling processing, the first up-sampling processing, the second down-sampling processing, and After the second up-sampling processing, perform instance standardization processing or layer standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively .
  • the image processing method provided by some embodiments of the present disclosure further includes: using a first convolutional neural network to perform the multi-scale cyclic sampling processing; wherein, the first convolutional neural network includes: a first element network for Perform the first-level sampling processing; a second meta network for performing the second-level sampling processing.
  • the first meta-network includes: a first sub-network for performing the first down-sampling process; a second sub-network for performing the first down-sampling process An up-sampling process; the second meta-network includes: a third sub-network for performing the second down-sampling process; a fourth sub-network for performing the second up-sampling process.
  • each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes a convolutional layer, One of residual networks and dense networks.
  • each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes instance standardization A layer or layer standardization layer, the instance standardization layer is used to perform instance standardization processing, and the layer standardization layer is used to perform layer standardization processing.
  • the image processing method provided by some embodiments of the present disclosure further includes: acquiring an input image; using an analysis network to convert the input image into the first feature image; and using a synthesis network to process the output of the at least one multi-scale cyclic sampling process Convert to output image.
  • At least one embodiment of the present disclosure further provides a neural network training method, wherein the neural network includes: an analysis network, a first sub-neural network, and a synthesis network, and the analysis network processes the input image to obtain the first feature Image, the first sub-neural network performs multi-scale cyclic sampling processing on the first feature image at least once to obtain a second feature image, and the synthesis network processes the second feature image to obtain an output image;
  • the training method includes: obtaining a training input image; using the analysis network to process the training input image to provide a first training feature image; using the first sub-neural network to perform an analysis on the first training feature image
  • the at least one multi-scale cyclic sampling process is used to obtain a second training feature image; the synthesis network is used to process the second training feature image to obtain a training output image; based on the training output image, the loss function is used to calculate the The loss value of the neural network; and correcting the parameters of the neural network according to the loss value;
  • the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing
  • the first-level sampling processing includes first down-sampling processing, first up-sampling processing, and first residual linking Addition processing
  • the first down-sampling processing performs down-sampling processing based on the input of the first-level sampling processing to obtain a first down-sampled output
  • the first up-sampling processing performs up-sampling based on the first down-sampling output
  • the first up-sampling output is obtained by processing
  • the first residual link addition processing performs a first residual link addition on the input of the first-level sampling processing and the first up-sampling output, and then adds the first residual link
  • the result of a residual link addition is used as the output of the first-level sampling process
  • the second-level sampling process is nested between the first down-sampling process and the first up-sampling process, and receives the first
  • the down-sampling output is used as the
  • the size of the output of the first upsampling process is the same as the size of the input of the first downsampling process; the size of the output of the second upsampling process The same size as the input of the second downsampling process.
  • the multi-scale cyclic sampling process further includes a third-level sampling process, and the third-level sampling process is nested in the second down-sampling process and the second down-sampling process.
  • the second down-sampling output is received as the input of the third-level sampling process, and the output of the third-level sampling process is provided as the input of the second up-sampling process, so that the second up-sampling process Up-sampling processing is performed based on the second down-sampling output;
  • the third-level sampling processing includes third down-sampling processing, third up-sampling processing, and third residual link addition processing, where the third down-sampling Processing is performed based on the input of the third-level sampling process to perform down-sampling processing to obtain a third down-sampled output, and the third up-sampling process performs up-sampling based on the third down-sampled output to obtain a third up-sampled output, the The third residual link addition process performs a third residual link addition on the input of the third level sampling process and the third up-sampling output, and then uses the result of the third residual link addition as the The output of the third-level sampling process.
  • the multi-scale cyclic sampling processing includes the second-level sampling processing that is sequentially executed multiple times, and the second-level sampling processing receives the first
  • the down-sampling output is used as the input of the first second-level sampling process, and each second-level sampling process except the first second-level sampling process receives the previous second-level sampling process
  • the output of is used as the input of the second-level sampling process this time, and the output of the last second-level sampling process is used as the input of the first upsampling process.
  • the at least one multi-scale cyclic sampling processing includes the multi-scale cyclic sampling processing performed sequentially multiple times, and each time the input of the multi-scale cyclic sampling processing is used as the original
  • the input of the first-level sampling process in the multi-scale cyclic sampling process, and the output of the first-level sampling process in the multi-scale cyclic sampling process is used as the multi-scale cyclic sampling process this time
  • the output of the first multi-scale cyclic sampling process receives the first training feature image as the input of the first multi-scale cyclic sampling process, except for the first multi-scale cyclic sampling process every time
  • the multi-scale cyclic sampling process receives the output of the previous multi-scale cyclic sampling process as the input of this multi-scale cyclic sampling process, and the output of the last multi-scale cyclic sampling process is used as the at least one multi-scale cyclic sampling process.
  • the output of the scale cycle sampling process is performed sequentially multiple times, and each time the input of the multi-scale cycl
  • the multi-scale cyclic sampling processing further includes: performing the first down-sampling processing, the first up-sampling processing, the second down-sampling processing, and After the second up-sampling processing, perform instance standardization processing or layer standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively .
  • the first sub-neural network includes: a first meta-network for performing the first-level sampling processing; a second meta-network for performing the first-level sampling process; Two-level sampling processing.
  • the first meta-network includes: a first sub-network for performing the first downsampling process; a second sub-network for performing the first Up-sampling processing; the second meta-network includes: a third sub-network for performing the second down-sampling processing; a fourth sub-network for performing the second up-sampling processing.
  • each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes a convolutional layer, One of residual networks and dense networks.
  • each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes an instance standardization layer Or a layer standardization layer
  • the instance standardization layer is used to perform instance standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively
  • the layer standardization layer is used to perform layer standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively.
  • At least one embodiment of the present disclosure further provides an image processing device, including: a memory for non-transitory storage of computer-readable instructions; and a processor for running the computer-readable instructions, the computer-readable instructions being The processor executes the image processing method provided by any embodiment of the present disclosure or the neural network training method provided by any embodiment of the present disclosure while running.
  • At least one embodiment of the present disclosure further provides a storage medium that non-temporarily stores computer-readable instructions, and when the computer-readable instructions are executed by a computer, the instructions or instructions of the image processing method provided in any embodiment of the present disclosure can be executed. Instructions of the neural network training method provided by any embodiment of the present disclosure.
  • Figure 1 is a schematic diagram of a convolutional neural network
  • Figure 2A is a schematic diagram of a convolutional neural network
  • Figure 2B is a schematic diagram of the working process of a convolutional neural network
  • FIG. 3 is a flowchart of an image processing method provided by an embodiment of the disclosure.
  • FIG. 4A is a schematic flow chart of a multi-scale cyclic sampling process corresponding to the image processing method shown in FIG. 3 according to an embodiment of the present disclosure
  • FIG. 4B is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure
  • FIG. 4C is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to still another embodiment of the present disclosure
  • FIG. 4D is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure
  • FIG. 5 is a flowchart of an image processing method provided by another embodiment of the present disclosure.
  • Fig. 6A is a schematic diagram of an input image
  • FIG. 6B is a schematic diagram of an output image obtained by processing the input image shown in FIG. 6A according to an image processing method provided by an embodiment of the present disclosure
  • FIG. 7A is a schematic structural diagram of a neural network provided by an embodiment of the disclosure.
  • FIG. 7B is a flowchart of a neural network training method provided by an embodiment of the disclosure.
  • FIG. 7C is a schematic structural block diagram of training the neural network shown in FIG. 7A corresponding to the training method shown in FIG. 7B according to an embodiment of the present disclosure
  • FIG. 8 is a schematic block diagram of an image processing apparatus provided by an embodiment of the present disclosure.
  • FIG. 9 is a schematic diagram of a storage medium provided by an embodiment of the disclosure.
  • Image enhancement is one of the research hotspots in the field of image processing. Due to the limitations of various physical factors in the image acquisition process (for example, the size of the image sensor of the mobile phone camera is too small and other software and hardware limitations) and the interference of environmental noise, the image quality will be greatly reduced.
  • the purpose of image enhancement is to improve the grayscale histogram of the image and the contrast of the image through image enhancement technology, thereby highlighting the detailed information of the image and improving the visual effect of the image.
  • the use of deep neural networks for image enhancement is a technology emerging with the development of deep learning technology.
  • low-quality photos input images
  • the quality of the output images can be close to that of digital single-lens reflex cameras (Digital Single Lens Reflex Camera).
  • DSLR Digital Single Lens Reflex Camera
  • the quality of the photos taken For example, the Peak Signal to Noise Ratio (PSNR) index is commonly used to characterize image quality, where the higher the PSNR value, the closer the image is to the real photos taken by a digital single-lens reflex camera.
  • PSNR Peak Signal to Noise Ratio
  • Andrey Ignatov et al. proposed a method of convolutional neural network to achieve image enhancement, please refer to the literature, Andrey Ignatov, Nikolay Kobyshev, Kenneth Vanhoey, Radu Timofte, Luc Van Gool, DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks.arXiv:1704.02470v2[cs.CV], September 5, 2017. This document is hereby incorporated by reference in its entirety as a part of this application.
  • This method mainly uses convolutional layers, batch normalization layers and residual connections to construct a single-scale convolutional neural network.
  • the network can be used to input low-quality images (for example, low contrast, underexposure or exposure Excessive, the entire image is too dark or too bright, etc.) processed into a higher quality image.
  • low-quality images for example, low contrast, underexposure or exposure Excessive, the entire image is too dark or too bright, etc.
  • color loss, texture loss and content loss as the loss function in training can achieve better processing results.
  • At least one embodiment of the present disclosure provides an image processing method, an image processing device, a neural network training method, and a storage medium.
  • This image processing method proposes a multi-scale cyclic sampling method based on convolutional neural network. By repeatedly sampling at multiple scales to obtain higher image fidelity, the quality of the output image can be greatly improved, and it is suitable for image processing. Offline applications such as batch processing with high quality requirements.
  • CNN Convolutional Neural Network
  • FIG. 1 shows a schematic diagram of a convolutional neural network.
  • the convolutional neural network can be used for image processing, which uses images as input and output, and replaces scalar weights with convolution kernels.
  • FIG. 1 only shows a convolutional neural network with a 3-layer structure, which is not limited in the embodiment of the present disclosure.
  • the convolutional neural network includes an input layer 101, a hidden layer 102, and an output layer 103.
  • the input layer 101 has 4 inputs
  • the hidden layer 102 has 3 outputs
  • the output layer 103 has 2 outputs.
  • the convolutional neural network finally outputs 2 images.
  • the 4 inputs of the input layer 101 may be 4 images, or 4 feature images of 1 image.
  • the three outputs of the hidden layer 102 may be characteristic images of the image input through the input layer 101.
  • the convolutional layer has weights And bias Weights Represents the convolution kernel, bias Is a scalar superimposed on the output of the convolutional layer, where k is the label of the input layer 101, and i and j are the labels of the unit of the input layer 101 and the unit of the hidden layer 102, respectively.
  • the first convolutional layer 201 includes a first set of convolution kernels (in Figure 1 ) And the first set of offsets (in Figure 1 ).
  • the second convolution layer 202 includes a second set of convolution kernels (in Figure 1 ) And the second set of offsets (in Figure 1 ).
  • each convolutional layer includes tens or hundreds of convolution kernels. If the convolutional neural network is a deep convolutional neural network, it may include at least five convolutional layers.
  • the convolutional neural network further includes a first activation layer 203 and a second activation layer 204.
  • the first activation layer 203 is located behind the first convolutional layer 201
  • the second activation layer 204 is located behind the second convolutional layer 202.
  • the activation layer (for example, the first activation layer 203 and the second activation layer 204) includes activation functions, which are used to introduce nonlinear factors into the convolutional neural network, so that the convolutional neural network can better solve more complex problems .
  • the activation function may include a linear correction unit (ReLU) function, a sigmoid function (Sigmoid function), or a hyperbolic tangent function (tanh function).
  • the ReLU function is an unsaturated nonlinear function
  • the Sigmoid function and tanh function are saturated nonlinear functions.
  • the activation layer can be used as a layer of the convolutional neural network alone, or the activation layer can also be included in the convolutional layer (for example, the first convolutional layer 201 can include the first activation layer 203, and the second convolutional layer 202 can be Including the second active layer 204).
  • the first convolution layer 201 For example, in the first convolution layer 201, first, several convolution kernels in the first group of convolution kernels are applied to each input And several offsets in the first set of offsets In order to obtain the output of the first convolutional layer 201; then, the output of the first convolutional layer 201 can be processed by the first activation layer 203 to obtain the output of the first activation layer 203.
  • the second convolutional layer 202 first, apply several convolution kernels in the second set of convolution kernels to the output of the input first activation layer 203 And several offsets in the second set of offsets In order to obtain the output of the second convolutional layer 202; then, the output of the second convolutional layer 202 can be processed by the second activation layer 204 to obtain the output of the second activation layer 204.
  • the output of the first convolutional layer 201 can be a convolution kernel applied to its input Offset
  • the output of the second convolutional layer 202 can be a convolution kernel applied to the output of the first activation layer 203 Offset The result of the addition.
  • the convolutional neural network Before using the convolutional neural network for image processing, the convolutional neural network needs to be trained. After training, the convolution kernel and bias of the convolutional neural network remain unchanged during image processing. In the training process, each convolution kernel and bias are adjusted through multiple sets of input/output example images and optimization algorithms to obtain an optimized convolutional neural network model.
  • FIG. 2A shows a schematic diagram of the structure of a convolutional neural network
  • FIG. 2B shows a schematic diagram of the working process of a convolutional neural network.
  • the main components of a convolutional neural network can include multiple convolutional layers, multiple downsampling layers, and fully connected layers.
  • each of these layers refers to a corresponding processing operation, that is, convolution processing, downsampling processing, fully connected processing, etc.
  • the described neural network also refers to the corresponding processing operation, the example standardization layer or layer standardization layer to be described below is similar to this, and the description is not repeated here.
  • a complete convolutional neural network can be composed of these three layers.
  • FIG. 2A only shows three levels of a convolutional neural network, namely the first level, the second level, and the third level.
  • each level may include a convolution module and a downsampling layer.
  • each convolution module may include a convolution layer.
  • the processing process of each level may include: convolution and down-sampling of the input image.
  • each convolution module may also include an instance normalization layer or a layer normalization layer, so that the processing process of each level may also include instance normalization processing or layer normalization processing.
  • the instance standardization layer is used to perform instance standardization processing on the feature image output by the convolutional layer, so that the gray value of the pixel of the feature image changes within a predetermined range, thereby simplifying the image generation process and improving the effect of image enhancement.
  • the predetermined range may be [-1, 1].
  • the instance standardization layer performs instance standardization processing on each feature image according to its own mean and variance.
  • the instance standardization layer can also be used to perform instance standardization processing on a single image.
  • the instance standardization formula of the instance standardization layer can be expressed as follows:
  • x tijk is the value of the t-th feature image, the i-th feature image, the j-th row, and the k-th column in the feature image set output by the convolutional layer.
  • y tijk represents the result obtained after processing x tijk by the instance standardization layer.
  • ⁇ 1 is a small integer to avoid zero denominator.
  • the layer standardization layer is similar to the instance standardization layer, and is also used to perform layer standardization processing on the feature image output by the convolutional layer, so that the gray value of the pixel of the feature image changes within a predetermined range, thereby simplifying the image generation process and improving The effect of image enhancement.
  • the predetermined range may be [-1, 1].
  • the layer standardization layer performs layer standardization processing on each column of the characteristic image according to the mean value and variance of each column of each characteristic image, thereby realizing the layer standardization processing of the characteristic image.
  • the layer standardization layer can also be used to perform layer standardization processing on a single image.
  • the model of the feature image is expressed as (T, C, H, W). Therefore, the layer standardization formula of the layer standardization layer can be expressed as follows:
  • x tijk is the value of the t-th feature image, the i-th feature image, the j-th row, and the k-th column in the feature image set output by the convolutional layer.
  • y′ tijk represents the result obtained after processing x tijk by the layer standardization layer.
  • ⁇ 2 is a small integer to avoid zero denominator.
  • the convolutional layer is the core layer of the convolutional neural network.
  • a neuron is only connected to some of the neurons in the adjacent layer.
  • the convolutional layer can apply several convolution kernels (also called filters) to the input image to extract multiple types of features of the input image.
  • Each convolution kernel can extract one type of feature.
  • the convolution kernel is generally initialized in the form of a random decimal matrix. During the training process of the convolutional neural network, the convolution kernel will learn to obtain reasonable weights.
  • the result obtained after applying a convolution kernel to the input image is called a feature map, and the number of feature images is equal to the number of convolution kernels.
  • Each feature image is composed of some rectangularly arranged neurons, and the neurons of the same feature image share weights, and the shared weights here are the convolution kernels.
  • the feature image output by the convolution layer of one level can be input to the convolution layer of the next next level and processed again to obtain a new feature image.
  • the first-level convolutional layer may output a first-level feature image
  • the first-level feature image is input to the second-level convolutional layer and processed again to obtain a second-level feature image.
  • the convolutional layer can use different convolution kernels to convolve the data of a certain local receptive field of the input image, and the convolution result is input to the activation layer, which is calculated according to the corresponding activation function To get the characteristic information of the input image.
  • the down-sampling layer is arranged between adjacent convolutional layers, and the down-sampling layer is a form of down-sampling.
  • the down-sampling layer can be used to reduce the scale of the input image, simplify the calculation complexity, and reduce over-fitting to a certain extent; on the other hand, the down-sampling layer can also perform feature compression to extract the input image Main features.
  • the down-sampling layer can reduce the size of feature images, but does not change the number of feature images.
  • a 2 ⁇ 2 output image can be obtained, which means that 36 pixels on the input image are merged into the output image. 1 pixel.
  • the last downsampling layer or convolutional layer can be connected to one or more fully connected layers, which are used to connect all the extracted features.
  • the output of the fully connected layer is a one-dimensional matrix, which is a vector.
  • FIG. 3 is a flowchart of an image processing method provided by an embodiment of the disclosure.
  • the image processing method includes:
  • Step S110 receiving a first characteristic image
  • Step S120 Perform at least one multi-scale cyclic sampling process on the first feature image.
  • the first feature image may include a feature image obtained after the input image is processed by one of a convolutional layer, a residual network, a dense network, etc. (for example, refer to FIG. 2B).
  • the residual network maintains its input in a certain proportion in its output by means of, for example, adding residual connections.
  • a dense network includes a bottleneck layer and a convolution layer.
  • the bottleneck layer is used to reduce the dimensionality of the data to reduce the number of parameters in the subsequent convolution operation, such as the convolution kernel of the bottleneck layer.
  • the convolution kernel of the convolution layer is a 3 ⁇ 3 convolution kernel; the present disclosure includes but is not limited to this.
  • the input image is processed by convolution, down-sampling, etc. to obtain the first feature image.
  • this embodiment does not limit the acquisition method of the first characteristic image.
  • the first characteristic image may include a plurality of characteristic images, but is not limited thereto.
  • the first feature image received in step S110 is used as the input of the multi-scale cyclic sampling process in step S120.
  • the multi-scale cyclic sampling process may have various forms, including but not limited to the three forms shown in FIGS. 4A-4C which will be described below.
  • FIG. 4A is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to an embodiment of the present disclosure.
  • the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing.
  • the input of the multi-scale cyclic sampling processing is used as the input of the first-level sampling processing
  • the output of the first-level sampling processing is used as the output of the multi-scale cyclic sampling processing.
  • the output of the multi-scale cyclic sampling process is called the second feature image.
  • the size of the second feature image (the number of rows and columns of the pixel array) may be the same as the size of the first feature image.
  • the first-level sampling process includes a first down-sampling process, a first up-sampling process, and a first residual link addition process that are sequentially executed.
  • the first down-sampling process is performed based on the input of the first-level sampling process to obtain the first down-sampled output.
  • the first down-sampling process can directly down-sample the input of the first-level sampling process to obtain the first down-sampling Output.
  • the first up-sampling process is performed based on the first down-sampling output to perform up-sampling processing to obtain the first up-sampling output, for example, after the first down-sampling output is subjected to the second-level sampling process, the up-sampling process is performed to obtain the first up-sampling output That is, the first up-sampling process can indirectly perform up-sampling on the first down-sampling output.
  • the first residual link addition process performs the first residual link addition on the input of the first level sampling process and the first upsampling output, and then uses the result of the first residual link addition as the output of the first level sampling process .
  • the size of the output of the first up-sampling process is the same as the size of the input of the first-level sampling process (ie, the input of the first down-sampling process), which is added through the first residual link
  • the size of the output of the first-level sampling process is the same as the size of the input of the first-level sampling process.
  • the second-level sampling process is nested between the first down-sampling process and the first up-sampling process of the first-level sampling process, and the first down-sampling output is received as the input of the second-level sampling process. , Providing the output of the second-level sampling process as the input of the first up-sampling process, so that the first up-sampling process performs the up-sampling process based on the first down-sampling output.
  • the second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process that are sequentially executed.
  • the second down-sampling process performs down-sampling based on the input of the second-level sampling process to obtain the second down-sampled output.
  • the second down-sampling process can directly down-sample the input of the second-level sampling process to obtain the second down-sampling Output.
  • the second up-sampling process performs up-sampling based on the second down-sampled output to obtain the second up-sampled output.
  • the second up-sampling process may directly up-sample the second down-sampled output to obtain the second up-sampled output.
  • the second residual link addition process performs a second residual link addition on the input of the second level sampling process and the second upsampling output, and then uses the result of the second residual link addition as the output of the second level sampling process .
  • the size of the output of the second up-sampling process (ie, the second up-sampling output) is the same as the size of the input of the second-level sampling process (ie, the input of the second down-sampling process), so that it is added through the second residual link
  • the size of the output of the second-level sampling process is the same as the size of the input of the second-level sampling process.
  • the sampling processing of each level for example, the first-level sampling processing, the second-level sampling processing, and the embodiment shown in FIG. 4B will be The procedures of the third-level sampling processing, etc. introduced are similar, including down-sampling processing, up-sampling processing and residual link addition processing.
  • the residual link addition processing may include correspondingly adding the values of each row and each column of the matrix of the two feature images, but it is not limited to this.
  • “nested” means that an object includes another object that is similar or identical to the object, and the object includes but is not limited to a process or a network structure.
  • the size of the output of the upsampling process (for example, the output of the upsampling process is a feature image) in the sampling process of each level and the input of the downsampling process (for example, The input of the down-sampling process is the feature image) with the same size, so after the residual link addition, the output of the sampling process of each level (for example, the output of the sampling process of each level can be the feature image) size and each The input of the sampling process of the levels (for example, the input of the sampling process of each level may be a feature image) has the same size.
  • multi-scale cyclic sampling processing can be implemented by a convolutional neural network.
  • the first convolutional neural network may be used to perform multi-scale cyclic sampling processing.
  • the first convolutional neural network may include a nested first meta network and a second meta network, the first meta network is used to perform the first level of sampling processing, and the second meta network is used to perform the second Hierarchical sampling processing.
  • the first meta-network may include a first sub-network and a second sub-network, the first sub-network is used to perform the first down-sampling process, and the second sub-network is used to perform the first up-sampling process.
  • the second meta network is nested between the first sub network and the third sub network of the first meta network.
  • the second meta-network may include a third sub-network and a fourth sub-network, the third sub-network is used to perform the second down-sampling process, and the fourth sub-network is used to perform the second up-sampling process.
  • both the first meta network and the second meta network are similar to the aforementioned residual network.
  • each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes one of a convolutional layer, a residual network, a dense network, and the like.
  • the first sub-network and the third sub-network may include a convolutional layer (down-sampling layer) with down-sampling function, and may also include one of residual networks and dense networks with down-sampling function;
  • the and fourth sub-network may include a convolutional layer (up-sampling layer) with an up-sampling function, and may also include one of a residual network with an up-sampling function, a dense network, and the like.
  • the first sub-network and the third sub-network may have the same structure or different structures; the second sub-network and the fourth sub-network may have the same structure or different structures; The disclosed embodiment does not limit this.
  • Down-sampling is used to reduce the size of the feature image, thereby reducing the data amount of the feature image.
  • down-sampling can be performed through the down-sampling layer, but is not limited to this.
  • the down-sampling layer can use max pooling, average pooling, strided convolution, decimation, such as selecting fixed pixels, and demultiplexing output (demuxout, Split the input image into multiple smaller images) and other down-sampling methods to achieve down-sampling processing.
  • Upsampling is used to increase the size of the feature image, thereby increasing the data volume of the feature image.
  • upsampling can be performed through an upsampling layer, but is not limited to this.
  • the up-sampling layer can adopt up-sampling methods such as strided transposed convolution and interpolation algorithms to implement up-sampling processing.
  • the interpolation algorithm may include, for example, interpolation, bilinear interpolation, and bicubic interpolation (Bicubic Interprolation).
  • the downsampling factor of the downsampling process at the same level corresponds to the upsampling factor of the upsampling process, that is, when the downsampling factor of the downsampling process is 1/y .
  • the upsampling factor of the upsampling process is y, where y is a positive integer, and y is usually greater than 2.
  • the parameters of the downsampling process at different levels may be the same or different; different levels
  • the parameters of the upsampling process that is, the parameters of the network corresponding to the upsampling process
  • the added parameters of the residual connections at different levels can be the same or different. This disclosure does not limit this.
  • the multi-scale cyclic sampling processing may also include: first down-sampling processing, first up-sampling processing After the processing, the second down-sampling process, and the second up-sampling process, the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output are respectively subjected to instance standardization processing or layer standardization processing.
  • first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output can use the same standardization processing method (instance standardization processing or layer standardization processing), or different Standardized processing method, this disclosure does not limit this.
  • the first sub-network, the second sub-network, the third sub-network and the fourth sub-network also include an instance standardization layer or a layer standardization layer, respectively, the instance standardization layer is used to perform instance standardization processing, and the layer standardization layer is used to execute the layer.
  • Standardized processing can perform instance standardization processing according to the aforementioned instance standardization formula
  • the layer standardization layer can perform layer standardization processing according to the aforementioned layer standardization formula, which is not limited in the present disclosure.
  • first sub-network, the second sub-network, the third sub-network, and the fourth sub-network may include the same standardization layer (example standardization layer or layer standardization layer), or can include different standardization layers, the present disclosure There is no restriction on this.
  • FIG. 4B is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure.
  • the multi-scale cyclic sampling process further includes a third-level sampling process.
  • the other procedures of the multi-scale cyclic sampling processing shown in FIG. 5 are basically the same as the procedures of the multi-scale cyclic sampling processing shown in FIG. 4A, and the repetitions are not repeated here.
  • the third-level sampling process is nested between the second down-sampling process and the second up-sampling process of the second-level sampling process, and the second down-sampling output is received as the input of the third-level sampling process.
  • the second up-sampling process also indirectly up-sampling the second down-sampled output.
  • the third-level sampling process includes a third down-sampling process, a third up-sampling process, and a third residual link addition process that are sequentially executed.
  • the third down-sampling process is performed based on the input of the third-level sampling process to obtain the third down-sampled output.
  • the third down-sampling process can directly down-sample the input of the third-level sampling process to obtain the third down-sampling Output.
  • the third up-sampling process performs up-sampling based on the third down-sampled output to obtain the third up-sampled output.
  • the third up-sampling process may directly up-sample the third down-sampled output to obtain the third up-sampled output.
  • the third residual link addition process performs the third residual link addition on the input of the third level sampling process and the third upsampling output, and then uses the third residual link addition result as the output of the third level sampling process .
  • the size of the output of the third up-sampling process ie, the third up-sampling output
  • the size of the input of the third-level sampling process ie, the input of the third down-sampling process
  • the size of the output of the third-level sampling process is the same as the size of the input of the third-level sampling process.
  • multi-scale cyclic sampling processing may also include more levels of sampling processing, for example, it may also include a fourth level nested in the third level of sampling processing.
  • the sampling processing, the fifth-level sampling processing nested in the fourth-level sampling processing, etc., the nesting method is similar to the second-level sampling processing and the third-level sampling processing described above. limit.
  • FIG. 4C is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure.
  • the multi-scale cyclic sampling process includes a second-level sampling process that is sequentially executed multiple times.
  • the other procedures of the multi-scale cyclic sampling processing shown in FIG. 5 are basically the same as the procedures of the multi-scale cyclic sampling processing shown in FIG. 4A, and the repetitions are not repeated here.
  • the inclusion of two second-level sampling processing in FIG. 4C is exemplary.
  • the multi-scale cyclic sampling processing may include two or more second-level sampling performed sequentially. deal with.
  • the number of second-level sampling processing can be selected according to actual needs, and the present disclosure does not limit this.
  • the inventor of the present application found that compared to using an image processing method with one or three second-level sampling processing, an image processing method with two second-level sampling processing is used for image enhancement processing. The effect is better, but this should not be seen as a limitation of the present disclosure.
  • the first second-level sampling process receives the first down-sampling output as the input of the first second-level sampling process, and every second-level sampling process except the first second-level sampling process receives the previous one
  • the output of the second-level sampling process is used as the input of this second-level sampling process
  • the output of the last second-level sampling process is used as the input of the first upsampling process.
  • the parameters of the downsampling process of the same level in different orders may be the same or different; the parameters of the upsampling process of the same level in different orders It can be the same or different; the added parameters of the residual connections of the same level in different orders can be the same or different. This disclosure does not limit this.
  • the first-level sampling process can nest multiple second-level sampling processes that are executed in sequence; further, at least partially The second-level sampling process may nest one or more third-level sampling processes that are executed sequentially, and the number of third-level sampling processes nested in at least part of the second-level sampling process may be the same or different; further, The third-level sampling processing can nest the fourth-level sampling processing, and the specific nesting manner may be the same as the second-level sampling processing nesting the third-level sampling processing; and so on.
  • FIGS. 4A-4C show a situation where the image processing method provided by an embodiment of the present disclosure includes a multi-scale cyclic sampling process.
  • at least one multi-scale cyclic sampling processing includes one multi-scale cyclic sampling processing.
  • the multi-scale cyclic sampling processing receives the first feature image as the input of the multi-scale cyclic sampling processing, and the input of the multi-scale cyclic sampling processing is used as the input of the first-level sampling processing in the multi-scale cyclic sampling processing.
  • the output of the first-level sampling processing is used as the output of the multi-scale cyclic sampling processing, and the output of the multi-scale cyclic sampling processing is used as the output of the multi-scale cyclic sampling processing at least once.
  • the present disclosure includes but is not limited to this.
  • FIG. 4D is a schematic flow chart corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure.
  • at least one multi-scale cyclic sampling process includes multiple times of sequential execution of multi-scale cyclic sampling processing.
  • at least one multi-scale cyclic sampling process may include two or three times.
  • Multi-scale cyclic sampling processing executed sequentially, but not limited to this. It should be noted that in the embodiments of the present disclosure, the number of times of multi-scale cyclic sampling processing can be selected according to actual needs, and the present disclosure does not limit this.
  • the inventor of the present application found that compared to the image processing method with one or three multi-scale cyclic sampling processing, the image processing method with two multi-scale cyclic sampling processing is used for image enhancement processing.
  • the effect is better, but this should not be seen as a limitation of the present disclosure.
  • each multi-scale cyclic sampling process is used as the input of the first-level sampling process in this multi-scale cyclic sampling process
  • the output of the first-level sampling process in each multi-scale cyclic sampling process is used as the current multi-scale The output of cyclic sampling processing.
  • the first multi-scale cyclic sampling process receives the first feature image as the input of the first multi-scale cyclic sampling process, and each multi-scale cycle except the first multi-scale cyclic sampling process
  • the sampling process receives the output of the previous multi-scale cyclic sampling process as the input of this multi-scale cyclic sampling process, and the output of the last multi-scale cyclic sampling process is used as the output of at least one multi-scale cyclic sampling process.
  • Fig. 5 is a flowchart of an image processing method provided by another embodiment of the present disclosure.
  • the image processing method includes step S210 to step S250.
  • steps S230 to S240 of the image processing method shown in FIG. 5 correspond to the same as steps S110 to S120 of the image processing method shown in FIG. 3, that is, the image processing method shown in FIG. Therefore, steps S230 to S240 of the image processing method shown in FIG. 5 can refer to the foregoing description of steps S110 to S120 of the image processing method shown in FIG. 3, and of course, you can also refer to FIG. 4A to The method of the embodiment shown in 4D, etc.
  • steps S210 to S250 of the image processing method shown in FIG. 5 will be described in detail.
  • Step S210 Obtain an input image.
  • the input image may include photos captured by a camera of a smart phone, a camera of a tablet computer, a camera of a personal computer, a lens of a digital camera, a surveillance camera, or a web camera, etc., which may include images of people, animations, etc. Plant images or landscape images, etc., are not limited in this disclosure.
  • the quality of the input image is lower than the quality of photos taken by a real digital single-lens reflex camera, that is, the input image is a low-quality image.
  • the input image may include a 3-channel RGB image; in other examples, the input image may include a 3-channel YUV image.
  • the input image includes an RGB image as an example, but the embodiment of the present disclosure is not limited to this.
  • Step S220 Use the analysis network to convert the input image into a first feature image.
  • the analysis network may be a convolutional neural network including one of a convolutional layer, a residual network, and a dense network.
  • the analysis network can convert 3 channel RGB images (ie, input images) into multiple first feature images, such as 64 first feature images.
  • RGB images ie, input images
  • first feature images such as 64 first feature images.
  • the present disclosure includes but is not limited to this.
  • the embodiment of the present disclosure does not limit the structure and parameters of the analysis network, as long as it can convert the input image to the convolution feature dimension (ie, convert it to the first feature image).
  • Step S230 Receive the first characteristic image
  • Step S240 Perform at least one multi-scale cyclic sampling process on the first feature image.
  • step S230 to step S240 reference may be made to the foregoing description of step S110 to step S120, which will not be repeated in this disclosure.
  • Step S250 Use a synthesis network to convert the output of at least one multi-scale cyclic sampling process into an output image.
  • the synthesis network may be a convolutional neural network including one of a convolutional layer, a residual network, a dense network, and the like.
  • the output of at least one multi-scale cyclic sampling process can be referred to as the second feature image.
  • the number of second feature images may be multiple, but is not limited to this.
  • the synthesis network may convert multiple second feature images into output images.
  • the output image may include 3 channel RGB images. The present disclosure includes but is not limited to this.
  • FIG. 6A is a schematic diagram of an input image
  • FIG. 6B is a result obtained by processing the input image shown in FIG. 6A according to an image processing method (for example, the image processing method shown in FIG. 5) provided by an embodiment of the present disclosure Schematic of the output image.
  • an image processing method for example, the image processing method shown in FIG. 5
  • the output image retains the content of the input image, but the contrast of the image is improved, and the problem of the input image being too dark is improved, so that the quality of the output image can be close to that of the input image.
  • the output image is a high-quality image.
  • the embodiment of the present disclosure does not limit the structure and parameters of the synthesis network, as long as it can convert the convolution feature dimension (ie, the second feature image) into an output image.
  • the image processing method provided by the embodiments of the present disclosure can perform image enhancement processing on low-quality input images, and by repeatedly sampling at multiple scales to obtain higher image fidelity, the quality of output images can be greatly improved.
  • the PSNR of the image output by the image enhancement method proposed by Andrey Ignatov et al. is 20.08, while the PSNR of the output image obtained based on the image processing method provided in the embodiment shown in FIG. 4C of the present disclosure can reach 23.35, that is The image obtained by the image processing method provided by the embodiments of the present disclosure may be closer to a real photo taken by a digital single-lens reflex camera.
  • FIG. 7A is a schematic structural diagram of a neural network provided by an embodiment of the disclosure
  • FIG. 7B is a flowchart of a neural network training method provided by an embodiment of the disclosure
  • FIG. 7C is a schematic diagram of a neural network training method provided by an embodiment of the disclosure. This corresponds to the schematic block diagram of the training method shown in FIG. 7B for training the neural network shown in FIG. 7A.
  • the neural network 300 includes an analysis network 310, a first sub-neural network 320, and a synthesis network 330.
  • the analysis network 310 processes the input image to obtain the first feature image
  • the first sub-neural network 320 performs at least one multi-scale cyclic sampling process on the first feature image to obtain the second feature image
  • the synthesis network 330 performs the second feature image
  • the image is processed to obtain an output image.
  • the structure of the analysis network 310 can refer to the description of the analysis network in the aforementioned step S220, which is not limited in the present disclosure; the structure of the first sub-neural network 320 can refer to the aforementioned step S120 (that is, step S240) regarding the multi-scale loop
  • the first sub-neural network may include but is not limited to the aforementioned first convolutional neural network, which is not limited in the present disclosure; for example, the synthesis network 330 may refer to the synthesis network in the aforementioned step S250 Description, this disclosure does not limit this.
  • the input image and the output image can also refer to the description of the input image and the output image in the image processing method provided in the foregoing embodiment, which will not be repeated in this disclosure.
  • the training method of the neural network includes step S410 to step S460.
  • Step S410 Obtain training input images.
  • the training input image may also include photos taken by the camera of a smart phone, a camera of a tablet computer, a camera of a personal computer, a lens of a digital camera, a surveillance camera, or a web camera. It may include images of people, images of animals and plants, or landscapes, etc., which is not limited in the present disclosure.
  • the quality of the training input image is lower than the quality of photos taken by a real digital single-lens reflex camera, that is, the training input image is a low-quality image.
  • the training input image may include 3 channel RGB images.
  • Step S420 Use the analysis network to process the training input image to provide a first training feature image.
  • the analysis network 310 may be a convolutional neural network including one of a convolutional layer, a residual network, and a dense network.
  • the analysis network can convert 3 channel RGB images (ie, training input images) into multiple first training feature images, such as 64 first training feature images.
  • RGB images ie, training input images
  • first training feature images such as 64 first training feature images.
  • Step S430 Use the first sub-neural network to perform multi-scale cyclic sampling processing on the first training feature image at least once to obtain a second training feature image.
  • the multi-scale cyclic sampling process can be implemented as the multi-scale cyclic sampling process in any of the embodiments shown in FIGS. 4A-4D, but is not limited thereto.
  • the multi-scale cyclic sampling processing in step S430 is implemented as the multi-scale cyclic sampling processing shown in FIG. 4A as an example for description.
  • the multi-scale cyclic sampling process nests the first-level sampling process and the second-level sampling process.
  • the input of the multi-scale cyclic sampling process (ie, the first training feature image) is used as the input of the first-level sampling process
  • the output of the first-level sampling process is used as the output of the multi-scale cyclic sampling process (ie, the first Two training feature images).
  • the size of the second training feature image may be the same as the size of the first training feature image.
  • the first-level sampling process includes a first down-sampling process, a first up-sampling process, and a first residual link addition process that are sequentially executed.
  • the first down-sampling process is performed based on the input of the first-level sampling process to obtain the first down-sampled output.
  • the first down-sampling process can directly down-sample the input of the first-level sampling process to obtain the first down-sampling Output.
  • the first up-sampling process is performed based on the first down-sampling output to perform up-sampling processing to obtain the first up-sampling output, for example, after the first down-sampling output is subjected to the second-level sampling process, the up-sampling process is performed to obtain the first up-sampling output That is, the first up-sampling process can indirectly perform up-sampling on the first down-sampling output.
  • the first residual link addition process performs the first residual link addition on the input of the first level sampling process and the first upsampling output, and then uses the result of the first residual link addition as the output of the first level sampling process .
  • the size of the output of the first up-sampling process is the same as the size of the input of the first-level sampling process (ie, the input of the first down-sampling process), which is added through the first residual link
  • the size of the output of the first-level sampling process is the same as the size of the input of the first-level sampling process.
  • the second-level sampling process is nested between the first down-sampling process and the first up-sampling process of the first-level sampling process, and the first down-sampling output is received as the input of the second-level sampling process. , Providing the output of the second-level sampling process as the input of the first up-sampling process, so that the first up-sampling process performs the up-sampling process based on the first down-sampling output.
  • the second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process that are sequentially executed.
  • the second down-sampling process performs down-sampling based on the input of the second-level sampling process to obtain the second down-sampled output.
  • the second down-sampling process can directly down-sample the input of the second-level sampling process to obtain the second down-sampling Output.
  • the second up-sampling process performs up-sampling based on the second down-sampled output to obtain the second up-sampled output.
  • the second up-sampling process may directly up-sample the second down-sampled output to obtain the second up-sampled output.
  • the second residual link addition process performs a second residual link addition on the input of the second level sampling process and the second upsampling output, and then uses the result of the second residual link addition as the output of the second level sampling process .
  • the size of the output of the second up-sampling process (ie, the second up-sampling output) is the same as the size of the input of the second-level sampling process (ie, the input of the second down-sampling process), so that it is added through the second residual link
  • the size of the output of the second-level sampling process is the same as the size of the input of the second-level sampling process.
  • the first sub-neural network 320 may be implemented as the aforementioned first convolutional neural network.
  • the first sub-neural network 320 may include a nested first meta-network and a second meta-network, the first meta-network is used to perform the first-level sampling processing, and the second meta-network is used to perform the second-level sampling processing.
  • the first meta-network may include a first sub-network and a second sub-network, the first sub-network is used to perform the first down-sampling process, and the second sub-network is used to perform the first up-sampling process.
  • the second meta network is nested between the first sub network and the third sub network of the first meta network.
  • the second meta-network may include a third sub-network and a fourth sub-network, the third sub-network is used to perform the second down-sampling process, and the fourth sub-network is used to perform the second up-sampling process.
  • each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes one of a convolutional layer, a residual network, a dense network, and the like.
  • the first sub-network and the third sub-network may include one of a convolutional layer (down-sampling layer) with down-sampling function, a residual network, a dense network, etc.
  • the second and fourth sub-networks may include One of the convolutional layer (upsampling layer), residual network, dense network, etc. of the upsampling function.
  • the first sub-network and the third sub-network may have the same structure or different structures; the second sub-network and the fourth sub-network may have the same structure or different structures; There is no restriction on this publicly.
  • the multi-scale cyclic sampling processing may further include: first down-sampling processing, first up-sampling processing, second down-sampling processing, and After the second up-sampling processing, the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output are respectively subjected to instance standardization processing or layer standardization processing.
  • first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output can use the same standardization processing method (instance standardization processing or layer standardization processing), or different Standardized processing method, this disclosure does not limit this.
  • the first sub-network, the second sub-network, the third sub-network and the fourth sub-network also include an instance standardization layer or a layer standardization layer, respectively, the instance standardization layer is used to perform instance standardization processing, and the layer standardization layer is used to execute the layer.
  • Standardized processing can perform instance standardization processing according to the aforementioned instance standardization formula
  • the layer standardization layer can perform layer standardization processing according to the aforementioned layer standardization formula, which is not limited in the present disclosure.
  • first sub-network, the second sub-network, the third sub-network, and the fourth sub-network may include the same standardization layer (example standardization layer or layer standardization layer), or can include different standardization layers, the present disclosure There is no restriction on this.
  • step S430 for more implementation methods and more details of the multi-scale cyclic sampling processing in step S430, please refer to the foregoing step S120 (ie, step S240) and the multi-scale cyclic sampling processing in the embodiment shown in FIGS. 4A-4D. This disclosure will not repeat the description. It should also be noted that when the multi-scale cyclic sampling processing in step S430 is implemented in other forms, the first sub-neural network 320 should be changed accordingly to implement other forms of multi-scale cyclic sampling processing, which will not be discussed in this disclosure. Repeat.
  • the number of second training feature images may be multiple, but is not limited thereto.
  • Step S440 Use the synthetic network to process the second training feature image to obtain a training output image.
  • the synthesis network 330 may be a convolutional neural network including one of a convolutional layer, a residual network, a dense network, and the like.
  • the synthesis network may convert multiple second training feature images into training output images.
  • the training output image may include 3 channel RGB images, and the present disclosure includes but is not limited to this.
  • Step S450 Based on the training output image, calculate the loss value of the neural network through the loss function.
  • the parameters of the neural network 300 include the parameters of the analysis network 310, the parameters of the first sub-neural network 320, and the parameters of the synthesis network 330.
  • the initial parameter of the neural network 300 may be a random number, for example, the random number conforms to a Gaussian distribution, which is not limited in the embodiment of the present disclosure.
  • the loss function of this embodiment can refer to the loss function in the literature provided by Andrey Ignatov et al.
  • the loss function can include a color loss function, a texture loss function, and a content loss function; accordingly, the specific process of calculating the loss value of the parameters of the neural network 300 through the loss function can also refer to this Description in the literature.
  • the embodiment of the present disclosure does not limit the specific form of the loss function, which includes but is not limited to the form of the loss function in the above-mentioned documents.
  • Step S460 Correct the parameters of the neural network according to the loss value.
  • the training process of the neural network 300 may also include an optimization function (not shown in FIG. 7C).
  • the optimization function may calculate the error value of the parameters of the neural network 300 according to the loss value calculated by the loss function, and according to the error value The parameters of the neural network 300 are corrected.
  • the optimization function may use a stochastic gradient descent (SGD) algorithm, a batch gradient descent (BGD) algorithm, etc., to calculate the error value of the parameters of the neural network 300.
  • SGD stochastic gradient descent
  • BGD batch gradient descent
  • the training method of the neural network may further include: judging whether the training of the neural network satisfies a predetermined condition, if the predetermined condition is not met, repeating the above training process (ie, step S410 to step S460); if the predetermined condition is met, stopping the above During the training process, a trained neural network is obtained.
  • the foregoing predetermined condition is that the loss values corresponding to two consecutive (or more) training output images no longer decrease significantly.
  • the foregoing predetermined condition is that the number of training times or training cycles of the neural network reaches a predetermined number. This disclosure does not limit this.
  • the training output image output by the trained neural network 300 retains the content of the training input image, but the quality of the training output image can be close to the quality of photos taken by a real digital single-lens reflex camera, that is, the training output image is high Quality image.
  • the above-mentioned embodiments only schematically illustrate the training process of the neural network.
  • the training process of each sample image may include multiple iterations to correct the parameters of the neural network.
  • the training phase also includes fine-tune the parameters of the neural network to obtain more optimized parameters.
  • the neural network training method provided by the embodiments of the present disclosure can train the neural network used in the image processing method of the embodiments of the present disclosure, and the neural network trained by the training method can perform image enhancement on low-quality input images Processing, by repeatedly sampling at multiple scales to obtain higher image fidelity, the quality of the output image can be greatly improved, and it is suitable for offline applications such as batch processing that require high image quality.
  • FIG. 8 is a schematic block diagram of an image processing device provided by an embodiment of the present disclosure.
  • the image processing apparatus 500 includes a memory 510 and a processor 520.
  • the memory 510 is used to non-temporarily store computer readable instructions
  • the processor 520 is used to run the computer readable instructions.
  • the image processing method provided by the embodiments of the present disclosure is executed.
  • the memory 510 and the processor 520 may directly or indirectly communicate with each other.
  • components such as the memory 510 and the processor 520 may communicate through a network connection.
  • the network may include a wireless network, a wired network, and/or any combination of a wireless network and a wired network.
  • the network may include a local area network, the Internet, a telecommunication network, the Internet of Things (Internet of Things) based on the Internet and/or a telecommunication network, and/or any combination of the above networks, etc.
  • the wired network may, for example, use twisted pair, coaxial cable, or optical fiber transmission for communication, and the wireless network may use, for example, a 3G/4G/5G mobile communication network, Bluetooth, Zigbee, or WiFi.
  • the present disclosure does not limit the types and functions of the network here.
  • the processor 520 may control other components in the image processing apparatus to perform desired functions.
  • the processor 520 may be a central processing unit (CPU), a tensor processor (TPU), or a graphics processor GPU, and other devices with data processing capabilities and/or program execution capabilities.
  • the central processing unit (CPU) can be an X86 or ARM architecture.
  • the GPU can be directly integrated on the motherboard alone or built into the north bridge chip of the motherboard.
  • the GPU can also be built into the central processing unit (CPU).
  • the memory 510 may include any combination of one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or nonvolatile memory.
  • Volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
  • the non-volatile memory may include, for example, read only memory (ROM), hard disk, erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, flash memory, etc.
  • one or more computer instructions may be stored in the memory 510, and the processor 520 may execute the computer instructions to implement various functions.
  • the computer-readable storage medium may also store various application programs and various data, such as training input images, and various data used and/or generated by the application programs.
  • one or more steps in the image processing method described above may be executed.
  • one or more steps in the neural network training method described above may be executed.
  • the image processing device provided by the above-mentioned embodiments of the present disclosure is exemplary rather than restrictive. According to actual application requirements, the image processing device may also include other conventional components or structures, for example, to realize image processing. For necessary functions of the processing device, those skilled in the art can set other conventional components or structures according to specific application scenarios, which are not limited in the embodiments of the present disclosure.
  • FIG. 9 is a schematic diagram of a storage medium provided by an embodiment of the disclosure.
  • the storage medium 600 non-transitory stores computer-readable instructions 601.
  • the non-transitory computer-readable instructions 601 are executed by a computer (including a processor), any of the embodiments of the present disclosure can be executed. Instructions for the image processing method.
  • one or more computer instructions may be stored on the storage medium 600.
  • Some computer instructions stored on the storage medium 600 may be, for example, instructions for implementing one or more steps in the foregoing image processing method.
  • the other computer instructions stored on the storage medium may be, for example, instructions for implementing one or more steps in the above-mentioned neural network training method.
  • the storage medium may include the storage components of a tablet computer, the hard disk of a personal computer, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), optical disk read only memory (CD -ROM), flash memory, or any combination of the above storage media, can also be other suitable storage media.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • CD -ROM optical disk read only memory
  • flash memory or any combination of the above storage media, can also be other suitable storage media.

Abstract

An image processing method, an image processing device, a neural network training method and a storage medium. The image processing method comprising: receiving a first feature image; and performing at least one instance of multi-scale loop sampling processing with regard to the first feature image; wherein, the multi-scale loop sampling processing comprising a nested first level sampling processing and second level sampling processing, the first level sampling processing comprising, successively executed, first downsampling processing, first upsampling processing and first residual connection addition processing; the second level sampling processing being nested between the first downsampling processing and the first upsampling processing, and the second level sampling processing comprising, successively executed, second downsampling processing, second upsampling processing and second residual connection addition processing.

Description

图像处理方法及装置、神经网络的训练方法、存储介质Image processing method and device, neural network training method, and storage medium
本申请要求于2019年3月19日递交的中国专利申请第201910209662.2号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。This application claims the priority of the Chinese patent application No. 201910209662.2 filed on March 19, 2019, and the content of the above-mentioned Chinese patent application is quoted here in full as a part of this application.
技术领域Technical field
本公开的实施例涉及一种图像处理方法、图像处理装置、神经网络的训练方法以及存储介质。The embodiments of the present disclosure relate to an image processing method, an image processing device, a training method of a neural network, and a storage medium.
背景技术Background technique
当前,基于人工神经网络的深度学习技术已经在诸如图像分类、图像捕获和搜索、面部识别、年龄和语音识别等领域取得了巨大进展。深度学习的优势在于可以利用通用的结构以相对类似的系统解决非常不同的技术问题。卷积神经网络(Convolutional Neural Network,CNN)是近年发展起来并引起广泛重视的一种人工神经网络,CNN是一种特殊的图像识别方式,属于非常有效的带有前向反馈的网络。现在,CNN的应用范围已经不仅仅限于图像识别领域,也可以应用在人脸识别、文字识别、图像处理等应用方向。Currently, deep learning technology based on artificial neural networks has made great progress in fields such as image classification, image capture and search, facial recognition, age and speech recognition. The advantage of deep learning is that it can use a common structure to solve very different technical problems with relatively similar systems. Convolutional Neural Network (CNN) is an artificial neural network that has been developed in recent years and has attracted wide attention. CNN is a special method of image recognition and is a very effective network with forward feedback. Now, the application scope of CNN is not only limited to the field of image recognition, but can also be applied to the application direction of face recognition, text recognition, image processing and so on.
发明内容Summary of the invention
本公开至少一个实施例提供一种图像处理方法,包括:接收第一特征图像;以及对所述第一特征图像进行至少一次多尺度循环采样处理;At least one embodiment of the present disclosure provides an image processing method, including: receiving a first characteristic image; and performing multi-scale cyclic sampling processing on the first characteristic image at least once;
其中,所述多尺度循环采样处理包括嵌套的第一层级采样处理和第二层级采样处理,所述第一层级采样处理包括第一下采样处理、第一上采样处理和第一残差链接相加处理,其中,所述第一下采样处理基于第一层级采样处理的输入进行下采样处理得到第一下采样输出,所述第一上采样处理基于所述第一下采样输出进行上采样处理得到第一上采样输出,所述第一残差链接相加处理将所述第一层级采样处理的输入和所述第一上采样输出进行第一残差链接相加,然后将所述第一残差链接相加的结果作为第一层级采样处理的输出;所述第二层级采样处理嵌套在所述第一下采样处理和所述第一上采样处理之间,接收所述第一下采样输出作为第二层级采样处理的输入,提供第二层级采样处理的输 出作为第一上采样处理的输入,以使得所述第一上采样处理基于所述第一下采样输出进行上采样处理;所述第二层级采样处理包括第二下采样处理、第二上采样处理和第二残差链接相加处理,其中,所述第二下采样处理基于所述第二层级采样处理的输入进行下采样处理得到第二下采样输出,所述第二上采样处理基于所述第二下采样输出进行上采样处理得到第二上采样输出,所述第二残差链接相加处理将所述第二层级采样处理的输入和所述第二上采样输出进行第二残差链接相加,然后将所述第二残差链接相加的结果作为所述第二层级采样处理的输出。Wherein, the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing, and the first-level sampling processing includes first down-sampling processing, first up-sampling processing, and first residual linking Addition processing, wherein the first down-sampling processing performs down-sampling processing based on the input of the first-level sampling processing to obtain a first down-sampled output, and the first up-sampling processing performs up-sampling based on the first down-sampling output The first up-sampling output is obtained by processing, and the first residual link addition processing performs a first residual link addition on the input of the first-level sampling processing and the first up-sampling output, and then adds the first residual link The result of a residual link addition is used as the output of the first-level sampling process; the second-level sampling process is nested between the first down-sampling process and the first up-sampling process, and receives the first The down-sampling output is used as the input of the second-level sampling process, and the output of the second-level sampling process is provided as the input of the first up-sampling process, so that the first up-sampling process performs up-sampling processing based on the first down-sampling output The second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process, wherein the second down-sampling process is performed based on the input of the second-level sampling process The down-sampling process obtains a second down-sampling output, the second up-sampling process performs up-sampling processing based on the second down-sampling output to obtain a second up-sampling output, and the second residual link addition process adds the first The input of the second-level sampling process and the second up-sampling output are subjected to a second residual link addition, and then the result of the second residual link addition is used as the output of the second-level sampling process.
例如,在本公开一些实施例提供的图像处理方法中,所述第一上采样处理的输出的尺寸与所述第一下采样处理的输入的尺寸相同;所述第二上采样处理的输出的尺寸与所述第二下采样处理的输入的尺寸相同。For example, in the image processing method provided by some embodiments of the present disclosure, the size of the output of the first upsampling process is the same as the size of the input of the first downsampling process; the size of the output of the second upsampling process is The size is the same as the input size of the second downsampling process.
例如,在本公开一些实施例提供的图像处理方法中,所述多尺度循环采样处理还包括第三层级采样处理,所述第三层级采样处理嵌套在所述第二下采样处理和所述第二上采样处理之间,接收所述第二下采样输出作为第三层级采样处理的输入,提供第三层级采样处理的输出作为第二上采样处理的输入,以使得所述第二上采样处理基于所述第二下采样输出进行上采样处理;所述第三层级采样处理包括第三下采样处理、第三上采样处理和第三残差链接相加处理,其中,所述第三下采样处理基于所述第三层级采样处理的输入进行下采样处理得到第三下采样输出,所述第三上采样处理基于所述第三下采样输出进行上采样处理得到第三上采样输出,所述第三残差链接相加处理将所述第三层级采样处理的输入和所述第三上采样输出进行第三残差链接相加,然后将所述第三残差链接相加的结果作为所述第三层级采样处理的输出。For example, in the image processing method provided by some embodiments of the present disclosure, the multi-scale cyclic sampling processing further includes a third-level sampling processing, and the third-level sampling processing is nested in the second down-sampling processing and the During the second up-sampling process, the second down-sampling output is received as the input of the third-level sampling process, and the output of the third-level sampling process is provided as the input of the second up-sampling process, so that the second up-sampling The processing performs up-sampling processing based on the second down-sampling output; the third-level sampling processing includes third down-sampling processing, third up-sampling processing, and third residual link addition processing, where the third down-sampling processing The sampling process performs down-sampling based on the input of the third-level sampling process to obtain a third down-sampled output, and the third up-sampling process performs up-sampling based on the third down-sampled output to obtain a third up-sampled output, so The third residual link addition processing performs a third residual link addition on the input of the third level sampling process and the third up-sampling output, and then uses the third residual link addition result as The output of the third-level sampling process.
例如,在本公开一些实施例提供的图像处理方法中,所述多尺度循环采样处理包括多次依次执行的所述第二层级采样处理,第一次所述第二层级采样处理接收所述第一下采样输出作为第一次所述第二层级采样处理的输入,除第一次所述第二层级采样处理之外的每次所述第二层级采样处理接收前一次所述第二层级采样处理的输出作为本次所述第二层级采样处理的输入,最后一次所述第二层级采样处理的输出作为所述第一上采样处理的输入。For example, in the image processing method provided by some embodiments of the present disclosure, the multi-scale cyclic sampling process includes the second-level sampling process that is sequentially executed multiple times, and the second-level sampling process receives the first-level sampling process for the first time. The one-shot output is used as the input of the first second-level sampling process, and each second-level sampling process except the first second-level sampling process receives the previous second-level sampling The processed output is used as the input of the second-level sampling process this time, and the output of the last second-level sampling process is used as the input of the first upsampling process.
例如,在本公开一些实施例提供的图像处理方法中,所述至少一次多尺度循环采样处理包括多次依次执行的所述多尺度循环采样处理,每次所述多尺度循环采样处理的输入作为本次所述多尺度循环采样处理中的所述第一层级采 样处理的输入,每次所述多尺度循环采样处理中的所述第一层级采样处理的输出作为本次所述多尺度循环采样处理的输出;第一次所述多尺度循环采样处理接收所述第一特征图像作为第一次所述多尺度循环采样处理的输入,除第一次所述多尺度循环采样处理之外的每次所述多尺度循环采样处理接收前一次所述多尺度循环采样处理的输出作为本次所述多尺度循环采样处理的输入,最后一次所述多尺度循环采样处理的输出作为所述至少一次多尺度循环采样处理的输出。For example, in the image processing method provided by some embodiments of the present disclosure, the at least one multi-scale cyclic sampling process includes the multi-scale cyclic sampling process performed sequentially multiple times, and each time the input of the multi-scale cyclic sampling process is The input of the first-level sampling processing in the multi-scale cyclic sampling processing this time, and the output of the first-level sampling processing in the multi-scale cyclic sampling processing each time is used as the multi-scale cyclic sampling this time Processing output; the first multi-scale cyclic sampling process receives the first feature image as the input of the first multi-scale cyclic sampling process, except for the first multi-scale cyclic sampling process. The multi-scale cyclic sampling process receives the output of the previous multi-scale cyclic sampling process as the input of this multi-scale cyclic sampling process, and the output of the last multi-scale cyclic sampling process is used as the at least one multi-scale cyclic sampling process. The output of the scale cycle sampling process.
例如,在本公开一些实施例提供的图像处理方法中,所述多尺度循环采样处理还包括:在所述第一下采样处理、所述第一上采样处理、所述第二下采样处理和所述第二上采样处理之后,分别对所述第一下采样输出、所述第一上采样输出、所述第二下采样输出和所述第二上采样输出进行实例标准化处理或层标准化处理。For example, in the image processing method provided by some embodiments of the present disclosure, the multi-scale cyclic sampling processing further includes: performing the first down-sampling processing, the first up-sampling processing, the second down-sampling processing, and After the second up-sampling processing, perform instance standardization processing or layer standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively .
例如,本公开一些实施例提供的图像处理方法还包括:使用第一卷积神经网络进行所述多尺度循环采样处理;其中,所述第一卷积神经网络包括:第一元网络,用于执行所述第一层级采样处理;第二元网络,用于执行所述第二层级采样处理。For example, the image processing method provided by some embodiments of the present disclosure further includes: using a first convolutional neural network to perform the multi-scale cyclic sampling processing; wherein, the first convolutional neural network includes: a first element network for Perform the first-level sampling processing; a second meta network for performing the second-level sampling processing.
例如,在本公开一些实施例提供的图像处理方法中,所述第一元网络包括:第一子网络,用于执行所述第一下采样处理;第二子网络,用于执行所述第一上采样处理;所述第二元网络包括:第三子网络,用于执行所述第二下采样处理;第四子网络,用于执行所述第二上采样处理。For example, in the image processing method provided by some embodiments of the present disclosure, the first meta-network includes: a first sub-network for performing the first down-sampling process; a second sub-network for performing the first down-sampling process An up-sampling process; the second meta-network includes: a third sub-network for performing the second down-sampling process; a fourth sub-network for performing the second up-sampling process.
例如,在本公开一些实施例提供的图像处理方法中,所述第一子网络、所述第二子网络、所述第三子网络和所述第四子网络中每一个包括卷积层、残差网络、密集网络之一。For example, in the image processing method provided by some embodiments of the present disclosure, each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes a convolutional layer, One of residual networks and dense networks.
例如,在本公开一些实施例提供的图像处理方法中,所述第一子网络、所述第二子网络、所述第三子网络和所述第四子网络中的每一个都包括实例标准化层或层标准化层,所述实例标准化层用于执行实例标准化处理,所述层标准化层用于执行层标准化处理。For example, in the image processing method provided by some embodiments of the present disclosure, each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes instance standardization A layer or layer standardization layer, the instance standardization layer is used to perform instance standardization processing, and the layer standardization layer is used to perform layer standardization processing.
例如,本公开一些实施例提供的图像处理方法还包括:获取输入图像;使用分析网络将输入图像转换为所述第一特征图像;以及使用合成网络将所述至少一次多尺度循环采样处理的输出转换为输出图像。For example, the image processing method provided by some embodiments of the present disclosure further includes: acquiring an input image; using an analysis network to convert the input image into the first feature image; and using a synthesis network to process the output of the at least one multi-scale cyclic sampling process Convert to output image.
本公开至少一个实施例还提供一种神经网络的训练方法,其中,所述神经 网络包括:分析网络、第一子神经网络和合成网络,所述分析网络对输入图像进行处理以得到第一特征图像,所述第一子神经网络对所述第一特征图像进行至少一次多尺度循环采样处理以得到第二特征图像,所述合成网络对所述第二特征图像进行处理以得到输出图像;At least one embodiment of the present disclosure further provides a neural network training method, wherein the neural network includes: an analysis network, a first sub-neural network, and a synthesis network, and the analysis network processes the input image to obtain the first feature Image, the first sub-neural network performs multi-scale cyclic sampling processing on the first feature image at least once to obtain a second feature image, and the synthesis network processes the second feature image to obtain an output image;
所述训练方法包括:获取训练输入图像;使用所述分析网络对所述训练输入图像进行处理以提供第一训练特征图像;使用所述第一子神经网络对所述第一训练特征图像进行所述至少一次多尺度循环采样处理以得到第二训练特征图像;使用所述合成网络对所述第二训练特征图像进行处理以得到训练输出图像;基于所述训练输出图像,通过损失函数计算所述神经网络的损失值;以及根据所述损失值对所述神经网络的参数进行修正;The training method includes: obtaining a training input image; using the analysis network to process the training input image to provide a first training feature image; using the first sub-neural network to perform an analysis on the first training feature image The at least one multi-scale cyclic sampling process is used to obtain a second training feature image; the synthesis network is used to process the second training feature image to obtain a training output image; based on the training output image, the loss function is used to calculate the The loss value of the neural network; and correcting the parameters of the neural network according to the loss value;
其中,所述多尺度循环采样处理包括嵌套的第一层级采样处理和第二层级采样处理,所述第一层级采样处理包括第一下采样处理、第一上采样处理和第一残差链接相加处理,其中,所述第一下采样处理基于第一层级采样处理的输入进行下采样处理得到第一下采样输出,所述第一上采样处理基于所述第一下采样输出进行上采样处理得到第一上采样输出,所述第一残差链接相加处理将所述第一层级采样处理的输入和所述第一上采样输出进行第一残差链接相加,然后将所述第一残差链接相加的结果作为第一层级采样处理的输出;所述第二层级采样处理嵌套在所述第一下采样处理和所述第一上采样处理之间,接收所述第一下采样输出作为第二层级采样处理的输入,提供第二层级采样处理的输出作为第一上采样处理的输入,以使得所述第一上采样处理基于所述第一下采样输出进行上采样处理;所述第二层级采样处理包括第二下采样处理、第二上采样处理和第二残差链接相加处理,其中,所述第二下采样处理基于所述第二层级采样处理的输入进行下采样处理得到第二下采样输出,所述第二上采样处理基于所述第二下采样输出进行上采样处理得到第二上采样输出,所述第二残差链接相加处理将所述第二层级采样处理的输入和所述第二上采样输出进行所述第二残差链接相加,然后将所述第二残差链接相加的结果作为所述第二层级采样处理的输出。Wherein, the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing, and the first-level sampling processing includes first down-sampling processing, first up-sampling processing, and first residual linking Addition processing, wherein the first down-sampling processing performs down-sampling processing based on the input of the first-level sampling processing to obtain a first down-sampled output, and the first up-sampling processing performs up-sampling based on the first down-sampling output The first up-sampling output is obtained by processing, and the first residual link addition processing performs a first residual link addition on the input of the first-level sampling processing and the first up-sampling output, and then adds the first residual link The result of a residual link addition is used as the output of the first-level sampling process; the second-level sampling process is nested between the first down-sampling process and the first up-sampling process, and receives the first The down-sampling output is used as the input of the second-level sampling process, and the output of the second-level sampling process is provided as the input of the first up-sampling process, so that the first up-sampling process performs up-sampling processing based on the first down-sampling output The second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process, wherein the second down-sampling process is performed based on the input of the second-level sampling process The down-sampling process obtains a second down-sampling output, the second up-sampling process performs up-sampling processing based on the second down-sampling output to obtain a second up-sampling output, and the second residual link addition process adds the first The input of the second-level sampling process and the second up-sampling output are subjected to the second residual link addition, and then the result of the second residual link addition is used as the output of the second-level sampling process.
例如,在本公开一些实施例提供的训练方法中,所述第一上采样处理的输出的尺寸与所述第一下采样处理的输入的尺寸相同;所述第二上采样处理的输出的尺寸与所述第二下采样处理的输入的尺寸相同。For example, in the training method provided by some embodiments of the present disclosure, the size of the output of the first upsampling process is the same as the size of the input of the first downsampling process; the size of the output of the second upsampling process The same size as the input of the second downsampling process.
例如,在本公开一些实施例提供的训练方法中,所述多尺度循环采样处理 还包括第三层级采样处理,所述第三层级采样处理嵌套在所述第二下采样处理和所述第二上采样处理之间,接收所述第二下采样输出作为第三层级采样处理的输入,提供第三层级采样处理的输出作为第二上采样处理的输入,以使得所述第二上采样处理基于所述第二下采样输出进行上采样处理;所述第三层级采样处理包括第三下采样处理、第三上采样处理和第三残差链接相加处理,其中,所述第三下采样处理基于所述第三层级采样处理的输入进行下采样处理得到第三下采样输出,所述第三上采样处理基于所述第三下采样输出进行上采样处理得到第三上采样输出,所述第三残差链接相加处理将所述第三层级采样处理的输入和所述第三上采样输出进行第三残差链接相加,然后将所述第三残差链接相加的结果作为所述第三层级采样处理的输出。For example, in the training method provided by some embodiments of the present disclosure, the multi-scale cyclic sampling process further includes a third-level sampling process, and the third-level sampling process is nested in the second down-sampling process and the second down-sampling process. Between the two up-sampling processes, the second down-sampling output is received as the input of the third-level sampling process, and the output of the third-level sampling process is provided as the input of the second up-sampling process, so that the second up-sampling process Up-sampling processing is performed based on the second down-sampling output; the third-level sampling processing includes third down-sampling processing, third up-sampling processing, and third residual link addition processing, where the third down-sampling Processing is performed based on the input of the third-level sampling process to perform down-sampling processing to obtain a third down-sampled output, and the third up-sampling process performs up-sampling based on the third down-sampled output to obtain a third up-sampled output, the The third residual link addition process performs a third residual link addition on the input of the third level sampling process and the third up-sampling output, and then uses the result of the third residual link addition as the The output of the third-level sampling process.
例如,在本公开一些实施例提供的训练方法中,所述多尺度循环采样处理包括多次依次执行的所述第二层级采样处理,第一次所述第二层级采样处理接收所述第一下采样输出作为第一次所述第二层级采样处理的输入,除第一次所述第二层级采样处理之外的每次所述第二层级采样处理接收前一次所述第二层级采样处理的输出作为本次所述第二层级采样处理的输入,最后一次所述第二层级采样处理的输出作为所述第一上采样处理的输入。For example, in the training method provided by some embodiments of the present disclosure, the multi-scale cyclic sampling processing includes the second-level sampling processing that is sequentially executed multiple times, and the second-level sampling processing receives the first The down-sampling output is used as the input of the first second-level sampling process, and each second-level sampling process except the first second-level sampling process receives the previous second-level sampling process The output of is used as the input of the second-level sampling process this time, and the output of the last second-level sampling process is used as the input of the first upsampling process.
例如,在本公开一些实施例提供的训练方法中,所述至少一次多尺度循环采样处理包括多次依次执行的所述多尺度循环采样处理,每次所述多尺度循环采样处理的输入作为本次所述多尺度循环采样处理中的所述第一层级采样处理的输入,每次所述多尺度循环采样处理中的所述第一层级采样处理的输出作为本次所述多尺度循环采样处理的输出;第一次所述多尺度循环采样处理接收所述第一训练特征图像作为第一次所述多尺度循环采样处理的输入,除第一次所述多尺度循环采样处理之外的每次所述多尺度循环采样处理接收前一次所述多尺度循环采样处理的输出作为本次所述多尺度循环采样处理的输入,最后一次所述多尺度循环采样处理的输出作为所述至少一次多尺度循环采样处理的输出。For example, in the training method provided by some embodiments of the present disclosure, the at least one multi-scale cyclic sampling processing includes the multi-scale cyclic sampling processing performed sequentially multiple times, and each time the input of the multi-scale cyclic sampling processing is used as the original The input of the first-level sampling process in the multi-scale cyclic sampling process, and the output of the first-level sampling process in the multi-scale cyclic sampling process is used as the multi-scale cyclic sampling process this time The output of the first multi-scale cyclic sampling process receives the first training feature image as the input of the first multi-scale cyclic sampling process, except for the first multi-scale cyclic sampling process every time The multi-scale cyclic sampling process receives the output of the previous multi-scale cyclic sampling process as the input of this multi-scale cyclic sampling process, and the output of the last multi-scale cyclic sampling process is used as the at least one multi-scale cyclic sampling process. The output of the scale cycle sampling process.
例如,在本公开一些实施例提供的图像处理方法中,所述多尺度循环采样处理还包括:在所述第一下采样处理、所述第一上采样处理、所述第二下采样处理和所述第二上采样处理之后,分别对所述第一下采样输出、所述第一上采样输出、所述第二下采样输出和所述第二上采样输出进行实例标准化处理或层标准化处理。For example, in the image processing method provided by some embodiments of the present disclosure, the multi-scale cyclic sampling processing further includes: performing the first down-sampling processing, the first up-sampling processing, the second down-sampling processing, and After the second up-sampling processing, perform instance standardization processing or layer standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively .
例如,在本公开一实施例提供的训练方法中,所述第一子神经网络包括:第一元网络,用于执行所述第一层级采样处理;第二元网络,用于执行所述第二层级采样处理。For example, in the training method provided by an embodiment of the present disclosure, the first sub-neural network includes: a first meta-network for performing the first-level sampling processing; a second meta-network for performing the first-level sampling process; Two-level sampling processing.
例如,在本公开一实施例提供的训练方法中,所述第一元网络包括:第一子网络,用于执行所述第一下采样处理;第二子网络,用于执行所述第一上采样处理;所述第二元网络包括:第三子网络,用于执行所述第二下采样处理;第四子网络,用于执行所述第二上采样处理。For example, in the training method provided by an embodiment of the present disclosure, the first meta-network includes: a first sub-network for performing the first downsampling process; a second sub-network for performing the first Up-sampling processing; the second meta-network includes: a third sub-network for performing the second down-sampling processing; a fourth sub-network for performing the second up-sampling processing.
例如,在本公开一实施例提供的训练方法中,所述第一子网络、所述第二子网络、所述第三子网络和所述第四子网络中的每一个包括卷积层、残差网络、密集网络之一。For example, in the training method provided by an embodiment of the present disclosure, each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes a convolutional layer, One of residual networks and dense networks.
例如,在本公开一实施例提供的训练方法中,所述第一子网络、所述第二子网络、所述第三子网络和所述第四子网络中的每一个都包括实例标准化层或层标准化层,所述实例标准化层用于分别对所述第一下采样输出、所述第一上采样输出、所述第二下采样输出和所述第二上采样输出进行实例标准化处理,所述层标准化层用于分别对所述第一下采样输出、所述第一上采样输出、所述第二下采样输出和所述第二上采样输出进行层标准化处理。For example, in the training method provided by an embodiment of the present disclosure, each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes an instance standardization layer Or a layer standardization layer, the instance standardization layer is used to perform instance standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively, The layer standardization layer is used to perform layer standardization processing on the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output, respectively.
本公开至少一个实施例还提供一种图像处理装置,包括:存储器,用于非暂时性存储计算机可读指令;以及处理器,用于运行所述计算机可读指令,所述计算机可读指令被所述处理器运行时执行本公开任一实施例提供的图像处理方法或本公开任一实施例提供的神经网路的训练方法。At least one embodiment of the present disclosure further provides an image processing device, including: a memory for non-transitory storage of computer-readable instructions; and a processor for running the computer-readable instructions, the computer-readable instructions being The processor executes the image processing method provided by any embodiment of the present disclosure or the neural network training method provided by any embodiment of the present disclosure while running.
本公开至少一个实施例还提供一种存储介质,非暂时性地存储计算机可读指令,当所述计算机可读指令由计算机执行时能够执行本公开任一实施例提供的图像处理方法的指令或本公开任一实施例提供的神经网路的训练方法的指令。At least one embodiment of the present disclosure further provides a storage medium that non-temporarily stores computer-readable instructions, and when the computer-readable instructions are executed by a computer, the instructions or instructions of the image processing method provided in any embodiment of the present disclosure can be executed. Instructions of the neural network training method provided by any embodiment of the present disclosure.
附图说明Description of the drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述中的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the drawings of the embodiments. Obviously, the drawings in the following description only relate to some embodiments of the present disclosure, rather than limit the present disclosure. .
图1为一种卷积神经网络的示意图;Figure 1 is a schematic diagram of a convolutional neural network;
图2A为一种卷积神经网络的结构示意图;Figure 2A is a schematic diagram of a convolutional neural network;
图2B为一种卷积神经网络的工作过程示意图;Figure 2B is a schematic diagram of the working process of a convolutional neural network;
图3为本公开一实施例提供的一种图像处理方法的流程图;FIG. 3 is a flowchart of an image processing method provided by an embodiment of the disclosure;
图4A为本公开一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图;4A is a schematic flow chart of a multi-scale cyclic sampling process corresponding to the image processing method shown in FIG. 3 according to an embodiment of the present disclosure;
图4B为本公开另一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图;FIG. 4B is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure;
图4C为本公开再一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图;FIG. 4C is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to still another embodiment of the present disclosure;
图4D为本公开又一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图;FIG. 4D is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure;
图5为本公开另一实施例提供的一种图像处理方法的流程图;FIG. 5 is a flowchart of an image processing method provided by another embodiment of the present disclosure;
图6A为一种输入图像的示意图;Fig. 6A is a schematic diagram of an input image;
图6B为根据本公开一实施例提供的一种图像处理方法对图6A所示的输入图像进行处理得到的输出图像的示意图;6B is a schematic diagram of an output image obtained by processing the input image shown in FIG. 6A according to an image processing method provided by an embodiment of the present disclosure;
图7A为本公开一实施例提供的一种神经网络的结构示意图;FIG. 7A is a schematic structural diagram of a neural network provided by an embodiment of the disclosure;
图7B为本公开一实施例提供的一种神经网络的训练方法的流程图;FIG. 7B is a flowchart of a neural network training method provided by an embodiment of the disclosure;
图7C为本公开一实施例提供的一种对应于图7B中所示的训练方法训练图7A所示的神经网络的示意性架构框图;FIG. 7C is a schematic structural block diagram of training the neural network shown in FIG. 7A corresponding to the training method shown in FIG. 7B according to an embodiment of the present disclosure;
图8为本公开一实施例提供的一种图像处理装置的示意性框图;以及FIG. 8 is a schematic block diagram of an image processing apparatus provided by an embodiment of the present disclosure; and
图9为本公开一实施例提供的一种存储介质的示意图。FIG. 9 is a schematic diagram of a storage medium provided by an embodiment of the disclosure.
具体实施方式detailed description
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例的附图,对本公开实施例的技术方案进行清楚、完整地描述。显然,所描述的实施例是本公开的一部分实施例,而不是全部的实施例。基于所描述的本公开的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the objectives, technical solutions, and advantages of the embodiments of the present disclosure clearer, the technical solutions of the embodiments of the present disclosure will be described clearly and completely in conjunction with the accompanying drawings of the embodiments of the present disclosure. Obviously, the described embodiments are part of the embodiments of the present disclosure, rather than all of the embodiments. Based on the described embodiments of the present disclosure, all other embodiments obtained by those of ordinary skill in the art without creative labor are within the protection scope of the present disclosure.
除非另外定义,本公开使用的技术术语或者科学术语应当为本公开所属领域内具有一般技能的人士所理解的通常意义。本公开中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者 物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。“上”、“下”、“左”、“右”等仅用于表示相对位置关系,当被描述对象的绝对位置改变后,则该相对位置关系也可能相应地改变。Unless otherwise defined, the technical terms or scientific terms used in the present disclosure shall have the usual meanings understood by those with ordinary skills in the field to which this disclosure belongs. The "first", "second" and similar words used in the present disclosure do not indicate any order, quantity, or importance, but are only used to distinguish different components. "Include" or "include" and other similar words mean that the element or item appearing before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. Similar words such as "connected" or "connected" are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "Up", "Down", "Left", "Right", etc. are only used to indicate the relative position relationship. When the absolute position of the described object changes, the relative position relationship may also change accordingly.
下面通过几个具体的实施例对本公开进行说明。为了保持本公开实施例的以下说明清楚且简明,可省略已知功能和已知部件的详细说明。当本公开实施例的任一部件在一个以上的附图中出现时,该部件在每个附图中由相同或类似的参考标号表示。The present disclosure will be described below through several specific embodiments. In order to keep the following description of the embodiments of the present disclosure clear and concise, detailed descriptions of known functions and known components may be omitted. When any component of the embodiment of the present disclosure appears in more than one drawing, the component is represented by the same or similar reference numeral in each drawing.
图像增强是图像处理领域的研究热点之一。由于在图像采集过程中存在各种物理因素的限制(例如,手机相机的图像传感器尺寸太小以及其他软件、硬件的限制等)以及环境噪声的干扰,会导致图像质量大大降低。图像增强的目的是通过图像增强技术,改善图像的灰度直方图,提高图像的对比度,从而凸显图像细节信息,改善图像的视觉效果。Image enhancement is one of the research hotspots in the field of image processing. Due to the limitations of various physical factors in the image acquisition process (for example, the size of the image sensor of the mobile phone camera is too small and other software and hardware limitations) and the interference of environmental noise, the image quality will be greatly reduced. The purpose of image enhancement is to improve the grayscale histogram of the image and the contrast of the image through image enhancement technology, thereby highlighting the detailed information of the image and improving the visual effect of the image.
利用深度神经网络进行图像增强是随着深度学习技术的发展而新兴起来的技术。例如,基于卷积神经网络,可以对手机拍摄的低质量的照片(输入图像)进行处理以获得高质量的输出图像,该输出图像的质量可以接近于数码单镜反光相机(Digital Single Lens Reflex Camera,常简称为DSLR,也简称为数码单反相机)拍摄的照片的质量。例如,常用峰值信噪比(Peak Signal to Noise Ratio,PSNR)指标来表征图像质量,其中PSNR值越高表示图像越接近于真实的数码单镜反光相机拍摄的照片。The use of deep neural networks for image enhancement is a technology emerging with the development of deep learning technology. For example, based on convolutional neural networks, low-quality photos (input images) taken by mobile phones can be processed to obtain high-quality output images. The quality of the output images can be close to that of digital single-lens reflex cameras (Digital Single Lens Reflex Camera). , Often referred to as DSLR, also referred to as digital SLR camera) the quality of the photos taken. For example, the Peak Signal to Noise Ratio (PSNR) index is commonly used to characterize image quality, where the higher the PSNR value, the closer the image is to the real photos taken by a digital single-lens reflex camera.
例如,Andrey Ignatov等人提出了一种卷积神经网络实现图像增强的方法,请参见文献,Andrey Ignatov,Nikolay Kobyshev,Kenneth Vanhoey,Radu Timofte,Luc Van Gool,DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks.arXiv:1704.02470v2[cs.CV],2017年9月5日。在此将该文献全文引用结合于此,以作为本申请的一部分。该方法主要是利用卷积层、批量标准化层及残差连接构建了一种单一尺度的卷积神经网络,利用该网络可以将输入的低质量图像(例如,对比度较低,图像曝光不足或曝光过度,整幅图像过暗或过亮等)处理成一张较高质量图像。利用颜色损失、纹理损失及内容损失作为训练中的损失函数,能够取得较好的处理效果。For example, Andrey Ignatov et al. proposed a method of convolutional neural network to achieve image enhancement, please refer to the literature, Andrey Ignatov, Nikolay Kobyshev, Kenneth Vanhoey, Radu Timofte, Luc Van Gool, DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks.arXiv:1704.02470v2[cs.CV], September 5, 2017. This document is hereby incorporated by reference in its entirety as a part of this application. This method mainly uses convolutional layers, batch normalization layers and residual connections to construct a single-scale convolutional neural network. The network can be used to input low-quality images (for example, low contrast, underexposure or exposure Excessive, the entire image is too dark or too bright, etc.) processed into a higher quality image. Using color loss, texture loss and content loss as the loss function in training can achieve better processing results.
本公开至少一个实施例提供一种图像处理方法、图像处理装置、神经网络 的训练方法以及存储介质。该图像处理方法提出了一种基于卷积神经网络的多尺度循环采样方法,通过在多个尺度上反复采样以获取更高的图像保真度,可以大幅提升输出图像的质量,适用于对于图像质量要求较高的批处理等离线应用。At least one embodiment of the present disclosure provides an image processing method, an image processing device, a neural network training method, and a storage medium. This image processing method proposes a multi-scale cyclic sampling method based on convolutional neural network. By repeatedly sampling at multiple scales to obtain higher image fidelity, the quality of the output image can be greatly improved, and it is suitable for image processing. Offline applications such as batch processing with high quality requirements.
最初,卷积神经网络(Convolutional Neural Network,CNN)主要用于识别二维形状,其对图像的平移、比例缩放、倾斜或其他形式的变形具有高度不变性。CNN主要通过局部感知野和权值共享来简化神经网络模型的复杂性、减少权重的数量。随着深度学习技术的发展,CNN的应用范围已经不仅仅限于图像识别领域,其也可以应用在人脸识别、文字识别、动物分类、图像处理等领域。Initially, Convolutional Neural Network (CNN) was mainly used to identify two-dimensional shapes, and it was highly invariant to image translation, scaling, tilt, or other forms of deformation. CNN mainly simplifies the complexity of the neural network model and reduces the number of weights through the sharing of local perception fields and weights. With the development of deep learning technology, the application scope of CNN is not only limited to the field of image recognition, it can also be applied in fields such as face recognition, text recognition, animal classification, and image processing.
图1示出了一种卷积神经网络的示意图。例如,该卷积神经网络可以用于图像处理,其使用图像作为输入和输出,并通过卷积核替代标量的权重。图1中仅示出了具有3层结构的卷积神经网络,本公开的实施例对此不作限制。如图1所示,卷积神经网络包括输入层101、隐藏层102和输出层103。输入层101具有4个输入,隐藏层102具有3个输出,输出层103具有2个输出,最终该卷积神经网络最终输出2幅图像。Figure 1 shows a schematic diagram of a convolutional neural network. For example, the convolutional neural network can be used for image processing, which uses images as input and output, and replaces scalar weights with convolution kernels. FIG. 1 only shows a convolutional neural network with a 3-layer structure, which is not limited in the embodiment of the present disclosure. As shown in FIG. 1, the convolutional neural network includes an input layer 101, a hidden layer 102, and an output layer 103. The input layer 101 has 4 inputs, the hidden layer 102 has 3 outputs, and the output layer 103 has 2 outputs. Finally, the convolutional neural network finally outputs 2 images.
例如,输入层101的4个输入可以为4幅图像,或者1幅图像的四种特征图像。隐藏层102的3个输出可以为经过输入层101输入的图像的特征图像。For example, the 4 inputs of the input layer 101 may be 4 images, or 4 feature images of 1 image. The three outputs of the hidden layer 102 may be characteristic images of the image input through the input layer 101.
例如,如图1所示,卷积层具有权重
Figure PCTCN2020077763-appb-000001
和偏置
Figure PCTCN2020077763-appb-000002
权重
Figure PCTCN2020077763-appb-000003
表示卷积核,偏置
Figure PCTCN2020077763-appb-000004
是叠加到卷积层的输出的标量,其中,k是表示输入层101的标签,i和j分别是输入层101的单元和隐藏层102的单元的标签。例如,第一卷积层201包括第一组卷积核(图1中的
Figure PCTCN2020077763-appb-000005
)和第一组偏置(图1中的
Figure PCTCN2020077763-appb-000006
)。第二卷积层202包括第二组卷积核(图1中的
Figure PCTCN2020077763-appb-000007
)和第二组偏置(图1中的
Figure PCTCN2020077763-appb-000008
)。通常,每个卷积层包括数十个或数百个卷积核,若卷积神经网络为深度卷积神经网络,则其可以包括至少五层卷积层。
For example, as shown in Figure 1, the convolutional layer has weights
Figure PCTCN2020077763-appb-000001
And bias
Figure PCTCN2020077763-appb-000002
Weights
Figure PCTCN2020077763-appb-000003
Represents the convolution kernel, bias
Figure PCTCN2020077763-appb-000004
Is a scalar superimposed on the output of the convolutional layer, where k is the label of the input layer 101, and i and j are the labels of the unit of the input layer 101 and the unit of the hidden layer 102, respectively. For example, the first convolutional layer 201 includes a first set of convolution kernels (in Figure 1
Figure PCTCN2020077763-appb-000005
) And the first set of offsets (in Figure 1
Figure PCTCN2020077763-appb-000006
). The second convolution layer 202 includes a second set of convolution kernels (in Figure 1
Figure PCTCN2020077763-appb-000007
) And the second set of offsets (in Figure 1
Figure PCTCN2020077763-appb-000008
). Generally, each convolutional layer includes tens or hundreds of convolution kernels. If the convolutional neural network is a deep convolutional neural network, it may include at least five convolutional layers.
例如,如图1所示,该卷积神经网络还包括第一激活层203和第二激活层204。第一激活层203位于第一卷积层201之后,第二激活层204位于第二卷积层202之后。激活层(例如,第一激活层203和第二激活层204)包括激活函数,激活函数用于给卷积神经网络引入非线性因素,以使卷积神经网络可以更好地解决较为复杂的问题。激活函数可以包括线性修正单元(ReLU)函数、S型函数(Sigmoid函数)或双曲正切函数(tanh函数)等。ReLU函数为非饱和非线 性函数,Sigmoid函数和tanh函数为饱和非线性函数。例如,激活层可以单独作为卷积神经网络的一层,或者激活层也可以被包含在卷积层(例如,第一卷积层201可以包括第一激活层203,第二卷积层202可以包括第二激活层204)中。For example, as shown in FIG. 1, the convolutional neural network further includes a first activation layer 203 and a second activation layer 204. The first activation layer 203 is located behind the first convolutional layer 201, and the second activation layer 204 is located behind the second convolutional layer 202. The activation layer (for example, the first activation layer 203 and the second activation layer 204) includes activation functions, which are used to introduce nonlinear factors into the convolutional neural network, so that the convolutional neural network can better solve more complex problems . The activation function may include a linear correction unit (ReLU) function, a sigmoid function (Sigmoid function), or a hyperbolic tangent function (tanh function). The ReLU function is an unsaturated nonlinear function, and the Sigmoid function and tanh function are saturated nonlinear functions. For example, the activation layer can be used as a layer of the convolutional neural network alone, or the activation layer can also be included in the convolutional layer (for example, the first convolutional layer 201 can include the first activation layer 203, and the second convolutional layer 202 can be Including the second active layer 204).
例如,在第一卷积层201中,首先,对每个输入应用第一组卷积核中的若干卷积核
Figure PCTCN2020077763-appb-000009
和第一组偏置中的若干偏置
Figure PCTCN2020077763-appb-000010
以得到第一卷积层201的输出;然后,第一卷积层201的输出可以通过第一激活层203进行处理,以得到第一激活层203的输出。在第二卷积层202中,首先,对输入的第一激活层203的输出应用第二组卷积核中的若干卷积核
Figure PCTCN2020077763-appb-000011
和第二组偏置中的若干偏置
Figure PCTCN2020077763-appb-000012
以得到第二卷积层202的输出;然后,第二卷积层202的输出可以通过第二激活层204进行处理,以得到第二激活层204的输出。例如,第一卷积层201的输出可以为对其输入应用卷积核
Figure PCTCN2020077763-appb-000013
后再与偏置
Figure PCTCN2020077763-appb-000014
相加的结果,第二卷积层202的输出可以为对第一激活层203的输出应用卷积核
Figure PCTCN2020077763-appb-000015
后再与偏置
Figure PCTCN2020077763-appb-000016
相加的结果。
For example, in the first convolution layer 201, first, several convolution kernels in the first group of convolution kernels are applied to each input
Figure PCTCN2020077763-appb-000009
And several offsets in the first set of offsets
Figure PCTCN2020077763-appb-000010
In order to obtain the output of the first convolutional layer 201; then, the output of the first convolutional layer 201 can be processed by the first activation layer 203 to obtain the output of the first activation layer 203. In the second convolutional layer 202, first, apply several convolution kernels in the second set of convolution kernels to the output of the input first activation layer 203
Figure PCTCN2020077763-appb-000011
And several offsets in the second set of offsets
Figure PCTCN2020077763-appb-000012
In order to obtain the output of the second convolutional layer 202; then, the output of the second convolutional layer 202 can be processed by the second activation layer 204 to obtain the output of the second activation layer 204. For example, the output of the first convolutional layer 201 can be a convolution kernel applied to its input
Figure PCTCN2020077763-appb-000013
Offset
Figure PCTCN2020077763-appb-000014
As a result of the addition, the output of the second convolutional layer 202 can be a convolution kernel applied to the output of the first activation layer 203
Figure PCTCN2020077763-appb-000015
Offset
Figure PCTCN2020077763-appb-000016
The result of the addition.
在利用卷积神经网络进行图像处理前,需要对卷积神经网络进行训练。经过训练之后,卷积神经网络的卷积核和偏置在图像处理期间保持不变。在训练过程中,各卷积核和偏置通过多组输入/输出示例图像以及优化算法进行调整,以获取优化后的卷积神经网络模型。Before using the convolutional neural network for image processing, the convolutional neural network needs to be trained. After training, the convolution kernel and bias of the convolutional neural network remain unchanged during image processing. In the training process, each convolution kernel and bias are adjusted through multiple sets of input/output example images and optimization algorithms to obtain an optimized convolutional neural network model.
图2A示出了一种卷积神经网络的结构示意图,图2B示出了一种卷积神经网络的工作过程示意图。例如,如图2A和2B所示,输入图像通过输入层输入到卷积神经网络后,依次经过若干个处理过程(如图2A中的每个层级)后输出类别标识。卷积神经网络的主要组成部分可以包括多个卷积层、多个下采样层和全连接层等。本公开中,应该理解的是,多个卷积层、多个下采样层和全连接层等这些层每个都指代对应的处理操作,即卷积处理、下采样处理、全连接处理等,所描述的神经网络也都指代对应的处理操作,以下将要描述的实例标准化层或层标准化层等也与此类似,这里不再重复说明。例如,一个完整的卷积神经网络可以由这三种层叠加组成。例如,图2A仅示出了一种卷积神经网络的三个层级,即第一层级、第二层级和第三层级。例如,每个层级可以包括一个卷积模块和一个下采样层。例如,每个卷积模块可以包括卷积层。由此,每个层级的处理过程可以包括:对输入图像进行卷积(convolution)以及下采样(sub-sampling/down-sampling)。例如,根据实际需要,每个卷积模块还可以包括实例标准化(instance normalization)层或层标准化(layer normalization)层,从而每个层级的处理过程还可以包括实例标准化处理或层 标准化处理。FIG. 2A shows a schematic diagram of the structure of a convolutional neural network, and FIG. 2B shows a schematic diagram of the working process of a convolutional neural network. For example, as shown in FIGS. 2A and 2B, after the input image is input to the convolutional neural network through the input layer, it undergoes several processing procedures (each level in FIG. 2A) in turn, and then outputs the category identification. The main components of a convolutional neural network can include multiple convolutional layers, multiple downsampling layers, and fully connected layers. In this disclosure, it should be understood that each of these layers, such as multiple convolutional layers, multiple downsampling layers, and fully connected layers, refers to a corresponding processing operation, that is, convolution processing, downsampling processing, fully connected processing, etc. , The described neural network also refers to the corresponding processing operation, the example standardization layer or layer standardization layer to be described below is similar to this, and the description is not repeated here. For example, a complete convolutional neural network can be composed of these three layers. For example, FIG. 2A only shows three levels of a convolutional neural network, namely the first level, the second level, and the third level. For example, each level may include a convolution module and a downsampling layer. For example, each convolution module may include a convolution layer. Thus, the processing process of each level may include: convolution and down-sampling of the input image. For example, according to actual needs, each convolution module may also include an instance normalization layer or a layer normalization layer, so that the processing process of each level may also include instance normalization processing or layer normalization processing.
例如,实例标准化层用于对卷积层输出的特征图像进行实例标准化处理,以使特征图像的像素的灰度值在预定范围内变化,从而简化图像生成过程,改善图像增强的效果。例如,预定范围可以为[-1,1]。实例标准化层根据每个特征图像自身的均值和方差,对该特征图像进行实例标准化处理。例如,实例标准化层还可用于对单幅图像进行实例标准化处理。For example, the instance standardization layer is used to perform instance standardization processing on the feature image output by the convolutional layer, so that the gray value of the pixel of the feature image changes within a predetermined range, thereby simplifying the image generation process and improving the effect of image enhancement. For example, the predetermined range may be [-1, 1]. The instance standardization layer performs instance standardization processing on each feature image according to its own mean and variance. For example, the instance standardization layer can also be used to perform instance standardization processing on a single image.
例如,假设小批梯度下降法(mini-batch gradient decent)的尺寸为T,某一卷积层输出的特征图像的数量为C,且每个特征图像均为H行W列的矩阵,则特征图像的模型表示为(T,C,H,W)。从而,实例标准化层的实例标准化公式可以表示如下:For example, assuming that the size of the mini-batch gradient decent method is T, the number of feature images output by a certain convolutional layer is C, and each feature image is a matrix of H rows and W columns, then the feature The image model is represented as (T, C, H, W). Therefore, the instance standardization formula of the instance standardization layer can be expressed as follows:
Figure PCTCN2020077763-appb-000017
Figure PCTCN2020077763-appb-000017
其中,x tijk为该卷积层输出的特征图像集合中的第t个特征块(patch)、第i个特征图像、第j行、第k列的值。y tijk表示经过实例标准化层处理x tijk后得到的结果。ε 1为一个很小的整数,以避免分母为0。 Among them, x tijk is the value of the t-th feature image, the i-th feature image, the j-th row, and the k-th column in the feature image set output by the convolutional layer. y tijk represents the result obtained after processing x tijk by the instance standardization layer. ε 1 is a small integer to avoid zero denominator.
例如,层标准化层与实例标准化层类似,也用于对卷积层输出的特征图像进行层标准化处理,以使特征图像的像素的灰度值在预定范围内变化,从而简化图像生成过程,改善图像增强的效果。例如,预定范围可以为[-1,1]。与实例标准化层不同的是,层标准化层根据每个特征图像每一列的均值和方差,对该特征图像的每一列进行层标准化处理,从而实现对该特征图像的层标准化处理。例如,层标准化层也可用于对单幅图像进行层标准化处理。For example, the layer standardization layer is similar to the instance standardization layer, and is also used to perform layer standardization processing on the feature image output by the convolutional layer, so that the gray value of the pixel of the feature image changes within a predetermined range, thereby simplifying the image generation process and improving The effect of image enhancement. For example, the predetermined range may be [-1, 1]. Different from the example standardization layer, the layer standardization layer performs layer standardization processing on each column of the characteristic image according to the mean value and variance of each column of each characteristic image, thereby realizing the layer standardization processing of the characteristic image. For example, the layer standardization layer can also be used to perform layer standardization processing on a single image.
例如,仍然以上述小批梯度下降法(mini-batch gradient decent)为例,特征图像的模型表示为(T,C,H,W)。从而,层标准化层的层标准化公式可以表示如下:For example, still taking the mini-batch gradient decent method described above as an example, the model of the feature image is expressed as (T, C, H, W). Therefore, the layer standardization formula of the layer standardization layer can be expressed as follows:
Figure PCTCN2020077763-appb-000018
Figure PCTCN2020077763-appb-000018
其中,x tijk为该卷积层输出的特征图像集合中的第t个特征块(patch)、第i个特征图像、第j行、第k列的值。y′ tijk表示经过层标准化层处理x tijk后得到的结果。ε 2为一个很小的整数,以避免分母为0。 Among them, x tijk is the value of the t-th feature image, the i-th feature image, the j-th row, and the k-th column in the feature image set output by the convolutional layer. y′ tijk represents the result obtained after processing x tijk by the layer standardization layer. ε 2 is a small integer to avoid zero denominator.
卷积层是卷积神经网络的核心层。在卷积神经网络的卷积层中,一个神经元只与部分相邻层的神经元连接。卷积层可以对输入图像应用若干个卷积核 (也称为滤波器),以提取输入图像的多种类型的特征。每个卷积核可以提取一种类型的特征。卷积核一般以随机小数矩阵的形式初始化,在卷积神经网络的训练过程中卷积核将通过学习以得到合理的权值。对输入图像应用一个卷积核之后得到的结果被称为特征图像(feature map),特征图像的数目与卷积核的数目相等。每个特征图像由一些矩形排列的神经元组成,同一特征图像的神经元共享权值,这里共享的权值就是卷积核。一个层级的卷积层输出的特征图像可以被输入到相邻的下一个层级的卷积层并再次处理以得到新的特征图像。例如,如图2A所示,第一层级的卷积层可以输出第一层级特征图像,该第一层级特征图像被输入到第二层级的卷积层再次处理以得到第二层级特征图像。The convolutional layer is the core layer of the convolutional neural network. In the convolutional layer of a convolutional neural network, a neuron is only connected to some of the neurons in the adjacent layer. The convolutional layer can apply several convolution kernels (also called filters) to the input image to extract multiple types of features of the input image. Each convolution kernel can extract one type of feature. The convolution kernel is generally initialized in the form of a random decimal matrix. During the training process of the convolutional neural network, the convolution kernel will learn to obtain reasonable weights. The result obtained after applying a convolution kernel to the input image is called a feature map, and the number of feature images is equal to the number of convolution kernels. Each feature image is composed of some rectangularly arranged neurons, and the neurons of the same feature image share weights, and the shared weights here are the convolution kernels. The feature image output by the convolution layer of one level can be input to the convolution layer of the next next level and processed again to obtain a new feature image. For example, as shown in FIG. 2A, the first-level convolutional layer may output a first-level feature image, and the first-level feature image is input to the second-level convolutional layer and processed again to obtain a second-level feature image.
例如,如图2B所示,卷积层可以使用不同的卷积核对输入图像的某一个局部感受域的数据进行卷积,卷积结果被输入激活层,该激活层根据相应的激活函数进行计算以得到输入图像的特征信息。For example, as shown in Figure 2B, the convolutional layer can use different convolution kernels to convolve the data of a certain local receptive field of the input image, and the convolution result is input to the activation layer, which is calculated according to the corresponding activation function To get the characteristic information of the input image.
例如,如图2A和2B所示,下采样层设置在相邻的卷积层之间,下采样层是下采样的一种形式。一方面,下采样层可以用于缩减输入图像的规模,简化计算的复杂度,在一定程度上减小过拟合的现象;另一方面,下采样层也可以进行特征压缩,提取输入图像的主要特征。下采样层能够减少特征图像的尺寸,但不改变特征图像的数量。例如,一个尺寸为12×12的输入图像,通过6×6的卷积核对其进行采样,那么可以得到2×2的输出图像,这意味着输入图像上的36个像素合并为输出图像中的1个像素。最后一个下采样层或卷积层可以连接到一个或多个全连接层,全连接层用于连接提取的所有特征。全连接层的输出为一个一维矩阵,也就是向量。For example, as shown in FIGS. 2A and 2B, the down-sampling layer is arranged between adjacent convolutional layers, and the down-sampling layer is a form of down-sampling. On the one hand, the down-sampling layer can be used to reduce the scale of the input image, simplify the calculation complexity, and reduce over-fitting to a certain extent; on the other hand, the down-sampling layer can also perform feature compression to extract the input image Main features. The down-sampling layer can reduce the size of feature images, but does not change the number of feature images. For example, if an input image with a size of 12×12 is sampled by a 6×6 convolution kernel, a 2×2 output image can be obtained, which means that 36 pixels on the input image are merged into the output image. 1 pixel. The last downsampling layer or convolutional layer can be connected to one or more fully connected layers, which are used to connect all the extracted features. The output of the fully connected layer is a one-dimensional matrix, which is a vector.
下面结合附图对本公开的一些实施例及其示例进行详细说明。Hereinafter, some embodiments and examples of the present disclosure will be described in detail with reference to the accompanying drawings.
图3为本公开一实施例提供的一种图像处理方法的流程图。例如,如图3所示,该图像处理方法包括:FIG. 3 is a flowchart of an image processing method provided by an embodiment of the disclosure. For example, as shown in Figure 3, the image processing method includes:
步骤S110:接收第一特征图像;Step S110: receiving a first characteristic image;
步骤S120:对第一特征图像进行至少一次多尺度循环采样处理。Step S120: Perform at least one multi-scale cyclic sampling process on the first feature image.
例如,在步骤S110中,第一特征图像可以包括输入图像经过卷积层、残差网络、密集网络等之一处理后得到的特征图像(例如,可以参考图2B)。例如,残差网络通过例如残差连接相加的方式将其输入以一定的比例保持在其输出中。例如,密集网络包括瓶颈层(bottleneck layer)和卷积层,例如,在一些示例中,瓶颈层用于对数据进行降维以减少后续卷积操作中的参数数量,例 如瓶颈层的卷积核为1x1卷积核,例如卷积层的卷积核为3x3卷积核;本公开包括但不限于此。例如,输入图像经过卷积、下采样等处理以得到第一特征图像。需要说明的是,本实施例对第一特征图像的获取方式不作限制。例如,第一特征图像可以包括多个特征图像,但不限于此。For example, in step S110, the first feature image may include a feature image obtained after the input image is processed by one of a convolutional layer, a residual network, a dense network, etc. (for example, refer to FIG. 2B). For example, the residual network maintains its input in a certain proportion in its output by means of, for example, adding residual connections. For example, a dense network includes a bottleneck layer and a convolution layer. For example, in some examples, the bottleneck layer is used to reduce the dimensionality of the data to reduce the number of parameters in the subsequent convolution operation, such as the convolution kernel of the bottleneck layer. It is a 1×1 convolution kernel, for example, the convolution kernel of the convolution layer is a 3×3 convolution kernel; the present disclosure includes but is not limited to this. For example, the input image is processed by convolution, down-sampling, etc. to obtain the first feature image. It should be noted that this embodiment does not limit the acquisition method of the first characteristic image. For example, the first characteristic image may include a plurality of characteristic images, but is not limited thereto.
例如,在步骤S110中接收的第一特征图像作为步骤S120中的多尺度循环采样处理的输入。例如,多尺度循环采样处理可以具有多种形式,包括但不限于下文中将要说明的图4A-4C所示的三种形式。For example, the first feature image received in step S110 is used as the input of the multi-scale cyclic sampling process in step S120. For example, the multi-scale cyclic sampling process may have various forms, including but not limited to the three forms shown in FIGS. 4A-4C which will be described below.
图4A为本公开一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图。如图4A所示,多尺度循环采样处理包括嵌套的第一层级采样处理和第二层级采样处理。FIG. 4A is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to an embodiment of the present disclosure. As shown in FIG. 4A, the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing.
例如,如图4A所示,多尺度循环采样处理的输入作为第一层级采样处理的输入,第一层级采样处理的输出作为多尺度循环采样处理的输出。例如多尺度循环采样处理的输出称为第二特征图像,例如,第二特征图像的尺寸(像素阵列的行和列的数值)可以和第一特征图像的尺寸相同。For example, as shown in FIG. 4A, the input of the multi-scale cyclic sampling processing is used as the input of the first-level sampling processing, and the output of the first-level sampling processing is used as the output of the multi-scale cyclic sampling processing. For example, the output of the multi-scale cyclic sampling process is called the second feature image. For example, the size of the second feature image (the number of rows and columns of the pixel array) may be the same as the size of the first feature image.
例如,如图4A所示,第一层级采样处理包括依次执行的第一下采样处理、第一上采样处理和第一残差链接相加处理。第一下采样处理基于第一层级采样处理的输入进行下采样处理得到第一下采样输出,例如,第一下采样处理可以直接对第一层级采样处理的输入进行下采样以得到第一下采样输出。第一上采样处理基于第一下采样输出进行上采样处理得到第一上采样输出,例如,在第一下采样输出经过第二层级采样处理之后,再进行上采样处理以得到第一上采样输出,即第一上采样处理可以间接地对第一下采样输出进行上采样处理。第一残差链接相加处理将第一层级采样处理的输入和第一上采样输出进行第一残差链接相加,然后将第一残差链接相加的结果作为第一层级采样处理的输出。例如,第一上采样处理的输出(即第一上采样输出)的尺寸与第一层级采样处理的输入(即第一下采样处理的输入)的尺寸相同,从而经过第一残差链接相加后,第一层级采样处理的输出的尺寸与第一层级采样处理的输入的尺寸相同。For example, as shown in FIG. 4A, the first-level sampling process includes a first down-sampling process, a first up-sampling process, and a first residual link addition process that are sequentially executed. The first down-sampling process is performed based on the input of the first-level sampling process to obtain the first down-sampled output. For example, the first down-sampling process can directly down-sample the input of the first-level sampling process to obtain the first down-sampling Output. The first up-sampling process is performed based on the first down-sampling output to perform up-sampling processing to obtain the first up-sampling output, for example, after the first down-sampling output is subjected to the second-level sampling process, the up-sampling process is performed to obtain the first up-sampling output That is, the first up-sampling process can indirectly perform up-sampling on the first down-sampling output. The first residual link addition process performs the first residual link addition on the input of the first level sampling process and the first upsampling output, and then uses the result of the first residual link addition as the output of the first level sampling process . For example, the size of the output of the first up-sampling process (ie, the first up-sampling output) is the same as the size of the input of the first-level sampling process (ie, the input of the first down-sampling process), which is added through the first residual link After that, the size of the output of the first-level sampling process is the same as the size of the input of the first-level sampling process.
例如,如图4A所示,第二层级采样处理嵌套在第一层级采样处理的第一下采样处理和第一上采样处理之间,接收第一下采样输出作为第二层级采样处理的输入,提供第二层级采样处理的输出作为第一上采样处理的输入,以使得第一上采样处理基于第一下采样输出进行上采样处理。For example, as shown in Figure 4A, the second-level sampling process is nested between the first down-sampling process and the first up-sampling process of the first-level sampling process, and the first down-sampling output is received as the input of the second-level sampling process. , Providing the output of the second-level sampling process as the input of the first up-sampling process, so that the first up-sampling process performs the up-sampling process based on the first down-sampling output.
例如,如图4A所示,第二层级采样处理包括依次执行的第二下采样处理、第二上采样处理和第二残差链接相加处理。第二下采样处理基于第二层级采样处理的输入进行下采样处理得到第二下采样输出,例如,第二下采样处理可以直接对第二层级采样处理的输入进行下采样以得到第二下采样输出。第二上采样处理基于第二下采样输出进行上采样处理得到第二上采样输出,例如,第二上采样处理可以直接对第二下采样输出进行上采样以得到第二上采样输出。第二残差链接相加处理将第二层级采样处理的输入和第二上采样输出进行第二残差链接相加,然后将第二残差链接相加的结果作为第二层级采样处理的输出。例如,第二上采样处理的输出(即第二上采样输出)的尺寸与第二层级采样处理的输入(即第二下采样处理的输入)的尺寸相同,从而经过第二残差链接相加后,第二层级采样处理的输出的尺寸与第二层级采样处理的输入的尺寸相同。For example, as shown in FIG. 4A, the second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process that are sequentially executed. The second down-sampling process performs down-sampling based on the input of the second-level sampling process to obtain the second down-sampled output. For example, the second down-sampling process can directly down-sample the input of the second-level sampling process to obtain the second down-sampling Output. The second up-sampling process performs up-sampling based on the second down-sampled output to obtain the second up-sampled output. For example, the second up-sampling process may directly up-sample the second down-sampled output to obtain the second up-sampled output. The second residual link addition process performs a second residual link addition on the input of the second level sampling process and the second upsampling output, and then uses the result of the second residual link addition as the output of the second level sampling process . For example, the size of the output of the second up-sampling process (ie, the second up-sampling output) is the same as the size of the input of the second-level sampling process (ie, the input of the second down-sampling process), so that it is added through the second residual link After that, the size of the output of the second-level sampling process is the same as the size of the input of the second-level sampling process.
需要说明的是,在本公开的一些实施例(不限于本实施例)中,每个层级的采样处理(例如,第一层级采样处理、第二层级采样处理以及图4B所示实施例中将要介绍的第三层级采样处理等)的流程是相似的,均包括下采样处理、上采样处理和残差链接相加处理。另外,以特征图像为例,残差链接相加处理可以包括将两幅特征图像的矩阵的每一行、每一列的值对应相加,但不限于此。It should be noted that, in some embodiments of the present disclosure (not limited to this embodiment), the sampling processing of each level (for example, the first-level sampling processing, the second-level sampling processing, and the embodiment shown in FIG. 4B will be The procedures of the third-level sampling processing, etc. introduced are similar, including down-sampling processing, up-sampling processing and residual link addition processing. In addition, taking the feature image as an example, the residual link addition processing may include correspondingly adding the values of each row and each column of the matrix of the two feature images, but it is not limited to this.
在本公开中,“嵌套”是指一个对象中包括与该对象相似或相同的另一个对象,所述对象包括但不限于流程或者网络结构等。In the present disclosure, “nested” means that an object includes another object that is similar or identical to the object, and the object includes but is not limited to a process or a network structure.
需要说明的是,在本公开的一些实施例中,每个层级的采样处理中的上采样处理的输出(例如,上采样处理的输出为特征图像)的尺寸与下采样处理的输入(例如,下采样处理的输入为特征图像)尺寸相同,从而经过残差链接相加后,每个层级的采样处理的输出(例如,每个层级的采样处理的输出可以为特征图像)的尺寸和每个层级的采样处理的输入(例如,每个层级的采样处理的输入可以为特征图像)的尺寸相同。It should be noted that in some embodiments of the present disclosure, the size of the output of the upsampling process (for example, the output of the upsampling process is a feature image) in the sampling process of each level and the input of the downsampling process (for example, The input of the down-sampling process is the feature image) with the same size, so after the residual link addition, the output of the sampling process of each level (for example, the output of the sampling process of each level can be the feature image) size and each The input of the sampling process of the levels (for example, the input of the sampling process of each level may be a feature image) has the same size.
需要说明的是,在本公开的一些实施例中,多尺度循环采样处理可以通过卷积神经网络实现。例如,在本公开的一些实施例中,可以使用第一卷积神经网络进行多尺度循环采样处理。例如,在一些示例中,第一卷积神经网络可以包括嵌套的第一元网络和第二元网络,第一元网络用于执行第一层级采样处理,第二元网络用于执行第二层级采样处理。It should be noted that, in some embodiments of the present disclosure, multi-scale cyclic sampling processing can be implemented by a convolutional neural network. For example, in some embodiments of the present disclosure, the first convolutional neural network may be used to perform multi-scale cyclic sampling processing. For example, in some examples, the first convolutional neural network may include a nested first meta network and a second meta network, the first meta network is used to perform the first level of sampling processing, and the second meta network is used to perform the second Hierarchical sampling processing.
例如,在一些示例中,第一元网络可以包括第一子网络和第二子网络,第 一子网络用于执行第一下采样处理,第二子网络用于执行第一上采样处理。第二元网络嵌套在第一元网络的第一子网络和第三子网络之间。例如,在一些示例中,第二元网络可以包括第三子网络和第四子网络,第三子网络用于执行第二下采样处理,第四子网络用于执行第二上采样处理。例如,第一元网络和第二元网络均类似于前述残差网络的形式。For example, in some examples, the first meta-network may include a first sub-network and a second sub-network, the first sub-network is used to perform the first down-sampling process, and the second sub-network is used to perform the first up-sampling process. The second meta network is nested between the first sub network and the third sub network of the first meta network. For example, in some examples, the second meta-network may include a third sub-network and a fourth sub-network, the third sub-network is used to perform the second down-sampling process, and the fourth sub-network is used to perform the second up-sampling process. For example, both the first meta network and the second meta network are similar to the aforementioned residual network.
例如,一些示例中,第一子网络、第二子网络、第三子网络和第四子网络中的每一个都包括卷积层、残差网络、密集网络等之一。具体地,第一子网络和第三子网络可以包括具有下采样功能的卷积层(下采样层),也可以包括具有下采样功能的残差网络、密集网络等之一;第二子网络和第四子网络可以包括具有上采样功能的卷积层(上采样层),也可以包括具有上采样功能的残差网络、密集网络等之一。需要说明的是,第一子网络和第三子网络可以具有相同的结构,也可以具有不同的结构;第二子网络和第四子网络可以具有相同的结构,也可以具有不同的结构;本公开的实施例对此不作限制。For example, in some examples, each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes one of a convolutional layer, a residual network, a dense network, and the like. Specifically, the first sub-network and the third sub-network may include a convolutional layer (down-sampling layer) with down-sampling function, and may also include one of residual networks and dense networks with down-sampling function; the second sub-network The and fourth sub-network may include a convolutional layer (up-sampling layer) with an up-sampling function, and may also include one of a residual network with an up-sampling function, a dense network, and the like. It should be noted that the first sub-network and the third sub-network may have the same structure or different structures; the second sub-network and the fourth sub-network may have the same structure or different structures; The disclosed embodiment does not limit this.
下采样用于减小特征图像的尺寸,从而减少特征图像的数据量,例如可以通过下采样层进行下采样处理,但不限于此。例如,下采样层可以采用最大值合并(max pooling)、平均值合并(average pooling)、跨度卷积(strided convolution)、欠采样(decimation,例如选择固定的像素)、解复用输出(demuxout,将输入图像拆分为多个更小的图像)等下采样方法实现下采样处理。Down-sampling is used to reduce the size of the feature image, thereby reducing the data amount of the feature image. For example, down-sampling can be performed through the down-sampling layer, but is not limited to this. For example, the down-sampling layer can use max pooling, average pooling, strided convolution, decimation, such as selecting fixed pixels, and demultiplexing output (demuxout, Split the input image into multiple smaller images) and other down-sampling methods to achieve down-sampling processing.
上采样用于增大特征图像的尺寸,从而增加特征图像的数据量,例如可以通过上采样层进行上采样处理,但不限于此。例如,上采样层可以采用跨度转置卷积(strided transposed convolution)、插值算法等上采样方法实现上采样处理。插值算法例如可以包括内插值、双线性插值、两次立方插值(Bicubic Interprolation)等算法。Upsampling is used to increase the size of the feature image, thereby increasing the data volume of the feature image. For example, upsampling can be performed through an upsampling layer, but is not limited to this. For example, the up-sampling layer can adopt up-sampling methods such as strided transposed convolution and interpolation algorithms to implement up-sampling processing. The interpolation algorithm may include, for example, interpolation, bilinear interpolation, and bicubic interpolation (Bicubic Interprolation).
需要说明的是,在本公开的一些实施例中,同一层级的下采样处理的下采样因子与上采样处理的上采样因子对应,即:当该下采样处理的下采样因子为1/y时,则该上采样处理的上采样因子为y,其中y为正整数,且y通常大于2。从而,可以确保同一层级的上采样处理的输出和下采样处理的输入尺寸相同。It should be noted that, in some embodiments of the present disclosure, the downsampling factor of the downsampling process at the same level corresponds to the upsampling factor of the upsampling process, that is, when the downsampling factor of the downsampling process is 1/y , The upsampling factor of the upsampling process is y, where y is a positive integer, and y is usually greater than 2. Thus, it can be ensured that the output of the up-sampling process and the input size of the down-sampling process at the same level are the same.
需要说明的是,在本公开的一些实施例(不限于本实施例)中,不同层级的下采样处理的参数(即该下采样处理对应的网络的参数)可以相同,也可以不同;不同层级的上采样处理的参数(即该上采样处理对应的网络的参数)可 以相同,也可以不同;不同层级的残差连接相加的参数可以相同,也可以不同。本公开对此不作限制。It should be noted that, in some embodiments of the present disclosure (not limited to this embodiment), the parameters of the downsampling process at different levels (that is, the parameters of the network corresponding to the downsampling process) may be the same or different; different levels The parameters of the upsampling process (that is, the parameters of the network corresponding to the upsampling process) can be the same or different; the added parameters of the residual connections at different levels can be the same or different. This disclosure does not limit this.
例如,在本公开的一些实施例中(不限于本实施例),为了改善特征图像的亮度、对比度等全局特征,多尺度循环采样处理还可以包括:在第一下采样处理、第一上采样处理、第二下采样处理和第二上采样处理之后,分别对第一下采样输出、第一上采样输出、第二下采样输出和第二上采样输出进行实例标准化处理或层标准化处理。需要说明的是,第一下采样输出、第一上采样输出、第二下采样输出和第二上采样输出可以采用相同的标准化处理方法(实例标准化处理或层标准化处理),也可以采用不同的标准化处理方法,本公开对此不作限制。For example, in some embodiments of the present disclosure (not limited to this embodiment), in order to improve the global features such as brightness and contrast of the feature image, the multi-scale cyclic sampling processing may also include: first down-sampling processing, first up-sampling processing After the processing, the second down-sampling process, and the second up-sampling process, the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output are respectively subjected to instance standardization processing or layer standardization processing. It should be noted that the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output can use the same standardization processing method (instance standardization processing or layer standardization processing), or different Standardized processing method, this disclosure does not limit this.
相应地,第一子网络、第二子网络、第三子网络和第四子网络还分别包括实例标准化层或层标准化层,实例标准化层用于执行实例标准化处理,层标准化层用于执行层标准化处理。例如,实例标准化层可以根据前述实例标准化公式进行实例标准化处理,层标准化层可以根据前述层标准化公式进行层标准化处理,本公开对此不作限制。需要说明的是,第一子网络、第二子网络、第三子网络和第四子网络可以包括相同的标准化层(实例标准化层或层标准化层),也可以包括不同的标准化层,本公开对此亦不作限制。Correspondingly, the first sub-network, the second sub-network, the third sub-network and the fourth sub-network also include an instance standardization layer or a layer standardization layer, respectively, the instance standardization layer is used to perform instance standardization processing, and the layer standardization layer is used to execute the layer. Standardized processing. For example, the instance standardization layer can perform instance standardization processing according to the aforementioned instance standardization formula, and the layer standardization layer can perform layer standardization processing according to the aforementioned layer standardization formula, which is not limited in the present disclosure. It should be noted that the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network may include the same standardization layer (example standardization layer or layer standardization layer), or can include different standardization layers, the present disclosure There is no restriction on this.
图4B为本公开另一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图。如图4B所示,在图4A所示的多尺度循环采样处理的基础上,该多尺度循环采样处理还进一步包括第三层级采样处理。需要说明的是,图5所示的多尺度循环采样处理的其他流程与图4A所示的多尺度循环采样处理的流程基本相同,重复之处在此不再赘述。FIG. 4B is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure. As shown in FIG. 4B, based on the multi-scale cyclic sampling process shown in FIG. 4A, the multi-scale cyclic sampling process further includes a third-level sampling process. It should be noted that the other procedures of the multi-scale cyclic sampling processing shown in FIG. 5 are basically the same as the procedures of the multi-scale cyclic sampling processing shown in FIG. 4A, and the repetitions are not repeated here.
例如,如图4B所示,第三层级采样处理嵌套在第二层级采样处理的第二下采样处理和第二上采样处理之间,接收第二下采样输出作为第三层级采样处理的输入,提供第三层级采样处理的输出作为第二上采样处理的输入,以使得第二上采样处理基于所述第二下采样输出进行上采样处理。需要说明的是,此时,与第一上采样处理间接地对第一下采样输出进行上采样处理类似,第二上采样处理也间接地对第二下采样输出进行上采样处理。For example, as shown in Figure 4B, the third-level sampling process is nested between the second down-sampling process and the second up-sampling process of the second-level sampling process, and the second down-sampling output is received as the input of the third-level sampling process. , Providing the output of the third-level sampling process as the input of the second up-sampling process, so that the second up-sampling process performs the up-sampling process based on the second down-sampling output. It should be noted that, at this time, similar to the first up-sampling process indirectly performing the up-sampling process on the first down-sampled output, the second up-sampling process also indirectly up-sampling the second down-sampled output.
第三层级采样处理包括依次执行的第三下采样处理、第三上采样处理和第三残差链接相加处理。第三下采样处理基于第三层级采样处理的输入进行下采样处理得到第三下采样输出,例如,第三下采样处理可以直接对第三层级采样 处理的输入进行下采样以得到第三下采样输出。第三上采样处理基于第三下采样输出进行上采样处理得到第三上采样输出,例如,第三上采样处理可以直接对第三下采样输出进行上采样以得到第三上采样输出。第三残差链接相加处理将第三层级采样处理的输入和第三上采样输出进行第三残差链接相加,然后将第三残差链接相加的结果作为第三层级采样处理的输出。例如,第三上采样处理的输出(即第三上采样输出)的尺寸与第三层级采样处理的输入(即第三下采样处理的输入)的尺寸相同,从而经过第三残差链接相加后,第三层级采样处理的输出的尺寸与第三层级采样处理的输入的尺寸相同。The third-level sampling process includes a third down-sampling process, a third up-sampling process, and a third residual link addition process that are sequentially executed. The third down-sampling process is performed based on the input of the third-level sampling process to obtain the third down-sampled output. For example, the third down-sampling process can directly down-sample the input of the third-level sampling process to obtain the third down-sampling Output. The third up-sampling process performs up-sampling based on the third down-sampled output to obtain the third up-sampled output. For example, the third up-sampling process may directly up-sample the third down-sampled output to obtain the third up-sampled output. The third residual link addition process performs the third residual link addition on the input of the third level sampling process and the third upsampling output, and then uses the third residual link addition result as the output of the third level sampling process . For example, the size of the output of the third up-sampling process (ie, the third up-sampling output) is the same as the size of the input of the third-level sampling process (ie, the input of the third down-sampling process), so that it is added through the third residual link After that, the size of the output of the third-level sampling process is the same as the size of the input of the third-level sampling process.
需要说明的是第三层级采样处理的更多细节以及实现方式(即网络结构)可以参考图4A所示实施例中关于第一层级采样处理和第二层级采样处理的描述,本公开对此不再赘述。It should be noted that more details of the third-level sampling processing and implementation (ie network structure) can refer to the description of the first-level sampling processing and the second-level sampling processing in the embodiment shown in FIG. 4A. Repeat it again.
需要说明的是,基于本实施例,本领域的技术人员应当理解,多尺度循环采样处理还可以包括更多层级的采样处理,例如还可以包括嵌套在第三层级采样处理中的第四层级采样处理、嵌套在第四层级采样处理中的第五层级采样处理等,其嵌套方式与上面所述的第二层级采样处理、第三层级采样处理的方式向类似,本公开对此不作限制。It should be noted that based on this embodiment, those skilled in the art should understand that multi-scale cyclic sampling processing may also include more levels of sampling processing, for example, it may also include a fourth level nested in the third level of sampling processing. The sampling processing, the fifth-level sampling processing nested in the fourth-level sampling processing, etc., the nesting method is similar to the second-level sampling processing and the third-level sampling processing described above. limit.
图4C为本公开再一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图。如图4C所示,在图4A所示的多尺度循环采样处理的基础上,该多尺度循环采样处理包括多次依次执行的第二层级采样处理。需要说明的是,图5所示的多尺度循环采样处理的其他流程与图4A所示的多尺度循环采样处理的流程基本相同,重复之处在此不再赘述。还需要说明的是,图4C中包括两次第二层级采样处理是示例性的,在本公开的实施例中,多尺度循环采样处理可以包括两次或两次以上依次执行的第二层级采样处理。需要说明的是,在本公开的实施例中,第二层级采样处理的次数可以根据实际需要进行选择,本公开对此不作限制。例如,在一些示例中,本申请的发明人发现,相比于采用具有一次或三次第二层级采样处理的图像处理方法,采用具有两次第二层级采样处理的图像处理方法进行图像增强处理的效果更好,但这不应当被视为对本公开的限制。FIG. 4C is a schematic flowchart diagram corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure. As shown in FIG. 4C, based on the multi-scale cyclic sampling process shown in FIG. 4A, the multi-scale cyclic sampling process includes a second-level sampling process that is sequentially executed multiple times. It should be noted that the other procedures of the multi-scale cyclic sampling processing shown in FIG. 5 are basically the same as the procedures of the multi-scale cyclic sampling processing shown in FIG. 4A, and the repetitions are not repeated here. It should also be noted that the inclusion of two second-level sampling processing in FIG. 4C is exemplary. In an embodiment of the present disclosure, the multi-scale cyclic sampling processing may include two or more second-level sampling performed sequentially. deal with. It should be noted that in the embodiment of the present disclosure, the number of second-level sampling processing can be selected according to actual needs, and the present disclosure does not limit this. For example, in some examples, the inventor of the present application found that compared to using an image processing method with one or three second-level sampling processing, an image processing method with two second-level sampling processing is used for image enhancement processing. The effect is better, but this should not be seen as a limitation of the present disclosure.
例如,第一次第二层级采样处理接收第一下采样输出作为第一次第二层级采样处理的输入,除第一次第二层级采样处理之外的每次第二层级采样处理接收前一次第二层级采样处理的输出作为本次第二层级采样处理的输入,最后一 次第二层级采样处理的输出作为第一上采样处理的输入。For example, the first second-level sampling process receives the first down-sampling output as the input of the first second-level sampling process, and every second-level sampling process except the first second-level sampling process receives the previous one The output of the second-level sampling process is used as the input of this second-level sampling process, and the output of the last second-level sampling process is used as the input of the first upsampling process.
需要说明的是,每一次第二层级采样处理的更多细节以及实现方式可以参考图4A所示实施例中关于第二层级采样处理的描述,本公开对此不再赘述。It should be noted that, for more details and implementation manners of each second-level sampling processing, reference may be made to the description of the second-level sampling processing in the embodiment shown in FIG. 4A, which will not be repeated in this disclosure.
需要说明的是,在本公开的一些实施例(不限于本实施例)中,不同次序的相同层级的下采样处理的参数可以相同,也可以不同;不同次序的相同层级的上采样处理的参数可以相同,也可以不同;不同次序的相同层级的残差连接相加的参数可以相同,也可以不同。本公开对此不作限制。It should be noted that in some embodiments of the present disclosure (not limited to this embodiment), the parameters of the downsampling process of the same level in different orders may be the same or different; the parameters of the upsampling process of the same level in different orders It can be the same or different; the added parameters of the residual connections of the same level in different orders can be the same or different. This disclosure does not limit this.
需要说明的是,基于本实施例,本领域的技术人员应当理解,在多尺度循环采样处理中,第一层级采样处理可以嵌套多个依次执行的第二层级采样处理;进一步地,至少部分第二层级采样处理可以嵌套一个或多个依次执行的第三层级采样处理,且该至少部分第二层级采样处理嵌套的第三层级采样处理的数量可以相同,也可以不同;进一步地,第三层级采样处理可以嵌套第四层级采样处理,具体嵌套方式可以与第二层级采样处理嵌套第三层级采样处理的方式相同;之后以此类推。It should be noted that, based on this embodiment, those skilled in the art should understand that in the multi-scale cyclic sampling process, the first-level sampling process can nest multiple second-level sampling processes that are executed in sequence; further, at least partially The second-level sampling process may nest one or more third-level sampling processes that are executed sequentially, and the number of third-level sampling processes nested in at least part of the second-level sampling process may be the same or different; further, The third-level sampling processing can nest the fourth-level sampling processing, and the specific nesting manner may be the same as the second-level sampling processing nesting the third-level sampling processing; and so on.
需要说明的是,图4A-4C示出的是本公开的实施例提供的图像处理方法包括一次多尺度循环采样处理的情形。在图4A-4C所示的实施例提供的图像处理方法中,至少一次多尺度循环采样处理包括一次多尺度循环采样处理。该多尺度循环采样处理接收第一特征图像作为多尺度循环采样处理的输入,多尺度循环采样处理的输入作为多尺度循环采样处理中的第一层级采样处理的输入,多尺度循环采样处理中的第一层级采样处理的输出作为多尺度循环采样处理的输出,多尺度循环采样处理的输出作为至少一次多尺度循环采样处理的输出。本公开包括但不限于此。It should be noted that FIGS. 4A-4C show a situation where the image processing method provided by an embodiment of the present disclosure includes a multi-scale cyclic sampling process. In the image processing method provided by the embodiments shown in FIGS. 4A-4C, at least one multi-scale cyclic sampling processing includes one multi-scale cyclic sampling processing. The multi-scale cyclic sampling processing receives the first feature image as the input of the multi-scale cyclic sampling processing, and the input of the multi-scale cyclic sampling processing is used as the input of the first-level sampling processing in the multi-scale cyclic sampling processing. The output of the first-level sampling processing is used as the output of the multi-scale cyclic sampling processing, and the output of the multi-scale cyclic sampling processing is used as the output of the multi-scale cyclic sampling processing at least once. The present disclosure includes but is not limited to this.
图4D为本公开又一实施例提供的一种对应于图3所示的图像处理方法中的多尺度循环采样处理的示意性流程框图。如图4D所示,在本实施例提供的图像处理方法中,至少一次多尺度循环采样处理包括多次依次执行的多尺度循环采样处理,例如至少一次多尺度循环采样处理可以包括两次或三次依次执行的多尺度循环采样处理,但不限于此。需要说明的是,在本公开的实施例中,多尺度循环采样处理的次数可以根据实际需要进行选择,本公开对此不作限制。例如,在一些示例中,本申请的发明人发现,相比于采用具有一次或三次多尺度循环采样处理的图像处理方法,采用具有两次多尺度循环采样处理的图像处理方法进行图像增强处理的效果更好,但这不应当被视为对本公开的限 制。FIG. 4D is a schematic flow chart corresponding to the multi-scale cyclic sampling processing in the image processing method shown in FIG. 3 according to another embodiment of the present disclosure. As shown in FIG. 4D, in the image processing method provided in this embodiment, at least one multi-scale cyclic sampling process includes multiple times of sequential execution of multi-scale cyclic sampling processing. For example, at least one multi-scale cyclic sampling process may include two or three times. Multi-scale cyclic sampling processing executed sequentially, but not limited to this. It should be noted that in the embodiments of the present disclosure, the number of times of multi-scale cyclic sampling processing can be selected according to actual needs, and the present disclosure does not limit this. For example, in some examples, the inventor of the present application found that compared to the image processing method with one or three multi-scale cyclic sampling processing, the image processing method with two multi-scale cyclic sampling processing is used for image enhancement processing. The effect is better, but this should not be seen as a limitation of the present disclosure.
例如,每次多尺度循环采样处理的输入作为本次多尺度循环采样处理中的第一层级采样处理的输入,每次多尺度循环采样处理中的第一层级采样处理的输出作为本次多尺度循环采样处理的输出。For example, the input of each multi-scale cyclic sampling process is used as the input of the first-level sampling process in this multi-scale cyclic sampling process, and the output of the first-level sampling process in each multi-scale cyclic sampling process is used as the current multi-scale The output of cyclic sampling processing.
例如,如图4D所示,第一次多尺度循环采样处理接收第一特征图像作为第一次多尺度循环采样处理的输入,除第一次多尺度循环采样处理之外的每次多尺度循环采样处理接收前一次多尺度循环采样处理的输出作为本次多尺度循环采样处理的输入,最后一次多尺度循环采样处理的输出作为至少一次多尺度循环采样处理的输出。For example, as shown in Figure 4D, the first multi-scale cyclic sampling process receives the first feature image as the input of the first multi-scale cyclic sampling process, and each multi-scale cycle except the first multi-scale cyclic sampling process The sampling process receives the output of the previous multi-scale cyclic sampling process as the input of this multi-scale cyclic sampling process, and the output of the last multi-scale cyclic sampling process is used as the output of at least one multi-scale cyclic sampling process.
需要说明的是,每次多尺度循环采样处理的更多细节和实现方式可以参考图4A-4C所示实施例中关于多尺度循环采样处理的描述,本公开对此不再赘述。还需要说明的是,不同次序的多尺度循环采样处理的实现方式(即网络结构)和参数等可以相同,也可以不同,本公开对此不作限制。It should be noted that, for more details and implementation of the multi-scale cyclic sampling processing each time, reference may be made to the description of the multi-scale cyclic sampling processing in the embodiment shown in FIGS. 4A-4C, which will not be repeated in this disclosure. It should also be noted that the implementation manner (ie, network structure) and parameters of the multi-scale cyclic sampling processing in different orders may be the same or different, and the present disclosure does not limit this.
图5为本公开另一实施例提供的一种图像处理方法的流程图。如图5所示,该图像处理方法包括步骤S210至步骤S250。需要说明的是,图5所示的图像处理方法的步骤S230至步骤S240与图3所示的图像处理方法的步骤S110至步骤S120对应相同,即图5所示的图像处理方法包括图3所示的图像处理方法,因此图5所示的图像处理方法的步骤S230至步骤S240可以参考前述关于图3所示的图像处理方法的步骤S110至步骤S120的描述,当然也可以参考如图4A~4D所示的实施例的方法等。以下,对图5所示的图像处理方法的步骤S210至步骤S250进行详细说明。Fig. 5 is a flowchart of an image processing method provided by another embodiment of the present disclosure. As shown in FIG. 5, the image processing method includes step S210 to step S250. It should be noted that steps S230 to S240 of the image processing method shown in FIG. 5 correspond to the same as steps S110 to S120 of the image processing method shown in FIG. 3, that is, the image processing method shown in FIG. Therefore, steps S230 to S240 of the image processing method shown in FIG. 5 can refer to the foregoing description of steps S110 to S120 of the image processing method shown in FIG. 3, and of course, you can also refer to FIG. 4A to The method of the embodiment shown in 4D, etc. Hereinafter, steps S210 to S250 of the image processing method shown in FIG. 5 will be described in detail.
步骤S210:获取输入图像。Step S210: Obtain an input image.
例如,在步骤S210中,输入图像可以包括通过智能手机的摄像头、平板电脑的摄像头、个人计算机的摄像头、数码照相机的镜头、监控摄像头或者网络摄像头等拍摄采集的照片,其可以包括人物图像、动植物图像或风景图像等,本公开对此不作限制。例如,输入图像的质量低于真实的数码单镜反光相机拍摄的照片的质量,即输入图像为低质量图像。例如,在一些示例中,输入图像可以包括3个通道的RGB图像;在另一些示例中,输入图像可以包括3个通道的YUV图像。以下,以输入图像包括RGB图像为例进行说明,但是本公开的实施例不限于此。For example, in step S210, the input image may include photos captured by a camera of a smart phone, a camera of a tablet computer, a camera of a personal computer, a lens of a digital camera, a surveillance camera, or a web camera, etc., which may include images of people, animations, etc. Plant images or landscape images, etc., are not limited in this disclosure. For example, the quality of the input image is lower than the quality of photos taken by a real digital single-lens reflex camera, that is, the input image is a low-quality image. For example, in some examples, the input image may include a 3-channel RGB image; in other examples, the input image may include a 3-channel YUV image. Hereinafter, the input image includes an RGB image as an example, but the embodiment of the present disclosure is not limited to this.
步骤S220:使用分析网络将输入图像转换为第一特征图像。Step S220: Use the analysis network to convert the input image into a first feature image.
例如,在步骤S220中,分析网络可以为包括卷积层、残差网络、密集网络等之一的卷积神经网络。例如,在一些示例中,分析网络可以将3个通道的RGB图像(即输入图像)转换为多个第一特征图像,例如64个第一特征图像,本公开包括但不限于此。For example, in step S220, the analysis network may be a convolutional neural network including one of a convolutional layer, a residual network, and a dense network. For example, in some examples, the analysis network can convert 3 channel RGB images (ie, input images) into multiple first feature images, such as 64 first feature images. The present disclosure includes but is not limited to this.
需要说明的是,本公开的实施例对分析网络的结构和参数不作限制,只要其能将输入图像转换到卷积特征维度(即转换为第一特征图像)即可。It should be noted that the embodiment of the present disclosure does not limit the structure and parameters of the analysis network, as long as it can convert the input image to the convolution feature dimension (ie, convert it to the first feature image).
步骤S230:接收第一特征图像;Step S230: Receive the first characteristic image;
步骤S240:对第一特征图像进行至少一次多尺度循环采样处理。Step S240: Perform at least one multi-scale cyclic sampling process on the first feature image.
需要说明的是,步骤S230至步骤S240可以参考前述关于步骤S110至步骤S120的描述,本公开对此不再赘述。It should be noted that, for step S230 to step S240, reference may be made to the foregoing description of step S110 to step S120, which will not be repeated in this disclosure.
步骤S250:使用合成网络将至少一次多尺度循环采样处理的输出转换为输出图像。Step S250: Use a synthesis network to convert the output of at least one multi-scale cyclic sampling process into an output image.
例如,在步骤S250中,合成网络可以为包括卷积层、残差网络、密集网络等之一的卷积神经网络。例如,可以将至少一次多尺度循环采样处理的输出称为第二特征图像。例如,第二特征图像的数量可以为多个,但不限于此。例如,在一些示例中,合成网络可以将多个第二特征图像转换为输出图像,例如,该输出图像可以包括3个通道的RGB图像,本公开包括但不限于此。For example, in step S250, the synthesis network may be a convolutional neural network including one of a convolutional layer, a residual network, a dense network, and the like. For example, the output of at least one multi-scale cyclic sampling process can be referred to as the second feature image. For example, the number of second feature images may be multiple, but is not limited to this. For example, in some examples, the synthesis network may convert multiple second feature images into output images. For example, the output image may include 3 channel RGB images. The present disclosure includes but is not limited to this.
图6A为一种输入图像的示意图,图6B为根据本公开一实施例提供的一种图像处理方法(例如,图5所示的图像处理方法)对图6A所示的输入图像进行处理得到的输出图像的示意图。FIG. 6A is a schematic diagram of an input image, and FIG. 6B is a result obtained by processing the input image shown in FIG. 6A according to an image processing method (for example, the image processing method shown in FIG. 5) provided by an embodiment of the present disclosure Schematic of the output image.
例如,如图6A和图6B所示,输出图像保留了输入图像的内容,但是提高了图像的对比度,改善了输入图像过暗的问题,从而,与输入图像相比,输出图像的质量可以接近于真实的数码单镜反光相机拍摄的照片的质量,即输出图像为高质量图像。For example, as shown in Figure 6A and Figure 6B, the output image retains the content of the input image, but the contrast of the image is improved, and the problem of the input image being too dark is improved, so that the quality of the output image can be close to that of the input image. Based on the quality of photos taken by a real digital single-lens reflex camera, the output image is a high-quality image.
需要说明的是,本公开的实施例对合成网络的结构和参数不作限制,只要其能将卷积特征维度(即第二特征图像)转换为输出图像即可。It should be noted that the embodiment of the present disclosure does not limit the structure and parameters of the synthesis network, as long as it can convert the convolution feature dimension (ie, the second feature image) into an output image.
本公开的实施例提供的图像处理方法,可以对低质量的输入图像进行图像增强处理,通过在多个尺度上反复采样以获取更高的图像保真度,可以大幅提升输出图像的质量,适用于对于图像质量要求较高的批处理等离线应用。具体地,Andrey Ignatov等人的文献中提出的图像增强方法输出的图像的PSNR为20.08,而基于本公开的图4C所示实施例提供的图像处理方法获得的输出图像 的PSNR可以达到23.35,即本公开的实施例提供的图像处理方法获得的图像可以更接近于真实的数码单镜反光相机拍摄的照片。The image processing method provided by the embodiments of the present disclosure can perform image enhancement processing on low-quality input images, and by repeatedly sampling at multiple scales to obtain higher image fidelity, the quality of output images can be greatly improved. For offline applications such as batch processing that require high image quality. Specifically, the PSNR of the image output by the image enhancement method proposed by Andrey Ignatov et al. is 20.08, while the PSNR of the output image obtained based on the image processing method provided in the embodiment shown in FIG. 4C of the present disclosure can reach 23.35, that is The image obtained by the image processing method provided by the embodiments of the present disclosure may be closer to a real photo taken by a digital single-lens reflex camera.
本公开至少一实施例还提供一种神经网络的训练方法。图7A为本公开一实施例提供的一种神经网络的结构示意图,图7B为本公开一实施例提供的一种神经网络的训练方法的流程图,图7C为本公开一实施例提供的一种对应于图7B中所示的训练方法训练图7A所示的神经网络的示意性架构框图。At least one embodiment of the present disclosure also provides a neural network training method. FIG. 7A is a schematic structural diagram of a neural network provided by an embodiment of the disclosure, FIG. 7B is a flowchart of a neural network training method provided by an embodiment of the disclosure, and FIG. 7C is a schematic diagram of a neural network training method provided by an embodiment of the disclosure. This corresponds to the schematic block diagram of the training method shown in FIG. 7B for training the neural network shown in FIG. 7A.
例如,如图7A所示,该神经网络300包括分析网络310、第一子神经网络320和合成网络330。例如,分析网络310对输入图像进行处理以得到第一特征图像,第一子神经网络320对第一特征图像进行至少一次多尺度循环采样处理以得到第二特征图像,合成网络330对第二特征图像进行处理以得到输出图像。For example, as shown in FIG. 7A, the neural network 300 includes an analysis network 310, a first sub-neural network 320, and a synthesis network 330. For example, the analysis network 310 processes the input image to obtain the first feature image, the first sub-neural network 320 performs at least one multi-scale cyclic sampling process on the first feature image to obtain the second feature image, and the synthesis network 330 performs the second feature image The image is processed to obtain an output image.
例如,分析网络310的结构可以参考前述步骤S220中关于分析网络的描述,本公开对此不作限制;第一子神经网络320的结构可以参考前述步骤S120(也即步骤S240)中关于多尺度循环采样处理的实现方式的描述,例如,第一子神经网络可以包括但不限于前述第一卷积神经网络,本公开对此不作限制;例如,合成网络330可以参考前述步骤S250中关于合成网络的描述,本公开对此不作限制。For example, the structure of the analysis network 310 can refer to the description of the analysis network in the aforementioned step S220, which is not limited in the present disclosure; the structure of the first sub-neural network 320 can refer to the aforementioned step S120 (that is, step S240) regarding the multi-scale loop For the description of the implementation of sampling processing, for example, the first sub-neural network may include but is not limited to the aforementioned first convolutional neural network, which is not limited in the present disclosure; for example, the synthesis network 330 may refer to the synthesis network in the aforementioned step S250 Description, this disclosure does not limit this.
例如,输入图像和输出图像也可以参考前述实施例提供的图像处理方法中关于输入图像和输出图像的描述,本公开对此不再赘述。For example, the input image and the output image can also refer to the description of the input image and the output image in the image processing method provided in the foregoing embodiment, which will not be repeated in this disclosure.
例如,结合图7B和图7C所示,该神经网络的训练方法包括步骤S410至步骤S460。For example, as shown in FIG. 7B and FIG. 7C, the training method of the neural network includes step S410 to step S460.
步骤S410:获取训练输入图像。Step S410: Obtain training input images.
例如,与前述步骤S210中的输入图像类似,训练输入图像也可以包括通过智能手机的摄像头、平板电脑的摄像头、个人计算机的摄像头、数码照相机的镜头、监控摄像头或者网络摄像头等拍摄采集的照片,其可以包括人物图像、动植物图像或风景图像等,本公开对此不作限制。例如,训练输入图像的质量低于真实的数码单镜反光相机拍摄的照片的质量,即训练输入图像为低质量图像。例如,在一些示例中,训练输入图像可以包括3个通道的RGB图像。For example, similar to the input image in the aforementioned step S210, the training input image may also include photos taken by the camera of a smart phone, a camera of a tablet computer, a camera of a personal computer, a lens of a digital camera, a surveillance camera, or a web camera. It may include images of people, images of animals and plants, or landscapes, etc., which is not limited in the present disclosure. For example, the quality of the training input image is lower than the quality of photos taken by a real digital single-lens reflex camera, that is, the training input image is a low-quality image. For example, in some examples, the training input image may include 3 channel RGB images.
步骤S420:使用分析网络对训练输入图像进行处理以提供第一训练特征图像。Step S420: Use the analysis network to process the training input image to provide a first training feature image.
例如,与前述步骤S220中的分析网络类似,分析网络310可以为包括卷 积层、残差网络、密集网络等之一的卷积神经网络。例如,在一些示例中,分析网络可以将3个通道的RGB图像(即训练输入图像)转换为多个第一训练特征图像,例如64个第一训练特征图像,本公开包括但不限于此。For example, similar to the analysis network in the aforementioned step S220, the analysis network 310 may be a convolutional neural network including one of a convolutional layer, a residual network, and a dense network. For example, in some examples, the analysis network can convert 3 channel RGB images (ie, training input images) into multiple first training feature images, such as 64 first training feature images. The present disclosure includes but is not limited to this.
步骤S430:使用第一子神经网络对第一训练特征图像进行至少一次多尺度循环采样处理以得到第二训练特征图像。Step S430: Use the first sub-neural network to perform multi-scale cyclic sampling processing on the first training feature image at least once to obtain a second training feature image.
例如,在步骤S430中,多尺度循环采样处理可以实现为图4A-4D任一所示的实施例中的多尺度循环采样处理,但不限于此。以下,以步骤S430中的多尺度循环采样处理实现为图4A所示的多尺度循环采样处理为例进行说明。For example, in step S430, the multi-scale cyclic sampling process can be implemented as the multi-scale cyclic sampling process in any of the embodiments shown in FIGS. 4A-4D, but is not limited thereto. In the following, the multi-scale cyclic sampling processing in step S430 is implemented as the multi-scale cyclic sampling processing shown in FIG. 4A as an example for description.
例如,如图4A所示,多尺度循环采样处理嵌套的第一层级采样处理和第二层级采样处理。For example, as shown in FIG. 4A, the multi-scale cyclic sampling process nests the first-level sampling process and the second-level sampling process.
例如,如图4A所示,多尺度循环采样处理的输入(即第一训练特征图像)作为第一层级采样处理的输入,第一层级采样处理的输出作为多尺度循环采样处理的输出(即第二训练特征图像)。例如,第二训练特征图像的尺寸可以和第一训练特征图像的尺寸相同。For example, as shown in FIG. 4A, the input of the multi-scale cyclic sampling process (ie, the first training feature image) is used as the input of the first-level sampling process, and the output of the first-level sampling process is used as the output of the multi-scale cyclic sampling process (ie, the first Two training feature images). For example, the size of the second training feature image may be the same as the size of the first training feature image.
例如,如图4A所示,第一层级采样处理包括依次执行的第一下采样处理、第一上采样处理和第一残差链接相加处理。第一下采样处理基于第一层级采样处理的输入进行下采样处理得到第一下采样输出,例如,第一下采样处理可以直接对第一层级采样处理的输入进行下采样以得到第一下采样输出。第一上采样处理基于第一下采样输出进行上采样处理得到第一上采样输出,例如,在第一下采样输出经过第二层级采样处理之后,再进行上采样处理以得到第一上采样输出,即第一上采样处理可以间接地对第一下采样输出进行上采样处理。第一残差链接相加处理将第一层级采样处理的输入和第一上采样输出进行第一残差链接相加,然后将第一残差链接相加的结果作为第一层级采样处理的输出。例如,第一上采样处理的输出(即第一上采样输出)的尺寸与第一层级采样处理的输入(即第一下采样处理的输入)的尺寸相同,从而经过第一残差链接相加后,第一层级采样处理的输出的尺寸与第一层级采样处理的输入的尺寸相同。For example, as shown in FIG. 4A, the first-level sampling process includes a first down-sampling process, a first up-sampling process, and a first residual link addition process that are sequentially executed. The first down-sampling process is performed based on the input of the first-level sampling process to obtain the first down-sampled output. For example, the first down-sampling process can directly down-sample the input of the first-level sampling process to obtain the first down-sampling Output. The first up-sampling process is performed based on the first down-sampling output to perform up-sampling processing to obtain the first up-sampling output, for example, after the first down-sampling output is subjected to the second-level sampling process, the up-sampling process is performed to obtain the first up-sampling output That is, the first up-sampling process can indirectly perform up-sampling on the first down-sampling output. The first residual link addition process performs the first residual link addition on the input of the first level sampling process and the first upsampling output, and then uses the result of the first residual link addition as the output of the first level sampling process . For example, the size of the output of the first up-sampling process (ie, the first up-sampling output) is the same as the size of the input of the first-level sampling process (ie, the input of the first down-sampling process), which is added through the first residual link After that, the size of the output of the first-level sampling process is the same as the size of the input of the first-level sampling process.
例如,如图4A所示,第二层级采样处理嵌套在第一层级采样处理的第一下采样处理和第一上采样处理之间,接收第一下采样输出作为第二层级采样处理的输入,提供第二层级采样处理的输出作为第一上采样处理的输入,以使得第一上采样处理基于第一下采样输出进行上采样处理。For example, as shown in Figure 4A, the second-level sampling process is nested between the first down-sampling process and the first up-sampling process of the first-level sampling process, and the first down-sampling output is received as the input of the second-level sampling process. , Providing the output of the second-level sampling process as the input of the first up-sampling process, so that the first up-sampling process performs the up-sampling process based on the first down-sampling output.
例如,如图4A所示,第二层级采样处理包括依次执行的第二下采样处理、第二上采样处理和第二残差链接相加处理。第二下采样处理基于第二层级采样处理的输入进行下采样处理得到第二下采样输出,例如,第二下采样处理可以直接对第二层级采样处理的输入进行下采样以得到第二下采样输出。第二上采样处理基于第二下采样输出进行上采样处理得到第二上采样输出,例如,第二上采样处理可以直接对第二下采样输出进行上采样以得到第二上采样输出。第二残差链接相加处理将第二层级采样处理的输入和第二上采样输出进行第二残差链接相加,然后将第二残差链接相加的结果作为第二层级采样处理的输出。例如,第二上采样处理的输出(即第二上采样输出)的尺寸与第二层级采样处理的输入(即第二下采样处理的输入)的尺寸相同,从而经过第二残差链接相加后,第二层级采样处理的输出的尺寸与第二层级采样处理的输入的尺寸相同。For example, as shown in FIG. 4A, the second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process that are sequentially executed. The second down-sampling process performs down-sampling based on the input of the second-level sampling process to obtain the second down-sampled output. For example, the second down-sampling process can directly down-sample the input of the second-level sampling process to obtain the second down-sampling Output. The second up-sampling process performs up-sampling based on the second down-sampled output to obtain the second up-sampled output. For example, the second up-sampling process may directly up-sample the second down-sampled output to obtain the second up-sampled output. The second residual link addition process performs a second residual link addition on the input of the second level sampling process and the second upsampling output, and then uses the result of the second residual link addition as the output of the second level sampling process . For example, the size of the output of the second up-sampling process (ie, the second up-sampling output) is the same as the size of the input of the second-level sampling process (ie, the input of the second down-sampling process), so that it is added through the second residual link After that, the size of the output of the second-level sampling process is the same as the size of the input of the second-level sampling process.
例如,相应地,第一子神经网络320可以实现为前述第一卷积神经网络。例如,第一子神经网络320可以包括嵌套的第一元网络和第二元网络,第一元网络用于执行第一层级采样处理,第二元网络用于执行第二层级采样处理。For example, correspondingly, the first sub-neural network 320 may be implemented as the aforementioned first convolutional neural network. For example, the first sub-neural network 320 may include a nested first meta-network and a second meta-network, the first meta-network is used to perform the first-level sampling processing, and the second meta-network is used to perform the second-level sampling processing.
例如,第一元网络可以包括第一子网络和第二子网络,第一子网络用于执行第一下采样处理,第二子网络用于执行第一上采样处理。第二元网络嵌套在第一元网络的第一子网络和第三子网络之间。例如,第二元网络可以包括第三子网络和第四子网络,第三子网络用于执行第二下采样处理,第四子网络用于执行第二上采样处理。For example, the first meta-network may include a first sub-network and a second sub-network, the first sub-network is used to perform the first down-sampling process, and the second sub-network is used to perform the first up-sampling process. The second meta network is nested between the first sub network and the third sub network of the first meta network. For example, the second meta-network may include a third sub-network and a fourth sub-network, the third sub-network is used to perform the second down-sampling process, and the fourth sub-network is used to perform the second up-sampling process.
例如,第一子网络、第二子网络、第三子网络和第四子网络中的每一个都包括卷积层、残差网络、密集网络等之一。具体地,第一子网络和第三子网络可以包括具有下采样功能的卷积层(下采样层)、残差网络、密集网络等之一;第二子网络和第四子网络可以包括具有上采样功能的卷积层(上采样层)、残差网络、密集网络等之一。需要说明的是,第一子网络和第三子网络可以具有相同的结构,也可以具有不同的结构;第二子网络和第四子网络可以具有相同的结构,也可以具有不同的结构;本公开对此不作限制。For example, each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes one of a convolutional layer, a residual network, a dense network, and the like. Specifically, the first sub-network and the third sub-network may include one of a convolutional layer (down-sampling layer) with down-sampling function, a residual network, a dense network, etc.; the second and fourth sub-networks may include One of the convolutional layer (upsampling layer), residual network, dense network, etc. of the upsampling function. It should be noted that the first sub-network and the third sub-network may have the same structure or different structures; the second sub-network and the fourth sub-network may have the same structure or different structures; There is no restriction on this publicly.
例如,在本公开的实施例中,为了改善特征图像的亮度、对比度等全局特征,多尺度循环采样处理还可以包括:在第一下采样处理、第一上采样处理、第二下采样处理和第二上采样处理之后,分别对第一下采样输出、第一上采样输出、第二下采样输出和第二上采样输出进行实例标准化处理或层标准化处 理。需要说明的是,第一下采样输出、第一上采样输出、第二下采样输出和第二上采样输出可以采用相同的标准化处理方法(实例标准化处理或层标准化处理),也可以采用不同的标准化处理方法,本公开对此不作限制。For example, in the embodiments of the present disclosure, in order to improve the global features such as brightness and contrast of the feature image, the multi-scale cyclic sampling processing may further include: first down-sampling processing, first up-sampling processing, second down-sampling processing, and After the second up-sampling processing, the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output are respectively subjected to instance standardization processing or layer standardization processing. It should be noted that the first down-sampling output, the first up-sampling output, the second down-sampling output, and the second up-sampling output can use the same standardization processing method (instance standardization processing or layer standardization processing), or different Standardized processing method, this disclosure does not limit this.
相应地,第一子网络、第二子网络、第三子网络和第四子网络还分别包括实例标准化层或层标准化层,实例标准化层用于执行实例标准化处理,层标准化层用于执行层标准化处理。例如,实例标准化层可以根据前述实例标准化公式进行实例标准化处理,层标准化层可以根据前述层标准化公式进行层标准化处理,本公开对此不作限制。需要说明的是,第一子网络、第二子网络、第三子网络和第四子网络可以包括相同的标准化层(实例标准化层或层标准化层),也可以包括不同的标准化层,本公开对此亦不作限制。Correspondingly, the first sub-network, the second sub-network, the third sub-network and the fourth sub-network also include an instance standardization layer or a layer standardization layer, respectively, the instance standardization layer is used to perform instance standardization processing, and the layer standardization layer is used to execute the layer. Standardized processing. For example, the instance standardization layer can perform instance standardization processing according to the aforementioned instance standardization formula, and the layer standardization layer can perform layer standardization processing according to the aforementioned layer standardization formula, which is not limited in the present disclosure. It should be noted that the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network may include the same standardization layer (example standardization layer or layer standardization layer), or can include different standardization layers, the present disclosure There is no restriction on this.
需要说明的是,步骤S430中的多尺度循环采样处理的更多实现方式以及更多细节可以参考前述步骤S120(即步骤S240)以及图4A-4D所示的实施例中关于多尺度循环采样处理的描述,本公开对此不再赘述。还需要说明的是,当步骤S430中的多尺度循环采样处理实现为其他形式时,第一子神经网络320应当作相应变化,以实现其他形式的多尺度循环采样处理,本公开对此不再赘述。It should be noted that for more implementation methods and more details of the multi-scale cyclic sampling processing in step S430, please refer to the foregoing step S120 (ie, step S240) and the multi-scale cyclic sampling processing in the embodiment shown in FIGS. 4A-4D. This disclosure will not repeat the description. It should also be noted that when the multi-scale cyclic sampling processing in step S430 is implemented in other forms, the first sub-neural network 320 should be changed accordingly to implement other forms of multi-scale cyclic sampling processing, which will not be discussed in this disclosure. Repeat.
例如,在步骤S430中,第二训练特征图像的数量可以为多个,但不限于此。For example, in step S430, the number of second training feature images may be multiple, but is not limited thereto.
步骤S440:使用合成网络对第二训练特征图像进行处理以得到训练输出图像。Step S440: Use the synthetic network to process the second training feature image to obtain a training output image.
例如,与前述步骤S250中的合成网络类似,合成网络330可以为包括卷积层、残差网络、密集网络等之一的卷积神经网络。例如,在一些示例中,合成网络可以将多个第二训练特征图像转换为训练输出图像,例如,该训练输出图像可以包括3个通道的RGB图像,本公开包括但不限于此。For example, similar to the synthesis network in the foregoing step S250, the synthesis network 330 may be a convolutional neural network including one of a convolutional layer, a residual network, a dense network, and the like. For example, in some examples, the synthesis network may convert multiple second training feature images into training output images. For example, the training output image may include 3 channel RGB images, and the present disclosure includes but is not limited to this.
步骤S450:基于训练输出图像,通过损失函数计算神经网络的损失值。Step S450: Based on the training output image, calculate the loss value of the neural network through the loss function.
例如,神经网络300的参数包括分析网络310的参数、第一子神经网络320的参数和合成网络330的参数。例如,神经网络300的初始参数可以为随机数,例如随机数符合高斯分布,本公开的实施例对此不作限制。For example, the parameters of the neural network 300 include the parameters of the analysis network 310, the parameters of the first sub-neural network 320, and the parameters of the synthesis network 330. For example, the initial parameter of the neural network 300 may be a random number, for example, the random number conforms to a Gaussian distribution, which is not limited in the embodiment of the present disclosure.
例如,本实施例的损失函数可以参考Andrey Ignatov等人提供的文献中的损失函数。例如,与该文献中的损失函数类似,该损失函数可以包括颜色损失函数、纹理损失函数及内容损失函数;相应地,通过损失函数计算神经网络300 的参数的损失值的具体过程也可以参考该文献中的描述。需要说明的是,本公开的实施例对损失函数的具体形式不作限制,即包括但不限于上述文献中的损失函数的形式。For example, the loss function of this embodiment can refer to the loss function in the literature provided by Andrey Ignatov et al. For example, similar to the loss function in this document, the loss function can include a color loss function, a texture loss function, and a content loss function; accordingly, the specific process of calculating the loss value of the parameters of the neural network 300 through the loss function can also refer to this Description in the literature. It should be noted that the embodiment of the present disclosure does not limit the specific form of the loss function, which includes but is not limited to the form of the loss function in the above-mentioned documents.
步骤S460:根据损失值对神经网络的参数进行修正。Step S460: Correct the parameters of the neural network according to the loss value.
例如,在神经网络300的训练过程中还可以包括优化函数(图7C中未示出),优化函数可以根据损失函数计算得到的损失值计算神经网络300的参数的误差值,并根据该误差值对神经网络300的参数进行修正。例如,优化函数可以采用随机梯度下降(stochastic gradient descent,SGD)算法、批量梯度下降(batch gradient descent,BGD)算法等计算神经网络300的参数的误差值。For example, the training process of the neural network 300 may also include an optimization function (not shown in FIG. 7C). The optimization function may calculate the error value of the parameters of the neural network 300 according to the loss value calculated by the loss function, and according to the error value The parameters of the neural network 300 are corrected. For example, the optimization function may use a stochastic gradient descent (SGD) algorithm, a batch gradient descent (BGD) algorithm, etc., to calculate the error value of the parameters of the neural network 300.
例如,神经网络的训练方法还可以包括:判断神经网络的训练是否满足预定条件,若不满足预定条件,则重复执行上述训练过程(即步骤S410至步骤S460);若满足预定条件,则停止上述训练过程,得到训练好的神经网络。例如,在一个示例中,上述预定条件为连续两幅(或更多幅)训练输出图像对应的损失值不再显著减小。例如,在另一个示例中,上述预定条件为神经网络的训练次数或训练周期达到预定数目。本公开对此不作限制。For example, the training method of the neural network may further include: judging whether the training of the neural network satisfies a predetermined condition, if the predetermined condition is not met, repeating the above training process (ie, step S410 to step S460); if the predetermined condition is met, stopping the above During the training process, a trained neural network is obtained. For example, in one example, the foregoing predetermined condition is that the loss values corresponding to two consecutive (or more) training output images no longer decrease significantly. For example, in another example, the foregoing predetermined condition is that the number of training times or training cycles of the neural network reaches a predetermined number. This disclosure does not limit this.
例如,训练好的神经网络300输出的训练输出图像保留了训练输入图像的内容,但是,训练输出图像的质量可以接近于真实的数码单镜反光相机拍摄的照片的质量,即训练输出图像为高质量图像。For example, the training output image output by the trained neural network 300 retains the content of the training input image, but the quality of the training output image can be close to the quality of photos taken by a real digital single-lens reflex camera, that is, the training output image is high Quality image.
需要说明的是,上述实施例仅是示意性说明神经网络的训练过程。本领域技术人员应当知道,在训练阶段,需要利用大量样本图像对神经网络进行训练;同时,在每一幅样本图像训练过程中,都可以包括多次反复迭代以对神经网络的参数进行修正。又例如,训练阶段还包括对神经网络的参数进行微调(fine-tune),以获取更优化的参数。It should be noted that the above-mentioned embodiments only schematically illustrate the training process of the neural network. Those skilled in the art should know that in the training phase, a large number of sample images need to be used to train the neural network; at the same time, the training process of each sample image may include multiple iterations to correct the parameters of the neural network. For another example, the training phase also includes fine-tune the parameters of the neural network to obtain more optimized parameters.
本公开的实施例提供的神经网络的训练方法,可以对本公开实施例的图像处理方法中采用的神经网络进行训练,通过该训练方法训练好的神经网络,可以对低质量的输入图像进行图像增强处理,通过在多个尺度上反复采样以获取更高的图像保真度,可以大幅提升输出图像的质量,适用于对于图像质量要求较高的批处理等离线应用。The neural network training method provided by the embodiments of the present disclosure can train the neural network used in the image processing method of the embodiments of the present disclosure, and the neural network trained by the training method can perform image enhancement on low-quality input images Processing, by repeatedly sampling at multiple scales to obtain higher image fidelity, the quality of the output image can be greatly improved, and it is suitable for offline applications such as batch processing that require high image quality.
本公开至少一实施例还提供一种图像处理装置。图8为本公开一实施例提供的一种图像处理装置的示意性框图。例如,如图8所示,该图像处理装置500包括存储器510和处理器520。例如,存储器510用于非暂时性存储计算机可 读指令,处理器520用于运行该计算机可读指令,该计算机可读指令被处理器520运行时执行本公开的实施例提供的图像处理方法。At least one embodiment of the present disclosure also provides an image processing device. FIG. 8 is a schematic block diagram of an image processing device provided by an embodiment of the present disclosure. For example, as shown in FIG. 8, the image processing apparatus 500 includes a memory 510 and a processor 520. For example, the memory 510 is used to non-temporarily store computer readable instructions, and the processor 520 is used to run the computer readable instructions. When the computer readable instructions are executed by the processor 520, the image processing method provided by the embodiments of the present disclosure is executed.
例如,存储器510和处理器520之间可以直接或间接地互相通信。例如,存储器510和处理器520等组件之间可以通过网络连接进行通信。网络可以包括无线网络、有线网络、和/或无线网络和有线网络的任意组合。网络可以包括局域网、互联网、电信网、基于互联网和/或电信网的物联网(Internet of Things)、和/或以上网络的任意组合等。有线网络例如可以采用双绞线、同轴电缆或光纤传输等方式进行通信,无线网络例如可以采用3G/4G/5G移动通信网络、蓝牙、Zigbee或者WiFi等通信方式。本公开对网络的类型和功能在此不作限制。For example, the memory 510 and the processor 520 may directly or indirectly communicate with each other. For example, components such as the memory 510 and the processor 520 may communicate through a network connection. The network may include a wireless network, a wired network, and/or any combination of a wireless network and a wired network. The network may include a local area network, the Internet, a telecommunication network, the Internet of Things (Internet of Things) based on the Internet and/or a telecommunication network, and/or any combination of the above networks, etc. The wired network may, for example, use twisted pair, coaxial cable, or optical fiber transmission for communication, and the wireless network may use, for example, a 3G/4G/5G mobile communication network, Bluetooth, Zigbee, or WiFi. The present disclosure does not limit the types and functions of the network here.
例如,处理器520可以控制图像处理装置中的其它组件以执行期望的功能。处理器520可以是中央处理单元(CPU)、张量处理器(TPU)或者图形处理器GPU等具有数据处理能力和/或程序执行能力的器件。中央处理器(CPU)可以为X86或ARM架构等。GPU可以单独地直接集成到主板上,或者内置于主板的北桥芯片中。GPU也可以内置于中央处理器(CPU)上。For example, the processor 520 may control other components in the image processing apparatus to perform desired functions. The processor 520 may be a central processing unit (CPU), a tensor processor (TPU), or a graphics processor GPU, and other devices with data processing capabilities and/or program execution capabilities. The central processing unit (CPU) can be an X86 or ARM architecture. The GPU can be directly integrated on the motherboard alone or built into the north bridge chip of the motherboard. The GPU can also be built into the central processing unit (CPU).
例如,存储器510可以包括一个或多个计算机程序产品的任意组合,计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器、闪存等。For example, the memory 510 may include any combination of one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or nonvolatile memory. Volatile memory may include random access memory (RAM) and/or cache memory (cache), for example. The non-volatile memory may include, for example, read only memory (ROM), hard disk, erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, flash memory, etc.
例如,在存储器510上可以存储一个或多个计算机指令,处理器520可以运行所述计算机指令,以实现各种功能。在计算机可读存储介质中还可以存储各种应用程序和各种数据,例如训练输入图像、以及应用程序使用和/或产生的各种数据等。For example, one or more computer instructions may be stored in the memory 510, and the processor 520 may execute the computer instructions to implement various functions. The computer-readable storage medium may also store various application programs and various data, such as training input images, and various data used and/or generated by the application programs.
例如,存储器510存储的一些计算机指令被处理器520执行时可以执行根据上文所述的图像处理方法中的一个或多个步骤。又例如,存储器510存储的另一些计算机指令被处理器520执行时可以执行根据上文所述的神经网络的训练方法中的一个或多个步骤。For example, when some computer instructions stored in the memory 510 are executed by the processor 520, one or more steps in the image processing method described above may be executed. For another example, when other computer instructions stored in the memory 510 are executed by the processor 520, one or more steps in the neural network training method described above may be executed.
例如,关于图像处理方法的处理过程的详细说明可以参考上述图像处理方法的实施例中的相关描述,关于神经网络的训练方法的处理过程的详细说明可以参考上述神经网络的训练方法的实施例中的相关描述,重复之处不再赘述。For example, for a detailed description of the processing procedure of the image processing method, please refer to the relevant description in the embodiment of the above-mentioned image processing method, and for the detailed description of the processing procedure of the neural network training method, please refer to the above-mentioned embodiment of the neural network training method. The relevant description of, the repetition will not be repeated.
需要说明的是,本公开的上述实施例提供的图像处理装置是示例性的,而非限制性的,根据实际应用需要,该图像处理装置还可以包括其他常规部件或结构,例如,为实现图像处理装置的必要功能,本领域技术人员可以根据具体应用场景设置其他的常规部件或结构,本公开的实施例对此不作限制。It should be noted that the image processing device provided by the above-mentioned embodiments of the present disclosure is exemplary rather than restrictive. According to actual application requirements, the image processing device may also include other conventional components or structures, for example, to realize image processing. For necessary functions of the processing device, those skilled in the art can set other conventional components or structures according to specific application scenarios, which are not limited in the embodiments of the present disclosure.
本公开的上述实施例提供的图像处理装置的技术效果可以参考上述实施例中关于图像处理方法以及神经网络的训练方法的相应描述,在此不再赘述。For the technical effects of the image processing device provided in the foregoing embodiment of the present disclosure, reference may be made to the corresponding description of the image processing method and the neural network training method in the foregoing embodiment, which will not be repeated here.
本公开至少一实施例还提供一种存储介质。图9为本公开一实施例提供的一种存储介质的示意图。例如,如图9所示,该存储介质600非暂时性地存储计算机可读指令601,当非暂时性计算机可读指令601由计算机(包括处理器)执行时可以执行本公开任一实施例提供的图像处理方法的指令。At least one embodiment of the present disclosure also provides a storage medium. FIG. 9 is a schematic diagram of a storage medium provided by an embodiment of the disclosure. For example, as shown in FIG. 9, the storage medium 600 non-transitory stores computer-readable instructions 601. When the non-transitory computer-readable instructions 601 are executed by a computer (including a processor), any of the embodiments of the present disclosure can be executed. Instructions for the image processing method.
例如,在存储介质600上可以存储一个或多个计算机指令。存储介质600上存储的一些计算机指令可以是例如用于实现上述图像处理方法中的一个或多个步骤的指令。存储介质上存储的另一些计算机指令可以是例如用于实现上述神经网络的训练方法中的一个或多个步骤的指令。For example, one or more computer instructions may be stored on the storage medium 600. Some computer instructions stored on the storage medium 600 may be, for example, instructions for implementing one or more steps in the foregoing image processing method. The other computer instructions stored on the storage medium may be, for example, instructions for implementing one or more steps in the above-mentioned neural network training method.
例如,存储介质可以包括平板电脑的存储部件、个人计算机的硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、光盘只读存储器(CD-ROM)、闪存、或者上述存储介质的任意组合,也可以为其他适用的存储介质。For example, the storage medium may include the storage components of a tablet computer, the hard disk of a personal computer, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), optical disk read only memory (CD -ROM), flash memory, or any combination of the above storage media, can also be other suitable storage media.
本公开的实施例提供的存储介质的技术效果可以参考上述实施例中关于图像处理方法以及神经网络的训练方法的相应描述,在此不再赘述。For the technical effects of the storage medium provided by the embodiments of the present disclosure, reference may be made to the corresponding descriptions of the image processing method and the neural network training method in the above-mentioned embodiments, which will not be repeated here.
对于本公开,有以下几点需要说明:For this disclosure, the following points need to be explained:
(1)本公开实施例附图中,只涉及到与本公开实施例涉及到的结构,其他结构可参考通常设计。(1) In the drawings of the embodiments of the present disclosure, only the structures related to the embodiments of the present disclosure are involved, and other structures can refer to the usual design.
(2)在不冲突的情况下,本公开同一实施例及不同实施例中的特征可以相互组合。(2) In the case of no conflict, the features of the same embodiment and different embodiments of the present disclosure can be combined with each other.
以上,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以权利要求的保护范围为准。The above are only specific implementations of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art can easily conceive of changes or substitutions within the technical scope disclosed in the present disclosure, which shall cover Within the protection scope of this disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Claims (19)

  1. 一种图像处理方法,包括:An image processing method, including:
    接收第一特征图像;以及Receiving the first characteristic image; and
    对所述第一特征图像进行至少一次多尺度循环采样处理;Performing multi-scale cyclic sampling processing on the first feature image at least once;
    其中,所述多尺度循环采样处理包括嵌套的第一层级采样处理和第二层级采样处理,Wherein, the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing,
    所述第一层级采样处理包括第一下采样处理、第一上采样处理和第一残差链接相加处理,其中,所述第一下采样处理基于第一层级采样处理的输入进行下采样处理得到第一下采样输出,所述第一上采样处理基于所述第一下采样输出进行上采样处理得到第一上采样输出,所述第一残差链接相加处理将所述第一层级采样处理的输入和所述第一上采样输出进行第一残差链接相加,然后将所述第一残差链接相加的结果作为第一层级采样处理的输出;The first-level sampling processing includes first down-sampling processing, first up-sampling processing, and first residual link addition processing, wherein the first down-sampling processing performs down-sampling processing based on the input of the first-level sampling processing Obtain a first down-sampling output, the first up-sampling process performs an up-sampling process based on the first down-sampling output to obtain a first up-sampling output, and the first residual link addition process samples the first level The processed input and the first up-sampling output are subjected to a first residual link addition, and then the result of the first residual link addition is used as the output of the first-level sampling processing;
    所述第二层级采样处理嵌套在所述第一下采样处理和所述第一上采样处理之间,接收所述第一下采样输出作为第二层级采样处理的输入,提供第二层级采样处理的输出作为第一上采样处理的输入,以使得所述第一上采样处理基于所述第一下采样输出进行上采样处理;The second-level sampling process is nested between the first down-sampling process and the first up-sampling process, the first down-sampling output is received as the input of the second-level sampling process, and the second-level sampling is provided The processed output is used as the input of the first up-sampling process, so that the first up-sampling process performs up-sampling processing based on the first down-sampling output;
    所述第二层级采样处理包括第二下采样处理、第二上采样处理和第二残差链接相加处理,其中,所述第二下采样处理基于所述第二层级采样处理的输入进行下采样处理得到第二下采样输出,所述第二上采样处理基于所述第二下采样输出进行上采样处理得到第二上采样输出,所述第二残差链接相加处理将所述第二层级采样处理的输入和所述第二上采样输出进行第二残差链接相加,然后将所述第二残差链接相加的结果作为所述第二层级采样处理的输出。The second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process, wherein the second down-sampling process is performed based on the input of the second-level sampling process. Sampling processing obtains a second down-sampling output, the second up-sampling processing performs up-sampling processing based on the second down-sampling output to obtain a second up-sampling output, and the second residual link addition processing adds the second The input of the hierarchical sampling process and the second up-sampling output are subjected to a second residual link addition, and then the result of the second residual link addition is used as the output of the second hierarchical sampling process.
  2. 根据权利要求1所述的图像处理方法,其中,所述第一上采样处理的输出的尺寸与所述第一下采样处理的输入的尺寸相同;The image processing method according to claim 1, wherein the size of the output of the first upsampling process is the same as the size of the input of the first downsampling process;
    所述第二上采样处理的输出的尺寸与所述第二下采样处理的输入的尺寸相同。The size of the output of the second upsampling process is the same as the size of the input of the second downsampling process.
  3. 根据权利要求1或2所述的图像处理方法,其中,所述多尺度循环采样处理还包括第三层级采样处理,The image processing method according to claim 1 or 2, wherein the multi-scale cyclic sampling processing further comprises a third-level sampling processing,
    所述第三层级采样处理嵌套在所述第二下采样处理和所述第二上采样处理之间,接收所述第二下采样输出作为第三层级采样处理的输入,提供第三层 级采样处理的输出作为第二上采样处理的输入,以使得所述第二上采样处理基于所述第二下采样输出进行上采样处理;The third-level sampling process is nested between the second down-sampling process and the second up-sampling process, and the second down-sampling output is received as the input of the third-level sampling process to provide third-level sampling The processed output is used as the input of the second up-sampling process, so that the second up-sampling process performs up-sampling processing based on the second down-sampling output;
    所述第三层级采样处理包括第三下采样处理、第三上采样处理和第三残差链接相加处理,其中,所述第三下采样处理基于所述第三层级采样处理的输入进行下采样处理得到第三下采样输出,所述第三上采样处理基于所述第三下采样输出进行上采样处理得到第三上采样输出,所述第三残差链接相加处理将所述第三层级采样处理的输入和所述第三上采样输出进行第三残差链接相加,然后将所述第三残差链接相加的结果作为所述第三层级采样处理的输出。The third-level sampling process includes a third down-sampling process, a third up-sampling process, and a third residual link addition process, wherein the third down-sampling process performs down-sampling based on the input of the third-level sampling process. Sampling processing obtains a third down-sampling output, the third up-sampling processing performs up-sampling processing based on the third down-sampling output to obtain a third up-sampling output, and the third residual link addition processing adds the third The input of the hierarchical sampling process and the third up-sampling output are subjected to a third residual link addition, and then the result of the third residual link addition is used as the output of the third hierarchical sampling process.
  4. 根据权利要求1或2所述的图像处理方法,其中,所述多尺度循环采样处理包括多次依次执行的所述第二层级采样处理,The image processing method according to claim 1 or 2, wherein the multi-scale cyclic sampling processing includes the second-level sampling processing executed multiple times in sequence,
    第一次所述第二层级采样处理接收所述第一下采样输出作为第一次所述第二层级采样处理的输入,The first second-level sampling process receives the first down-sampled output as the input of the first second-level sampling process,
    除第一次所述第二层级采样处理之外的每次所述第二层级采样处理接收前一次所述第二层级采样处理的输出作为本次所述第二层级采样处理的输入,Each second-level sampling process except the first second-level sampling process receives the output of the previous second-level sampling process as the input of this second-level sampling process,
    最后一次所述第二层级采样处理的输出作为所述第一上采样处理的输入。The output of the last second-level sampling process is used as the input of the first up-sampling process.
  5. 根据权利要求1-4任一项所述的图像处理方法,其中,所述至少一次多尺度循环采样处理包括多次依次执行的所述多尺度循环采样处理,The image processing method according to any one of claims 1 to 4, wherein the at least one multi-scale cyclic sampling process includes the multi-scale cyclic sampling process executed sequentially multiple times,
    每次所述多尺度循环采样处理的输入作为本次所述多尺度循环采样处理中的所述第一层级采样处理的输入,每次所述多尺度循环采样处理中的所述第一层级采样处理的输出作为本次所述多尺度循环采样处理的输出;Each time the input of the multi-scale cyclic sampling processing is used as the input of the first-level sampling processing in the multi-scale cyclic sampling processing this time, the first-level sampling in the multi-scale cyclic sampling processing each time The processed output is used as the output of the multi-scale cyclic sampling process;
    第一次所述多尺度循环采样处理接收所述第一特征图像作为第一次所述多尺度循环采样处理的输入,The first multi-scale cyclic sampling process receives the first feature image as an input of the first multi-scale cyclic sampling process,
    除第一次所述多尺度循环采样处理之外的每次所述多尺度循环采样处理接收前一次所述多尺度循环采样处理的输出作为本次所述多尺度循环采样处理的输入,Each time the multi-scale cyclic sampling process except the first multi-scale cyclic sampling process receives the output of the previous multi-scale cyclic sampling process as the input of the multi-scale cyclic sampling process this time,
    最后一次所述多尺度循环采样处理的输出作为所述至少一次多尺度循环采样处理的输出。The output of the last multi-scale cyclic sampling process is used as the output of the at least one multi-scale cyclic sampling process.
  6. 根据权利要求1-5任一项所述的图像处理方法,其中,所述多尺度循环采样处理还包括:The image processing method according to any one of claims 1-5, wherein the multi-scale cyclic sampling processing further comprises:
    在所述第一下采样处理、所述第一上采样处理、所述第二下采样处理和所述第二上采样处理之后,分别对所述第一下采样输出、所述第一上采样输出、 所述第二下采样输出和所述第二上采样输出进行实例标准化处理或层标准化处理。After the first down-sampling process, the first up-sampling process, the second down-sampling process, and the second up-sampling process, the first down-sampling output and the first up-sampling process are respectively output The output, the second down-sampling output and the second up-sampling output are subjected to instance standardization processing or layer standardization processing.
  7. 根据权利要求1-6任一项所述的图像处理方法,还包括:使用第一卷积神经网络进行所述多尺度循环采样处理;The image processing method according to any one of claims 1 to 6, further comprising: using a first convolutional neural network to perform the multi-scale cyclic sampling processing;
    其中,所述第一卷积神经网络包括:Wherein, the first convolutional neural network includes:
    第一元网络,用于执行所述第一层级采样处理;The first element network is used to perform the first-level sampling processing;
    第二元网络,用于执行所述第二层级采样处理。The second meta network is used to perform the second-level sampling processing.
  8. 根据权利要求7所述的图像处理方法,其中,The image processing method according to claim 7, wherein:
    所述第一元网络包括:The first meta network includes:
    第一子网络,用于执行所述第一下采样处理;The first sub-network is configured to perform the first down-sampling process;
    第二子网络,用于执行所述第一上采样处理;The second sub-network is used to perform the first upsampling process;
    所述第二元网络包括:The second element network includes:
    第三子网络,用于执行所述第二下采样处理;The third sub-network is used to perform the second down-sampling process;
    第四子网络,用于执行所述第二上采样处理。The fourth sub-network is used to perform the second upsampling process.
  9. 根据权利要求8所述的图像处理方法,其中,所述第一子网络、所述第二子网络、所述第三子网络和所述第四子网络中每一个包括卷积层、残差网络、密集网络之一。The image processing method according to claim 8, wherein each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes a convolutional layer, a residual One of network and dense network.
  10. 根据权利要求9所述的图像处理方法,其中,所述第一子网络、所述第二子网络、所述第三子网络和所述第四子网络中的每一个都包括实例标准化层或层标准化层,The image processing method according to claim 9, wherein each of the first sub-network, the second sub-network, the third sub-network, and the fourth sub-network includes an instance standardization layer or Layer standardization layer,
    所述实例标准化层用于执行实例标准化处理,所述层标准化层用于执行层标准化处理。The instance standardization layer is used to perform instance standardization processing, and the layer standardization layer is used to perform layer standardization processing.
  11. 根据权利要求1-10任一项所述的图像处理方法,还包括:The image processing method according to any one of claims 1-10, further comprising:
    获取输入图像;Get the input image;
    使用分析网络将输入图像转换为所述第一特征图像;以及:Use an analysis network to convert the input image into the first feature image; and:
    使用合成网络将所述至少一次多尺度循环采样处理的输出转换为输出图像。A synthesis network is used to convert the output of the at least one multi-scale cyclic sampling process into an output image.
  12. 一种神经网络的训练方法,其中,A neural network training method, in which,
    所述神经网络包括:分析网络、第一子神经网络和合成网络,The neural network includes: an analysis network, a first sub-neural network and a synthesis network,
    所述训练方法包括:The training method includes:
    获取训练输入图像;Obtain training input images;
    使用所述分析网络对所述训练输入图像进行处理以提供第一训练特征图像;Using the analysis network to process the training input image to provide a first training feature image;
    使用所述第一子神经网络对所述第一训练特征图像进行至少一次多尺度循环采样处理以得到第二训练特征图像;Using the first sub-neural network to perform multi-scale cyclic sampling processing on the first training feature image at least once to obtain a second training feature image;
    使用所述合成网络对所述第二训练特征图像进行处理以得到训练输出图像;Use the synthesis network to process the second training feature image to obtain a training output image;
    基于所述训练输出图像,通过损失函数计算所述神经网络的损失值;以及Based on the training output image, calculate the loss value of the neural network through a loss function; and
    根据所述损失值对所述神经网络的参数进行修正;Correcting the parameters of the neural network according to the loss value;
    其中,所述多尺度循环采样处理包括嵌套的第一层级采样处理和第二层级采样处理,Wherein, the multi-scale cyclic sampling processing includes nested first-level sampling processing and second-level sampling processing,
    所述第一层级采样处理包括第一下采样处理、第一上采样处理和第一残差链接相加处理,其中,所述第一下采样处理基于第一层级采样处理的输入进行下采样处理得到第一下采样输出,所述第一上采样处理基于所述第一下采样输出进行上采样处理得到第一上采样输出,所述第一残差链接相加处理将所述第一层级采样处理的输入和所述第一上采样输出进行第一残差链接相加,然后将所述第一残差链接相加的结果作为第一层级采样处理的输出;The first-level sampling processing includes first down-sampling processing, first up-sampling processing, and first residual link addition processing, wherein the first down-sampling processing performs down-sampling processing based on the input of the first-level sampling processing Obtain a first down-sampling output, the first up-sampling process performs an up-sampling process based on the first down-sampling output to obtain a first up-sampling output, and the first residual link addition process samples the first level The processed input and the first up-sampling output are subjected to a first residual link addition, and then the result of the first residual link addition is used as the output of the first-level sampling processing;
    所述第二层级采样处理嵌套在所述第一下采样处理和所述第一上采样处理之间,接收所述第一下采样输出作为第二层级采样处理的输入,提供第二层级采样处理的输出作为第一上采样处理的输入,以使得所述第一上采样处理基于所述第一下采样输出进行上采样处理;The second-level sampling process is nested between the first down-sampling process and the first up-sampling process, the first down-sampling output is received as the input of the second-level sampling process, and the second-level sampling is provided The processed output is used as an input of the first up-sampling process, so that the first up-sampling process performs up-sampling processing based on the first down-sampling output;
    所述第二层级采样处理包括第二下采样处理、第二上采样处理和第二残差链接相加处理,其中,所述第二下采样处理基于所述第二层级采样处理的输入进行下采样处理得到第二下采样输出,所述第二上采样处理基于所述第二下采样输出进行上采样处理得到第二上采样输出,所述第二残差链接相加处理将所述第二层级采样处理的输入和所述第二上采样输出进行第二残差链接相加,然后将所述第二残差链接相加的结果作为所述第二层级采样处理的输出。The second-level sampling process includes a second down-sampling process, a second up-sampling process, and a second residual link addition process, wherein the second down-sampling process is performed based on the input of the second-level sampling process. Sampling processing obtains a second down-sampling output, the second up-sampling processing performs up-sampling processing based on the second down-sampling output to obtain a second up-sampling output, and the second residual link addition processing adds The input of the hierarchical sampling process and the second up-sampling output are subjected to a second residual link addition, and then the result of the second residual link addition is used as the output of the second hierarchical sampling process.
  13. 根据权利要求12所述的训练方法,其中,所述第一上采样处理的输出的尺寸与所述第一下采样处理的输入的尺寸相同;The training method according to claim 12, wherein the size of the output of the first upsampling process is the same as the size of the input of the first downsampling process;
    所述第二上采样处理的输出的尺寸与所述第二下采样处理的输入的尺寸相同。The size of the output of the second upsampling process is the same as the size of the input of the second downsampling process.
  14. 根据权利要求12或13所述的训练方法,其中,所述多尺度循环采样 处理还包括第三层级采样处理,The training method according to claim 12 or 13, wherein the multi-scale cyclic sampling processing further comprises a third-level sampling processing,
    所述第三层级采样处理嵌套在所述第二下采样处理和所述第二上采样处理之间,接收所述第二下采样输出作为第三层级采样处理的输入,提供第三层级采样处理的输出作为第二上采样处理的输入,以使得所述第二上采样处理基于所述第二下采样输出进行上采样处理;The third-level sampling process is nested between the second down-sampling process and the second up-sampling process, and the second down-sampling output is received as the input of the third-level sampling process to provide third-level sampling The processed output is used as the input of the second up-sampling process, so that the second up-sampling process performs up-sampling processing based on the second down-sampling output;
    所述第三层级采样处理包括第三下采样处理、第三上采样处理和第三残差链接相加处理,其中,所述第三下采样处理基于所述第三层级采样处理的输入进行下采样处理得到第三下采样输出,所述第三上采样处理基于所述第三下采样输出进行上采样处理得到第三上采样输出,所述第三残差链接相加处理将所述第三层级采样处理的输入和所述第三上采样输出进行第三残差链接相加,然后将所述第三残差链接相加的结果作为所述第三层级采样处理的输出。The third-level sampling process includes a third down-sampling process, a third up-sampling process, and a third residual link addition process, wherein the third down-sampling process performs down-sampling based on the input of the third-level sampling process. Sampling processing obtains a third down-sampling output, the third up-sampling processing performs up-sampling processing based on the third down-sampling output to obtain a third up-sampling output, and the third residual link addition processing adds the third The input of the hierarchical sampling process and the third up-sampling output are subjected to a third residual link addition, and then the result of the third residual link addition is used as the output of the third hierarchical sampling process.
  15. 根据权利要求12或13所述的训练方法,其中,所述多尺度循环采样处理包括多次依次执行的所述第二层级采样处理,The training method according to claim 12 or 13, wherein the multi-scale cyclic sampling processing includes the second-level sampling processing executed multiple times in sequence,
    第一次所述第二层级采样处理接收所述第一下采样输出作为第一次所述第二层级采样处理的输入,The first second-level sampling process receives the first down-sampled output as the input of the first second-level sampling process,
    除第一次所述第二层级采样处理之外的每次所述第二层级采样处理接收前一次所述第二层级采样处理的输出作为本次所述第二层级采样处理的输入,Each second-level sampling process except the first second-level sampling process receives the output of the previous second-level sampling process as the input of this second-level sampling process,
    最后一次所述第二层级采样处理的输出作为所述第一上采样处理的输入。The output of the last second-level sampling process is used as the input of the first up-sampling process.
  16. 根据权利要求12-15任一项所述的训练方法,其中,所述至少一次多尺度循环采样处理包括多次依次执行的所述多尺度循环采样处理,The training method according to any one of claims 12-15, wherein the at least one multi-scale cyclic sampling process comprises the multi-scale cyclic sampling process executed sequentially multiple times,
    每次所述多尺度循环采样处理的输入作为本次所述多尺度循环采样处理中的所述第一层级采样处理的输入,每次所述多尺度循环采样处理中的所述第一层级采样处理的输出作为本次所述多尺度循环采样处理的输出;Each time the input of the multi-scale cyclic sampling processing is used as the input of the first-level sampling processing in the multi-scale cyclic sampling processing this time, the first-level sampling in the multi-scale cyclic sampling processing each time The processed output is used as the output of the multi-scale cyclic sampling process;
    第一次所述多尺度循环采样处理接收所述第一训练特征图像作为第一次所述多尺度循环采样处理的输入,The first multi-scale cyclic sampling process receives the first training feature image as the input of the first multi-scale cyclic sampling process,
    除第一次所述多尺度循环采样处理之外的每次所述多尺度循环采样处理接收前一次所述多尺度循环采样处理的输出作为本次所述多尺度循环采样处理的输入,Each time the multi-scale cyclic sampling process except the first multi-scale cyclic sampling process receives the output of the previous multi-scale cyclic sampling process as the input of the multi-scale cyclic sampling process this time,
    最后一次所述多尺度循环采样处理的输出作为所述至少一次多尺度循环采样处理的输出。The output of the last multi-scale cyclic sampling process is used as the output of the at least one multi-scale cyclic sampling process.
  17. 根据权利要求12-16任一项所述的训练方法,其中,所述多尺度循环 采样处理还包括:The training method according to any one of claims 12-16, wherein the multi-scale cyclic sampling processing further comprises:
    在所述第一下采样处理、所述第一上采样处理、所述第二下采样处理和所述第二上采样处理之后,分别对所述第一下采样输出、所述第一上采样输出、所述第二下采样输出和所述第二上采样输出进行实例标准化处理或层标准化处理。After the first down-sampling process, the first up-sampling process, the second down-sampling process, and the second up-sampling process, the first down-sampling output and the first up-sampling process are respectively output The output, the second down-sampling output, and the second up-sampling output are subjected to instance standardization processing or layer standardization processing.
  18. 一种图像处理装置,包括:An image processing device including:
    存储器,用于非暂时性存储计算机可读指令;以及Memory for non-temporary storage of computer-readable instructions; and
    处理器,用于运行所述计算机可读指令,所述计算机可读指令被所述处理器运行时执行根据权利要求1-11任一项所述的图像处理方法或根据权利要求12-17任一项所述的神经网络的训练方法。The processor is configured to run the computer-readable instructions, and when the computer-readable instructions are executed by the processor, the image processing method according to any one of claims 1-11 or any one of claims 12-17 is executed. One of the training methods of neural network.
  19. 一种存储介质,非暂时性地存储计算机可读指令,当所述计算机可读指令由计算机执行时能够执行根据权利要求1-11任一项所述的图像处理方法的指令或根据权利要求12-17任一项所述的神经网络的训练方法的指令。A storage medium that non-temporarily stores computer-readable instructions, which can execute the instructions of the image processing method according to any one of claims 1-11 or according to claim 12 when the computer-readable instructions are executed by a computer -17 The instruction of any one of the neural network training method.
PCT/CN2020/077763 2019-03-19 2020-03-04 Image processing method and device, neural network training method, and storage medium WO2020187029A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910209662.2A CN111724309B (en) 2019-03-19 2019-03-19 Image processing method and device, training method of neural network and storage medium
CN201910209662.2 2019-03-19

Publications (1)

Publication Number Publication Date
WO2020187029A1 true WO2020187029A1 (en) 2020-09-24

Family

ID=72519587

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/077763 WO2020187029A1 (en) 2019-03-19 2020-03-04 Image processing method and device, neural network training method, and storage medium

Country Status (2)

Country Link
CN (1) CN111724309B (en)
WO (1) WO2020187029A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114501012A (en) * 2021-12-31 2022-05-13 浙江大华技术股份有限公司 Image filtering, coding and decoding method and related equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538287B (en) * 2021-07-29 2024-03-29 广州安思创信息技术有限公司 Video enhancement network training method, video enhancement method and related devices

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107767408A (en) * 2017-11-09 2018-03-06 京东方科技集团股份有限公司 Image processing method, processing unit and processing equipment
WO2018102748A1 (en) * 2016-12-01 2018-06-07 Berkeley Lights, Inc. Automated detection and repositioning of micro-objects in microfluidic devices
CN109360151A (en) * 2018-09-30 2019-02-19 京东方科技集团股份有限公司 Image processing method and system, increase resolution method, readable storage medium storing program for executing

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10303977B2 (en) * 2016-06-28 2019-05-28 Conduent Business Services, Llc System and method for expanding and training convolutional neural networks for large size input images
CN107730474B (en) * 2017-11-09 2022-02-22 京东方科技集团股份有限公司 Image processing method, processing device and processing equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018102748A1 (en) * 2016-12-01 2018-06-07 Berkeley Lights, Inc. Automated detection and repositioning of micro-objects in microfluidic devices
CN107767408A (en) * 2017-11-09 2018-03-06 京东方科技集团股份有限公司 Image processing method, processing unit and processing equipment
CN109360151A (en) * 2018-09-30 2019-02-19 京东方科技集团股份有限公司 Image processing method and system, increase resolution method, readable storage medium storing program for executing

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114501012A (en) * 2021-12-31 2022-05-13 浙江大华技术股份有限公司 Image filtering, coding and decoding method and related equipment

Also Published As

Publication number Publication date
CN111724309A (en) 2020-09-29
CN111724309B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
WO2021073493A1 (en) Image processing method and device, neural network training method, image processing method of combined neural network model, construction method of combined neural network model, neural network processor and storage medium
WO2020239026A1 (en) Image processing method and device, method for training neural network, and storage medium
US11461639B2 (en) Image processing method, image processing device, and training method of neural network
WO2020200030A1 (en) Neural network training method, image processing method, image processing device, and storage medium
US11551333B2 (en) Image reconstruction method and device
US10678508B2 (en) Accelerated quantized multiply-and-add operations
WO2019091181A1 (en) Image processing method, processing apparatus and processing device
EP3923233A1 (en) Image denoising method and apparatus
CN111402130B (en) Data processing method and data processing device
WO2021018163A1 (en) Neural network search method and apparatus
CN111914997B (en) Method for training neural network, image processing method and device
CN112446834A (en) Image enhancement method and device
WO2022134971A1 (en) Noise reduction model training method and related apparatus
WO2020187029A1 (en) Image processing method and device, neural network training method, and storage medium
CN113011562A (en) Model training method and device
CN109754357B (en) Image processing method, processing device and processing equipment
CN113096023B (en) Training method, image processing method and device for neural network and storage medium
TW202133032A (en) Image normalization processing method, apparatus and storage medium
CN113076966B (en) Image processing method and device, training method of neural network and storage medium
WO2023029559A1 (en) Data processing method and apparatus
CN113256556A (en) Image selection method and device
WO2022183325A1 (en) Video block processing method and apparatus, neural network training method, and storage medium
US20240135490A1 (en) Image processing method and device, training method of neural network, image processing method based on combined neural network model, constructing method of combined neural network model, neural network processor, and storage medium
WO2023028866A1 (en) Image processing method and apparatus, and vehicle
CN111767979B (en) Training method, image processing method and image processing device for neural network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20773582

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20773582

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20773582

Country of ref document: EP

Kind code of ref document: A1