WO2021179147A1 - Procédé et appareil de traitement d'image basés sur un réseau neuronal - Google Patents

Procédé et appareil de traitement d'image basés sur un réseau neuronal Download PDF

Info

Publication number
WO2021179147A1
WO2021179147A1 PCT/CN2020/078484 CN2020078484W WO2021179147A1 WO 2021179147 A1 WO2021179147 A1 WO 2021179147A1 CN 2020078484 W CN2020078484 W CN 2020078484W WO 2021179147 A1 WO2021179147 A1 WO 2021179147A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
processed
neural network
images
frame
Prior art date
Application number
PCT/CN2020/078484
Other languages
English (en)
Chinese (zh)
Inventor
李蒙
胡慧
陈海
郑成林
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2020/078484 priority Critical patent/WO2021179147A1/fr
Priority to CN202080098211.7A priority patent/CN115244569A/zh
Publication of WO2021179147A1 publication Critical patent/WO2021179147A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to the field of image processing technology, and in particular to a neural network-based image processing method and device.
  • the mobile terminal performs image signal processing (ISP) on the image signal.
  • ISP image signal processing
  • the main function of ISP is to perform post-processing on the image signal output by the front-end image sensor. Depending on the ISP, the images obtained under different optical conditions can better restore the details of the scene.
  • the ISP processing flow is shown in Figure 1.
  • the natural scene 101 obtains a Bayer image through a lens 102, and then obtains an analog electrical signal 105 through photoelectric conversion 104, and further obtains a digital electrical signal through denoising and analog-to-digital processing 106 (Ie, raw image) 107, which will then enter the digital signal processing chip 100.
  • the steps in the digital signal processing chip 100 are the core steps of ISP processing.
  • the digital signal processing chip 100 generally includes black level compensation (BLC) 108, lens shading correction 109, and dead pixel correction ( bad pixel correction, BPC) 110, demosaic (demosaic) 111, Bayer domain noise reduction (denoise) 112, auto white balance (AWB) 113, Ygamma 114, auto exposure (AE) 115, auto focus (auto focus, AF) (not shown in Figure 1), color correction (CC) 116, gamma correction 117, color gamut conversion 118, color denoising/detail enhancement 119, color enhancement (color Enhance (CE) 120, formater (formater) 121, input/output (input/output, I/O) control 122 and other modules.
  • BLC black level compensation
  • BPC dead pixel correction
  • demosaic demosaic
  • Bayer domain noise reduction denoise
  • ARB auto white balance
  • AE auto exposure
  • AF auto focus
  • CE color Enhance
  • CE color Enhance
  • ISP based on deep learning has achieved certain results in the application of many tasks.
  • the ISP based on deep learning will process the image data through a neural network and then output it.
  • the processing complexity of the neural network is generally very high.
  • the expected purpose can be achieved, but in scenarios that require real-time processing , Generally there are problems such as energy consumption and running time.
  • ISP based on neural network needs to be further optimized.
  • This application provides a neural network-based image processing method and device, in order to optimize the neural network-based image signal processing performance.
  • a neural network-based image processing method which uses a first neural network and a second neural network to process an image to be processed, and output the processed image.
  • the image to be processed includes a first component image and a second component image.
  • the image to be processed is input to the first neural network for calculation to obtain a first image
  • the first image is the The first component image processed by the first neural network of the image to be processed
  • the first to-be-processed image matrix is input to the second neural network for operation to obtain a second image
  • the second image is a second component image of the to-be-processed image processed by the second neural network; based on the The second image, the processed image is obtained.
  • the first image obtained after the image to be processed undergoes the first neural network operation can process a part of the component image of the image to be processed to obtain an intermediate result.
  • the first image is spliced with the image to be processed, and the spliced result is processed by the second neural network to obtain the second image.
  • the intermediate results can be applied to the processing of the second neural network, reducing the computational complexity of the second neural network and ensuring the quality of image processing.
  • the second component image is the brightness component of the image to be processed.
  • the brightness component is an important component in the image processing process, which occupies a relatively high proportion of network complexity.
  • the first neural network can process the brightness component first.
  • the processing result of the brightness component is input into the second neural network as an intermediate result, and the complexity requirement of the second neural network will be reduced.
  • the first image and the second image are combined to generate the processed image.
  • a third image is obtained at the same time, and the third image is a first component image of the image to be processed that has been processed by the second neural network; corresponding , Combining the third image and the second image to generate the processed image.
  • the complexity of the second neural network is lower than the complexity of the first neural network.
  • the computing power required by the first component image is higher than the computing power required by the second component image.
  • the image to be processed may be one or more frames.
  • the image to be processed includes multiple frames of temporally adjacent images.
  • the first image is multiple frames
  • the second image is multiple frames.
  • Each frame of the image to be processed corresponds to one frame of the first image and one frame. Frame the second image.
  • the first neural network is used to deal with the problem of complex computing power between multiple frames of images
  • the second neural network is used to deal with the problem of lower computing power in each frame of the multi-frame image, and output the multi-frame processed image, making the first
  • the comprehensive computing power of the neural network and the second neural network is dispersed on multiple frames of images, so that the processing complexity of each frame of image is reduced compared with the above-mentioned solutions, and at the same time, the quality of the image or video can be guaranteed.
  • the first component image is the luminance channel
  • the second component image is the chrominance channel.
  • the first neural network can solve the problem of inter-frame motion between multiple frames of images.
  • the luminance channel and the chrominance channel of the corresponding frame are input into the second Neural network, the second neural network processes the chromaticity of each frame of image, so that the result of a processed luminance channel can be used as a guide, and the second neural network with smaller computing power can be used to solve the color problem in the frame.
  • the image processing system provided by the present application has lower complexity in image processing and guarantees the quality of the image or video.
  • the computing power required by the color channel is less than the computing power required by the brightness channel.
  • YUV images generally use a 420 sampling format, that is, the resolution of the color channel is half of that of the brightness channel.
  • the calculation performed on the second neural network includes the following steps: obtaining a feature map matrix of the first image matrix to be processed according to the first image matrix to be processed, and adding the feature The image matrix is respectively vector-spliced with each frame of the first image to obtain a plurality of second to-be-processed image matrices, wherein each frame of the second image is obtained according to each second to-be-processed image matrix.
  • the vector stitching of the first image and the to-be-processed image may be implemented in the following manner: grouping the multiple frames of adjacent images in the time domain to obtain multiple sets of subgroup images Like; the first image of each frame and a group of sub-group images are vector stitched to generate multiple sub-matrices of images to be processed.
  • the first image and the sub-group of images for vector stitching correspond to the same frame of image to be processed.
  • the first component image is the brightness component of the image to be processed.
  • the second component image is one or more chrominance components, or one or more color components of the image to be processed.
  • the first component image and the second component image are respectively different color components of the image to be processed.
  • the first neural network and the second neural network constitute an image processing system, and the image processing system is used to process the image to be processed for noise reduction and mosaic effect elimination.
  • the format of the image to be processed may be a red-green-blue (RGB) format, a bright color separation (YUV) format, or a Bayer format.
  • RGB red-green-blue
  • YUV bright color separation
  • Bayer format a format of the image to be processed
  • a neural network-based image processing device can be a terminal device, a device in a terminal device (such as a chip, or a chip system, or a circuit), or a device that can be matched with the terminal device.
  • the device may include modules that perform one-to-one correspondence of the methods/operations/steps/actions described in the first aspect.
  • the modules may be hardware circuits, software, or hardware circuits combined with software.
  • the device processes the image to be processed and obtains the processed image.
  • the image to be processed includes a first component image and a second component image.
  • the device may include an arithmetic module and a splicing module.
  • the processing module is used to call the communication module to perform the function of receiving and/or sending.
  • An arithmetic module configured to input a to-be-processed image into a first neural network for calculation to obtain a first image, the first image being a first component image of the to-be-processed image processed by the first neural network;
  • the stitching module is used to concatenate the first image and the to-be-processed image in vector to obtain a first to-be-processed image matrix;
  • the arithmetic module is also used to concatenate the first to-be-processed image matrix
  • Input a second neural network to perform operations to obtain a second image, the second image being a second component image of the image to be processed that has been processed by the second neural network; based on the second image, the processed image is obtained After the image.
  • the arithmetic module obtains a processed image based on the second image
  • the calculation module is specifically used for:
  • the first image and the second image are combined to generate the processed image.
  • a third image is obtained at the same time, and the third image is a first component image of the image to be processed that has been processed by the second neural network; corresponding , Combining the third image and the second image to generate the processed image.
  • the complexity of the second neural network is lower than the complexity of the first neural network.
  • the computing power required by the first component image is higher than the computing power required by the second component image.
  • the image to be processed may be one or more frames.
  • the image to be processed includes multiple frames of temporally adjacent images.
  • the first image is multiple frames
  • the second image is multiple frames.
  • Each frame of the image to be processed corresponds to one frame of the first image and one frame. Frame the second image.
  • the calculation module in the calculation performed on the second neural network, is used to: obtain the feature map matrix of the first to-be-processed image matrix according to the first to-be-processed image matrix, and to convert the The feature map matrix is respectively vector stitched with each frame of the first image to obtain a plurality of second to-be-processed image matrices, wherein each frame of the second image is obtained according to each second to-be-processed image matrix.
  • the stitching module when performing vector stitching of the first image and the to-be-processed image, is configured to: group the images adjacent to the multiple frames in the time domain to obtain multiple groups of subgroups Image; the first image of each frame and a group of sub-groups of images are vector stitched to generate multiple sub-matrices of images to be processed.
  • the first image and the sub-group of images for vector stitching correspond to the same frame of image to be processed.
  • the first component image is the brightness component of the image to be processed.
  • the second component image is one or more chrominance components, or one or more color components of the image to be processed.
  • the first component image and the second component image are respectively different color components of the image to be processed.
  • the first neural network and the second neural network constitute an image processing system, and the image processing system is used to process the image to be processed for noise reduction and mosaic effect elimination.
  • the format of the image to be processed may be a red-green-blue (RGB) format, a bright color separation (YUV) format, or a Bayer format.
  • RGB red-green-blue
  • YUV bright color separation
  • Bayer format a format of the image to be processed
  • an embodiment of the present application provides an image processing device based on a neural network.
  • the device includes a processor, and the processor is used to call a set of programs, instructions, or data to execute the first aspect or any one of the first aspects. Possible design methods described.
  • the device may also include a memory for storing programs, instructions or data called by the processor.
  • the memory is coupled with the processor, and when the processor executes the instructions or data stored in the memory, it can implement the method described in the first aspect or any possible design.
  • an embodiment of the present application provides a chip system, which includes a processor and may also include a memory, for implementing the method described in the first aspect or any one of the possible designs of the first aspect.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • the embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium stores computer-readable instructions.
  • the method described in one aspect or any one of the possible designs of the first aspect is executed.
  • the embodiments of the present application also provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the method described in the first aspect or any possible design of the first aspect .
  • FIG. 1 is a schematic diagram of an ISP processing flow in the prior art
  • FIG. 2 is a schematic structural diagram of the system architecture in an embodiment of the application
  • FIG. 3 is a schematic diagram of the principle of a neural network in an embodiment of the application.
  • FIG. 4 is a schematic flowchart of an image processing method based on a neural network in an embodiment of the application
  • FIG. 5a is a schematic diagram of implementation manner 1 of image processing in an embodiment of the application.
  • FIG. 5b is a schematic diagram of implementation manner 2 of image processing in an embodiment of the application.
  • FIG. 6 is a schematic diagram of a RGrGbB image processing method in an embodiment of the application.
  • FIG. 7 is a second schematic diagram of the RGrGbB image processing method in an embodiment of the application.
  • FIG. 8a is one of the schematic structural diagrams of the first neural network in an embodiment of the application.
  • FIG. 8b is the second schematic diagram of the structure of the first neural network in the embodiment of this application.
  • FIG. 9a is a schematic diagram of part of the processing process of a typical convolutional neural network in an embodiment of the application.
  • FIG. 9b is a schematic diagram of part of the processing process of a multi-branch neural network in an embodiment of the application.
  • FIG. 10a is a schematic structural diagram of a second neural network in an embodiment of the application.
  • 10b is a partial schematic diagram of the multi-branch operation of the second neural network in an embodiment of the application.
  • FIG. 11 is a partial schematic diagram of a typical neural network operation used in a second neural network in an embodiment of the application;
  • FIG. 12 is a schematic diagram of vector stitching of a first image and a feature map matrix in an embodiment of this application;
  • FIG. 13 is one of the schematic structural diagrams of the image processing device based on neural network in an embodiment of the application.
  • FIG. 14 is the second structural diagram of the image processing device based on neural network in an embodiment of the application.
  • the image processing method and device based on neural network (NN) provided by the embodiments of this application can be applied to electronic equipment, and the electronic equipment may be a mobile terminal (mobile terminal), mobile station (mobile station, MS), Mobile devices such as user equipment (UE) can also be fixed devices, such as fixed phones, desktop computers, etc., or video monitors.
  • the electronic device is an image acquisition and processing device with image signal acquisition and processing functions, and has an ISP processing function.
  • the electronic device can also optionally have a wireless connection function to provide users with a handheld device with voice and/or data connectivity, or other processing devices connected to a wireless modem.
  • the electronic device can be a mobile phone (or (Called "cellular" phones), computers with mobile terminals, etc., can also be portable, pocket-sized, handheld, computer-built or vehicle-mounted mobile devices, of course, can also be wearable devices (such as smart watches, smart bracelets) Etc.), tablet computers, personal computers (PC), personal digital assistants (PDAs), point of sales (POS), etc.
  • a terminal device may be used as an example for description.
  • FIG. 2 is a schematic diagram of an optional hardware structure of a terminal device 200 related to an embodiment of the application.
  • the terminal device 200 mainly includes a chipset and peripheral devices.
  • Components such as USB interface, memory, display screen, battery/mains power, earphone/speaker, antenna, sensor, etc. can be understood as peripheral devices.
  • the arithmetic processor, RAM, I/O, display interface, ISP, Sensor hub, baseband and other components in the chipset can form a system-on-a-chip (SOC), which is the main part of the chipset.
  • SOC system-on-a-chip
  • the components in the SOC can all be integrated into a complete chip, or part of the components in the SOC can be integrated, and the other parts are not integrated.
  • the baseband communication module in the SOC can not be integrated with other parts and become an independent part.
  • the components in the SOC can be connected to each other through a bus or other connecting lines.
  • the PMU, voice codec, RF, etc. outside the SOC usually include analog circuit parts, so they are often outside the SOC and are not integrated with each other.
  • the PMU is used to connect to the mains or battery to supply power to the SOC, and the mains can be used to charge the battery.
  • the voice codec is used as the sound codec unit to connect with earphones or speakers to realize the conversion between natural analog voice signals and digital voice signals that can be processed by the SOC.
  • the short-range module can include wireless fidelity (WiFi) and Bluetooth, and can also optionally include infrared, near field communication (NFC), radio (FM) or global positioning system (GPS) ) Module etc.
  • the RF is connected with the baseband communication module in the SOC to realize the conversion between the air interface RF signal and the baseband signal, that is, mixing. For mobile phones, receiving is down-conversion, and sending is up-conversion.
  • Both the short-range module and the RF can have one or more antennas for signal transmission or reception.
  • Baseband is used for baseband communication, including one or more of a variety of communication modes, used for processing wireless communication protocols, including physical layer (layer 1), medium access control (MAC) ( Layer 2), radio resource control (RRC) (layer 3) and other protocol layers can support various cellular communication standards, such as long term evolution (LTE) communication, or 5G new air interface ( new radio, NR) communication, etc.
  • the Sensor hub is an interface between the SOC and external sensors, and is used to collect and process data from at least one external sensor.
  • the external sensors can be, for example, accelerometers, gyroscopes, control sensors, image sensors, and so on.
  • the arithmetic processor can be a general-purpose processor, such as a central processing unit (CPU), or one or more integrated circuits, such as one or more application specific integrated circuits (ASICs), or , One or more digital signal processors (digital signal processors, DSP), or microprocessors, or, one or more field programmable gate arrays (FPGA), etc.
  • the arithmetic processor can include one or more cores, and can selectively schedule other units.
  • RAM can store some intermediate data during calculation or processing, such as intermediate calculation data of CPU and baseband.
  • ISP is used to process the data collected by the image sensor.
  • I/O is used for the SOC to interact with various external interfaces, such as the universal serial bus (USB) interface for data transmission.
  • USB universal serial bus
  • the memory can be a chip or a group of chips.
  • the display screen can be a touch screen, which is connected to the bus through a display interface.
  • the display interface can be used for data processing before image display, such as aliasing of multiple layers to be displayed, buffering of display data, or control and adjustment of screen brightness.
  • the terminal device 200 involved in the embodiments of the present application includes an image sensor, which can collect external signals such as light from the outside, and process and convert the external signals into sensor signals, that is, electrical signals.
  • the sensor signal can be a static image signal or a dynamic video image signal.
  • the image sensor may be a camera, for example.
  • the terminal device 200 involved in the embodiment of the present application also includes an image signal processor.
  • the image sensor collects sensor signals and transmits them to the image signal processor.
  • the image signal processor obtains the sensor signal and can perform image signal processing on the sensor signal. , In order to obtain the image signal of the sharpness, color, brightness and other aspects that are in line with the characteristics of the human eye.
  • the image signal processor involved in the embodiment of the present application may be one or a group of chips, that is, it may be integrated or independent.
  • the image signal processor included in the terminal device 200 may be an integrated ISP chip integrated in the arithmetic processor.
  • the terminal device 200 involved in the embodiments of the present application has the function of taking photos or recording videos.
  • the neural network-based image processing method provided in the embodiments of the present application mainly focuses on how to perform image signal processing based on the neural network.
  • a neural network is used to process the multi-frame images to be processed.
  • Neural network is a network structure that imitates the behavioral characteristics of animal neural networks for information processing, also referred to as neural networks (NN).
  • the neural network can be composed of neural units.
  • the neural unit can refer to the arithmetic unit that takes the input signal x s and the intercept 1 as input.
  • the output of the arithmetic unit can be as shown in formula (1-1):
  • s 1, 2,...n, n is a natural number greater than 1
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the characteristics of the local receptive field.
  • the local receptive field can be a region composed of several neural units.
  • the neural network 300 has N processing layers, where N ⁇ 3 and N takes a natural number.
  • the first layer of the neural network is the input layer 301, which is responsible for receiving input signals.
  • the last layer of the neural network is the output layer 303, which outputs the processing results of the neural network.
  • the other layers except the first and last layers are the intermediate layer 304.
  • These intermediate layers together form the hidden layer 302, each of the hidden layers
  • the middle layer of the layer can receive input signals and output signals, and the hidden layer is responsible for the processing of input signals.
  • Each layer represents a logic level of signal processing. Through multiple layers, data signals can be processed by multiple levels of logic.
  • the input signal of the neural network may be a signal in various forms such as a voice signal, a text signal, an image signal, and a temperature signal.
  • the processed image signals may be various sensor signals such as landscape signals captured by a camera (image sensor), image signals of a community environment captured by a display monitoring device, and facial signals of human faces acquired by an access control system.
  • the input signals of the neural network include various other engineering signals that can be processed by computers, so I won't list them all here. If the neural network is used for deep learning of the image signal, the image quality can be improved.
  • Deep neural network also known as multi-layer neural network
  • the DNN is divided according to the positions of different layers.
  • the neural network inside the DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the number of layers in the middle are all hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1th layer.
  • x is the input vector and y is The output vector
  • b is the offset vector
  • W is the weight matrix (also called coefficient)
  • ⁇ () is the activation function.
  • Each layer simply performs such a simple operation on the input vector x to obtain the output vector y. Due to the large number of DNN layers, the number of coefficients W and offset vectors b is also relatively large.
  • DNN The definition of these parameters in DNN is as follows: Take coefficient W as an example: Suppose in a three-layer DNN, the linear coefficients from the fourth neuron in the second layer to the second neuron in the third layer are defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third-level index 2 and the input second-level index 4.
  • the coefficient from the kth neuron in the L-1th layer to the jth neuron in the Lth layer is defined as
  • Convolutional neural network (convolutional neuron network, CNN) is a deep neural network with a convolutional structure.
  • the convolutional neural network contains a feature extractor composed of a convolutional layer and a sub-sampling layer.
  • the feature extractor can be regarded as a filter.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can be connected to only part of the neighboring neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units in the same feature plane share weights, and the shared weights here are the convolution kernels.
  • Sharing weight can be understood as the way of extracting image information has nothing to do with location.
  • the convolution kernel can be initialized in the form of a matrix of random size. In the training process of the convolutional neural network, the convolution kernel can obtain reasonable weights through learning. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, and at the same time reduce the risk of overfitting.
  • the neural network in the embodiment of the present application may be a convolutional neural network, and of course, it may also be another type of neural network, such as a recurrent neural network (recurrent neural network, RNN).
  • recurrent neural network recurrent neural network
  • the images in the embodiments of the present application may be static images (or referred to as static pictures) or dynamic images (or referred to as dynamic pictures).
  • the images in the present application may be videos or dynamic pictures, or the present application
  • the images in can also be static pictures or photos.
  • static images or dynamic images are collectively referred to as images.
  • This method is executed by an image processing device based on a neural network.
  • the neural network-based image processing device may be any device or device with image processing functions to execute, for example, the method is executed by the terminal device 200 shown in FIG. 2 or executed by a device related to the terminal device, or It is executed by part of the equipment included in the terminal equipment.
  • multiple neural networks are used for image processing, for example, two neural networks are used to process the image to be processed, and the two neural networks are denoted as the first neural network and the second neural network.
  • the first neural network and the second neural network conform to the above description of the neural network.
  • the image to be processed includes component images of one or more dimensions.
  • the image to be processed includes a first component image and a second component image.
  • the process of performing image processing on the image to be processed includes processing the first component image and the second component image.
  • the neural network-based image processing method provided by the embodiment of the present application is as follows.
  • S401 Input the image to be processed into the first neural network for calculation to obtain a first image.
  • the first image is a first component image processed by the first neural network of the image to be processed.
  • S402 Perform vector concatenation of the first image and the to-be-processed image to obtain a first to-be-processed image matrix.
  • S403 Input the first image matrix to be processed into the second neural network for calculation to obtain a second image.
  • the second image is a second component image of the image to be processed that has been processed by the second neural network
  • the first image obtained after the first neural network operation is performed on the image to be processed can process a part of the component image of the image to be processed to obtain an intermediate result.
  • the first image is spliced with the image to be processed, and the splicing is processed by the second neural network to obtain the second image.
  • the intermediate results can be applied to the processing of the second neural network, reducing the computational complexity of the second neural network and ensuring the quality of image processing.
  • the first component image is the brightness component of the image to be processed.
  • the brightness component is an important component in the image processing process, which occupies a relatively high proportion of network complexity, and the brightness component can be processed first through the first neural network.
  • the processing result of the brightness component is input into the second neural network as an intermediate result, and the complexity requirement of the second neural network will be reduced.
  • the image to be processed includes a first component image and a second component image, and the processed image also includes the first component image and the second component image.
  • the first image is obtained based on the first neural network
  • the second image is obtained by the second neural network
  • the first image and the second image are combined to obtain the processed image.
  • Combining the first image and the second image can also be considered as combining the first image and the second image, because the first image is the first component image processed by the first neural network, and the second image is the second neural network.
  • the first image and the second image are combined, that is, the processed first component image and the processed second component image are combined to obtain the processed image.
  • the first image matrix to be processed is input to the second neural network for calculation, and when the second image is obtained, the third image is also obtained at the same time.
  • the third image is the first component image processed by the second neural network of the image to be processed. In this way, the third image and the second image can be combined to generate a processed image.
  • the image to be processed may be one frame or multiple adjacent frames in the time domain.
  • the adjacent multi-frames in the time domain include consecutive multi-frames in the time domain.
  • the adjacent multi-frames in the time domain are referred to as multi-frames in the description below.
  • the processed image is also corresponding to multiple frames.
  • the first component image and the second component image are multiple frames respectively
  • the first image obtained after processing by the first neural network is multiple frames
  • the second image obtained by the second neural network is Multiple frames
  • each frame of the to-be-processed image corresponds to one frame of the first image and one frame of the second image.
  • Each frame of processed image corresponds to one frame of first image and one frame of second image, or each frame of processed image corresponds to one frame of first image and one frame of third image.
  • the solution of the embodiment of the present application is as follows. Input the multi-frame images to be processed into the first neural network for processing to obtain the first image.
  • the first image is the multi-frame first component image of the multi-frame image to be processed and processed by the first neural network. After the processed image passes through the first neural network, a frame of the first component image will be obtained.
  • the first image and the image to be processed are vector stitched to obtain a first image matrix to be processed, and correspondingly, there are multiple first image matrices to be processed. Specifically, each frame of the first image and the corresponding frame of the image to be processed are vector stitched to obtain the first image matrix to be processed.
  • the first image matrix to be processed is input into the second neural network for calculation to obtain a second image.
  • the second image is a multi-frame second component image processed by the second neural network of the multi-frame image to be processed. A processed image is obtained based on the second image.
  • the two optional possible manners are as follows.
  • multiple frames of first component images processed by the first neural network and multiple frames of second component images processed by the second neural network are combined to generate a multi-frame processed image.
  • a third image is obtained.
  • the third image is a multi-frame first component image processed by the second neural network of the multi-frame image to be processed.
  • the image and the second image are combined to generate a processed image.
  • the first component image may be the brightness component or the brightness channel of the image to be processed.
  • the second component image is one or more chrominance components of the image to be processed, or the second component image is one or more color components of the image to be processed, or the second component image is one or more color channels of the image to be processed Or chroma channel.
  • the first component image is one or more chrominance components of the image to be processed
  • the second component image is one or more chrominance components of the image to be processed
  • the first component image is different chrominance components of the image to be processed.
  • the chrominance component may also be referred to as a chrominance channel or a color component or a color channel.
  • the format of the image to be processed may be a red-green-blue (RGB) format, a bright color separation (YUV) format, or a Bayer format.
  • RGB red-green-blue
  • YUV bright color separation
  • Bayer format There is no limitation in this application.
  • the format of the image to be processed is RGB
  • the first component image may be a G channel
  • the second component signal is an RB channel.
  • the first image matrix to be processed is input to the second neural network for calculation to obtain a second image.
  • the second image is multiple frames.
  • the first to-be-processed image matrix is formed by vector stitching of the first image and the to-be-processed image.
  • the first image is multiple frames.
  • the first image matrix to be processed is multiple matrices or includes multiple image sub-matrices to be processed.
  • the feature map matrix of the first to-be-processed image matrix is obtained according to the first to-be-processed image matrix, and the feature map matrix is respectively vector-spliced with each frame of the first image to obtain multiple second images.
  • a matrix of images to be processed, wherein each frame of the second image is obtained according to each second image matrix to be processed.
  • vector stitching is performed on the first image and the image to be processed to obtain the first image matrix to be processed.
  • the first image is multiple frames
  • the to-be-processed image is multiple frames.
  • Multiple frames of to-be-processed images can be grouped to obtain multiple sets of sub-groups of images, and the first image of each frame and a set of sub-groups of images can be vector stitched to obtain multiple sub-matrices of to-be-processed images.
  • the first to-be-processed image matrix includes the plurality of to-be-processed image sub-matrices, or in other words, the plurality of to-be-processed image sub-matrices form the first to-be-processed image matrix.
  • the first image for vector stitching and a group of sub-group images correspond to the same frame of image to be processed.
  • the number of multi-frame images to be processed is 4 frames, and 4 frames of to-be-processed images are input to the first neural network for calculation to obtain 4 first images.
  • each frame of the to-be-processed image corresponds to a first image, for example, the first frame of the to-be-processed image corresponds to the first first image; the second frame of the to-be-processed image corresponds to the second first image.
  • the first group of sub-group images corresponds to the first frame of images to be processed
  • the second group of sub-group images corresponds to the second frame of images to be processed.
  • the first group of sub-group images and the first first image are vector stitched.
  • the second group of sub-group images and the second first image are vector stitched.
  • the vector stitching of multiple frames of the first image and multiple groups of sub-group images can be regarded as the internal processing process of the second neural network.
  • the input to the second neural network is an overall matrix, that is, the first image matrix to be processed.
  • the process of splicing into the first image matrix to be processed can be decomposed into the splicing of the aforementioned multiple frames of first images and multiple groups of sub-group images.
  • the first neural network and the second neural network can be combined into an image processing system, and the image repair system is used to process the image to be processed to improve the quality of the image or video.
  • the processing process can include noise reduction, mosaic effect elimination and other processing.
  • the complexity of the first neural network is higher than the complexity of the second neural network.
  • multiple frames of images are often synthesized into one frame of output through a neural network to improve image or video quality.
  • a neural network requires a high degree of complexity, and in a video scene, a high processing speed is required.
  • terminal video processing needs to achieve a processing speed of 30 frames/s for a video with a resolution of 8K, that is, a frame rate of 30.
  • a neural network is used to synthesize multiple frames of images into one frame for output, it needs to face the problems of computational complexity and computational resource consumption, and a large time delay is required. If you blindly reduce the complexity of the neural network and use a lower complexity network, it will affect the quality of the image or video.
  • the first neural network is used to deal with the problem of complex computing power between multiple frames of images
  • the second neural network is used to deal with the problem of lower computing power in each frame of the multi-frame images, and output the multi-frame processed
  • the integrated computing power of the first neural network and the second neural network is allocated to multiple frames of images, so that the processing complexity of each frame of image is reduced compared with the above-mentioned solution, and at the same time, the quality of the image or video can be guaranteed.
  • the first component image is the luminance channel
  • the second component image is the chrominance channel.
  • the first neural network can solve the problem of inter-frame motion between multiple frames of images
  • the second neural network processes the chrominance of each frame of image. In this way, through the joint processing of the two neural networks, the image processing system provided by the present application has lower complexity in image processing and ensures the quality of the image or video. Improve the application of deep learning technology in the field of image signal processing.
  • the first neural network and the second neural network are convolutional neural networks as an example.
  • the image to be processed is 4 frames, and the processed image is 4 frames.
  • the format of the image to be processed is a Bayer format image, in particular the image format is an RGrGbB format, and one frame of an RGrGbB format image includes 4 channels (R, Gr, Gb, B).
  • R, Gr, Gb, B 4 channels
  • the image processing system includes a first neural network and a second neural network.
  • the 16 channels include (R1, Gr1, Gb1, B1, R2, Gr2, Gb2, B2, R3, Gr3, Gb3, B3, R4, Gr4, Gb4, B4).
  • the first component image is a Gr channel
  • 4 consecutive Gr channel images are obtained.
  • the four consecutive Gr channel images include a first frame of Gr channel image Gr1, a second frame of Gr channel image Gr2, a third frame of Gr channel image Gr3, and a fourth frame of Gr channel image Gr4.
  • the second component image is the R, Gb, and B channels, and 4 consecutive RGbB images (R1, Gb1, B1, R2, Gb2, B2, R3, Gb3, B3, R4, Gb4, B4) are obtained.
  • 4 frames of continuous RGbB channel images include the first frame of RGbB channel images R1, Gb1, B1, the second frame of RGbB channel images R2, Gb2, B2, the third frame of RGbB channel images R3, Gb3, B3, and the fourth frame of RGbB channel images R4, Gb4, B4.
  • the 16 channels include (R1, Gr1, Gb1, B1, R2, Gr2, Gb2, B2, R3, Gr3, Gb3, B3, R4, Gr4, Gb4, B4).
  • the first component image is a Gr channel
  • 4 consecutive Gr channel images are obtained.
  • the four consecutive Gr channel images include a first frame of Gr channel image Gr1, a second frame of Gr channel image Gr2, a third frame of Gr channel image Gr3, and a fourth frame of Gr channel image Gr4.
  • 4 frames of continuous Gr channel images and 4 frames of continuous RGrGbB images to be processed are vector stitched, and the obtained first to be processed image matrix is input into the second neural network for processing, and 4 continuous frames of processed images are obtained.
  • 4 consecutive frames of processed images can also be regarded as 4 consecutive frames of second images and 4 consecutive frames of third images.
  • the third image is the Gr channel, and the 4 consecutive third images are (Gr1, Gr2, Gr3, Gr4).
  • the second component image is the RGbB channel, and 4 consecutive RB images (R1, Gb1, B1, R2, Gb2, B2, R3, Gb3, B3, R4, Gb4, B4) are obtained.
  • Four consecutive frames of RGbB channel images include the first frame of RB channel images R1, Gb1, B1, the second frame of RGbB channel images R2, Gb2, B2, the third frame of RGbB channel images R3, Gb3, B3, and the fourth frame of RGbB channel images R4, Gb4, B4.
  • the architecture of the first neural network is shown in Figs. 8a and 8b. Because the drawings of the first neural network are too large, the first neural network is divided into two parts, as shown in Figs. 8a and 8b, respectively. out. Figures 8a and 8b together form the architecture of the first neural network. After add in Figure 8a, connect to the first layer in Figure 8b.
  • the convolutional layer is represented by a rectangular box.
  • Conv2d represents the 2-dimensional convolution
  • bias represents the bias term
  • 1x1/3x3 represents the size of the convolution kernel
  • Stride represents the step size
  • _32_16 represents the number of input and output feature maps
  • 32 represents the number of feature maps input to the layer is 32
  • 16 means that the number of feature maps of the output layer is 16.
  • Split represents the split layer, which means that the feature map is split in the channel (chanel) dimension.
  • Split 2 means to split the image in the feature map dimension. For example, an image input with 32 feature maps will become two images with 16 feature maps after the above operation.
  • concat stands for the jump chain layer, which means that images are merged in the feature map dimension, for example, two images with 16 feature maps are merged into one image with 32 feature maps.
  • add represents a matrix addition operation.
  • the convolutional layer of the first neural network shown in Figure 8a and Figure 8b adopts multi-branch operation, which can solve the interference of motion information between multiple frames of brightness channels, and then solve multiple frames through a complex convolutional neural network Motion interference between image brightness channels.
  • the neural network system in the embodiment of the present application is a network for noise reduction
  • the above-mentioned multi-branch operation can be used to obtain the result of multi-frame luminance channel noise reduction, and can ensure the multi-frame luminance channel noise reduction
  • the latter result does not have problems such as motion blur and motion smearing.
  • the convolutional layer of the first neural network may also adopt a group convolution (group convolution) operation. Among them, the group convolution is a special convolution layer.
  • N the number of groups M for group convolution.
  • the operation of the group of convolutional layers is to first divide the N channels into M parts. Each group corresponds to N/M channels, and the convolution of each group is performed independently. After completion, the output feature maps are vector concatenated (concat) together as the output channel of this layer.
  • the operation mode of group convolution can obtain the same or similar technical effect of the branch mode.
  • a typical convolutional neural network can also be used. If a typical convolutional neural network is used, before obtaining 4 frames of continuous Gr channel images, it is different from Figure 8b Yes, the operation after the concat result of the last step of the jump chain layer. For a better comparison, the multi-branch operation in Figure 8b is shown in Figure 9b.
  • the 32-layer feature maps of the same jump-connected layer result are used as the input of the classic neural network and the multi-branch neural network, respectively. In the classic neural network, multiple convolutional layers are shared.
  • the 32-layer feature map is operated by 4 convolutional layers, and finally 4 layers of feature maps (4 frames of Gr channel images) are output.
  • the multi-branch neural network adopts a 4-branch method. Each branch independently obtains the feature map output of one channel through a 4-layer convolution operation as the result of the luminance channel of one frame, and the four branches respectively obtain 4 luminance channel results.
  • the neural network that uses a mixture of classic convolutional layers and multi-branch convolutional layers can not only solve the problem of image noise reduction, but also ensure that there are no problems such as motion blur and motion smearing when multi-frame Gr channels are output at the same time.
  • the architecture of the second neural network is described below. Exemplarily, the architecture in the second neural network is shown in Fig. 10a.
  • _20_16 indicates that the number of feature maps input to this layer is 20, and the number of feature maps output to this layer is 16.
  • the convolutional layer of the second neural network shown in Fig. 10a adopts a multi-branch operation. If a multi-branch neural network is not used, a typical convolutional neural network can also be used. For better comparison, the multi-branch operation part in Figure 10a is shown in Figure 10b. The operating part of a typical convolutional neural network is shown in Figure 11. Figure 10b and Figure 11 also display the output Gr channel image and RGbB channel image.
  • the leftmost chain jump layer (concat) in FIG. 10b and the leftmost chain jump layer (concat) in FIG. 11 output the same results, and the operations after the concat are different.
  • Each _17_3 in FIG. 11 indicates that the number of feature maps input to this layer is 17, and the number of feature maps output to this layer is 3.
  • _17_12 in Figure 10b indicates that the number of feature maps input to this layer is 17, and the number of feature maps output to this layer is 12.
  • Figure 10b and Figure 11 are the same as the result of the chain-jumping layer.
  • the 17-layer feature maps are used as the input of the classic neural network and the multi-branch neural network, respectively. In the classic neural network, multiple convolutional layers are shared.
  • the 17-layer feature map is operated by one convolutional layer, and finally 12-layer feature map is output (4 frames of R ⁇ Gb ⁇ B channel images).
  • the multi-branch neural network adopts a 4-branch method. Each branch independently links the Gr channel of the first neural network result of the corresponding frame by means of chain skipping. After operating through 1 convolutional layer, 3 channels are obtained. Feature map output, as the result of one frame of color channel (R ⁇ Gb ⁇ B channel) image, four branches, 4 frames of color results are obtained respectively, using multi-branch convolutional layer neural network, and using the jump chain layer will correspond to the reduced Linking the noisy and clean brightness channels can solve the problem of image noise reduction in a very low-complexity network. It can also ensure that the simultaneous output of multi-frame R ⁇ Gb ⁇ B channels does not have problems such as motion blur and motion smearing.
  • 4 consecutive Gr channel images (Gr1, Gr2, Gr3, Gr4) are obtained.
  • the 4 groups of sub-group images are: (R1, Gr1, Gb1, B1), (R2, Gr2, Gb2, B2), (R3, Gr3, Gb3, B3), (R4, Gr4, Gb4, B4) .
  • the first image and the sub-group of images to be vector stitched correspond to the same frame of image to be processed.
  • the first image matrix to be processed obtained after vector stitching.
  • the feature map matrix of the first image matrix to be processed is obtained according to the first image matrix to be processed.
  • the feature map matrix is respectively vector-spliced with the first image of each frame to obtain multiple second to-be-processed image matrices.
  • the first image is obtained by the first neural network.
  • the first image of each frame is input to the second neural network, and vector stitching is performed with the feature map matrix.
  • the feature map matrix of the first image matrix to be processed corresponds to the result of the concat of the chain skipping layer on the leftmost side in FIG. 12.
  • obtaining multiple second vectors is taken as an example.
  • the neural network model needs to be trained before the first neural network and the second neural network are used.
  • the training data can include training images and ground truth images.
  • the training image includes a first component image and a second component image.
  • the output image is compared with the true value image of the first component image until the network converges, and the training of the first neural network model is completed.
  • the so-called network convergence may refer to, for example, that the difference between the output image and the first true value image is less than the set first threshold.
  • the parameters of the first component image obtained by the training of the first neural network are fixed, and the true value image of the second component image is used to process the collected training image to obtain and output the image.
  • the output image is compared with the true value image of the second component image until the network converges, and the training of the second neural network model is completed.
  • network convergence may mean that the difference between the output image and the true value image of the second component image is less than the set second threshold.
  • the input training image is four adjacent or continuous frames in the time domain.
  • the four consecutive images of the training image are R 1 G 1 B 1 , R 2 G 2 B 2 , R 3 G 3 B 3 , and R 4 G 4 B 4 .
  • the first component image is a luminance channel
  • the second component image is a color channel.
  • the training process of the first neural network (or called the luminance channel inter-network) and the second neural network (or called the color channel intra-network) is as follows. Construct two true-value images of brightness channel and color channel.
  • the models of the two networks can be tested.
  • testing there are two ways to output test results.
  • four frames of images R 1 G 1 B 1 , R 2 G 2 B 2 , R 3 G 3 B 3 , R 4 G 4 B 4 are used as input, and G′ is output through the inter-frame brightness network.
  • the four frames of images R 1 G 1 B 1 , R 2 G 2 B 2 , R 3 G 3 B 3 , R 4 G 4 B 4 are used as input and output through the inter-frame brightness network G′ 1 G′ 2 G′ 3 G′ 4 , and then output R′ 1 G′ 1 B′ 1 , R′ 2 G′ 2 B′ 2 , R′ 3 G′ 3 B′ 3 , through the intra-frame color network R′ 4 G′ 4 B′ 4 .
  • an image processing system is formed by a first neural network and a second neural network, and the image processing system is used to process multiple frames of to-be-processed images, and output multiple frames of processed images.
  • the complexity of the second neural network is lower than the complexity of the first neural network.
  • the calculation amount of the image processing system for each frame of the image to be processed is reduced to a certain extent compared with the scheme of processing multiple frames of images into one frame through the basic network in some technologies. In turn, the image processing time delay can be reduced, and the quality of the image or video can be guaranteed.
  • the computing power of the two neural networks for processing multiple frames of images to be processed will be illustrated below with examples.
  • the processed image output by the first neural network and the second neural network is 4 frames.
  • a frame is output after basic network processing.
  • the first neural network is shown in Figure 8a and Figure 8b, and the second neural network is shown in Figure 10a.
  • the calculation amount of the first neural network is about the same as that of the basic network, which is about 12000 MAC.
  • the calculation process of the network complexity of the basic network is as follows:
  • the calculation amount of the second neural network is about 6000, which is assumed to be 6000.
  • the multi-frame input and multi-frame output schemes performed by the first neural network and the second neural network can reduce the amount of calculation, thereby reducing the delay of image processing, and can meet the requirements of video in video scenarios.
  • Requirements for image processing delay The network computing power requirement of a video with a resolution of 8 thousand (K) pixels and a frame rate of 30 frames per second is about 3000 MAC.
  • the embodiment of this application outputs 8 frames, the amount of calculation specified by the image processing system can basically be Meet the network computing power requirements of 8K 30 videos.
  • the neural network-based image processing device may include a hardware structure and/or a software module, and implement the above in the form of a hardware structure, a software module, or a hardware structure plus a software module.
  • a hardware structure e.g., a hardware structure plus a software module.
  • Each function Whether a certain function among the above-mentioned functions is executed by a hardware structure, a software module, or a hardware structure plus a software module depends on the specific application and design constraint conditions of the technical solution.
  • an embodiment of the present application also provides a neural network-based image processing device 1300.
  • the neural network-based image processing device 1300 may be a mobile terminal or any device with image processing functions. .
  • the neural network-based image processing device 1300 may include modules that perform one-to-one correspondence of the methods/operations/steps/actions in the foregoing method embodiments.
  • the modules may be hardware circuits, software, or It is realized by hardware circuit combined with software.
  • the neural network-based image processing device 1300 may include an arithmetic module 1301 and a splicing module 1302.
  • the calculation module 1301 is configured to input the image to be processed into a first neural network for calculation to obtain a first image, and the first image is a first component image of the image to be processed that has been processed by the first neural network;
  • the stitching module 1302 is used for vector stitching (concatenate) the first image and the to-be-processed image to obtain the first to-be-processed image matrix;
  • the calculation module 1301 is also used to input the first image matrix to be processed into the second neural network for calculation to obtain a second image, the second image being a second component image of the image to be processed that has been processed by the second neural network; The second image, the processed image is obtained.
  • the arithmetic module 1301 and the splicing module 1202 can also be used to perform other corresponding steps or operations in the above method embodiments, and will not be repeated here.
  • the division of modules in the embodiments of this application is illustrative, and it is only a logical function division. In actual implementation, there may be other division methods.
  • the functional modules in the various embodiments of this application can be integrated into one process. In the device, it can also exist alone physically, or two or more modules can be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
  • an embodiment of the present application also provides an image processing device 1400 based on a neural network.
  • the image processing device 1400 of the neural network includes a processor 1401.
  • the processor 1401 is used to call a group of programs, so that the foregoing method embodiments are executed.
  • the neural network image processing device 1400 further includes a memory 1402, and the memory 1402 is used to store program instructions and/or data executed by the processor 1401.
  • the memory 1402 is coupled with the processor 1401.
  • the coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units or modules, and may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
  • the processor 1401 may operate in cooperation with the memory 1402.
  • the processor 1401 may execute program instructions stored in the memory 1402.
  • the memory 1402 may be included in the processor 1401.
  • the neural network-based image processing device 1400 may be a chip system.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the processor 1401 is configured to input the image to be processed into a first neural network for calculation to obtain a first image, the first image being a first component image of the image to be processed that has been processed by the first neural network;
  • the image and the image to be processed are vector concatenated (concatenate) to obtain a first image matrix to be processed;
  • the second component image of the processed image processed by the second neural network based on the second image, the processed image is obtained.
  • the processor 1401 may also be used to perform other corresponding steps or operations in the foregoing method embodiments, which will not be repeated here.
  • the processor 1401 may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, and may implement or execute the The disclosed methods, steps and logic block diagrams.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in combination with the embodiments of the present application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the memory 1402 may be a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD), etc., and may also be a volatile memory (volatile memory), such as random access memory (random access memory). -access memory, RAM).
  • the memory is any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited to this.
  • the memory in the embodiments of the present application may also be a circuit or any other device capable of realizing a storage function for storing program instructions and/or data.
  • An embodiment of the present application also provides a chip including a processor, which is used to support the neural network-based image processing device to implement the functions involved in the foregoing method embodiments.
  • the chip is connected to a memory or the chip includes a memory, and the memory is used to store the necessary program instructions and data of the communication device.
  • the embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program includes instructions for executing the foregoing method embodiments.
  • the embodiments of the present application provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the foregoing method embodiments.
  • this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et un appareil de traitement d'image basés sur un réseau neuronal, ceux-ci étant utilisés pour réduire un retard de traitement d'image sur la base de la garantie de la qualité d'image. Le procédé consiste à entrer une image à traiter dans un premier réseau neuronal pour une opération afin d'obtenir une première image, l'image à traiter comprenant une première image de composant et une seconde image de composant, et la première image étant la première image de composant de l'image à traiter après avoir été traitée par le premier réseau neuronal ; à effectuer une concaténation de vecteur sur la première image et sur l'image à traiter pour obtenir une première matrice d'image à traiter ; à entrer la première matrice d'image à traiter dans un second réseau neuronal pour une opération afin d'obtenir une seconde image, la seconde image étant la seconde image de composant de l'image à traiter après avoir été traitée par le second réseau neuronal ; et à obtenir une image traitée sur la base de la seconde image.
PCT/CN2020/078484 2020-03-09 2020-03-09 Procédé et appareil de traitement d'image basés sur un réseau neuronal WO2021179147A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/078484 WO2021179147A1 (fr) 2020-03-09 2020-03-09 Procédé et appareil de traitement d'image basés sur un réseau neuronal
CN202080098211.7A CN115244569A (zh) 2020-03-09 2020-03-09 一种基于神经网络的图像处理方法及装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/078484 WO2021179147A1 (fr) 2020-03-09 2020-03-09 Procédé et appareil de traitement d'image basés sur un réseau neuronal

Publications (1)

Publication Number Publication Date
WO2021179147A1 true WO2021179147A1 (fr) 2021-09-16

Family

ID=77671071

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/078484 WO2021179147A1 (fr) 2020-03-09 2020-03-09 Procédé et appareil de traitement d'image basés sur un réseau neuronal

Country Status (2)

Country Link
CN (1) CN115244569A (fr)
WO (1) WO2021179147A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3054279A1 (fr) * 2015-02-06 2016-08-10 St. Anna Kinderkrebsforschung e.V. Procédés de classification et de visualisation de populations cellulaires sur un niveau de cellule unique sur la base d'images de microscopie
CN109102475A (zh) * 2018-08-13 2018-12-28 北京飞搜科技有限公司 一种图像去雨方法及装置
CN110415187A (zh) * 2019-07-04 2019-11-05 深圳市华星光电技术有限公司 图像处理方法及图像处理系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3054279A1 (fr) * 2015-02-06 2016-08-10 St. Anna Kinderkrebsforschung e.V. Procédés de classification et de visualisation de populations cellulaires sur un niveau de cellule unique sur la base d'images de microscopie
CN109102475A (zh) * 2018-08-13 2018-12-28 北京飞搜科技有限公司 一种图像去雨方法及装置
CN110415187A (zh) * 2019-07-04 2019-11-05 深圳市华星光电技术有限公司 图像处理方法及图像处理系统

Also Published As

Publication number Publication date
CN115244569A (zh) 2022-10-25

Similar Documents

Publication Publication Date Title
US11430209B2 (en) Image signal processing method, apparatus, and device
EP2677732B1 (fr) Procédé, appareil et produit programme d'ordinateur pour capturer un contenu vidéo
US10136110B2 (en) Low-light image quality enhancement method for image processing device and method of operating image processing system performing the method
CN107431770A (zh) 自适应线性亮度域视频流水线架构
CN113962884B (zh) Hdr视频获取方法、装置、电子设备以及存储介质
EP2791898A2 (fr) Procédé, appareil et produit programme d'ordinateur pour capturer des images
CN111741303B (zh) 深度视频处理方法、装置、存储介质与电子设备
US20190114750A1 (en) Color Correction Integrations for Global Tone Mapping
US20190281230A1 (en) Processor, image processing device including same, and method for image processing
CN112954251B (zh) 视频处理方法、视频处理装置、存储介质与电子设备
US11825179B2 (en) Auto exposure for spherical images
CN113850367A (zh) 网络模型的训练方法、图像处理方法及其相关设备
WO2024027287A9 (fr) Système et procédé de traitement d'image, et support lisible par ordinateur et dispositif électronique associés
WO2023010755A1 (fr) Procédé et appareil de conversion vidéo hdr, dispositif, et support de stockage informatique
WO2021179147A1 (fr) Procédé et appareil de traitement d'image basés sur un réseau neuronal
KR20130018899A (ko) 단일 파이프라인 스테레오 이미지 캡처
WO2021196050A1 (fr) Procédé et appareil de traitement d'image sur la base d'un réseau neuronal
CN102509320B (zh) 一种基于电子终端的图片处理方法和装置
WO2021179142A1 (fr) Procédé de traitement d'image et appareil associé
US11941789B2 (en) Tone mapping and tone control integrations for image processing
CN114363693B (zh) 画质调整方法及装置
CN109688333B (zh) 彩色图像获取方法、装置、设备以及存储介质
WO2022218080A1 (fr) Appareil de traitement de signal d'image prépositif et produit associé
KR20240047283A (ko) 인공지능 기반의 영상 처리 방법 및 이를 지원하는 전자 장치
CN115801987A (zh) 一种视频插帧方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20924725

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20924725

Country of ref document: EP

Kind code of ref document: A1