WO2021114184A1 - 神经网络模型的训练方法、图像处理方法及其装置 - Google Patents

神经网络模型的训练方法、图像处理方法及其装置 Download PDF

Info

Publication number
WO2021114184A1
WO2021114184A1 PCT/CN2019/124911 CN2019124911W WO2021114184A1 WO 2021114184 A1 WO2021114184 A1 WO 2021114184A1 CN 2019124911 W CN2019124911 W CN 2019124911W WO 2021114184 A1 WO2021114184 A1 WO 2021114184A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
color
neural network
training
color adjustment
Prior art date
Application number
PCT/CN2019/124911
Other languages
English (en)
French (fr)
Inventor
李蒙
胡慧
陈海
郑成林
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201980102164.6A priority Critical patent/CN114730456A/zh
Priority to PCT/CN2019/124911 priority patent/WO2021114184A1/zh
Publication of WO2021114184A1 publication Critical patent/WO2021114184A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/46Colour picture communication systems
    • H04N1/56Processing of colour picture signals
    • H04N1/60Colour correction or control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This application relates to the field of image processing, and more specifically, to a training method of a neural network model, an image processing method and a device thereof.
  • image signal processing image signal processing
  • ISP image signal processing
  • the ISP processing flow is shown in Figure 1.
  • the natural scenery 101 obtains a Bayer image through a lens 102, and then obtains an analog electrical signal 105 through a photoelectric conversion 104, and further obtains a digital image signal through denoising and analog-to-digital processing 106 (Ie, raw image) 107, which will then enter the digital signal processing chip 100.
  • the steps in the digital signal processing chip 100 are the core steps of ISP processing.
  • the digital signal processing chip 100 generally includes black level compensation (BLC) 108, lens shading correction 109, and dead pixel correction ( bad pixel correction, BPC) 110, demosaic (demosaic) 111, Bayer domain noise reduction (denoise) 112, auto white balance (AWB) 113, Ygamma 114, auto exposure (AE) 115, auto focus (auto focus, AF) (not shown in Figure 1), color correction (CC) 116, gamma correction 117, color gamut conversion 118, color denoising/detail enhancement 119, color enhancement (color Enhance (CE) 120, formater (formater) 121, input/output (input/output, I/O) control 122 and other modules.
  • BLC black level compensation
  • BPC dead pixel correction
  • demosaic demosaic
  • Bayer domain noise reduction denoise
  • ARB auto white balance
  • AE auto exposure
  • AF auto focus
  • CE color Enhance
  • CE color Enhance
  • the modules related to global color in ISP processing mainly include AWB113, CC116, etc.
  • the traditional method is to classify according to the scene, select the applicable color adjustment matrix from several fixed color adjustment matrices, and then use the selected color adjustment matrix to process the image.
  • the color adjustment matrix can only be determined from several fixed color adjustment matrices, so that the effect of image color adjustment is not ideal.
  • the application provides a neural network model training method, image processing method and device, which can improve the effect of image color adjustment.
  • this application provides a method for training a neural network model.
  • the method includes: acquiring a true value image and a training image, the true value image is constructed according to the standard value of the color card, and the training image contains The image of the color card; training a neural network model according to the true value image and the training image, and the output target of the neural network model is a color adjustment matrix for color adjustment of the image.
  • the output target of the neural network model obtained by training is the color adjustment matrix. Therefore, using the neural network model to process the image to be processed can generate the color adjustment matrix for the image to be processed. Compared with the traditional method, The obtained color adjustment matrix is more consistent with the scene of the image to be processed. In this way, the color adjustment matrix obtained by using the above technical solution can improve the effect of image color adjustment.
  • the above technical solution uses the standard values of the color card to construct the true value image during the training process of the neural network model, and uses the image with the color card as the training image, which can complete the neural network in the absence of training images or true value images.
  • Network model training uses the standard values of the color card to construct the true value image during the training process of the neural network model, and uses the image with the color card as the training image, which can complete the neural network in the absence of training images or true value images.
  • the training a neural network model according to the ground truth image and the training image includes: obtaining a candidate image according to the training image, and the candidate image is based on the An image composed of the color values at the location of the color card in the training image; obtain the color adjustment matrix corresponding to the training image according to the training image and the neural network model; use the color adjustment matrix to perform Color adjustment to obtain a color-adjusted image; according to the color-adjusted image and the true value image, the model parameters of the neural network model are adjusted.
  • the candidate image is an image in which the color value at the location of the color card included in the training image is filtered.
  • the filtering processing can be any possible filtering methods such as mean filtering and median filtering.
  • the model parameters of the neural network model can be adjusted according to the image loss or image error between the color-adjusted image and the true value image. Specifically, according to the obtained image loss or image error, the model parameters of the neural network can be adjusted through the gradient reverse transfer function, so as to obtain the final neural network model.
  • the candidate image is constructed according to the color value of the color card in the training image, and the color adjustment matrix output by the neural network model is used to adjust the color of the candidate image, so as to obtain the color adjusted image, and then adjust the color The image and the true value image adjust the parameters of the neural network model.
  • the standard value of the color card can be used as the supervision value to adjust the model parameters of the neural network model, so as to obtain more accurate model parameters.
  • the obtaining a color adjustment matrix corresponding to the training image according to the training image and the neural network model includes: dividing the training image to obtain multiple sub-images; using The neural network model processes the multiple sub-images respectively to obtain multiple color adjustment matrices, and the multiple sub-images correspond to the multiple color adjustment matrices one-to-one.
  • the above-mentioned sub-image may be a part of the above-mentioned training image, and different sub-images may or may not overlap.
  • One possible way to divide is to patch the training images.
  • the training image is divided to obtain multiple sub-images, so that the neural network model can be trained separately through the training image and the sub-images of the training image, that is, the demand for the number of training images is reduced through data enhancement. , Which is conducive to the realization of the training process of the neural network model.
  • the color adjustment matrix is used to perform automatic white balance correction and/or color correction on the image.
  • the training image may be an image after automatic white balance correction, or an image before automatic white balance correction.
  • the training image is a raw image.
  • the raw image is the raw data obtained by the complementary metal oxide semiconductor (CMOS) or charge coupled device (CCD) image sensor that converts the captured light source signal into a digital signal, it is lossless. So it contains the original information of the object. In this way, using raw images to train the neural network model can improve the training effect of the neural network model.
  • CMOS complementary metal oxide semiconductor
  • CCD charge coupled device
  • the present application provides an image processing method, which includes: acquiring an image to be processed; processing the image to be processed through a neural network model to obtain a color adjustment matrix of the image to be processed, the The neural network model is trained according to the true value image and the training image, the true value image is constructed according to the standard value of the color card, the training image is the image containing the color card; according to the image to be processed The color adjustment matrix performs color adjustment on the image to be processed to obtain a color-adjusted image.
  • the neural network model may be obtained by training in the first aspect or any one of the possible implementations of the first aspect.
  • the neural network model can generate a color adjustment matrix for the image to be processed according to the image to be processed. Compared with the traditional method, the obtained color adjustment matrix is more suitable for the scene of the image to be processed. In this way, the color adjustment matrix obtained by using the above technical solution can improve the effect of image color adjustment.
  • the neural network model in the above technical solution is obtained by training the true value image constructed by the standard value of the color card and the training image of the image with the color card. It can complete the neural network in the absence of training images or true value images. Network model training.
  • the color adjustment matrix of the image to be processed is used to perform automatic white balance correction and/or color correction on the image to be processed.
  • the performing color adjustment on the image to be processed according to the color adjustment matrix of the image to be processed includes: performing the color adjustment matrix of the image to be processed with the image to be processed Matrix operation to obtain the to-be-processed image after color processing adjustment.
  • the image to be processed is a raw image.
  • the raw image is the original data obtained by converting the captured light source signal into a digital signal by a CMOS or CCD image sensor, it is lossless, so it contains the original information of the object.
  • God takes the input of the network model as a raw image, so that the image information is preserved to the greatest extent, so that the color adjustment matrix obtained can reflect the information of the real scene, and the image color adjustment effect is also better.
  • this application provides a neural network model training device, which includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed ,
  • the processor is configured to perform the following process: obtain a true value image and a training image, the true value image is constructed according to the standard value of the color card, the training image is an image containing the color card; according to the true value
  • the value image and the training image train a neural network model, and the output target of the neural network model is a color adjustment matrix for color adjustment of the image.
  • the output target of the neural network model obtained by training is the color adjustment matrix. Therefore, using the neural network model to process the image to be processed can generate the color adjustment matrix for the image to be processed. Compared with the traditional method, The obtained color adjustment matrix is more consistent with the scene of the image to be processed. In this way, the color adjustment matrix obtained by using the above technical solution can improve the effect of image color adjustment.
  • the above technical solution uses the standard values of the color card to construct the true value image during the training process of the neural network model, and uses the image with the color card as the training image, which can complete the neural network in the absence of training images or true value images.
  • Network model training uses the standard values of the color card to construct the true value image during the training process of the neural network model, and uses the image with the color card as the training image, which can complete the neural network in the absence of training images or true value images.
  • the processor is specifically configured to perform the following process: obtain a candidate image according to the training image, and the candidate image is based on the color value of the position of the color card in the training image The formed image; according to the training image and the neural network model, the color adjustment matrix corresponding to the training image is obtained; the color adjustment matrix is used to adjust the color of the candidate image to obtain the color adjusted image; The color-adjusted image and the true value image are adjusted to the model parameters of the neural network model.
  • the candidate image is an image in which the color value at the location of the color card included in the training image is filtered.
  • the filtering processing can be any possible filtering methods such as mean filtering and median filtering.
  • the model parameters of the neural network model can be adjusted according to the image loss or image error between the color-adjusted image and the true value image. Specifically, according to the obtained image loss or image error, the model parameters of the neural network can be adjusted through the gradient reverse transfer function, so as to obtain the final neural network model.
  • the candidate image is constructed according to the color value of the color card in the training image, and the color adjustment matrix output by the neural network model is used to adjust the color of the candidate image, so as to obtain the color adjusted image, and then adjust the color The image and the true value image adjust the parameters of the neural network model.
  • the standard value of the color card can be used as the supervision value to adjust the model parameters of the neural network model, so as to obtain more accurate model parameters.
  • the processor is specifically configured to perform the following process: divide the training image to obtain multiple sub-images; use the neural network model to process the multiple sub-images separately To obtain a plurality of the color adjustment matrices, and the plurality of sub-images correspond to the plurality of the color adjustment matrices one-to-one.
  • the above-mentioned sub-image may be a part of the above-mentioned training image, and different sub-images may or may not overlap.
  • One possible way to divide is to patch the training images.
  • the training image is divided to obtain multiple sub-images, so that the neural network model can be trained separately through the training image and the sub-images of the training image, that is, the demand for the number of training images is reduced through data enhancement. , Which is conducive to the realization of the training process of the neural network model.
  • the color adjustment matrix is used to perform automatic white balance correction and/or color correction on the image.
  • the training image may be an image after automatic white balance correction, or an image before automatic white balance correction.
  • the training image is a raw image.
  • the raw image is the original data obtained by converting the captured light source signal into a digital signal by a CMOS or CCD image sensor, it is lossless, so it contains the original information of the object. In this way, using raw images to train the neural network model can improve the training effect of the neural network model.
  • the present application provides an image processing device that includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the The processor is used to perform the following process: obtain the image to be processed; process the image to be processed through a neural network model to obtain the color adjustment matrix of the image to be processed, and the neural network model is based on the true value image and the training image Obtained by training, the true value image is constructed according to the standard value of the color card, and the training image is an image containing the color card; color the image to be processed is performed according to the color adjustment matrix of the image to be processed Adjust to obtain a color-adjusted image.
  • the neural network model may be obtained by training in the first aspect or any one of the possible implementations of the first aspect.
  • the neural network model can generate a color adjustment matrix for the image to be processed according to the image to be processed. Compared with the traditional method, the obtained color adjustment matrix is more suitable for the scene of the image to be processed. In this way, the color adjustment matrix obtained by using the above technical solution can improve the effect of image color adjustment.
  • the neural network model in the above technical solution is obtained by training the true value image constructed by the standard value of the color card and the training image of the image with the color card. It can complete the neural network in the absence of training images or true value images. Network model training.
  • the color adjustment matrix of the image to be processed is used to perform automatic white balance correction and/or color correction on the image to be processed.
  • the processor is specifically configured to perform the following process: perform a matrix operation on the color adjustment matrix of the image to be processed and the image to be processed to obtain a color adjusted image.
  • the image to be processed is a raw image.
  • the raw image is the original data obtained by converting the captured light source signal into a digital signal by a CMOS or CCD image sensor, it is lossless, so it contains the original information of the object.
  • God takes the input of the network model as a raw image, so that the image information is preserved to the greatest extent, so that the color adjustment matrix obtained can reflect the information of the real scene, and the image color adjustment effect is also better.
  • a training device for a neural network model includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, The processor is configured to execute the method in any one of the foregoing implementation manners of the first aspect.
  • an image processing device in a sixth aspect, includes: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the processing The device is used to execute the method in any one of the foregoing second aspects.
  • the processors in the fifth and sixth aspects described above can be either a central processing unit (CPU), or a combination of a CPU and a neural network processing unit, where the neural network processing unit can include graphics processing Graphical processing unit (GPU), neural-network processing unit (NPU), tensor processing unit (TPU), and so on.
  • the neural network processing unit can include graphics processing Graphical processing unit (GPU), neural-network processing unit (NPU), tensor processing unit (TPU), and so on.
  • GPU graphics processing Graphical processing unit
  • NPU neural-network processing unit
  • TPU tensor processing unit
  • TPU is an artificial intelligence accelerator application specific integrated circuit fully customized by Google for machine learning.
  • a computer-readable medium stores program code for device execution.
  • the program code runs on a computer
  • the computer executes the first aspect or any one of the first aspect.
  • a computer program product containing instructions is provided.
  • the computer program product runs on a computer, the computer executes the method in the first aspect or any one of the first aspect, or executes the second aspect Or the method in any implementation of the second aspect.
  • a chip in a ninth aspect, includes a processor and a data interface, the processor reads instructions stored in a memory through the data interface, and executes the first aspect or any one of the implementation manners of the first aspect Or implement the method in the second aspect or any one of the implementation manners of the second aspect.
  • the chip may further include a memory in which instructions are stored, and the processor is configured to execute instructions stored on the memory.
  • the The processor is configured to execute the method in the first aspect or any implementation manner of the first aspect, or execute the method in the second aspect or any implementation manner of the second aspect.
  • the aforementioned chip may specifically be a field-programmable gate array (FPGA) or an application-specific integrated circuit (ASIC).
  • FPGA field-programmable gate array
  • ASIC application-specific integrated circuit
  • a computing device in a tenth aspect, includes the neural network model training device in any one of the third aspects, or the computing device includes the image processing device in any one of the fourth aspects .
  • the electronic device may specifically be a server.
  • the electronic device may specifically be a terminal device.
  • Fig. 1 is a schematic flowchart of ISP processing.
  • Figure 2 is a schematic structural diagram of a system architecture provided by an embodiment of the present application.
  • Fig. 3 is a schematic block diagram of a convolutional neural network model provided by an embodiment of the present application.
  • Fig. 4 is a schematic diagram of a chip hardware structure provided by an embodiment of the present application.
  • Fig. 5 is a schematic flowchart of a neural network model training method according to an embodiment of the present application.
  • Fig. 6 is a schematic diagram of a neural network model training process provided by an embodiment of the present application.
  • Fig. 7 is a schematic diagram of a neural network model provided by an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • Fig. 9 is a schematic diagram of an image processing process provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of the hardware structure of the neural network model training device according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram of the hardware structure of an image processing apparatus according to an embodiment of the present application.
  • the embodiments provided in the embodiments of the present application can be applied to image retrieval, album management, safe city, human-computer interaction, and other scenes that require image color adjustment.
  • the images in the embodiments of the present application may be static images (or referred to as static pictures) or dynamic images (or referred to as dynamic pictures).
  • the images in the present application may be videos or dynamic pictures, or the present application
  • the images in can also be static pictures or photos.
  • static images or dynamic images are collectively referred to as images.
  • the embodiments of the present application involve a large number of related applications of neural networks.
  • the following first introduces related terms and other related concepts of the neural networks that may be involved in the embodiments of the present application.
  • a neural network can be composed of neural units.
  • a neural unit can refer to an arithmetic unit that takes x s and intercept 1 as inputs.
  • the output of the arithmetic unit can be as shown in formula (1-1):
  • s 1, 2,...n, n is a natural number greater than 1
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the characteristics of the local receptive field.
  • the local receptive field can be a region composed of several neural units.
  • Deep neural network also known as multi-layer neural network
  • the DNN is divided according to the positions of different layers.
  • the neural network inside the DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the number of layers in the middle are all hidden layers.
  • the layers are fully connected, that is to say, any neuron in the i-th layer must be connected to any neuron in the i+1th layer.
  • DNN looks complicated, it is not complicated in terms of the work of each layer. Simply put, it is the following linear relationship expression: among them, Is the input vector, Is the output vector, Is the offset vector, W is the weight matrix (also called coefficient), and ⁇ () is the activation function.
  • Each layer is just the input vector After such a simple operation, the output vector is obtained Due to the large number of DNN layers, the coefficient W and the offset vector The number is also relatively large.
  • DNN The definition of these parameters in DNN is as follows: Take coefficient W as an example: Suppose in a three-layer DNN, the linear coefficients from the fourth neuron in the second layer to the second neuron in the third layer are defined as The superscript 3 represents the number of layers where the coefficient W is located, and the subscript corresponds to the output third-level index 2 and the input second-level index 4.
  • the coefficient from the kth neuron in the L-1th layer to the jth neuron in the Lth layer is defined as
  • Convolutional neural network (convolutional neuron network, CNN) is a deep neural network with a convolutional structure.
  • the convolutional neural network contains a feature extractor composed of a convolutional layer and a sub-sampling layer.
  • the feature extractor can be regarded as a filter.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can be connected to only part of the neighboring neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units in the same feature plane share weights, and the shared weights here are the convolution kernels.
  • Sharing weight can be understood as the way of extracting image information has nothing to do with location.
  • the convolution kernel can be initialized in the form of a matrix of random size. In the training process of the convolutional neural network, the convolution kernel can obtain reasonable weights through learning. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, and at the same time reduce the risk of overfitting.
  • the neural network can use the back propagation (BP) algorithm to modify the size of the parameters in the initial neural network model during the training process, so that the reconstruction error loss of the neural network model becomes smaller and smaller. Specifically, forwarding the input signal to the output will cause error loss, and the parameters in the initial neural network model are updated by backpropagating the error loss information, so that the error loss is converged.
  • the back-propagation algorithm is a back-propagation motion dominated by error loss, and aims to obtain the optimal parameters of the neural network model, such as the weight matrix.
  • the pixel value of the image can be a red-green-blue (RGB) color value, and the pixel value can be a long integer representing the color.
  • the pixel value is 256*Red+100*Green+76Blue, where Blue represents the blue component, Green represents the green component, and Red represents the red component. In each color component, the smaller the value, the lower the brightness, and the larger the value, the higher the brightness.
  • the pixel values can be grayscale values.
  • an embodiment of the present application provides a system architecture 200.
  • a data collection device 260 is used to collect training data.
  • the training data may include training images and truth-value images.
  • the data collection device 260 stores the training data in the database 230, and the training device 220 trains to obtain the target model/rule 201 based on the training data maintained in the database 230.
  • the training device 220 processes the input training image and compares the output image with the true value image until the image output by the training device 220 is the true value. The difference of the images is less than a certain threshold, thereby completing the training of the target model/rule 201.
  • the above-mentioned target model/rule 201 can be used to implement the image processing method of the embodiment of the present application, that is, the image to be processed is input into the target model/rule 201 after relevant preprocessing, to obtain a color-adjusted image.
  • the target model/rule 201 in the embodiment of the application may specifically be the neural network model in the embodiment of the application.
  • the training data maintained in the database 230 may not all come from the collection of the data collection device 260, and may also be received from other devices.
  • the training device 220 does not necessarily perform the training of the target model/rule 201 completely based on the training data maintained by the database 230. It may also obtain training data from the cloud or other places for model training. The above description should not be used as a reference to this application. Limitations of the embodiment.
  • the target model/rule 201 trained according to the training device 220 can be applied to different systems or devices, such as the execution device 210 shown in FIG. 2, which can be a terminal, such as a mobile phone terminal, a tablet computer, notebook computers, augmented reality (AR)/virtual reality (VR), vehicle-mounted terminals, etc., can also be servers or cloud devices.
  • the execution device 210 is configured with an input/output (input/output, I/O) interface 212 for data interaction with external devices.
  • the user can input data to the I/O interface 212 through the client device 240.
  • the input data in this embodiment of the present application may include: a to-be-processed image input by the client device.
  • the preprocessing module 213 and the preprocessing module 214 are used to perform preprocessing according to the input data (such as the image to be processed) received by the I/O interface 212.
  • the preprocessing module 213 and the preprocessing module may not be provided. 214 (there may only be one preprocessing module), and the calculation module 211 is directly used to process the input data.
  • the execution device 210 When the execution device 210 preprocesses input data, or when the calculation module 211 of the execution device 210 performs calculations and other related processing, the execution device 210 can call data, codes, etc. in the data storage system 250 for corresponding processing. , The data, instructions, etc. obtained by corresponding processing may also be stored in the data storage system 250.
  • the I/O interface 212 returns the processing result, the color-processed image obtained as described above, to the client device 240, so as to provide it to the user.
  • the training device 220 can generate corresponding target models/rules 201 based on different training data for different goals or tasks, and the corresponding target models/rules 201 can be used to achieve the above goals or complete The above tasks provide users with the desired results.
  • the user can manually set input data, and the manual setting can be operated through the interface provided by the I/O interface 212.
  • the client device 240 can automatically send input data to the I/O interface 212. If the client device 240 is required to automatically send the input data and the user's authorization is required, the user can set the corresponding authority in the client device 240. The user can view the result output by the execution device 210 on the client device 240, and the specific presentation form may be a specific manner such as display, sound, and action.
  • the client device 240 can also be used as a data collection terminal to collect the input data of the input I/O interface 212 and the output result of the output I/O interface 212 as new sample data, and store it in the database 230 as shown in the figure.
  • the I/O interface 212 directly uses the input data input to the I/O interface 212 and the output result of the output I/O interface 212 as a new sample as shown in the figure. The data is stored in the database 230.
  • FIG. 2 is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 250 is an external memory relative to the execution device 210. In other cases, the data storage system 250 may also be placed in the execution device 210.
  • the target model/rule 201 is obtained by training according to the training device 220.
  • the target model/rule 201 may be the neural network model in the embodiment of the application.
  • the neural network model provided in the embodiment of the application may include one or more neural networks, and the one or more neural networks may include CNN, deep convolutional neural networks (DCNN), and/or recurrent neural networks (RNNS), etc. .
  • CNN is a very common neural network
  • the structure of CNN will be introduced in detail below in conjunction with Figure 3.
  • a convolutional neural network is a deep neural network with a convolutional structure. It is a deep learning architecture.
  • the deep learning architecture refers to the algorithm of machine learning. Multi-level learning is carried out on the abstract level of.
  • CNN is a feed-forward artificial neural network. Each neuron in the feed-forward artificial neural network can respond to the input image.
  • a convolutional neural network (CNN) 300 may include an input layer 310, a convolutional layer/pooling layer 320 (the pooling layer is optional), and a neural network layer 330.
  • CNN convolutional neural network
  • the convolutional layer/pooling layer 320 may include layers 321-326, for example: in one implementation, layer 321 is a convolutional layer, layer 322 is a pooling layer, and layer 323 is Convolutional layer, 324 is a pooling layer, 325 is a convolutional layer, and 326 is a pooling layer; in another implementation, 321 and 322 are convolutional layers, 323 is a pooling layer, and 324 and 325 are convolutions.
  • the accumulation layer, 326 is the pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 321 can include many convolution operators.
  • the convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator is essentially It can be a weight matrix. This weight matrix is usually pre-defined. In the process of convolution on the image, the weight matrix is usually one pixel after one pixel (or two pixels after two pixels) along the horizontal direction on the input image. ...It depends on the value of stride) to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix and the depth dimension of the input image are the same.
  • the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a single depth dimension convolution output, but in most cases, a single weight matrix is not used, but multiple weight matrices of the same size (row ⁇ column) are applied. That is, multiple homogeneous matrices.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image, where the dimension can be understood as determined by the "multiple" mentioned above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image.
  • the multiple weight matrices have the same size (row ⁇ column), and the feature maps extracted by the multiple weight matrices of the same size have the same size, and then the multiple extracted feature maps of the same size are combined to form a convolution operation. Output.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications.
  • Each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network 300 can make correct predictions. .
  • the initial convolutional layer (such as 321) often extracts more general features, which can also be called low-level features; with the convolutional neural network
  • the features extracted by the subsequent convolutional layers (for example, 326) become more and more complex, such as features such as high-level semantics, and features with higher semantics are more suitable for the problem to be solved.
  • the pooling layer can be a convolutional layer followed by a layer.
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the sole purpose of the pooling layer is to reduce the size of the image space.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of the average pooling.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
  • the operators in the pooling layer should also be related to the image size.
  • the size of the image output after processing by the pooling layer can be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.
  • Neural network layer 330
  • the convolutional neural network 300 After processing by the convolutional layer/pooling layer 320, the convolutional neural network 300 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 320 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 300 needs to use the neural network layer 330 to generate one or a group of required classes of output. Therefore, the neural network layer 330 can include multiple hidden layers (331, 332 to 33n as shown in FIG. 3) and an output layer 340. The parameters contained in the multiple hidden layers can be based on specific task types. The relevant training data of the, for example, the task type can include image color processing, image recognition, image classification, image super-resolution reconstruction and so on.
  • the output layer 340 After the multiple hidden layers in the neural network layer 330, that is, the final layer of the entire convolutional neural network 300 is the output layer 340.
  • the output layer 240 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
  • the convolutional neural network 300 shown in FIG. 3 is only used as an example of a convolutional neural network. In specific applications, the convolutional neural network may also exist in the form of other network models.
  • the neural network model may include the convolutional neural network 300 shown in FIG. 3, and the neural network model may process the image to be processed to obtain a color processing matrix corresponding to the image to be processed.
  • FIG. 4 is a hardware structure of a chip provided by an embodiment of the application, and the chip includes a neural network processor 40.
  • the chip can be set in the execution device 210 as shown in FIG. 2 to complete the calculation work of the calculation module 211.
  • the chip can also be set in the training device 220 shown in FIG. 2 to complete the training work of the training device 220 and output the target model/rule 201.
  • the algorithms of each layer in the convolutional neural network shown in Figure 3 can be implemented in the chip shown in Figure 4.
  • the convolutional neural network can be (one or more) included in the above neural network model. A) one of the neural networks.
  • the neural network processor NPU 40 is mounted on a host CPU (host CPU) as a coprocessor, and the host CPU distributes tasks.
  • the core part of the NPU is the arithmetic circuit 403, and the controller 404 controls the arithmetic circuit 403 to extract data from the memory (weight memory or input memory) and perform calculations.
  • the arithmetic circuit 403 includes multiple processing units (process engines, PE). In some implementations, the arithmetic circuit 403 is a two-dimensional systolic array. The arithmetic circuit 403 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 403 is a general-purpose matrix processor.
  • the arithmetic circuit 403 fetches the data corresponding to matrix B from the weight memory 402 and caches it on each PE in the arithmetic circuit 403.
  • the arithmetic circuit 403 fetches the matrix A data and matrix B from the input memory 401 to perform matrix operations, and the partial result or final result of the obtained matrix is stored in an accumulator 408.
  • the vector calculation unit 407 can perform further processing on the output of the arithmetic circuit 403, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, and so on.
  • the vector calculation unit 407 can be used for network calculations in the non-convolutional/non-FC layer of the neural network, such as pooling, batch normalization, local response normalization, etc. .
  • the vector calculation unit 407 can store the processed output vector to the unified buffer 406.
  • the vector calculation unit 407 may apply a nonlinear function to the output of the arithmetic circuit 403, such as a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 407 generates a normalized value, a combined value, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 303, for example for use in a subsequent layer in a neural network.
  • the unified memory 406 is used to store input data and output data.
  • the weight data directly transfers the input data in the external memory to the input memory 401 and/or the unified memory 406 through the storage unit access controller 405 (direct memory access controller, DMAC), and stores the weight data in the external memory into the weight memory 402, And the data in the unified memory 406 is stored in the external memory.
  • DMAC direct memory access controller
  • the bus interface unit (BIU) 410 is used to implement interaction between the main CPU, the DMAC, and the fetch memory 409 through the bus.
  • An instruction fetch buffer 409 connected to the controller 404 is used to store instructions used by the controller 404;
  • the controller 404 is used to call the instructions cached in the memory 409 to control the working process of the computing accelerator.
  • the unified memory 406, the input memory 401, the weight memory 402, and the instruction fetch memory 409 are all on-chip (On-Chip) memories.
  • the external memory is a memory external to the NPU.
  • the external memory can be a double data rate synchronous dynamic random access memory.
  • Memory double data rate synchronous dynamic random access memory, referred to as DDR SDRAM
  • HBM high bandwidth memory
  • each layer in the convolutional neural network shown in FIG. 3 may be executed by the arithmetic circuit 403 or the vector calculation unit 407.
  • the execution device 210 in FIG. 2 introduced above can execute each step of the method embodiments of the present application.
  • the execution device 210 in FIG. 2 may include the CNN model shown in FIG. 3 and the chip shown in FIG. 4 .
  • the method embodiments of the present application will be described in detail below with reference to the accompanying drawings.
  • the traditional way is to classify according to the scene, select the applicable color adjustment matrix from several fixed color adjustment matrices, and then use the selected color adjustment matrix to perform the image deal with.
  • the color adjustment matrix can only be determined from several fixed color adjustment matrices, which makes the effect of image color adjustment unsatisfactory.
  • this application proposes an image processing method and a neural network model training method, which can improve the effect of image color adjustment.
  • Fig. 5 is a schematic flowchart of a neural network model training method according to an embodiment of the present application.
  • the method shown in FIG. 5 may be executed by a device with strong computing capabilities such as a computer device, a server device, or a computing device.
  • the method may be executed by the terminal device in FIG. 4.
  • the method shown in FIG. 5 includes steps 510 and 520, which are respectively described in detail below.
  • the ground truth image and the training image are obtained.
  • the true value image is the reference image used when training the neural network model.
  • the truth image can also have other names, for example, target image, desired image, truth map, etc., which are collectively referred to as truth image hereinafter.
  • the standard value of the color card can be used to form a true value image.
  • the standard value of the color card is the standard value of the color value of the color card, which can be obtained from the provider of the color card.
  • the standard value of the color card can be obtained by querying the color standard value table provided by the color card provider.
  • the color value can be an RGB value or the like.
  • the embodiment of the present application does not specifically limit the type of the color card.
  • the color card can be a 140-color color card, a 24-color color card, and so on.
  • the 24-color color card has 24 standard values, and these 24 standard values are used to construct a true value image.
  • the true value image is constructed according to the distribution of the color blocks of the color card.
  • the color card includes 24 color blocks of different colors in 6 rows and 4 columns. Each color block corresponds to a standard value.
  • a 6*4 standard value matrix can be constructed. The values of the elements in the standard value matrix are the color cards. The standard value of the color of the color block at the corresponding position, and the 6*4 standard value matrix is the true value image.
  • the true value image can also be obtained by extracting features of an image that meets the expected effect.
  • the image that satisfies the expected effect may be an image taken by an imaging device, or a synthesized image, which is not specifically limited in the embodiment of the present application.
  • Training images can also be called training samples, sample data, etc., and are used to train neural network models.
  • the training image when the true value image is composed of the standard values of the color card color, the training image can be a captured image including the color card, so that the color of the actually captured color card can be extracted by way of feature extraction. value. In this way, by shooting a large number of images including color cards, a large number of training images can be obtained.
  • the training image may be a raw image.
  • a neural network model is trained according to the ground truth image and the training image, and the output target of the neural network model is a color adjustment matrix for color adjustment of the image.
  • the target output of the neural network model (or neural network) in the embodiment of the present application is a color adjustment matrix adapted to the input image to be processed.
  • the neural network model of the embodiment of the present application can generate a color adjustment matrix suitable for each image to be processed. Further using the color adjustment matrix to adjust the color of the image to be processed can improve the color adjustment effect of the image.
  • the color adjustment matrix can be used for global color adjustment of the image to be processed.
  • the color adjustment matrix can perform automatic white balance correction on the image to be processed, color correction on the image to be processed, or both automatic white balance correction and color correction on the image to be processed.
  • the traditional method when adjusting the color of an image, it is not limited to a few pre-calculated color adjustment matrices.
  • obtain a candidate image according to the training image where the candidate image is an image constructed according to the color value of the position of the color card in the training image; obtain the color adjustment matrix corresponding to the training image according to the training image and the neural network model; Use the obtained color adjustment matrix to adjust the color of the candidate image to obtain a color-adjusted image; adjust the model parameters of the neural network model according to the color-adjusted image and the true value image.
  • the image error between the true value image and the color-adjusted image can be determined through a loss function.
  • a candidate image is obtained according to the training image.
  • the candidate image is an image constructed according to the color value of the position of the color card in the training image; the training image is divided to obtain multiple sub-images;
  • the multiple sub-images are processed separately to obtain multiple color adjustment matrices, and the multiple sub-images correspond to the multiple color adjustment matrices one-to-one; and then according to the obtained multiple color adjustment matrices and training images, the color-adjusted images are obtained respectively;
  • the model parameters of the neural network model are adjusted. Understandably, after multiple sub-images are obtained, each sub-image of the multiple sub-images can be used in turn to train the neural network model.
  • the multiple sub-images include a first sub-image and a second sub-image. After the neural network model is trained using the first sub-image, the neural network is trained using the second sub-image.
  • the above-mentioned sub-image may be a part of the above-mentioned training image, and different sub-images may or may not overlap.
  • One possible way to divide is to patch the training images.
  • the above-mentioned candidate image is an image in which the color value of the location of the color card included in the training image is filtered.
  • the filtering processing can be any possible filtering methods such as mean filtering and median filtering.
  • the model parameters of the neural network model can be adjusted according to the image loss or the image error between the color-adjusted image and the true value image.
  • the model parameters of the neural network can be adjusted through the gradient back transfer function or the back propagation algorithm described above, so as to obtain the final neural network model.
  • the image error between the true value image and the color-adjusted image can be determined through a loss function.
  • Fig. 6 is a schematic diagram of a neural network model training process provided by an embodiment of the present application.
  • the training image or the sub-image obtained by the training image is input into the neural network model to obtain the output of the neural network model, that is, the color adjustment matrix.
  • feature extraction is performed on the training image to obtain the color included in the training image.
  • the color value of each color block of the card, and the obtained color value is used to construct a candidate image; the obtained color adjustment matrix is matrix multiplied with the candidate image to obtain an RGB image, that is, an image after color adjustment;
  • the true value image constructed by the standard value of the card color value is compared to obtain the error between the two; according to the obtained error, the model parameters of the neural network model are adjusted; the above operations are repeated until the obtained error meets the conditions.
  • the training image may be a raw image (or called rawdata), and performing feature extraction on the training image may be to obtain the rawdata value of the position of the color card in the raw image.
  • the training image may also be preprocessed, for example, demosaicing processing, noise reduction processing, and the like.
  • a, b, and c can be determined by the neural network model.
  • R', G', and B' are respectively the value of the color channel R, the value of the color channel G, and the value of the color channel B of the image after color adjustment.
  • Values, R, G, and B are respectively the value of the color channel R, the value of the color channel G, and the value of the color channel B of the image before color adjustment.
  • R', G', B' are the value of the color channel R, the value of the color channel G, and the value of the color channel B of the image after color adjustment
  • R, G, and B are respectively the color adjustment The value of the color channel R, the value of the color channel G, and the value of the color channel B of the previous image.
  • the embodiments of the present application may also use the following quadratic terms, cubic terms, and square root terms for color adjustment:
  • ⁇ 2,3 (R,G,B,R 2 ,G 2 ,B 2 ,RG,GB,RB) T
  • R, G, and B are respectively the value of color channel R, the value of color channel G, and the value of color channel B of the image before color adjustment, and T represents transposition.
  • the color adjustment matrix used for color adjustment will have different formats. Taking the color adjustment matrix with secondary terms as an example, the color adjustment of the training image can be processed according to the following formula, which is used for color adjustment at this time
  • the matrix M can be a 3*10 matrix:
  • R', G', B' are the value of the color channel R, the value of the color channel G, and the value of the color channel B of the image after color adjustment
  • R, G, and B are respectively the color adjustment The value of the color channel R, the value of the color channel G, and the value of the color channel B of the previous image.
  • FIG. 6 only takes the color-adjusted image as an image in RGB format as an example, and the above training process is also applicable to images in other formats, for example, images in YUV format.
  • Fig. 7 is a schematic diagram of a neural network model provided by an embodiment of the present application. It should be understood that FIG. 7 is only exemplary, which is only used to help those skilled in the art understand the embodiments of the present application, and is not intended to limit the embodiments of the present application to the specific scenarios illustrated.
  • the neural network model of the embodiment of the present application may also be in other structural forms, as long as it can be a method for implementing the embodiment of the present application.
  • FIG. 7 only shows the high-level network part of the neural network model, and the high-level network part includes a convolution part, a convolution pooling part, and a fully connected (fc) part.
  • the convolution part may include the first M layers as shown in FIG. 7 for extracting basic features; the convolutional pooling part may include the middle N layer and the global pooling layer as shown in FIG. 7 for extracting middle-level features, Each of the N layers in the middle of this part includes a convolutional layer and an average pooling layer; the fully connected part may include the back K layer as shown in FIG. 7 for extracting high-level features.
  • the H, W, and C in H*W*C in FIG. 7 respectively represent the height, width, and channel number of the image.
  • the functions of the convolutional layer and the pooling layer can be referred to the related description of FIG. 3, which will not be repeated here.
  • FIG. 8 shows a schematic flowchart of an image processing method provided by an embodiment of the present application.
  • the method shown in FIG. 8 can be executed by a device with strong computing capabilities such as a computer device, a server device, or a computing device, for example, the method It can be executed by the terminal device in FIG. 4.
  • the method shown in FIG. 8 includes steps 810-830, and these steps are respectively described in detail below.
  • an image to be processed is acquired.
  • the image to be processed in the embodiment of the present application may be obtained image data, image signal, and the like.
  • the image to be processed may be obtained through an image acquisition device (for example, a lens and a sensor), or may be received from another device, which is not specifically limited in the embodiment of the present application.
  • the image to be processed may be a raw image (or called raw image, raw data, raw image, etc.). Of course, the image to be processed may also be an image after other image processing processes except for color adjustment. Other image processing procedures include any one or a combination of black level correction, lens shading correction, dead pixel correction, demosaicing, Bayer domain noise reduction, auto exposure, auto focus, etc.
  • the image to be processed is processed through a neural network model to obtain a color adjustment matrix of the image to be processed.
  • the image to be processed is a raw image
  • the raw image is the original data obtained by converting the captured light source signal into a digital signal by a CMOS or CCD image sensor, it is lossless, so it contains the original information of the object.
  • the input of the neural network model is a raw image, which preserves the image information to the greatest extent, so that the color adjustment matrix obtained can reflect the information of the real scene, and the image color adjustment effect is also better.
  • the neural network model of the embodiment of the present application is obtained by training based on the true value image and the training image, where the true value image is constructed based on the true value of the color card, and the training image includes the image of the color card.
  • the training process of the neural network model can be referred to the training method shown in FIG. 5, which will not be repeated here.
  • color adjustment is performed on the image to be processed according to the color adjustment matrix of the image to be processed.
  • the target output of the neural network model (or neural network) in the embodiment of the present application is a color adjustment matrix adapted to the input image to be processed.
  • the neural network model of the embodiment of the present application can generate a color adjustment matrix suitable for each image to be processed. Further using the color adjustment matrix to adjust the color of the image to be processed can improve the color adjustment effect of the image.
  • the color adjustment matrix can be used for global color adjustment of the image to be processed.
  • the color adjustment matrix can perform automatic white balance correction on the image to be processed, color correction on the image to be processed, or both automatic white balance correction and color correction on the image to be processed.
  • the traditional method when adjusting the color of an image, it is not limited to a few pre-calculated color adjustment matrices.
  • Fig. 9 is a schematic diagram of an image processing process provided by an embodiment of the present application.
  • the image to be processed is input to the neural network model to obtain the output of the neural network, that is, the color adjustment matrix of the image to be processed, and the image to be processed is preprocessed at the same time, such as demosaicing, Noise reduction processing, etc.; matrix multiplication is performed on the obtained color adjustment matrix and the preprocessed image to be processed to obtain an RGB image, that is, an image after color adjustment.
  • the trained neural network model can generate a color adjustment matrix for the image to be processed according to the image to be processed.
  • the color adjustment matrix obtained is more suitable for the scene of the image to be processed. In this way, the color adjustment matrix obtained by using the above technical solution can improve the effect of image color adjustment.
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • FIG. 10 is a schematic diagram of the hardware structure of the neural network model training device according to an embodiment of the present application.
  • the neural network model training device 1000 shown in FIG. 10 includes a memory 1010, a processor 1020, a communication interface 1030, and a bus 1040.
  • the memory 1010, the processor 1020, and the communication interface 1030 implement a communication connection between each other through a bus 1040.
  • the memory 1010 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 1010 may store a program. When the program stored in the memory 1010 is executed by the processor 1020, the processor 1020 and the communication interface 1020 are used to execute each step of the neural network model training method of the embodiment of the present application.
  • the processor 1020 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the neural network model training device of the embodiment of the present application, or to execute the neural network training method of the method embodiment of the present application.
  • the processor 1020 may also be an integrated circuit chip with signal processing capability.
  • each step of the neural network model training method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 1020 or instructions in the form of software.
  • the aforementioned processor 1020 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an ASIC, an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic. Devices, discrete hardware components.
  • the aforementioned general-purpose processor may be a microprocessor or the processor may also be any conventional processor. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1010, and the processor 1020 reads the information in the memory 1010, and combines its hardware to complete the functions required by the units included in the neural network model training device of the embodiment of the present application, or perform the functions of the method embodiment of the present application. Neural network model training method.
  • the communication interface 1030 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 1000 and other devices or a communication network.
  • a transceiving device such as but not limited to a transceiver to implement communication between the device 1000 and other devices or a communication network.
  • the training image can be obtained through the communication interface 1030.
  • the bus 1040 may include a path for transferring information between various components of the device 1000 (for example, the memory 1010, the processor 1020, and the communication interface 1030).
  • the neural network model is trained by the neural network model training device 1000 shown in FIG. 10, and the neural network model obtained by training can be used to execute the image processing method in this application. Specifically, training the neural network model by the device 1000 can obtain the neural network model in the methods shown in FIG. 5 and FIG. 8.
  • the device shown in FIG. 10 can obtain the training image, the truth value image, and the neural network model to be trained from the outside through the communication interface 1030, and then the processor trains the neural network model to be trained according to the training image and the truth value image. .
  • FIG. 11 is a schematic diagram of the hardware structure of an image processing apparatus according to an embodiment of the present application. Similar to the device 1000 described above, the image processing device 1100 shown in FIG. 11 includes a memory 1110, a processor 1120, a communication interface 1130, and a bus 1140. Among them, the memory 1110, the processor 1120, and the communication interface 1130 implement a communication connection between each other through the bus 1140.
  • the memory 1110 may store a program.
  • the processor 1120 is configured to execute each step of the image processing method in the embodiment of the present application.
  • the processor 1120 may adopt a general CPU, a microprocessor, an ASIC, a GPU, or one or more integrated circuits for executing related programs to implement the image processing method in the embodiments of the present application.
  • the processor 1120 may also be an integrated circuit chip with signal processing capability.
  • each step of the image processing method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 1120 or instructions in the form of software.
  • the processor 1120 may adopt a general central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the image processing apparatus of the embodiment of the present application, or to execute the image processing method of the method embodiment of the present application.
  • the processor 1120 may also be an integrated circuit chip with signal processing capability.
  • each step of the image processing method of the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 1120 or instructions in the form of software.
  • the aforementioned processor 1120 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an ASIC, an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic. Devices, discrete hardware components.
  • the aforementioned general-purpose processor may be a microprocessor or the processor may also be any conventional processor. The steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1010, and the processor 1120 reads the information in the memory 1110, and combines its hardware to complete the functions required by the units included in the image processing apparatus of the embodiment of the present application, or perform the image processing of the method embodiment of the present application method.
  • the communication interface 1130 uses a transceiving device such as but not limited to a transceiver to implement communication between the device 1100 and other devices or a communication network.
  • a transceiving device such as but not limited to a transceiver to implement communication between the device 1100 and other devices or a communication network.
  • the image to be processed can be acquired through the communication interface 1130.
  • the bus 1140 may include a path for transferring information between various components of the device 1100 (for example, the memory 1110, the processor 1120, and the communication interface 1130).
  • the device 1000 and device 1100 only show a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the device 1000 and device 1100 may also include those necessary for normal operation. Other devices. At the same time, according to specific needs, those skilled in the art should understand that the apparatus 1000 and the apparatus 1100 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the device 1000 and the device 1100 may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIG. 10 and FIG. 11.
  • the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • Access memory synchronous DRAM, SDRAM
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous connection dynamic random access memory Take memory (synchlink DRAM, SLDRAM) and direct memory bus random access memory (direct rambus RAM, DR RAM).
  • the above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
  • the above-mentioned embodiments may be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions or computer programs.
  • the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center that includes one or more sets of available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium.
  • the semiconductor medium may be a solid state drive.
  • At least one refers to one or more, and “multiple” refers to two or more.
  • the following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
  • at least one item (a) of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种神经网络模型的训练方法、图像处理方法及其装置,涉及图像处理领域中的图像色彩处理技术,该神经网络模型的训练方法,包括:获取真值图像和训练图像,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像(510);根据所述真值图像和训练图像,对神经网络模型进行训练,所述神经网络模型的输出目标是用于对图像进行色彩调整的色彩调整矩阵(520)。通过神经网络模型得到的色彩调整矩阵与待处理图像的场景更为契合。使用上述方法得到的色彩调整矩阵对待处理图像处理,可以提高图像色彩调整的效果。

Description

神经网络模型的训练方法、图像处理方法及其装置 技术领域
本申请涉及图像处理领域,并且更具体地,涉及神经网络模型的训练方法、图像处理方法及其装置。
背景技术
图像信号处理(image signal processing,ISP)主要作用是对前端图像传感器输出的图像信号进行后期处理。依赖于ISP,在不同的光学条件下得到的图像才能较好的还原现场细节。
ISP处理流程如图1所示,自然景物101通过镜头(lens)102获得拜耳(bayer)图像,然后通过光电转换104得到模拟电信号105,进一步通过消噪和模拟转数字处理106获得数字图像信号(即原始图像(raw image))107,接下来会进入数字信号处理芯片100中。在数字信号处理芯片100中的步骤是ISP处理的核心步骤,数字信号处理芯片100一般包含黑电平矫正(black level compensation,BLC)108、镜头阴影矫正(lens shading correction)109、坏点矫正(bad pixel correction,BPC)110、去马赛克(demosaic)111、拜耳域降噪(denoise)112、自动白平衡(auto white balance,AWB)113、Ygamma114、自动曝光(auto exposure,AE)115、自动对焦(auto focus,AF)(图1中未示出)、色彩矫正(color correction,CC)116、伽玛(gamma)矫正117、色域转换118、色彩去噪/细节增强119、色彩增强(color enhance,CE)120、编织器(formater)121、输入输出(input/output,I/O)控制122等模块。
ISP处理中与全局色彩相关的模块主要包括AWB113、CC116等。对于全局颜色处理模块,传统方式是根据场景分类,从固定的几种色彩调整矩阵中选择适用的色彩调整矩阵,再使用选择的色彩调整矩阵对图像进行处理。上述方式中,色彩调整矩阵只能从固定的几种色彩调整矩阵中确定,使得图像色彩调整的效果并不理想。
发明内容
本申请提供神经网络模型的训练方法、图像处理方法及其装置,能够提高图像色彩调整的效果。
第一方面,本申请提供了一种神经网络模型的训练方法,该方法包括:获取真值图像和训练图像,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;根据所述真值图像和所述训练图像,对神经网络模型进行训练,所述神经网络模型的输出目标是用于对图像进行色彩调整的色彩调整矩阵。
在上述技术方案中,训练得到的神经网络模型的输出目标是色彩调整矩阵,因此使用该神经网络模型对待处理图像进行处理,可以生成针对于待处理图像的色彩调整矩阵,相较于传统方式,得到的色彩调整矩阵与待处理图像的场景更为契合。这样,使用上述技术 方案得到的色彩调整矩阵对待处理图像,可以提高图像色彩调整的效果。
此外,上述技术方案在神经网络模型的训练过程中,使用色卡的标准值构造真值图像,使用带有色卡的图像作为训练图像,可以在缺少训练图像或者真值图像的场景下,完成神经网络模型的训练。
在一种可能的实现方式中,所述根据所述真值图像和所述训练图像,对神经网络模型进行训练,包括:根据所述训练图像,得到候选图像,所述候选图像为根据所述训练图像中所述色卡所在位置的色彩值构成的图像;根据所述训练图像和神经网络模型,得到与所述训练图像对应的色彩调整矩阵;使用所述色彩调整矩阵对所述候选图像进行色彩调整,得到色彩调整后的图像;根据所述色彩调整后的图像和所述真值图像,对所述神经网络模型的模型参数进行调整。
可选地,候选图像为训练图像包含的色卡所在位置的色彩值经过滤波处理后的图像。其中,滤波处理可以是均值滤波、中值滤波等任意可能的滤波方式。
可选地,可以根据色彩调整后的图像与真值图像之间的图像损失或者图像误差,对神经网络模型的模型参数进行调整。具体地,可以根据得到的图像损失或者图像误差,通过梯度反向传递函数,调整所述神经网络的模型参数,从而获取最终的神经网络模型。
上述技术方案中,根据训练图像中色卡所在位置的色彩值构建候选图像,并使用神经网络模型输出的色彩调整矩阵对候选图像进行色彩调整,从而得到色彩调整后的图像,再将色彩调整后的图像与真值图像调整神经网络模型的参数。这样,可以实现以色卡标准值作为监督值,对神经网络模型的模型参数进行调整,从而得到更准确的模型参数。
在一种可能的实现方式中,所述根据所述训练图像和神经网络模型,得到与所述训练图像对应的色彩调整矩阵,包括:对所述训练图像进行划分,得到多张子图像;使用所述神经网络模型对所述多张子图像分别进行处理,得到多个所述色彩调整矩阵,所述多张子图像与所述多个所述色彩调整矩阵一一对应。
上述子图像可以是上述训练图像的一部分,不同的子图像之间可以重叠或不重叠。一种可能的划分方式为对训练图像进行patch处理。
在上述技术方案中,对训练图像进行划分,得到多张子图像,这样可以通过训练图像和训练图像的子图像分别对神经网络模型进行训练,即通过数据增强的方式降低对训练图像数量的需求,有利于神经网络模型的训练过程的实现。
在一种可能的实现方式中,所述色彩调整矩阵用于对图像进行自动白平衡校正和/或色彩校正。
或者说,训练图像可以是经过自动白平衡校正之后的图像,也可以是经过自动白平衡校正之前的图像。
在一种可能的实现方式中,所述训练图像为raw图像。
由于raw图像是互补金属氧化物半导体(complementary metal oxide semiconductor,CMOS)或者电行耦合元件(charge coupled device,CCD)图像传感器将捕捉到的光源信号转化为数字信号得到的原始数据,是无损的,故其包含了物体原始的信息。这样,使用raw图像对神经网络模型进行训练,可以提高神经网络模型的训练效果。
第二方面,本申请提供了一种图像处理方法,该方法包括:获取待处理图像;通过神经网络模型对所述待处理图像进行处理,以得到所述待处理图像的色彩调整矩阵,所述神 经网络模型是根据真值图像和训练图像训练得到的,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;根据所述待处理图像的色彩调整矩阵对所述待处理图像进行色彩调整,以获取色彩调整后的图像。
可以理解地,神经网络模型可以是通过第一方面或者第一方面任意一种可能的实现方式中的训练方法训练得到的。
在上述技术方案中,神经网络模型可以根据待处理图像生成针对于待处理图像的色彩调整矩阵,相较于传统方式,得到的色彩调整矩阵与待处理图像的场景更为契合。这样,使用上述技术方案得到的色彩调整矩阵对待处理图像,可以提高图像色彩调整的效果。
此外,上述技术方案中的神经网络模型是通过色卡的标准值构造的真值图像和带有色卡的图像的训练图像训练得到的,可以在缺少训练图像或者真值图像的场景下,完成神经网络模型的训练。
在一种可能的实现方式中,所述待处理图像的色彩调整矩阵用于对所述待处理图像进行自动白平衡校正和/或色彩校正。
在一种可能的实现方式中,所述根据所述待处理图像的色彩调整矩阵对所述待处理图像进行色彩调整,包括:将所述待处理图像的色彩调整矩阵与所述待处理图像进行矩阵运算,以得到色彩处理调整后的待处理图像。
在一种可能的实现方式中,所述待处理图像为raw图像。
由于raw图像是CMOS或者CCD图像传感器将捕捉到的光源信号转化为数字信号得到的原始数据,是无损的,故其包含了物体原始的信息。神将网络模型的输入为raw图像,这样最大程度的保留了图像的信息,这样得到的色彩调整矩阵可以反映真实场景的信息,进而图像色彩调整效果也更好。
第三方面,本申请提供一种神经网络模型的训练装置,该装置包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:获取真值图像和训练图像,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;根据所述真值图像和所述训练图像,对神经网络模型进行训练,所述神经网络模型的输出目标是用于对图像进行色彩调整的色彩调整矩阵。
在上述技术方案中,训练得到的神经网络模型的输出目标是色彩调整矩阵,因此使用该神经网络模型对待处理图像进行处理,可以生成针对于待处理图像的色彩调整矩阵,相较于传统方式,得到的色彩调整矩阵与待处理图像的场景更为契合。这样,使用上述技术方案得到的色彩调整矩阵对待处理图像,可以提高图像色彩调整的效果。
此外,上述技术方案在神经网络模型的训练过程中,使用色卡的标准值构造真值图像,使用带有色卡的图像作为训练图像,可以在缺少训练图像或者真值图像的场景下,完成神经网络模型的训练。
在一种可能的实现方式中,所述处理器具体用于执行以下过程:根据所述训练图像,得到候选图像,所述候选图像为根据所述训练图像中所述色卡所在位置的色彩值构成的图像;根据所述训练图像和神经网络模型,得到与所述训练图像对应的色彩调整矩阵;使用所述色彩调整矩阵对所述候选图像进行色彩调整,得到色彩调整后的图像;根据所述色彩调整后的图像和所述真值图像,对所述神经网络模型的模型参数进行调整。
可选地,候选图像为训练图像包含的色卡所在位置的色彩值经过滤波处理后的图像。其中,滤波处理可以是均值滤波、中值滤波等任意可能的滤波方式。
可选地,可以根据色彩调整后的图像与真值图像之间的图像损失或者图像误差,对神经网络模型的模型参数进行调整。具体地,可以根据得到的图像损失或者图像误差,通过梯度反向传递函数,调整所述神经网络的模型参数,从而获取最终的神经网络模型。
上述技术方案中,根据训练图像中色卡所在位置的色彩值构建候选图像,并使用神经网络模型输出的色彩调整矩阵对候选图像进行色彩调整,从而得到色彩调整后的图像,再将色彩调整后的图像与真值图像调整神经网络模型的参数。这样,可以实现以色卡标准值作为监督值,对神经网络模型的模型参数进行调整,从而得到更准确的模型参数。
在一种可能的实现方式中,所述处理器具体用于执行以下过程:对所述训练图像进行划分,得到多张子图像;使用所述神经网络模型对所述多张子图像分别进行处理,得到多个所述色彩调整矩阵,所述多张子图像与所述多个所述色彩调整矩阵一一对应。
上述子图像可以是上述训练图像的一部分,不同的子图像之间可以重叠或不重叠。一种可能的划分方式为对训练图像进行patch处理。
在上述技术方案中,对训练图像进行划分,得到多张子图像,这样可以通过训练图像和训练图像的子图像分别对神经网络模型进行训练,即通过数据增强的方式降低对训练图像数量的需求,有利于神经网络模型的训练过程的实现。
在一种可能的实现方式中,所述色彩调整矩阵用于对图像进行自动白平衡校正和/或色彩校正。
或者说,训练图像可以是经过自动白平衡校正之后的图像,也可以是经过自动白平衡校正之前的图像。
在一种可能的实现方式中,所述训练图像为raw图像。
由于raw图像是CMOS或者CCD图像传感器将捕捉到的光源信号转化为数字信号得到的原始数据,是无损的,故其包含了物体原始的信息。这样,使用raw图像对神经网络模型进行训练,可以提高神经网络模型的训练效果。
第四方面,本申请提供一种图像处理装置,该装置包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:获取待处理图像;通过神经网络模型对所述待处理图像进行处理,以得到所述待处理图像的色彩调整矩阵,所述神经网络模型是根据真值图像和训练图像训练得到的,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;根据所述待处理图像的色彩调整矩阵对所述待处理图像进行色彩调整,以获取色彩调整后的图像。
可以理解地,神经网络模型可以是通过第一方面或者第一方面任意一种可能的实现方式中的训练方法训练得到的。
在上述技术方案中,神经网络模型可以根据待处理图像生成针对于待处理图像的色彩调整矩阵,相较于传统方式,得到的色彩调整矩阵与待处理图像的场景更为契合。这样,使用上述技术方案得到的色彩调整矩阵对待处理图像,可以提高图像色彩调整的效果。
此外,上述技术方案中的神经网络模型是通过色卡的标准值构造的真值图像和带有色卡的图像的训练图像训练得到的,可以在缺少训练图像或者真值图像的场景下,完成神经 网络模型的训练。
在一种可能的实现方式中,所述待处理图像的色彩调整矩阵用于对所述待处理图像进行自动白平衡校正和/或色彩校正。
在一种可能的实现方式中,所述处理器具体用于执行以下过程:将所述待处理图像的色彩调整矩阵与所述待处理图像进行矩阵运算,以得到色彩调整后的图像。
在一种可能的实现方式中,所述待处理图像为raw图像。
由于raw图像是CMOS或者CCD图像传感器将捕捉到的光源信号转化为数字信号得到的原始数据,是无损的,故其包含了物体原始的信息。神将网络模型的输入为raw图像,这样最大程度的保留了图像的信息,这样得到的色彩调整矩阵可以反映真实场景的信息,进而图像色彩调整效果也更好。
第五方面,提供了一种神经网络模型的训练装置,该装置包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行上述第一方面中的任意一种实现方式中的方法。
第六方面,提供了一种图像处理装置,该装置包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行上述第二方面中的任意一种实现方式中的方法。
上述第五方面和第六方面中的处理器既可以是中央处理器(central processing unit,CPU),也可以是CPU与神经网络运算处理器的组合,这里的神经网络运算处理器可以包括图形处理器(graphics processing unit,GPU)、神经网络处理器(neural-network processing unit,NPU)和张量处理器(tensor processing unit,TPU)等等。其中,TPU是谷歌(google)为机器学习全定制的人工智能加速器专用集成电路。
第七方面,提供一种计算机可读介质,该计算机可读介质存储用于设备执行的程序代码,当该程序代码在计算机上运行时,使得计算机执行第一方面或者第一方面的任意一种实现方式中的方法,或者执行第二方面或者第二方面的任意一种实现方式中的方法。
第八方面,提供一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行第一方面或者第一方面的任意一种实现方式中的方法,或者执行第二方面或者第二方面的任意一种实现方式中的方法。
第九方面,提供一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行第一方面或者第一方面的任意一种实现方式中的方法,或者执行第二方面或者第二方面的任意一种实现方式中的方法。
可选地,作为一种实现方式,所述芯片还可以包括存储器,所述存储器中存储有指令,所述处理器用于执行所述存储器上存储的指令,当所述指令被执行时,所述处理器用于执行第一方面或者第一方面的任意一种实现方式中的方法,或者执行第二方面或者第二方面的任意一种实现方式中的方法。
上述芯片具体可以是现场可编程门阵列(field-programmable gate array,FPGA)或者专用集成电路(application-specific integrated circuit,ASIC)。
第十方面,提供了一种计算设备,该计算设备包括第三方面中的任意一个方面中的神经网络模型的训练装置,或者该计算设备包括第四方面中的任意一个方面中的图像处理装置。
可选地,当上述计算设备包括第三方面中的任意一个方面中的神经网络模型的训练装置时,该电子设备具体可以是服务器。
可选地,当上述电子设备包括上述第四方面中的任意一个方面中的图像处理装置时,该电子设备具体可以是终端设备。
附图说明
图1是ISP处理的示意性流程图。
图2是本申请实施例提供的系统架构的结构示意图。
图3是本申请实施例提供的卷积神经网络模型的示意性框图。
图4是本申请实施例提供的一种芯片硬件结构示意图。
图5是本申请实施例的神经网络模型的训练方法的示意性流程图。
图6是本申请实施例提供的神经网络模型训练过程的示意图。
图7是本申请实施例提供的神经网络模型的示意图。
图8是本申请实施例提供的图像处理方法的示意性流程图。
图9是本申请实施例提供的图像处理过程的示意图。
图10是本申请实施例的神经网络模型训练装置的硬件结构示意图。
图11是本申请实施例的图像处理装置的硬件结构示意图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请实施例提供的各实施例能够应用在图片检索、相册管理、平安城市、人机交互以及其他需要进行图像进行色彩调整的场景。应理解,本申请实施例中的图像可以为静态图像(或称为静态画面)或动态图像(或称为动态画面),例如,本申请中的图像可以为视频或动态图片,或者,本申请中的图像也可以为静态图片或照片。为了便于描述,本申请在下述实施例中将静态图像或动态图像统一称为图像。
本申请实施例涉及了大量神经网络的相关应用,为了更好地理解本申请实施例的方案,下面先对本申请实施例可能涉及的神经网络的相关术语和其他相关概念进行介绍。
(1)神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以如公式(1-1)所示:
Figure PCTCN2019124911-appb-000001
其中,s=1、2、……n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数。神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
(2)深度神经网络
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有多层隐含层的神经网络。按照不同层的位置对DNN进行划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。
虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2019124911-appb-000002
其中,
Figure PCTCN2019124911-appb-000003
是输入向量,
Figure PCTCN2019124911-appb-000004
是输出向量,
Figure PCTCN2019124911-appb-000005
是偏移向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2019124911-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2019124911-appb-000007
由于DNN层数多,系数W和偏移向量
Figure PCTCN2019124911-appb-000008
的数量也比较多。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2019124911-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。
综上,第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2019124911-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。
(3)卷积神经网络
卷积神经网络(convolutional neuron network,CNN)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器,该特征抽取器可以看作是滤波器。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。
(4)损失函数
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断地调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。
(5)反向传播算法
神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的神经网络模型中参数的大小,使得神经网络模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的神经网络模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的神经网络模型的参数,例如权重矩阵。
(6)像素值
图像的像素值可以是一个红绿蓝(RGB)颜色值,像素值可以是表示颜色的长整数。例如,像素值为256*Red+100*Green+76Blue,其中,Blue代表蓝色分量,Green代表绿色分量,Red代表红色分量。各个颜色分量中,数值越小,亮度越低,数值越大,亮度越高。对于灰度图像来说,像素值可以是灰度值。
下面结合图2对本申请实施例适用的系统架构进行介绍。
如图2所示,本申请实施例提供了一种系统架构200。在图2中,数据采集设备260用于采集训练数据。针对本申请实施例的神经网络模型的训练方法来说,训练数据可以包括训练图像以及真值图像。
在采集到训练数据之后,数据采集设备260将这些训练数据存入数据库230,训练设备220基于数据库230中维护的训练数据训练得到目标模型/规则201。
下面对训练设备120基于训练数据得到目标模型/规则201进行描述,训练设备220对输入的训练图像进行处理,将输出的图像与真值图像进行对比,直到训练设备220输出的图像与真值图像的差值小于一定的阈值,从而完成目标模型/规则201的训练。
上述目标模型/规则201能够用于实现本申请实施例的图像处理方法,即,将待处理图像通过相关预处理后输入该目标模型/规则201,即可得到色彩调整后的图像。本申请实施例中的目标模型/规则201具体可以为本申请实施例中的神经网络模型。需要说明的是,在实际的应用中,所述数据库230中维护的训练数据不一定都来自于数据采集设备260的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备220也不一定完全基于数据库230维护的训练数据进行目标模型/规则201的训练,也有可能从云端或其他地方获取训练数据进行模型训练,上述描述不应该作为对本申请实施例的限定。
根据训练设备220训练得到的目标模型/规则201可以应用于不同的系统或设备中,如应用于图2所示的执行设备210,所述执行设备210可以是终端,如手机终端,平板电脑,笔记本电脑,增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR),车载终端等,还可以是服务器或者云端设备等。在图2中,执行设备210配置输入/输出(input/output,I/O)接口212,用于与外部设备进行数据交互,用户可以通过客户设备240向I/O接口212输入数据,所述输入数据在本申请实施例中可以包括:客户设备输入的待处理图像。
预处理模块213和预处理模块214用于根据I/O接口212接收到的输入数据(如待处理图像)进行预处理,在本申请实施例中,也可以没有预处理模块213和预处理模块214(也可以只有其中的一个预处理模块),而直接采用计算模块211对输入数据进行处理。
在执行设备210对输入数据进行预处理,或者在执行设备210的计算模块211执行计算等相关的处理过程中,执行设备210可以调用数据存储系统250中的数据、代码等以用 于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储系统250中。
最后,I/O接口212将处理结果,如上述得到的经过色彩处理后的图像返回给客户设备240,从而提供给用户。
值得说明的是,训练设备220可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标模型/规则201,该相应的目标模型/规则201即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。
在图2所示情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口212提供的界面进行操作。另一种情况下,客户设备240可以自动地向I/O接口212发送输入数据,如果要求客户设备240自动发送输入数据需要获得用户的授权,则用户可以在客户设备240中设置相应权限。用户可以在客户设备240查看执行设备210输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备240也可以作为数据采集端,采集如图所示输入I/O接口212的输入数据及输出I/O接口212的输出结果作为新的样本数据,并存入数据库230。当然,也可以不经过客户设备240进行采集,而是由I/O接口212直接将如图所示输入I/O接口212的输入数据及输出I/O接口212的输出结果,作为新的样本数据存入数据库230。
值得注意的是,图2仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图2中,数据存储系统250相对执行设备210是外部存储器,在其它情况下,也可以将数据存储系统250置于执行设备210中。
如图2所示,根据训练设备220训练得到目标模型/规则201,该目标模型/规则201在本申请实施例中可以是本申请中的神经网络模型,具体的,本申请实施例提供的神经网络模型可以包括一个或多个神经网络,该一个或多个神经网络可以包括CNN、深度卷积神经网络(deep convolutional neural networks,DCNN)和/或循环神经网络(recurrent neural network,RNNS)等等。
由于CNN是一种非常常见的神经网络,下面结合图3重点对CNN的结构进行详细的介绍。如上文的基础概念介绍所述,卷积神经网络是一种带有卷积结构的深度神经网络,是一种深度学习(deep learning)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,CNN是一种前馈(feed-forward)人工神经网络,该前馈人工神经网络中的各个神经元可以对输入其中的图像作出响应。
如图3所示,卷积神经网络(CNN)300可以包括输入层310,卷积层/池化层320(其中池化层为可选的),以及神经网络层330。下面对这些层的相关内容做详细介绍。
卷积层/池化层320:
卷积层:
如图3所示卷积层/池化层320可以包括如示例321-326层,举例来说:在一种实现方式中,321层为卷积层,322层为池化层,323层为卷积层,324层为池化层,325为卷积层,326为池化层;在另一种实现方式中,321、322为卷积层,323为池化层,324、325为卷积层,326为池化层。即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。
下面将以卷积层321为例,介绍一层卷积层的内部工作原理。
卷积层321可以包括很多个卷积算子,卷积算子也称为核,其在图像处理中的作用相当于一个从输入图像矩阵中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,在对图像进行卷积操作的过程中,权重矩阵通常在输入图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素……这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用多个尺寸(行×列)相同的权重矩阵,即多个同型矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度,这里的维度可以理解为由上面所述的“多个”来决定。不同的权重矩阵可以用来提取图像中不同的特征,例如一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等。该多个权重矩阵尺寸(行×列)相同,经过该多个尺寸相同的权重矩阵提取后的特征图的尺寸也相同,再将提取到的多个尺寸相同的特征图合并形成卷积运算的输出。
这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以用来从输入图像中提取信息,从而使得卷积神经网络300进行正确的预测。
当卷积神经网络300有多个卷积层的时候,初始的卷积层(例如321)往往提取较多的一般特征,该一般特征也可以称之为低级别的特征;随着卷积神经网络300深度的加深,越往后的卷积层(例如326)提取到的特征越来越复杂,比如高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。
池化层/池化层320:
由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,在如图3中320所示例的321-326各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的唯一目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产生平均值作为平均池化的结果。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。另外,就像卷积层中用权重矩阵的大小应该与图像尺寸相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。
神经网络层330:
在经过卷积层/池化层320的处理后,卷积神经网络300还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层320只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或其他相关信息),卷积神经网络300需要利用神经网络层330来生成一个或者一组所需要的类的数量的输出。因此,在神经网络 层330中可以包括多层隐含层(如图3所示的331、332至33n)以及输出层340,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到,例如该任务类型可以包括图像色彩处理、图像识别,图像分类,图像超分辨率重建等等。
在神经网络层330中的多层隐含层之后,也就是整个卷积神经网络300的最后层为输出层340,该输出层240具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络300的前向传播(如图3由310至340方向的传播为前向传播)完成,反向传播(如图3由340至310方向的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络300的损失,及卷积神经网络300通过输出层输出的结果和理想结果之间的误差。
需要说明的是,如图3所示的卷积神经网络300仅作为一种卷积神经网络的示例,在具体的应用中,卷积神经网络还可以以其他网络模型的形式存在。
本申请中,神经网络模型可以包括图3所示的卷积神经网络300,该神经网络模型可以对待处理图像进行处理,得到对应于待处理图像的色彩处理矩阵。
图4为本申请实施例提供的一种芯片硬件结构,该芯片包括神经网络处理器40。该芯片可以被设置在如图2所示的执行设备210中,用以完成计算模块211的计算工作。该芯片也可以被设置在如图2所示的训练设备220中,用以完成训练设备220的训练工作并输出目标模型/规则201。如图3所示的卷积神经网络中各层的算法均可在如图4所示的芯片中得以实现,可选地,该卷积神经网络可以为上述神经网络模型包括的(一个或多个)神经网络中的一个。
神经网络处理器NPU 40作为协处理器挂载到主CPU(host CPU)上,由主CPU分配任务。NPU的核心部分为运算电路403,控制器404控制运算电路403提取存储器(权重存储器或输入存储器)中的数据并进行运算。
在一些实现方式中,运算电路403内部包括多个处理单元(process engine,PE)。在一些实现方式中,运算电路403是二维脉动阵列。运算电路403还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现方式中,运算电路403是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路403从权重存储器402中取矩阵B相应的数据,并缓存在运算电路403中每一个PE上。运算电路403从输入存储器401中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)408中。
向量计算单元407可以对运算电路403的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。例如,向量计算单元407可以用于神经网络中非卷积/非FC层的网络计算,如池化(pooling),批归一化(batch normalization),局部响应归一化(local response normalization)等。
在一些实现方式中,向量计算单元能407将经处理的输出的向量存储到统一缓存器406。例如,向量计算单元407可以将非线性函数应用到运算电路403的输出,例如累加值的向量,用以生成激活值。在一些实现方式中,向量计算单元407生成归一化的值、合并值,或二者均有。在一些实现方式中,处理过的输出的向量能够用作到运算电路303的激活输入,例如用于在神经网络中的后续层中的使用。
统一存储器406用于存放输入数据以及输出数据。
权重数据直接通过存储单元访问控制器405(direct memory access controller,DMAC)将外部存储器中的输入数据搬运到输入存储器401和/或统一存储器406、将外部存储器中的权重数据存入权重存储器402,以及将统一存储器406中的数据存入外部存储器。
总线接口单元(bus interface unit,BIU)410,用于通过总线实现主CPU、DMAC和取指存储器409之间进行交互。
与控制器404连接的取指存储器(instruction fetch buffer)409,用于存储控制器404使用的指令;
控制器404,用于调用指存储器409中缓存的指令,实现控制该运算加速器的工作过程。
一般地,统一存储器406,输入存储器401,权重存储器402以及取指存储器409均为片上(On-Chip)存储器,外部存储器为该NPU外部的存储器,该外部存储器可以为双倍数据率同步动态随机存储器(double data rate synchronous dynamic random access memory,简称DDR SDRAM)、高带宽存储器(high bandwidth memory,HBM)或其他可读可写的存储器。
其中,图3所示的卷积神经网络中各层的运算可以由运算电路403或向量计算单元407执行。
上文中介绍的图2中的执行设备210能够执行本申请各方法实施例的各个步骤,可选地,图2中的执行设备210可以包括图3所示的CNN模型和图4所示的芯片。下面结合附图对本申请的方法实施例进行详细的介绍。
如背景技术部分所述,ISP处理中,对于全局颜色处理模块,传统方式是根据场景分类,从固定的几种色彩调整矩阵中选择适用的色彩调整矩阵,再使用选择的色彩调整矩阵对图像进行处理。该方式中色彩调整矩阵只能从固定的几种色彩调整矩阵中确定,使得图像色彩调整的效果并不理想。
针对上述问题,本申请提出了图像处理方法和神经网络模型的训练方法,能够提高图像色彩调整的效果。
图5是本申请实施例的神经网络模型的训练方法的示意性流程图。图5所示的方法可以由计算机设备、服务器设备或者运算设备等运算能力较强的设备来执行,例如,该方法可以由图4中的终端设备执行。图5所示的方法包括步骤510和520,下面分别对这几个步骤进行详细的介绍。
在510中,获取真值图像和训练图像。
其中,真值图像为对神经网络模型进行训练时使用的参考图像。真值图像也可以有其他叫法,例如,目标图像、期望图像、真值图等,下文统一称为真值图像。
获取真值图像的方式有很多,本申请实施例不作具体限定。
作为一个示例,可以通过色卡的标准值,构成真值图像。色卡的标准值为色卡色彩值的标准值,可以通过从色卡的提供商获取,例如,可以通过查询色卡提供商提供的色彩标准值表获取色卡的标准值。色彩值可以是RGB值等。本申请实施例对色卡的类型不作具体限定。例如,色卡可以是140色色卡、24色色卡等。
以24色色卡为例,对本申请实施例的真值图像的构造方法进行描述。24色色卡具有 24个标准值,使用这24个标准值构造真值图像。可选地,根据色卡色块的分布情况,构造真值图像。例如,色卡包括6行4列共24个不同颜色的色块,每个色块对应一个标准值,可以构造一个6*4的标准值矩阵,标准值矩阵中的元素的值分别为色卡相应位置的色块的色彩的标准值,所述6*4的标准值矩阵即为真值图像。
作为另一个示例,还可以通过对满足预期效果的图像进行特征提取得到真值图像。满足预期效果的图像可以是通过摄像设备拍摄的图像,也可以是合成的图像,本申请实施例不作具体限定。
训练图像也可以称为训练样本、样本数据等,用于对神经网络模型进行训练。在本申请实施例中,当真值图像由色卡色彩的标准值构成时,训练图像可以是拍摄的包括色卡的图像,这样就可以通过特征提取的方式,提取到实际拍摄的色卡的色彩值。这样,通过拍摄大量的包括色卡的图像,即可获取大量的训练图像。在本申请中,训练图像可以是raw图像。
在520中,根据所述真值图像和所述训练图像,对神经网络模型进行训练,所述神经网络模型的输出目标是用于对图像进行色彩调整的色彩调整矩阵。
本申请实施例的神经网络模型(或称神经网络)的目标输出为与输入的待处理图像相适应的色彩调整矩阵。也就是说,本申请实施例的神经网络模型可以根据每张待处理图像生成与之相适应的色彩调整矩阵。进一步使用该色彩调整矩阵对待处理图像进行色彩调整,可以提高图像的色彩调整效果。
色彩调整矩阵可以用于对待处理图像进行全局色彩调整。例如,色彩调整矩阵可以对待处理图像进行自动白平衡校正,对待处理图像进行色彩矫正,或对待处理图像同时进行自动白平衡校正和色彩矫正。相较于传统方式,对图像进行色彩调整时,可以不局限于预计算的几种色彩调整矩阵。
根据所述真值图像和训练图像,对神经网络模型进行训练的方式有很多,本申请实施例不作具体限定。
作为一个示例,根据训练图像,得到候选图像,候选图像为根据训练图像中所述色卡所在位置的色彩值构成的图像;根据训练图像和神经网络模型,得到与训练图像对应的色彩调整矩阵;使用得到的色彩调整矩阵对候选图像进行色彩调整,得到色彩调整后的图像;根据色彩调整后的图像和真值图像,对神经网络模型的模型参数进行调整。可选地,可以通过损失函数确定真值图像和色彩调整后的图像之间的图像误差。
作为另一个示例,根据训练图像,得到候选图像,候选图像为根据训练图像中所述色卡所在位置的色彩值构成的图像;对训练图像进行划分,得到多张子图像;通过神经网络模型对多张子图像分别进行处理,得到多个色彩调整矩阵,多张子图像与多个色彩调整矩阵一一对应;然后根据得到的多个色彩调整矩阵和训练图像,分别得到色彩调整后的图像;根据得到的色彩调整后的图像和真值图像,对神经网络模型的模型参数进行调整。可以理解地,在得到多张子图像之后,可以依次使用多张子图像中的每一张子图像对神经网络模型进行训练。例如,多张子图像包括第一子图像和第二子图像,使用第一子图像对神经网络模型进行训练后,在使用第二子图像对神经网络进行训练。
上述子图像可以是上述训练图像的一部分,不同的子图像之间可以重叠或不重叠。一种可能的划分方式为对训练图像进行patch处理。
示例性地,上述候选图像为训练图像包含的色卡所在位置的色彩值经过滤波处理后的图像。其中,滤波处理可以是均值滤波、中值滤波等任意可能的滤波方式。
本申请实施例中,具体地,可以根据色彩调整后的图像与真值图像之间的图像损失或者图像误差,对神经网络模型的模型参数进行调整。例如,可以根据得到的图像损失或者图像误差,通过梯度反向传递函数或者上文所述的反向传播算法,调整所述神经网络的模型参数,从而获取最终的神经网络模型。可选地,可以通过损失函数确定真值图像和色彩调整后的图像之间的图像误差。
图6是本申请实施例提供的神经网络模型训练过程的示意图。如图6所示,将训练图像或者训练图像划分得到的子图像输入到神经网络模型,得到神经网络模型的输出,即色彩调整矩阵,同时对该训练图像进行特征提取,得到训练图像包括的色卡的各个色块的色彩值,并使用得到的色彩值构建候选图像;将得到的色彩调整矩阵与候选图像进行矩阵乘法,以得到RGB图像,即色彩调整后的图像;将RGB图像与使用色卡色彩值标准值构造的真值图像进行比较,得到两者之间的误差;根据得到的误差,调整神经网络模型的模型参数;反复执行上述操作,直到得到的误差满足条件。可选地,训练图像可以为raw图像(或称rawdata),对训练图像进行特征提取可以是获取raw图像中色卡所在位置的rawdata数值。可选地,在对训练图像进行特征提取之前,还可以对训练图像进行预处理,例如,去马赛克处理、降噪处理等。
将候选图像与色彩调整矩阵进行矩阵运算有多种方式。
例如,可以采用以下的形式:
Figure PCTCN2019124911-appb-000011
其中a、b、c可以由神经网络模型确定,R'、G'、B'分别为经过色彩调整后的图像的色彩通路通道R的数值、色彩通路通道G的数值和色彩通路通道B通的数值,R、G、B分别为经过色彩调整前的图像的色彩通路通道R的数值、色彩通路通道G的数值和色彩通路通道B通的数值。
例如,可以采用以下的形式:
Figure PCTCN2019124911-appb-000012
其中R'、G'、B'分别为经过色彩调整后的图像的色彩通路通道R的数值、色彩通路通道G的数值和色彩通路通道B通的数值,R、G、B分别为经过色彩调整前的图像的色彩通路通道R的数值、色彩通路通道G的数值和色彩通路通道B通的数值。
本申请实施例还可以使用如下的二次项、三次项、开方项等形式进行色彩调整:
ρ 2,3=(R,G,B,R 2,G 2,B 2,RG,GB,RB) T
Figure PCTCN2019124911-appb-000013
Figure PCTCN2019124911-appb-000014
其中R、G、B分别为经过色彩调整前的图像的色彩通路通道R的数值、色彩通路通道G的数值和色彩通路通道B通的数值,T表示转置。对应于上述格式,用于色彩调整的色彩调整矩阵会有不同的格式,以带有二次项的色彩调整矩阵为例,对训练图像进行色彩调整可以按照以下公式处理,此时用于色彩调整矩阵M可以是3*10的矩阵:
Figure PCTCN2019124911-appb-000015
其中R'、G'、B'分别为经过色彩调整后的图像的色彩通路通道R的数值、色彩通路通道G的数值和色彩通路通道B通的数值,R、G、B分别为经过色彩调整前的图像的色彩通路通道R的数值、色彩通路通道G的数值和色彩通路通道B通的数值。
需要说明的是,图6仅以色彩调整后的图像为RGB格式的图像为例,上述训练流程同样适用于其他格式的图像,例如,YUV格式的图像等。
图7是本申请实施例提供的神经网络模型的示意图。应理解,图7仅为示例性的,仅仅是为了帮助本领域技术人员理解本申请实施例,而非要将本申请实施例限于所例示的具体场景。本申请实施例的神经网络模型还可以是其他结构形式,只要可以是实现本申请实施例的方法即可。
图7仅示出了神经网络模型的高层网络部分,该高层网络部分包括卷积部分、卷积池化部分和全连接(full connected,fc)部分。卷积部分可以包括如图7所示的前M层,用于提取基础特征;卷积池化部分可以包括如图7所示的中间的N层和全局池化层,用于提取中层特征,该部分的中间的N层中的每一层又包括卷积层和平均池化层;全连接部分可以包括如图7所示的后K层,用于提取高层特征。图7中的H*W*C中的H、W、C分别表示图像的高度、宽度和通道数,卷积层、池化层的功能可以参考图3的相关描述,在此不再赘述。
图8示出了本申请实施例提供的图像处理方法的示意性流程图,图8所示的方法可以由计算机设备、服务器设备或者运算设备等运算能力较强的设备来执行,例如,该方法可以由图4中的终端设备执行。图8所示的方法包括步骤810-830,下面分别对这几个步骤进行详细的介绍。
在810中,获取待处理图像。
本申请实施例的待处理图像可以是获取得到的图像数据、图像信号等。在实际的应用中,待处理图像可以是通过图像采集设备(例如,镜头和传感器)获取的,也可以是从其他设备接收得到的,本申请实施例不作具体限定。
待处理图像可以是raw图像(或称原始图像、rawdata、raw image等)。当然,待处理图像也可以是经过除色彩调整之外的其他图像处理过程之后的图像。其他图像处理过程包括黑电平矫正、镜头阴影矫正、坏点矫正、去马赛克、拜耳域降噪、自动曝光、自动对焦等中的任意一个或任意多个的组合。
在820中,通过神经网络模型对所述待处理图像进行处理,以得到所述待处理图像的色彩调整矩阵。
当待处理图像为raw图像时,由于raw图像是CMOS或者CCD图像传感器将捕捉到的光源信号转化为数字信号得到的原始数据,是无损的,故其包含了物体原始的信息。神经网络模型的输入为raw图像,这样最大程度的保留了图像的信息,这样得到的色彩调整矩阵可以反映真实场景的信息,进而图像色彩调整效果也更好。
本申请实施例的神经网络模型是根据真值图像和训练图像训练得到的,其中真值图像是根据色卡真值构造的,训练图像包括所述色卡的图像。神经网络模型的训练过程可以参见图5所示的训练方法,在此不再赘述。
在830中,根据所述待处理图像的色彩调整矩阵对所述待处理图像进行色彩调整。
本申请实施例的神经网络模型(或称神经网络)的目标输出为与输入的待处理图像相适应的色彩调整矩阵。也就是说,本申请实施例的神经网络模型可以根据每张待处理图像生成与之相适应的色彩调整矩阵。进一步使用该色彩调整矩阵对待处理图像进行色彩调整,可以提高图像的色彩调整效果。
色彩调整矩阵可以用于对待处理图像进行全局色彩调整。例如,色彩调整矩阵可以对待处理图像进行自动白平衡校正,对待处理图像进行色彩矫正,或对待处理图像同时进行自动白平衡校正和色彩矫正。相较于传统方式,对图像进行色彩调整时,可以不局限于预计算的几种色彩调整矩阵。
图9是本申请实施例提供的图像处理过程的示意图。如图9所示,具体地,将待处理图像输入到神经网络模型,得到神经网络的输出,即该待处理图像的色彩调整矩阵,同时对该待处理图像进行预处理,例如去马赛克处理、降噪处理等;将得到的色彩调整矩阵与预处理后的待处理图像进行矩阵乘法,以得到RGB图像,即色彩调整后的图像。
在本申请中,训练得到的神经网络模型可以根据待处理图像生成针对于待处理图像的色彩调整矩阵,相较于传统方式,得到的色彩调整矩阵与待处理图像的场景更为契合。这样,使用上述技术方案得到的色彩调整矩阵对待处理图像,可以提高图像色彩调整的效果。
本文中描述的各个实施例可以为独立的方案,也可以根据内在逻辑进行组合,这些方案都落入本申请的保护范围中。
应理解,本申请实施例中的具体的例子只是为了帮助本领域技术人员更好地理解本申请实施例,而非限制本申请实施例的范围。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
以上,结合图5至图9详细说明了本申请实施例提供的方法。以下,结合图10至图11详细说明本申请实施例提供的装置。应理解,装置实施例的描述与方法实施例的描述相互对应,因此,未详细描述的内容可以参见上文方法实施例,为了简洁,这里不再赘述。
图10是本申请实施例的神经网络模型训练装置的硬件结构示意图。图10所示的神经网络模型训练装置1000包括存储器1010、处理器1020、通信接口1030以及总线1040。其中,存储器1010、处理器1020、通信接口1030通过总线1040实现彼此之间的通信连接。
存储器1010可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1010可以存储程序,当存储器1010中存储的程序被处理器1020执行时,处理器1020和通信接口1020用于执行本申请实施例的神经网络模型训练方法的各个步骤。
处理器1020可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的神经网络模型训练装置中的单元所需执行的功能,或者执行本申请方法实施例的神经网络的训练方法。
处理器1020还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的神经网络模型的训练方法的各个步骤可以通过处理器1020中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器1020还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、ASIC、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。上述通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1010,处理器1020读取存储器1010中的信息,结合其硬件完成本申请实施例的神经网络模型训练装置中包括的单元所需执行的功能,或者执行本申请方法实施例的神经网络模型训练方法。
通信接口1030使用例如但不限于收发器一类的收发装置,来实现装置1000与其他设备或通信网络之间的通信。例如,可以通过通信接口1030获取训练图像。
总线1040可包括在装置1000各个部件(例如,存储器1010、处理器1020、通信接口1030)之间传送信息的通路。
应理解,通过图10所示的神经网络模型的训练装置1000对神经网络模型进行训练,训练得到的神经网络模型就可以用于执行本申请中的图像处理方法了。具体地,通过装置1000对神经网络模型进行训练能够得到图5以及图8所示的方法中的神经网络模型。
具体地,图10所示的装置可以通过通信接口1030从外界获取训练图像、真值图像以及待训练的神经网络模型,然后由处理器根据训练图像和真值图像对待训练的神经网络模型进行训练。
图11是本申请实施例的图像处理装置的硬件结构示意图。与上述装置1000类似,图11所示的图像处理装置1100包括存储器1110、处理器1120、通信接口1130以及总线1140。其中,存储器1110、处理器1120、通信接口1130通过总线1140实现彼此之间的通信连接。
存储器1110可以存储程序,当存储器1110中存储的程序被处理器1120执行时,处理器1120用于执行本申请实施例的图像处理方法的各个步骤。
处理器1120可以采用通用的CPU,微处理器,ASIC,GPU或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的图像处理方法。
处理器1120还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的图像处理方法的各个步骤可以通过处理器1120中的硬件的集成逻辑电路或者软件形式的指令完成。
处理器1120可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的图像处理装置中的单元所需执行的功能,或者执行本申请方法实施例的图像处理方法。
处理器1120还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的图像处理方法的各个步骤可以通过处理器1120中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器1120还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、ASIC、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。上述通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1010,处理器1120读取存储器1110中的信息,结合其硬件完成本申请实施例的图像处理装置中包括的单元所需执行的功能,或者执行本申请方法实施例的图像处理方法。
通信接口1130使用例如但不限于收发器一类的收发装置,来实现装置1100与其他设备或通信网络之间的通信。例如,可以通过通信接口1130获取待处理图像。
总线1140可包括在装置1100各个部件(例如,存储器1110、处理器1120、通信接口1130)之间传送信息的通路。
应注意,尽管上述装置1000和装置1100仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置1000和装置1100还可以包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,装置1000和装置1100还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置1000和装置1100也可仅仅包括实现本申请实施例所必须的器件,而不必包括图10和图11中所示的全部器件。
应理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括 易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的随机存取存储器(random access memory,RAM)可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行所述计算机指令或计算机程序时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系,但也可能表示的是一种“和/或”的关系,具体可参考前后文进行理解。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本 申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (20)

  1. 一种神经网络模型训练方法,其特征在于,包括:
    获取真值图像和训练图像,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;
    根据所述真值图像和所述训练图像,对神经网络模型进行训练,所述神经网络模型的输出目标是用于对图像进行色彩调整的色彩调整矩阵。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述真值图像和所述训练图像,对神经网络模型进行训练,包括:
    根据所述训练图像,得到候选图像,所述候选图像为根据所述训练图像中所述色卡所在位置的色彩值构成的图像;
    根据所述训练图像和神经网络模型,得到与所述训练图像对应的色彩调整矩阵;
    使用所述色彩调整矩阵对所述候选图像进行色彩调整,得到色彩调整后的图像;
    根据所述色彩调整后的图像和所述真值图像,对所述神经网络模型的模型参数进行调整。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述训练图像和神经网络模型,得到与所述训练图像对应的色彩调整矩阵,包括:
    对所述训练图像进行划分,得到多张子图像;
    使用所述神经网络模型对所述多张子图像分别进行处理,得到多个所述色彩调整矩阵,所述多张子图像与所述多个所述色彩调整矩阵一一对应。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述色彩调整矩阵用于对图像进行自动白平衡校正和/或色彩校正。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述训练图像为raw图像。
  6. 一种图像处理方法,其特征在于,包括:
    获取待处理图像;
    通过神经网络模型对所述待处理图像进行处理,以得到所述待处理图像的色彩调整矩阵,所述神经网络模型是根据真值图像和训练图像训练得到的,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;
    根据所述待处理图像的色彩调整矩阵对所述待处理图像进行色彩调整,以获取色彩调整后的图像。
  7. 根据权利要求6所述的方法,其特征在于,所述待处理图像的色彩调整矩阵用于对所述待处理图像进行自动白平衡校正和/或色彩校正。
  8. 根据权利要求6或7所述的方法,其特征在于,所述根据所述待处理图像的色彩调整矩阵对所述待处理图像进行色彩调整,包括:
    将所述待处理图像的色彩调整矩阵与所述待处理图像进行矩阵运算,以得到色彩调整后的图像。
  9. 根据权利要求6至8中任一项所述的方法,其特征在于,所述待处理图像为raw 图像。
  10. 一种神经网络模型训练装置,其特征在于,包括:
    存储器,用于存储程序;
    处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:
    获取真值图像和训练图像,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;
    根据所述真值图像和所述训练图像,对神经网络模型进行训练,所述神经网络模型的输出目标是用于对图像进行色彩调整的色彩调整矩阵。
  11. 根据权利要求10所述的装置,其特征在于,所述处理器具体用于执行以下过程:
    根据所述训练图像,得到候选图像,所述候选图像为根据所述训练图像中所述色卡所在位置的色彩值构成的图像;
    根据所述训练图像和神经网络模型,得到与所述训练图像对应的色彩调整矩阵;
    使用所述色彩调整矩阵对所述候选图像进行色彩调整,得到色彩调整后的图像;
    根据所述色彩调整后的图像和所述真值图像,对所述神经网络模型的模型参数进行调整。
  12. 根据权利要求11所述的装置,其特征在于,所述处理器具体用于执行以下过程:
    对所述训练图像进行划分,得到多张子图像;
    使用所述神经网络模型对所述多张子图像分别进行处理,得到多个所述色彩调整矩阵,所述多张子图像与所述多个所述色彩调整矩阵一一对应。
  13. 根据权利要求10至12中任一项所述的装置,其特征在于,所述色彩调整矩阵用于对图像进行自动白平衡校正和/或色彩校正。
  14. 根据权利要求10至13中任一项所述的装置,其特征在于,所述训练图像为raw图像。
  15. 一种图像处理装置,其特征在于,包括:
    存储器,用于存储程序;
    处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:
    获取待处理图像;
    通过神经网络模型对所述待处理图像进行处理,以得到所述待处理图像的色彩调整矩阵,所述神经网络模型是根据真值图像和训练图像训练得到的,所述真值图像是根据色卡的标准值构造的,所述训练图像为包含所述色卡的图像;
    根据所述待处理图像的色彩调整矩阵对所述待处理图像进行色彩调整,以获取色彩调整后的图像。
  16. 根据权利要求15所述的装置,其特征在于,所述待处理图像的色彩调整矩阵用于对所述待处理图像进行自动白平衡校正和/或色彩校正。
  17. 根据权利要求15或16所述的装置,其特征在于,所述处理器具体用于执行以下过程:
    将所述待处理图像的色彩调整矩阵与所述待处理图像进行矩阵运算,以得到色彩调整 后的图像。
  18. 根据权利要求15至17中任一项所述的装置,其特征在于,所述待处理图像为raw图像。
  19. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序指令,当所述程序指令由处理器运行时,实现如权利要求1至5中任一项所述的方法,或者实现如权利要求6至9中任一项所述的方法。
  20. 一种芯片,其特征在于,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,以执行如权利要求1至5中任一项所述的方法,或者执行如权利要求6至9中任一项所述的方法。
PCT/CN2019/124911 2019-12-12 2019-12-12 神经网络模型的训练方法、图像处理方法及其装置 WO2021114184A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980102164.6A CN114730456A (zh) 2019-12-12 2019-12-12 神经网络模型的训练方法、图像处理方法及其装置
PCT/CN2019/124911 WO2021114184A1 (zh) 2019-12-12 2019-12-12 神经网络模型的训练方法、图像处理方法及其装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/124911 WO2021114184A1 (zh) 2019-12-12 2019-12-12 神经网络模型的训练方法、图像处理方法及其装置

Publications (1)

Publication Number Publication Date
WO2021114184A1 true WO2021114184A1 (zh) 2021-06-17

Family

ID=76329318

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/124911 WO2021114184A1 (zh) 2019-12-12 2019-12-12 神经网络模型的训练方法、图像处理方法及其装置

Country Status (2)

Country Link
CN (1) CN114730456A (zh)
WO (1) WO2021114184A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113506332A (zh) * 2021-09-09 2021-10-15 北京的卢深视科技有限公司 目标对象识别的方法、电子设备及存储介质
CN113556526A (zh) * 2021-07-18 2021-10-26 北京理工大学 一种基于rgbw滤光阵列的彩色夜视设备色彩增强方法
CN113592733A (zh) * 2021-07-22 2021-11-02 北京小米移动软件有限公司 图像处理方法、装置、存储介质及电子设备
CN114494523A (zh) * 2022-01-25 2022-05-13 合肥工业大学 一种有限色彩空间下的线稿自动上色模型训练方法、装置、电子设备及存储介质
CN114580630A (zh) * 2022-03-01 2022-06-03 厦门大学 用于ai芯片设计的神经网络模型训练方法及图形分类方法
CN115103168A (zh) * 2022-06-27 2022-09-23 展讯通信(上海)有限公司 图像生成方法、装置、电子设备及存储介质
WO2023207137A1 (zh) * 2022-04-28 2023-11-02 华为技术有限公司 图像处理方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424118B (zh) * 2022-11-03 2023-05-12 荣耀终端有限公司 一种神经网络训练方法、图像处理方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107507250A (zh) * 2017-06-02 2017-12-22 北京工业大学 一种基于卷积神经网络的面色舌色图像颜色校正方法
US20190069821A1 (en) * 2017-09-05 2019-03-07 Cnoga Medical Ltd. Method and apparatus for non-invasive glucose measurement
CN109523485A (zh) * 2018-11-19 2019-03-26 Oppo广东移动通信有限公司 图像颜色校正方法、装置、存储介质及移动终端
CN109859117A (zh) * 2018-12-30 2019-06-07 南京航空航天大学 一种采用神经网络直接校正rgb值的图像颜色校正方法
CN109903256A (zh) * 2019-03-07 2019-06-18 京东方科技集团股份有限公司 模型训练方法、色差校正方法、装置、介质和电子设备
CN110400278A (zh) * 2019-07-30 2019-11-01 广东工业大学 一种图像颜色和几何畸变的全自动校正方法、装置及设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107507250A (zh) * 2017-06-02 2017-12-22 北京工业大学 一种基于卷积神经网络的面色舌色图像颜色校正方法
US20190069821A1 (en) * 2017-09-05 2019-03-07 Cnoga Medical Ltd. Method and apparatus for non-invasive glucose measurement
CN109523485A (zh) * 2018-11-19 2019-03-26 Oppo广东移动通信有限公司 图像颜色校正方法、装置、存储介质及移动终端
CN109859117A (zh) * 2018-12-30 2019-06-07 南京航空航天大学 一种采用神经网络直接校正rgb值的图像颜色校正方法
CN109903256A (zh) * 2019-03-07 2019-06-18 京东方科技集团股份有限公司 模型训练方法、色差校正方法、装置、介质和电子设备
CN110400278A (zh) * 2019-07-30 2019-11-01 广东工业大学 一种图像颜色和几何畸变的全自动校正方法、装置及设备

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113556526A (zh) * 2021-07-18 2021-10-26 北京理工大学 一种基于rgbw滤光阵列的彩色夜视设备色彩增强方法
CN113592733A (zh) * 2021-07-22 2021-11-02 北京小米移动软件有限公司 图像处理方法、装置、存储介质及电子设备
CN113506332A (zh) * 2021-09-09 2021-10-15 北京的卢深视科技有限公司 目标对象识别的方法、电子设备及存储介质
CN113506332B (zh) * 2021-09-09 2021-12-17 北京的卢深视科技有限公司 目标对象识别的方法、电子设备及存储介质
CN114494523A (zh) * 2022-01-25 2022-05-13 合肥工业大学 一种有限色彩空间下的线稿自动上色模型训练方法、装置、电子设备及存储介质
CN114580630A (zh) * 2022-03-01 2022-06-03 厦门大学 用于ai芯片设计的神经网络模型训练方法及图形分类方法
CN114580630B (zh) * 2022-03-01 2024-05-31 厦门大学 用于ai芯片设计的神经网络模型训练方法及图形分类方法
WO2023207137A1 (zh) * 2022-04-28 2023-11-02 华为技术有限公司 图像处理方法及装置
CN115103168A (zh) * 2022-06-27 2022-09-23 展讯通信(上海)有限公司 图像生成方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN114730456A (zh) 2022-07-08

Similar Documents

Publication Publication Date Title
WO2021114184A1 (zh) 神经网络模型的训练方法、图像处理方法及其装置
WO2021043273A1 (zh) 图像增强方法和装置
WO2020192483A1 (zh) 图像显示方法和设备
CN110188795B (zh) 图像分类方法、数据处理方法和装置
WO2021043168A1 (zh) 行人再识别网络的训练方法、行人再识别方法和装置
WO2021073493A1 (zh) 图像处理方法及装置、神经网络的训练方法、合并神经网络模型的图像处理方法、合并神经网络模型的构建方法、神经网络处理器及存储介质
US20230214976A1 (en) Image fusion method and apparatus and training method and apparatus for image fusion model
WO2021018163A1 (zh) 神经网络的搜索方法及装置
US20230177641A1 (en) Neural network training method, image processing method, and apparatus
WO2020177607A1 (zh) 图像去噪方法和装置
EP4109392A1 (en) Image processing method and image processing device
US12039440B2 (en) Image classification method and apparatus, and image classification model training method and apparatus
WO2021135657A1 (zh) 图像处理方法、装置和图像处理系统
WO2021018245A1 (zh) 图像分类方法及装置
CN113034358B (zh) 一种超分辨率图像处理方法以及相关装置
US20220157046A1 (en) Image Classification Method And Apparatus
WO2021227787A1 (zh) 训练神经网络预测器的方法、图像处理方法及装置
CN114339054B (zh) 拍照模式的生成方法、装置和计算机可读存储介质
WO2022179606A1 (zh) 一种图像处理方法及相关装置
CN114627034A (zh) 一种图像增强方法、图像增强模型的训练方法及相关设备
WO2023083231A1 (en) System and methods for multiple instance segmentation and tracking
CN113096023A (zh) 神经网络的训练方法、图像处理方法及装置、存储介质
WO2023029559A1 (zh) 一种数据处理方法以及装置
CN111861877A (zh) 视频超分变率的方法和装置
WO2022193132A1 (zh) 图像检测方法、装置和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19955660

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19955660

Country of ref document: EP

Kind code of ref document: A1