WO2022078413A1 - 基于深度学习的图像调色方法、装置、电子设备及计算机可读存储介质 - Google Patents
基于深度学习的图像调色方法、装置、电子设备及计算机可读存储介质 Download PDFInfo
- Publication number
- WO2022078413A1 WO2022078413A1 PCT/CN2021/123631 CN2021123631W WO2022078413A1 WO 2022078413 A1 WO2022078413 A1 WO 2022078413A1 CN 2021123631 W CN2021123631 W CN 2021123631W WO 2022078413 A1 WO2022078413 A1 WO 2022078413A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- toning
- neural network
- network model
- input
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000013135 deep learning Methods 0.000 title claims abstract description 22
- 238000003860 storage Methods 0.000 title claims description 11
- 238000003062 neural network model Methods 0.000 claims abstract description 61
- 238000012549 training Methods 0.000 claims abstract description 31
- 230000015654 memory Effects 0.000 claims description 22
- 230000000694 effects Effects 0.000 claims description 17
- 230000002146 bilateral effect Effects 0.000 claims description 12
- 230000007547 defect Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 238000012937 correction Methods 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 230000001133 acceleration Effects 0.000 claims description 4
- 230000000593 degrading effect Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 3
- 238000006731 degradation reaction Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims 1
- 230000008676 import Effects 0.000 claims 1
- 238000010276 construction Methods 0.000 abstract description 7
- 230000002708 enhancing effect Effects 0.000 abstract 1
- 238000004422 calculation algorithm Methods 0.000 description 5
- 230000009467 reduction Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000013077 scoring method Methods 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 230000003796 beauty Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004040 coloring Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 230000005294 ferromagnetic effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000005291 magnetic effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000011514 reflex Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the present application relates to the technical field of video processing, and in particular, to a deep learning-based image toning method, apparatus, electronic device, and computer-readable storage medium.
- Image toning enhancement refers to adjusting the contrast, saturation, hue, etc. of a picture or video frame through algorithms to change the overall or local color of the picture or video frame, including underexposure, overexposure, and saturation of the picture or video frame Low-level processing to make pictures or video frames look fuller and more vivid.
- Image toning enhancement technology is widely used in film and television production, photography, medical imaging, remote sensing imaging and other fields.
- the image toning enhancement algorithm can also be used as a preprocessing algorithm for image processing algorithms such as target recognition, target tracking, feature point matching, image fusion, and super-resolution reconstruction.
- image enhancement technology based on deep learning has been developed rapidly. These algorithms use supervised or semi-supervised learning to learn the mapping relationship before and after image enhancement from a large number of training data pairs.
- the existing deep learning method is to first classify the image based on the neural network, and then load the filter of the corresponding category to color the image.
- the classification of this method cannot cover all types of shooting scenes and backgrounds, and it cannot accurately color-grade different areas in the image or different segments in the video, which may easily cause problems such as color distortion, noise, artifacts, and insufficient restoration. There are problems such as too long training time of the neural network model.
- the purpose of the present invention is to provide an image toning method, device, electronic device and computer-readable storage medium based on deep learning, which aims to solve the defects of the existing deep learning-based image toning.
- the present invention provides an image toning method based on deep learning, the method includes: acquiring image training samples; building an image toning neural network model; accelerating the image toning neural network model; The image toning neural network model performs toning; outputs the toned image.
- the present invention provides an image toning device based on deep learning, the device includes: an acquisition module for acquiring image training samples; a construction module for constructing an image toning neural network model; an acceleration module for using A neural network module for accelerating image toning; the input module is used to input the image to be graded into the image toning neural network model for toning; the output module is used to output the toned image.
- the present invention provides an electronic device, comprising a memory for storing executable instructions; a processor for implementing the above-mentioned deep learning-based image toning method when executing the executable instructions stored in the memory .
- the deep learning-based image toning method of the present invention has the advantages of fast model construction, low implementation cost, fast color matching, and the color matching results have a high level of aesthetics, which can effectively reduce the burden of the creator.
- the grading time of an image or video improves the creator's work level and grading experience.
- FIG. 2 is a structural block diagram of an image toning apparatus in an embodiment of the present invention.
- FIG. 3 is a structural block diagram of an electronic device in an embodiment of the present invention.
- the deep learning-based image toning method in this embodiment includes the following steps.
- the image training samples are obtained in the following manner.
- the original image captured by the photographing device is used as the input sample, and the artificially toned image of the input sample is used as the target sample.
- the target sample collect a plurality of uncolored plane images obtained by using the photographing device, professional color grading personnel will color the uncolored plane images, and then other experts with professional aesthetic level will compare the uncolored image and the colorized image.
- using the subjective scoring method to evaluate the toned image and then select the toned image with a score greater than the set score threshold and the corresponding uncolored original image as the target sample and the input sample respectively.
- the photographing device in this embodiment is any device that can obtain digital images, including but not limited to a single-lens reflex camera, a mirrorless camera, a mobile phone with photographing and videography functions, a motion camera, a panoramic camera, and the like. If the photographing device is a panoramic camera, the plane image is obtained by projecting or cropping the panoramic video obtained by the panoramic camera.
- Image training samples can also be obtained by the following methods: acquiring an image without toning defects, and then degrading the image, then taking the image without toning defects as the target sample, and using the degraded image corresponding to the target sample as the target sample. images as input samples. Specifically, a degraded image is obtained after at least one operation such as exposure, underexposure, contrast reduction, color saturation reduction, resolution reduction, etc., is performed on an image with aesthetic beauty and full image quality, and then the degraded image is used as Enter a sample, and use the beautiful and full-quality image as the target sample. It can be seen from the above that multiple input samples can be obtained by performing different degrading operations on images without toning defects.
- a deep neural network model with bilateral guided upsampling is first constructed, and then the image training samples in S1 are input into the deep neural network model for training to obtain the trained deep neural network model, and then the input has a color tone defect.
- the image tones of the trained deep neural network model is detected, and the trained deep neural network model is optimized according to the toning effect to obtain an image toning neural network model.
- the deep neural network model with bilateral guided upsampling in this embodiment is constructed as follows: down-sampling the input image, extracting image features with a convolutional neural network to obtain a bilateral grid and smoothing, and applying a bilateral grid according to the guide map of the input image Do upsampling, and finally output a graded image.
- the model construction can refer to the paper: Gharbi M, Chen J, Barron J T, et al. Deep bilateral learning for real-time image enhancement[J].
- the original image or the degraded image captured by the camera is used as the input sample
- the image corresponding to the toned image or the image without toning defects is used as the target sample
- the depth with bilateral guided upsampling is input.
- the neural network model is trained.
- the mean square error loss function (MSE Loss Function) is used to evaluate the error between the output image of the deep neural network model and the target sample.
- MSE Loss Function mean square error loss function
- a certain number (such as 10) of images with color correction defects outside the image training samples are input into the neural network model after training to obtain multiple images after color correction.
- step S23 Determine whether the toning effect of the toned image has reached the predetermined toning effect, if not, go to step S24; if yes, go to step S25.
- an expert with a professional aesthetic level uses a subjective scoring method to evaluate the color toning effect of each image after color matching. If the scores of all images are greater than or equal to the set score threshold, it is considered that the image color matching neural network model The color toning effect has a high aesthetic level, then enter step S24, that is, the neural network model after training can be used as an image toning neural network model; if the score of at least one image is less than the set score threshold, it is considered that The image toning neural network model still needs to be optimized, and at this time, it goes to step S25.
- step S24 subjecting the image that has not reached the predetermined toning effect to artificial toning processing to achieve an image with an ideal toning effect, and then performing different degradation processes on the toned image to obtain a plurality of degraded images, Then, the degraded images and the artificially toned images are used as image training samples, and the process returns to step S21.
- one or more images that do not achieve the ideal toning effect are toned by professional colorists using toning software to obtain a toned image approved by the professional aesthetic level, and then the toned image is obtained.
- perform different degrading processes such as overexposure, underexposure, lowering contrast, lowering color saturation, lowering resolution, etc.
- the input sample take the color-toned image as the target sample and return to step S21, that is, to optimize the color-toning effect performed by the deep neural network model.
- the trained neural network model achieves an ideal color grading effect after color grading a certain number of images, it can be considered that its color grading effect has a high aesthetic level, and no further training is required, and can be used as a Image toning neural network model.
- the present embodiment improves the construction speed and color adjustment effect of the image color matching neural network model through targeted training and continuous optimization of the deep neural network model.
- Application GPU Graphic The Processing Unit, namely the image processing unit accelerates the deep neural network model with bilateral guided upsampling, so as to realize the real-time operation of the model. Specifically: exporting the parameters of the deep neural network model with a high aesthetic level as a binary file; integrating an open-source deep neural network inference engine in the color matching program, and importing the binary deep neural network model parameters; reasoning according to the open-source neural network
- the engine API performs inference on the input image to obtain the bilateral grid and guide map; uses the graphics engine API on the programmable GPU to implement the double-sided grid upsampling acceleration process, and finally outputs the image after color correction.
- S4 Input the image to be graded into the image toning neural network model for toning.
- the uncolored plane image or video is a digital plane image or video obtained by the photographing device.
- the photographing device is a panoramic camera
- the plane image and the video are obtained by projecting or cropping the panoramic picture or video obtained by the panoramic camera.
- the toned image is output after the model is toned; if the input is a flat video, the flat video is split into flat video frames, and the toned image is toned by the model. Output the video frame after color correction, and then splicing it into a flat video after color correction.
- the graded image is output through the display of an electronic device such as a camera or mobile phone.
- this embodiment discloses an image toning device based on deep learning, including: an acquisition module for acquiring image training samples; a construction module for constructing an image toning neural network model; an acceleration module, It is used to speed up the image toning neural network module; the input module is used to input the image to be graded into the image toning neural network model for toning; the output module is used to output the toned image.
- this embodiment discloses an electronic device including a memory and a processor.
- the memory is used to store executable instructions; the processor is used to implement the deep learning-based image toning method in Embodiment 1 when executing the executable instructions stored in the memory.
- Executable instructions in this embodiment may take the form of programs, software, software modules, scripts or codes, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and may be Deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- executable instructions may, but do not necessarily correspond to files in a file system, may be stored as part of a file that holds other programs or data, eg, in a Hyper Text Markup Language (HTML) document
- HTML Hyper Text Markup Language
- One or more scripts of a stored in a single file dedicated to the program in question, or, in multiple cooperating files (eg, files that store one or more modules, subprograms, or code sections).
- executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or alternatively, distributed across multiple sites and interconnected by a communication network execute on.
- This embodiment provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the deep learning-based image toning method in Embodiment 1 is implemented.
- the storage medium can be a computer-readable storage medium, for example, a ferroelectric memory (FRAM, Ferromagnetic Random Access Memory), Read Only Memory (ROM, Read Only Memory), Programmable Read Only Memory (PROM, Programmable Memory) Read Only Memory), Erasable Programmable Read Only Memory (EPROM, Erasable Programmable Read Only Memory), Electrically Erasable Programmable Read Only Memory (EEPROM, Electrically Erasable Programmable Read Only Memory), flash memory, magnetic surface memory, optical disk, or compact disk-read only memory (CD-ROM, Compact Disk-Read Only Memory); it can also be a variety of devices including one or any combination of the above memories .
- FRAM ferroelectric memory
- ROM Read Only Memory
- PROM Programmable Read Only Memory
- EPROM Erasable Programmable Read Only Memory
- EEPROM Electrically Erasable Programmable Read Only Memory
- flash memory magnetic surface memory, optical disk, or compact disk-read only memory (CD-ROM, Compact Disk-Read Only Memory
- CD-ROM Compact Disk-Read
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
本发明提供了一种基于深度学习的图像调色方法,该方法包括:获取图像训练样本;构建图像调色神经网络模型;加速图像调色神经网络模型;将待调色图像输入图像调色神经网络模型进行调色;输出调色后的图像。与现有技术相比,本发明方案的基于深度学习的图像调色方法的模型构建速度快、实现成本低、调色速度快,且调色结果具有较高美学水平,能有效减少创作者的图像或视频的调色时间,提升了创造者的作品水平与调色体验。
Description
本申请涉及视频处理技术领域,具体涉及一种基于深度学习的图像调色方法、装置、电子设备及计算机可读存储介质。
图像调色增强是指通过算法来调整图片或视频帧的对比度、饱和度、色调等,以改变图片或视频帧的整体或局部色彩,包括对图片或视频帧的欠曝、过曝、饱和度低等进行处理,使得图片或视频帧看上去更加饱满生动。图像调色增强技术在影视制作、摄影摄像、医学成像、遥感成像等领域都有着广泛的应用。此外,图像调色增强算法还可以作为目标识别、目标跟踪、特征点匹配、图像融合、超分辨率重构等图像处理算法的预处理算法。
传统调色软件的工作原理是使用预设好的图像处理参数,或根据预调色生成的颜色查找表(常被称为“预设”或“滤镜”),将被调色图像/视频中像素的颜色转换成另一种颜色,最终实现调色效果。这种方案下每种预设仅适用于特定颜色范围的场景,如果图像中色彩丰富、光照条件复杂,或视频中场景变化较大,则会导致调色后画面色彩不统一、色偏、画面不符合用户美感等问题。当用户需要调整局部图像区域或视频片段颜色时,必须手动选择并精细调整,难以满足用户快速调色的需求。
近年来,基于深度学习的图像增强技术得到了快速发展,这些算法通过有监督或半监督学习的方式,从大量的训练数据对中,让神经网络学习出图像增强前后的映射关系。现有的深度学习方法是先基于神经网络对图像进行分类,再加载对应类别的滤镜对图像调色。但该方法的分类无法涵盖所有类型拍摄景物与背景,且不能分别对图像中不同区域或视频中不同片段准确调色,容易引起颜色失真、噪声、伪影、恢复力度不够等问题;此外,还存在神经网络模型训练时间过长等问题。
本发明的目的在于提供一种基于深度学习的图像调色方法、装置、电子设备及计算机可读存储介质,旨在解决现有基于深度学习的图像调色存在的缺陷。
第一方面,本发明提供了一种基于深度学习的图像调色方法,该方法包括:获取图像训练样本;构建图像调色神经网络模型;加速图像调色神经网络模型;将待调色图像输入图像调色神经网络模型进行调色;输出调色后的图像。
第二方面,本发明提供了一种基于深度学习的图像调色装置,该装置包括:获取模块,用于获取图像训练样本;构建模块,用于构建图像调色神经网络模型;加速模块,用于加速图像调色神经网络模块;输入模块,用于将待调色图像输入图像调色神经网络模型进行调色;输出模块,用于输出调色后的图像。
第三方面,本发明提供了一种电子设备,包括存储器,用于存储可执行指令;处理器,用于执行所述存储器中存储的可执行指令时实现上述的基于深度学习的图像调色方法。
第四方面,一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述的基于深度学习的图像调色方法。
与现有技术相比,本发明方案的基于深度学习的图像调色方法的模型构建速度快、实现成本低、调色速度快,且调色结果具有较高美学水平,能有效减少创作者的图像或视频的调色时间,提升了创造者的作品水平与调色体验。
图1是本发明实施例中的基于深度学习的图像调色方法的流程图。
图2是本发明实施例中的图像调色装置的结构框图。
图3是本发明实施例中的电子设备的结构框图。
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
为了说明本发明所述的技术方案,下面通过具体实施例来进行说明。
实施例1
如图1所示,本实施例中的基于深度学习的图像调色方法包括以下步骤。
S1:获取图像训练样本。
在本实施例中,图像训练样本通过以下方式获得,将拍摄装置拍摄的原始图像作为输入样本,将输入样本经人工调色的图像作为目标样本。具体地,收集多张使用拍摄装置获取的未调色平面图像,由专业调色人员对未调色平面图像调色,再由其他具有专业审美水平的专家比较未调色图像与已调色图像,采用主观评分方式评价已调色图像,然后选择评分大于设定分数阈值的已调色图像及其对应的未调色前的原始图像分别作为目标样本和输入样本。本实施例中的拍摄装置为任意可获得数字图像的设备,包括但不限于单反相机、无反相机、带有拍照与摄像功能的手机、运动相机、全景相机等。若所述拍摄装置为全景相机,则将所述全景相机获得的全景视频投影或裁剪获得所述平面图像。
图像训练样本还可以通过以下方式获得:获取不存在调色缺陷的图像,再将该图像进行降质处理,然后将不存在调色缺陷的图像作为目标样本,将目标样本对应的已降质的图像作为输入样本。具体地,将具有美感且画质饱满的图像经过曝、欠曝、降低对比度、调低色彩饱和度、降低分辨率等至少一个操作后得到降质后的图像,然后将降质后的图像作为输入样本,将该美感且画质饱满的图像作为目标样本。由上可知,可通过对不存在调色缺陷的图像进行不同的降质操作而得到多个输入样本。
S2:构建图像调色神经网络模型。
本实施例中,首先构建带有双边引导上采样的深度神经网络模型,然后将S1中的图像训练样本输入深度神经网络模型进行训练以得到训练后的深度神经网络模型,然后输入存在调色缺陷的图像检测训练后的深度神经网络模型的调色效果,并根据调色效果对训练后的深度神经网络模型进行优化,以得到图像调色神经网络模型。
本实施例中带有双边引导上采样的深度神经网络模型构建为:对输入图像进行降采样,应用卷积神经网络提取图像特征得到双边网格并平滑,根据输入图像的引导图应用双边网格做上采样,最后输出已调色图像。该模型构建可参考论文:Gharbi
M, Chen J, Barron J T, et al. Deep bilateral learning for real-time image
enhancement[J]. ACM Transactions on Graphics (TOG), 2017, 36(4): 118。
本实施例中的图像调色神经网络模型的构建包括以下步骤:
S21:将图像训练样本输入深度神经网络模型进行训练直至模型损失函数收敛以得到训练后的深度神经网络模型。
具体地,将摄像装置的拍摄的原始图像或降质后的图像作为输入样本,将对应所述已调色图像或不存在调色缺陷的图像作为目标样本,输入带有双边引导上采样的深度神经网络模型进行训练。训练时使用均方误差损失函数(MSE
Loss Function)评价深度神经网络模型输出图像与目标样本之间的误差。当损失函数值小于2×10
-4时,认为模型损失函数收敛,完成深度神经网络模型的训练。
S22:将图像训练样本外的一定数量的存在调色缺陷的图像输入训练后的深度神经网络模型得到调色后的图像。
具体地,将图像训练样本外的一定数量(如10张)的存在调色缺陷的图像输入训练后神经网络模型得到调色后的多张图像。
S23:判断调色后的图像的调色效果是否都达到了预定调色效果,如果否,则进入步骤S24;如果是,则进入步骤S25。
具体地,由具有专业审美水平的专家采用主观评分方式评价调色后的每张图像的调色效果,如果所有图像的评分都大于或等于设定分数阈值时,认为该图像调色神经网络模型的调色效果已具有较高美学水平,则进入步骤S24,即可以将该训练后神经网络模型作为图像调色神经网络模型;如果有至少一张图像的评分小于设定分数阈值时,则认为图像调色神经网络模型还需优化,此时进入步骤S25。
S24:将未达到预定调色效果的图像经人工调色处理以达到理想的调色效果的图像,再将该调色后的图像进行不同的降质处理以得到多张降质后的图像,然后将降质后的各图像及经人工调色的图像作为图像训练样本并返回步骤S21。
具体地,将未达到理想的调色效果的一张或多张图像通过专业调色人员使用调色软件进行调色并得到经专业审美水平认同的调色后的图像,再将该调色后的图像进行不同的降质处理(如过曝、欠曝、降低对比度、调低色彩饱和度、降低分辨率等),得到多张降质后的图像,然后将该多张降质后的图像作为输入样本,将调色后的图像作为目标样本并返回步骤S21,即将对深度神经网络模型进行的调色效果进行优化。
S25:将训练后的深度神经网络模型作为图像调色神经网络模型。
由于训练后的神经网络模型在对一定数量的图像进行调色后都达到了理想的调色效果,此时可以认为其调色效果已具有较高美学水平,可以不需再进行训练,可以作为图像调色神经网络模型。
通过上述具体步骤的描述可以知道,本实施例通过对深度神经网络模型有针对性的进行训练和持续优化,提高了图像调色神经网络模型的构建速度和调色效果。
S3:加速图像调色神经网络模型。
应用GPU(Graphic
Processing Unit,即图像处理单元)对所述带有双边引导上采样的深度神经网络模型加速,实现模型实时运行。具体为:将所述已具有较高美学水平的深度神经网络模型参数导出为二进制文件;在调色程序中集成开源深度神经网络推理引擎,并导入二进制深度神经网络模型参数;根据开源神经网络推理引擎API对输入图像进行推理,得到双边网格与引导图;在可编程GPU上使用图形引擎API实现双边网格上采样加速过程,最后输出调色后图像。通过上述对图像调色神经网络模型进行加速,有利于提高图像调色神经网络模型的图像调色处理速度。
S4:将待调色图像输入图像调色神经网络模型进行调色。
将未调色平面图像或视频数据输入图像调色神经网络模型进行调色处理。其中,未调色平面图像或视频为拍摄装置获得的数字平面图像或视频。若所述拍摄装置为全景相机,则将全景相机获得的全景图片或视频经投影或裁剪获得所述平面图像与视频。调色处理过程中,若输入为平面图像,则经过所述模型调色后输出调色后图像;若输入为平面视频,则将平面视频拆分为平面视频帧,经所述模型后调色输出调色后视频帧,再拼接为调色后平面视频。
S5:输出调色后的图像。
通过电子设备(如相机或手机)的显示屏输出调色后的图像。
实施例2
如图2所示,本实施例揭示了一种基于深度学习的图像调色装置,包括:获取模块,用于获取图像训练样本;构建模块,用于构建图像调色神经网络模型;加速模块,用于加速图像调色神经网络模块;输入模块,用于将待调色图像输入图像调色神经网络模型进行调色;输出模块,用于输出调色后的图像。
实施例3
如图3所示,本实施例揭示了一种电子设备,包括存储器和处理器。存储器用于存储可执行指令;处理器,用于执行所述存储器中存储的可执行指令时实现实施例1中的基于深度学习的图像调色方法。
本实施例中的可执行指令可以采用程序、软件、软件模块、脚本或代码的形式,按任意形式的编程语言(包括编译或解释语言,或者声明性或过程性语言)来编写,并且其可按任意形式部署,包括被部署为独立的程序或者被部署为模块、组件、子例程或者适合在计算环境中使用的其它单元。
作为示例,可执行指令可以但不一定对应于文件系统中的文件,可以可被存储在保存其它程序或数据的文件的一部分,例如,存储在超文本标记语言(HTML,Hyper TextMarkup Language)文档中的一个或多个脚本中,存储在专用于所讨论的程序的单个文件中,或者,存储在多个协同文件(例如,存储一个或多个模块、子程序或代码部分的文件)中。作为示例,可执行指令可被部署为在一个计算设备上执行,或者在位于一个地点的多个计算设备上执行,又或者,在分布在多个地点且通过通信网络互连的多个计算设备上执行。
实施例4
本实施例提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现实施例1中的基于深度学习的图像调色方法。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,存储介质可以是计算机可读存储介质,例如,铁电存储器(FRAM,Ferromagnetic Random Access Memory)、只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable
Read Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read Only Memory)、带电可擦可编程只读存储器(EEPROM,Electrically
Erasable Programmable Read Only Memory)、闪存、磁表面存储器、光盘、或光盘只读存储器(CD-ROM,Compact Disk-Read Only Memory)等存储器;也可以是包括上述存储器之一或任意组合的各种设备。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。
Claims (10)
- 一种基于深度学习的图像调色方法,其特征在于,包括:S1:获取图像训练样本;S2:构建图像调色神经网络模型;S3:加速图像调色神经网络模型;S4:将待调色图像输入图像调色神经网络模型进行调色;S5:输出调色后的图像。
- 根据权利要求1所述的图像调色方法,其特征在于,所述步骤S1中的获取图像训练样本包括:先获取全景图像,再将全景图像转换为平面图像。
- 根据权利要求1所述的图像调色方法,其特征在于,所述步骤S1中的图像训练样本包括输入样本和目标样本;其中,输入样本为拍摄装置拍摄的原始图像,目标样本为经人工对输入样本进行调色后的图像。
- 根据权利要求1所述的图像调色方法,其特征在于,所述步骤S1中的图像训练样本包括输入样本和目标样本;其中,目标样本为不存在调色缺陷的图像,输入样本为将目标样本经降质处理后的图像。
- 根据权利要求1所述的图像调色方法,其特征在于,所述步骤S2中的构建图像调色神经网络模型包括:S21:将图像训练样本输入深度神经网络模型进行训练直至模型损失函数收敛以得到训练后的深度神经网络模型;S22:将图像训练样本外的一定数量的存在调色缺陷的图像输入训练后的深度神经网络模型得到调色后的图像;S23:判断调色后的图像的调色效果是否都达到了预定调色效果,如果否,则进入步骤S24;如果是,则进入步骤S25;S24:将未达到预定调色效果的图像经人工调色处理以达到理想的调色效果的图像,再将该调色后的图像进行不同的降质处理以得到多张降质后的图像,然后将降质后的各图像及经人工调色的图像作为图像训练样本并返回步骤S21;S25:将训练后的深度神经网络模型作为图像调色神经网络模型。
- 根据权利要求1所述的图像调色方法,其特征在于,所述步骤S3中的加速图像调色神经网络模型为:将深度神经网络模型的参数导出为二进制文件;在调色程序中集成开源深度神经网络推理引擎,并导入二进制深度神经网络模型参数;根据开源神经网络推理引擎API对输入图像进行推理,得到双边网格与引导图;在可编程GPU上使用图形引擎API实现双边网格上采样加速过程,最后输出调色后图像。
- 根据权利要求1所述的图像调色方法,其特征在于,所述步骤S4中的将待调色图像输入图像调色神经网络模型进行调色为:对待调色图像进行降采样,应用卷积神经网络提取图像特征得到双边网格并平滑,再根据输入图像的引导图应用双边网格做上采样。
- 一种基于深度学习的图像调色装置,其特征在于,包括:获取模块,用于获取图像训练样本;构建模块,用于构建图像调色神经网络模型;加速模块,用于加速图像调色神经网络模块;输入模块,用于将待调色图像输入图像调色神经网络模型进行调色;输出模块,用于输出调色后的图像。
- 一种电子设备,其特征在于,包括:存储器,用于存储可执行指令;处理器,用于执行所述存储器中存储的可执行指令时实现权利要求1至7任一项所述基于深度学习的图像调色方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至7任一项所述基于深度学习的图像调色方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011088622.6 | 2020-10-13 | ||
CN202011088622.6A CN114359058A (zh) | 2020-10-13 | 2020-10-13 | 基于深度学习的图像调色方法及计算机可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022078413A1 true WO2022078413A1 (zh) | 2022-04-21 |
Family
ID=81089505
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/123631 WO2022078413A1 (zh) | 2020-10-13 | 2021-10-13 | 基于深度学习的图像调色方法、装置、电子设备及计算机可读存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114359058A (zh) |
WO (1) | WO2022078413A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190164261A1 (en) * | 2017-11-28 | 2019-05-30 | Adobe Inc. | High dynamic range illumination estimation |
CN110612549A (zh) * | 2017-12-15 | 2019-12-24 | 谷歌有限责任公司 | 用于快速图像增强的基于机器学习的技术 |
CN110634147A (zh) * | 2019-09-19 | 2019-12-31 | 延锋伟世通电子科技(上海)有限公司 | 基于双边引导上采样的图像抠图方法 |
CN111598799A (zh) * | 2020-04-30 | 2020-08-28 | 中国科学院深圳先进技术研究院 | 图像调色增强方法和图像调色增强神经网络训练方法 |
US20200286213A1 (en) * | 2015-09-02 | 2020-09-10 | Irystec Software Inc. | System and method for real-time tone-mapping |
-
2020
- 2020-10-13 CN CN202011088622.6A patent/CN114359058A/zh active Pending
-
2021
- 2021-10-13 WO PCT/CN2021/123631 patent/WO2022078413A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200286213A1 (en) * | 2015-09-02 | 2020-09-10 | Irystec Software Inc. | System and method for real-time tone-mapping |
US20190164261A1 (en) * | 2017-11-28 | 2019-05-30 | Adobe Inc. | High dynamic range illumination estimation |
CN110612549A (zh) * | 2017-12-15 | 2019-12-24 | 谷歌有限责任公司 | 用于快速图像增强的基于机器学习的技术 |
CN110634147A (zh) * | 2019-09-19 | 2019-12-31 | 延锋伟世通电子科技(上海)有限公司 | 基于双边引导上采样的图像抠图方法 |
CN111598799A (zh) * | 2020-04-30 | 2020-08-28 | 中国科学院深圳先进技术研究院 | 图像调色增强方法和图像调色增强神经网络训练方法 |
Non-Patent Citations (2)
Title |
---|
CHEN, JIAWEN ET AL.: "Bilateral Guided Upsampling", ACM TRANSACTIONS ON GRAPHICS, vol. 35, no. 6, 30 November 2016 (2016-11-30), pages 1 - 8, XP058306385, DOI: 10.1145/2980179.2982423 * |
GHARBI, MICHAEL ET AL.: "Deep Bilateral learning for Real-Time Image Enhancement", ACM TRANSACTIONS ON GRAPHICS, vol. 36, no. 4, 31 July 2017 (2017-07-31), pages 1 - 11, XP058372892, DOI: 10.1145/3072959.3073592 * |
Also Published As
Publication number | Publication date |
---|---|
CN114359058A (zh) | 2022-04-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11182877B2 (en) | Techniques for controlled generation of training data for machine learning enabled image enhancement | |
Yang et al. | Image correction via deep reciprocating HDR transformation | |
US9275445B2 (en) | High dynamic range and tone mapping imaging techniques | |
CN111669514B (zh) | 高动态范围成像方法和装置 | |
CN108401154B (zh) | 一种图像曝光度无参考质量评价方法 | |
CN111292264A (zh) | 一种基于深度学习的图像高动态范围重建方法 | |
JP2020530920A (ja) | 画像照明方法、装置、電子機器および記憶媒体 | |
US20230146181A1 (en) | Integrated machine learning algorithms for image filters | |
CN111598799A (zh) | 图像调色增强方法和图像调色增强神经网络训练方法 | |
US20190362478A1 (en) | Machine learning techniques for increasing color consistency across videos | |
KR20040043157A (ko) | 디지털 영상을 그 잡음을 고려하여 수정하기 위한 방법 및시스템 | |
CN114862698B (zh) | 一种基于通道引导的真实过曝光图像校正方法与装置 | |
CN113096029A (zh) | 基于多分支编解码器神经网络的高动态范围图像生成方法 | |
WO2021213336A1 (zh) | 一种画质增强装置及相关方法 | |
CN104902168B (zh) | 一种图像合成方法、装置及拍摄设备 | |
Panetta et al. | Deep perceptual image enhancement network for exposure restoration | |
CN114298942A (zh) | 图像去模糊方法及装置、计算机可读介质和电子设备 | |
CN112819699A (zh) | 视频处理方法、装置及电子设备 | |
CN112991236B (zh) | 一种基于模板的图像增强方法及装置 | |
CN112200737B (zh) | 一种基于强化学习的图像处理方法、装置及存储介质 | |
US20240013354A1 (en) | Deep SDR-HDR Conversion | |
CN117058019A (zh) | 一种基于金字塔增强网络的低光照下目标检测方法 | |
WO2022078413A1 (zh) | 基于深度学习的图像调色方法、装置、电子设备及计算机可读存储介质 | |
CN116614714A (zh) | 相机感知特性引导的真实曝光校正方法和系统 | |
CN114638764B (zh) | 基于人工智能的多曝光图像融合方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21879456 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 01.09.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21879456 Country of ref document: EP Kind code of ref document: A1 |