WO2020233129A1 - 一种图像超分辨和着色方法、系统及电子设备 - Google Patents
一种图像超分辨和着色方法、系统及电子设备 Download PDFInfo
- Publication number
- WO2020233129A1 WO2020233129A1 PCT/CN2019/130536 CN2019130536W WO2020233129A1 WO 2020233129 A1 WO2020233129 A1 WO 2020233129A1 CN 2019130536 W CN2019130536 W CN 2019130536W WO 2020233129 A1 WO2020233129 A1 WO 2020233129A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- resolution
- module
- feature map
- layer
- image
- Prior art date
Links
- 238000004040 coloring Methods 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000013461 design Methods 0.000 claims abstract description 11
- 238000011176 pooling Methods 0.000 claims description 42
- 238000005070 sampling Methods 0.000 claims description 36
- 239000000284 extract Substances 0.000 claims description 15
- 239000011800 void material Substances 0.000 claims description 7
- 230000000717 retained effect Effects 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 12
- 238000003860 storage Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000004438 eyesight Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000016776 visual perception Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- ZFMRLFXUPVQYAU-UHFFFAOYSA-N sodium 5-[[4-[4-[(7-amino-1-hydroxy-3-sulfonaphthalen-2-yl)diazenyl]phenyl]phenyl]diazenyl]-2-hydroxybenzoic acid Chemical compound C1=CC(=CC=C1C2=CC=C(C=C2)N=NC3=C(C=C4C=CC(=CC4=C3O)N)S(=O)(=O)O)N=NC5=CC(=C(C=C5)O)C(=O)O.[Na+] ZFMRLFXUPVQYAU-UHFFFAOYSA-N 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/001—Texturing; Colouring; Generation of texture or colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
Definitions
- This application belongs to the field of image processing technology, and in particular relates to an image super-resolution and coloring method, system and electronic equipment.
- Image super-resolution technology refers to the restoration of high-resolution images from a low-resolution image or image sequence. This technology was first proposed in the field of optics, and is now widely used in the field of image compression, medical imaging, and remote sensing imaging.
- Traditional super-resolution techniques include: interpolation-based methods (such as: nearest neighbor interpolation, bilinear interpolation, bicubic interpolation, etc.), reconstruction-based methods (such as: non-uniform interpolation, iterative back projection method, maximum Posterior probability method, etc.) etc. At present, the best effect is the deep learning method.
- Image coloring is the process of pseudo-coloring black and white grayscale images to make the image more visually perceptive and attractive.
- Image coloring technology has applications in industrial production, medical image processing, early film and television works and old photos. Since the changes from grayscale to color images are diverse, this also makes this task a difficult task.
- Common image coloring methods can be divided into image coloring methods based on local colors and coloring image methods based on color transfer. Deep learning techniques are constantly being applied to the image coloring process, and more and more models have emerged, including the article [Wang F Y, Zhang J, Zheng X, et al.
- the article First calculate the chromaticity diagram, which adjusts the spatial position of the chromaticity samples provided by the low-resolution input image. The chromaticity diagram is then used to color the final result based on the super-resolution luminance channel.
- the article When applying this method to learning-based super-resolution technology, the article also introduces a back-projection step to first normalize the luminance channel before the image is colored.
- This application provides an image super-resolution and coloring method, system, and electronic device, which aim to solve at least one of the above technical problems in the prior art to a certain extent.
- An image super-resolution and coloring method including the following steps:
- Step a Design a new network model combining image super-resolution and coloring, the network model including an encoding module and a decoding module;
- Step b Input the low-resolution grayscale image into the network model
- Step c Extract the semantic feature map of the low-resolution grayscale image through the encoding module, and pass the semantic feature map to the decoding module, and the decoding module superimposes the semantic feature maps of each level and outputs the high-resolution color Figure.
- the encoding module includes a 13-layer convolutional layer, a 5-layer pooling layer, and a 3-layer fully connected layer, and the convolution kernel of each convolutional layer Respectively 3*3, the convolution output channels of the first and second layers are 64, the convolution output channels of the third and fourth layers are 128, and the convolution output channels of the fifth, sixth, and seventh layers are 256.
- the convolution output channels of the 9th and 10th layers are 512, and the convolutional output channels of the 11th, 12th, and 13th layers are 512; the downsampling rate of the pooling layer is 2, and the first four pooling layers are the maximum pooling.
- the final pooling layer is global average pooling; the numbers of nodes in the 3 fully connected layers are 4096, 4096, and 1000 respectively.
- the decoding module includes an up-sampling module, a hole residual module, and a mixed up-sampling module, the number of the up-sampling module and the hole residual module
- the hole residual module includes two multi-branch convolution modules with varying void rates
- each convolution module includes 4 branches, and each branch is compressed by a common convolution first.
- the hole rates of the four branch hole convolutions are 1, 2, 3, and 5 respectively
- the hybrid upsampling module includes a bilinear difference branch with fixed parameters and a parameter learnable inverse Convolution branch.
- the technical solution adopted in the embodiment of the application further includes: in the step c, the encoding module extracts the semantic feature map of the low-resolution grayscale image, and transmits the semantic feature map to the decoding module, the decoding module To superimpose the semantic feature maps of each level and output the high-resolution color image is specifically: extracting the semantic feature map of the low-resolution gray-scale image through the 13-layer convolutional layer, and passing through the 5-layer pooling layer and the 3-layer fully connected layer After that, the semantic feature maps A, B, C, D, and E output by the second, fourth, seventh, tenth, and 13th layers of convolution are retained and transmitted to the decoding module in turn.
- the feature map E passes through the hole residual module and The feature map output after the mixed upsampling module is stacked with the feature map output by the feature map D after passing through the hole residual module.
- the stacked feature map is output after the hole residual module and the mixed upsampling module, and then The feature maps of the upper layer of feature map C after passing through the hole residual module are stacked and proceeded in sequence, until finally the feature maps generated after the feature maps A, B, C, D, and E are passed through the decoding module and the low-resolution gray
- the feature map generated after the degree map passes through the mixed up-sampling module is added to the output of the high-resolution color map.
- an image super-resolution and coloring system including designing a new network model combining image super-resolution and coloring.
- the network model includes an encoding module and a decoding module;
- the image is input to the network model, the semantic feature map of the low-resolution grayscale image is extracted through the encoding module, and the semantic feature map is passed to the decoding module, which superimposes the semantic feature maps of each level, and Output high-resolution color images.
- the encoding module includes a 13-layer convolutional layer, a 5-layer pooling layer, and a 3-layer fully connected layer.
- the convolution kernel of each convolutional layer is 3*3.
- the convolution output channel of layer 1, 2 is 64
- the convolution output channel of layer 3, 4 is 128, the convolution output channel of layer 5, 6, and 7 is 256
- the output channel is 512, and the convolution output channel of the 11th, 12th, and 13th layers is 512; the downsampling rate of the pooling layer is 2, the first four pooling layers are maximum pooling, and the last pooling layer is global Average pooling; the number of nodes in the 3 fully connected layers are 4096, 4096, and 1000 respectively.
- the technical solution adopted in the embodiment of the present application further includes: the decoding module includes an up-sampling module, a hole residual module, and a hybrid up-sampling module, the number of the up-sampling module and the hole residual module is at least two;
- the void residual module includes two multi-branch convolution modules with varying void rates.
- Each convolution module includes 4 branches. Each branch is compressed by a common convolution first, and then undergoes a hole convolution.
- the hole rates of the four branch hole convolutions are 1, 2, 3, and 5 respectively;
- the hybrid up-sampling module includes a bilinear difference branch with fixed parameters and a deconvolution branch with learnable parameters.
- the technical solution adopted in the embodiment of the application further includes: the image super-resolution and coloring method of the network model is specifically: extracting the semantic feature map of the low-resolution grayscale image through the 13-layer convolutional layer, and passing through the 5-layer pooling layer After the fully connected layer and the 3rd layer, the semantic feature maps A, B, C, D, and E output by the second, fourth, seventh, 10th, and 13th layers of convolution are retained and transmitted to the decoding module in turn, the feature map E
- the feature map output after the hole residual module and the hybrid upsampling module is stacked with the feature map output after the feature map D passes through the hole residual module.
- the stacked feature map is output after the hole residual module and the hybrid upsampling module.
- the feature map of the previous layer is stacked with the feature map generated by the previous layer of feature map C after passing through the hole residual module, and proceeded in turn, until the feature map generated by the feature map A, B, C, D, E through the decoding module
- the addition of the low-resolution gray-scale image to the feature map generated after the hybrid up-sampling module is the output of the high-resolution color image.
- an electronic device including:
- At least one processor At least one processor
- a memory communicatively connected with the at least one processor; wherein,
- the memory stores instructions that can be executed by the one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform the following operations of the above-mentioned image super-resolution and rendering method:
- Step a Design a new network model combining image super-resolution and coloring, the network model including an encoding module and a decoding module;
- Step b Input the low-resolution grayscale image into the network model
- Step c Extract the semantic feature map of the low-resolution grayscale image through the encoding module, and pass the semantic feature map to the decoding module, and the decoding module superimposes the semantic feature maps of each level and outputs the high-resolution color Figure.
- the beneficial effects produced by the embodiments of the present application are: the image super-resolution and coloring methods, systems and electronic devices of the embodiments of the present application design new network models for super-resolution and coloring problems, and perform super-resolution and coloring tasks.
- Common processing direct mapping of low-resolution grayscale images to high-resolution color images, improve people's visual perception, save resources and time.
- the network model of the present application can optimize all parameters together without biasing to a certain sub-model.
- Fig. 1 is a flowchart of an image super-resolution and coloring method according to an embodiment of the present application
- Figure 2 is a schematic structural diagram of a network model of an embodiment of the application
- FIG. 3 is a structural diagram of a cavity residual module according to an embodiment of the application.
- FIG. 4 is a schematic structural diagram of an image super-resolution and coloring system according to an embodiment of the present application.
- FIG. 5 is a schematic diagram of the hardware device structure of an image super-resolution and coloring method provided by an embodiment of the present application.
- FIG. 1 is a flowchart of an image super-resolution and coloring method according to an embodiment of the present application.
- the image super-resolution and coloring method of the embodiment of the present application includes the following steps:
- Step 100 Design a new network model combining image super-resolution and coloring
- step 100 is a schematic structural diagram of a network model according to an embodiment of this application.
- the network model of the embodiment of the present application includes an encoding module and a decoding module.
- the encoding module is VGG16 (a network model).
- the VGG16 has been pre-trained on a large amount of data on the ImageNet database and has a certain ability to extract semantic information.
- the coding module includes a 13-layer convolutional layer, a 5-layer pooling layer, and a 3-layer fully connected layer.
- the convolution kernel of each convolutional layer is 3*3, the convolution output channel of the first and second layers is 64, the convolution output channel of the third and fourth layers is 128, and the convolution of the fifth, sixth, and seventh layers
- the output channel is 256, the convolution output channel of the 8, 9, and 10 layers is 512, and the convolution output channel of the 11, 12, and 13 layers is 512.
- the downsampling rate of the pooling layer is 2, the first four pooling layers are the maximum pooling, and the last pooling layer is the global average pooling, and then through 3 fully connected layers, the number of nodes in the 3 fully connected layers are respectively It is 4096, 4096, 1000.
- the decoding module includes a multi-layer up-sampling module (the specific number depends on the task), multiple hole residual modules (the specific number depends on the task), and a mixed up-sampling module.
- the up-sampling module and the hole residual module The number is at least two, and the specific number can be set according to the actual task.
- FIG. 3 together is a structural diagram of a cavity residual module according to an embodiment of the application.
- the cavity residual module mainly includes two multi-branch convolution modules with varying cavity rates.
- Each convolution module includes 4 branches, and each branch is compressed by a common convolution first, and then undergoes a hole convolution.
- the hole rates of the four branch hole convolutions are 1, 2, 3, and 5 respectively.
- the hybrid up-sampling module includes a bilinear difference branch with fixed parameters and a deconvolution branch with learnable parameters, and finally the feature map after the fusion of the two branches is input.
- Step 200 Input the low-resolution grayscale image into the network model
- Step 300 Extract the semantic feature map of the low-resolution grayscale image through the encoding module of the network model, and pass the semantic feature map to the decoding module.
- the decoding module uses the hole residual module and the hybrid up-sampling module to analyze the semantic feature maps of each level Perform superimposition and output color high-resolution images.
- step 300 it is assumed that the semantic feature maps output by the second, fourth, seventh, tenth, and 13th layers of convolution are A, B, C, D, and E, respectively.
- the feature map overlay method is specifically: the semantic feature maps A, B, C, D, and E are sequentially transmitted to at least two void residual modules. Since the sizes of semantic feature maps A, B, C, D, and E are different, each time the feature maps are stacked through the mixed up-sampling module, the smaller The feature map is increased by a factor of 2 (all upsampling rates are 2).
- Feature map E is the feature map output after passing through the hole residual module and the mixed up-sampling module, and is stacked with the feature map output by the feature map D after passing through the hole residual module.
- the stacked feature map passes through the hole residual module and mixed up-sampling
- the feature map output after the module is stacked with the feature map of the previous layer of feature map C after passing through the hole residual module, and then proceeding in turn, until the final feature map A, B, C, D, E are generated after the decoding module
- the feature map and the input low-resolution grayscale image are added after the mixed up-sampling module to produce the feature map, which is the output of the high-resolution color image.
- the network model continuously updates the network parameters after multiple training on the training set, and finally the network model can learn the mapping function from low-resolution grayscale image to high-resolution color image to a certain extent.
- FIG. 4 is a schematic structural diagram of an image super-resolution and coloring system according to an embodiment of the present application.
- the image super-resolution and coloring system of the embodiment of the application is a new network model designed by combining image super-resolution and coloring.
- the network model includes an encoding module and a decoding module.
- the low-resolution grayscale image is input to the network model through the encoding module Extract the semantic feature map of the low-resolution grayscale image, and pass the semantic feature map to the decoding module.
- the decoding module uses the hole residual module and the hybrid up-sampling module to superimpose the semantic feature maps of each level, and output the color high-resolution image.
- the coding module of the network model is VGG16 (a network model).
- VGG16 has been pre-trained on a large amount of data on the ImageNet database and has a certain ability to extract semantic information.
- the coding module includes a 13-layer convolutional layer, a 5-layer pooling layer, and a 3-layer fully connected layer.
- the convolution kernel of each convolutional layer is 3*3, the convolution output channel of the first and second layers is 64, the convolution output channel of the third and fourth layers is 128, and the convolution of the fifth, sixth, and seventh layers
- the output channel is 256, the convolution output channel of the 8, 9, and 10 layers is 512, and the convolution output channel of the 11, 12, and 13 layers is 512.
- the downsampling rate of the pooling layer is 2, the first four pooling layers are the maximum pooling, and the last pooling layer is the global average pooling, and then through 3 fully connected layers, the number of nodes in the 3 fully connected layers are respectively It is 4096, 4096, 1000.
- Extract the semantic feature map of the input image through the 13-layer convolutional layer, and after the 5-layer pooling layer and the 3-layer fully connected layer, the semantic feature maps of the second, fourth, seventh, 10th, and 13th convolutional output are retained respectively Down, and in turn lost to the decoding module.
- the decoding module includes a multi-layer up-sampling module (the specific number depends on the task), multiple hole residual modules (the specific number depends on the task), and a mixed up-sampling module.
- the up-sampling module and the hole residual module The number is at least two, and the specific number can be set according to the actual task.
- the cavity residual module mainly includes two multi-branch convolution modules with varying cavity rates. Each convolution module includes 4 branches, and each branch is compressed by a common convolution first, and then undergoes a hole convolution.
- the hole rates of the four branch hole convolutions are 1, 2, 3, and 5 respectively.
- the hybrid up-sampling module includes a bilinear difference branch with fixed parameters and a deconvolution branch with learnable parameters, and finally the feature map after the fusion of the two branches is input.
- the image super-resolution and coloring method of the network model is specifically as follows: assuming that the semantic feature maps output by the second, fourth, seventh, tenth, and thirteenth layers of convolution are A, B, C, D, and E, respectively, the semantics Feature maps A, B, C, D, and E are transferred to at least two hollow residual modules in turn. Since the sizes of semantic feature maps A, B, C, D, and E are different, each time the feature maps are stacked before they are mixed Sampling module to increase the small feature map by 2 times (all upsampling rates are 2).
- Feature map E is the feature map output after passing through the hole residual module and the mixed up-sampling module, and is stacked with the feature map output by the feature map D after passing through the hole residual module.
- the stacked feature map passes through the hole residual module and mixed up-sampling
- the feature map output after the module is stacked with the feature map of the previous layer of feature map C after passing through the hole residual module, and then proceeding in turn, until the final feature map A, B, C, D, E are generated after the decoding module
- the feature map and the input low-resolution grayscale image are added after the mixed up-sampling module to produce the feature map, which is the output of the high-resolution color image.
- the network model continuously updates the network parameters after multiple training on the training set, and finally the network model can learn the mapping function from low-resolution grayscale image to high-resolution color image to a certain extent.
- FIG. 5 is a schematic diagram of the hardware device structure of an image super-resolution and coloring method provided by an embodiment of the present application.
- the device includes one or more processors and memory. Taking a processor as an example, the device may also include: an input system and an output system.
- the processor, the memory, the input system, and the output system may be connected by a bus or other methods.
- the connection by a bus is taken as an example.
- the memory can be used to store non-transitory software programs, non-transitory computer executable programs, and modules.
- the processor executes various functional applications and data processing of the electronic device by running non-transitory software programs, instructions, and modules stored in the memory, that is, realizing the processing methods of the foregoing method embodiments.
- the memory may include a program storage area and a data storage area, where the program storage area can store an operating system and an application program required by at least one function; the data storage area can store data and the like.
- the memory may include a high-speed random access memory, and may also include a non-transitory memory, such as at least one magnetic disk storage device, a flash memory device, or other non-transitory solid state storage devices.
- the storage may optionally include storage remotely arranged with respect to the processor, and these remote storages may be connected to the processing system through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
- the input system can receive input digital or character information, and generate signal input.
- the output system may include display devices such as a display screen.
- the one or more modules are stored in the memory, and when executed by the one or more processors, the following operations of any of the foregoing method embodiments are performed:
- Step a Design a new network model combining image super-resolution and coloring, the network model including an encoding module and a decoding module;
- Step b Input the low-resolution grayscale image into the network model
- Step c Extract the semantic feature map of the low-resolution grayscale image through the encoding module, and pass the semantic feature map to the decoding module, and the decoding module superimposes the semantic feature maps of each level and outputs the high-resolution color Figure.
- the embodiment of the present application provides a non-transitory (nonvolatile) computer storage medium, the computer storage medium stores computer executable instructions, and the computer executable instructions can perform the following operations:
- Step a Design a new network model combining image super-resolution and coloring, the network model including an encoding module and a decoding module;
- Step b Input the low-resolution grayscale image into the network model
- Step c Extract the semantic feature map of the low-resolution grayscale image through the encoding module, and pass the semantic feature map to the decoding module, and the decoding module superimposes the semantic feature maps of each level and outputs the high-resolution color Figure.
- the embodiment of the present application provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, when the program instructions are executed by a computer To make the computer do the following:
- Step a Design a new network model combining image super-resolution and coloring, the network model including an encoding module and a decoding module;
- Step b Input the low-resolution grayscale image into the network model
- Step c Extract the semantic feature map of the low-resolution grayscale image through the encoding module, and pass the semantic feature map to the decoding module, and the decoding module superimposes the semantic feature maps of each level and outputs the high-resolution color Figure.
- the image super-resolution and coloring method, system and electronic device of the embodiments of the application design a new network model for the super-resolution and coloring problem, and jointly process the super-resolution and coloring tasks, and directly map the low-resolution grayscale image to the high-resolution image. Color map, improve people's visual perception, save resources and time.
- the network model of the present application can optimize all parameters together without biasing to a certain sub-model.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims (9)
- 一种图像超分辨和着色方法,其特征在于,包括以下步骤:步骤a:结合图像超分辨和着色设计新的网络模型,所述网络模型包括编码模块和解码模块;步骤b:将低分辨灰度图输入网络模型;步骤c:通过所述编码模块提取低分辨灰度图像的语义特征图,并将所述语义特征图传入解码模块,所述解码模块对各个层级的语义特征图进行叠加,并输出高分辨彩色图。
- 根据权利要求1所述的图像超分辨和着色方法,其特征在于,在所述步骤a中,所述编码模块包括13层卷积层、5层池化层和3层全连接层,每层卷积层的卷积核分别为3*3,第1、2层的卷积输出通道为64,第3、4层的卷积输出通道为128,第5、6、7层的卷积输出通道为256,第8、9、10层的卷积输出通道为512,第11、12、13层的卷积输出通道为512;所述池化层的下采样率为2,前四个池化层为最大池化,最后的池化层为全局平均池化;所述3个全连接层的节点个数分别为4096、4096、1000。
- 根据权利要求1或2所述的图像超分辨和着色方法,其特征在于,在所述步骤a中,所述解码模块包括上采样模块、空洞残差模块和混合上采样模块,所述上采样模块和空洞残差模块的个数为至少两个;所述空洞残差模块包括两个多分支结构的空洞率变化的卷积模块,每个卷积模块分别包括4个分支,每个分支分别由一个普通卷积先压缩维度,再经过一个空洞卷积,四个分支空洞卷积的空洞率分别为1、2、3、5;所述混合上采样模块包括一个固定参数的双线性差值分支和一个参数可学习的反卷积分支。
- 根据权利要求3所述的图像超分辨和着色方法,其特征在于,在所述步骤c中,所述通过编码模块提取低分辨灰度图像的语义特征图,并将所述语义特征图传入解码模块,所述解码模块对各个层级的语义特征图进行叠加,并输出高分辨彩色图具体为:通过所述13层卷积层提取低分辨灰度图的语义特征图,经过5层池化层和3层全连接层后,分别将第2、4、7、10、13层卷积输出的语义特征图A、B、C、D、E保留下来,并依次传输至解码模块,特征图E经过空洞残差模块和混合上采样模块之后输出的特征图,与特征图D经过空洞残差模块后输出的特征图相堆叠,堆叠后的特征图经过空洞残差模块和混合上采样模块之后输出的特征图,再和上一层特征图C经过空洞残差模块后产生的特征图相堆叠,依次进行,直到最后将特征图A、B、C、D、E经过解码模块后产生的特征图与所述低分辨灰度图经过混合上采样模块之后产生的特征图相加,即为高分辨率彩色图的输出。
- 一种图像超分辨和着色系统,其特征在于,包括结合图像超分辨和着色设计新的网络模型,所述网络模型包括编码模块和解码模块;将低分辨灰度图输入所述网络模型,通过所述编码模块提取低分辨灰度图像的语义特征图,并将所述语义特征图传入解码模块,所述解码模块对各个层级的语义特征图进行叠加,并输出高分辨彩色图。
- 根据权利要求5所述的图像超分辨和着色系统,其特征在于,所述编码模块包括13层卷积层、5层池化层和3层全连接层,每层卷积层的卷积核分别为3*3,第1、2层的卷积输出通道为64,第3、4层的卷积输出通道为128,第5、6、7层的卷积输出通道为256,第8、9、10层的卷积输出通道为512,第11、12、13层的卷积输出通道为512;所述池化层的下采样率为2,前四个池化层为最大池化,最后的池化层为全局平均池化;所述3个全连接层的节点个数分别为 4096、4096、1000。
- 根据权利要求5或6所述的图像超分辨和着色系统,其特征在于,所述解码模块包括上采样模块、空洞残差模块和混合上采样模块,所述上采样模块和空洞残差模块的个数为至少两个;所述空洞残差模块包括两个多分支结构的空洞率变化的卷积模块,每个卷积模块分别包括4个分支,每个分支分别由一个普通卷积先压缩维度,再经过一个空洞卷积,四个分支空洞卷积的空洞率分别为1、2、3、5;所述混合上采样模块包括一个固定参数的双线性差值分支和一个参数可学习的反卷积分支。
- 根据权利要求7所述的图像超分辨和着色系统,其特征在于,所述网络模型的图像超分辨和着色方式具体为:通过所述13层卷积层提取低分辨灰度图的语义特征图,经过5层池化层和3层全连接层后,分别将第2、4、7、10、13层卷积输出的语义特征图A、B、C、D、E保留下来,并依次传输至解码模块,特征图E经过空洞残差模块和混合上采样模块之后输出的特征图,与特征图D经过空洞残差模块后输出的特征图相堆叠,堆叠后的特征图经过空洞残差模块和混合上采样模块之后输出的特征图,再和上一层特征图C经过空洞残差模块后产生的特征图相堆叠,依次进行,直到最后将特征图A、B、C、D、E经过解码模块后产生的特征图与所述低分辨灰度图经过混合上采样模块之后产生的特征图相加,即为高分辨率彩色图的输出。
- 一种电子设备,包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述1至4任一项所述的图像 超分辨和着色方法的以下操作:步骤a:结合图像超分辨和着色设计新的网络模型,所述网络模型包括编码模块和解码模块;步骤b:将低分辨灰度图输入网络模型;步骤c:通过所述编码模块提取低分辨灰度图像的语义特征图,并将所述语义特征图传入解码模块,所述解码模块对各个层级的语义特征图进行叠加,并输出高分辨彩色图。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910411068.1A CN110163801B (zh) | 2019-05-17 | 2019-05-17 | 一种图像超分辨和着色方法、系统及电子设备 |
CN201910411068.1 | 2019-05-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020233129A1 true WO2020233129A1 (zh) | 2020-11-26 |
Family
ID=67631101
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/130536 WO2020233129A1 (zh) | 2019-05-17 | 2019-12-31 | 一种图像超分辨和着色方法、系统及电子设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110163801B (zh) |
WO (1) | WO2020233129A1 (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112967293A (zh) * | 2021-03-04 | 2021-06-15 | 首都师范大学 | 一种图像语义分割方法、装置及存储介质 |
US20210209732A1 (en) * | 2020-06-17 | 2021-07-08 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Face super-resolution realization method and apparatus, electronic device and storage medium |
CN113255675A (zh) * | 2021-04-13 | 2021-08-13 | 西安邮电大学 | 基于扩张卷积和残差路径的图像语义分割网络结构及方法 |
CN113505792A (zh) * | 2021-06-30 | 2021-10-15 | 中国海洋大学 | 面向非均衡遥感图像的多尺度语义分割方法及模型 |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110163801B (zh) * | 2019-05-17 | 2021-07-20 | 深圳先进技术研究院 | 一种图像超分辨和着色方法、系统及电子设备 |
CN111179283A (zh) * | 2019-12-30 | 2020-05-19 | 深圳市商汤科技有限公司 | 图像语义分割方法及装置、存储介质 |
CN111353940B (zh) * | 2020-03-31 | 2021-04-02 | 成都信息工程大学 | 一种基于深度学习迭代上下采样的图像超分辨率重建方法 |
CN111654721A (zh) * | 2020-04-17 | 2020-09-11 | 北京奇艺世纪科技有限公司 | 视频处理方法、系统、电子设备及存储介质 |
CN111950469A (zh) * | 2020-08-14 | 2020-11-17 | 上海云从汇临人工智能科技有限公司 | 一种道路标识检测方法、系统、设备和介质 |
CA3195077A1 (en) * | 2020-10-07 | 2022-04-14 | Dante DE NIGRIS | Systems and methods for segmenting 3d images |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106855996A (zh) * | 2016-12-13 | 2017-06-16 | 中山大学 | 一种基于卷积神经网络的灰阶图像着色方法及其装置 |
CN107657586A (zh) * | 2017-10-13 | 2018-02-02 | 深圳市唯特视科技有限公司 | 一种基于深度残差网络的单照片超分辨增强方法 |
CN107833183A (zh) * | 2017-11-29 | 2018-03-23 | 安徽工业大学 | 一种基于多任务深度神经网络的卫星图像同时超分辨和着色的方法 |
CN108830912A (zh) * | 2018-05-04 | 2018-11-16 | 北京航空航天大学 | 一种深度特征对抗式学习的交互式灰度图像着色方法 |
CN109118491A (zh) * | 2018-07-30 | 2019-01-01 | 深圳先进技术研究院 | 一种基于深度学习的图像分割方法、系统及电子设备 |
CN110163801A (zh) * | 2019-05-17 | 2019-08-23 | 深圳先进技术研究院 | 一种图像超分辨和着色方法、系统及电子设备 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103489174B (zh) * | 2013-10-08 | 2016-06-29 | 武汉大学 | 一种基于残差保持的人脸超分辨率方法 |
WO2017029681A1 (en) * | 2015-08-14 | 2017-02-23 | Siddhartha Bhattacharyya | A led based colorimeter device |
CN105844589B (zh) * | 2016-03-21 | 2018-12-21 | 深圳市未来媒体技术研究院 | 一种基于混合成像系统的实现光场图像超分辨的方法 |
CN107767343B (zh) * | 2017-11-09 | 2021-08-31 | 京东方科技集团股份有限公司 | 图像处理方法、处理装置和处理设备 |
CN108596841B (zh) * | 2018-04-08 | 2021-01-19 | 西安交通大学 | 一种并行实现图像超分辨率及去模糊的方法 |
CN108717698A (zh) * | 2018-05-28 | 2018-10-30 | 深圳市唯特视科技有限公司 | 一种基于深度卷积生成对抗网络的高质量图像生成方法 |
-
2019
- 2019-05-17 CN CN201910411068.1A patent/CN110163801B/zh active Active
- 2019-12-31 WO PCT/CN2019/130536 patent/WO2020233129A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106855996A (zh) * | 2016-12-13 | 2017-06-16 | 中山大学 | 一种基于卷积神经网络的灰阶图像着色方法及其装置 |
CN107657586A (zh) * | 2017-10-13 | 2018-02-02 | 深圳市唯特视科技有限公司 | 一种基于深度残差网络的单照片超分辨增强方法 |
CN107833183A (zh) * | 2017-11-29 | 2018-03-23 | 安徽工业大学 | 一种基于多任务深度神经网络的卫星图像同时超分辨和着色的方法 |
CN108830912A (zh) * | 2018-05-04 | 2018-11-16 | 北京航空航天大学 | 一种深度特征对抗式学习的交互式灰度图像着色方法 |
CN109118491A (zh) * | 2018-07-30 | 2019-01-01 | 深圳先进技术研究院 | 一种基于深度学习的图像分割方法、系统及电子设备 |
CN110163801A (zh) * | 2019-05-17 | 2019-08-23 | 深圳先进技术研究院 | 一种图像超分辨和着色方法、系统及电子设备 |
Non-Patent Citations (1)
Title |
---|
LIU, SHUAICHENG: "Colorization for Single Image Super Resolution", ECCV 2010, 31 December 2010 (2010-12-31), XP055755147, DOI: 20200303145631A * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210209732A1 (en) * | 2020-06-17 | 2021-07-08 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Face super-resolution realization method and apparatus, electronic device and storage medium |
US11710215B2 (en) * | 2020-06-17 | 2023-07-25 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Face super-resolution realization method and apparatus, electronic device and storage medium |
CN112967293A (zh) * | 2021-03-04 | 2021-06-15 | 首都师范大学 | 一种图像语义分割方法、装置及存储介质 |
CN113255675A (zh) * | 2021-04-13 | 2021-08-13 | 西安邮电大学 | 基于扩张卷积和残差路径的图像语义分割网络结构及方法 |
CN113255675B (zh) * | 2021-04-13 | 2023-10-10 | 西安邮电大学 | 基于扩张卷积和残差路径的图像语义分割网络结构及方法 |
CN113505792A (zh) * | 2021-06-30 | 2021-10-15 | 中国海洋大学 | 面向非均衡遥感图像的多尺度语义分割方法及模型 |
CN113505792B (zh) * | 2021-06-30 | 2023-10-27 | 中国海洋大学 | 面向非均衡遥感图像的多尺度语义分割方法及模型 |
Also Published As
Publication number | Publication date |
---|---|
CN110163801A (zh) | 2019-08-23 |
CN110163801B (zh) | 2021-07-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020233129A1 (zh) | 一种图像超分辨和着色方法、系统及电子设备 | |
Li et al. | MDCN: Multi-scale dense cross network for image super-resolution | |
WO2023092813A1 (zh) | 一种基于通道注意力的Swin-Transformer图像去噪方法及系统 | |
CN111275618B (zh) | 一种基于双支感知的深度图超分辨率重建网络构建方法 | |
Jiang et al. | A progressively enhanced network for video satellite imagery superresolution | |
CN102142137B (zh) | 基于高分辨率字典的稀疏表征图像超分辨重建方法 | |
CN111861961A (zh) | 单幅图像超分辨率的多尺度残差融合模型及其复原方法 | |
CN110610526B (zh) | 一种基于wnet对单目人像进行分割和景深渲染的方法 | |
Chen et al. | Single image super-resolution using deep CNN with dense skip connections and inception-resnet | |
CN109359527B (zh) | 基于神经网络的头发区域提取方法及系统 | |
CN109934771B (zh) | 基于循环神经网络的无监督遥感图像超分辨率重建方法 | |
CN112070664A (zh) | 一种图像处理方法以及装置 | |
CN113222819B (zh) | 一种基于深度卷积神经网络的遥感图像超分辨重建方法 | |
CN113837946B (zh) | 一种基于递进蒸馏网络的轻量化图像超分辨率重建方法 | |
CN112991231B (zh) | 单图像超分与感知图像增强联合任务学习系统 | |
CN112489050A (zh) | 一种基于特征迁移的半监督实例分割算法 | |
CN114581347B (zh) | 无参考影像的光学遥感空谱融合方法、装置、设备及介质 | |
CN110930306A (zh) | 一种基于非局部感知的深度图超分辨率重建网络构建方法 | |
CN111414988B (zh) | 基于多尺度特征自适应融合网络的遥感影像超分辨率方法 | |
CN114841859A (zh) | 基于轻量神经网络和Transformer的单图像超分辨率重建方法 | |
CN116934592A (zh) | 一种基于深度学习的图像拼接方法、系统、设备及介质 | |
CN116958534A (zh) | 一种图像处理方法、图像处理模型的训练方法和相关装置 | |
CN116596822A (zh) | 基于自适应权重与目标感知的像素级实时多光谱图像融合方法 | |
CN116524121A (zh) | 一种单目视频三维人体重建方法、系统、设备及介质 | |
Hua et al. | Dynamic scene deblurring with continuous cross-layer attention transmission |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19930053 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19930053 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19930053 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.06.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19930053 Country of ref document: EP Kind code of ref document: A1 |