CN114170079A - Depth map super-resolution method based on attention guide mechanism - Google Patents
Depth map super-resolution method based on attention guide mechanism Download PDFInfo
- Publication number
- CN114170079A CN114170079A CN202111400288.8A CN202111400288A CN114170079A CN 114170079 A CN114170079 A CN 114170079A CN 202111400288 A CN202111400288 A CN 202111400288A CN 114170079 A CN114170079 A CN 114170079A
- Authority
- CN
- China
- Prior art keywords
- layer
- convolutional layer
- module
- resolution
- branch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 230000007246 mechanism Effects 0.000 title claims abstract description 7
- 238000005070 sampling Methods 0.000 claims abstract description 54
- 230000006870 function Effects 0.000 claims abstract description 30
- 238000012549 training Methods 0.000 claims abstract description 27
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 7
- 238000013135 deep learning Methods 0.000 claims abstract description 5
- 230000015556 catabolic process Effects 0.000 claims abstract description 4
- 238000006731 degradation reaction Methods 0.000 claims abstract description 4
- 230000004927 fusion Effects 0.000 claims description 45
- 230000004913 activation Effects 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000036314 physical performance Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4007—Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to a depth map super-resolution method based on a guide attention mechanism, which comprises the following steps of: analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image; establishing a data set; designing a network framework: the whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained. Inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output, wherein the whole network is HDSRnet; setting the learning rate of the network and the weight of each part of loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model.
Description
Technical Field
The invention belongs to the field of image restoration, and relates to a super-resolution method of a depth map.
Background
A depth map is also called a distance image, and is an image in which the distance between an image pickup and each point of a scene is defined as a pixel value. Currently, the acquisition of the depth map is mainly realized by using structured light illumination and tof (time of flight) technology. However, due to the limitation of the physical performance of the current devices, the resolution of the depth map captured by the current depth camera is low, and the depth map is difficult to be applied to scenes with high requirements on fineness.
Current depth cameras are typically configured with one depth camera lens in combination with one color camera lens, and the corresponding collected data is a low resolution depth map and a paired high resolution color map. To obtain a high resolution depth map, a high resolution color image is typically used to guide a low resolution depth image for super resolution.
Depth map super-resolution methods based on the guide thought are mainly divided into two categories:
(1) conventional methods, for example: a hyper-resolution scheme based on an image filter; introducing prior information, and converting the overdivision problem into an optimization problem by utilizing a maximum posterior probability theory; and methods using dictionary learning and the like.
(2) A Convolutional Neural Network (CNN) is used to directly learn the mapping between low-resolution depth images to corresponding high-resolution depth images. Such methods require a large number of pairs of low-resolution depth images and high-resolution depth images, which are training data sets for the network, to train the network. However, in the existing research, most methods simply splice the high-resolution color image features and the depth image features, which results in low information utilization efficiency. Currently, most networks improve performance by stacking parameters, which results in large network parameters, low operation efficiency, and difficult deployment into practical application devices. Therefore, a more efficient characteristic fusion mode is adopted to improve the network performance, and further the practical application value of the method is improved to become the development trend of the current method.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a depth map super-resolution method with high operation efficiency and excellent performance by a guiding attention mechanism mode, and the technical scheme is as follows:
a depth map super-resolution method based on a guiding attention mechanism comprises the following steps:
1) analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image:
Dlr=HDhr+n0
in the formula DlrFor low-resolution images captured by depth cameras, DhrIs a high resolution depth image, H is a downsampled matrix, n0Is noise.
2) Building data sets
The high-resolution data of the training set is formed by randomly cropping a plurality of RGB-D images in an MPI Sintel depth data set and a plurality of RGB-D images in a Middlebury data set, the size of each pair of RGB-D data is such that the diversity of the data is increased by introducing the random rotation of an angle theta epsilon [0 DEG, 90 DEG, 180 DEG, 270 DEG ], and the low-resolution data is obtained by the high-resolution depth image through double-cubic interpolation downsampling.
3) Designing a network framework
31) The whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained. Inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output, wherein the whole network is HDSRnet;
32) the HDSRnet-light of the lightweight network consists of 3 branches: a main branch; a structural side branch; the detail side branch, two side branches are composed of up-sampling module and down-sampling module, the main branch is composed of up-sampling module, down-sampling module, attention guiding module and attention module. Wherein, the low resolution depth map is input into the main branch, the depth map which is up-sampled by bicubic interpolation is input into the structural side branch, and the high resolution color map is input into the detail side branch.
321) The structure of the down-sampling module: the convolution layer 1, the pooling layer 1 and the convolution layer 2 are connected with ReLu activation functions after all convolution layers in the module.
322) The up-sampling module structure: deconvolution layer 1-convolution layer 1, all convolution layers and deconvolution layers in the module are connected with ReLu activation functions.
323) Guiding attention module structure: the method comprises the steps of convolutional layer 1-convolutional layer 2-multiplication layer-convolutional layer 3, wherein main branch characteristics enter convolutional layer 1, side branch characteristics enter convolutional layer 2, element multiplication is carried out on the outputs of convolutional layer 1 and convolutional layer 2 in the multiplication layer, the output of convolutional layer 2 enters convolutional layer 3, the output of convolutional layer 3 is added with the output of the multiplication layer, and all convolutional layers are connected with ReLu activation functions.
324) Attention module structure: a lead attention module 1-a lead attention module 2, where the main branch features and side branch features enter the lead attention module 1 and the output and side branch features of the lead attention module 1 enter the lead attention module 2.
325) Main branch structure: the system comprises a convolutional layer 1, a fusion layer 1, a guide attention module 1, an up-sampling module 1, an attention module 1, an up-sampling module 2, an attention module 2 and a convolutional layer 2, wherein the fusion layer 1 combines the outputs of a main branch convolutional layer 1 and a structure side branch down-sampling module 2 in a channel dimension, the guide attention module 1 combines the outputs of the detail side branch down-sampling module 2 and the main branch fusion layer 1, the attention module 1 combines the outputs of the structure side branch down-sampling module 1, the detail side branch down-sampling module 1 and the main branch up-sampling module 1, and the attention module 2 combines the outputs of the structure side branch convolutional layer 1, the detail side branch convolutional layer 1 and the main branch up-sampling module 2. The final output is the output of the main branch convolution layer 2 and the output obtained by sampling and adding through bicubic interpolation;
326) detail side branch structure: the system comprises a convolution layer 1, a down-sampling module 1 and a down-sampling module 2, wherein the branches are responsible for processing information of a high-resolution color image, and the outputs of the down-sampling module 1 and the down-sampling module 2 are sent to an attention module of a main branch for feature fusion. The convolution layer 1 is connected with a ReLu activation function;
327) structure-side branch structure: the system comprises a convolutional layer 1, a downsampling module 1 and a downsampling module 2, wherein the convolutional layer 1, the downsampling module 1 and the downsampling module 2 are responsible for processing depth map information of bicubic interpolation upsampling, and the outputs of the downsampling module 1 and the downsampling module 2 are sent to an attention module of a main branch for feature fusion. The convolutional layer 1 is followed by a ReLu activation function.
33) The edge refinement network ERnet consists of two branches: a deep main branch and a detail side branch. The output of HDSRnet-light is input to the main deep branch and the high resolution color image is input to the side detailed branch. Both branches are composed of convolutional layers for fine reconstruction of the edge.
331) Detail side branch structure: convolutional layer C1-a convolutional layer C2-a convolutional layer C3-a convolutional layer C4-a convolutional layer C5-a convolutional layer C6-a convolutional layer C7-a convolutional layer C8-a convolutional layer C9The branch is responsible for processing high-resolution color image information, and all the convolution layers are connected in series and are connected with a ReLu activation function.
332) Deep branch structure: convolutional layer D1-a fusion layer F1-convolutional layer D2-a fusion layer F2-convolutional layer D3-a fusion layer F3-convolutional layer D4-a fusion layer F4-convolutional layer D5-a fusion layer F5-convolutional layer D6-a fusion layer F6-convolutional layer D7-a fusion layer F7-convolutional layer D8-a fusion layer F8-convolutional layer D9-a fusion layer F9-convolutional layer D10The branch is responsible for processing depth map information and fusing color branch information, all convolution layers are connected in series, and a fusion layer FiMerging depth branch convolutional layers in channel dimension DiOutput and color branch convolution layer C ofiTo output of (c).
4) Setting the learning rate of the network and the weight of each loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model;
41) after determining the network structure, inputting training data into the network;
42) in the first stage of network training, the initial learning rate of the HDSRnet-light network is set to be 0.0001, the learning rate is reduced to 0.1 time of the original learning rate every iteration of an epoch, and the HDSR is trainednet-light, in order toNorm as a loss function;
43) in the second stage of network training, after HDSRnet-light is trained for 30 epochs, the HDSRnet-light is followed by ERnet to train HDSRnet-lightNorm as a loss function;
44) training is carried out, and mapping from a low-resolution depth map to a high-resolution depth map is obtained through HDSRnet-light; and obtaining the mapping from the high-resolution depth map to the fine high-resolution depth map through the ERnet.
Aiming at super resolution of a depth map, the method generates low-resolution data on the basis of a public data set based on a deep learning method, and trains on a convolutional neural network HDSRnet. The HDSRnet-light section achieves efficient depth map super resolution, achieving superior performance with very little parameter amount. The ERnet part realizes more precise edge reconstruction, and the invention has the following characteristics:
1. an efficient lightweight depth map super-resolution network HDSRnet-light is provided, wherein an attention module can efficiently fuse the characteristics of a high-resolution color image and a low-resolution depth image. The network can perform 8-time super-resolution on a 135X 240 depth map at the speed of 360 frames/second, and the performance and the speed of the network exceed those of other methods in the related art.
2. The method provides an edge refinement network ERnet, carries out more fine repair on the basis of HDSRnet-light, corrects the deviation of some edges and realizes more excellent performance.
Drawings
FIG. 1 is an algorithm flow diagram;
FIG. 2 is a network framework diagram;
FIG. 3 is an attention module frame diagram;
FIG. 4 is a comparison of various methods: (a) the method comprises the steps of (a) obtaining an original high-resolution Depth map, (b) obtaining a result after bicubic interpolation up-sampling, (c) obtaining a result of a Depth learning method Depth SR, (d) obtaining a result of a Depth learning method PMBANet, (e) obtaining a result of a Depth learning method DEIN, and (f) obtaining a super-resolution result of the method.
Detailed Description
In order to overcome the defects of the prior art, the invention aims to provide a depth map super-resolution method with high operation efficiency and excellent performance by a guiding attention mechanism mode:
1) analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image:
Dlr=HDhr+n0
in the above formula DlrFor low-resolution images captured by depth cameras, DhrIs a high resolution depth image, H is a downsampled matrix, n0Is noise.
2) Building data sets
The high resolution data of the training set consists of a random cropping of 58 RGB-D images in the MPI sinter depth dataset and 34 RGB-D images in the Middlebury dataset, with the size of each pair of RGB-D data being such as to increase the diversity of the data by introducing a random rotation of the angle θ ∈ [0 °,90 °,180 °,270 °. And low-resolution data is acquired from the high-resolution depth map by bi-cubic interpolation down-sampling.
The test set was generated in the same way using six images in the Middlebury 2005 dataset.
3) Designing a network framework
31) The whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained. And inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output.
The whole network is HDSRnet;
32) the HDSRnet-light of the lightweight network consists of 3 branches: a main branch; a structural side branch; detail side branching. The two side branches are composed of an up-sampling module and a down-sampling module, and the main branch is composed of an up-sampling module, a down-sampling module, a guiding attention module and an attention module. Wherein, the low resolution depth map is input into the main branch, the depth map which is up-sampled by bicubic interpolation is input into the structural side branch, and the high resolution color map is input into the detail side branch.
321) The structure of the down-sampling module: convolutional layer 1-pooling layer 1-convolutional layer 2. All the convolution layers in the module are connected with ReLu activation functions.
322) The up-sampling module structure: deconvolution layer 1 — convolution layer 1. All the convolution layers and the deconvolution layers in the module are connected with ReLu activation functions.
323) Guiding attention module structure: convolutional layer 1-convolutional layer 2-multiplicative layer-convolutional layer 3. Wherein the main branch features enter convolutional layer 1, the side branch features enter convolutional layer 2, the output of convolutional layer 1 and convolutional layer 2 is element multiplied in the multiplication layer, the output of convolutional layer 2 enters convolutional layer 3, and the output of convolutional layer 3 and the multiplication layer output are added. All convolutional layers are connected with the ReLu activation function.
324) Attention module structure: direct attention module 1-direct attention module 2. Wherein the main branch features and the side branch features enter the guiding attention module 1 and the output and side branch features of the guiding attention module 1 enter the guiding attention module 2.
325) Main branch structure: convolutional layer 1-fusion layer 1-direction attention module 1-upsampling module 1-attention module 1-upsampling module 2-attention module 2-convolutional layer 2. The fusion layer 1 combines the outputs of the main branch convolution layer 1 and the structure side branch down-sampling module 2 in the channel dimension, the guide attention module 1 combines the outputs of the detail side branch down-sampling module 2 and the main branch fusion layer 1, the attention module 1 combines the structure side branch down-sampling module 1, the output of the detail side branch down-sampling module 1 and the main branch up-sampling module 1, the attention module 2 combines the output of the structure side branch convolution layer 1, the output of the detail side branch convolution layer 1 and the main branch up-sampling module 2. The final output is the output of the main branch convolution layer 2 and the output obtained by sampling and adding through bicubic interpolation;
326) detail side branch structure: convolutional layer 1-down sampling module 2. The branch is responsible for processing the information of the high-resolution color image, wherein the outputs of the down-sampling module 1 and the down-sampling module 2 are sent to the attention module of the main branch for feature fusion. The convolution layer 1 is connected with a ReLu activation function;
327) structure-side branch structure: convolutional layer 1-down sampling module 2. The branch is responsible for processing the depth map information of bicubic interpolation up-sampling, wherein the outputs of the down-sampling module 1 and the down-sampling module 2 are sent to the attention module of the main branch for feature fusion. The convolutional layer 1 is followed by a ReLu activation function.
33) The edge refinement network ERnet consists of two branches: a deep main branch and a detail side branch. The output of HDSRnet-light is input to the main deep branch and the high resolution color image is input to the side detailed branch. Both branches are composed of convolutional layers for fine reconstruction of the edge.
331) Detail side branch structure: convolutional layer C1-a convolutional layer C2-a convolutional layer C3-a convolutional layer C4-a convolutional layer C5-a convolutional layer C6-a convolutional layer C7-a convolutional layer C8-a convolutional layer C9. The branch is responsible for processing high-resolution color image information, and all the convolution layers are connected in series and are connected with a ReLu activation function.
332) Deep branch structure: convolutional layer D1-a fusion layer F1-convolutional layer D2-a fusion layer F2-convolutional layer D3-a fusion layer F3-convolutional layer D4-a fusion layer F4-convolutional layer D5-a fusion layer F5-convolutional layer D6-a fusion layer F6-a convolutional layerD7-a fusion layer F7-convolutional layer D8-a fusion layer F8-convolutional layer D9-a fusion layer F9-convolutional layer D10. The branch is responsible for processing depth map information and fusing color branch information, all convolution layers are connected in series, and a fusion layer FiMerging depth branch convolutional layers in channel dimension DiOutput and color branch convolution layer C ofiTo output of (c).
4) Setting the learning rate of the network and the weight of each loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model;
41) after determining the network structure, inputting training data into the network;
42) in the first stage of network training, the initial learning rate of HDSRnet-light network is set to 0.0001, and every iteration of epoch, the learning rate is reduced to 0.1 times of the original learning rate, and when training HDSRnet-light, the learning rate is reduced to 0.1 timesNorm as a loss function;
43) in the second stage of network training, after HDSRnet-light is trained for 30 epochs, the HDSRnet-light is followed by ERnet to train HDSRnet-lightNorm as a loss function;
44) training is carried out, and mapping from a low-resolution depth map to a high-resolution depth map is obtained through HDSRnet-light; and obtaining the mapping from the high-resolution depth map to the fine high-resolution depth map through the ERnet.
5) And inputting the RGB-D in the test set into a network to finally obtain a corresponding fine high-resolution depth image.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (1)
1. A depth map super-resolution method based on a guiding attention mechanism comprises the following steps:
1) analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image:
Dlr=HDhr+n0
in the formula DlrFor low-resolution images captured by depth cameras, DhrIs a high resolution depth image, H is a downsampled matrix, n0Is noise;
2) building data sets
The high-resolution data of the training set is formed by randomly cutting a plurality of RGB-D images in an MPI Sintel depth data set and a plurality of RGB-D images in a Middlebury data set, the size of each pair of RGB-D data is that the diversity of the data is increased by introducing the random rotation of an angle theta which belongs to [0 ], 90 degrees, 180 degrees and 270 degrees ], and the low-resolution data is obtained by the high-resolution depth image through double-cubic interpolation downsampling;
3) designing a network framework
31) The whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained; inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output, wherein the whole network is HDSRnet;
32) the HDSRnet-light of the lightweight network consists of 3 branches: a main branch; a structural side branch; the system comprises detail side branches, a main branch and a detail side branch, wherein the detail side branches comprise an up-sampling module and a down-sampling module; wherein, the low resolution depth map is input into the main branch, the depth map which is up-sampled by bicubic interpolation is input into the structural side branch, and the high resolution color map is input into the detail side branch;
321) the structure of the down-sampling module: the system comprises convolution layers 1, a pooling layer 1 and a convolution layer 2, wherein all convolution layers in a module are connected with ReLu activation functions;
322) the up-sampling module structure: the system comprises an deconvolution layer 1, a convolution layer 1, wherein all convolution layers and all deconvolution layers in a module are connected with a ReLu activation function;
323) guiding attention module structure: the method comprises the following steps of (1) convolutional layer 1-convolutional layer 2-multiplication layer-convolutional layer 3, wherein a main branch characteristic enters the convolutional layer 1, a side branch characteristic enters the convolutional layer 2, element multiplication is carried out on the outputs of the convolutional layer 1 and the convolutional layer 2 in the multiplication layer, the output of the convolutional layer 2 enters the convolutional layer 3, the output of the convolutional layer 3 is added with the output of the multiplication layer, and all convolutional layers are connected with ReLu activation functions;
324) attention module structure: a guiding attention module 1-a guiding attention module 2, wherein the main branch features and the side branch features enter the guiding attention module 1, and the output and side branch features of the guiding attention module 1 enter the guiding attention module 2;
325) main branch structure: the system comprises a convolutional layer 1, a fusion layer 1, a guide attention module 1, an upsampling module 1, an attention module 1, an upsampling module 2, an attention module 2 and a convolutional layer 2, wherein the fusion layer 1 combines the outputs of a main branch convolutional layer 1 and a structure side branch downsampling module 2 in a channel dimension, the guide attention module 1 combines the outputs of the detail side branch downsampling module 2 and the main branch fusion layer 1, the attention module 1 combines the outputs of the structure side branch downsampling module 1, the detail side branch downsampling module 1 and the main branch upsampling module 1, and the attention module 2 combines the outputs of the structure side branch convolutional layer 1, the detail side branch convolutional layer 1 and the main branch upsampling module 2; the final output is the output of the main branch convolution layer 2 and the output obtained by sampling and adding through bicubic interpolation;
326) detail side branch structure: the system comprises a convolution layer 1, a down-sampling module 1 and a down-sampling module 2, wherein the branches are responsible for processing the information of the high-resolution color image, and the outputs of the down-sampling module 1 and the down-sampling module 2 are all sent to an attention module of a main branch for feature fusion; the convolution layer 1 is connected with a ReLu activation function;
327) structure-side branch structure: the system comprises a convolutional layer 1, a downsampling module 1 and a downsampling module 2, wherein the convolutional layer 1, the downsampling module 1 and the downsampling module 2 are responsible for processing depth map information of bicubic interpolation upsampling, and the outputs of the downsampling module 1 and the downsampling module 2 are sent to an attention module of a main branch for feature fusion; the convolution layer 1 is connected with a ReLu activation function;
33) the edge refinement network ERnet consists of two branches: a deep main branch and a detail side branch; the output of HDSRnet-light is input to a depth main branch, and a high-resolution color image is input to a detail side branch; the two branches are composed of convolution layers and used for realizing fine reconstruction of edges;
331) detail side branch structure: convolutional layer C1-a convolutional layer C2-a convolutional layer C3-a convolutional layer C4-a convolutional layer C5-a convolutional layer C6-a convolutional layer C7-a convolutional layer C8-a convolutional layer C9The branch is responsible for processing high-resolution color image information, and all the convolution layers are connected in series and are connected with a ReLu activation function;
332) deep branch structure: convolutional layer D1-a fusion layer F1-convolutional layer D2-a fusion layer F2-convolutional layer D3-a fusion layer F3-convolutional layer D4-a fusion layer F4-convolutional layer D5-a fusion layer F5-convolutional layer D6-a fusion layer F6-convolutional layer D7-a fusion layer F7-convolutional layer D8-a fusion layer F8-convolutional layer D9-a fusion layer F9-convolutional layer D10The branch is responsible for processing depth map information and fusing color branch information, all convolution layers are connected in series, and a fusion layer FiMerging depth branch convolutional layers in channel dimension DiOutput and color branch convolution layer C ofiAn output of (d);
4) setting the learning rate of the network and the weight of each loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model;
41) after determining the network structure, inputting training data into the network;
42) in the first stage of network training, the initial learning rate of the HDSRnet-light network is set to be 0.0001, and the learning rate is reduced to 0 per iteration of an epoch1 times, in training HDSRnet-light, toNorm as a loss function;
43) in the second stage of network training, after HDSRnet-light is trained for 30 epochs, the HDSRnet-light is followed by ERnet to train HDSRnet-lightNorm as a loss function;
44) training is carried out, and mapping from a low-resolution depth map to a high-resolution depth map is obtained through HDSRnet-light; and obtaining the mapping from the high-resolution depth map to the fine high-resolution depth map through the ERnet.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111400288.8A CN114170079A (en) | 2021-11-19 | 2021-11-19 | Depth map super-resolution method based on attention guide mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111400288.8A CN114170079A (en) | 2021-11-19 | 2021-11-19 | Depth map super-resolution method based on attention guide mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114170079A true CN114170079A (en) | 2022-03-11 |
Family
ID=80480419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111400288.8A Pending CN114170079A (en) | 2021-11-19 | 2021-11-19 | Depth map super-resolution method based on attention guide mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114170079A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114998683A (en) * | 2022-06-01 | 2022-09-02 | 北京理工大学 | Attention mechanism-based ToF multipath interference removing method |
-
2021
- 2021-11-19 CN CN202111400288.8A patent/CN114170079A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114998683A (en) * | 2022-06-01 | 2022-09-02 | 北京理工大学 | Attention mechanism-based ToF multipath interference removing method |
CN114998683B (en) * | 2022-06-01 | 2024-05-31 | 北京理工大学 | Attention mechanism-based ToF multipath interference removal method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108765296B (en) | Image super-resolution reconstruction method based on recursive residual attention network | |
CN113362223B (en) | Image super-resolution reconstruction method based on attention mechanism and two-channel network | |
CN114092330B (en) | Light-weight multi-scale infrared image super-resolution reconstruction method | |
CN111275618A (en) | Depth map super-resolution reconstruction network construction method based on double-branch perception | |
CN111179167B (en) | Image super-resolution method based on multi-stage attention enhancement network | |
CN110136062B (en) | Super-resolution reconstruction method combining semantic segmentation | |
CN110930342B (en) | Depth map super-resolution reconstruction network construction method based on color map guidance | |
CN112435191B (en) | Low-illumination image enhancement method based on fusion of multiple neural network structures | |
CN112365403B (en) | Video super-resolution recovery method based on deep learning and adjacent frames | |
CN105678728A (en) | High-efficiency super-resolution imaging device and method with regional management | |
CN116071243A (en) | Infrared image super-resolution reconstruction method based on edge enhancement | |
CN112419191B (en) | Image motion blur removing method based on convolution neural network | |
CN111654621B (en) | Dual-focus camera continuous digital zooming method based on convolutional neural network model | |
CN113409190B (en) | Video super-resolution method based on multi-frame grouping and feedback network | |
CN112017116B (en) | Image super-resolution reconstruction network based on asymmetric convolution and construction method thereof | |
CN116486074A (en) | Medical image segmentation method based on local and global context information coding | |
CN115100039B (en) | Lightweight image super-resolution reconstruction method based on deep learning | |
Zhang et al. | Deformable and residual convolutional network for image super-resolution | |
CN113538243A (en) | Super-resolution image reconstruction method based on multi-parallax attention module combination | |
CN114170079A (en) | Depth map super-resolution method based on attention guide mechanism | |
Wang et al. | DDistill-SR: Reparameterized dynamic distillation network for lightweight image super-resolution | |
Gong et al. | Learning deep resonant prior for hyperspectral image super-resolution | |
CN113610707A (en) | Video super-resolution method based on time attention and cyclic feedback network | |
CN111861870B (en) | End-to-end parallel generator network construction method for image translation | |
CN113362239A (en) | Deep learning image restoration method based on feature interaction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |