CN114170079A - Depth map super-resolution method based on attention guide mechanism - Google Patents

Depth map super-resolution method based on attention guide mechanism Download PDF

Info

Publication number
CN114170079A
CN114170079A CN202111400288.8A CN202111400288A CN114170079A CN 114170079 A CN114170079 A CN 114170079A CN 202111400288 A CN202111400288 A CN 202111400288A CN 114170079 A CN114170079 A CN 114170079A
Authority
CN
China
Prior art keywords
layer
convolutional layer
module
resolution
branch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111400288.8A
Other languages
Chinese (zh)
Inventor
杨敬钰
陈昶佚
岳焕景
李坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202111400288.8A priority Critical patent/CN114170079A/en
Publication of CN114170079A publication Critical patent/CN114170079A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a depth map super-resolution method based on a guide attention mechanism, which comprises the following steps of: analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image; establishing a data set; designing a network framework: the whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained. Inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output, wherein the whole network is HDSRnet; setting the learning rate of the network and the weight of each part of loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model.

Description

Depth map super-resolution method based on attention guide mechanism
Technical Field
The invention belongs to the field of image restoration, and relates to a super-resolution method of a depth map.
Background
A depth map is also called a distance image, and is an image in which the distance between an image pickup and each point of a scene is defined as a pixel value. Currently, the acquisition of the depth map is mainly realized by using structured light illumination and tof (time of flight) technology. However, due to the limitation of the physical performance of the current devices, the resolution of the depth map captured by the current depth camera is low, and the depth map is difficult to be applied to scenes with high requirements on fineness.
Current depth cameras are typically configured with one depth camera lens in combination with one color camera lens, and the corresponding collected data is a low resolution depth map and a paired high resolution color map. To obtain a high resolution depth map, a high resolution color image is typically used to guide a low resolution depth image for super resolution.
Depth map super-resolution methods based on the guide thought are mainly divided into two categories:
(1) conventional methods, for example: a hyper-resolution scheme based on an image filter; introducing prior information, and converting the overdivision problem into an optimization problem by utilizing a maximum posterior probability theory; and methods using dictionary learning and the like.
(2) A Convolutional Neural Network (CNN) is used to directly learn the mapping between low-resolution depth images to corresponding high-resolution depth images. Such methods require a large number of pairs of low-resolution depth images and high-resolution depth images, which are training data sets for the network, to train the network. However, in the existing research, most methods simply splice the high-resolution color image features and the depth image features, which results in low information utilization efficiency. Currently, most networks improve performance by stacking parameters, which results in large network parameters, low operation efficiency, and difficult deployment into practical application devices. Therefore, a more efficient characteristic fusion mode is adopted to improve the network performance, and further the practical application value of the method is improved to become the development trend of the current method.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a depth map super-resolution method with high operation efficiency and excellent performance by a guiding attention mechanism mode, and the technical scheme is as follows:
a depth map super-resolution method based on a guiding attention mechanism comprises the following steps:
1) analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image:
Dlr=HDhr+n0
in the formula DlrFor low-resolution images captured by depth cameras, DhrIs a high resolution depth image, H is a downsampled matrix, n0Is noise.
2) Building data sets
The high-resolution data of the training set is formed by randomly cropping a plurality of RGB-D images in an MPI Sintel depth data set and a plurality of RGB-D images in a Middlebury data set, the size of each pair of RGB-D data is such that the diversity of the data is increased by introducing the random rotation of an angle theta epsilon [0 DEG, 90 DEG, 180 DEG, 270 DEG ], and the low-resolution data is obtained by the high-resolution depth image through double-cubic interpolation downsampling.
3) Designing a network framework
31) The whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained. Inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output, wherein the whole network is HDSRnet;
32) the HDSRnet-light of the lightweight network consists of 3 branches: a main branch; a structural side branch; the detail side branch, two side branches are composed of up-sampling module and down-sampling module, the main branch is composed of up-sampling module, down-sampling module, attention guiding module and attention module. Wherein, the low resolution depth map is input into the main branch, the depth map which is up-sampled by bicubic interpolation is input into the structural side branch, and the high resolution color map is input into the detail side branch.
321) The structure of the down-sampling module: the convolution layer 1, the pooling layer 1 and the convolution layer 2 are connected with ReLu activation functions after all convolution layers in the module.
322) The up-sampling module structure: deconvolution layer 1-convolution layer 1, all convolution layers and deconvolution layers in the module are connected with ReLu activation functions.
323) Guiding attention module structure: the method comprises the steps of convolutional layer 1-convolutional layer 2-multiplication layer-convolutional layer 3, wherein main branch characteristics enter convolutional layer 1, side branch characteristics enter convolutional layer 2, element multiplication is carried out on the outputs of convolutional layer 1 and convolutional layer 2 in the multiplication layer, the output of convolutional layer 2 enters convolutional layer 3, the output of convolutional layer 3 is added with the output of the multiplication layer, and all convolutional layers are connected with ReLu activation functions.
324) Attention module structure: a lead attention module 1-a lead attention module 2, where the main branch features and side branch features enter the lead attention module 1 and the output and side branch features of the lead attention module 1 enter the lead attention module 2.
325) Main branch structure: the system comprises a convolutional layer 1, a fusion layer 1, a guide attention module 1, an up-sampling module 1, an attention module 1, an up-sampling module 2, an attention module 2 and a convolutional layer 2, wherein the fusion layer 1 combines the outputs of a main branch convolutional layer 1 and a structure side branch down-sampling module 2 in a channel dimension, the guide attention module 1 combines the outputs of the detail side branch down-sampling module 2 and the main branch fusion layer 1, the attention module 1 combines the outputs of the structure side branch down-sampling module 1, the detail side branch down-sampling module 1 and the main branch up-sampling module 1, and the attention module 2 combines the outputs of the structure side branch convolutional layer 1, the detail side branch convolutional layer 1 and the main branch up-sampling module 2. The final output is the output of the main branch convolution layer 2 and the output obtained by sampling and adding through bicubic interpolation;
326) detail side branch structure: the system comprises a convolution layer 1, a down-sampling module 1 and a down-sampling module 2, wherein the branches are responsible for processing information of a high-resolution color image, and the outputs of the down-sampling module 1 and the down-sampling module 2 are sent to an attention module of a main branch for feature fusion. The convolution layer 1 is connected with a ReLu activation function;
327) structure-side branch structure: the system comprises a convolutional layer 1, a downsampling module 1 and a downsampling module 2, wherein the convolutional layer 1, the downsampling module 1 and the downsampling module 2 are responsible for processing depth map information of bicubic interpolation upsampling, and the outputs of the downsampling module 1 and the downsampling module 2 are sent to an attention module of a main branch for feature fusion. The convolutional layer 1 is followed by a ReLu activation function.
33) The edge refinement network ERnet consists of two branches: a deep main branch and a detail side branch. The output of HDSRnet-light is input to the main deep branch and the high resolution color image is input to the side detailed branch. Both branches are composed of convolutional layers for fine reconstruction of the edge.
331) Detail side branch structure: convolutional layer C1-a convolutional layer C2-a convolutional layer C3-a convolutional layer C4-a convolutional layer C5-a convolutional layer C6-a convolutional layer C7-a convolutional layer C8-a convolutional layer C9The branch is responsible for processing high-resolution color image information, and all the convolution layers are connected in series and are connected with a ReLu activation function.
332) Deep branch structure: convolutional layer D1-a fusion layer F1-convolutional layer D2-a fusion layer F2-convolutional layer D3-a fusion layer F3-convolutional layer D4-a fusion layer F4-convolutional layer D5-a fusion layer F5-convolutional layer D6-a fusion layer F6-convolutional layer D7-a fusion layer F7-convolutional layer D8-a fusion layer F8-convolutional layer D9-a fusion layer F9-convolutional layer D10The branch is responsible for processing depth map information and fusing color branch information, all convolution layers are connected in series, and a fusion layer FiMerging depth branch convolutional layers in channel dimension DiOutput and color branch convolution layer C ofiTo output of (c).
4) Setting the learning rate of the network and the weight of each loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model;
41) after determining the network structure, inputting training data into the network;
42) in the first stage of network training, the initial learning rate of the HDSRnet-light network is set to be 0.0001, the learning rate is reduced to 0.1 time of the original learning rate every iteration of an epoch, and the HDSR is trainednet-light, in order to
Figure BDA0003365165640000021
Norm as a loss function;
43) in the second stage of network training, after HDSRnet-light is trained for 30 epochs, the HDSRnet-light is followed by ERnet to train HDSRnet-light
Figure BDA0003365165640000022
Norm as a loss function;
44) training is carried out, and mapping from a low-resolution depth map to a high-resolution depth map is obtained through HDSRnet-light; and obtaining the mapping from the high-resolution depth map to the fine high-resolution depth map through the ERnet.
Aiming at super resolution of a depth map, the method generates low-resolution data on the basis of a public data set based on a deep learning method, and trains on a convolutional neural network HDSRnet. The HDSRnet-light section achieves efficient depth map super resolution, achieving superior performance with very little parameter amount. The ERnet part realizes more precise edge reconstruction, and the invention has the following characteristics:
1. an efficient lightweight depth map super-resolution network HDSRnet-light is provided, wherein an attention module can efficiently fuse the characteristics of a high-resolution color image and a low-resolution depth image. The network can perform 8-time super-resolution on a 135X 240 depth map at the speed of 360 frames/second, and the performance and the speed of the network exceed those of other methods in the related art.
2. The method provides an edge refinement network ERnet, carries out more fine repair on the basis of HDSRnet-light, corrects the deviation of some edges and realizes more excellent performance.
3. According to the characteristics of different loss functions, a two-stage training strategy is provided, and meanwhile, the characteristics of different loss functions are considered
Figure BDA0003365165640000031
Norm sum
Figure BDA0003365165640000032
The characteristics of norm accelerate the convergence speedDegree and performance.
Drawings
FIG. 1 is an algorithm flow diagram;
FIG. 2 is a network framework diagram;
FIG. 3 is an attention module frame diagram;
FIG. 4 is a comparison of various methods: (a) the method comprises the steps of (a) obtaining an original high-resolution Depth map, (b) obtaining a result after bicubic interpolation up-sampling, (c) obtaining a result of a Depth learning method Depth SR, (d) obtaining a result of a Depth learning method PMBANet, (e) obtaining a result of a Depth learning method DEIN, and (f) obtaining a super-resolution result of the method.
Detailed Description
In order to overcome the defects of the prior art, the invention aims to provide a depth map super-resolution method with high operation efficiency and excellent performance by a guiding attention mechanism mode:
1) analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image:
Dlr=HDhr+n0
in the above formula DlrFor low-resolution images captured by depth cameras, DhrIs a high resolution depth image, H is a downsampled matrix, n0Is noise.
2) Building data sets
The high resolution data of the training set consists of a random cropping of 58 RGB-D images in the MPI sinter depth dataset and 34 RGB-D images in the Middlebury dataset, with the size of each pair of RGB-D data being such as to increase the diversity of the data by introducing a random rotation of the angle θ ∈ [0 °,90 °,180 °,270 °. And low-resolution data is acquired from the high-resolution depth map by bi-cubic interpolation down-sampling.
The test set was generated in the same way using six images in the Middlebury 2005 dataset.
3) Designing a network framework
31) The whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained. And inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output.
The whole network is HDSRnet;
32) the HDSRnet-light of the lightweight network consists of 3 branches: a main branch; a structural side branch; detail side branching. The two side branches are composed of an up-sampling module and a down-sampling module, and the main branch is composed of an up-sampling module, a down-sampling module, a guiding attention module and an attention module. Wherein, the low resolution depth map is input into the main branch, the depth map which is up-sampled by bicubic interpolation is input into the structural side branch, and the high resolution color map is input into the detail side branch.
321) The structure of the down-sampling module: convolutional layer 1-pooling layer 1-convolutional layer 2. All the convolution layers in the module are connected with ReLu activation functions.
322) The up-sampling module structure: deconvolution layer 1 — convolution layer 1. All the convolution layers and the deconvolution layers in the module are connected with ReLu activation functions.
323) Guiding attention module structure: convolutional layer 1-convolutional layer 2-multiplicative layer-convolutional layer 3. Wherein the main branch features enter convolutional layer 1, the side branch features enter convolutional layer 2, the output of convolutional layer 1 and convolutional layer 2 is element multiplied in the multiplication layer, the output of convolutional layer 2 enters convolutional layer 3, and the output of convolutional layer 3 and the multiplication layer output are added. All convolutional layers are connected with the ReLu activation function.
324) Attention module structure: direct attention module 1-direct attention module 2. Wherein the main branch features and the side branch features enter the guiding attention module 1 and the output and side branch features of the guiding attention module 1 enter the guiding attention module 2.
325) Main branch structure: convolutional layer 1-fusion layer 1-direction attention module 1-upsampling module 1-attention module 1-upsampling module 2-attention module 2-convolutional layer 2. The fusion layer 1 combines the outputs of the main branch convolution layer 1 and the structure side branch down-sampling module 2 in the channel dimension, the guide attention module 1 combines the outputs of the detail side branch down-sampling module 2 and the main branch fusion layer 1, the attention module 1 combines the structure side branch down-sampling module 1, the output of the detail side branch down-sampling module 1 and the main branch up-sampling module 1, the attention module 2 combines the output of the structure side branch convolution layer 1, the output of the detail side branch convolution layer 1 and the main branch up-sampling module 2. The final output is the output of the main branch convolution layer 2 and the output obtained by sampling and adding through bicubic interpolation;
326) detail side branch structure: convolutional layer 1-down sampling module 2. The branch is responsible for processing the information of the high-resolution color image, wherein the outputs of the down-sampling module 1 and the down-sampling module 2 are sent to the attention module of the main branch for feature fusion. The convolution layer 1 is connected with a ReLu activation function;
327) structure-side branch structure: convolutional layer 1-down sampling module 2. The branch is responsible for processing the depth map information of bicubic interpolation up-sampling, wherein the outputs of the down-sampling module 1 and the down-sampling module 2 are sent to the attention module of the main branch for feature fusion. The convolutional layer 1 is followed by a ReLu activation function.
33) The edge refinement network ERnet consists of two branches: a deep main branch and a detail side branch. The output of HDSRnet-light is input to the main deep branch and the high resolution color image is input to the side detailed branch. Both branches are composed of convolutional layers for fine reconstruction of the edge.
331) Detail side branch structure: convolutional layer C1-a convolutional layer C2-a convolutional layer C3-a convolutional layer C4-a convolutional layer C5-a convolutional layer C6-a convolutional layer C7-a convolutional layer C8-a convolutional layer C9. The branch is responsible for processing high-resolution color image information, and all the convolution layers are connected in series and are connected with a ReLu activation function.
332) Deep branch structure: convolutional layer D1-a fusion layer F1-convolutional layer D2-a fusion layer F2-convolutional layer D3-a fusion layer F3-convolutional layer D4-a fusion layer F4-convolutional layer D5-a fusion layer F5-convolutional layer D6-a fusion layer F6-a convolutional layerD7-a fusion layer F7-convolutional layer D8-a fusion layer F8-convolutional layer D9-a fusion layer F9-convolutional layer D10. The branch is responsible for processing depth map information and fusing color branch information, all convolution layers are connected in series, and a fusion layer FiMerging depth branch convolutional layers in channel dimension DiOutput and color branch convolution layer C ofiTo output of (c).
4) Setting the learning rate of the network and the weight of each loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model;
41) after determining the network structure, inputting training data into the network;
42) in the first stage of network training, the initial learning rate of HDSRnet-light network is set to 0.0001, and every iteration of epoch, the learning rate is reduced to 0.1 times of the original learning rate, and when training HDSRnet-light, the learning rate is reduced to 0.1 times
Figure BDA0003365165640000041
Norm as a loss function;
43) in the second stage of network training, after HDSRnet-light is trained for 30 epochs, the HDSRnet-light is followed by ERnet to train HDSRnet-light
Figure BDA0003365165640000042
Norm as a loss function;
44) training is carried out, and mapping from a low-resolution depth map to a high-resolution depth map is obtained through HDSRnet-light; and obtaining the mapping from the high-resolution depth map to the fine high-resolution depth map through the ERnet.
5) And inputting the RGB-D in the test set into a network to finally obtain a corresponding fine high-resolution depth image.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (1)

1. A depth map super-resolution method based on a guiding attention mechanism comprises the following steps:
1) analyzing the characteristics of the low-resolution depth map, and establishing a degradation model in the screen shot image:
Dlr=HDhr+n0
in the formula DlrFor low-resolution images captured by depth cameras, DhrIs a high resolution depth image, H is a downsampled matrix, n0Is noise;
2) building data sets
The high-resolution data of the training set is formed by randomly cutting a plurality of RGB-D images in an MPI Sintel depth data set and a plurality of RGB-D images in a Middlebury data set, the size of each pair of RGB-D data is that the diversity of the data is increased by introducing the random rotation of an angle theta which belongs to [0 ], 90 degrees, 180 degrees and 270 degrees ], and the low-resolution data is obtained by the high-resolution depth image through double-cubic interpolation downsampling;
3) designing a network framework
31) The whole network structure is composed of a lightweight network HDSRnet-light and an optional edge refinement network ERnet, a low-resolution depth map, a depth map subjected to bicubic interpolation up-sampling and a high-resolution color map are input into the HDS Rnet-light, and a high-resolution depth map is obtained; inputting the high-resolution depth map and the high-resolution color map into the ERnet for further reconstruction, and finally obtaining fine high-resolution output, wherein the whole network is HDSRnet;
32) the HDSRnet-light of the lightweight network consists of 3 branches: a main branch; a structural side branch; the system comprises detail side branches, a main branch and a detail side branch, wherein the detail side branches comprise an up-sampling module and a down-sampling module; wherein, the low resolution depth map is input into the main branch, the depth map which is up-sampled by bicubic interpolation is input into the structural side branch, and the high resolution color map is input into the detail side branch;
321) the structure of the down-sampling module: the system comprises convolution layers 1, a pooling layer 1 and a convolution layer 2, wherein all convolution layers in a module are connected with ReLu activation functions;
322) the up-sampling module structure: the system comprises an deconvolution layer 1, a convolution layer 1, wherein all convolution layers and all deconvolution layers in a module are connected with a ReLu activation function;
323) guiding attention module structure: the method comprises the following steps of (1) convolutional layer 1-convolutional layer 2-multiplication layer-convolutional layer 3, wherein a main branch characteristic enters the convolutional layer 1, a side branch characteristic enters the convolutional layer 2, element multiplication is carried out on the outputs of the convolutional layer 1 and the convolutional layer 2 in the multiplication layer, the output of the convolutional layer 2 enters the convolutional layer 3, the output of the convolutional layer 3 is added with the output of the multiplication layer, and all convolutional layers are connected with ReLu activation functions;
324) attention module structure: a guiding attention module 1-a guiding attention module 2, wherein the main branch features and the side branch features enter the guiding attention module 1, and the output and side branch features of the guiding attention module 1 enter the guiding attention module 2;
325) main branch structure: the system comprises a convolutional layer 1, a fusion layer 1, a guide attention module 1, an upsampling module 1, an attention module 1, an upsampling module 2, an attention module 2 and a convolutional layer 2, wherein the fusion layer 1 combines the outputs of a main branch convolutional layer 1 and a structure side branch downsampling module 2 in a channel dimension, the guide attention module 1 combines the outputs of the detail side branch downsampling module 2 and the main branch fusion layer 1, the attention module 1 combines the outputs of the structure side branch downsampling module 1, the detail side branch downsampling module 1 and the main branch upsampling module 1, and the attention module 2 combines the outputs of the structure side branch convolutional layer 1, the detail side branch convolutional layer 1 and the main branch upsampling module 2; the final output is the output of the main branch convolution layer 2 and the output obtained by sampling and adding through bicubic interpolation;
326) detail side branch structure: the system comprises a convolution layer 1, a down-sampling module 1 and a down-sampling module 2, wherein the branches are responsible for processing the information of the high-resolution color image, and the outputs of the down-sampling module 1 and the down-sampling module 2 are all sent to an attention module of a main branch for feature fusion; the convolution layer 1 is connected with a ReLu activation function;
327) structure-side branch structure: the system comprises a convolutional layer 1, a downsampling module 1 and a downsampling module 2, wherein the convolutional layer 1, the downsampling module 1 and the downsampling module 2 are responsible for processing depth map information of bicubic interpolation upsampling, and the outputs of the downsampling module 1 and the downsampling module 2 are sent to an attention module of a main branch for feature fusion; the convolution layer 1 is connected with a ReLu activation function;
33) the edge refinement network ERnet consists of two branches: a deep main branch and a detail side branch; the output of HDSRnet-light is input to a depth main branch, and a high-resolution color image is input to a detail side branch; the two branches are composed of convolution layers and used for realizing fine reconstruction of edges;
331) detail side branch structure: convolutional layer C1-a convolutional layer C2-a convolutional layer C3-a convolutional layer C4-a convolutional layer C5-a convolutional layer C6-a convolutional layer C7-a convolutional layer C8-a convolutional layer C9The branch is responsible for processing high-resolution color image information, and all the convolution layers are connected in series and are connected with a ReLu activation function;
332) deep branch structure: convolutional layer D1-a fusion layer F1-convolutional layer D2-a fusion layer F2-convolutional layer D3-a fusion layer F3-convolutional layer D4-a fusion layer F4-convolutional layer D5-a fusion layer F5-convolutional layer D6-a fusion layer F6-convolutional layer D7-a fusion layer F7-convolutional layer D8-a fusion layer F8-convolutional layer D9-a fusion layer F9-convolutional layer D10The branch is responsible for processing depth map information and fusing color branch information, all convolution layers are connected in series, and a fusion layer FiMerging depth branch convolutional layers in channel dimension DiOutput and color branch convolution layer C ofiAn output of (d);
4) setting the learning rate of the network and the weight of each loss function, and training the convolutional neural network by using a deep learning frame Pythrch until loss is converged to generate a training model;
41) after determining the network structure, inputting training data into the network;
42) in the first stage of network training, the initial learning rate of the HDSRnet-light network is set to be 0.0001, and the learning rate is reduced to 0 per iteration of an epoch1 times, in training HDSRnet-light, to
Figure FDA0003365165630000021
Norm as a loss function;
43) in the second stage of network training, after HDSRnet-light is trained for 30 epochs, the HDSRnet-light is followed by ERnet to train HDSRnet-light
Figure FDA0003365165630000022
Norm as a loss function;
44) training is carried out, and mapping from a low-resolution depth map to a high-resolution depth map is obtained through HDSRnet-light; and obtaining the mapping from the high-resolution depth map to the fine high-resolution depth map through the ERnet.
CN202111400288.8A 2021-11-19 2021-11-19 Depth map super-resolution method based on attention guide mechanism Pending CN114170079A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111400288.8A CN114170079A (en) 2021-11-19 2021-11-19 Depth map super-resolution method based on attention guide mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111400288.8A CN114170079A (en) 2021-11-19 2021-11-19 Depth map super-resolution method based on attention guide mechanism

Publications (1)

Publication Number Publication Date
CN114170079A true CN114170079A (en) 2022-03-11

Family

ID=80480419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111400288.8A Pending CN114170079A (en) 2021-11-19 2021-11-19 Depth map super-resolution method based on attention guide mechanism

Country Status (1)

Country Link
CN (1) CN114170079A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998683A (en) * 2022-06-01 2022-09-02 北京理工大学 Attention mechanism-based ToF multipath interference removing method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998683A (en) * 2022-06-01 2022-09-02 北京理工大学 Attention mechanism-based ToF multipath interference removing method
CN114998683B (en) * 2022-06-01 2024-05-31 北京理工大学 Attention mechanism-based ToF multipath interference removal method

Similar Documents

Publication Publication Date Title
CN108765296B (en) Image super-resolution reconstruction method based on recursive residual attention network
CN113362223B (en) Image super-resolution reconstruction method based on attention mechanism and two-channel network
CN114092330B (en) Light-weight multi-scale infrared image super-resolution reconstruction method
CN111275618A (en) Depth map super-resolution reconstruction network construction method based on double-branch perception
CN111179167B (en) Image super-resolution method based on multi-stage attention enhancement network
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN110930342B (en) Depth map super-resolution reconstruction network construction method based on color map guidance
CN112435191B (en) Low-illumination image enhancement method based on fusion of multiple neural network structures
CN112365403B (en) Video super-resolution recovery method based on deep learning and adjacent frames
CN105678728A (en) High-efficiency super-resolution imaging device and method with regional management
CN116071243A (en) Infrared image super-resolution reconstruction method based on edge enhancement
CN112419191B (en) Image motion blur removing method based on convolution neural network
CN111654621B (en) Dual-focus camera continuous digital zooming method based on convolutional neural network model
CN113409190B (en) Video super-resolution method based on multi-frame grouping and feedback network
CN112017116B (en) Image super-resolution reconstruction network based on asymmetric convolution and construction method thereof
CN116486074A (en) Medical image segmentation method based on local and global context information coding
CN115100039B (en) Lightweight image super-resolution reconstruction method based on deep learning
Zhang et al. Deformable and residual convolutional network for image super-resolution
CN113538243A (en) Super-resolution image reconstruction method based on multi-parallax attention module combination
CN114170079A (en) Depth map super-resolution method based on attention guide mechanism
Wang et al. DDistill-SR: Reparameterized dynamic distillation network for lightweight image super-resolution
Gong et al. Learning deep resonant prior for hyperspectral image super-resolution
CN113610707A (en) Video super-resolution method based on time attention and cyclic feedback network
CN111861870B (en) End-to-end parallel generator network construction method for image translation
CN113362239A (en) Deep learning image restoration method based on feature interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination