CN109685842A - A kind of thick densification method of sparse depth based on multiple dimensioned network - Google Patents
A kind of thick densification method of sparse depth based on multiple dimensioned network Download PDFInfo
- Publication number
- CN109685842A CN109685842A CN201811531022.5A CN201811531022A CN109685842A CN 109685842 A CN109685842 A CN 109685842A CN 201811531022 A CN201811531022 A CN 201811531022A CN 109685842 A CN109685842 A CN 109685842A
- Authority
- CN
- China
- Prior art keywords
- layer
- convolution
- depth
- branch
- convolution block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a kind of thick densification methods of the sparse depth based on multiple dimensioned network.Belong to the estimation of Depth technical field of computer vision.The present invention uses multiple dimensioned convolutional neural networks, and rgb image data and sparse point cloud data are effectively merged, finally obtain dense depth image.Sparse cloud is mapped to two-dimensional surface and generates sparse depth figure, and it is aligned with RGB image, then sparse depth figure and RGB image are linked together and generates RGBD image, RGBD image is input to multiple dimensioned convolutional neural networks and is trained and tests, finally estimate a dense depth map.The mode estimating depth that RGB image and sparse cloud combine, the range information that a cloud can be allowed to include go that RGB image is instructed to be converted into depth map;The information of initial data different resolution is utilized in multiple dimensioned network, on the one hand expands view field, and on the other hand the input depth map in small resolution ratio is denser, can obtain higher accuracy rate.
Description
Technical field
The invention belongs to the estimation of Depth fields of computer vision, and in particular to one kind is based on multiple dimensioned convolutional neural networks
The thick densification method of sparse depth.
Background technique
In unmanned, the sensory perceptual system based on computer vision technique is most basic part.Currently, unmanned sense
Know that most-often used in system is the camera based on visible light, camera has many advantages, such as at low cost, and the relevant technologies are mature.But
There is also distinct disadvantages for camera based on visible light: first, there was only colouring information by the RGB image that camera is shot, if
Target texture is complicated, and sensory perceptual system easily determines fault.Second, the camera based on visible light can fail in certain environment.Example
Such as illumination insufficient night, camera is difficult to be normally carried out work.Laser radar is also that unmanned sensory perceptual system is commonly used
Sensor.Laser radar is not easy to be illuminated by the light the influence of condition, and the point cloud data of acquisition has three-dimensional character, by point cloud data
Depth image can be directly obtained, depth image is the image that will be put cloud and be mapped to two-dimensional surface formation, each pixel
Value indicates the point to the distance of sensor.Compared to RGB image, the range information that depth image includes is to object identification, segmentation
Etc. tasks it is more helpful.But laser radar is expensive, and the point cloud acquired is excessively sparse, and the depth map of generation is also excessively dilute
It dredges, affects its using effect to a certain degree.
Summary of the invention
Goal of the invention of the invention is: in view of the above problems, providing a kind of multiple dimensioned network of utilization to sparse
The method that depth carries out denseization.
The thick densification method of sparse depth based on multiple dimensioned network of the invention, including the following steps:
Construct multiple dimensioned network model:
The multiple dimensioned network model includes (L >=2) road L input branch's branch, by the output corresponding points of the road L branch branch
Information fused layer is inputted after addition, a up-sampling treatment layer is followed by information fused layer, as the defeated of multiple dimensioned network model
Layer out;
Wherein, in the road L input branch's branch, wherein input of the branch as original image all the way;The remaining road L-1 is as former
Beginning image carries out the input of the down-sampled images obtained after different down-samplings;And the output figure of the output layer of multiple dimensioned network model
As identical as the size of original image;
And the input data of the road L input branch's branch includes: RGB image and sparse depth figure;Wherein for original graph
The down-sampling mode of the sparse depth figure of picture are as follows: for sparse depth figure, preset down-sampling multiple K is based on, by sparse depth
Figure is divided into grid according to pixel, and each grid includes K × K and is originally inputted pixel;And based on the depth for being originally inputted pixel
The mark value s for being respectively originally inputted pixel is arranged in valueiIf the depth value for being currently originally inputted pixel is 0, si=0;Otherwise si=
1;Wherein i is a specificator for being originally inputted pixel of K × K that each grid includes;And according to formulaObtain the depth value p of each gridnew, wherein piExpression is originally inputted picture
The depth value of plain i;
Input is that the network structure of the branch of original image is first network structure;
Input is the network structure of the branch of the down-sampled images of original image are as follows: adds K/2 after first network structure
The up-sampling convolution block D in a 16 channel, wherein K indicates the down-sampling multiple to original image;
The first network structure includes 14 layers, is respectively as follows:
First layer is input layer and pond layer, and the convolution kernel size of input layer is 7*7, port number 64, and convolution step-length is
2;Pond layer uses maximum value pond, and convolution kernel size is 3*3, and pond constant is 2;
The second layer is identical with third layer structure, is the R in 64 channels1Residual error convolution block;
The 4th layer of R for 128 channels2Residual error convolution block;
Layer 5 is the R in 128 channels1Residual error convolution block;
Layer 6 is the R in 256 channels2Residual error convolution block;
Layer 7 is the R in 256 channels1Residual error convolution block;
The 8th layer of R for 512 channels2Residual error convolution block;
The 9th layer of R for 512 channels1Residual error convolution block;
Tenth layer is a convolutional layer, and convolution kernel size is 3*3, and port number 256, convolution step-length is 1;
Eleventh floor is the up-sampling convolution block D in 128 channels, and by the output of eleventh floor and the output of layer 7 according to
Floor 12 is inputted again after the superposition of channel;
Floor 12 is the up-sampling convolution block D in 64 channels, and by the output of Floor 12 and the output of layer 5 according to
The 13rd layer is inputted again after the superposition of channel;
13rd layer be 32 channels up-sampling convolution block D, and by the 13rd layer of output and the output of third layer according to
The 14th layer is inputted again after the superposition of channel;
The 14th layer of up-sampling convolution block D for 16 channels;
The R1Residual error convolution block includes two layers of mutually isostructural convolutional layer, and convolution kernel size is 3*3, and convolution step-length is
1, port number is adjustable;And R will be inputted1The input data of residual error convolution block is added access one with the output corresponding points of the second layer
ReLU activation primitive, as R1The output layer of residual error convolution block;
The R2Residual error convolution block includes the first, second, and third convolutional layer, inputs R2The input data of residual error convolution block point
Not Jin Ru two branches, then the output corresponding points of two branches are added one ReLU activation primitive of access, as R2Residual error volume
The output layer of block;Wherein a branch is sequentially connected first and second convolutional layer, and another branch is third convolutional layer;
The structure of first convolutional layer and the second convolutional layer is identical, is convolution kernel size for 3*3, and convolution step-length is 2,
Port number is adjustable;Third convolutional layer is that convolution kernel size is 3*3, and convolution step-length is 1, and port number is adjustable;
The up-sampling convolution block D includes two amplification modules and a convolutional layer, wherein input up-samples convolution block D's
Input data respectively enters two branches, then the output corresponding points of two branches are added one ReLU activation primitive of access, makees
For the output layer for up-sampling convolution block D;Wherein a branch is sequentially connected first amplification module and convolutional layer, another branch
Road is the second amplification module;
Wherein, the convolutional layer of convolution block D is up-sampled are as follows: convolution kernel size is 3*3, and convolution step-length is 1, and port number is adjustable
Section;
The amplification module of up-sampling convolution block D includes four convolutional layers arranged side by side, the port number setting of four convolutional layers
To be identical, convolution kernel size is respectively as follows: 3*3,3*2,2*3 and 2*2, and convolution step-length is 1, inputs the input number of amplification module
According to by being stitched together after its four convolutional layers again, the output as amplification module;
The information Fusion Module is that convolution kernel size is 3*3, port number 1, the convolutional layer that convolution step-length is 1;
Deep learning training is carried out to constructed multiple dimensioned network model, and passes through trained multiple dimensioned network model
Obtain the processing result of denseization of image to be processed.
In conclusion by adopting the above-described technical solution, the beneficial effects of the present invention are: the present invention utilizes sparse cloud
The mode estimating depth combined with image, sparse depth instruct RGB image, and RGB image mends sparse depth
It fills, in conjunction with the advantages of two kinds of data modes, multiple scale network modelings in conjunction with set by the present invention carry out estimation of Depth, improve
The accuracy rate of estimation of Depth.
Detailed description of the invention
Fig. 1 is down-sampling schematic diagram of the invention in specific embodiment;
Fig. 2 is residual error convolution block schematic diagram in specific embodiment.Wherein Fig. 2-a is one residual error convolution block of type, Fig. 2-
B is two residual error convolution block of type;
Fig. 3 is to up-sample convolution block schematic diagram in specific embodiment.Wherein Fig. 3-a is amplification module schematic diagram, Fig. 3-
B is entire up-sampling convolution block schematic diagram;
Fig. 4 is used multiple dimensioned schematic network structure in specific embodiment;
Fig. 5 is the result and comparing result figure of the present invention and existing processing method in specific embodiment.Wherein Fig. 5-a
For the RGB image of input, Fig. 5-b is sparse depth figure;Fig. 5-c is estimation of Depth of the existing method to Fig. 5-b;Fig. 5-d is this
Depth estimation result of the invention to Fig. 5-b.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below with reference to embodiment and attached drawing, to this hair
It is bright to be described in further detail.
In order to meet special scenes (such as unmanned) to the higher demand of depth image quality requirement, the present invention is mentioned
A kind of method that denseization is carried out to sparse depth using multiple dimensioned network is gone out.And existing depth estimation method mainly utilizes
RGB image directly obtains dense depth, but since two dimensional image direct estimation depth image has inherent ambiguity, to understand
The certainly problem, the present invention by sparse cloud and image combine in the way of estimating depth, sparse depth refers to RGB image
It leads, RGB image supplements sparse depth, in conjunction with the advantages of two kinds of data modes, while carrying out depth at multiple scales
Estimation, improves the accuracy rate of estimation of Depth.
The present invention uses multiple dimensioned convolutional neural networks, and rgb image data and sparse point cloud data are effectively melted
It closes, finally obtains dense depth image.Sparse cloud is mapped to two-dimensional surface and generates sparse depth figure, and and RGB image
Alignment, then sparse depth figure and RGB image are linked together and generate RGBD (RGB+Depth Map) image, RGBD is schemed
It is trained and tests as being input to multiple dimensioned convolutional neural networks, finally estimate a dense depth map.RGB image and
The mode estimating depth that sparse cloud combines, the range information that a cloud can be allowed to include go that RGB image is instructed to be converted into depth
Figure;The information of initial data different resolution is utilized in multiple dimensioned network, on the one hand expands view field, on the other hand small resolution
Input depth map in rate is denser, can obtain higher accuracy rate.
It is proposed by the present invention based on the multiple dimensioned thick densification method of sparse depth the specific implementation process is as follows:
(1) input data down-sampling:
The size of feasible down-sampling multiple and input data has very big relationship.The input for being M*N for a Zhang great little
For image, feasible down-sampling multiple range is [2, min (M, N) * 2-5]。
The mode of sampling is as described below: selected down-sampling multiple is indicated with K, by input sparse depth figure according to pixel
It is divided into grid, each grid includes K*K and is originally inputted pixel, then input picture will be divided intoA grid.Under Fig. 1 is
Sample schematic diagram when multiple is 2.K*K pixel in grid is expressed as pixel set P={ p1, p2..., pK*K}。
Since, there are the value that depth is zero, these values are referred to as invalid value in sparse depth figure.A mark value s is constructed to use
Carry out marked invalid value, thinks effectively, s to be enabled to be equal to 1 if the pixel depth value is not equal to 0;Otherwise it is invalid value, enables s etc.
In 0.It is S={ s so as to obtain label value set corresponding with pixel set P1, s2..., sK*K}。
New depth value after above-mentioned down-sampling are as follows:Wherein pnIndicate former
The depth value of beginning pixel, snIndicate the mark value of original image vegetarian refreshments.
Aforesaid operations are carried out to ready-portioned each grid, so that it is smaller to obtain a new resolution ratio, it is denser
Depth map (referred to as small depth of resolution figure).Compared to traditional down-sampled method, small depth of resolution which obtains
Figure is denser, and due to eliminating the influence of invalid value, depth value is also more accurate.RGB image is down-sampled then using traditional
The down-sampled method of bilinear interpolation.Finally obtain the image and sparse depth figure of small resolution ratio.
(2) residual error convolution block is constructed:
Residual error convolution block is the important component of multiple dimensioned network of the invention, for extracting the feature of input data,
It is divided into two types.
Type one: residual error convolution block R1Building process is as follows: as shown in Fig. 2-a, the first layer of residual error convolution block is one
Convolutional layer, convolution kernel size are 3*3, and port number n, convolution step-length (stide) is 1.The second layer is identical as first layer structure.
Then input data is added with the output corresponding points of the second layer.Finally access a ReLU activation primitive.Residual error convolution agllutination
Structure is fixed, but the port number of convolutional layer is variable, the available different residual error convolution block of port number is adjusted, accordingly by type one
Residual error convolution block is named as n-channel R1。R1Input and output it is in the same size, without the operation of down-sampling.
Type two: residual error convolution block R2Building process is as follows: as shown in Fig. 2-b, the first layer of residual error convolution block is one
Convolutional layer, convolution kernel size are 3*3, and port number n, convolution step-length is 2.The second layer be also be a convolutional layer, convolution
Core size is 3*3, and port number n, convolution step-length is 1.Then by input data by a convolutional layer, convolution kernel size is
1*1, port number n, convolution step-length are 2, which is added with the output corresponding points of the second layer.A ReLU is finally accessed to swash
Function living.With R1Naming method is similar, and two residual error convolution block of type is named as n-channel R2。R2Input size be output two
Times, the purpose of the operation is the receptive field for expanding convolution kernel, preferably extracts global characteristics.
(3) building up-sampling convolution block:
Up-sampling convolution block is also the pith of multiple dimensioned network, and effect is will to input amplification, each up-sampling
Input can be put and is twice by convolution block.Its building process is as follows: the basic module of up-sampling convolution block is amplification module, is such as schemed
Shown in 3-a, amplification module is made of four convolutional layers arranged side by side, and the port number of this four convolutional layers is all n, convolution kernel size point
It is not 3*3,3*2,2*3 and 2*2, by being stitched together after this four convolutional layers, output expands one compared to input for input
Times.As shown in Fig. 3-b, up-sampling convolution block is made of Liang Ge branch.The first layer of branch one is the amplification that a port number is n
Module is followed by a ReLU activation primitive, and the second layer is a convolutional layer, and convolution kernel size is 3*3, port number n.Point
Only one layer of branch two, this layer is the amplification module that a port number is n.The output and the output corresponding points phase of branch two of branch one
Add, finally accesses a ReLU activation primitive.With R1, R2Naming method is similar, and up-sampling convolution block is named as n-channel D.
(4) multiple dimensioned convolutional network is constructed:
Multiple dimensioned network can construct multiple scales, it can construct a plurality of branch, the quantity and down-sampling of branch building
Multiple is equally influenced by input picture size, and for size is the image of M*N, the number of branches upper limit is log2(min
(M, N) * 2-5)+1.Construction method will establish two branches by taking two branches as an example, and the input of a branch is former resolution ratio,
The input of another branch is 1/K original resolution, and K is the down-sampling multiple of input picture.Two branches are finally subjected to letter
Breath fusion.
First branch, i.e. input are that the branch building of original resolution is as follows:
First layer is input layer and pond layer, and the convolution kernel size of input layer is 7*7, port number 64, and convolution step-length is
2.Pond layer uses maximum value pond, and convolution kernel size is 3*3, and pond constant is 2.The size being originally inputted is M*N*4, is led to
Size becomes after crossing first layerI.e. size becomes original 1/4, and port number becomes 64.
The second layer is the R in 64 channels1Residual error convolution block, is denoted as R1 1。
Third layer structure is identical as the second layer, is denoted as R1 2。
4th layer be 128 channels R2Residual error convolution block is denoted as R2 1。
Layer 5 is the R in 128 channels1Residual error convolution block, is denoted as R1 3。
Layer 6 is the R in 256 channels2Residual error convolution block, is denoted as R2 2。
Layer 7 is the R in 256 channels1Residual error convolution block, is denoted as R1 4。
8th layer be 512 channels R2Residual error convolution block, is denoted as R2 3。
9th layer be 512 channels R1Residual error convolution block, is denoted as R1 5。
Tenth layer is a convolutional layer, and convolution kernel size is 3*3, and port number 256, convolution step-length is 1.
Eleventh floor is the up-sampling convolution block D in 128 channels, is denoted as D1。
Then by D1Output and layer 7 R1 4Output be superimposed according to channel, wherein R1 4Output Size beD1Output Size beSuperimposed size becomesThe meaning of superposition is
The available some raw informations lost in convolution process, so that result is more acurrate.
Floor 12 is the up-sampling convolution block D in 64 channels, is denoted as D2, then by D2Output and R1 3Output according to logical
Trace-stacking.
13rd layer be 32 channels up-sampling convolution block D, be denoted as D3, then by D3Output and R1 2Output according to logical
Trace-stacking.
The 14th layer of up-sampling convolution block D for 16 channels is denoted as D4。
So far, it inputs and is finished for the network structure building of the branch of former resolution ratio.
Article 2 branch, i.e. input are that the branch building of 1/K original resolution is as follows:
Preceding ten four-layer structure is that the branch of original resolution is identical with input, thereafter will be according to the input size of branch
Add the up-sampling convolution block D in 16 channels of corresponding number.It is 1/K original resolution for input (down-sampling multiple is K)
For branch, then K/2 up-sampling convolution block is added.Such as the situation that Fig. 4 is two branch, wherein Article 2 branch is defeated
Enter for the example of 1/2 original resolution (down-sampling multiple is 2), of Article 2 branch up-sampling convolution block D to be added
Number is exactly 1.The case where multiresolution, is similar therewith, if input is 1/4 original resolution, adds the upper of two 16 channels
Convolution block is sampled, and so on.
After the completion of branch building, the information by this two branches is needed to merge, the structure of information fusion is as follows: by the
The output of one branch is added with the output corresponding points of Article 2 branch, the input as information Fusion Module.Information merges mould
The network structure of block is a convolutional layer, and convolution kernel size is 3*3, and port number 1 finally passes through layer output on linear
Sampling obtains size and is originally inputted final result of a size.
Information fusion in the case of extra multiple branch circuit (two or more) is then:
(5) setting of loss function:
In present embodiment, loss function uses Smooth L1 loss function, i.e.,Its
Middle d indicates the depth value that convolutional neural networks estimate, dgThe depth value of expression standard, N indicate pixel in a depth map
The summation of number.
(6) training and test of model:
In present embodiment, the training data of use derives from public data collection NYU-Depth-v2 dataset.
The data set contains RGB image and dense depth map, size 640*480.Training process has selected 48000 RGB
Image dense depth map corresponding with its is as training data;Test process has selected 654 RGB images corresponding with its dense
Depth is as test data.The input of network is a RGB image and a sparse depth figure, and there is no sparse for the data set
Depth map can be combined into RGBD with RGB image by obtaining sparse depth figure to 1000 points of dense depth map stochastical sampling
Image is as input.
When training, by RGBD image down sampling at 320*240 size, then carries out heartcut and obtain 304*228 size
RGBD image (inputs the original image of multiple dimensioned network model), using the image as the input of first branch, then will
The image obtains the RGBD image of 152*114 size as Article 2 branch according to twice of method down-sampling described in step (1)
Input.Once 8 images of training then train full number to need 6000 times according to collection, by entire data set training 15 times, then one
It to train altogether 90000 times.Learning rate when training is using the learning rate changed, and initial learning rate is set as 0.01, and data set is every
It has trained 5 times, learning rate declines 10 times, and last learning rate is 0.0001.The parameter of model is saved after training.
When test, the parameter of reading model, data processing method is identical in training process, and by treated, data are input to
In model, final result is exported.As shown in figure 5, being some of output result and existing deep learning method of the invention
Compare.On the whole, result of the invention is apparent, compares the result details as can be seen that of the invention from the result in black surround
That embodies is more preferable.
The above description is merely a specific embodiment, any feature disclosed in this specification, except non-specifically
Narration, can be replaced by other alternative features that are equivalent or have similar purpose;Disclosed all features or all sides
Method or in the process the step of, other than mutually exclusive feature and/or step, can be combined in any way.
Claims (3)
1. a kind of thick densification method of sparse depth based on multiple dimensioned network, characterized in that it comprises the following steps:
Construct multiple dimensioned network model:
The multiple dimensioned network model includes the road L input branch's branch, is inputted after the output corresponding points of the road L branch branch are added
Information fused layer is followed by a up-sampling treatment layer to information fused layer, the output layer as multiple dimensioned network model;
Wherein, in the road L input branch's branch, wherein input of the branch as original image all the way;The remaining road L-1 is as original graph
As the input of the down-sampled images obtained after different down-samplings;And the output image of the output layer of multiple dimensioned network model with
The size of original image is identical;
And the input data of the road L input branch's branch includes: RGB image and sparse depth figure;Wherein for original image
The down-sampling mode of sparse depth figure are as follows: for sparse depth figure, be based on preset down-sampling multiple K, sparse depth figure is pressed
Photograph element is divided into grid, and each grid includes K × K and is originally inputted pixel;And it is set based on the depth value for being originally inputted pixel
Set the mark value s for being respectively originally inputted pixeliIf the depth value for being currently originally inputted pixel is 0, si=0;Otherwise si=1;Its
Middle i is a specificator for being originally inputted pixel of K × K that each grid includes;And according to formulaObtain the depth value p of each gridnew, wherein piExpression is originally inputted picture
The depth value of plain i;
Input is that the network structure of the branch of original image is first network structure;
Input is the network structure of the branch of the down-sampled images of original image are as follows: K/2 16 is added after first network structure
The up-sampling convolution block D in channel, wherein K indicates the down-sampling multiple to original image;
The first network structure includes 14 layers, is respectively as follows:
First layer is input layer and pond layer, and the convolution kernel size of input layer is 7*7, and port number 64, convolution step-length is 2;Pond
Change layer and use maximum value pond, convolution kernel size is 3*3, and pond constant is 2;
The second layer is identical with third layer structure, is the R in 64 channels1Residual error convolution block;
The 4th layer of R for 128 channels2Residual error convolution block;
Layer 5 is the R in 128 channels1Residual error convolution block;
Layer 6 is the R in 256 channels2Residual error convolution block;
Layer 7 is the R in 256 channels1Residual error convolution block;
The 8th layer of R for 512 channels2Residual error convolution block;
The 9th layer of R for 512 channels1Residual error convolution block;
Tenth layer is a convolutional layer, and convolution kernel size is 3*3, and port number 256, convolution step-length is 1;
Eleventh floor is the up-sampling convolution block D in 128 channels, and by the output of eleventh floor and the output of layer 7 according to channel
Floor 12 is inputted after superposition again;
Floor 12 is the up-sampling convolution block D in 64 channels, and by the output of Floor 12 and the output of layer 5 according to channel
The 13rd layer is inputted after superposition again;
13rd layer be 32 channels up-sampling convolution block D, and by the 13rd layer of output and the output of third layer according to channel
The 14th layer is inputted after superposition again;
The 14th layer of up-sampling convolution block D for 16 channels;
The R1Residual error convolution block includes two layers of mutually isostructural convolutional layer, and convolution kernel size is 3*3, and convolution step-length is 1, is led to
Road number is adjustable;And R will be inputted1The input data of residual error convolution block is added one ReLU of access with the output corresponding points of the second layer
Activation primitive, as R1The output layer of residual error convolution block;
The R2Residual error convolution block includes the first, second, and third convolutional layer, inputs R2The input data of residual error convolution block respectively into
Enter two branches, then the output corresponding points of two branches are added one ReLU activation primitive of access, as R2Residual error convolution block
Output layer;Wherein a branch is sequentially connected first and second convolutional layer, and another branch is third convolutional layer;
The structure of first convolutional layer and the second convolutional layer is identical, is convolution kernel size for 3*3, convolution step-length is 2, channel
Number is adjustable;Third convolutional layer is that convolution kernel size is 3*3, and convolution step-length is 1, and port number is adjustable;
The up-sampling convolution block D includes two amplification modules and a convolutional layer, wherein the input of input up-sampling convolution block D
Data respectively enter two branches, then the output corresponding points of two branches are added one ReLU activation primitive of access, as upper
Sample the output layer of convolution block D;Wherein a branch is sequentially connected first amplification module and convolutional layer, and another branch is
Second amplification module;
Wherein, the convolutional layer of convolution block D is up-sampled are as follows: convolution kernel size is 3*3, and convolution step-length is 1, and port number is adjustable;
The amplification module for up-sampling convolution block D includes four convolutional layers arranged side by side, and the port number of four convolutional layers is set as phase
Together, convolution kernel size is respectively as follows: 3*3,3*2,2*3 and 2*2, and convolution step-length is 1, and the input data for inputting amplification module is logical
It is stitched together again after crossing its four convolutional layers, the output as amplification module;
The information Fusion Module is that convolution kernel size is 3*3, port number 1, the convolutional layer that convolution step-length is 1;
Deep learning training is carried out to constructed multiple dimensioned network model, and is obtained by trained multiple dimensioned network model
The processing result of denseization of image to be processed.
2. the method as described in claim 1, which is characterized in that the down-sampling mode of the RGB image of original image are as follows: use
The down-sampled method of bilinear interpolation.
3. the method as described in claim 1, which is characterized in that when carrying out deep learning training to multiple dimensioned network model, adopt
Loss function isWherein djIndicate the depth of each pixel of multiple dimensioned network model output
Value, i.e. estimated value depth value, j are pixel specificator,Indicate the standard depth value of pixel, the i.e. corresponding mark of training sample
Label value, N indicate the summation of the number of pixels of a width sparse depth figure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811531022.5A CN109685842B (en) | 2018-12-14 | 2018-12-14 | Sparse depth densification method based on multi-scale network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811531022.5A CN109685842B (en) | 2018-12-14 | 2018-12-14 | Sparse depth densification method based on multi-scale network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109685842A true CN109685842A (en) | 2019-04-26 |
CN109685842B CN109685842B (en) | 2023-03-21 |
Family
ID=66187804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811531022.5A Active CN109685842B (en) | 2018-12-14 | 2018-12-14 | Sparse depth densification method based on multi-scale network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109685842B (en) |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110490118A (en) * | 2019-08-14 | 2019-11-22 | 厦门美图之家科技有限公司 | Image processing method and device |
CN110796105A (en) * | 2019-11-04 | 2020-02-14 | 中国矿业大学 | Remote sensing image semantic segmentation method based on multi-modal data fusion |
CN110992271A (en) * | 2020-03-04 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Image processing method, path planning method, device, equipment and storage medium |
CN111062981A (en) * | 2019-12-13 | 2020-04-24 | 腾讯科技(深圳)有限公司 | Image processing method, device and storage medium |
CN111079683A (en) * | 2019-12-24 | 2020-04-28 | 天津大学 | Remote sensing image cloud and snow detection method based on convolutional neural network |
CN111179331A (en) * | 2019-12-31 | 2020-05-19 | 智车优行科技(上海)有限公司 | Depth estimation method, depth estimation device, electronic equipment and computer-readable storage medium |
CN111199516A (en) * | 2019-12-30 | 2020-05-26 | 深圳大学 | Image processing method, system and storage medium based on image generation network model |
CN111667522A (en) * | 2020-06-04 | 2020-09-15 | 上海眼控科技股份有限公司 | Three-dimensional laser point cloud densification method and equipment |
CN111815766A (en) * | 2020-07-28 | 2020-10-23 | 复旦大学附属华山医院 | Processing method and system for reconstructing blood vessel three-dimensional model based on 2D-DSA image |
CN112001914A (en) * | 2020-08-31 | 2020-11-27 | 三星(中国)半导体有限公司 | Depth image completion method and device |
CN112102472A (en) * | 2020-09-01 | 2020-12-18 | 北京航空航天大学 | Sparse three-dimensional point cloud densification method |
CN112132880A (en) * | 2020-09-02 | 2020-12-25 | 东南大学 | Real-time dense depth estimation method based on sparse measurement and monocular RGB (red, green and blue) image |
CN112258626A (en) * | 2020-09-18 | 2021-01-22 | 山东师范大学 | Three-dimensional model generation method and system for generating dense point cloud based on image cascade |
CN112837262A (en) * | 2020-12-04 | 2021-05-25 | 国网宁夏电力有限公司检修公司 | Method, medium and system for detecting opening and closing states of disconnecting link |
CN112861729A (en) * | 2021-02-08 | 2021-05-28 | 浙江大学 | Real-time depth completion method based on pseudo-depth map guidance |
CN113034562A (en) * | 2019-12-09 | 2021-06-25 | 百度在线网络技术(北京)有限公司 | Method and apparatus for optimizing depth information |
CN113256546A (en) * | 2021-05-24 | 2021-08-13 | 浙江大学 | Depth map completion method based on color map guidance |
CN113344839A (en) * | 2021-08-06 | 2021-09-03 | 深圳市汇顶科技股份有限公司 | Depth image acquisition device, fusion method and terminal equipment |
CN113496138A (en) * | 2020-03-18 | 2021-10-12 | 广州极飞科技股份有限公司 | Dense point cloud data generation method and device, computer equipment and storage medium |
CN113807417A (en) * | 2021-08-31 | 2021-12-17 | 中国人民解放军战略支援部队信息工程大学 | Dense matching method and system based on deep learning view self-selection network |
CN114078149A (en) * | 2020-08-21 | 2022-02-22 | 深圳市万普拉斯科技有限公司 | Image estimation method, electronic equipment and storage medium |
CN114494023A (en) * | 2022-04-06 | 2022-05-13 | 电子科技大学 | Video super-resolution implementation method based on motion compensation and sparse enhancement |
CN114627351A (en) * | 2022-02-18 | 2022-06-14 | 电子科技大学 | Fusion depth estimation method based on vision and millimeter wave radar |
WO2023010559A1 (en) * | 2021-08-06 | 2023-02-09 | 深圳市汇顶科技股份有限公司 | Depth image collection apparatus, depth image fusion method and terminal device |
CN115861401A (en) * | 2023-02-27 | 2023-03-28 | 之江实验室 | Binocular and point cloud fusion depth recovery method, device and medium |
CN115908531A (en) * | 2023-03-09 | 2023-04-04 | 深圳市灵明光子科技有限公司 | Vehicle-mounted distance measuring method and device, vehicle-mounted terminal and readable storage medium |
CN116152066A (en) * | 2023-02-14 | 2023-05-23 | 苏州赫芯科技有限公司 | Point cloud detection method, system, equipment and medium for complete appearance of element |
CN116503460A (en) * | 2023-04-07 | 2023-07-28 | 北京鉴智科技有限公司 | Depth map acquisition method and device, electronic equipment and storage medium |
CN117953029A (en) * | 2024-03-27 | 2024-04-30 | 北京科技大学 | General depth map completion method and device based on depth information propagation |
CN116503460B (en) * | 2023-04-07 | 2024-11-05 | 北京鉴智科技有限公司 | Depth map acquisition method and device, electronic equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140267243A1 (en) * | 2013-03-13 | 2014-09-18 | Pelican Imaging Corporation | Systems and Methods for Synthesizing Images from Image Data Captured by an Array Camera Using Restricted Depth of Field Depth Maps in which Depth Estimation Precision Varies |
US20140328535A1 (en) * | 2013-05-06 | 2014-11-06 | Disney Enterprises, Inc. | Sparse light field representation |
CN106408015A (en) * | 2016-09-13 | 2017-02-15 | 电子科技大学成都研究院 | Road fork identification and depth estimation method based on convolutional neural network |
CN107767413A (en) * | 2017-09-20 | 2018-03-06 | 华南理工大学 | A kind of image depth estimation method based on convolutional neural networks |
CN107944459A (en) * | 2017-12-09 | 2018-04-20 | 天津大学 | A kind of RGB D object identification methods |
CN108535675A (en) * | 2018-04-08 | 2018-09-14 | 朱高杰 | A kind of magnetic resonance multichannel method for reconstructing being in harmony certainly based on deep learning and data |
-
2018
- 2018-12-14 CN CN201811531022.5A patent/CN109685842B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140267243A1 (en) * | 2013-03-13 | 2014-09-18 | Pelican Imaging Corporation | Systems and Methods for Synthesizing Images from Image Data Captured by an Array Camera Using Restricted Depth of Field Depth Maps in which Depth Estimation Precision Varies |
US20140328535A1 (en) * | 2013-05-06 | 2014-11-06 | Disney Enterprises, Inc. | Sparse light field representation |
CN106408015A (en) * | 2016-09-13 | 2017-02-15 | 电子科技大学成都研究院 | Road fork identification and depth estimation method based on convolutional neural network |
CN107767413A (en) * | 2017-09-20 | 2018-03-06 | 华南理工大学 | A kind of image depth estimation method based on convolutional neural networks |
CN107944459A (en) * | 2017-12-09 | 2018-04-20 | 天津大学 | A kind of RGB D object identification methods |
CN108535675A (en) * | 2018-04-08 | 2018-09-14 | 朱高杰 | A kind of magnetic resonance multichannel method for reconstructing being in harmony certainly based on deep learning and data |
Non-Patent Citations (5)
Title |
---|
CHEN,ZHAO: "estimating depth from RGB and sparse sensing", 《COMPUTER VISION-ECCV 2018》 * |
MA,FANGCHANG: "sparse-to-dense:depth prediction from sparse depth samples and a single image:web of science", 《2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION ICRA》 * |
YANG,SHUYUAN: "deep sparse tensor filtering network for synthetic aperture radar images classification", 《IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS》 * |
曾湘峰: "车载多传感器融合下的动态目标检测与跟踪", 《中国优秀硕士学位论文全文数据库工程科技Ⅱ辑》 * |
雷杰等: "深度网络模型压缩综述", 《软件学报》 * |
Cited By (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110490118A (en) * | 2019-08-14 | 2019-11-22 | 厦门美图之家科技有限公司 | Image processing method and device |
CN110796105A (en) * | 2019-11-04 | 2020-02-14 | 中国矿业大学 | Remote sensing image semantic segmentation method based on multi-modal data fusion |
CN113034562B (en) * | 2019-12-09 | 2023-05-12 | 百度在线网络技术(北京)有限公司 | Method and apparatus for optimizing depth information |
CN113034562A (en) * | 2019-12-09 | 2021-06-25 | 百度在线网络技术(北京)有限公司 | Method and apparatus for optimizing depth information |
CN111062981A (en) * | 2019-12-13 | 2020-04-24 | 腾讯科技(深圳)有限公司 | Image processing method, device and storage medium |
CN111062981B (en) * | 2019-12-13 | 2023-05-05 | 腾讯科技(深圳)有限公司 | Image processing method, device and storage medium |
CN111079683B (en) * | 2019-12-24 | 2023-12-12 | 天津大学 | Remote sensing image cloud and snow detection method based on convolutional neural network |
CN111079683A (en) * | 2019-12-24 | 2020-04-28 | 天津大学 | Remote sensing image cloud and snow detection method based on convolutional neural network |
CN111199516A (en) * | 2019-12-30 | 2020-05-26 | 深圳大学 | Image processing method, system and storage medium based on image generation network model |
CN111199516B (en) * | 2019-12-30 | 2023-05-05 | 深圳大学 | Image processing method, system and storage medium based on image generation network model |
CN111179331A (en) * | 2019-12-31 | 2020-05-19 | 智车优行科技(上海)有限公司 | Depth estimation method, depth estimation device, electronic equipment and computer-readable storage medium |
CN111179331B (en) * | 2019-12-31 | 2023-09-08 | 智车优行科技(上海)有限公司 | Depth estimation method, depth estimation device, electronic equipment and computer readable storage medium |
CN110992271B (en) * | 2020-03-04 | 2020-07-07 | 腾讯科技(深圳)有限公司 | Image processing method, path planning method, device, equipment and storage medium |
WO2021174904A1 (en) * | 2020-03-04 | 2021-09-10 | 腾讯科技(深圳)有限公司 | Image processing method, path planning method, apparatus, device, and storage medium |
CN110992271A (en) * | 2020-03-04 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Image processing method, path planning method, device, equipment and storage medium |
CN113496138A (en) * | 2020-03-18 | 2021-10-12 | 广州极飞科技股份有限公司 | Dense point cloud data generation method and device, computer equipment and storage medium |
CN111667522A (en) * | 2020-06-04 | 2020-09-15 | 上海眼控科技股份有限公司 | Three-dimensional laser point cloud densification method and equipment |
CN111815766A (en) * | 2020-07-28 | 2020-10-23 | 复旦大学附属华山医院 | Processing method and system for reconstructing blood vessel three-dimensional model based on 2D-DSA image |
CN111815766B (en) * | 2020-07-28 | 2024-04-30 | 复影(上海)医疗科技有限公司 | Processing method and system for reconstructing three-dimensional model of blood vessel based on 2D-DSA image |
CN114078149A (en) * | 2020-08-21 | 2022-02-22 | 深圳市万普拉斯科技有限公司 | Image estimation method, electronic equipment and storage medium |
CN112001914A (en) * | 2020-08-31 | 2020-11-27 | 三星(中国)半导体有限公司 | Depth image completion method and device |
CN112001914B (en) * | 2020-08-31 | 2024-03-01 | 三星(中国)半导体有限公司 | Depth image complement method and device |
CN112102472B (en) * | 2020-09-01 | 2022-04-29 | 北京航空航天大学 | Sparse three-dimensional point cloud densification method |
CN112102472A (en) * | 2020-09-01 | 2020-12-18 | 北京航空航天大学 | Sparse three-dimensional point cloud densification method |
CN112132880A (en) * | 2020-09-02 | 2020-12-25 | 东南大学 | Real-time dense depth estimation method based on sparse measurement and monocular RGB (red, green and blue) image |
CN112132880B (en) * | 2020-09-02 | 2024-05-03 | 东南大学 | Real-time dense depth estimation method based on sparse measurement and monocular RGB image |
CN112258626A (en) * | 2020-09-18 | 2021-01-22 | 山东师范大学 | Three-dimensional model generation method and system for generating dense point cloud based on image cascade |
CN112837262A (en) * | 2020-12-04 | 2021-05-25 | 国网宁夏电力有限公司检修公司 | Method, medium and system for detecting opening and closing states of disconnecting link |
CN112837262B (en) * | 2020-12-04 | 2023-04-07 | 国网宁夏电力有限公司检修公司 | Method, medium and system for detecting opening and closing states of disconnecting link |
CN112861729B (en) * | 2021-02-08 | 2022-07-08 | 浙江大学 | Real-time depth completion method based on pseudo-depth map guidance |
CN112861729A (en) * | 2021-02-08 | 2021-05-28 | 浙江大学 | Real-time depth completion method based on pseudo-depth map guidance |
CN113256546A (en) * | 2021-05-24 | 2021-08-13 | 浙江大学 | Depth map completion method based on color map guidance |
CN113344839A (en) * | 2021-08-06 | 2021-09-03 | 深圳市汇顶科技股份有限公司 | Depth image acquisition device, fusion method and terminal equipment |
US11928802B2 (en) | 2021-08-06 | 2024-03-12 | Shenzhen GOODIX Technology Co., Ltd. | Apparatus for acquiring depth image, method for fusing depth images, and terminal device |
WO2023010559A1 (en) * | 2021-08-06 | 2023-02-09 | 深圳市汇顶科技股份有限公司 | Depth image collection apparatus, depth image fusion method and terminal device |
CN113344839B (en) * | 2021-08-06 | 2022-01-07 | 深圳市汇顶科技股份有限公司 | Depth image acquisition device, fusion method and terminal equipment |
EP4156085A4 (en) * | 2021-08-06 | 2023-04-26 | Shenzhen Goodix Technology Co., Ltd. | Depth image collection apparatus, depth image fusion method and terminal device |
CN113807417B (en) * | 2021-08-31 | 2023-05-30 | 中国人民解放军战略支援部队信息工程大学 | Dense matching method and system based on deep learning visual field self-selection network |
CN113807417A (en) * | 2021-08-31 | 2021-12-17 | 中国人民解放军战略支援部队信息工程大学 | Dense matching method and system based on deep learning view self-selection network |
CN114627351A (en) * | 2022-02-18 | 2022-06-14 | 电子科技大学 | Fusion depth estimation method based on vision and millimeter wave radar |
CN114494023A (en) * | 2022-04-06 | 2022-05-13 | 电子科技大学 | Video super-resolution implementation method based on motion compensation and sparse enhancement |
CN116152066A (en) * | 2023-02-14 | 2023-05-23 | 苏州赫芯科技有限公司 | Point cloud detection method, system, equipment and medium for complete appearance of element |
CN115861401A (en) * | 2023-02-27 | 2023-03-28 | 之江实验室 | Binocular and point cloud fusion depth recovery method, device and medium |
CN115908531A (en) * | 2023-03-09 | 2023-04-04 | 深圳市灵明光子科技有限公司 | Vehicle-mounted distance measuring method and device, vehicle-mounted terminal and readable storage medium |
CN116503460A (en) * | 2023-04-07 | 2023-07-28 | 北京鉴智科技有限公司 | Depth map acquisition method and device, electronic equipment and storage medium |
CN116503460B (en) * | 2023-04-07 | 2024-11-05 | 北京鉴智科技有限公司 | Depth map acquisition method and device, electronic equipment and storage medium |
CN117953029A (en) * | 2024-03-27 | 2024-04-30 | 北京科技大学 | General depth map completion method and device based on depth information propagation |
CN117953029B (en) * | 2024-03-27 | 2024-06-07 | 北京科技大学 | General depth map completion method and device based on depth information propagation |
Also Published As
Publication number | Publication date |
---|---|
CN109685842B (en) | 2023-03-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109685842A (en) | A kind of thick densification method of sparse depth based on multiple dimensioned network | |
CN107204010B (en) | A kind of monocular image depth estimation method and system | |
CN111625608B (en) | Method and system for generating electronic map according to remote sensing image based on GAN model | |
CN110009674B (en) | Monocular image depth of field real-time calculation method based on unsupervised depth learning | |
CN110246181B (en) | Anchor point-based attitude estimation model training method, attitude estimation method and system | |
Tian et al. | Depth estimation using a self-supervised network based on cross-layer feature fusion and the quadtree constraint | |
CN108460403A (en) | The object detection method and system of multi-scale feature fusion in a kind of image | |
CN109598754A (en) | A kind of binocular depth estimation method based on depth convolutional network | |
CN110310317A (en) | A method of the monocular vision scene depth estimation based on deep learning | |
CN110570522A (en) | Multi-view three-dimensional reconstruction method | |
CN111143489B (en) | Image-based positioning method and device, computer equipment and readable storage medium | |
CN110110578A (en) | A kind of indoor scene semanteme marking method | |
CN117351363A (en) | Remote sensing image building extraction method based on transducer | |
CN108629763A (en) | A kind of evaluation method of disparity map, device and terminal | |
CN112767467A (en) | Double-image depth estimation method based on self-supervision deep learning | |
CN109345581A (en) | Augmented reality method, apparatus and system based on more mesh cameras | |
CN116519106A (en) | Method, device, storage medium and equipment for determining weight of live pigs | |
CN117078753A (en) | Progressive feature distribution sampling 6D pose estimation method and system based on camera | |
CN107909565A (en) | Stereo-picture Comfort Evaluation method based on convolutional neural networks | |
CN114494611A (en) | Intelligent three-dimensional reconstruction method, device, equipment and medium based on nerve basis function | |
CN112116646B (en) | Depth estimation method for light field image based on depth convolution neural network | |
CN110060212A (en) | A kind of multispectral photometric stereo surface normal restoration methods based on deep learning | |
CN107680070A (en) | A kind of layering weight image interfusion method based on original image content | |
CN111898607A (en) | Point cloud semantic segmentation method for color difference guided convolution | |
Li et al. | Design of the 3D Digital Reconstruction System of an Urban Landscape Spatial Pattern Based on the Internet of Things |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |