CN110211061A

CN110211061A - List depth camera depth map real time enhancing method and device neural network based

Info

Publication number: CN110211061A
Application number: CN201910417886.2A
Authority: CN
Inventors: 刘烨斌; 闫石; 戴琼海; 方璐
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2019-05-20
Filing date: 2019-05-20
Publication date: 2019-09-06

Abstract

The invention discloses a kind of single depth camera depth map real time enhancing method and devices neural network based, wherein, this method comprises: passing through the depth image and RGB image of depth camera collecting sample, depth image is aligned with RGB image according to matrix is joined inside and outside the depth camera of calibration, and RGB image is transformed into gray space and carries out radiation transformation, obtain the grayscale image being aligned with the feature of depth image；Feedforward neural network and loss function are constructed, depth image and grayscale image input feedforward neural network are trained, and update the weight of feedforward neural network according to loss function backpropagation；Feedforward neural network after grayscale image to be input to training update weight is to export enhancing depth image.This method carries out shooting, collecting depth image to sample using depth camera, without high-precision scanning device acquisition real depth map as supervision message, eliminates the process manually demarcated, provides good interactive three-dimensional for user and rebuild experience.

Description

List depth camera depth map real time enhancing method and device neural network based

Technical field

The present invention relates to computer visions and computer graphics techniques field, in particular to a kind of neural network based Single depth camera depth map real time enhancing method and device.

Background technique

Consumer level depth camera is gradually popularized in recent years, and being even more built-in in especially newest Iphone X is based on structure The depth camera of light.This makes from 3-D scanning to virtual reality, and many brand-new mobile terminal applications are possibly realized in mixed reality. Although the resolution ratio and quality of the initial data of sensor acquisition have promotion, the depth obtained at this stage from consumer level depth camera Degree figure still has many noises, also lacks enough details.Such as body three-dimensional reconstruction is computer graphics and computer view The Important Problems in feel field.The human 3d model of high quality has extensively in fields such as video display amusement, demographic data statistical analysis General application prospect and important application value.But the acquisition of high quality human 3d model usually rely on it is expensive swash Photoscanner or polyphaser array system are realized, at high cost also significantly there is can not be real-time although precision is higher The disadvantages of, it can not spread in general public daily life.There are many methods to believe using the high frequency in high resolution R GB image Breath, while removing the distinctive a large amount of structured noises of depth transducer.Traditional class bilateral filtering method can not be in enhancing details While guarantee the authenticity of depth map, complicated optimization algorithm is usually required by the method that light and shade restores the 3D shape of object And to specific input failure, based on the smooth fusion method of time domain can not single frames real time enhancing depth map, and data-driven Machine learning algorithm can not carry out unsupervisedly in the case where no real depth diagram data.

Summary of the invention

The present invention is directed to solve at least some of the technical problems in related technologies.

For this purpose, an object of the present invention is to provide a kind of single depth camera depth maps neural network based to increase in real time Strong method, this method can directly shoot with a large amount of Quick Acquisition depth images object using depth camera, without height The scanning device acquisition real depth map of precision eliminates the process manually demarcated as supervision message.

It is another object of the present invention to propose a kind of single depth camera depth map real time enhancing neural network based Device.

In order to achieve the above objectives, it is deep to propose a kind of single depth camera neural network based for one aspect of the present invention embodiment Spend figure real time enhancing method, comprising: by the depth image and RGB image of depth camera collecting sample, according to the depth of calibration The depth image is aligned by the inside and outside ginseng matrix of camera with the RGB image, and the RGB image is transformed into gray space And radiation transformation is carried out, obtain the grayscale image being aligned with the feature of the depth image；Construct feedforward neural network and loss letter The depth image and the grayscale image are inputted the feedforward neural network and are trained by number, and according to the loss function Backpropagation updates the weight of the feedforward neural network；The grayscale image is input to the forward direction after training updates the weight Neural network is to export enhancing depth image.

Single depth camera depth map real time enhancing method neural network based of the embodiment of the present invention, by utilizing depth Camera acquires RGB image and depth image；RGB image and depth image are aligned according to the camera internal reference of calibration；Building fusion is more The neural network of the multiple dimensioned output of grade and unsupervised loss function；RGB image and depth image training nerve are inputted simultaneously Network simultaneously updates network weight according to loss function backpropagation；Fixed network weight is for test and deployment phase, according to depth Spend the RGB image real time enhancing depth map of camera acquisition.Thus, it is possible to directly be shot object with big using depth camera Quick Acquisition depth image is measured, without high-precision scanning device acquisition real depth map as supervision message, is eliminated simultaneously The process manually demarcated.Data needed for the data-driven method are very easy to acquisition, can be simply using end-to-end unsupervised Training completes training in the PC machine of single video card.

In addition, single depth camera depth map real time enhancing method neural network based according to the above embodiment of the present invention There can also be following additional technical characteristic:

Further, the radiation transformation are as follows:

Z_cp_c=RZ_dp_d+T

Wherein, (R_c,T_c) and (R_d,T_d) it is respectively the colour of calibration and the outer ginseng matrix of depth transducer, K_cAnd K_dRespectively For the colour of calibration and the internal reference matrix of depth transducer, Z_dFor the depth value of pixel, Z_cIndicate corresponding on colour or gray level image The homogeneous coordinates value of point.

Further, the loss function formula are as follows:

L(D_dt, D, I) and=λ_shl_sh+λ_fidl_fid+λ_smol_smo

Wherein, D_dtFor the depth map, D is the depth image, and I is the gray level image；

l_shItem is lost for illumination, according to the depth map D of enhancing_dtThe normal direction figure of calculating is N_dt, l_shIs defined as:

Wherein, B () is that irradiation level calculates function, l^*It is the illumination tensor of estimation, R is albedo figure, and I is the gray scale Figure,Indicate the difference of gradient；

l_fidItem is lost for value preserving, is defined as:

l_fid(D_dt, D)=| D_dt-D|₁

l_smoSmoothly to lose item, it is defined as D_dtIn anisotropic full variation:

Further, the RGB image assists the depth image to denoise and in the feedforward neural network that enhances, packet The stack operation of characteristic pattern is exported after convolution containing multiple identical or adjacent scale feature figure.

Further, the illumination tensor is estimated by the depth image, the light is calculated according to the illumination tensor According to the numerical value of loss item.

In order to achieve the above objectives, another aspect of the present invention embodiment proposes a kind of single depth camera neural network based Depth map real time enhancing square law device, comprising:

Alignment module, for passing through the depth image and RGB image of depth camera collecting sample, according to the depth phase of calibration The depth image is aligned by the inside and outside ginseng matrix of machine with the RGB image, and the RGB image is transformed into gray space simultaneously Radiation transformation is carried out, the grayscale image being aligned with the feature of the depth image is obtained；

Training update module, for constructing feedforward neural network and loss function, by the depth image and the gray scale Figure inputs the feedforward neural network and is trained, and updates the feedforward neural network according to the loss function backpropagation Weight；

Enhance module, trains the feedforward neural network after updating the weight to export for the grayscale image to be input to Enhance depth image.

Single depth camera depth map real time enhancing square law device neural network based of the embodiment of the present invention, passes through utilization Depth camera acquires RGB image and depth image；RGB image and depth image are aligned according to the camera internal reference of calibration；Building is melted The neural network of the multistage multiple dimensioned output of conjunction and unsupervised loss function；RGB image and depth image training are inputted simultaneously Neural network simultaneously updates network weight according to loss function backpropagation；Fixed network weight is for test and deployment phase, root The RGB image real time enhancing depth map acquired according to depth camera.Thus, it is possible to directly be shot using depth camera to object It is saved simultaneously with a large amount of Quick Acquisition depth images without high-precision scanning device acquisition real depth map as supervision message The process of artificial calibration is gone.Data needed for the data-driven method are very easy to acquisition, can simply use end-to-end nothing Supervised training completes training in the PC machine of single video card.

In addition, single depth camera depth map real time enhancing method neural network based according to the above embodiment of the present invention Device can also have following additional technical characteristic:

Further, the radiation transformation are as follows:

Z_cp_c=RZ_dp_d+T

Further, the loss function formula are as follows:

L(D_dt, D, I) and=λ_shl_sh+λ_fidl_fid+λ_smol_smo

l_fidItem is lost for value preserving, is defined as:

l_fid(D_dt, D)=| D_dt-D|₁

l_smoSmoothly to lose item, it is defined as D_dtIn anisotropic full variation:

The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description Obviously, or practice through the invention is recognized.

Detailed description of the invention

Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:

Fig. 1 is single depth camera depth map real time enhancing method neural network based according to one embodiment of the invention Flow chart；

Fig. 2 is single depth camera depth map real time enhancing side neural network based according to another embodiment of the invention Method flow chart；

Fig. 3 is the neural network structure figure according to the multistage multiple dimensioned output of fusion of the building of one embodiment of the invention；

Fig. 4 is single depth camera depth map real time enhancing device neural network based according to one embodiment of the invention Structural schematic diagram.

Specific embodiment

The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.

The single depth camera depth map neural network based proposed according to embodiments of the present invention is described with reference to the accompanying drawings Real time enhancing method and device.

The single depth camera depth neural network based proposed according to embodiments of the present invention is described with reference to the accompanying drawings first Figure real time enhancing method.

Fig. 1 is single depth camera depth map real time enhancing method neural network based according to one embodiment of the invention Flow chart.

As shown in Figure 1, single depth camera depth map real time enhancing method neural network based the following steps are included:

In step s101, by the depth image and RGB image of depth camera collecting sample, according to the depth phase of calibration Depth image is aligned by the inside and outside ginseng matrix of machine with RGB image, and RGB image is transformed into gray space and carries out radiation change It changes, obtains the grayscale image being aligned with the feature of depth image.

Further, by the depth image and RGB image of depth camera collecting sample, sample can be human body or object Body can be static or dynamic.Human body or object are shot to obtain depth image and corresponding RGB image, according to The parameter of calibration is aligned two inlet flows and is pre-processed.

Specifically, as shown in Fig. 2, being shot using single depth camera to static or dynamic object, connected Individual continuous depth image and corresponding RGB image sequence.According to the colour of calibration and the outer ginseng matrix (R of depth transducer_c, T_c) and (R_d,T_d), and colored and depth transducer the internal reference matrix K of calibration_cAnd K_d, gray scale is carried out to the RGB image of input The pretreatment such as conversion and affine transformation.Radiation transformation are as follows:

Z_cp_c=RZ_dp_d+T

Wherein, Z_dFor the depth value of some pixel, Z_cIndicate the homogeneous coordinates value of corresponding points on colored (gray scale) image.It is right Depth map and RGB image size having the same after neat, in order to which augmentation training data introduces randomness, random shearing fixed edge A length of 256 square region is as network inputs.

In step s 102, feedforward neural network and loss function are constructed, depth image and grayscale image input is preceding Godwards It is trained through network, and updates the weight of feedforward neural network according to loss function backpropagation.

Specifically, the feedforward neural network of the multistage multiple dimensioned output of building fusion, input are respectively depth map D and gray scale Scheme I, exports as without the depth map D to make an uproar and details enhances_dt, the network have simultaneously be fitted depth map denoising and details enhance letter Several abilities.

A variety of unsupervised loss function L are constructed, illumination loss item l is specifically included_sh(D_dt, I), value preserving loss item l_fid (D_dt, D) and smooth loss item l_smo(D_dt)。

Wherein, loss function L is calculated, the independent variable of loss function is all from the data of acquisition, is a kind of unsupervised damage Lose function, formula are as follows:

L(D_dt, D, I) and=λ_shl_sh+λ_fidl_fid+λ_smol_smo

L in loss function_shItem is lost for illumination, if according to the depth map D of enhancing_dtThe normal direction figure of calculating is N_dt, the Xiang Ding Justice are as follows:

Wherein B () is that irradiation level calculates function, l^*It is the illumination tensor of estimation, R is albedo figure, and Section 2 indicates two The difference of person's gradient,Indicate the difference of gradient.

L in loss function_fidItem is lost for value preserving, is defined as:

l_fid(D_dt, D)=| D_dt-D|₁

L in loss function_smoSmoothly to lose item, it is defined as D_dtIn anisotropic full variation:

Construct a kind of neural network of the multistage multiple dimensioned output of fusion of special construction and the loss letter that one kind is unsupervised It is several, it is therefore intended that the high frequency detail in different scale characteristic pattern in gray level image to be made full use of, with depth in corresponding scale characteristic pattern Degree figure blends, to achieve the purpose that the noise that removal is mixed on depth map essence low dimensional manifold, while in corresponding position The details that depth map is lost is recovered from gray level image according to illumination equation.In an example of the present invention, fusion is multistage more The Neural Network Structure Design of scale output is as shown in Figure 3.

Specifically, D and C of the network inputs for left side, conv expression convolution, concat are indicated along the last one dimension in figure (characteristic pattern port number) merges two inputs, and pool indicates the operation of maximum value pondization, and resize is indicated using bilinear interpolation Up-sampling operation.All equal differentiables of operation in network, i.e. error can update convolution nuclear parameter by backpropagation.For ruler The input that degree is 256, network carry out down-sampling twice and up-sample twice, and after three different scales incorporate convolution Characteristic pattern, this design have fully merged the feature of gray level image and original depth-map during a propagated forward.Mirror It can cut down, but can be inferred by the depth value of part, one embodiment of the present of invention after multiple convolution in high frequency detail In do not have using profound convolutional neural networks.This design can save calculating money in the sizeable situation of receptive field Shorten net training time, and be effectively prevented from over-fitting in source.

In the training stage, loss function is by three Xiang Zucheng: item l is lost in illumination_sh, value preserving loss item l_fidWith smooth loss item l_smo.The weighted sum that training objective total losses function is three:

L(D_dt, D, I) and=λ_shl_sh+λ_fidl_fid+λ_smol_smo

Every depth map and greyscale image data stream for only relying upon input in loss function is a kind of unsupervised instruction Practice method.

Illustrate that item is lost in illumination first.Surface primary for youth and low frequency illumination, object irradiation level can use the humorous letter of second order ball Several and illumination tensor is approximate:

Wherein H_b:R³→ R is one group of basic function of spheric harmonic function, and l is the humorous light of second order ball of 9 decomposition scene low frequency illumination According to coefficient, R is the albedo of object, and N is the normal direction figure calculated according to depth map.Based on above-mentioned formula, the present invention constructs one Kind loses item without the unsupervised illumination of truthful data:

Wherein N_dtIt is according to depth map D_dtThe normal direction figure of calculating, illumination lose Section 2 in item and constrain the two gradient Difference, gradient difference value can more robustly enhance depth map in complex illumination.It is anti-for general object when calculating penalty values It is assumed to be definite value according to rate R, illumination tensor l can be by least-squares estimation:

In order to limit enhanced depth map close to the noise-containing depth map being originally inputted, the present invention designs value preserving damage Lose item:

l_fid(D_dt, D)=| D_dt-D|₁

Wherein norm loss item can drive network to export depth map D_dtIt is as sparse as possible with input D error, it ensure that increasing Depth map after strong has restored the low frequency part in D as precisely as possible, and has filtered out the high frequency knot of depth transducer generation Structure noise.

Last in loss function smoothly loses item l_smoAs regular terms, the people of illumination item introducing is effectively reduced Work trace.Smooth loss item is defined as D_dtIn anisotropic full variation:

After specifying overall loss function, the training process of network is summarized as follows.In one embodiment of the invention, network Trained batch processing quantity is 64, and total iteration wheel number is 20.Initial learning rate is 0.001, the optimization plan of target loss function Adam is slightly selected, learning rate and momentum, and the demand not additional to video memory can be adaptively adjusted in the training process. After two norms of the gradient value that training reaches the wheel number of setting or backpropagation calculates are less than certain threshold value, no longer update Network weight saves optimal network model weight, repeats application or the further tuning under more data.

In step s 103, the feedforward neural network after grayscale image to be input to training update weight is to export enhancing depth Image.

Further, fixed network weight W is walked for test and deployment phase according to the data that depth camera acquires Rapid B enhances original depth map using the high-frequency information in grayscale image I, the depth map D enhanced_dt, network forward calculation speed Degree can achieve the requirement applied in real time, provides good interactive three-dimensional for user and rebuilds experience, before possessing wide application Scape.

Specifically, after training has updated feedforward neural network, in test or deployment phase, due to convolution operation and pond It is partial operation, model can be extended to arbitrary size, and the input depth map and RGB image of the ratio of width to height are further widened The application scenarios of the method for the present invention.Similarly, the input of network is pretreated depth map and greyscale image data stream.This rank Section fixed network weight, enhances original depth map D using the high-frequency information in grayscale image I, the depth map D enhanced_dt.By Compact in the network structure of design, one embodiment of the present of invention entirety weight quantity is only 130,000, and offline model occupies empty Between it is very small, can be deployed in various mobile devices completely.For the 640 × 480 of the acquisition of general consumer level depth camera The depth map of size, constructed network forward calculation speed fully achieves real-time application in one embodiment of the present of invention Requirement, it might even be possible to run on the hardware systems such as specific smart phone processing unit or mobile PCs.

It is understood that object noiseless and the depth comprising enhancing details can be obtained in real time in deployment phase Figure ensure that the accuracy of depth map, neural network have good generalization ability, can be enhanced not while enhancing depth map The depth map of same human body, object, effect is good, and the speed of service is more than in real time, to gather around and have broad application prospects, the model ginseng after training Number can be deployed on the hardware systems such as specific smart phone processing unit or mobile PCs.

Further, RGB image auxiliary depth map denoise and in the network that enhances, includes multiple identical or adjacent scale spy Sign figure exports the stack operation of characteristic pattern after convolution.

Further, item l is lost in illumination in unsupervised loss function_shEvaluation method, in the hypothesis of uniform albedo Under, illumination tensor l^*It can be estimated first by the D inputted, recycle illumination l^*Calculate the numerical value l of loss item_sh(l^*,N_dt, I), This process can iteration repeatedly to improve estimation precision.That is, illumination tensor is estimated by depth image, according to illumination tensor Calculate the numerical value of illumination loss item.

Further, the denoising enhancing of depth map or details generation cannot substantially deviate original depth-map D, value preserving loss Item has larger impact for the constringency performance of network training.

The single depth camera depth map real time enhancing method neural network based proposed according to embodiments of the present invention, passes through Special object is acquired to obtain individual depth image and RGB image or depth map and RGB image stream；RGB image is transformed into Gray space is simultaneously aligned with deepness image registration；RGB image and depth image are aligned according to the camera internal reference of calibration；Construct one kind The neural network of the multistage multiple dimensioned output of the fusion of special construction, constructs a kind of nothing for not needing depth map true value in the training stage Supervise loss function；In the training stage, while inputting RGB image and depth image training neural network and anti-according to loss function Network weight is updated to propagating；In test or actual deployment stage, fixed network weight only carries out the forward calculation of network, utilizes High-frequency information in the RGB image of depth camera acquisition, real time enhancing depth map.

Thus, it is possible to directly be shot object with a large amount of Quick Acquisition depth images using depth camera, without height The scanning device acquisition real depth map of precision eliminates the process manually demarcated as supervision message.The data-driven Data needed for method are very easy to acquisition, can simply use end-to-end unsupervised training, complete instruction in the PC machine of single video card Practice.

The single depth camera depth map neural network based proposed according to embodiments of the present invention referring next to attached drawing description Real time enhancing square law device.

Fig. 4 is single depth camera depth map real time enhancing method neural network based according to one embodiment of the invention Apparatus structure schematic diagram.

As shown in figure 4, single depth camera depth map real time enhancing square law device neural network based includes: alignment mould Block 100, training update module 200 and enhancing module 300.

Wherein, alignment module 100, for passing through the depth image and RGB image of depth camera collecting sample, according to calibration Depth camera inside and outside ginseng matrix depth image is aligned with RGB image, and RGB image is transformed into gray space and is carried out Radiation transformation, obtains the grayscale image being aligned with the feature of depth image.

Training update module 200 inputs depth image and grayscale image for constructing feedforward neural network and loss function Feedforward neural network is trained, and the weight of feedforward neural network is updated according to loss function backpropagation.

Enhance module 300, trains the feedforward neural network after updating weight to export enhancing for grayscale image to be input to Depth image.

The device can provide good interactive three-dimensional for user and rebuild experience, gather around and have broad application prospects.

Further, radiation transformation are as follows:

Z_cp_c=RZ_dp_d+T

Further, loss function formula are as follows:

L(D_dt, D, I) and=λ_shl_sh+λ_fidl_fid+λ_smol_smo

Wherein, D_dtFor depth map, D is depth image, and I is gray level image；

Wherein, B () is that irradiation level calculates function, l^*It is the illumination tensor of estimation, R is albedo figure, and I is grayscale image,Indicate the difference of gradient；

l_fidItem is lost for value preserving, is defined as:

l_fid(D_dt, D)=| D_dt-D|₁

l_smoSmoothly to lose item, it is defined as D_dtIn anisotropic full variation:

Further, RGB image auxiliary depth image denoise and in the feedforward neural network that enhances, include it is multiple identical or Adjacent scale feature figure exports the stack operation of characteristic pattern after convolution.

Further, illumination tensor is estimated by depth image, the numerical value of illumination loss item is calculated according to illumination tensor.

It should be noted that aforementioned to single depth camera depth map real time enhancing embodiment of the method neural network based The device for being also applied for the embodiment is illustrated, details are not described herein again.

The single depth camera depth map real time enhancing device neural network based proposed according to embodiments of the present invention, passes through Special object is acquired to obtain individual depth image and RGB image or depth map and RGB image stream；RGB image is transformed into Gray space is simultaneously aligned with deepness image registration；RGB image and depth image are aligned according to the camera internal reference of calibration；Construct one kind The neural network of the multistage multiple dimensioned output of the fusion of special construction, constructs a kind of nothing for not needing depth map true value in the training stage Supervise loss function；In the training stage, while inputting RGB image and depth image training neural network and anti-according to loss function Network weight is updated to propagating；In test or actual deployment stage, fixed network weight only carries out the forward calculation of network, utilizes High-frequency information in the RGB image of depth camera acquisition, real time enhancing depth map.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.

Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

Claims

1. a kind of single depth camera depth map real time enhancing method neural network based, which comprises the following steps:

By the depth image and RGB image of depth camera collecting sample, according to joining matrix inside and outside the depth camera of calibration for institute It states depth image to be aligned with the RGB image, and the RGB image is transformed into gray space and carries out radiation transformation, obtain To the grayscale image being aligned with the feature of the depth image；

Feedforward neural network and loss function are constructed, the depth image and the grayscale image are inputted into the feedforward neural network It is trained, and updates the weight of the feedforward neural network according to the loss function backpropagation；

Feedforward neural network after the grayscale image to be input to the training update weight is to export enhancing depth image.

2. the method according to claim 1, wherein the radiation converts are as follows:

Z_cp_c=RZ_dp_d+T

Wherein, (R_c, T_c) and (R_d, T_d) it is respectively the colour of calibration and the outer ginseng matrix of depth transducer, K_cAnd K_dRespectively mark The internal reference matrix of fixed colour and depth transducer, Z_dFor the depth value of pixel, Z_cIndicate corresponding points on colour or gray level image Homogeneous coordinates value.

3. the method according to claim 1, wherein the loss function formula are as follows:

L(D_dt, D, I) and=λ_shl_sh+λ_fidl_fid+λ_smol_smo

Wherein, B () is that irradiation level calculates function, l^*It is the illumination tensor of estimation, R is albedo figure, and I is the grayscale image,Indicate the difference of gradient；

l_fidItem is lost for value preserving, is defined as:

l_fid(D_dt, D)=| D_dt-D|₁

l_smoSmoothly to lose item, it is defined as D_dtIn anisotropic full variation:

4. the method according to claim 1, wherein

The RGB image assists the depth image to denoise and in the feedforward neural network that enhances, comprising multiple identical or Adjacent scale feature figure exports the stack operation of characteristic pattern after convolution.

5. according to the method described in claim 3, it is characterized in that,

The illumination tensor is estimated by the depth image, and the number of the illumination loss item is calculated according to the illumination tensor Value.

6. a kind of single depth camera depth map real time enhancing device neural network based characterized by comprising

Alignment module, for passing through the depth image and RGB image of depth camera collecting sample, according in the depth camera of calibration The depth image is aligned by outer ginseng matrix with the RGB image, and the RGB image is transformed into gray space and is carried out Radiation transformation, obtains the grayscale image being aligned with the feature of the depth image；

Training update module, it is for constructing feedforward neural network and loss function, the depth image and the grayscale image is defeated Enter the feedforward neural network to be trained, and updates the power of the feedforward neural network according to the loss function backpropagation Weight；

Enhance module, trains the feedforward neural network after updating the weight to export enhancing for the grayscale image to be input to Depth image.

7. device according to claim 6, which is characterized in that the radiation transformation are as follows:

Z_cp_c=RZ_dP_d+T

8. device according to claim 6, which is characterized in that the loss function formula are as follows:

L(D_dt, D, I) and=λ_shl_sh+λ_fidl_fid+λ_smol_smo

l_fidItem is lost for value preserving, is defined as:

l_fid(D_dt, D)=| D_dt-D|₁

l_smoSmoothly to lose item, it is defined as D_dtIn anisotropic full variation:

9. device according to claim 6, which is characterized in that

10. device according to claim 6, which is characterized in that