CN110211061A - List depth camera depth map real time enhancing method and device neural network based - Google Patents

List depth camera depth map real time enhancing method and device neural network based Download PDF

Info

Publication number
CN110211061A
CN110211061A CN201910417886.2A CN201910417886A CN110211061A CN 110211061 A CN110211061 A CN 110211061A CN 201910417886 A CN201910417886 A CN 201910417886A CN 110211061 A CN110211061 A CN 110211061A
Authority
CN
China
Prior art keywords
depth
image
neural network
feedforward neural
illumination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910417886.2A
Other languages
Chinese (zh)
Inventor
刘烨斌
闫石
戴琼海
方璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201910417886.2A priority Critical patent/CN110211061A/en
Publication of CN110211061A publication Critical patent/CN110211061A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/38Registration of image sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a kind of single depth camera depth map real time enhancing method and devices neural network based, wherein, this method comprises: passing through the depth image and RGB image of depth camera collecting sample, depth image is aligned with RGB image according to matrix is joined inside and outside the depth camera of calibration, and RGB image is transformed into gray space and carries out radiation transformation, obtain the grayscale image being aligned with the feature of depth image;Feedforward neural network and loss function are constructed, depth image and grayscale image input feedforward neural network are trained, and update the weight of feedforward neural network according to loss function backpropagation;Feedforward neural network after grayscale image to be input to training update weight is to export enhancing depth image.This method carries out shooting, collecting depth image to sample using depth camera, without high-precision scanning device acquisition real depth map as supervision message, eliminates the process manually demarcated, provides good interactive three-dimensional for user and rebuild experience.

Description

List depth camera depth map real time enhancing method and device neural network based
Technical field
The present invention relates to computer visions and computer graphics techniques field, in particular to a kind of neural network based Single depth camera depth map real time enhancing method and device.
Background technique
Consumer level depth camera is gradually popularized in recent years, and being even more built-in in especially newest Iphone X is based on structure The depth camera of light.This makes from 3-D scanning to virtual reality, and many brand-new mobile terminal applications are possibly realized in mixed reality. Although the resolution ratio and quality of the initial data of sensor acquisition have promotion, the depth obtained at this stage from consumer level depth camera Degree figure still has many noises, also lacks enough details.Such as body three-dimensional reconstruction is computer graphics and computer view The Important Problems in feel field.The human 3d model of high quality has extensively in fields such as video display amusement, demographic data statistical analysis General application prospect and important application value.But the acquisition of high quality human 3d model usually rely on it is expensive swash Photoscanner or polyphaser array system are realized, at high cost also significantly there is can not be real-time although precision is higher The disadvantages of, it can not spread in general public daily life.There are many methods to believe using the high frequency in high resolution R GB image Breath, while removing the distinctive a large amount of structured noises of depth transducer.Traditional class bilateral filtering method can not be in enhancing details While guarantee the authenticity of depth map, complicated optimization algorithm is usually required by the method that light and shade restores the 3D shape of object And to specific input failure, based on the smooth fusion method of time domain can not single frames real time enhancing depth map, and data-driven Machine learning algorithm can not carry out unsupervisedly in the case where no real depth diagram data.
Summary of the invention
The present invention is directed to solve at least some of the technical problems in related technologies.
For this purpose, an object of the present invention is to provide a kind of single depth camera depth maps neural network based to increase in real time Strong method, this method can directly shoot with a large amount of Quick Acquisition depth images object using depth camera, without height The scanning device acquisition real depth map of precision eliminates the process manually demarcated as supervision message.
It is another object of the present invention to propose a kind of single depth camera depth map real time enhancing neural network based Device.
In order to achieve the above objectives, it is deep to propose a kind of single depth camera neural network based for one aspect of the present invention embodiment Spend figure real time enhancing method, comprising: by the depth image and RGB image of depth camera collecting sample, according to the depth of calibration The depth image is aligned by the inside and outside ginseng matrix of camera with the RGB image, and the RGB image is transformed into gray space And radiation transformation is carried out, obtain the grayscale image being aligned with the feature of the depth image;Construct feedforward neural network and loss letter The depth image and the grayscale image are inputted the feedforward neural network and are trained by number, and according to the loss function Backpropagation updates the weight of the feedforward neural network;The grayscale image is input to the forward direction after training updates the weight Neural network is to export enhancing depth image.
Single depth camera depth map real time enhancing method neural network based of the embodiment of the present invention, by utilizing depth Camera acquires RGB image and depth image;RGB image and depth image are aligned according to the camera internal reference of calibration;Building fusion is more The neural network of the multiple dimensioned output of grade and unsupervised loss function;RGB image and depth image training nerve are inputted simultaneously Network simultaneously updates network weight according to loss function backpropagation;Fixed network weight is for test and deployment phase, according to depth Spend the RGB image real time enhancing depth map of camera acquisition.Thus, it is possible to directly be shot object with big using depth camera Quick Acquisition depth image is measured, without high-precision scanning device acquisition real depth map as supervision message, is eliminated simultaneously The process manually demarcated.Data needed for the data-driven method are very easy to acquisition, can be simply using end-to-end unsupervised Training completes training in the PC machine of single video card.
In addition, single depth camera depth map real time enhancing method neural network based according to the above embodiment of the present invention There can also be following additional technical characteristic:
Further, the radiation transformation are as follows:
Zcpc=RZdpd+T
Wherein, (Rc,Tc) and (Rd,Td) it is respectively the colour of calibration and the outer ginseng matrix of depth transducer, KcAnd KdRespectively For the colour of calibration and the internal reference matrix of depth transducer, ZdFor the depth value of pixel, ZcIndicate corresponding on colour or gray level image The homogeneous coordinates value of point.
Further, the loss function formula are as follows:
L(Ddt, D, I) and=λshlshfidlfidsmolsmo
Wherein, DdtFor the depth map, D is the depth image, and I is the gray level image;
lshItem is lost for illumination, according to the depth map D of enhancingdtThe normal direction figure of calculating is Ndt, lshIs defined as:
Wherein, B () is that irradiation level calculates function, l*It is the illumination tensor of estimation, R is albedo figure, and I is the gray scale Figure,Indicate the difference of gradient;
lfidItem is lost for value preserving, is defined as:
lfid(Ddt, D)=| Ddt-D|1
lsmoSmoothly to lose item, it is defined as DdtIn anisotropic full variation:
Further, the RGB image assists the depth image to denoise and in the feedforward neural network that enhances, packet The stack operation of characteristic pattern is exported after convolution containing multiple identical or adjacent scale feature figure.
Further, the illumination tensor is estimated by the depth image, the light is calculated according to the illumination tensor According to the numerical value of loss item.
In order to achieve the above objectives, another aspect of the present invention embodiment proposes a kind of single depth camera neural network based Depth map real time enhancing square law device, comprising:
Alignment module, for passing through the depth image and RGB image of depth camera collecting sample, according to the depth phase of calibration The depth image is aligned by the inside and outside ginseng matrix of machine with the RGB image, and the RGB image is transformed into gray space simultaneously Radiation transformation is carried out, the grayscale image being aligned with the feature of the depth image is obtained;
Training update module, for constructing feedforward neural network and loss function, by the depth image and the gray scale Figure inputs the feedforward neural network and is trained, and updates the feedforward neural network according to the loss function backpropagation Weight;
Enhance module, trains the feedforward neural network after updating the weight to export for the grayscale image to be input to Enhance depth image.
Single depth camera depth map real time enhancing square law device neural network based of the embodiment of the present invention, passes through utilization Depth camera acquires RGB image and depth image;RGB image and depth image are aligned according to the camera internal reference of calibration;Building is melted The neural network of the multistage multiple dimensioned output of conjunction and unsupervised loss function;RGB image and depth image training are inputted simultaneously Neural network simultaneously updates network weight according to loss function backpropagation;Fixed network weight is for test and deployment phase, root The RGB image real time enhancing depth map acquired according to depth camera.Thus, it is possible to directly be shot using depth camera to object It is saved simultaneously with a large amount of Quick Acquisition depth images without high-precision scanning device acquisition real depth map as supervision message The process of artificial calibration is gone.Data needed for the data-driven method are very easy to acquisition, can simply use end-to-end nothing Supervised training completes training in the PC machine of single video card.
In addition, single depth camera depth map real time enhancing method neural network based according to the above embodiment of the present invention Device can also have following additional technical characteristic:
Further, the radiation transformation are as follows:
Zcpc=RZdpd+T
Wherein, (Rc,Tc) and (Rd,Td) it is respectively the colour of calibration and the outer ginseng matrix of depth transducer, KcAnd KdRespectively For the colour of calibration and the internal reference matrix of depth transducer, ZdFor the depth value of pixel, ZcIndicate corresponding on colour or gray level image The homogeneous coordinates value of point.
Further, the loss function formula are as follows:
L(Ddt, D, I) and=λshlshfidlfidsmolsmo
Wherein, DdtFor the depth map, D is the depth image, and I is the gray level image;
lshItem is lost for illumination, according to the depth map D of enhancingdtThe normal direction figure of calculating is Ndt, lshIs defined as:
Wherein, B () is that irradiation level calculates function, l*It is the illumination tensor of estimation, R is albedo figure, and I is the gray scale Figure,Indicate the difference of gradient;
lfidItem is lost for value preserving, is defined as:
lfid(Ddt, D)=| Ddt-D|1
lsmoSmoothly to lose item, it is defined as DdtIn anisotropic full variation:
Further, the RGB image assists the depth image to denoise and in the feedforward neural network that enhances, packet The stack operation of characteristic pattern is exported after convolution containing multiple identical or adjacent scale feature figure.
Further, the illumination tensor is estimated by the depth image, the light is calculated according to the illumination tensor According to the numerical value of loss item.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is single depth camera depth map real time enhancing method neural network based according to one embodiment of the invention Flow chart;
Fig. 2 is single depth camera depth map real time enhancing side neural network based according to another embodiment of the invention Method flow chart;
Fig. 3 is the neural network structure figure according to the multistage multiple dimensioned output of fusion of the building of one embodiment of the invention;
Fig. 4 is single depth camera depth map real time enhancing device neural network based according to one embodiment of the invention Structural schematic diagram.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.
The single depth camera depth map neural network based proposed according to embodiments of the present invention is described with reference to the accompanying drawings Real time enhancing method and device.
The single depth camera depth neural network based proposed according to embodiments of the present invention is described with reference to the accompanying drawings first Figure real time enhancing method.
Fig. 1 is single depth camera depth map real time enhancing method neural network based according to one embodiment of the invention Flow chart.
As shown in Figure 1, single depth camera depth map real time enhancing method neural network based the following steps are included:
In step s101, by the depth image and RGB image of depth camera collecting sample, according to the depth phase of calibration Depth image is aligned by the inside and outside ginseng matrix of machine with RGB image, and RGB image is transformed into gray space and carries out radiation change It changes, obtains the grayscale image being aligned with the feature of depth image.
Further, by the depth image and RGB image of depth camera collecting sample, sample can be human body or object Body can be static or dynamic.Human body or object are shot to obtain depth image and corresponding RGB image, according to The parameter of calibration is aligned two inlet flows and is pre-processed.
Specifically, as shown in Fig. 2, being shot using single depth camera to static or dynamic object, connected Individual continuous depth image and corresponding RGB image sequence.According to the colour of calibration and the outer ginseng matrix (R of depth transducerc, Tc) and (Rd,Td), and colored and depth transducer the internal reference matrix K of calibrationcAnd Kd, gray scale is carried out to the RGB image of input The pretreatment such as conversion and affine transformation.Radiation transformation are as follows:
Zcpc=RZdpd+T
Wherein, ZdFor the depth value of some pixel, ZcIndicate the homogeneous coordinates value of corresponding points on colored (gray scale) image.It is right Depth map and RGB image size having the same after neat, in order to which augmentation training data introduces randomness, random shearing fixed edge A length of 256 square region is as network inputs.
In step s 102, feedforward neural network and loss function are constructed, depth image and grayscale image input is preceding Godwards It is trained through network, and updates the weight of feedforward neural network according to loss function backpropagation.
Specifically, the feedforward neural network of the multistage multiple dimensioned output of building fusion, input are respectively depth map D and gray scale Scheme I, exports as without the depth map D to make an uproar and details enhancesdt, the network have simultaneously be fitted depth map denoising and details enhance letter Several abilities.
A variety of unsupervised loss function L are constructed, illumination loss item l is specifically includedsh(Ddt, I), value preserving loss item l_fid (Ddt, D) and smooth loss item lsmo(Ddt)。
Wherein, loss function L is calculated, the independent variable of loss function is all from the data of acquisition, is a kind of unsupervised damage Lose function, formula are as follows:
L(Ddt, D, I) and=λshlshfidlfidsmolsmo
L in loss functionshItem is lost for illumination, if according to the depth map D of enhancingdtThe normal direction figure of calculating is Ndt, the Xiang Ding Justice are as follows:
Wherein B () is that irradiation level calculates function, l*It is the illumination tensor of estimation, R is albedo figure, and Section 2 indicates two The difference of person's gradient,Indicate the difference of gradient.
L in loss functionfidItem is lost for value preserving, is defined as:
lfid(Ddt, D)=| Ddt-D|1
L in loss functionsmoSmoothly to lose item, it is defined as DdtIn anisotropic full variation:
Construct a kind of neural network of the multistage multiple dimensioned output of fusion of special construction and the loss letter that one kind is unsupervised It is several, it is therefore intended that the high frequency detail in different scale characteristic pattern in gray level image to be made full use of, with depth in corresponding scale characteristic pattern Degree figure blends, to achieve the purpose that the noise that removal is mixed on depth map essence low dimensional manifold, while in corresponding position The details that depth map is lost is recovered from gray level image according to illumination equation.In an example of the present invention, fusion is multistage more The Neural Network Structure Design of scale output is as shown in Figure 3.
Specifically, D and C of the network inputs for left side, conv expression convolution, concat are indicated along the last one dimension in figure (characteristic pattern port number) merges two inputs, and pool indicates the operation of maximum value pondization, and resize is indicated using bilinear interpolation Up-sampling operation.All equal differentiables of operation in network, i.e. error can update convolution nuclear parameter by backpropagation.For ruler The input that degree is 256, network carry out down-sampling twice and up-sample twice, and after three different scales incorporate convolution Characteristic pattern, this design have fully merged the feature of gray level image and original depth-map during a propagated forward.Mirror It can cut down, but can be inferred by the depth value of part, one embodiment of the present of invention after multiple convolution in high frequency detail In do not have using profound convolutional neural networks.This design can save calculating money in the sizeable situation of receptive field Shorten net training time, and be effectively prevented from over-fitting in source.
In the training stage, loss function is by three Xiang Zucheng: item l is lost in illuminationsh, value preserving loss item lfidWith smooth loss item lsmo.The weighted sum that training objective total losses function is three:
L(Ddt, D, I) and=λshlshfidlfidsmolsmo
Every depth map and greyscale image data stream for only relying upon input in loss function is a kind of unsupervised instruction Practice method.
Illustrate that item is lost in illumination first.Surface primary for youth and low frequency illumination, object irradiation level can use the humorous letter of second order ball Several and illumination tensor is approximate:
Wherein Hb:R3→ R is one group of basic function of spheric harmonic function, and l is the humorous light of second order ball of 9 decomposition scene low frequency illumination According to coefficient, R is the albedo of object, and N is the normal direction figure calculated according to depth map.Based on above-mentioned formula, the present invention constructs one Kind loses item without the unsupervised illumination of truthful data:
Wherein NdtIt is according to depth map DdtThe normal direction figure of calculating, illumination lose Section 2 in item and constrain the two gradient Difference, gradient difference value can more robustly enhance depth map in complex illumination.It is anti-for general object when calculating penalty values It is assumed to be definite value according to rate R, illumination tensor l can be by least-squares estimation:
In order to limit enhanced depth map close to the noise-containing depth map being originally inputted, the present invention designs value preserving damage Lose item:
lfid(Ddt, D)=| Ddt-D|1
Wherein norm loss item can drive network to export depth map DdtIt is as sparse as possible with input D error, it ensure that increasing Depth map after strong has restored the low frequency part in D as precisely as possible, and has filtered out the high frequency knot of depth transducer generation Structure noise.
Last in loss function smoothly loses item lsmoAs regular terms, the people of illumination item introducing is effectively reduced Work trace.Smooth loss item is defined as DdtIn anisotropic full variation:
After specifying overall loss function, the training process of network is summarized as follows.In one embodiment of the invention, network Trained batch processing quantity is 64, and total iteration wheel number is 20.Initial learning rate is 0.001, the optimization plan of target loss function Adam is slightly selected, learning rate and momentum, and the demand not additional to video memory can be adaptively adjusted in the training process. After two norms of the gradient value that training reaches the wheel number of setting or backpropagation calculates are less than certain threshold value, no longer update Network weight saves optimal network model weight, repeats application or the further tuning under more data.
In step s 103, the feedforward neural network after grayscale image to be input to training update weight is to export enhancing depth Image.
Further, fixed network weight W is walked for test and deployment phase according to the data that depth camera acquires Rapid B enhances original depth map using the high-frequency information in grayscale image I, the depth map D enhanceddt, network forward calculation speed Degree can achieve the requirement applied in real time, provides good interactive three-dimensional for user and rebuilds experience, before possessing wide application Scape.
Specifically, after training has updated feedforward neural network, in test or deployment phase, due to convolution operation and pond It is partial operation, model can be extended to arbitrary size, and the input depth map and RGB image of the ratio of width to height are further widened The application scenarios of the method for the present invention.Similarly, the input of network is pretreated depth map and greyscale image data stream.This rank Section fixed network weight, enhances original depth map D using the high-frequency information in grayscale image I, the depth map D enhanceddt.By Compact in the network structure of design, one embodiment of the present of invention entirety weight quantity is only 130,000, and offline model occupies empty Between it is very small, can be deployed in various mobile devices completely.For the 640 × 480 of the acquisition of general consumer level depth camera The depth map of size, constructed network forward calculation speed fully achieves real-time application in one embodiment of the present of invention Requirement, it might even be possible to run on the hardware systems such as specific smart phone processing unit or mobile PCs.
It is understood that object noiseless and the depth comprising enhancing details can be obtained in real time in deployment phase Figure ensure that the accuracy of depth map, neural network have good generalization ability, can be enhanced not while enhancing depth map The depth map of same human body, object, effect is good, and the speed of service is more than in real time, to gather around and have broad application prospects, the model ginseng after training Number can be deployed on the hardware systems such as specific smart phone processing unit or mobile PCs.
Further, RGB image auxiliary depth map denoise and in the network that enhances, includes multiple identical or adjacent scale spy Sign figure exports the stack operation of characteristic pattern after convolution.
Further, item l is lost in illumination in unsupervised loss functionshEvaluation method, in the hypothesis of uniform albedo Under, illumination tensor l*It can be estimated first by the D inputted, recycle illumination l*Calculate the numerical value l of loss itemsh(l*,Ndt, I), This process can iteration repeatedly to improve estimation precision.That is, illumination tensor is estimated by depth image, according to illumination tensor Calculate the numerical value of illumination loss item.
Further, the denoising enhancing of depth map or details generation cannot substantially deviate original depth-map D, value preserving loss Item has larger impact for the constringency performance of network training.
The single depth camera depth map real time enhancing method neural network based proposed according to embodiments of the present invention, passes through Special object is acquired to obtain individual depth image and RGB image or depth map and RGB image stream;RGB image is transformed into Gray space is simultaneously aligned with deepness image registration;RGB image and depth image are aligned according to the camera internal reference of calibration;Construct one kind The neural network of the multistage multiple dimensioned output of the fusion of special construction, constructs a kind of nothing for not needing depth map true value in the training stage Supervise loss function;In the training stage, while inputting RGB image and depth image training neural network and anti-according to loss function Network weight is updated to propagating;In test or actual deployment stage, fixed network weight only carries out the forward calculation of network, utilizes High-frequency information in the RGB image of depth camera acquisition, real time enhancing depth map.
Thus, it is possible to directly be shot object with a large amount of Quick Acquisition depth images using depth camera, without height The scanning device acquisition real depth map of precision eliminates the process manually demarcated as supervision message.The data-driven Data needed for method are very easy to acquisition, can simply use end-to-end unsupervised training, complete instruction in the PC machine of single video card Practice.
The single depth camera depth map neural network based proposed according to embodiments of the present invention referring next to attached drawing description Real time enhancing square law device.
Fig. 4 is single depth camera depth map real time enhancing method neural network based according to one embodiment of the invention Apparatus structure schematic diagram.
As shown in figure 4, single depth camera depth map real time enhancing square law device neural network based includes: alignment mould Block 100, training update module 200 and enhancing module 300.
Wherein, alignment module 100, for passing through the depth image and RGB image of depth camera collecting sample, according to calibration Depth camera inside and outside ginseng matrix depth image is aligned with RGB image, and RGB image is transformed into gray space and is carried out Radiation transformation, obtains the grayscale image being aligned with the feature of depth image.
Training update module 200 inputs depth image and grayscale image for constructing feedforward neural network and loss function Feedforward neural network is trained, and the weight of feedforward neural network is updated according to loss function backpropagation.
Enhance module 300, trains the feedforward neural network after updating weight to export enhancing for grayscale image to be input to Depth image.
The device can provide good interactive three-dimensional for user and rebuild experience, gather around and have broad application prospects.
Further, radiation transformation are as follows:
Zcpc=RZdpd+T
Wherein, (Rc,Tc) and (Rd,Td) it is respectively the colour of calibration and the outer ginseng matrix of depth transducer, KcAnd KdRespectively For the colour of calibration and the internal reference matrix of depth transducer, ZdFor the depth value of pixel, ZcIndicate corresponding on colour or gray level image The homogeneous coordinates value of point.
Further, loss function formula are as follows:
L(Ddt, D, I) and=λshlshfidlfidsmolsmo
Wherein, DdtFor depth map, D is depth image, and I is gray level image;
lshItem is lost for illumination, according to the depth map D of enhancingdtThe normal direction figure of calculating is Ndt, lshIs defined as:
Wherein, B () is that irradiation level calculates function, l*It is the illumination tensor of estimation, R is albedo figure, and I is grayscale image,Indicate the difference of gradient;
lfidItem is lost for value preserving, is defined as:
lfid(Ddt, D)=| Ddt-D|1
lsmoSmoothly to lose item, it is defined as DdtIn anisotropic full variation:
Further, RGB image auxiliary depth image denoise and in the feedforward neural network that enhances, include it is multiple identical or Adjacent scale feature figure exports the stack operation of characteristic pattern after convolution.
Further, illumination tensor is estimated by depth image, the numerical value of illumination loss item is calculated according to illumination tensor.
It should be noted that aforementioned to single depth camera depth map real time enhancing embodiment of the method neural network based The device for being also applied for the embodiment is illustrated, details are not described herein again.
The single depth camera depth map real time enhancing device neural network based proposed according to embodiments of the present invention, passes through Special object is acquired to obtain individual depth image and RGB image or depth map and RGB image stream;RGB image is transformed into Gray space is simultaneously aligned with deepness image registration;RGB image and depth image are aligned according to the camera internal reference of calibration;Construct one kind The neural network of the multistage multiple dimensioned output of the fusion of special construction, constructs a kind of nothing for not needing depth map true value in the training stage Supervise loss function;In the training stage, while inputting RGB image and depth image training neural network and anti-according to loss function Network weight is updated to propagating;In test or actual deployment stage, fixed network weight only carries out the forward calculation of network, utilizes High-frequency information in the RGB image of depth camera acquisition, real time enhancing depth map.
Thus, it is possible to directly be shot object with a large amount of Quick Acquisition depth images using depth camera, without height The scanning device acquisition real depth map of precision eliminates the process manually demarcated as supervision message.The data-driven Data needed for method are very easy to acquisition, can simply use end-to-end unsupervised training, complete instruction in the PC machine of single video card Practice.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.
Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

Claims (10)

1. a kind of single depth camera depth map real time enhancing method neural network based, which comprises the following steps:
By the depth image and RGB image of depth camera collecting sample, according to joining matrix inside and outside the depth camera of calibration for institute It states depth image to be aligned with the RGB image, and the RGB image is transformed into gray space and carries out radiation transformation, obtain To the grayscale image being aligned with the feature of the depth image;
Feedforward neural network and loss function are constructed, the depth image and the grayscale image are inputted into the feedforward neural network It is trained, and updates the weight of the feedforward neural network according to the loss function backpropagation;
Feedforward neural network after the grayscale image to be input to the training update weight is to export enhancing depth image.
2. the method according to claim 1, wherein the radiation converts are as follows:
Zcpc=RZdpd+T
Wherein, (Rc, Tc) and (Rd, Td) it is respectively the colour of calibration and the outer ginseng matrix of depth transducer, KcAnd KdRespectively mark The internal reference matrix of fixed colour and depth transducer, ZdFor the depth value of pixel, ZcIndicate corresponding points on colour or gray level image Homogeneous coordinates value.
3. the method according to claim 1, wherein the loss function formula are as follows:
L(Ddt, D, I) and=λshlshfidlfidsmolsmo
Wherein, DdtFor the depth map, D is the depth image, and I is the gray level image;
lshItem is lost for illumination, according to the depth map D of enhancingdtThe normal direction figure of calculating is Ndt, lshIs defined as:
Wherein, B () is that irradiation level calculates function, l*It is the illumination tensor of estimation, R is albedo figure, and I is the grayscale image,Indicate the difference of gradient;
lfidItem is lost for value preserving, is defined as:
lfid(Ddt, D)=| Ddt-D|1
lsmoSmoothly to lose item, it is defined as DdtIn anisotropic full variation:
4. the method according to claim 1, wherein
The RGB image assists the depth image to denoise and in the feedforward neural network that enhances, comprising multiple identical or Adjacent scale feature figure exports the stack operation of characteristic pattern after convolution.
5. according to the method described in claim 3, it is characterized in that,
The illumination tensor is estimated by the depth image, and the number of the illumination loss item is calculated according to the illumination tensor Value.
6. a kind of single depth camera depth map real time enhancing device neural network based characterized by comprising
Alignment module, for passing through the depth image and RGB image of depth camera collecting sample, according in the depth camera of calibration The depth image is aligned by outer ginseng matrix with the RGB image, and the RGB image is transformed into gray space and is carried out Radiation transformation, obtains the grayscale image being aligned with the feature of the depth image;
Training update module, it is for constructing feedforward neural network and loss function, the depth image and the grayscale image is defeated Enter the feedforward neural network to be trained, and updates the power of the feedforward neural network according to the loss function backpropagation Weight;
Enhance module, trains the feedforward neural network after updating the weight to export enhancing for the grayscale image to be input to Depth image.
7. device according to claim 6, which is characterized in that the radiation transformation are as follows:
Zcpc=RZdPd+T
Wherein, (Rc, Tc) and (Rd, Td) it is respectively the colour of calibration and the outer ginseng matrix of depth transducer, KcAnd KdRespectively mark The internal reference matrix of fixed colour and depth transducer, ZdFor the depth value of pixel, ZcIndicate corresponding points on colour or gray level image Homogeneous coordinates value.
8. device according to claim 6, which is characterized in that the loss function formula are as follows:
L(Ddt, D, I) and=λshlshfidlfidsmolsmo
Wherein, DdtFor the depth map, D is the depth image, and I is the gray level image;
lshItem is lost for illumination, according to the depth map D of enhancingdtThe normal direction figure of calculating is Ndt, lshIs defined as:
Wherein, B () is that irradiation level calculates function, l*It is the illumination tensor of estimation, R is albedo figure, and I is the grayscale image,Indicate the difference of gradient;
lfidItem is lost for value preserving, is defined as:
lfid(Ddt, D)=| Ddt-D|1
lsmoSmoothly to lose item, it is defined as DdtIn anisotropic full variation:
9. device according to claim 6, which is characterized in that
The RGB image assists the depth image to denoise and in the feedforward neural network that enhances, comprising multiple identical or Adjacent scale feature figure exports the stack operation of characteristic pattern after convolution.
10. device according to claim 6, which is characterized in that
The illumination tensor is estimated by the depth image, and the number of the illumination loss item is calculated according to the illumination tensor Value.
CN201910417886.2A 2019-05-20 2019-05-20 List depth camera depth map real time enhancing method and device neural network based Pending CN110211061A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910417886.2A CN110211061A (en) 2019-05-20 2019-05-20 List depth camera depth map real time enhancing method and device neural network based

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910417886.2A CN110211061A (en) 2019-05-20 2019-05-20 List depth camera depth map real time enhancing method and device neural network based

Publications (1)

Publication Number Publication Date
CN110211061A true CN110211061A (en) 2019-09-06

Family

ID=67787810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910417886.2A Pending CN110211061A (en) 2019-05-20 2019-05-20 List depth camera depth map real time enhancing method and device neural network based

Country Status (1)

Country Link
CN (1) CN110211061A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110689558A (en) * 2019-09-30 2020-01-14 清华大学 Multi-sensor image enhancement method and device
CN111272290A (en) * 2020-03-13 2020-06-12 西北工业大学 Temperature measurement thermal infrared imager calibration method and device based on deep neural network
CN111275751A (en) * 2019-10-12 2020-06-12 浙江省北大信息技术高等研究院 Unsupervised absolute scale calculation method and system
CN111652966A (en) * 2020-05-11 2020-09-11 北京航空航天大学 Three-dimensional reconstruction method and device based on multiple visual angles of unmanned aerial vehicle
CN111784757A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Training method of depth estimation model, depth estimation method, device and equipment
CN112767294A (en) * 2021-01-14 2021-05-07 Oppo广东移动通信有限公司 Depth image enhancement method and device, electronic equipment and storage medium
CN112927154A (en) * 2021-03-05 2021-06-08 上海炬佑智能科技有限公司 ToF device, depth camera and gray scale image enhancement method
CN113052884A (en) * 2021-03-17 2021-06-29 Oppo广东移动通信有限公司 Information processing method, information processing apparatus, storage medium, and electronic device
CN113096172A (en) * 2021-03-22 2021-07-09 西安交通大学 Reverse generation method from iToF depth data to original raw data
CN113096228A (en) * 2021-06-09 2021-07-09 上海影创信息科技有限公司 Real-time illumination estimation and rendering method and system based on neural network
CN113126944A (en) * 2021-05-17 2021-07-16 北京的卢深视科技有限公司 Depth map display method, display device, electronic device, and storage medium
CN113362241A (en) * 2021-06-03 2021-09-07 太原科技大学 Depth map denoising method combining high-low frequency decomposition and two-stage fusion strategy
CN113658037A (en) * 2021-08-24 2021-11-16 凌云光技术股份有限公司 Method and device for converting depth image into gray image
CN114359123A (en) * 2022-01-12 2022-04-15 广东汇天航空航天科技有限公司 Image processing method and device
CN115375827A (en) * 2022-07-21 2022-11-22 荣耀终端有限公司 Illumination estimation method and electronic equipment
CN116612357A (en) * 2023-07-11 2023-08-18 睿尔曼智能科技(北京)有限公司 Method, system and storage medium for constructing unsupervised RGBD multi-mode data set

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016704A (en) * 2017-03-09 2017-08-04 杭州电子科技大学 A kind of virtual reality implementation method based on augmented reality
CN108399610A (en) * 2018-03-20 2018-08-14 上海应用技术大学 A kind of depth image enhancement method of fusion RGB image information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016704A (en) * 2017-03-09 2017-08-04 杭州电子科技大学 A kind of virtual reality implementation method based on augmented reality
CN108399610A (en) * 2018-03-20 2018-08-14 上海应用技术大学 A kind of depth image enhancement method of fusion RGB image information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHI YAN 等: "DDRNet:Depth Map Denoising and Refinement for Consumer Depth Cameras Using Cascaded CNNs", 《EUROPEAN CONFERENCE ON COMPUTER VISION》 *

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110689558A (en) * 2019-09-30 2020-01-14 清华大学 Multi-sensor image enhancement method and device
CN110689558B (en) * 2019-09-30 2022-07-22 清华大学 Multi-sensor image enhancement method and device
CN111275751A (en) * 2019-10-12 2020-06-12 浙江省北大信息技术高等研究院 Unsupervised absolute scale calculation method and system
CN111275751B (en) * 2019-10-12 2022-10-25 浙江省北大信息技术高等研究院 Unsupervised absolute scale calculation method and system
CN111272290B (en) * 2020-03-13 2022-07-19 西北工业大学 Temperature measurement thermal infrared imager calibration method and device based on deep neural network
CN111272290A (en) * 2020-03-13 2020-06-12 西北工业大学 Temperature measurement thermal infrared imager calibration method and device based on deep neural network
CN111652966A (en) * 2020-05-11 2020-09-11 北京航空航天大学 Three-dimensional reconstruction method and device based on multiple visual angles of unmanned aerial vehicle
CN111652966B (en) * 2020-05-11 2021-06-04 北京航空航天大学 Three-dimensional reconstruction method and device based on multiple visual angles of unmanned aerial vehicle
CN111784757A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Training method of depth estimation model, depth estimation method, device and equipment
CN111784757B (en) * 2020-06-30 2024-01-23 北京百度网讯科技有限公司 Training method of depth estimation model, depth estimation method, device and equipment
CN112767294A (en) * 2021-01-14 2021-05-07 Oppo广东移动通信有限公司 Depth image enhancement method and device, electronic equipment and storage medium
CN112767294B (en) * 2021-01-14 2024-04-26 Oppo广东移动通信有限公司 Depth image enhancement method and device, electronic equipment and storage medium
CN112927154B (en) * 2021-03-05 2023-06-02 上海炬佑智能科技有限公司 ToF device, depth camera and gray image enhancement method
CN112927154A (en) * 2021-03-05 2021-06-08 上海炬佑智能科技有限公司 ToF device, depth camera and gray scale image enhancement method
CN113052884A (en) * 2021-03-17 2021-06-29 Oppo广东移动通信有限公司 Information processing method, information processing apparatus, storage medium, and electronic device
CN113096172A (en) * 2021-03-22 2021-07-09 西安交通大学 Reverse generation method from iToF depth data to original raw data
CN113096172B (en) * 2021-03-22 2023-10-27 西安交通大学 Reverse generation method from iToF depth data to original raw data
CN113126944A (en) * 2021-05-17 2021-07-16 北京的卢深视科技有限公司 Depth map display method, display device, electronic device, and storage medium
CN113362241B (en) * 2021-06-03 2022-04-05 太原科技大学 Depth map denoising method combining high-low frequency decomposition and two-stage fusion strategy
CN113362241A (en) * 2021-06-03 2021-09-07 太原科技大学 Depth map denoising method combining high-low frequency decomposition and two-stage fusion strategy
CN113096228B (en) * 2021-06-09 2021-08-31 上海影创信息科技有限公司 Real-time illumination estimation and rendering method and system based on neural network
CN113096228A (en) * 2021-06-09 2021-07-09 上海影创信息科技有限公司 Real-time illumination estimation and rendering method and system based on neural network
CN113658037A (en) * 2021-08-24 2021-11-16 凌云光技术股份有限公司 Method and device for converting depth image into gray image
CN113658037B (en) * 2021-08-24 2024-05-14 凌云光技术股份有限公司 Method and device for converting depth image into gray level image
CN114359123A (en) * 2022-01-12 2022-04-15 广东汇天航空航天科技有限公司 Image processing method and device
CN115375827A (en) * 2022-07-21 2022-11-22 荣耀终端有限公司 Illumination estimation method and electronic equipment
CN115375827B (en) * 2022-07-21 2023-09-15 荣耀终端有限公司 Illumination estimation method and electronic equipment
CN116612357A (en) * 2023-07-11 2023-08-18 睿尔曼智能科技(北京)有限公司 Method, system and storage medium for constructing unsupervised RGBD multi-mode data set
CN116612357B (en) * 2023-07-11 2023-11-24 睿尔曼智能科技(北京)有限公司 Method, system and storage medium for constructing unsupervised RGBD multi-mode data set

Similar Documents

Publication Publication Date Title
CN110211061A (en) List depth camera depth map real time enhancing method and device neural network based
CN108921926B (en) End-to-end three-dimensional face reconstruction method based on single image
Bilinski et al. Dense decoder shortcut connections for single-pass semantic segmentation
CN105787439B (en) A kind of depth image human synovial localization method based on convolutional neural networks
CN109410261B (en) Monocular image depth estimation method based on pyramid pooling module
CN109101975A (en) Image, semantic dividing method based on full convolutional neural networks
CN107358626A (en) A kind of method that confrontation network calculations parallax is generated using condition
CN111542861A (en) System and method for rendering an avatar using a depth appearance model
CN108022213A (en) Video super-resolution algorithm for reconstructing based on generation confrontation network
CN109584290A (en) A kind of three-dimensional image matching method based on convolutional neural networks
CN109087243A (en) A kind of video super-resolution generation method generating confrontation network based on depth convolution
CN113822982A (en) Human body three-dimensional model construction method and device, electronic equipment and storage medium
CN109035142A (en) A kind of satellite image ultra-resolution method fighting network integration Aerial Images priori
CN110473284A (en) A kind of moving object method for reconstructing three-dimensional model based on deep learning
CN107609638A (en) A kind of method based on line decoder and interpolation sampling optimization convolutional neural networks
CN109461177B (en) Monocular image depth prediction method based on neural network
CN110210524A (en) A kind of training method, image enchancing method and the device of image enhancement model
CN107944551A (en) One kind is used for electrowetting display screen defect identification method
CN110168572A (en) Information processing method, information processing unit, computer readable storage medium
CN113658040A (en) Face super-resolution method based on prior information and attention fusion mechanism
CN112634456B (en) Real-time high-realism drawing method of complex three-dimensional model based on deep learning
CN110599585A (en) Single-image human body three-dimensional reconstruction method and device based on deep learning
CN109447897A (en) A kind of real scene image composition method and system
CN114897728A (en) Image enhancement method and device, terminal equipment and storage medium
CN116670720A (en) Method and system for generating a three-dimensional (3D) model of an object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190906