CN107578435A - A kind of picture depth Forecasting Methodology and device - Google Patents

A kind of picture depth Forecasting Methodology and device Download PDF

Info

Publication number
CN107578435A
CN107578435A CN201710811182.4A CN201710811182A CN107578435A CN 107578435 A CN107578435 A CN 107578435A CN 201710811182 A CN201710811182 A CN 201710811182A CN 107578435 A CN107578435 A CN 107578435A
Authority
CN
China
Prior art keywords
network
image
depth
test
depth map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710811182.4A
Other languages
Chinese (zh)
Other versions
CN107578435B (en
Inventor
戴琼海
刘侃
方璐
王好谦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen International Graduate School of Tsinghua University
Original Assignee
Tsinghua Berkeley Shenzhen College Preparatory Office
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua Berkeley Shenzhen College Preparatory Office filed Critical Tsinghua Berkeley Shenzhen College Preparatory Office
Priority to CN201710811182.4A priority Critical patent/CN107578435B/en
Publication of CN107578435A publication Critical patent/CN107578435A/en
Application granted granted Critical
Publication of CN107578435B publication Critical patent/CN107578435B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Processing (AREA)

Abstract

The embodiment of the invention discloses a kind of picture depth Forecasting Methodology and device.Wherein method includes:Depth prediction processing is carried out based on first nerves network handles processing image, generates overall depth figure;Depth prediction processing is carried out at least one topography of the pending image based at least one nervus opticus network, generates at least one partial depth map;Processing, generation fusion depth map are weighted to the overall depth figure and at least one partial depth map according to default fusion weight.The embodiment of the present invention solves the problems, such as that picture depth precision of prediction is low in the prior art, complex operation, realizes and carries out high accuracy depth prediction to image.

Description

A kind of picture depth Forecasting Methodology and device
Technical field
The present embodiments relate to image processing techniques, more particularly to a kind of picture depth Forecasting Methodology and device.
Background technology
Depth prediction is the FAQs of computer vision field and image processing field, and depth information can be used for passing on 3D (Three Dimensions, three-dimensional) information, and further solve the machine visual tasks such as scene understanding or Object identifying.
The traditional approach of the extraction of depth information generally requires multiple input pictures, such as multi-view image, motion structure Multi-view image or for photometric stereo and multifocal image etc..Existing method is by learning 2D images and 3D rendering Between correlation, and then obtain image predetermined depth information.But true picture covers substantial amounts of different scenes, 2D images It is widely different between 3D rendering, cause that depth prediction precision is low, and effect is poor.
The content of the invention
The present invention provides a kind of picture depth Forecasting Methodology and device, and high accuracy depth prediction is carried out to image to realize.
In a first aspect, the embodiments of the invention provide a kind of picture depth Forecasting Methodology, this method includes:
Depth prediction processing is carried out based on first nerves network handles processing image, generates overall depth figure;
It is pre- that depth is carried out at least one topography of the pending image based at least one nervus opticus network Survey is handled, and generates at least one partial depth map;
Processing is weighted to the overall depth figure and at least one partial depth map according to default fusion weight, Generation fusion depth map.
Further, at least one topography based at least one nervus opticus network to the pending image Before carrying out depth prediction processing, in addition to:
Feature recognition is carried out to the pending image based on third nerve network, generates feature dot image;
Local image region is determined according to the feature dot image, processing is cut out to the local image region, it is raw Into at least one topography.
Further, it is complete in the first nerves network, the third nerve network and the nervus opticus network struction Cheng Shi, network parameter initialization is carried out to the first nerves network, the third nerve network and the third nerve net, and The first nerves network after initialization, third nerve network and nervus opticus network are instructed according to least disadvantage function fashion Practice, wherein, the network parameter of initialization is set according to one-dimensional gaussian profile.
Further, the default fusion weight is pre-set, and the method to set up of the default fusion weight includes:
Nervus opticus network described in the first nerves network is subjected to depth prediction processing, generation test by test sample Overall depth figure and at least one test partial depth map;
Obtain respectively it is described test partial depth map the first test error and test overall depth figure in the test Second test error of partial depth map corresponding region;
Default fusion weight is determined according to first test error and the second test error.
Further, the first nerves network and third nerve network composition multitask neutral net are described more The forward part of task neutral net includes the convolutional layer of the first predetermined number, for extracting the characteristic information of input picture;
The rear portion of the multitask neutral net point includes the first branch and the second branch, and first branch includes second The warp lamination of predetermined number, for the overall depth figure to generating the input picture, second branch includes pond Change layer and full articulamentum, for being cut out processing to the input picture, generate at least one topography, wherein, in institute State convolutional layer and be connected with pond layer, normalization layer and activation primitive layer afterwards, activation letter is connected with after the full articulamentum Several layers.
Further, the nervus opticus network includes the convolutional layer of the 3rd predetermined number and the warp of the 4th predetermined number Lamination, wherein, pond layer, normalization layer and activation primitive layer are connected with after the convolutional layer.
Further, the pending image is facial image, and the topography includes at least one of following:Left eye figure Picture, eye image, nose image and face image.
Second aspect, the embodiment of the present invention additionally provide a kind of picture depth prediction meanss, and the device includes:
Overall depth figure generation module, for carrying out depth prediction processing based on first nerves network handles processing image, Generate overall depth figure;
Partial depth map generation module, for based at least one nervus opticus network to the pending image at least One topography carries out depth prediction processing, generates at least one partial depth map;
Merge depth map generation module, for according to default fusion weight to the overall depth figure and described at least one Partial depth map is weighted processing, generation fusion depth map.
Further, device also includes:
Characteristic point determining module, for based at least one nervus opticus network to the pending image at least one Before individual topography carries out depth prediction processing, feature recognition is carried out to the pending image based on third nerve network, Generate feature dot image;
Topography's generation module, for determining local image region according to the feature dot image, to the Local map As region is cut out processing, at least one topography is generated.
Further, it is complete in the first nerves network, the third nerve network and the nervus opticus network struction Cheng Shi, network parameter initialization is carried out to the first nerves network, the third nerve network and the third nerve net, and The first nerves network after initialization, third nerve network and nervus opticus network are instructed according to least disadvantage function fashion Practice, wherein, the network parameter of initialization is set according to one-dimensional gaussian profile.
Further, the default fusion weight is pre-set, and weight setting module includes:
Depth map determining unit, for test sample to be carried out the first nerves network and the nervus opticus network Depth prediction processing, generation test overall depth figure and at least one test partial depth map;
Test error determining unit, for obtaining the first test error of the test partial depth map respectively and testing whole In body depth map with the second test error of the test partial depth map corresponding region;
Weight determining unit, for determining default fusion weight according to first test error and the second test error.
Further, the first nerves network and third nerve network composition multitask neutral net are described more The forward part of task neutral net includes the convolutional layer of the first predetermined number, for extracting the characteristic information of input picture;
The rear portion of the multitask neutral net point includes the first branch and the second branch, and first branch includes second The warp lamination of predetermined number, for the overall depth figure to generating the input picture, second branch includes pond Change layer and full articulamentum, for being cut out processing to the input picture, generate at least one topography, wherein, in institute State convolutional layer and be connected with pond layer, normalization layer and activation primitive layer afterwards, activation letter is connected with after the full articulamentum Several layers.
Further, the nervus opticus network includes the convolutional layer of the 3rd predetermined number and the warp of the 4th predetermined number Lamination, wherein, pond layer, normalization layer and activation primitive layer are connected with after the convolutional layer.
Further, the pending image is facial image, and the topography includes at least one of following:Left eye figure Picture, eye image, nose image and face image.
The embodiment of the present invention is by default neutral net to pending image and at least one Local map of pending image As carrying out depth prediction processing, overall depth figure and partial depth map are generated, and according to the default weight that merges by overall depth figure Fusion is weighted with partial depth map, high accuracy fusion depth map is generated, it is low to solve depth prediction precision in the prior art The problem of, realize and high accuracy depth prediction is carried out to image.
Brief description of the drawings
Fig. 1 is a kind of flow chart for picture depth Forecasting Methodology that the embodiment of the present invention one provides;
Fig. 2A is the pending facial image that the embodiment of the present invention one provides;
Fig. 2 B are the overall depth figures that the pending image that the embodiment of the present invention one provides generates through first nerves network;
Fig. 2 C are that the feature dot image that the pending image that the embodiment of the present invention one provides generates through third nerve network is shown It is intended to;
Fig. 2 D are a kind of schematic diagrames for multitask neutral net that the embodiment of the present invention one provides;
Fig. 3 is a kind of structural representation for picture depth prediction meanss that the embodiment of the present invention two provides.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that in order to just Part related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.
Embodiment one
Fig. 1 is a kind of flow chart for picture depth Forecasting Methodology that the embodiment of the present invention one provides, and the present embodiment is applicable In the situation for carrying out high accuracy depth prediction to image automatically, this method can be deep by a kind of image provided in an embodiment of the present invention Prediction meanss are spent to perform, and the device can be realized by the way of software and/or hardware.The device specifically includes:
S110, image progress depth prediction processing is handled based on first nerves network handles, generate overall depth figure.
Wherein, depth prediction processing refers to extracting the processing mode of depth information in pending middle image, depth information The hierarchical information or far and near distance information that each object is actual in image are referred to, if image has depth information, with level Sense and third dimension, visual effect are preferable.
Overall depth figure refers to including the image of depth information in pending image.Wherein, overall depth figure is gray scale Image, the half-tone information of pixel is characterized by gray value, exemplary, grey scale pixel value is bigger, shows that actual object is more remote, Gray value is smaller, shows that actual object is nearer.
In the present embodiment, before depth prediction processing is carried out to pending image, pending image is pre-processed, Wherein, pretreatment includes amplification, diminution or image segmentation of image etc., and wherein image refers to deleting in pending image Background image, simplify the input information of first nerves network.Optionally, the pending image point of first nerves network is inputted Resolution is fixed, exemplary, such as the resolution ratio of pending image can be 384x384.In the present embodiment, by pending Image process is pre-processed, and pending image is simplified and size is unified, it is useful to be advantageous to first nerves network rapid extraction Characteristic information, avoid the interference of background information.
In the present embodiment, first nerves network is that training in advance obtains, first nerves network can be include convolutional layer, Warp lamination and pond layer, and can connection pool layer, activation primitive layer and normalization (Batch after each convolutional layer Normalization, BN) layer, wherein, the order of connection of pond layer, activation primitive layer and normalization layer does not limit.It is exemplary , the quantity of convolutional layer can be 4 layers, and warp lamination can be 5 layers, and 5 layers of warp lamination are connected to after 4 layers of convolutional layer, Activation primitive for example can be ReLU functions, PReLU functions or RReLU functions.
S120, at least one topography progress depth based at least one nervus opticus network handles processing image are pre- Survey is handled, and generates at least one partial depth map.
In the present embodiment, the topological structure of nervus opticus network can be it is identical with the topological structure of first nerves network or It is different.
Optionally, nervus opticus network can be multiple, and multiple nervus opticus networks can be with phase homeomorphism knot Structure and heterogeneous networks parameter, are obtained based on different training samples.Different topographies corresponds to different nervus opticus networks.
In the present embodiment, multiple topographies can be by manually determine local image region cut generation or By the local image region of the pending image of automatic identification, and cut and generate automatically according to recognition result.Optionally, based on Three neutral nets are cut out processing to pending image, generate at least one topography.
Wherein, cut out processing to refer to by identifying key message in pending image, and intercept key message location Domain, forms topography, and topography refers to including the image of pending image local information.Exemplary, if pending Image is character image, and key message can be human limbs information;If pending image is facial image, key message can To be human face information.
In the present embodiment, third nerve network is that training in advance obtains, third nerve network can be include convolutional layer, Pond layer and full articulamentum, and activation primitive layer and normalization (Batch can be connected after each convolutional layer Normalization, BN) layer.Exemplary, the quantity of convolutional layer can be 4 layers, and full articulamentum can be 1 layer, and pond layer can To be after 1 layer, wherein pond layer and full articulamentum are connected to 4 layers of convolutional layer in turn, activation primitive for example can be ReLU letters Number, PReLU functions or RReLU functions.
Optionally, processing is cut out based on third nerve network handles processing image, generates at least one topography, Including:Feature recognition is carried out based on third nerve network handles processing image, generates feature dot image;It is true according to feature dot image Determine local image region, processing is cut out to local image region, generate at least one topography.
In the present embodiment, the key message in pending image is characterized by characteristic point in feature dot image, it is exemplary, It can be the profile that key message is formed by multiple characteristic points, can also be multiple and key message is covered by characteristic point, will The region or characteristic point overlay area that characteristic point connecting line includes determine local image region, and local image region is carried out Cut out, generate topography.Optionally, characteristic point corresponding to different key messages can be different, exemplary, different crucial letters Breath can be the feature dot image using different colours or figure.
Optionally, pending image is facial image, and topography includes at least one of following:Left-eye image, right eye figure Picture, nose image and face image.Optionally, topography can also include left eyebrow image and right eyebrow image.
Exemplary, it is the pending face figure that the embodiment of the present invention one provides referring to Fig. 2A, Fig. 2 B and Fig. 2 C, Fig. 2A Picture, wherein, the facial image is the pending facial image obtained by pretreatment;Fig. 2 B are that the embodiment of the present invention one provides The overall depth figure that is generated through first nerves network of pending image;Fig. 2 C are the pending figures that the embodiment of the present invention one provides As the feature dot image schematic diagram generated through third nerve network, wherein, the dot in Fig. 2 C is characterized a little, and eye area There is characteristic point domain, nasal area, face region and brow region, and according to features described above point, pending image can be cut Reason is made arrangement after due consideration, generates multiple topographies.
Optionally, before topography being inputted into nervus opticus network, the resolution ratio for adjusting topography is differentiated to be default Rate, exemplary, default resolution ratio can be 384x384.Accordingly, the depth prediction based on nervus opticus network is handled The resolution recovery of the partial depth map arrived to topography initial resolution.In the present embodiment, by increasing topography Resolution ratio, be advantageous to improve partial depth map precision.
Optionally, nervus opticus network includes the convolutional layer of the 3rd predetermined number and the warp lamination of the 4th predetermined number, Wherein, pond layer is connected with after the convolutional layer, normalizes layer and activation primitive layer, wherein, pond layer, activation primitive layer and return The order of connection of one change layer does not limit.Exemplary, the 3rd predetermined number can be 4, and the 4th predetermined number can be 5, swash Function living for example can be ReLU functions, PReLU functions or RReLU functions.
It should be noted that step S110 can be performed simultaneously with step S120, in the absence of priority sequential relationship.
The default fusion weight of S130, basis is weighted processing to overall depth figure and at least one partial depth map, raw Into fusion depth map.
In the present embodiment, overall depth figure and partial depth map are to characterize depth information by gray value, and according to same One gray value corresponds to identical depth value.In the present embodiment, overall depth figure and at least one partial depth map are weighted Processing refers to the gray value corresponding with overall depth figure of the gray value of each pixel in topography being based on default fusion Weight is weighted, it is determined that fusion depth map.Wherein, preset corresponding to different topographies fusion weight can with identical or It is different.It is exemplary, left eye partial depth map and the left eye region of overall depth figure are carried out to the weighted calculation of corresponding pixel points, The face region of face partial depth map and overall depth figure is carried out to the weighted calculation of corresponding pixel points.Optionally, above-mentioned two Fusion weight is preset corresponding to individual local gray level figure can be with identical or different.
It should be noted that the precision of partial depth map is higher than the precision of corresponding region in overall depth figure.The present embodiment In, partial depth map and overall depth figure are carried out by obtaining high-precision partial depth map, and based on default fusion weight Fusion, generates high-precision fusion depth map, solves the problems, such as that depth prediction precision is low in the prior art, meanwhile, based on god Depth prediction processing through network is end-to-end processing mode, simple to operate.
Optionally, three-dimensional image reconstruction is carried out according to fusion depth map, can be applied to video conference, visual telephone, virtual Game, recognition of face or film or cartoon making etc., be advantageous to improve later image or the definition of video production.
Optionally, preset fusion weight to pre-set, the method to set up of default fusion weight includes:By test sample First nerves network and nervus opticus network are subjected to depth prediction processing, generation test overall depth figure and at least one test Partial depth map;Obtain respectively local deep with test in the first test error and test overall depth figure of test partial depth map Spend the second test error of figure corresponding region;Default fusion weight is determined according to the first test error and the second test error.
It is on the basis of first nerves network and nervus opticus network training are completed, test sample is defeated in the present embodiment Enter first nerves network to obtain testing overall depth figure, by nervus opticus network corresponding to the input of the topography of test sample, Test partial depth map corresponding to generation.The first test error of test overall depth figure is determined based on standard overall depth figure, Exemplary, each pixel is determined by the difference of standard overall depth figure and test overall depth figure corresponding pixel points gray value Error, the average of each pixel point tolerance is defined as the first test error.Optionally, in overall depth figure is tested determine with Topography corresponds to the first test error of regional area.Similarly, each test partial-depth is determined based on standard partial depth map Second test error of figure.
Optionally, the corresponding test error of the respective weights of overall depth figure and partial depth map is inversely proportional.Example Property, the method that default fusion weight is illustrated by taking left-eye image as an example, the second test error and the according to corresponding to left-eye image One test error determines the second depth error and the first depth error, such as the second depth error can be 0.1mm, the first depth The default fusion weight of error 0.2mm, the regional area of overall depth figure corresponding to left-eye image and left eye depth map can be 2/3 and 1/3.Optionally, different fusion weights can be set in the different zones of overall depth figure.
Wherein, preset fusion weight and determine that exemplary, test sample quantity can be by substantial amounts of test sample 1900。
In the present embodiment, by determining default fusion weight according to the test error of overall depth figure and partial depth map, The weight of high-precision depth map is improved, while reduces the weight of low precision depth map, further increases the depth of fusion depth map Spend precision of prediction.
The technical scheme of the present embodiment, by presetting neutral net at least one of pending image and pending image Topography carries out depth prediction processing, generates overall depth figure and partial depth map, and will be overall according to default fusion weight Depth map and partial depth map are weighted fusion, generate high accuracy fusion depth map, solve depth prediction in the prior art The problem of precision is low, realizes and high accuracy depth prediction is carried out to image.
On the basis of above-described embodiment, before depth prediction processing is carried out to pending image, establish and train the One neutral net, third nerve network and nervus opticus network.Wherein:In first nerves network, third nerve network and second When neutral net structure is completed, network parameter initialization is carried out to first nerves network, third nerve network and third nerve net, And the first nerves network after initialization, third nerve network and nervus opticus network are carried out according to least disadvantage function fashion Training, wherein, the network parameter of initialization is set according to one-dimensional gaussian profile.
Wherein, network parameter initialization refers to setting initial network parameter, in the present embodiment, each nerve for neutral net The initial network parameter of network is set according to one-dimensional gaussian profile, instead of random to neutral net progress initial in the prior art Change, be advantageous to improve the training effectiveness of neutral net, avoid because neutral net rate of convergence is slow caused by random initializtion Or the problem of can not restraining.
Optionally, the training method of first nerves network includes:By first sample image through first nerves net to be trained Network carries out depth prediction processing, generates the first training image;According to the first training image standard corresponding with first sample image Overall depth figure, first-loss function is generated, the network ginseng of first nerves network to be trained is adjusted according to first-loss function Number.
In the present embodiment, first sample image for example can be substantial amounts of facial image, exemplary, first sample image Can include 2500 colorized face images, wherein male's facial image 1100 is opened, and women facial image 1400 is opened, optional , first sample image is unified size.
In the present embodiment, standard overall depth figure can be pre-set, also can be in first nerves network training process Period is extracted.For example, the depth information of information model extraction training image is extracted by predetermined depth figure, it is exemplary, Predetermined depth figure extraction information model for example can be HourGlass models, wherein, HourGlass models are previously obtained.The One loss function is special for characterizing the standard of the characteristic information of the training image of neutral net generation and standard overall depth figure The inconsistent degree of reference breath, the value of first-loss function is smaller, and the robustness of first nerves network is generally better.It is exemplary , first-loss function can use the form of mean square error (Mean Squared Error, MSE) to determine.
In the present embodiment, first-loss function is subjected to gradient anti-pass, and first nerves is adjusted according to first-loss function The network parameter of network.Optionally, network parameter includes but is not limited to weight and deviant.
Optionally, the training method of nervus opticus network includes:By the second sample image through nervus opticus net to be trained Network carries out depth prediction processing, generates the second training image;According to the second training image standard corresponding with the second sample image Partial depth map, the second loss function is generated, the network ginseng of nervus opticus network to be trained is adjusted according to the second loss function Number.
In the present embodiment, the second sample image is set, wherein the second sample image and first sample images match.It is exemplary , if nervus opticus network is used to carry out left-eye image depth prediction processing, the second sample image is and first sample figure The left eye topography as corresponding to.In the present embodiment, standard partial depth map can be pre-set, also can be in nervus opticus Extracted during network training process.Second loss function is the spy for characterizing the second training image of neutral net generation Reference ceases the inconsistent degree with the standard feature information of standard partial depth map, and the value of the second loss function is smaller, the second god Robustness through network is generally better.Exemplary, the second loss function can be determined in the form of mean square error.
In the present embodiment, the second loss function is subjected to gradient anti-pass, and nervus opticus is adjusted according to the second loss function The network parameter of network.Optionally, network parameter includes but is not limited to weight and deviant.
Optionally, the training method of third nerve network includes:By the 3rd sample image through third nerve net to be trained Network is cut out processing, generates at least one training topography;Training topography is obtained respectively and corresponding standard is local The boundary coordinate information of image;Believed according to the boundary coordinate information of training topography and the boundary coordinate of standard topography Breath, determines the 3rd loss function;The network parameter of third nerve network is adjusted according to the 3rd loss function.
In the present embodiment, the 3rd sample image can be identical with first sample image, reduce sample collection quantity.Its In the 3rd loss function boundary coordinate information and the standard topography of the training topography that are used to characterizing neutral net generation Boundary coordinate information inconsistent degree, it is exemplary, can be that the average error value of each boundary pixel point is defined as Three loss function values.3rd loss function is subjected to gradient anti-pass, and third nerve network is adjusted according to the 3rd loss function Network parameter.Optionally, network parameter includes but is not limited to weight and deviant.
Optionally, first nerves network and third nerve network composition multitask neutral net, it is exemplary, referring to figure 2D, Fig. 2 D are a kind of schematic diagrames for multitask neutral net that the embodiment of the present invention one provides.The front portion of multitask neutral net Dividing includes the convolutional layer of the first predetermined number, for extracting the characteristic information of input picture;The rear part of multitask neutral net Including the first branch and the second branch, the first branch includes the warp lamination of the second predetermined number, for generating input picture Overall depth figure, the second branch includes pond layer and full articulamentum, and for being cut out processing to input picture, generation is at least One topography, wherein, pond layer is connected with after the convolutional layer, normalizes layer and activation primitive layer, in full articulamentum it After be connected with activation primitive layer.
In the present embodiment, depth prediction processing is carried out to pending image simultaneously by multitask neutral net and cuts out place Reason, overall depth figure and at least one topography are generated, realizes while completes multiple tasks, instead of a neutral net A task is can be only done, simplifies the training process of neutral net.
Embodiment two
Fig. 3 is a kind of structural representation for picture depth prediction meanss that the embodiment of the present invention two provides, and the device is specific Including:
Overall depth figure generation module 210, for being carried out based on first nerves network handles processing image at depth prediction Reason, generate overall depth figure;
Partial depth map generation module 220, for handling image at least based at least one nervus opticus network handles One topography carries out depth prediction processing, generates at least one partial depth map;
Depth map generation module 230 is merged, for the default fusion weight of basis to overall depth figure and at least one part Depth map is weighted processing, generation fusion depth map.
Optionally, described device also includes:
Characteristic point determining module, for based at least one nervus opticus network to the pending image at least one Before individual topography carries out depth prediction processing, feature recognition is carried out to the pending image based on third nerve network, Generate feature dot image;
Topography's generation module, for determining local image region according to feature dot image, local image region is entered Row cuts out processing, generates at least one topography, wherein, feature dot image corresponding to different images region is different.
Optionally, when first nerves network, third nerve network and nervus opticus network struction are completed, to first nerves Network, third nerve network and third nerve net carry out network parameter initialization, and according to least disadvantage function fashion to initial First nerves network, third nerve network and nervus opticus network after change are trained, wherein, the network parameter root of initialization Set according to one-dimensional gaussian profile.
Optionally, preset fusion weight to pre-set, weight setting module includes:
Depth map determining unit, for test sample to be surveyed first nerves network and nervus opticus network processes, generation Try overall depth figure and at least one test partial depth map;
Test error determining unit, the first test error and test for obtaining test partial depth map respectively are overall deep Spend the second test error with test partial depth map corresponding region in figure;
Weight determining unit, for determining default fusion weight according to the first test error and the second test error.
Optionally, first nerves network and third nerve network composition multitask neutral net, multitask neutral net Forward part includes the convolutional layer of the first predetermined number, for extracting the characteristic information of input picture;
The rear portion of multitask neutral net point includes the first branch and the second branch, and the first branch includes the second predetermined number Warp lamination, for the overall depth figure to generating input picture, the second branch includes pond layer and full articulamentum, for pair Input picture is cut out processing, generates at least one topography, wherein, it is connected with pond layer, normalizing after convolutional layer Change layer and activation primitive layer, activation primitive layer is connected with after full articulamentum.
Optionally, nervus opticus network includes the convolutional layer of the 3rd predetermined number and the warp lamination of the 4th predetermined number, Wherein, pond layer, normalization layer and activation primitive layer are connected with after convolutional layer.
Optionally, pending image is facial image, and topography includes at least one of following:Left-eye image, right eye figure Picture, nose image and face image.
Picture depth prediction meanss provided in an embodiment of the present invention can perform the image that any embodiment of the present invention is provided Depth prediction approach, possess and perform the corresponding functional module of picture depth Forecasting Methodology and beneficial effect.
Pay attention to, above are only presently preferred embodiments of the present invention and institute's application technology principle.It will be appreciated by those skilled in the art that The invention is not restricted to specific embodiment described here, can carry out for a person skilled in the art various obvious changes, Readjust and substitute without departing from protection scope of the present invention.Therefore, although being carried out by above example to the present invention It is described in further detail, but the present invention is not limited only to above example, without departing from the inventive concept, also Other more equivalent embodiments can be included, and the scope of the present invention is determined by scope of the appended claims.

Claims (14)

  1. A kind of 1. picture depth Forecasting Methodology, it is characterised in that including:
    Depth prediction processing is carried out based on first nerves network handles processing image, generates overall depth figure;
    At least one topography of the pending image is carried out at depth prediction based at least one nervus opticus network Reason, generates at least one partial depth map;
    Processing is weighted to the overall depth figure and at least one partial depth map according to default fusion weight, generated Merge depth map.
  2. 2. according to the method for claim 1, it is characterised in that waiting to locate to described based at least one nervus opticus network Before at least one topography progress depth prediction processing for managing image, in addition to:
    Feature recognition is carried out to the pending image based on third nerve network, generates feature dot image;
    Local image region is determined according to the feature dot image, processing is cut out to the local image region, generation is extremely A few topography.
  3. 3. according to the method for claim 2, it is characterised in that in the first nerves network, the third nerve network When being completed with the nervus opticus network struction, to the first nerves network, the third nerve network and the 3rd god Network parameter initialization is carried out through net, and according to least disadvantage function fashion to the first nerves network after initialization, the 3rd god It is trained through network and nervus opticus network, wherein, the network parameter of initialization is set according to one-dimensional gaussian profile.
  4. 4. according to the method for claim 1, it is characterised in that the default fusion weight is pre-set, described pre- If the method to set up of fusion weight includes:
    Test sample is based on the first nerves network and the nervus opticus network carries out depth prediction processing, generation test Overall depth figure and at least one test partial depth map;
    Obtain respectively local with the test in the first test error and test overall depth figure of the test partial depth map Second test error of depth map corresponding region;
    Default fusion weight is determined according to first test error and the second test error.
  5. 5. according to the method for claim 2, it is characterised in that the first nerves network and the third nerve group of networks Into multitask neutral net, the forward part of the multitask neutral net includes the convolutional layer of the first predetermined number, for extracting The characteristic information of input picture;
    The rear portion of the multitask neutral net point includes the first branch and the second branch, and it is default that first branch includes second The warp lamination of quantity, for the overall depth figure to generating the input picture, second branch includes pond layer With full articulamentum, for being cut out processing to the input picture, at least one topography is generated, wherein, in the volume Pond layer, normalization layer and activation primitive layer are connected with after lamination, activation primitive layer is connected with after the full articulamentum.
  6. 6. according to the method for claim 1, it is characterised in that the nervus opticus network includes the volume of the 3rd predetermined number The warp lamination of lamination and the 4th predetermined number, wherein, pond layer, normalization layer and activation are connected with after the convolutional layer Function layer.
  7. 7. according to any described methods of claim 1-6, it is characterised in that the pending image is facial image, described Topography includes at least one of following:Left-eye image, eye image, nose image and face image.
  8. A kind of 8. picture depth prediction meanss, it is characterised in that including:
    Overall depth figure generation module, for carrying out depth prediction processing, generation based on first nerves network handles processing image Overall depth figure;
    Partial depth map generation module, for based at least one nervus opticus network at least one of the pending image Topography carries out depth prediction processing, generates at least one partial depth map;
    Depth map generation module is merged, for the default fusion weight of basis to the overall depth figure and at least one part Depth map is weighted processing, generation fusion depth map.
  9. 9. device according to claim 8, it is characterised in that described device also includes:
    Characteristic point determining module, at least one office based at least one nervus opticus network to the pending image Before portion's image carries out depth prediction processing, feature recognition, generation are carried out to the pending image based on third nerve network Feature dot image;
    Topography's generation module, for determining local image region according to the feature dot image, to the area of topography Domain is cut out processing, generates at least one topography.
  10. 10. device according to claim 9, it is characterised in that in the first nerves network, the third nerve network When being completed with the nervus opticus network struction, to the first nerves network, the third nerve network and the 3rd god Network parameter initialization is carried out through net, and according to least disadvantage function fashion to the first nerves network after initialization, the 3rd god It is trained through network and nervus opticus network, wherein, the network parameter of initialization is set according to one-dimensional gaussian profile.
  11. 11. device according to claim 8, it is characterised in that the default fusion weight is pre-set, and weight is set Putting module includes:
    Depth map determining unit, for test sample to be carried out into depth by the first nerves network and the nervus opticus network Prediction is handled, generation test overall depth figure and at least one test partial depth map;
    Test error determining unit, the first test error and test for obtaining the test partial depth map respectively are overall deep Spend the second test error with the test partial depth map corresponding region in figure;
    Weight determining unit, for determining default fusion weight according to first test error and the second test error.
  12. 12. device according to claim 9, it is characterised in that the first nerves network and the third nerve network Multitask neutral net is formed, the forward part of the multitask neutral net includes the convolutional layer of the first predetermined number, for carrying Take the characteristic information of input picture;
    The rear portion of the multitask neutral net point includes the first branch and the second branch, and it is default that first branch includes second The warp lamination of quantity, for the overall depth figure to generating the input picture, second branch includes pond layer With full articulamentum, for being cut out processing to the input picture, at least one topography is generated, wherein, in the volume Pond layer, normalization layer and activation primitive layer are connected with after lamination, activation primitive layer is connected with after the full articulamentum.
  13. 13. device according to claim 8, it is characterised in that the nervus opticus network includes the 3rd predetermined number The warp lamination of convolutional layer and the 4th predetermined number, wherein, pond layer, normalization layer are connected with after the convolutional layer and is swashed Function layer living.
  14. 14. according to any described devices of claim 8-13, it is characterised in that the pending image is facial image, institute Topography is stated including at least one of following:Left-eye image, eye image, nose image and face image.
CN201710811182.4A 2017-09-11 2017-09-11 A kind of picture depth prediction technique and device Active CN107578435B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710811182.4A CN107578435B (en) 2017-09-11 2017-09-11 A kind of picture depth prediction technique and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710811182.4A CN107578435B (en) 2017-09-11 2017-09-11 A kind of picture depth prediction technique and device

Publications (2)

Publication Number Publication Date
CN107578435A true CN107578435A (en) 2018-01-12
CN107578435B CN107578435B (en) 2019-11-29

Family

ID=61033100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710811182.4A Active CN107578435B (en) 2017-09-11 2017-09-11 A kind of picture depth prediction technique and device

Country Status (1)

Country Link
CN (1) CN107578435B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191514A (en) * 2018-10-23 2019-01-11 北京字节跳动网络技术有限公司 Method and apparatus for generating depth detection model
CN109829886A (en) * 2018-12-25 2019-05-31 苏州江奥光电科技有限公司 A kind of pcb board defect inspection method based on depth information
WO2019149206A1 (en) * 2018-02-01 2019-08-08 深圳市商汤科技有限公司 Depth estimation method and apparatus, electronic device, program, and medium
CN110309706A (en) * 2019-05-06 2019-10-08 深圳市华付信息技术有限公司 Face critical point detection method, apparatus, computer equipment and storage medium
CN110363296A (en) * 2019-06-28 2019-10-22 腾讯科技(深圳)有限公司 Task model acquisition methods and device, storage medium and electronic device
CN111414923A (en) * 2020-03-05 2020-07-14 南昌航空大学 Indoor scene three-dimensional reconstruction method and system based on single RGB image
CN111428859A (en) * 2020-03-05 2020-07-17 北京三快在线科技有限公司 Depth estimation network training method and device for automatic driving scene and autonomous vehicle
CN112488104A (en) * 2020-11-30 2021-03-12 华为技术有限公司 Depth and confidence estimation system
CN116721143A (en) * 2023-08-04 2023-09-08 南京诺源医疗器械有限公司 Depth information processing device and method for 3D medical image

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177440A (en) * 2012-12-20 2013-06-26 香港应用科技研究院有限公司 System and method of generating image depth map
CN106204522A (en) * 2015-05-28 2016-12-07 奥多比公司 The combined depth of single image is estimated and semantic tagger
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103177440A (en) * 2012-12-20 2013-06-26 香港应用科技研究院有限公司 System and method of generating image depth map
CN106204522A (en) * 2015-05-28 2016-12-07 奥多比公司 The combined depth of single image is estimated and semantic tagger
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DAVID EIGEN 等: "Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture", 《PROC.IEEE ICCV》 *
JOSE M. FACIL 等: "Single-View and Multi-View Depth Fusion", 《IEEE ROBOTICS AND AUTOMATION LETTERS》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200049833A (en) * 2018-02-01 2020-05-08 선전 센스타임 테크놀로지 컴퍼니 리미티드 Depth estimation methods and apparatus, electronic devices, programs and media
US11308638B2 (en) 2018-02-01 2022-04-19 Shenzhen Sensetime Technology Co., Ltd. Depth estimation method and apparatus, electronic device, program, and medium
WO2019149206A1 (en) * 2018-02-01 2019-08-08 深圳市商汤科技有限公司 Depth estimation method and apparatus, electronic device, program, and medium
KR102295403B1 (en) 2018-02-01 2021-08-31 선전 센스타임 테크놀로지 컴퍼니 리미티드 Depth estimation method and apparatus, electronic device, program and medium
CN109191514B (en) * 2018-10-23 2020-11-24 北京字节跳动网络技术有限公司 Method and apparatus for generating a depth detection model
CN109191514A (en) * 2018-10-23 2019-01-11 北京字节跳动网络技术有限公司 Method and apparatus for generating depth detection model
CN109829886A (en) * 2018-12-25 2019-05-31 苏州江奥光电科技有限公司 A kind of pcb board defect inspection method based on depth information
CN110309706A (en) * 2019-05-06 2019-10-08 深圳市华付信息技术有限公司 Face critical point detection method, apparatus, computer equipment and storage medium
CN110363296A (en) * 2019-06-28 2019-10-22 腾讯科技(深圳)有限公司 Task model acquisition methods and device, storage medium and electronic device
CN110363296B (en) * 2019-06-28 2022-02-08 腾讯医疗健康(深圳)有限公司 Task model obtaining method and device, storage medium and electronic device
CN111414923A (en) * 2020-03-05 2020-07-14 南昌航空大学 Indoor scene three-dimensional reconstruction method and system based on single RGB image
CN111428859A (en) * 2020-03-05 2020-07-17 北京三快在线科技有限公司 Depth estimation network training method and device for automatic driving scene and autonomous vehicle
CN111414923B (en) * 2020-03-05 2022-07-12 南昌航空大学 Indoor scene three-dimensional reconstruction method and system based on single RGB image
CN112488104A (en) * 2020-11-30 2021-03-12 华为技术有限公司 Depth and confidence estimation system
CN112488104B (en) * 2020-11-30 2024-04-09 华为技术有限公司 Depth and confidence estimation system
CN116721143A (en) * 2023-08-04 2023-09-08 南京诺源医疗器械有限公司 Depth information processing device and method for 3D medical image
CN116721143B (en) * 2023-08-04 2023-10-20 南京诺源医疗器械有限公司 Depth information processing device and method for 3D medical image

Also Published As

Publication number Publication date
CN107578435B (en) 2019-11-29

Similar Documents

Publication Publication Date Title
CN107578435B (en) A kind of picture depth prediction technique and device
CN109255831B (en) Single-view face three-dimensional reconstruction and texture generation method based on multi-task learning
CN113012293B (en) Stone carving model construction method, device, equipment and storage medium
CN105427385B (en) A kind of high-fidelity face three-dimensional rebuilding method based on multilayer deformation model
CN105404392B (en) Virtual method of wearing and system based on monocular cam
CN110084304B (en) Target detection method based on synthetic data set
CN108495110A (en) A kind of virtual visual point image generating method fighting network based on production
WO2015188684A1 (en) Three-dimensional model reconstruction method and system
CN109978984A (en) Face three-dimensional rebuilding method and terminal device
CN107154032B (en) A kind of image processing method and device
CN110223377A (en) One kind being based on stereo visual system high accuracy three-dimensional method for reconstructing
CN110148217A (en) A kind of real-time three-dimensional method for reconstructing, device and equipment
JP2008535116A (en) Method and apparatus for three-dimensional rendering
CN110197462A (en) A kind of facial image beautifies in real time and texture synthesis method
CN104809638A (en) Virtual glasses trying method and system based on mobile terminal
CN110246209B (en) Image processing method and device
CN110189202A (en) A kind of three-dimensional virtual fitting method and system
CN116109798A (en) Image data processing method, device, equipment and medium
CN108520510B (en) No-reference stereo image quality evaluation method based on overall and local analysis
CN104599317A (en) Mobile terminal and method for achieving 3D (three-dimensional) scanning modeling function
CN110517306A (en) A kind of method and system of the binocular depth vision estimation based on deep learning
CN107578469A (en) A kind of 3D human body modeling methods and device based on single photo
CN109218706A (en) A method of 3 D visual image is generated by single image
CN107469355A (en) Game image creation method and device, terminal device
CN113144613B (en) Model-based method for generating volume cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20221123

Address after: 518000 2nd floor, building a, Tsinghua campus, Shenzhen University Town, Xili street, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Tsinghua Shenzhen International Graduate School

Address before: 518000 Nanshan Zhiyuan 1001, Xue Yuan Avenue, Nanshan District, Shenzhen, Guangdong.

Patentee before: TSINGHUA-BERKELEY SHENZHEN INSTITUTE

TR01 Transfer of patent right