CN102263979B

CN102263979B - Depth map generation method and device for plane video three-dimensional conversion

Info

Publication number: CN102263979B
Application number: CN 201110223804
Authority: CN
Inventors: 戴琼海; 李唯一
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2011-08-05
Filing date: 2011-08-05
Publication date: 2013-10-09
Anticipated expiration: 2031-08-05
Also published as: CN102263979A

Abstract

The invention discloses a depth map generation method for plane video three-dimensional conversion, comprising the following steps of: segmenting a current frame image based on scene static model distribution to obtain segmented regions, marking pixels and clustering the pixels to different regions to generate a static model depth map of the current frame image; selecting characteristic points ofthe current frame image and a reference frame image, calculating a motion vector of a moving object, and performing motion analysis on the motion vector to obtain motion analysis result of the current frame image and the reference frame image; generating a dynamic model depth map of the current frame image according to the segmented regions of the current frame image and the motion analysis result of the current frame image and the reference frame image; and adaptively fusing the static model depth map and the dynamic model depth map of the current frame image. The invention further disclosesa depth map generation device for plane video three-dimensional conversion. With the method and the device, a static model and a dynamic model of a scene are fused, a stereoscopic video with more vivid effect can be obtained, and better stereoscopic effect can be achieved.

Description

A kind of degree of depth drawing generating method and device of planar video three-dimensional

Technical field

The present invention relates to technical field of computer multimedia, particularly a kind of degree of depth drawing generating method and device of planar video three-dimensional.

Background technology

Three-dimensional video-frequency technology (also claiming the 3D video technique) is following Development of Multimedia Technology direction, and three-dimensional video-frequency can provide relief novel video technique.Compare with the single channel video, three-dimensional video-frequency generally has two video channels, and data volume will be far longer than the single channel video.The third dimension that three-dimensional video-frequency can provide human vision to experience.Nowadays, the 3D technology has been widely used in many aspects, comprises communication, broadcasting, medical treatment, education, computer game, animation etc.In addition, the fast development of 3D display makes people can experience third dimension preferably and obtains the maximal comfort of eyes.

Traditional 2D video converts the method for three-dimensional video-frequency to, be to utilize degree of depth clue single in the piece image such as the space geometry information of object texture information, scene, object relative size information etc., carry out the extraction of depth information, finally obtain image in the projection of three-dimensional world.But the scene in the most video often changes complexity, has both comprised that dynamic scene also comprises static scene, and therefore single degree of depth clue can not be applicable to all camera lenses from start to end.

In addition, the conventional motion parallax information is that the hypothesis object near apart from the beholder is faster than the speed of the object of which movement of distance, therefore utilizes the speed of the speed of related movement between the object can obtain the parallax depth map.But the scene of one section video is a lot of samples often.It is fast the object of which movement far away apart from the beholder to occur through regular meeting in the video, slow apart near object of which movement, and use conventional methods processing this moment again, will obtain wrong disparity map.

Summary of the invention

Purpose of the present invention is intended to solve at least one of above-mentioned technological deficiency.

First purpose of the present invention is to provide a kind of degree of depth drawing generating method of planar video three-dimensional, and this method merges static models and dynamic model, generates the depth map that adapts to the scene truth more, thereby obtains effect three-dimensional video-frequency more true to nature.

Second purpose of the present invention is to provide a kind of depth map generating apparatus of planar video three-dimensional.

For achieving the above object, the embodiment of first aspect present invention has proposed a kind of degree of depth drawing generating method of planar video three-dimensional, comprise the steps: current frame image is carried out cutting apart based on the image that the scene static models distribute, comprise the characteristic vector design of graphics topological structure that utilizes pixel in the described current frame image, according to the graph topological structure of described current frame image described current frame image is carried out image and cut apart to obtain cut zone, the pixel that belongs to same zone in the described cut zone is carried out mark to generate the static models depth map of described current frame image; Select the characteristic point of described current frame image and described reference frame image, follow the tracks of described characteristic point, calculate the motion vector of the moving object in described current frame image and the described reference frame image, described motion vector is carried out motion analysis to obtain the motion analysis result of described current frame image and described reference frame image; According to the cut zone of described current frame image and the motion analysis result of described current frame image and described reference frame image, generate the dynamic model depth map of described current frame image; The static models depth map of described current frame image and the dynamic model depth map of described current frame image are carried out degree of depth fusion.

Degree of depth drawing generating method according to the planar video three-dimensional of the embodiment of the invention, merge static models and dynamic model, generation both had been fit to the depth map that moving scene also is fit to static scene, thereby obtained effect three-dimensional video-frequency more true to nature, reached stereoeffect preferably.

The embodiment of second aspect present invention proposes a kind of depth map generating apparatus of planar video three-dimensional, comprise: static models depth map generation module, described static models depth map generation module makes up the graph topological structure of described current frame image for the characteristic vector of utilizing described current frame image pixel, according to the graph topological structure of described current frame image described current frame image is carried out image and cut apart to obtain cut zone, the pixel that belongs to same zone in the described cut zone is carried out the static models depth map that mark generates described current frame image; The motion analysis module, described motion analysis module is used for selecting the characteristic point of described current frame image and described reference frame image, and follow the tracks of described characteristic point, calculate the motion vector of the moving object in described current frame image and the described reference frame image, described motion vector is carried out motion analysis to obtain the motion analysis result of described current frame image and described reference frame image; Dynamic model depth map generation module, described dynamic model depth map generation module links to each other with described motion analysis module with described static models depth map generation module respectively, be used for according to the cut zone of described current frame image and the motion analysis result of described current frame image and described reference frame image, generate the dynamic model depth map of described current frame image; Degree of depth Fusion Module, described degree of depth Fusion Module links to each other with described dynamic model depth map generation module with described static models depth map generation module respectively, is used for the static models depth map of described current frame image and the dynamic model depth map of described current frame image are carried out adaptive degree of depth fusion.

Depth map generating apparatus according to the planar video three-dimensional of the embodiment of the invention, merge static models and dynamic model, generation both had been fit to the depth map that moving scene also is fit to static scene, thereby obtained effect three-dimensional video-frequency more true to nature, reached stereoeffect preferably.

The aspect that the present invention adds and advantage part in the following description provide, and part will become obviously from the following description, or recognize by practice of the present invention.

Description of drawings

Above-mentioned and/or the additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:

Fig. 1 is the flow chart according to the degree of depth drawing generating method of the planar video three-dimensional of the embodiment of the invention;

Fig. 2 is the product process according to the static models depth map of the embodiment of the invention;

Fig. 3 is the product process according to the dynamic model depth map of the embodiment of the invention;

Fig. 4 analyzes schematic diagram according to the movable information of the embodiment of the invention;

Fig. 5 is the schematic diagram according to the depth map generating apparatus of the planar video three-dimensional of the embodiment of the invention.

Embodiment

Describe embodiments of the invention below in detail, the example of described embodiment is shown in the drawings, and wherein identical or similar label is represented identical or similar elements or the element with identical or similar functions from start to finish.Be exemplary below by the embodiment that is described with reference to the drawings, only be used for explaining the present invention, and can not be interpreted as limitation of the present invention.

The degree of depth drawing generating method of the planar video three-dimensional that the embodiment of the invention provides, input plane video at first, then planar video is divided into present frame, present frame and two branches of reference frame, generate static models depth map and the dynamic model depth map of current frame image respectively, again static models depth map and dynamic model depth map are carried out degree of depth fusion, thereby obtain the depth map after present frame merges.

Below with reference to the degree of depth drawing generating method of Fig. 1 to Fig. 4 description according to the planar video three-dimensional of the embodiment of the invention.

As shown in Figure 1, the degree of depth drawing generating method according to the planar video three-dimensional of the embodiment of the invention comprises the steps:

S101: current frame image is carried out cutting apart based on the image that the scene static models distribute.As shown in Figure 2, image is cut apart and is comprised following several steps:

S1011: after current frame image is input to internal memory, at first this current frame image carried out gray processing and handle.The step that gray processing is handled comprises that the color space conversion with 24 in each pixel RGB passage of current frame image is 8 color spaces of single channel.Handle by current frame image being carried out gray processing, thereby can reduce the quantity of deal with data in the subsequent step.Current frame image after then gray processing being handled carries out cluster operation.Be appreciated that to be that above-mentioned current frame image is carried out the step that gray processing handles is optional, namely can handle and directly carry out cluster operation without gray processing current frame image.

S1012: the process that image is cut apart can be used as the element clustering problem and handle, particularly, the basic element (for example pixel) of current frame image or other composition assembled being grouped in the chunk with similar component, namely current frame image is carried out cluster operation.

According to the difference of handling the data structure of image in the image cutting procedure, cluster operation comprises following dual mode:

1) in vector space, with the element of current frame image as a vector independently.The Partial Feature component of pixel as characteristic vector, is carried out the merger grouping by central authorities' algorithm that clusters with above-mentioned characteristic vector.The global characteristics of the current frame image that obtains by said method is very effective.But also have following shortcoming, this method is owing to utilize the space structure information of vector, thus a lot of detailed information that can not preserve current frame image, therefore through regular meeting with the disconnected zone errors of reality be integrated in the group.

2) utilize the spatial relationship of element in the current frame image and be communicated with boundary information, in spatial domain, handle current frame image, then current frame image is converted into the graph topological structure of two dimension.This method has been utilized spatial relationship and UNICOM's boundary information of element, thus the detailed information that can preserve current frame image.

Preferably, adopt the second way that current frame image is carried out cluster operation.

With current frame image be converted to graph topological structure G (V, E), wherein, V is the node in the image, E is the weight on the limit between the adjacent node.Vi can also can be a small subregion that uniform shapes, similar texture or other feature structures are arranged for independent pixel, for example piece image can be divided into several fritters of 5 * 5, and each fritter is as a fundamental node.In one embodiment of the invention, adopt color vector to explain the factor as characteristic vector.In feature space, Xi is the vector of RGB colouring information, the cost Ei on limit, j have one on the occasion of weights W i, j, Wi wherein, j has reflected node (i, j) the similitude degree between.The present invention adopts two Euclidean distances between the node to weigh Wi, the size of j.

Wi，j＝||Xi-Xj||，

total = Σ_{1}^{n} W (i, j),

Wherein, n is the quantity of the node in the graph topological structure, and total is total Wi, the value of j.

All nodes in the image are linked together, make that total Wi, the value total of j are minimum cost.

According to the graph topological structure of the current frame image that obtains among the step S1021 current frame image is carried out image and cut apart to obtain a plurality of effective coverages.The effective coverage can be understood as significant zone, and particularly the effective coverage is the zone of the pixel correspondence that the user needs in the image.Particularly, the graph topological structure that step S1021 is obtained is cut apart to carry out image as static models.In one embodiment of the invention, adopt the minimum spanning tree method that the graph topological structure of current frame image is carried out image and cut apart, obtain image segmentation result.Adopt the minimum spanning tree method to carry out image and cut apart the communication information that can keep each node in the graph topological structure, and the boundary cost cost of all nodes with minimum is connected together.Thus can the dendrogram topological structure, image is divided into a plurality of effective coverages, i.e. a plurality of significant image-regions.

S102: the static models depth map that generates current frame image.

S1021: the image cut zone merges.

In the process of the structure that makes up minimum spanning tree, it also is the process of graph topological structure cluster.Because the texture in the image is too abundant, thereby the segmentation result that causes image to be cut apart obtaining is too trifling, need the image after tentatively cutting apart be optimized.

The quantity of the subregion that comprises at least in each effective coverage among the step S1012 at first is set, and wherein, the effective coverage can be considered as the bigger zone of scope, and the quantity that needs the subregion that comprises in the bigger zone of each scope at least is set.According to the quantity of each subregion of above-mentioned setting merger being carried out in a plurality of effective coverages then merges.Merge and to eliminate noise region effectively by the image cut zone being carried out merger, reach generally with the prospect of moving object and the purpose of background separation.

S1022: element marking.

By step S1021 image is divided into a plurality of significant zones, the pixel that will belong to same zone then is labeled as identical mark, finishes thus the image of current frame image is cut apart, and determines the affiliated area of each pixel.

S1023: the static models depth map that generates current frame image

The static models depth map that generates current frame image in this step is based on following setting: the top of plane picture is away from the beholder, and the bottom is near the beholder, and namely the top of image is made as maximum distance, and the bottom is made as minimum distance.Belong to all pixels in the identical image cut zone, belong to the inside of an object, thereby be endowed identical depth value.The depth value of different cut zone is along with this zone changes with the variation of the vertical absolute distance of image apex.Can obtain the static models depth map D of current frame image at last _p

D_{p} = 255 \times \frac{\cos tY [i]}{number_of_block [i]]},

Wherein, costY[i] summation of Y coordinate figure of all pixels in expression i zone, number_of_block[i] expression i zone comprises the quantity of pixel.

S103: present frame and reference frame are carried out movable information analysis based on the scene dynamics model profile.In step S103, utilize the dynamic model of scene, calculate the motion vector of characteristic point in present frame and the reference frame, object analysis component motion, camera motion component and compound movement component three's relation.As shown in Figure 3, the movable information analysis to present frame and reference frame comprises the steps:

S1031: characteristic point is selected.

Any one object all has the feature of self, for example some sharp point, edge line, boundary curve etc., and above-mentioned feature is called characteristic point, feature straight line, indicatrix etc.When object moved in the space, so long as in observer's visual range, the feature on the object just can reflect at video image.In other words, can come the motion of observation analysis object by the feature of moving object.

S1032: characteristic point is followed the trail of.

Follow the tracks of the characteristic point that occurs in current frame image and the reference frame image.One group of point is chosen in moving object on plane picture, and the coordinate of (t1 constantly) is p before the motion _i(x _i, y _i), i=1,2 ...In like manner, motion back (t2 constantly) point moves to p ' _i(x ' _i, y ' _i), i=1,2 ..., then the two-dimensional space displacement of the plane of delineation then is d (x _i, t ₁t ₂), i=1,2 ...

S1033: calculating kinematical vector.

In above-mentioned steps, adopt static models that current frame image is cut apart and obtain a plurality of effective coverages after finishing, calculate the motion vector of each effective coverage.

Order

With

Distribute and represent present frame and reference frame,

In n region representation be α _n, α _nThe movement representation of middle pixel x is d (x; β _n), β wherein _nThe expression region alpha _nMotion vector.In one embodiment of the invention, α _nCan be following any one model: affine model, bilinear model or projection model.

Setting area α _nOn error function be:

By being minimized, above-mentioned error can obtain motion vector β _n

S1034: movable information analysis.

As shown in Figure 4, current frame image and reference frame image are divided into two parts, comprise central region (the I district among Fig. 4) and borderline region (the II district among Fig. 4).Wherein, the residing main region of foreground moving object in the piece image can roughly be reflected as the central region of image in the I district, and the II district is as the borderline region of image, three borderline regions that comprise top, left part and the right part of image, expression is used for the moving region of the camera of taking moving object.If there is camera motion, then whole scene all can be moved.Borderline region mainly reflects the zone of camera motion.Picture altitude is that H, picture traverse are W, the right (right margin in II district among Fig. 4) in the right in I district (right margin in I district among Fig. 4) and II district be spaced apart 1/5 of W, the top (coboundary in II district among Fig. 4) in the top in I district (coboundary in I district among Fig. 4) and II district be spaced apart 1/5 of H.

According to the motion vector among the above-mentioned steps S1033, central region and borderline region are analyzed, it is as follows to obtain the motion analysis result:

Wherein, cost1 represents the object component motion.

Wherein, cost2 represents camera motion component.

As cost2 during greater than the camera motion threshold value, illustrate there is camera motion that prospect and background are all moved.Otherwise be that camera is static, the scene of stationary background foreground object motion.Wherein, the camera motion threshold value is whether the camera of judging that sets in advance exists the threshold value of motion, as cost2 during greater than the camera motion threshold value, illustrates to have camera motion, otherwise judges that camera does not move.

Cost3 represents the compound movement component.

Wherein, the notable difference zone refers to that the histogrammic difference of the same area in present frame and the reference frame is greater than the zone of default compound movement threshold value.As cost3 during greater than the compound movement threshold value, then there is rambling compound movement in explanation.

S104: the dynamic model depth map that generates current frame image.

The image segmentation result that utilizes above-mentioned steps to obtain obtains dynamic model depth map D in conjunction with dynamic model _m

D _m＝d _cost1×cost1+d _cost2×(1-cost2)+d _cost3×(1-cost3)，

Wherein, d _Cost1Represent the motion parallax value of I district characteristic point, d _Cost2Represent the motion parallax value of II district characteristic point, d _Cost3Representative changes the motion parallax value of violent provincial characteristics point.

When the value of cost2 surpasses the camera motion threshold value of setting, d then _Cost2* (1-cost2) value just exists, otherwise d _Cost2* (1-cost2) be zero.When the value of cost3 surpasses the compound movement threshold value of setting, d then _Cost3* (1-cost3) value just exists, otherwise d _Cost3* (1-cost3) be zero.

The present invention is when calculating the motion model depth map, and analysis-by-synthesis object of which movement component, camera motion component and compound movement component are adjusted the each several part ratio dynamically, thereby obtain rational dynamic model depth map.

S105: static models depth map and dynamic model depth map are carried out degree of depth fusion.

With the static models depth map D that obtains in the above-mentioned steps _pWith dynamic model depth map D _mMerge to obtain merging the final depth map in back according to separately weighted value.

At first, the weight w2 of the dynamic model depth map of the weight 1 of static models depth map of current frame image and current frame image is set, calculates the depth map D after the current frame image degree of depth merges then _Depth

D _depth＝w1×D _p+w2×D _m，

Wherein, w ₁+ w ₂=1.

Utilize the segmentation result of the image that obtains in the above-mentioned steps, and reference Value, static models depth map and dynamic model depth map are carried out the degree of depth merge.

Wherein, when

During greater than default movement threshold, then to look like be dynamic scene to key diagram, then increases w ₂, reduce w ₁

When

Less than default movement threshold, and

Greater than zero the time, then to look like be static scene to key diagram, then reduces w ₂, increase w ₁

When

When equalling zero, illustrate that then scene is static fully, then w ₂=0, w ₁=1.

Below with reference to the depth map generating apparatus 500 of Fig. 5 description according to the planar video three-dimensional of the embodiment of the invention.

As shown in Figure 5, the depth map generating apparatus 500 according to the planar video three-dimensional of the embodiment of the invention comprises static models depth map generation module 510, motion analysis module 520, dynamic model depth map generation module 530 and depth map Fusion Module 540.Wherein, static models depth map generation module 510 links to each other with dynamic model depth map generation module 530 respectively with motion analysis module 520, and static models depth map generation module 510 links to each other with degree of depth Fusion Module 540 respectively with dynamic model depth map generation module 530.

After current frame image was input to internal memory, static models depth map generation module 510 at first carries out gray processing to this current frame image to be handled.The step that gray processing is handled comprises that the color space conversion with 24 in each pixel RGB passage of current frame image is 8 color spaces of single channel.Handle by current frame image being carried out gray processing, thereby can reduce the quantity of deal with data in the subsequent step.Current frame image after static models depth map generation module 510 is handled gray processing more then carries out cluster operation.Be appreciated that to be that above-mentioned current frame image is carried out the step that gray processing handles is optional, namely static models depth map generation module 510 can be handled and directly carry out cluster operation without gray processing current frame image.

The process that image is cut apart can be used as the element clustering problem and handle, static models depth map generation module 510 is assembled the basic element (for example pixel) of current frame image or other composition and is grouped in the chunk with similar component, namely current frame image is carried out cluster operation.

According to the difference of handling the data structure of image in the image cutting procedure, static models depth map generation module 510 carries out cluster operation and comprises following dual mode:

1) in vector space, with the element of current frame image as a vector independently.The Partial Feature component of pixel as characteristic vector, is carried out merger grouping, design of graphics topological structure by central authorities' algorithm that clusters with above-mentioned characteristic vector.The global characteristics of the current frame image that obtains by said method is very effective.But also have following shortcoming, this method is owing to utilize the space structure information of vector, thus a lot of detailed information that can not preserve current frame image, therefore through regular meeting with the disconnected zone errors of reality be integrated in the group.

Preferably, static models depth map generation module 510 adopts the second way that current frame image is carried out cluster operation.

Static models depth map generation module 510 with current frame image be converted to graph topological structure G (V, E), wherein, V is the node in the image, E is the weight on the limit between the adjacent node.Vi can also can be a small subregion that uniform shapes, similar texture or other feature structures are arranged for independent pixel, for example piece image can be divided into several fritters of 5 * 5, and each fritter is as a fundamental node.In one embodiment of the invention, adopt color vector to explain the factor as characteristic vector.In feature space, Xi is the vector of RGB colouring information, the cost Ei on limit, j have one on the occasion of weights W i, j, Wi wherein, j has reflected node (i, j) the similitude degree between.The present invention adopts two Euclidean distances between the node to weigh Wi, the size of j.

Wi，j＝||Xi-Xj||，

total = Σ_{1}^{n} W (i, j),

Static models depth map generation module 510 carries out image according to the graph topological structure of current frame image to current frame image cuts apart to obtain a plurality of effective coverages.Particularly, graph topological structure is cut apart to carry out image as static models.In one embodiment of the invention, static models depth map generation module 510 adopts the minimum spanning tree methods that the graph topological structure of current frame image is carried out image to cut apart, obtain image segmentation result.Adopt the minimum spanning tree method to carry out image and cut apart the communication information that can keep each node in the graph topological structure, and the boundary cost cost of all nodes with minimum is connected together.Thus can the dendrogram topological structure, image is divided into a plurality of effective coverages, i.e. a plurality of significant image-regions.

In the process of the structure that makes up minimum spanning tree, it also is the process of graph topological structure cluster.Because the texture in the image is too abundant, thereby the segmentation result that causes image to be cut apart obtaining is too trifling, and static models depth map generation module 510 need be optimized the image after tentatively cutting apart.

Static models depth map generation module 510 at first arranges the quantity of the subregion that comprises at least in each effective coverage, and wherein, the effective coverage can be considered as the bigger zone of scope, and the quantity that needs the subregion that comprises in the bigger zone of each scope at least is set.According to the quantity of each subregion of above-mentioned setting merger being carried out in a plurality of effective coverages then merges.Merge and to eliminate noise region effectively by the image cut zone being carried out merger, reach generally with the prospect of moving object and the purpose of background separation.

Static models depth map generation module 510 is divided into a plurality of significant zones by step S1021 with image, the pixel that will belong to same zone then is labeled as identical mark, finish thus the image of current frame image is cut apart, determine the affiliated area of each pixel.

The static models depth map that static models depth map generation module 510 generates current frame image is based on following setting: the top of plane picture is away from the beholder, and the bottom is near the beholder, and namely the top of image is made as maximum distance, and the bottom is made as minimum distance.Belong to all pixels in the identical image cut zone, belong to the inside of an object, thereby be endowed identical depth value.The depth value of different cut zone is along with this zone changes with the variation of the vertical absolute distance of image apex.Can obtain the static models depth map D of current frame image at last _p

D_{p} = 255 \times \frac{\cos tY [i]}{number_of_block [i]]},

520 pairs of present frames of motion analysis module and reference frame carry out the movable information analysis based on the scene dynamics model profile, comprise the dynamic model that utilizes scene, calculate the motion vector of characteristic point in present frame and the reference frame, object analysis component motion, camera motion component and compound movement component three's relation.

At first, motion analysis module 520 is carried out the selection of characteristic point.Any one object all has the feature of self, for example some sharp point, edge line, boundary curve etc., and above-mentioned feature is called characteristic point, feature straight line, indicatrix etc.When object moved in the space, so long as in observer's visual range, the feature on the object just can reflect at video image.In other words, can come the motion of observation analysis object by the feature of moving object.

Motion analysis module 520 is followed the tracks of the characteristic point that occurs in current frame image and the reference frame image then.One group of point is chosen in moving object on plane picture, and the coordinate of (t1 constantly) is p before the motion _i(x _i, y _i), i=1,2 ...In like manner, motion back (t2 constantly) point moves to p ' _i(x ' _i, y ' _i), i=1,2 ..., then the two-dimensional space displacement of the plane of delineation then is d (x _i, t ₁t ₂), i=1,2,

In the above-described embodiments, static models depth map generation module 510 adopts static models that current frame image is cut apart and obtains a plurality of effective coverages after finishing, and calculates the motion vector of each effective coverage by motion analysis module 520.

Order

With

Distribute and represent present frame and reference frame,

Motion analysis module 520 setting area α _nOn error function be: By being minimized, above-mentioned error can obtain motion vector β _n

Motion analysis module 520 is divided into two parts with current frame image and reference frame image, comprises central region (the I district among Fig. 4) and borderline region (the II district among Fig. 4).Wherein, the I district is as the central region of image, can reflect roughly that the residing main region II of foreground moving object district comprises three borderline regions of top, left part and the right part of image as the borderline region of image in the piece image, expression is used for the moving region of the camera of taking moving object.If there is camera motion, then whole scene all can be moved.Borderline region mainly reflects the zone of camera motion.Picture altitude is that H, picture traverse are W, the right (right margin in II district among Fig. 4) in the right in I district (right margin in I district among Fig. 4) and II district be spaced apart 1/5 of W, the top (coboundary in II district among Fig. 4) in the top in I district (coboundary in I district among Fig. 4) and II district be spaced apart 1/5 of H.

According to the motion vector that obtains, central region and borderline region are analyzed, it is as follows to obtain the motion analysis result:

Wherein, cost1 represents the object component motion.

Wherein, cost2 represents camera motion component.

Cost3 represents the compound movement component.

Dynamic model depth map generation module 530 obtains dynamic model depth map D when utilizing the image segmentation result that obtains in conjunction with dynamic model _m

D _m＝d _cost1×cost1+d _cost2×(1-cost2)+d _cost3×(1-cost3)，

Depth map Fusion Module 540 is with the above-mentioned static models depth map D that obtains _pWith dynamic model depth map D _mMerge to obtain merging the final depth map in back according to separately weighted value.

At first, depth map Fusion Module 540 arranges the weight w2 of the dynamic model depth map of the weight w1 of static models depth map of current frame image and current frame image, calculates the depth map D after the current frame image degree of depth merges then _Depth

D _depth＝w1×D _p+w2×D _m，

Wherein, w ₁+ w ₂=1.

Depth map Fusion Module 540 utilizes the segmentation result of the image that obtains in the above-mentioned steps, and reference

Value, static models depth map and dynamic model depth map are carried out the degree of depth merge.

Wherein, when

When Less than default movement threshold, and

When

In the description of this specification, concrete feature, structure, material or characteristics that the description of reference term " embodiment ", " some embodiment ", " example ", " concrete example " or " some examples " etc. means in conjunction with this embodiment or example description are contained at least one embodiment of the present invention or the example.In this manual, the schematic statement to above-mentioned term not necessarily refers to identical embodiment or example.And concrete feature, structure, material or the characteristics of description can be with the suitable manner combination in any one or more embodiment or example.

Although illustrated and described embodiments of the invention, for the ordinary skill in the art, be appreciated that without departing from the principles and spirit of the present invention and can carry out multiple variation, modification, replacement and modification to these embodiment that scope of the present invention is by claims and be equal to and limit.

Claims

1. the degree of depth drawing generating method of a planar video three-dimensional is characterized in that, comprises the steps:

Current frame image is carried out cutting apart based on the image that the scene static models distribute, comprise the characteristic vector design of graphics topological structure that utilizes pixel in the described current frame image, according to the graph topological structure of described current frame image described current frame image is carried out image and cut apart to obtain cut zone, the pixel that belongs to same cut zone in the described cut zone is carried out mark to generate the static models depth map of described current frame image, wherein, the static models depth map D of current frame image _p

D_{p} = 255 \times \frac{\cos tY [i]}{number_of_block [i]},

Wherein, costY[i] summation of Y coordinate figure of all pixels of the same cut zone i of expression, number_of_block[i] the described same cut zone i of expression comprises the quantity of pixel;

Select the characteristic point of described current frame image and described reference frame image, follow the tracks of described characteristic point, calculate the motion vector of the moving object in described current frame image and the described reference frame image, described motion vector is carried out motion analysis to obtain the motion analysis result of described current frame image and described reference frame image;

According to the cut zone of described current frame image and the motion analysis result of described current frame image and described reference frame image, generate the dynamic model depth map of described current frame image;

The static models depth map of described current frame image and the dynamic model depth map of described current frame image are carried out adaptive degree of depth fusion.

2. the degree of depth drawing generating method of planar video three-dimensional according to claim 1 is characterized in that, before described design of graphics topological structure, also comprises:

Described current frame image is carried out gray processing handle, comprise that the color space with 24 of the pixel RGB triple channel of described current frame image is converted into 8 color spaces of single channel.

3. the degree of depth drawing generating method of planar video three-dimensional as claimed in claim 1 is characterized in that, the graph topological structure of the described current frame image of described structure comprises the steps:

The element of described current frame image as a vector independently, as characteristic vector, is constructed the graph topological structure of described current frame image with the Partial Feature component of the pixel of described current frame image.

4. the degree of depth drawing generating method of planar video three-dimensional as claimed in claim 1 is characterized in that, the graph topological structure of the described current frame image of described structure comprises the steps:

Utilize the spatial relationship of element in the described current frame image and be communicated with boundary information, in spatial domain, handle described current frame image, described current frame image is converted into the graph topological structure of two dimension.

5. the degree of depth drawing generating method of planar video three-dimensional as claimed in claim 1 is characterized in that, describedly described current frame image is carried out image cuts apart, and comprises the steps:

Adopt the minimum spanning tree method that the graph topological structure of described current frame image is cut apart that described current frame image is divided into a plurality of effective coverages.

6. the degree of depth drawing generating method of planar video three-dimensional as claimed in claim 5 is characterized in that, the pixel that belongs to same zone in described a plurality of effective coverages is carried out also comprising the steps: before the mark

The quantity of the subregion that comprises at least in described each effective coverage is set;

According to the quantity of the described subregion that arranges merger being carried out in described a plurality of effective coverages merges.

7. the degree of depth drawing generating method of planar video three-dimensional as claimed in claim 1 is characterized in that, describedly motion vector is carried out motion analysis comprises the steps:

Described current frame image and described reference frame image are divided into central region and borderline region, and wherein, described central region is the central region of image, the zone of living in of expression foreground moving object; Described borderline region comprises top boundary zone, left part borderline region and the right part borderline region of image, and expression is used for the moving region of the camera of the described moving object of shooting;

Described central region and described borderline region analysis are obtained described motion analysis result.

8. the degree of depth drawing generating method of planar video three-dimensional as claimed in claim 7 is characterized in that, described motion analysis result comprises: moving object component motion, camera motion component and compound movement component,

Wherein, described moving object component motion

Described camera motion component

Described compound movement component

Wherein, described notable difference zone is that the histogrammic difference of the same area in described current frame image and the described reference frame image is greater than the zone of default compound movement threshold value.

9. the degree of depth drawing generating method of planar video three-dimensional as claimed in claim 1 is characterized in that, described static models depth map and the dynamic model depth map of described current frame image with current frame image carries out the degree of depth and merge, and comprises the steps:

The weight of the dynamic model depth map of the weight of static models depth map of described current frame image and described current frame image is set, calculates the depth map D after the described current frame image degree of depth merges _DepthFor:

D _depth=w1×D _p+w2×D _m，

Wherein, D _pBe the static models depth map of described current frame image, D _mBe the dynamic model depth map of described current frame image, w1 is the weight of the static models depth map of described current frame image, and w2 is the weight of the dynamic model depth map of described current frame image, and w1 and w2 satisfy w ₁+ w ₂=1.

10. the depth map generating apparatus of a planar video three-dimensional is characterized in that, comprising:

Static models depth map generation module, described static models depth map generation module makes up the graph topological structure of described current frame image for the characteristic vector of utilizing described current frame image pixel, according to the graph topological structure of described current frame image described current frame image is carried out image and cut apart to obtain cut zone, the pixel that belongs to same cut zone in the described cut zone is carried out the static models depth map that mark generates described current frame image, wherein, the static models depth map D of current frame image _p

D_{p} = 255 \times \frac{\cos tY [i]}{number_of_block [i]},

Wherein, costY[i] summation of Y coordinate figure of all pixels of the same cut zone i of expression, number_of_block[i] the same cut zone i of expression comprises the quantity of pixel;

The motion analysis module, described motion analysis module is used for selecting the characteristic point of described current frame image and described reference frame image, and follow the tracks of described characteristic point, calculate the motion vector of the moving object in described current frame image and the described reference frame image, described motion vector is carried out motion analysis to obtain the motion analysis result of described current frame image and described reference frame image;

Dynamic model depth map generation module, described dynamic model depth map generation module links to each other with described motion analysis module with described static models depth map generation module respectively, be used for according to the cut zone of described current frame image and the motion analysis result of described current frame image and described reference frame image, generate the dynamic model depth map of described current frame image;

Degree of depth Fusion Module, described degree of depth Fusion Module links to each other with described dynamic model depth map generation module with described static models depth map generation module respectively, is used for the static models depth map of described current frame image and the dynamic model depth map of described current frame image are carried out adaptive degree of depth fusion.

11. the depth map generating apparatus of planar video three-dimensional as claimed in claim 10, it is characterized in that, described static models depth map generation module is before making up described graph topological structure, described current frame image is carried out gray processing handle, comprise that the color space with 24 of the pixel RGB triple channel of described current frame image is converted into 8 color spaces of single channel.

12. the depth map generating apparatus of planar video three-dimensional as claimed in claim 10 is characterized in that, described static models depth map generation module adopts one of following dual mode split image graph topological structure:

1) with the element of described current frame image as a vector independently, as characteristic vector, adopt central authorities' algorithm that clusters described characteristic vector merger to be divided into groups the split image graph topological structure Partial Feature component of the pixel of described current frame image;

2) utilize the spatial relationship of element in the described current frame image and be communicated with boundary information, in spatial domain, handle described current frame image, described current frame image is converted into the graph topological structure of two dimension, based on described graph topological structure split image.

13. the depth map generating apparatus of planar video three-dimensional as claimed in claim 10, it is characterized in that described static models depth map generation module adopts the minimum spanning tree method that the graph topological structure of described current frame image is carried out image to cut apart that described current frame image is divided into a plurality of effective coverages.

14. the depth map generating apparatus of planar video three-dimensional as claimed in claim 13, it is characterized in that, described static models depth map generation module arranges the quantity of the subregion that will comprise at least in described each effective coverage, according to the quantity of the described subregion that arranges the static models depth map that merger merges to generate described current frame image is carried out in described a plurality of effective coverages.

15. the depth map generating apparatus of planar video three-dimensional as claimed in claim 10, it is characterized in that, described dynamic model depth map generation module is divided into central region and borderline region with described current frame image and described reference frame image, and described central region and described borderline region analysis obtained described motion analysis result, wherein, described central region is the central region of image, the zone of living in of expression foreground moving object; Described borderline region comprises top boundary zone, left part borderline region and the right part borderline region of image, and expression is used for the moving region of the camera of the described moving object of shooting.

16. the depth map generating apparatus of planar video three-dimensional as claimed in claim 15 is characterized in that, described motion analysis result comprises: moving object component motion, camera motion component and compound movement component,

Wherein, described moving object component motion

Described camera motion component

Described compound movement component

Wherein, described notable difference zone is that the histogrammic difference of the same area in described current frame image and the described reference frame image is greater than the zone of compound movement threshold value.

17. the depth map generating apparatus of planar video three-dimensional as claimed in claim 10, it is characterized in that, described degree of depth Fusion Module arranges the weight of the dynamic model depth map of the weight of static models depth map of described current frame image and described current frame image, calculates the depth map D after the described current frame image degree of depth merges _DepthFor:

D _depth=w1×D _p+w2×D _m，