CN101640809B - Depth extraction method of merging motion information and geometric information - Google Patents

Depth extraction method of merging motion information and geometric information Download PDF

Info

Publication number
CN101640809B
CN101640809B CN2009101021536A CN200910102153A CN101640809B CN 101640809 B CN101640809 B CN 101640809B CN 2009101021536 A CN2009101021536 A CN 2009101021536A CN 200910102153 A CN200910102153 A CN 200910102153A CN 101640809 B CN101640809 B CN 101640809B
Authority
CN
China
Prior art keywords
depth
background
motion
image
foreground
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2009101021536A
Other languages
Chinese (zh)
Other versions
CN101640809A (en
Inventor
黄晓军
黄俊钧
王梁昊
李东晓
张明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wan D display technology (Shenzhen) Co., Ltd.
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN2009101021536A priority Critical patent/CN101640809B/en
Publication of CN101640809A publication Critical patent/CN101640809A/en
Application granted granted Critical
Publication of CN101640809B publication Critical patent/CN101640809B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a depth extraction method of merging motion information and geometric information, which comprises the following steps: (1) carrying out scene segmentation on each frame of two-dimensional video image, and separating the static background and the dynamic foreground; (2) processing a scene segmentation chart by binaryzation and filtering; (3) generating a geometric depth chart of the static background based on the geometric information; (4) calculating the motion vector of the foreground object, and converting the motion vector into the motion amplitude; (5) linearly transforming the motion amplitude of the foreground object according to the position of the foreground object, and obtaining a motion depth chart; and (6) merging the motion depth chart and the geometric depth chart, and filtering to obtain a final depth chart. The method only calculates the motion vector of the separated dynamic foreground object, thereby eliminating the mismatching points of the background and reducing the amount of calculation. Meanwhile, the motion amplitude of the foreground object is linearly transformed according to the position of the foreground object, the motion amplitude is merged into the background depth, thereby integrally improving the quality of the depth chart.

Description

The depth extraction method of a kind of merging motion information and geological information
Technical field
Depth extraction method when the present invention relates to two-dimensional video conversion 3 D video relates in particular to the depth extraction method of a kind of merging motion information and geological information.
Background technology
Since the TV invention forties in 20th century, experienced the revolution first time from the black-and-white TV to the color TV and the revolution second time from the simulated television to the Digital Television, 3DTV will be the revolution for the third time of TV tech after Digital Television.
The demonstration of 3DTV be unable to do without the making of appropriate content.Conventional two-dimensional video film source can not be applicable to three-dimensional display system, needs to make the 3D content that satisfies the demonstration needs.A kind of method is directly to produce the 3D video with stereoscopic camera, but this method is of a high price.Another kind method is to seek a kind of suitable algorithm original 2D video to be converted to the 3D video that can be used for stereo display.And at present owing to the existence of a large amount of traditional 2D videos, 2D changes the 3D Study on Technology more realistic meanings, and this Study on Technology not only can provide abundant material for stereo display, and can save the cost of content production greatly.
2D changes the 3D technical process and mainly comprises:
(1) utilizes various degree of depth clues, produce the dense depth sequence by original 2D video;
(2) use DIBR (depth image based rendring) technology, utilize the range image sequence of one road 2D video and its correspondence, reconstruct multichannel 2D video;
(3) utilization image composition algorithm synthesizes one road 3D video with the multi-channel video that produces, and is input in the 3D display device to show.
How accurately producing dense depth and be used for the first step that DIBR reconstruct is whole 2D commentaries on classics 3D technical process, also is the core content that 2D changes the 3D technical research.
The researcher searches out many degree of depth clues on the basis of analyzing human physiology and psycho-visual.For example, object of which movement information, scene space geological information, article surface vein and shape information, object image-forming shade, object edge information and the focusing of taking camera with defocus situation.Utilize wherein a certain degree of depth clue can only handle a class 2D video, blanket degree of depth clue does not exist.Yet,,, utilize the movable information of object can produce dense depth effectively in conjunction with the temporal correlation of video because most of videos all have the foreground object of motion.Therefore, object of which movement information in the generation of depth map generally as topmost degree of depth clue.On the other hand, in some specific occasion, at this moment not motion of object in the scene just needs to seek other degree of depth clues and replaces the movable information clue.No matter be not difficult to find, be indoor or outdoor, usually comprised abundant geological information in the scene, as the wall line of outdoor extension to the road in a distant place, indoor rule.Space geometry information has played good supplementary function for object of which movement information in the middle of the generation of depth map, especially in the static occasion of scene.
Present depth map produces algorithm and is mostly to utilize a kind of degree of depth clue to remove to handle a class video, and actual video has comprised multiple degree of depth clue often, and the depth map that utilizes these degree of depth clues to produce width of cloth fusion will effectively improve the quality of depth map.
Summary of the invention
The objective of the invention is to overcome the deficiencies in the prior art, merge multiple degree of depth clue, the depth extraction method of a kind of merging motion information and geological information is provided.
The depth extraction method of merging motion information and geological information comprises the steps:
(1) two-dimensional video to be converted is set up the statistics background model,, isolate static background part and sport foreground part by moving object detection;
(2), and carry out medium filtering and mathematical morphology filter to isolating static background part and sport foreground image binaryzation partly;
(3) background image that the statistics background modeling is obtained adopts the method based on scene geometric information to produce geometric depath figure;
(4) by from seeking coupling between the two adjacent in time two field pictures of original two-dimensional video to be converted, obtain the motion vector of sport foreground object, and be converted into motion amplitude;
(5) according to sport foreground part present position, its motion amplitude is done linear transformation, obtain the motion depth map;
(6) fusional movement depth map and geometric depath figure, and carry out gaussian filtering, obtain ultimate depth figure, be used for the expression of 3 D video.
Described two-dimensional video to be converted is set up added up background model, by moving object detection, isolates static background part and sport foreground part steps:
(a) the continuous N two field picture of intercepting in two-dimensional video file to be converted is to this N two field picture I f(x y) carries out scanning on time and the space, try to achieve corresponding to the average of N pixel value of each pixel coordinate position as a setting image B (x, y) at this locational pixel value, computing formula is as follows:
B ( x , y ) = 1 N Σ f = 1 N I f ( x , y ) ;
(b) (x is y) with each two field picture I of two-dimensional video to be converted with the background image B that obtains in the step (a) f(x y) does subtraction, and by comparing to determine foreground point in the image with a pre-set threshold th, representation formula is as follows:
I f ( x , y ) = Background | I f ( x , y ) - B ( x , y ) | < th Foreground otherwise .
Described to isolating static background part and sport foreground image binaryzation partly, and carry out medium filtering and mathematical morphology filter step:
(c) the front and back scape that utilizes step (b) to make is judged, is each two field picture I in the video f(x y) produces a width of cloth bianry image, 0 value representation background wherein, and 255 value representation prospects, that is:
I f 1 ( x , y ) = 0 if I f ( x , y ) isBackground 255 if I f ( x , y ) isForeground ;
(d) bianry image that step (c) is obtained carries out the medium filtering of 3 * 3 window sizes, eliminates background noise;
(e) image behind the medium filtering that step (d) is obtained carries out typical opening operation and closed operation in the mathematical morphology filter, and the zone of action and the cavity of eliminating small size in the foreground image adopt 3 * 3 squares corrosion unit and expansion unit to finish opening operation and closed operation.
The described background image that the statistics background modeling is obtained adopts the method based on scene geometric information to produce geometric depath figure step:
(f) the luminance component B of the background image that step (a) is obtained y(x y) carries out rim detection with the Sobel operator, obtains horizontal gradient figure S x(x is y) with vertical gradient map S y(x, y), with this two width of cloth figure addition obtain gradient map S (x, y), Th compares with threshold value, obtains the background edge detection figure of binaryzation, choosing according to following formula of threshold value Th calculated:
Th=α[S(x,y) max-S(x,y) min]+S(x,y) min
Wherein, α is the weight coefficient of value between 0~1, and S (x, y) MaxBe the gradient map max pixel value, and S (x, y) MinIt is the gradient map minimum pixel value;
(g) adopt the binaryzation edge detection graph extraction main straight line wherein that classical H ough transfer pair step (f) obtains in the image processing, its result and original binaryzation edge detection graph are done AND-operation, extract the line that goes out in the background, the mid point in the zone of the line intersection point probability of occurrence maximum of going out is as vanishing point;
(h) with the vanishing point that obtains in the step (g) as a setting in degree of depth deepest point, deepen gradually with differential 2 speed toward the vanishing point direction degree of depth along the line that goes out, obtain background image geometric depath figure G (x, y).
Described by mating from seeking between the two adjacent in time two field pictures of original two-dimensional video to be converted, obtain the motion vector of sport foreground object, and be converted into the motion amplitude step:
(i) the current time picture frame I in the scanning two-dimensional video to be converted f(x, y), the front and back scape of the filtering that obtains according to step (e) separates bianry image, judge if current pixel point is the foreground point, then be its previous moment picture frame I at current time F-1(x y) goes up searching optimum Match pixel, adopts the method for calculating the coupling cost in the W * W neighborhood window that with the current pixel point is the center, to improve the accuracy of mating, establishes match search scope S N * NSize is N * N, u and v are respectively horizontal offset and the vertical offset of pixel on the previous moment picture frame on the present frame when seeking coupling, i and j are respectively to be the interior horizontal offset and the vertical offset of W * W neighborhood at center with the current pixel point, and definition coupling cost is as follows:
C ( x , y ; u , v ) = &Sigma; i = - ( w - 1 ) / 2 ( w - 1 ) / 2 &Sigma; j = - ( w - 1 ) / 2 ( w - 1 ) / 2 | I f ( x + i , y + i ) - I f - 1 ( x + u + i , y + v + j ) |
(x,y)∈foreground,?(u,v)∈S N×N,F=1,2,3......
Each pixel in the traversal current pixel point hunting zone calculates and mates cost accordingly, finds out the horizontal offset and the vertical offset that wherein have the smallest match cost, with the motion vector as current pixel point, horizontal MV x, vertical MV y, formulate is as follows:
C min(x,y;MV x,MV y)=Min[C(x,y;u,v)];
(j) the motion vector horizontal direction component of establishing each the foreground pixel point that obtains in the step (i) is MV x(x, y), the vertical direction component is MV y(x, y), motion amplitude is defined as:
F ( x , y ) = MV x 2 ( x , y ) + MV y 2 ( x , y ) ;
Describedly its motion amplitude is done linear transformation, obtains motion depth map step according to sport foreground part present position:
(k) motion amplitude that obtains in the step (j) is done linear transformation and downward rounding operation, the span of each foreground pixel point of guaranteeing the motion depth map is at [a, b], and is integer, and the linear transformation formula is as follows:
Wherein, linear transformation lower limit a and upper limit b value are all between 0~255, and a gets the minimum depth value of the background parts geometric depath figure that sport foreground shelters from, and b gets the depth value of the corresponding background parts geometric depath of sport foreground part minimum point figure.
Described fusional movement depth map and geometric depath figure, and carry out gaussian filtering, obtain ultimate depth figure step:
(l) the front and back scape of the filtering that obtains according to step (e) separate bianry image A (x, y), the motion depth map M (x that step (k) is obtained, y) and the geometric depath figure G that obtains of step (h) (x y) merges, and obtains merging depth map D (x, y), fusion formula is defined as follows:
D ( x , y ) = M ( x , y ) A ( x , y ) = 255 G ( x , y ) A ( x , y ) = 0 ;
(m) the fusion depth map that step (l) is obtained carries out gaussian filtering, obtains ultimate depth figure, is used for the expression of 3 D video.
The present invention is applicable to that the video file to the uncompressed of dynamic prospect and static background produces depth map.Utilized the object of which movement information to produce the method for depth map in the past, a kind of piece that is based on, there is blocking effect in object edge in the depth map of generation; Another kind is based on pixel, though the more block-based depth map of object edge is level and smooth, but the slight change of illumination condition all can bring strong background noise, the present invention in conjunction with moving object detection with based on the Pixel-level depth map generating technique of object of which movement information, level and smooth preferably object edge, suppressed background noise simultaneously, and, significantly reduced amount of calculation only to sport foreground object calculating kinematical vector.On the other hand, the present invention has overcome the deficiency of only utilizing single degree of depth clue to produce depth map in the conventional art, and fusion movable information and scene geometric information produce depth map simultaneously, have enlarged the scope of application, have improved the quality of depth map.
Description of drawings
Fig. 1 merges the FB(flow block) that depth map produces;
Some schematic diagrames when Fig. 2 (a) is the Hough conversion in the image space;
A sine curve schematic diagram when Fig. 2 (b) is the Hough conversion in the feature space;
Straight line schematic diagram when Fig. 2 (c) is the Hough conversion in the image space;
The schematic diagram that many sine curves when Fig. 2 (d) is the Hough conversion in the feature space intersect at a point;
Fig. 3 is the schematic diagram of estimation when producing calculating kinematical vector;
Fig. 4 is the sectional drawing of Hall Monitor video;
Fig. 5 is the background image that obtains from the modeling of Hall Monitor video;
Fig. 6 be after the filtering of Fig. 4 video interception correspondence before and after the scape binary map of separating;
Fig. 7 is the geometric depath figure of Fig. 5 background image;
Fig. 8 is the motion depth map of Fig. 4 video interception correspondence;
Fig. 9 is the ultimate depth figure that Fig. 7 geometric depath figure and Fig. 8 motion depth map merge and the process gaussian filtering is handled.
Embodiment
The depth extraction method of merging motion information and geological information comprises the steps (overall flow figure is as shown in Figure 1):
(1) two-dimensional video to be converted is set up the statistics background model,, isolate static background part and sport foreground part by moving object detection;
(2), and carry out medium filtering and mathematical morphology filter to isolating static background part and sport foreground image binaryzation partly;
(3) background image that the statistics background modeling is obtained adopts the method based on scene geometric information to produce geometric depath figure;
(4) by from seeking coupling between the two adjacent in time two field pictures of original two-dimensional video to be converted, obtain the motion vector of sport foreground object, and be converted into motion amplitude;
(5) according to sport foreground part present position, its motion amplitude is done linear transformation, obtain the motion depth map;
(6) fusional movement depth map and geometric depath figure, and carry out gaussian filtering, obtain ultimate depth figure, be used for the expression of 3 D video.
Described two-dimensional video to be converted is set up added up background model, by moving object detection, isolates static background part and sport foreground part steps:
(a) the continuous N two field picture of intercepting in two-dimensional video file to be converted is to this N two field picture I f(x y) carries out scanning on time and the space, try to achieve corresponding to the average of N pixel value of each pixel coordinate position as a setting image B (x, y) at this locational pixel value, computing formula is as follows:
B ( x , y ) = 1 N &Sigma; f = 1 N I f ( x , y ) ;
(b) (x is y) with each two field picture I of two-dimensional video to be converted with the background image B that obtains in the step (a) f(x y) does subtraction, and by comparing to determine foreground point in the image with a pre-set threshold th, representation formula is as follows:
I f ( x , y ) = Background | I f ( x , y ) - B ( x , y ) | < th Foreground otherwise .
Described to isolating static background part and sport foreground image binaryzation partly, and carry out medium filtering and mathematical morphology filter step:
(c) the front and back scape that utilizes step (b) to make is judged, is each two field picture I in the video f(x y) produces a width of cloth bianry image, 0 value representation background wherein, and 255 value representation prospects, that is:
I f 1 ( x , y ) = 0 if I f ( x , y ) isBackground 255 if I f ( x , y ) isForeground ;
(d) bianry image that step (c) is obtained carries out the medium filtering of 3 * 3 window sizes, eliminates background noise;
Sliding window that contains odd number point of the general employing of the medium filtering of image, replace the gray value of window mid point with the Mesophyticum of each point gray value in the window, adopt the window of 3 * 3 sizes, though bigger filter window can more effectively get filtering noise, but can bring level and smooth excessively, the moving region detail content is disappeared, bring difficulty for following processing.
(e) image behind the medium filtering that step (d) is obtained carries out typical opening operation and closed operation in the mathematical morphology filter, and the zone of action and the cavity of eliminating small size in the foreground image adopt 3 * 3 squares corrosion unit and expansion unit to finish opening operation and closed operation.
The described background image that the statistics background modeling is obtained adopts the method based on scene geometric information to produce geometric depath figure step:
(f) the luminance component B of the background image that step (a) is obtained y(x y) carries out rim detection with the Sobel operator, obtains horizontal gradient figure S x(x is y) with vertical gradient map S y(x, y), with this two width of cloth figure addition obtain gradient map S (x, y), Th compares with threshold value, obtains the background edge detection figure of binaryzation, choosing according to following formula of threshold value Th calculated:
Th=α·[S(x,y) max-S(x,y) min]+S(x,y) min
Wherein, α is the weight coefficient of value between 0~1, and S (x, y) MaxBe the gradient map max pixel value, and S (x, y) MinIt is the gradient map minimum pixel value;
α is more little, and edge of image and details are clear more, otherwise then edge of image and details are just fuzzy more.Help more the going out extraction of line that edge of image is clear more, but the extraction of the clear line that then is unfavorable for going out of details.
(g) adopt the binaryzation edge detection graph extraction main straight line wherein that classical H ough transfer pair step (f) obtains in the image processing, its result and original binaryzation edge detection graph are done AND-operation, extract the line that goes out in the background, the mid point in the zone of the line intersection point probability of occurrence maximum of going out is as vanishing point;
The principle of Hough conversion is exactly to utilize a little-symmetry between the line in essence.A point on the image space (Fig. 2 (a)) corresponding feature space (r, the θ) sine curve on (Fig. 2 (b)), and image space (x, y) straight line on (Fig. 2 (c)) corresponding feature space (r, θ) point on (Fig. 2 (d)).Straight line can be regarded set a little as, and therefore, the straight line correspondence on the image space the cluster sine curve on the feature space, and the coordinate of the intersection point of these curves is exactly the characteristic quantity of line correspondence in the image space.
When realizing, meet at the bar number of the curve of same point in the accumulative total feature space,, aggregate-value thinks the straight line that the intersecting point coordinate that has available characteristic of correspondence space in the image space characterizes if surpassing threshold value.
(h) with the vanishing point that obtains in the step (g) as a setting in degree of depth deepest point, deepen gradually with differential 2 speed toward the vanishing point direction degree of depth along the line that goes out, obtain background image geometric depath figure G (x, y).
Described by mating from seeking between the two adjacent in time two field pictures of original two-dimensional video to be converted, obtain the motion vector of sport foreground object, and be converted into the motion amplitude step:
(i) the current time picture frame I in the scanning two-dimensional video to be converted f(x, y), the front and back scape of the filtering that obtains according to step (e) separates bianry image, judge if current pixel point is the foreground point, then be its previous moment picture frame I at current time F-1(x y) goes up searching optimum Match pixel, adopts the method for calculating the coupling cost in the W * W neighborhood window that with the current pixel point is the center, to improve the accuracy of mating, as shown in Figure 3, establishes match search scope S N * NSize is N * N, u and v are respectively horizontal offset and the vertical offset of pixel on the previous moment picture frame on the present frame when seeking coupling, i and j are respectively to be the interior horizontal offset and the vertical offset of W * W neighborhood at center with the current pixel point, and definition coupling cost is as follows:
C ( x , y ; u , v ) = &Sigma; i = - ( w - 1 ) / 2 ( w - 1 ) / 2 &Sigma; j = - ( w - 1 ) / 2 ( w - 1 ) / 2 | I f ( x + i , y + i ) - I f - 1 ( x + u + i , y + v + j ) |
(x,y)∈foreground,(u,v)∈S N×N,f=1,2,3......
Each pixel in the traversal current pixel point hunting zone calculates and mates cost accordingly, finds out the horizontal offset and the vertical offset that wherein have the smallest match cost, with the motion vector as current pixel point, horizontal MV x, vertical MV y, formulate is as follows:
C min(x,y;MV x,MV y)=Min[C(x,y;u,v)];
(j) the motion vector horizontal direction component of establishing each the foreground pixel point that obtains in the step (i) is MV x(x, y), the vertical direction component is MV y(x, y), motion amplitude is defined as:
F ( x , y ) = MV x 2 ( x , y ) + MV y 2 ( x , y ) ;
Describedly its motion amplitude is done linear transformation, obtains motion depth map step according to sport foreground part present position:
(k) motion amplitude that obtains in the step (j) is done linear transformation and downward rounding operation, the span of each foreground pixel point of guaranteeing the motion depth map is at [a, b], and is integer, and the linear transformation formula is as follows:
Wherein, linear transformation lower limit a and upper limit b value are all between 0~255, and a gets the minimum depth value of the background parts geometric depath figure that sport foreground shelters from, and b gets the depth value of the corresponding background parts geometric depath of sport foreground part minimum point figure.
Because in three dimensions, has same movement speed and motion amplitude that the object of different depth shows and inequality on two dimensional image plane, the object of which movement amplitude that the degree of depth is little is big, otherwise then motion amplitude is little, so motion amplitude can be used for describing the degree of depth of object.The motion depth map is a gray-scale map, and therefore the motion amplitude that first calculated must be obtained is done linear transformation and rounded operation downwards, just can be used for describing the object degree of depth.In addition, in order finally better the motion depth map to be incorporated among the geometric depath figure, the bound of linear transformation scope is taken from the background depth value around the sport foreground object.
Described fusional movement depth map and geometric depath figure, and carry out gaussian filtering, obtain ultimate depth figure step:
(l) the front and back scape of the filtering that obtains according to step (e) separate bianry image A (x, y), the motion depth map M (x that step (k) is obtained, y) and the geometric depath figure G that obtains of step (h) (x y) merges, and obtains merging depth map D (x, y), fusion formula is defined as follows:
D ( x , y ) = M ( x , y ) A ( x , y ) = 255 G ( x , y ) A ( x , y ) = 0 ;
(m) the fusion depth map that step (l) is obtained carries out gaussian filtering, obtains ultimate depth figure, is used for the expression of 3 D video.
During utilization DIBR technology reconstruct virtual view, require the depth map of input smoother, therefore, need carry out gaussian filtering one time the fusion depth map that fusional movement depth map and geometric depath figure obtain.
Embodiment:
(1) be that 352 * 288 Hall Monitor test code streams is as the 2D video file of waiting to produce depth map with image resolution ratio.Fig. 4 is the sectional drawing of Hall Monitor video.
(2) set up the statistics background model, produce background image.Fig. 5 is the background image that the modeling of Hall Monitor video obtains.
(3) by comparing, detect the sport foreground of Hall Monitor video with background image, prospect and background that binaryzation is separated, and carry out medium filtering and mathematical morphology filter.The binary map that scape separated before and after Fig. 6 was after the filtering of Fig. 4 video interception correspondence;
(4) background image that background modeling is obtained adopts the method based on scene geometric information to produce geometric depath figure.Fig. 7 is the geometric depath figure of Fig. 5 background image.
(5) ask for the motion vector of Hall Monitor video file prospect part, and be converted into motion amplitude,, its motion amplitude is done linear transformation, obtain the motion depth map according to the foreground object present position.Fig. 8 is the motion depth map of Fig. 4 video interception correspondence.
(6) fusional movement depth map and geometric depath figure, and carry out gaussian filtering, obtain ultimate depth figure, be used for the expression of 3 D video.Fig. 9 is Fig. 7 geometric depath figure and Fig. 8 motion depth map merges and the ultimate depth figure of process gaussian filtering processing.

Claims (1)

1. the depth extraction method of merging motion information and geological information is characterized in that comprising the steps:
(1) two-dimensional video to be converted is set up the statistics background model,, isolate static background part and sport foreground part by moving object detection;
(2), and carry out medium filtering and mathematical morphology filter to isolating static background part and sport foreground image binaryzation partly;
(3) background image that the statistics background modeling is obtained adopts the method based on scene geometric information to produce geometric depath figure;
(4) by from seeking coupling between the two adjacent in time two field pictures of original two-dimensional video to be converted, obtain the motion vector of sport foreground object, and be converted into motion amplitude;
(5) according to sport foreground part present position, its motion amplitude is done linear transformation, obtain the motion depth map;
(6) fusional movement depth map and geometric depath figure, and carry out gaussian filtering, obtain ultimate depth figure, be used for the expression of 3 D video;
Wherein,
Described two-dimensional video to be converted is set up the statistics background model,, isolates static background part and sport foreground part steps and be by moving object detection:
(a) the continuous N two field picture of intercepting in two-dimensional video file to be converted is to this N two field picture I f(x y) carries out scanning on time and the space, try to achieve corresponding to the average of N pixel value of each pixel coordinate position as a setting image B (x, y) at this locational pixel value, computing formula is as follows:
B ( x , y ) = 1 N &Sigma; f = 1 N I f ( x , y ) ;
(b) (x is y) with each two field picture I of two-dimensional video to be converted with the background image B that obtains in the step (a) f(x y) does subtraction, and by comparing to determine foreground point in the image with a pre-set threshold th, representation formula is as follows:
I f = Background | I f ( x , y ) - B ( x , y ) | < th Foreground otherwise ;
Described to isolating static background part and sport foreground image binaryzation partly, and carry out medium filtering and the mathematical morphology filter step is:
(c) the front and back scape that utilizes step (b) to make is judged, is each two field picture I in the video f(x y) produces a width of cloth bianry image, 0 value representation background wherein, and 255 value representation prospects, that is:
I f 1 ( x , y ) = 0 if I f ( x , y ) is Backaround 255 if I f ( x , y ) is Foreground ;
(d) bianry image that step (c) is obtained carries out the medium filtering of 3 * 3 window sizes, eliminates background noise;
(e) image behind the medium filtering that step (d) is obtained carries out typical opening operation and closed operation in the mathematical morphology filter, and the zone of action and the cavity of eliminating small size in the foreground image adopt 3 * 3 squares corrosion unit and expansion unit to finish opening operation and closed operation;
The described background image that the statistics background modeling is obtained adopts the method generation geometric depath figure step based on scene geometric information to be:
(f) the luminance component B of the background image that step (a) is obtained y(x y) carries out rim detection with the Sobel operator, obtains horizontal gradient figure S x(x is y) with vertical gradient map S y(x, y), with this two width of cloth figure addition obtain gradient map S (x, y), Th compares with threshold value, obtains the background edge detection figure of binaryzation, choosing according to following formula of threshold value Th calculated:
Th=α·[S(x,y) max-S(x,y) min]+S(x,y) min
Wherein, α is the weight coefficient of value between 0~1, and S (x, y) MaxBe the gradient map max pixel value, and S (x, y) MinIt is the gradient map minimum pixel value;
(g) adopt the binaryzation edge detection graph extraction main straight line wherein that classical H ough transfer pair step (f) obtains in the image processing, its result and original binaryzation edge detection graph are done AND-operation, extract the line that goes out in the background, the mid point in the zone of the line intersection point probability of occurrence maximum of going out is as vanishing point;
(h) with the vanishing point that obtains in the step (g) as a setting in degree of depth deepest point, deepen gradually with differential 2 speed toward the vanishing point direction degree of depth along the line that goes out, obtain background image geometric depath figure G (x, y);
Describedly obtain the motion vector of sport foreground object, and be converted into the motion amplitude step and be by from seeking coupling between the two adjacent in time two field pictures of original two-dimensional video to be converted:
(i) the current time picture frame I in the scanning two-dimensional video to be converted f(x, y), the front and back scape of the filtering that obtains according to step (e) separates bianry image, judge if current pixel point is the foreground point, then be its previous moment picture frame I at current time F-1(x y) goes up searching optimum Match pixel, adopts the method for calculating the coupling cost in the W * W neighborhood window that with the current pixel point is the center, to improve the accuracy of mating, establishes match search scope S N * NSize is N * N, u and v are respectively horizontal offset and the vertical offset of pixel on the previous moment picture frame on the present frame when seeking coupling, i and j are respectively to be the interior horizontal offset and the vertical offset of W * W neighborhood at center with the current pixel point, and definition coupling cost is as follows:
C ( x , y ; u , v ) = &Sigma; i = - ( w - 1 ) / 2 ( w - 1 ) / 2 &Sigma; j = - ( w - 1 ) / 2 ( w - 1 ) / 2 | I f ( x + i , y + i ) - I f - 1 ( x + u + i , y + v + j ) |
(x,y)∈foreground,(u,v)∈S N×N,f=1,2,3……
Each pixel in the traversal current pixel point hunting zone calculates and mates cost accordingly, finds out the horizontal offset and the vertical offset that wherein have the smallest match cost, with the motion vector as current pixel point, horizontal MV x, vertical MV y, formulate is as follows:
C min(x,y;MV x,MV y)=Min[C(x,y;u,v)];
(j) the motion vector horizontal direction component of establishing each the foreground pixel point that obtains in the step (i) is MV x(x, y), the vertical direction component is MV y(x, y), motion amplitude is defined as:
F ( x , y ) = MV x 2 ( x , y ) + MV y 2 ( x , y ) ;
Describedly its motion amplitude is done linear transformation, obtains motion depth map step and be according to sport foreground part present position:
(k) motion amplitude that obtains in the step (j) is done linear transformation and downward rounding operation, the span of each foreground pixel point of guaranteeing the motion depth map is at [a, b], and is integer, and the linear transformation formula is as follows:
Wherein, linear transformation lower limit a and upper limit b value are all between 0~255, and a gets the minimum depth value of the background parts geometric depath figure that sport foreground shelters from, and b gets the depth value of the corresponding background parts geometric depath of sport foreground part minimum point figure;
Be described fusional movement depth map and geometric depath figure, and carry out gaussian filtering, obtain ultimate depth figure step and be:
(l) the front and back scape of the filtering that obtains according to step (e) separate bianry image A (x, y), the motion depth map M (x that step (k) is obtained, y) and the geometric depath figure G that obtains of step (h) (x y) merges, and obtains merging depth map D (x, y), fusion formula is defined as follows:
D ( x , y ) = M ( x , y ) A ( x , y ) = 255 G ( x , y ) A ( x , y ) = 0 :
(m) the fusion depth map that step (1) is obtained carries out gaussian filtering, obtains ultimate depth figure, is used for the expression of 3 D video.
CN2009101021536A 2009-08-17 2009-08-17 Depth extraction method of merging motion information and geometric information Active CN101640809B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009101021536A CN101640809B (en) 2009-08-17 2009-08-17 Depth extraction method of merging motion information and geometric information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101021536A CN101640809B (en) 2009-08-17 2009-08-17 Depth extraction method of merging motion information and geometric information

Publications (2)

Publication Number Publication Date
CN101640809A CN101640809A (en) 2010-02-03
CN101640809B true CN101640809B (en) 2010-11-03

Family

ID=41615552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101021536A Active CN101640809B (en) 2009-08-17 2009-08-17 Depth extraction method of merging motion information and geometric information

Country Status (1)

Country Link
CN (1) CN101640809B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663737A (en) * 2012-03-19 2012-09-12 西安交通大学 Vanishing point detection method for video signals rich in geometry information

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2499829B1 (en) 2009-10-14 2019-04-17 Dolby International AB Methods and devices for depth map processing
US8514269B2 (en) 2010-03-26 2013-08-20 Microsoft Corporation De-aliasing depth images
JP5227993B2 (en) 2010-03-31 2013-07-03 株式会社東芝 Parallax image generation apparatus and method thereof
CN101930606A (en) * 2010-05-14 2010-12-29 深圳市海量精密仪器设备有限公司 Field depth extending method for image edge detection
CN102314591B (en) * 2010-07-09 2014-07-23 株式会社理光 Method and equipment for detecting static foreground object
CN101969547A (en) * 2010-07-30 2011-02-09 新疆宏开电子系统集成有限公司 Method for processing infrared digital video signal at night
US9171372B2 (en) 2010-11-23 2015-10-27 Qualcomm Incorporated Depth estimation based on global motion
US9123115B2 (en) 2010-11-23 2015-09-01 Qualcomm Incorporated Depth estimation based on global motion and optical flow
CN102075777B (en) * 2011-01-26 2015-02-11 Tcl集团股份有限公司 Method for converting planar video image into three-dimensional video image based on moving object
TWI488494B (en) 2011-04-28 2015-06-11 Altek Corp Method of multi-frame image noise reduction
CN102905143B (en) * 2011-07-28 2015-04-15 瑞昱半导体股份有限公司 2D (two-dimensional)-3D (three-dimensional) image conversion device and method thereof
CN102263979B (en) * 2011-08-05 2013-10-09 清华大学 Depth map generation method and device for plane video three-dimensional conversion
CN102495907B (en) * 2011-12-23 2013-07-03 香港应用科技研究院有限公司 Video summary with depth information
KR20130084341A (en) * 2012-01-17 2013-07-25 삼성전자주식회사 Display system with image conversion mechanism and method of operation thereof
CN102682291B (en) * 2012-05-07 2016-10-05 深圳市贝尔信智能系统有限公司 A kind of scene demographic method, device and system
CN102750711B (en) * 2012-06-04 2015-07-29 清华大学 A kind of binocular video depth map calculating method based on Iamge Segmentation and estimation
CN103686136A (en) * 2012-09-18 2014-03-26 宏碁股份有限公司 Multimedia processing system and audio signal processing method
TW201432622A (en) * 2012-11-07 2014-08-16 Koninkl Philips Nv Generation of a depth map for an image
CN103218829B (en) * 2013-04-01 2016-04-13 上海交通大学 A kind of foreground extracting method being adapted to dynamic background
CN103413347B (en) * 2013-07-05 2016-07-06 南京邮电大学 Based on the extraction method of monocular image depth map that prospect background merges
CN104424649B (en) * 2013-08-21 2017-09-26 株式会社理光 Detect the method and system of moving object
CN103826032B (en) * 2013-11-05 2017-03-15 四川长虹电器股份有限公司 Depth map post-processing method
CN104735360B (en) * 2013-12-18 2017-12-22 华为技术有限公司 Light field image treating method and apparatus
CN103686139B (en) * 2013-12-20 2016-04-06 华为技术有限公司 Two field picture conversion method, frame video conversion method and device
TW201528775A (en) 2014-01-02 2015-07-16 Ind Tech Res Inst Depth map aligning method and system
CN103945211A (en) * 2014-03-13 2014-07-23 华中科技大学 Method for generating depth map sequence through single-visual-angle color image sequence
CN103985106B (en) * 2014-05-16 2017-08-25 三星电子(中国)研发中心 Apparatus and method for carrying out multiframe fusion to very noisy image
FR3028988B1 (en) * 2014-11-20 2018-01-19 Commissariat A L'energie Atomique Et Aux Energies Alternatives METHOD AND APPARATUS FOR REAL-TIME ADAPTIVE FILTERING OF BURNED DISPARITY OR DEPTH IMAGES
CN105005992B (en) * 2015-07-07 2016-03-30 南京华捷艾米软件科技有限公司 A kind of based on the background modeling of depth map and the method for foreground extraction
DE102016104732A1 (en) * 2016-03-15 2017-09-21 Connaught Electronics Ltd. Method for motion estimation between two images of an environmental region of a motor vehicle, computing device, driver assistance system and motor vehicle
KR102463702B1 (en) * 2016-12-15 2022-11-07 현대자동차주식회사 Apparatus for estimating location of vehicle, method for thereof, apparatus for constructing map thereof, and method for constructing map
EP3296749B1 (en) * 2017-01-27 2019-01-23 Sick IVP AB Motion encoder
CN107155101A (en) * 2017-06-20 2017-09-12 万维云视(上海)数码科技有限公司 The generation method and device for the 3D videos that a kind of 3D players are used
CN109213138B (en) * 2017-07-07 2021-09-14 北京臻迪科技股份有限公司 Obstacle avoidance method, device and system
CN107742296A (en) * 2017-09-11 2018-02-27 广东欧珀移动通信有限公司 Dynamic image generation method and electronic installation
CN107680169B (en) * 2017-09-28 2021-05-11 宝琳创展科技(佛山)有限公司 Method for manufacturing VR (virtual reality) stereoscopic image by adding curved surface to depth map
CN108389172B (en) * 2018-03-21 2020-12-18 百度在线网络技术(北京)有限公司 Method and apparatus for generating information
CN109697725B (en) * 2018-12-03 2020-10-02 浙江大华技术股份有限公司 Background filtering method and device and computer readable storage medium
CN109649384B (en) * 2019-02-15 2020-08-14 华域汽车系统股份有限公司 Parking assisting method
CN109900274B (en) * 2019-03-25 2022-09-16 哈尔滨工业大学 Image matching method and system
CN110602476B (en) * 2019-08-08 2021-08-06 南京航空航天大学 Hole filling method of Gaussian mixture model based on depth information assistance
CN110602479A (en) * 2019-09-11 2019-12-20 海林电脑科技(深圳)有限公司 Video conversion method and system
CN111208521B (en) * 2020-01-14 2021-12-07 武汉理工大学 Multi-beam forward-looking sonar underwater obstacle robust detection method
CN113139997B (en) * 2020-01-19 2023-03-21 武汉Tcl集团工业研究院有限公司 Depth map processing method, storage medium and terminal device
TWI736335B (en) * 2020-06-23 2021-08-11 國立成功大學 Depth image based rendering method, electrical device and computer program product
CN112489072B (en) * 2020-11-11 2023-10-13 广西大学 Vehicle-mounted video perception information transmission load optimization method and device
CN112203095B (en) * 2020-12-04 2021-03-09 腾讯科技(深圳)有限公司 Video motion estimation method, device, equipment and computer readable storage medium
CN112822479A (en) * 2020-12-30 2021-05-18 北京华录新媒信息技术有限公司 Depth map generation method and device for 2D-3D video conversion
CN113688849B (en) * 2021-08-30 2023-10-24 中国空空导弹研究院 Gray image sequence feature extraction method for convolutional neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663737A (en) * 2012-03-19 2012-09-12 西安交通大学 Vanishing point detection method for video signals rich in geometry information
CN102663737B (en) * 2012-03-19 2014-07-23 西安交通大学 Vanishing point detection method for video signals rich in geometry information

Also Published As

Publication number Publication date
CN101640809A (en) 2010-02-03

Similar Documents

Publication Publication Date Title
CN101640809B (en) Depth extraction method of merging motion information and geometric information
CN108648161B (en) Binocular vision obstacle detection system and method of asymmetric kernel convolution neural network
CN101765022B (en) Depth representing method based on light stream and image segmentation
CN102263979B (en) Depth map generation method and device for plane video three-dimensional conversion
CN101937578B (en) Method for drawing virtual view color image
CN102254348A (en) Block matching parallax estimation-based middle view synthesizing method
CN102223553A (en) Method for converting two-dimensional video into three-dimensional video automatically
CN101631256A (en) Method for converting 2D video into 3D video in three-dimensional television system
CN101287143A (en) Method for converting flat video to tridimensional video based on real-time dialog between human and machine
CN102609950B (en) Two-dimensional video depth map generation process
CN102098440A (en) Electronic image stabilizing method and electronic image stabilizing system aiming at moving object detection under camera shake
CN104065946B (en) Based on the gap filling method of image sequence
CN112019828B (en) Method for converting 2D (two-dimensional) video into 3D video
CN102665086A (en) Method for obtaining parallax by using region-based local stereo matching
CN106056622B (en) A kind of multi-view depth video restored method based on Kinect cameras
CN103581650A (en) Method for converting binocular 3D video into multicast 3D video
CN106447718B (en) A kind of 2D turns 3D depth estimation method
CN104992442A (en) Video three-dimensional drawing method specific to flat panel display device
CN104980726B (en) A kind of binocular video solid matching method of associated movement vector
CN103716615B (en) 2D video three-dimensional method based on sample learning and depth image transmission
CN101765019A (en) Stereo matching algorithm for motion blur and illumination change image
CN101557534A (en) Method for generating disparity map from video close frames
CN112822479A (en) Depth map generation method and device for 2D-3D video conversion
CN104778673B (en) A kind of improved gauss hybrid models depth image enhancement method
CN112634127B (en) Unsupervised stereo image redirection method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160614

Address after: 518000 new energy building, Nanhai Road, Shenzhen, Guangdong, Nanshan District A838

Patentee after: Meng Qi media (Shenzhen) Co. Ltd.

Address before: 310027 Hangzhou, Zhejiang Province, Zhejiang Road, No. 38

Patentee before: Zhejiang University

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160831

Address after: 518000, 101, 2, Fengyun technology building, Fifth Industrial Zone, North Ring Road, Shenzhen, Guangdong, Nanshan District

Patentee after: World wide technology (Shenzhen) Limited

Address before: 518000 new energy building, Nanhai Road, Shenzhen, Guangdong, Nanshan District A838

Patentee before: Meng Qi media (Shenzhen) Co. Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180903

Address after: 518000 B unit 101, Fengyun mansion 5, Xili street, Nanshan District, Shenzhen, Guangdong.

Patentee after: Wan D display technology (Shenzhen) Co., Ltd.

Address before: 518000 2 of Fengyun tower, Fifth Industrial Zone, Nanshan District North Ring Road, Shenzhen, Guangdong, 101

Patentee before: World wide technology (Shenzhen) Limited