CN103248906A - Method and system for acquiring depth map of binocular stereo video sequence - Google Patents

Method and system for acquiring depth map of binocular stereo video sequence Download PDF

Info

Publication number
CN103248906A
CN103248906A CN2013101347151A CN201310134715A CN103248906A CN 103248906 A CN103248906 A CN 103248906A CN 2013101347151 A CN2013101347151 A CN 2013101347151A CN 201310134715 A CN201310134715 A CN 201310134715A CN 103248906 A CN103248906 A CN 103248906A
Authority
CN
China
Prior art keywords
pixel
weights
module
image
zone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013101347151A
Other languages
Chinese (zh)
Other versions
CN103248906B (en
Inventor
王好谦
杜成立
张永兵
戴琼海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Tsinghua University
Original Assignee
Shenzhen Graduate School Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Tsinghua University filed Critical Shenzhen Graduate School Tsinghua University
Priority to CN201310134715.1A priority Critical patent/CN103248906B/en
Publication of CN103248906A publication Critical patent/CN103248906A/en
Priority to HK13110791.5A priority patent/HK1183577A1/en
Application granted granted Critical
Publication of CN103248906B publication Critical patent/CN103248906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method and a system for acquiring a depth map of a binocular stereo video sequence. The method comprises the following steps: performing clustering on a first image in two images, adopting a region growing manner to transform all clusters to connected regions, and recording average five-dimension coordinates of all regions and adjacent information between two regions; receiving manual signs for marking foreground and background of the first image input by an operator, and calculating connection weight and sign weight of the regions; taking the connection weight and sign weight of the regions as input, and transferring the Graphcut algorithm to partition the foreground and the background of the first image; and taking the second image of the two images as a reference image, according to the partition result, calculating a disparity map of the foreground based on the partial self-adaption weight stereo matching algorithm, calculating a disparity map of the background based on the Rank transformation stereo matching algorithm, and then transforming the disparity maps into depth maps. The system is a system executing the method. The method and the system have the benefits of correct result and low complexity.

Description

A kind of depth map acquisition methods and system of binocular tri-dimensional frequency sequence
Technical field
The invention belongs to the Computer Image Processing field, particularly a kind of depth map acquisition methods and system of binocular tri-dimensional frequency sequence.
Background technology
Develop rapidly along with current social, it is also increasing that demand is appreciated in humane amusement, the requirement of watching for telecine, it is not only high clear colorful, viewing person needs real more 3-D effect, three-dimensional video-frequency correlative study and application become current hot issue, and the technical problem that three-dimensional video-frequency is relevant is also thought problem demanding prompt solution.
Three-dimensional video-frequency is divided into traditional binocular tri-dimensional video and multi-view point video, binocular tri-dimensional video is made up of the two-path video sequence, though can bring stereoeffect to people, but effect is more stiff, must wear corresponding eyes just can watch, and viewpoint is single, and is far apart with the solid impression in the middle of the real-life.Multi-view point video then can adapt to the demand of bore hole reality, can satisfy people's bore hole and watch the demand of video, simultaneously, reason owing to multi-view point video, can so that the video sequence that people watch from different perspectives to inequality, and then the scene angle that observes is also inequality, and the latter is more near real three-dimensional impression.
Although multi-view point video possesses the advantage how binocular tri-dimensional video does not possess, the collection difficulty of multi-view point video is also bigger, and how obtaining high-quality multi-view point video becomes a key and study a question.Can directly be converted to multi-view point video from the 2d video at present, but because information seriously lacks the depth perception extreme difference of the multi-view point video sequence that obtains.And the collection of binocular video is fairly simple comparatively speaking, and amount of information is also abundant relatively, and binocular changes many orders and becomes a valid approach.
Binocular tri-dimensional video changes many purposes basic ideas: at first obtain depth information by binocular video, adopt the synthetic method of virtual view to obtain the multi-channel video sequence according to the depth map that obtains then.Wherein most critical is exactly how to obtain depth information accurately.The main method of present stage is exactly three-dimensional coupling, obtains depth information by the skew (parallax) of the pixel between the two-path video correspondence image of the search synchronization left and right sides, and the degree of depth is the relation that is inversely proportional to parallax.
Stereo Matching Algorithm mainly is divided into two classes, overall Stereo Matching Algorithm and local Stereo Matching Algorithm, and two kinds of algorithms have a common problem to be exactly, and for the situation of natural information disappearances such as blocking, are difficult to solve.
Summary of the invention
Technical problem to be solved by this invention is to provide a kind of depth map acquisition methods and system of binocular tri-dimensional frequency sequence, the complexity of reduction acquisition process when guaranteeing accuracy.
Technical scheme of the present invention is solved by following technological means:
As shown in Figure 1, a kind of depth map acquisition methods of binocular tri-dimensional frequency sequence may further comprise the steps:
S100) pre-treatment step: read in two width of cloth images of same time point, wherein first width of cloth image is carried out clustering processing; The mode that adopts region growing then is converted into the zone of connection with each cluster, records average five dimension coordinates and the interregional neighbor information in each zone.
Because two width of cloth images of same time point have very strong correlation, cutting operation to image, one of two width of cloth images are handled and are got final product about only needing, and another width of cloth figure is as the reference diagram of follow-up Stereo Matching Algorithm, and first image in this step can be that left figure also can be right figure.
S200) foreground extraction step: comprising: S210) weights calculation procedure: receive the prospect that is used for mark first width of cloth image of operator's input and the handmarking of background, calculate and represent the mark weights that zone connective between the zone connects correlation between weights and each zone and handmarking; S220) image segmentation step: described zone is connected weights and mark weights as input, call the GrpahCut algorithm first width of cloth image is carried out cutting apart of prospect part and background parts.
S300) depth map obtaining step: with second width of cloth image in described two width of cloth images as the reference image, segmentation result according to the foreground extraction step, calculate the disparity map of prospect part based on local auto-adaptive weights Stereo Matching Algorithm, based on the disparity map of Rank conversion Stereo Matching Algorithm calculating background parts, then the disparity map of prospect part and the disparity map of background parts are converted to depth map.
Preferably, also comprise: S400) post-processing step: to the pixel of prospect part, adopting following steps to carry out depth value proofreaies and correct: choose the depth map correcting window, and according to the result of pre-treatment step, upgrade the depth value of pixel to be corrected with the mean depth value that all and pixel to be corrected in the correcting window is belonged to the pixel of the same area.
Preferably, clustering processing may further comprise the steps described in the described pre-treatment step:
S110) pending image is carried out denoising;
S120) adopt the K-means clustering algorithm, according to the similarity degree of quintuple space coordinate described first width of cloth image is carried out cluster, pixel is attributed to its quintuple space apart from the classification under the cluster centre of minimum.
Preferably, described step S210) may further comprise the steps:
S211) according to step S100) the interregional neighbor information that obtains, for non-conterminous zone, establishing its zone, to connect weights be 0, for any adjacent areas a and regional b, its zone connection weights
Figure BDA00003064261500021
Wherein, D ab = ( R a - R b ) 2 + ( G a - G b ) 2 ( B a - B b ) 2 , (R a, G a, B a) and (R b, G b, B b) be respectively each pixel of regional a and regional b at the mean value of each component of RGB color space;
S212) receive operator's input prospect gauge point s and context marker point t;
S213) for each regional k of image, calculate its prospect mark weights foreW KsWith context marker weights backW Kt, wherein:
Figure BDA00003064261500031
ForeD KsMinimum quintuple space distance for regional k and prospect gauge point s;
Figure BDA00003064261500032
BackD KtMinimum quintuple space distance for regional k and context marker point t.
Preferably, described step S400) may further comprise the steps:
S401) read current pixel point;
S402) judge whether current pixel point belongs to the prospect part, if not, then return step S401) carry out the processing of next pixel, otherwise carry out next step;
S403) choose matrix centered by current pixel point as the depth map correcting window, according to the pre-treatment step result, the pixel that does not belong to the same area with current pixel point in the correcting window is defined as the inactive pixels point, and the rest of pixels point is defined as effective pixel points;
S404) depth value of renewal current pixel point is the mean value of the depth value of all effective pixel points in the depth map correcting window;
S405) check whether all pixels are proofreaied and correct and finish, if not, then enter next pixel and handle; If then finish to proofread and correct.
A kind of depth map of binocular tri-dimensional frequency sequence obtains system, it is characterized in that, comprising:
Pretreatment module (201) comprises cluster module and region growing module, and described cluster module is used for reading in two width of cloth images of same time point, and wherein first width of cloth image is carried out clustering processing; The mode that described region growing module be used for to adopt region growing is converted into the zone of connection with each cluster, records average five dimension coordinates and the interregional neighbor information in each zone;
Foreground extracting module (202): comprise that weights computing module and image cut apart module, described weights computing module is used for receiving the prospect that is used for mark first width of cloth image of operator's input and the handmarking of background, calculates and represents the mark weights that zone connective between the zone connects correlation between weights and each zone and handmarking; Described image is cut apart module and is used for described zone is connected weights and mark weights as input, calls the GrpahCut algorithm first width of cloth image is carried out cutting apart of prospect part and background parts;
Degree of depth acquisition module (203): be used for second width of cloth image with described two width of cloth images as the reference image, segmentation result according to the foreground extraction step, calculate the disparity map of prospect part based on local auto-adaptive weights Stereo Matching Algorithm, based on the disparity map of Rank conversion Stereo Matching Algorithm calculating background parts, then the disparity map of prospect part and the disparity map of background parts are converted to depth map.
Preferably, also comprise: post-processing module (204): be used for the pixel to the prospect part, adopting following steps to carry out depth value proofreaies and correct: choose the depth map correcting window, and according to the result of pre-treatment step, upgrade the depth value of pixel to be corrected with the mean depth value that all and pixel to be corrected in the correcting window is belonged to the pixel of the same area.
Preferably, described cluster module comprises: the denoising module: be used for pending image is carried out denoising; K-means module: be used for to adopt the K-means clustering algorithm, according to the similarity degree of quintuple space coordinate described first width of cloth image is carried out cluster, pixel is attributed to its quintuple space apart from the classification under the cluster centre of minimum.
Preferably, described weights computing module comprises:
The regional weights computing module that connects: be used for the interregional neighbor information according to the pretreatment module acquisition, for non-conterminous zone, establishing its zone connection weights is 0, and for any adjacent areas a and regional b, its zone connects weights
Figure BDA00003064261500041
Wherein, D ab = ( R a - R b ) 2 + ( G a - G b ) 2 ( B a - B b ) 2 , (R a, G a, B a) and (R b, G b, B b) be respectively each pixel of regional a and regional b at the mean value of each component of RGB color space;
The mark input module: be used for to receive operator's input prospect gauge point s and context marker point t;
Mark weights computing module: be used for each the regional k for image, calculate its prospect mark weights foreW KsWith context marker weights backW Kt, wherein:
Figure BDA00003064261500043
ForeD KsMinimum quintuple space distance for regional k and prospect gauge point s;
Figure BDA00003064261500044
BackD KtMinimum quintuple space distance for regional k and context marker point t.
Preferably, described post-processing module comprises:
Read module: be used for reading current pixel point;
Judge module: be used for judging whether current pixel point belongs to the prospect part, if not then read module carries out the processing of next pixel, handle otherwise carry out next module;
Valid pixel is chosen module: be used for choosing matrix centered by current pixel point as the depth map correcting window, according to the pre-treatment step result, the pixel that does not belong to the same area with current pixel point in the correcting window is defined as the inactive pixels point, and the rest of pixels point is defined as effective pixel points;
The depth value update module: the depth value that is used for the renewal current pixel point is the mean value of the depth value of all effective pixel points of depth map correcting window;
Stop judge module: finish for checking whether all pixels are proofreaied and correct, if not, then enter next pixel and handle; If then finish to proofread and correct.
Compared with prior art, the present invention is at first cut apart fore/background, calculate the depth value of prospect part then based on local auto-adaptive weights Stereo Matching Algorithm, more accurate to guarantee the depth information of the bigger prospect part of beholder's influence, and calculate the depth value of background parts based on the Stereo Matching Algorithm of RANK conversion fast, to reduce the complexity of algorithm.And, carrying out fore/background when cutting apart, at first utilize cluster and region growing to form picture portion, carry out alternately with the user then, utilize the user image to be carried out the handmarking of fore/background, correlation by regional connectivity and each zone and handmarking's part realizes that fore/background is cut apart accurately, thereby has guaranteed that follow-up depth value calculates the good effect of acquisition.
In the preferred version, further cluster and the region growing result to utilizing pre-treatment step carried out further correction to the depth value of prospect part pixel, can make the depth value of prospect part more near the objective degree of depth of image.
In the preferred version, before cluster, image is carried out denoising, can weaken noise to the influence of clustering algorithm, and adopt K-means algorithm (K-means clustering algorithm) to utilize the quintuple space distance of pixel to carry out cluster, guaranteed the effect of cluster.
Description of drawings
Fig. 1 is the technical scheme flow chart that the present invention designs.
Fig. 2 is the technical scheme modular structure schematic diagram that the present invention designs.
Fig. 3 is the flow chart of pretreatment module.
Fig. 4 is the flow chart of foreground extracting module.
Fig. 5 is the flow chart of depth map acquisition module.
Fig. 6 is the flow chart of post-processing module.
Embodiment
Preferred embodiment the invention will be further described for contrast accompanying drawing and combination below.
Present embodiment is a kind of depth map acquisition methods and system of binocular tri-dimensional frequency sequence, and Fig. 2 is the modular structure figure of described system, and this system mainly comprises following four modules:
Pretreatment module 201;
Foreground extracting module 202;
Depth map acquisition module 203; And
Post-processing module 204.
Pretreatment module is used for reading in left and right sides road picture, and right figure is done denoising, utilizes clustering algorithm that the similar pixel in the image is carried out cluster and merges into the zone then, and posting field information adopts the K-Means algorithm to carry out clustering processing in this implementation method.
The FB(flow block) of pretreatment module specifically comprises referring to Fig. 3:
Read in picture step 301: read in current point in time about two width of cloth images.
Image denoising step 302: purpose is to weaken picture noise to the influence of clustering algorithm.Adopt the gaussian filtering algorithm that image is carried out denoising in this example.Two width of cloth images have very strong spatial coherence about synchronization, for the image cutting operation, only need handle right figure getting final product, left figure as the reference picture of follow-up Stereo Matching Algorithm (left figure also can, follow-up with right figure as with reference to image)
What following step 303-306 carried out is that image clustering is handled, and adopts the K-means clustering algorithm in this example, and the basic calculating foundation is five dimension coordinate (x of image, y, r, g, b), wherein (x y) is the location of pixels coordinate, (b) pixel is at the component value of rgb color space for r, g, similarity degree according to five dimension coordinates carries out cluster operation, sets similarity variables D (x, y, r, g, b).
Initial cluster center step 303 is set: this example is divided into several fixedly rectangular blocks of length and width with image, calculates the quintuple space coordinate mean value of all pixels in each rectangular block as initial cluster center.Use C (x, y, r, g, b) position and the color space value of expression cluster centre for sake of convenience.X CAnd Y CThe horizontal ordinate of representing cluster centre respectively, R C, G CAnd B CRepresent three color component value of red, green, blue of cluster centre respectively, other formula all adopt similar symbol for the quintuple space coordinate hereinafter, hereinafter will not give unnecessary details.
With the step 304 of pixel according to quintuple space coordinate cluster: handle each pixel in the image successively.For current pixel point, calculate the quintuple space distance of each cluster centre in this point and its hunting zone, wherein the classification under the cluster centre of Zui Xiao space length value correspondence is the classification that this pixel belongs to.Suppose current pixel point be P (x, y, r, g, b), corresponding cluster centre pixel be C (x, y, r, g, b) concrete account form is as follows:
The color space distance;
D pc _ color = ( R p - R c ) 2 + ( G p - G c ) 2 + ( B p - B c ) 2
The locational space distance:
D pc _ position = ( X p - X c ) 2 + ( Y p - Y c ) 2
The quintuple space distance of this pixel and cluster centre:
D ( x , y , r , g , b ) = D pc _ color 2 + D pc _ position 2
By calculating the quintuple space distance of this pixel and different cluster centres, choose minimum space apart from minD (x, y, r, g, cluster centre b) is analog result, just this similitude is attributed to corresponding cluster centre one class.
Upgrade cluster centre step 305: add up all pixels that comprise of all categories, calculate the quintuple space coordinate mean value of these pixels, as new cluster centre, number of times tter_num is carried out in the statistics circulation.
Judge the step 306 whether cluster finishes: concrete grammar is as follows: in the computed image pixel apart from the minimum quintuple space of cluster centre apart from minD (x, y, r, g, b) sum sumD:
sumD=∑minD(x,y,r,g,b)
Current cycle calculations obtain apart from high and be sumD Current, the distance that a preceding cycle calculations obtains and be sumD Prevtous, the condition that cluster finishes is that following two formulas have an establishment;
sumD prevtous-surmD current≤T
tter_num>max_tter
Wherein T is that given threshold value (determine according to picture material, is preferably set to (1.2~2) sumD doubly by this parameter Prevtous/ tter_num); Tter_num is that number of times is carried out in current circulation, and max_tter is for carrying out number of times (being set at usually 5~10 times) with the largest loop of setting.If the cluster termination condition is false, then returns step 304 and continue circulation; If the cluster termination condition is set up, then enter region growing step 307.
Region growing step 307: the region growing algorithm that at first adopts the neighbours territory is converted into the zone of connection with each cluster, adds up the pixel number in each zone.If certain area pixel point number greater than given upper limit threshold, then arranges two or more cluster centres, call the K-means clustering algorithm again, be two or more subregions with this Region Segmentation; If certain area pixel point number less than given lower threshold, then is incorporated into neighborhood nearest in the quintuple space with this zone.
At last, record average five dimension coordinates in each zone and the neighbor information that ask in the zone (that is: expression zone whether adjacent information).
Foreground extracting module, be used for receiving the prospect that is used for mark first width of cloth image of operator's input and the handmarking of background, handmarking according to input, weights are calculated in each zone, comprise representing the mark weights that zone connective between the zone connects correlation between weights and zone and handmarking; Also be used for to calculate good weights as input, call the GrpahCut algorithm and cut apart the extraction foreground area to what present image carried out prospect part and background parts.The flow chart of foreground extracting module specifically comprises referring to Fig. 4:
The zone connects weights calculation procedure 401: judge according to step 307, for non-conterminous zone on the space, do not consider that it connects weights, namely connect weights and be made as O, for adjacent areas on the space, consider degree of correlation information on the color, calculate according to following formula and connect weights (be example with adjacent areas a and regional b):
Calculating pixel color distortion value:
D ab = ( R a - R b ) 2 + ( G a - G b ) 2 ( B a - B b ) 2
Zone a, the connection weights between the b:
Figure BDA00003064261500072
(R wherein a, G a, B a) and (R b, G b, B b) be respectively each pixel of regional a and b at the mean value of each component of RGB color space;
Handmarking's input step 402: the handmarking who receives operator's input.The labeling method that provides in this example is that the operator makes marks at foreground object with left mouse button, makes marks at background object with right mouse button.If the number of prospect gauge point or context marker point then utilizes the K-means algorithm that its cluster is cluster_num classification greater than predefined cluster threshold value cluster_num, get the cluster centre of each classification as final gauge point; In order to guarantee the validity of manual intervention information, if the number of gauge point is less than or equal to cluster threshold value cluster_num, then do not carry out cluster.
Mark weights calculation procedure 403: for each zone in the image, calculate it for the weights of prospect mark s and context marker k.
Calculating prospect mark weights are exactly the five bit space distances of calculating current region k and prospect gauge point s, and concrete computational methods are as follows:
Calculate the color space distance:
D ks _ color = ( R k - R s ) 2 + ( G k - G s ) 2 + ( B k - B s ) 2
The calculating location space length: D ks _ position = ( X k - X s ) 2 + ( Y k - Y s ) 2
Five bit space distances of current region and prospect gauge point:
D ks = D ks _ color 2 + D ks _ position 2
foreD ks=minD ks
foreW ks = 1 foreD ks
Wherein, (X k, Y k, R k, G k, B k) and (X s, Y s, R s, G s, B s) be respectively the quintuple space coordinate of regional k and prospect gauge point s; D Ks_colorColor space distance for regional k and prospect gauge point s; D Ks_positionPosition distance for regional k and prospect gauge point s; D KsQuintuple space distance for regional k and prospect gauge point s; ForeD KsMinimum quintuple space distance for regional k and prospect mark; ForeW KsProspect mark weights for regional k.
The computing formula of context marker weights and prospect mark weights account form are similar, and current region k and context marker point t computational methods are as follows:
D kt _ color = ( R k - R t ) 2 + ( G k - G t ) 2 ( B k - B t ) 2
D kt _ position = ( X k - X t ) 2 + ( Y k - Y t ) 2
D kt = D kt _ color 2 + D kt _ position 2
backD ks=minD ks
backW ks = 1 backD ks
Wherein, (X k, Y k, R k, G k, B k) and (X t, Y t, R t, G t, B t) be respectively the quintuple space coordinate of regional k and context marker point t; D Kt_colorColor space distance for regional k and context marker point t; D Kt_positionPosition distance for regional k and context marker point t; D KtQuintuple space distance for regional k and context marker point t; BackD KtMinimum quintuple space distance for regional k and context marker; BackW KtContext marker weights for regional k.
GrapchCut segmentation step 404: the zone is connected weights and mark weights as input parameter, call the GraphCut algorithm and obtain the Region Segmentation result.
The determining step 405 whether Region Segmentation result meets the demands: in this example, judge that by the operator whether the Region Segmentation result isolates prospect and the background of image comparatively accurately, namely observes segmentation result.If segmentation result can not meet the demands, then return step 402 and add the handmarking again, carry out the Region Segmentation operation; If segmentation result can meet the demands, then enter step 501 and carry out follow-up three-dimensional coupling and obtain the depth map stage.
The depth map acquisition module, result according to Region Segmentation, adopt different Stereo Matching Algorithm to handle to prospect part and background parts, obtain disparity map, be converted into depth map again, depth map and disparity map are the relations that is inversely proportional to, and obtain disparity map by the solid coupling and then can be converted into depth map, and hereinafter part will describe the process of obtaining disparity map by Stereo Matching Algorithm in detail.
In order to narrate conveniently, in this example, use P LThe left figure that expression is read in, P RThe right figure that expression is read in, P L(i, j) the left figure i of expression is capable, the pixel of j row, P R(i, j) the left figure i of expression is capable, the pixel of j row.For left figure current pixel point P L(i j), sets its parallax hunting zone (DSR=20), and namely the region of search is the same horizontal pixel point set P of right figure R(i, j-d), d ∈ [0, DSR] wherein.Successively to each the reference point P in d ∈ [0, the DSR] scope R(i j-d), calculates itself and P L(i, coupling cost Cost j) d, choose coupling cost Cost dMinimum reference point is as optimal match point, and then the parallax that this point is corresponding then is current pixel point P L(i, parallax value j) is d.Concrete steps are as shown in Figure 5:
Matching process is selected step 501: judge that current pixel belongs to prospect partly or background parts, according to the result of the foreground extracting module of a last module, part then enters step 502 if pixel belongs to prospect, otherwise enters step 505;
Belong to foreground area if judge pixel, should adopt Stereo Matching Algorithm more accurately, i.e. adaptive weight matching algorithm, computational methods are as step 502~504 hereinafter:
Self adaptation support window determining step 502: with current pixel point P L(i, j) centered by, choose the size be the support window of W * W (target window), parameter W range of choice is 27~37 odd number, according to each pixel in the window and current pixel P L(i, brightness j) and colour information and range information calculate itself and P L(i, degree of correlation j) is as weights.Pixel P in the target window L(i+m, weights j+n) are designated as Ω L(p, q), wherein q represents pixel P L(i+m, j+n).
The calculation procedure 503 of weights: original image is carried out medium filtering to remove noise jamming.The selection 5 * 5 of filter window size or 3 * 3(base unit are pixels).Illustrate: calculate weights Ω L(p q) need consider heterochromia and range information simultaneously.The more big weights of color similarity are more big, and the more near weights of distance are more big.In order to reduce the influence of noise signal, the colouring information that adopts when calculating weights is reference with the image that original image is carried out behind the medium filtering all, when medium filtering only is used for weights and calculates, still should handle original image in the matching process.
The color similarity is calculated: D Colour = ( R p - R q ) 2 + ( G p - G q ) 2 + ( B p - B q ) 2 , The chrominance component of RGB represent pixel wherein.
Distance is calculated: Wherein X, Y distinguish horizontal stroke and the ordinate of represent pixel.
Weights calculate: Ω L(p, q)=exp[-(D Colour/ γ C+ D Distance/ γ D)], γ wherein C∈ (6,12), γ D∈ (18,28)
The calculating of coupling cost need consider simultaneously that target window and reference windows are (with P R(i, j-d) support window centered by) in order to obtain weights more accurately, need consider the weights of each pixel in the target window and the weights of interior each pixel of reference windows simultaneously, and the two need calculate respectively according to the different information in the own window, obtains Ω L(p L, q L) and Ω R(p R, q R).
For the pixel in the support window, if it then will not got rid of in foreground area, the method that adopts is the mark value 1 that weights be multiply by foreground area, mark value is 0 if background pixel is put then, multiply each other and to remove by weights, background pixel point makes the result more accurate as the situation of pixel in the support window.
To sum up, final weights computing formula is:
Ω(q L,q R)=Ω L(p L,q L)×Ω R(p R,q R)×P Segment
Ω wherein L(p L, q L) and Ω R(p R, q R) weight matrix of target window and reference windows, P SegmentBe that each pixel is the matrix that belongs to prospect or background in the expression current window, respective pixel is that prospect is then at matrix P SegmentThe value of middle correspondence position is 1, and background is correspondence 0 then;
Absolute error (AD) calculation procedure 504: computing formula is as follows:
Cost AD(p,q)=|R p-R q|+|G p-G q|+|B p-B q|;
The chrominance component of RGB represent pixel wherein.And the computing formula of coupling cost accumulation is as follows:
Cost SAD ( d ) = Σ m = - W / 2 W / 2 Σ n = - W / 2 W / 2 Ω ( q L ( i + m , j + n ) , q R ( i + m , j - d + n ) ) × Cost AD ( q L ( i + m , j + n ) , q R ( i + m , j - d + n ) ) Σ m = - W / 2 W / 2 Σ n = - W / 2 W / 2 Ω ( q L ( i + m , j + n ) , q R ( i + m , j - d + n ) )
Wherein:
Ω(q L(i+m,j+n),q R(i+m,j-d+n))
L(p L(i,j),q L(i+m,j+n))×Ω R(q R(i,j-d),q R(i+m,j-d+n))×P Segment(i+m,j+n)
Current pixel point is the background pixel point, adopts based on Rank conversion Stereo Matching Algorithm and handles, the following step 505 of calculation procedure~507:
Step 505: with current pixel point P L(i, j) centered by, choose the size be the support window of X * Y (target window), wherein the range of choice of X and Y is 17~25 odd number, the two can be unequal.
Step 506: calculate luminance difference Diff earlier, namely the brightness value of interior each pixel of support window deducts the brightness value of center pixel, and difference is divided into 5 grades, obtains the Rank matrix, and computational methods are as follows:
Rank = - 2 Diff < - v - 1 - v &le; Diff < - u 0 - u &le; Diff < u 1 u &le; Diff < v 2 Diff > v
Wherein u and v are threshold parameter, u=2,3,4, v=8,9,10
Calculate the Rank matrix of target window and reference windows respectively, obtain RankL (i, j) and RankR (i, j-d), the two is the matrix of WinX * WinY size.
Calculate RankCost ( m , n ) = 0 RankL ( m , n ) = RankR ( m , n ) 1 RankL ( m , n ) &NotEqual; RankR ( m , n ) , M wherein, n is the variable that uses in the accumulation calculating,
Step 507: coupling cost Cost RT(d) calculate:
Cost RT ( d ) = &Sigma; m = - X / 2 X / 2 &Sigma; n = Y / 2 Y RankCost ( m , n )
Step 508: according to the coupling cost that step 504 or step 507 are obtained, choose the parallax value of smallest match cost correspondence, as the parallax value of this pixel;
Step 509: judge whether all pixels of present image dispose, and disposing then enters step 510, carry out the processing of next pixel otherwise can arrive 501;
Step 510: the disparity map that obtains is converted into depth map;
Post-processing module, be used for according to the foreground information of foreground extracting module 202 extractions and the ID figure information of depth map acquisition module 203 acquisitions, disparity map to the prospect part is further proofreaied and correct processing, obtains more reliable foreground information, makes that the depth information of entire image is more reliable.Concrete steps are as follows:
For sake of convenience, suppose that the depth map that obtains represents with Depth, (i, j) expression depth map i is capable, the depth value of j row for Depth.Each pixel to depth map is handled successively.
Step 601: read current pixel point;
Step 602: judge whether current pixel point belongs to foreground area, is not then to get back to the processing that step 601 is carried out next pixel, select corresponding depth map correcting window otherwise enter step 603;
Step 603: choose the depth map correcting window, be generally the matrix centered by current depth point, suppose that size is N * N, N is 7~15 odd number, judge according to the pretreatment module cluster result whether pixel and current pixel point in the correcting window belong to same classification, if not then remove this support pixel (namely being defined as the inactive pixels point), the matrix that is N * N with a size records this information, the support pixel (namely being defined as effective pixel points) that belongs to same classification, the correspondence position value is 1 in matrix, otherwise is 0;
Step 604: all effective pixel points according to the correcting window of determining in 603 are proofreaied and correct processing to current pixel point, the depth value of current pixel point supports effectively that for all the depth value of pixel gets mean value, can reject inactive pixels point participation mean value calculation by the mode that depth value and positional value with pixel multiply each other.
Step 605: check whether all pixels dispose, handle otherwise enter next pixel.Dispose and then finish whole technical proposal.
Above content be in conjunction with concrete preferred implementation to further describing that the present invention does, can not assert that concrete enforcement of the present invention is confined to these explanations.For those skilled in the art, without departing from the inventive concept of the premise, can also make some being equal to substitute or obvious modification, and performance or purposes are identical, all should be considered as belonging to protection scope of the present invention.

Claims (10)

1. the depth map acquisition methods of a binocular tri-dimensional frequency sequence is characterized in that, may further comprise the steps:
S100) pre-treatment step: read in two width of cloth images of same time point, wherein first width of cloth image is carried out clustering processing; The mode that adopts region growing then is converted into the zone of connection with each cluster, records average five dimension coordinates and the interregional neighbor information in each zone;
S200) foreground extraction step: comprising: S210) weights calculation procedure: receive the prospect that is used for mark first width of cloth image of operator's input and the handmarking of background, calculate and represent the mark weights that zone connective between the zone connects correlation between weights and each zone and handmarking; S220) image segmentation step: described zone is connected weights and mark weights as input, call the GrpahCut algorithm first width of cloth image is carried out cutting apart of prospect part and background parts;
S300) depth map obtaining step: with second width of cloth image in described two width of cloth images as the reference image, segmentation result according to the foreground extraction step, calculate the disparity map of prospect part based on local auto-adaptive weights Stereo Matching Algorithm, based on the disparity map of Rank conversion Stereo Matching Algorithm calculating background parts, then the disparity map of prospect part and the disparity map of background parts are converted to depth map.
2. the depth map acquisition methods of binocular tri-dimensional frequency sequence according to claim 1 is characterized in that, also comprises:
S400) post-processing step: to the pixel of prospect part, adopting following steps to carry out depth value proofreaies and correct: choose the depth map correcting window, and according to the result of pre-treatment step, upgrade the depth value of pixel to be corrected with the mean depth value that all and pixel to be corrected in the correcting window is belonged to the pixel of the same area.
3. the depth map acquisition methods of binocular tri-dimensional frequency sequence according to claim 1 and 2 is characterized in that, clustering processing may further comprise the steps described in the described pre-treatment step:
S110) pending image is carried out denoising;
S120) adopt the K-means clustering algorithm, according to the similarity degree of quintuple space coordinate described first width of cloth image is carried out cluster, pixel is attributed to its quintuple space apart from the classification under the cluster centre of minimum.
4. the depth map acquisition methods of binocular tri-dimensional frequency sequence according to claim 1 and 2 is characterized in that, described step S210) may further comprise the steps:
S211) according to step S100) the interregional neighbor information that obtains, for non-conterminous zone, establishing its zone, to connect weights be 0, for any adjacent areas a and regional b, its zone connection weights
Figure FDA00003064261400011
Wherein, D ab = ( R a - R b ) 2 + ( G a - G b ) 2 ( B a - B b ) 2 , (R a, G a, B a) and (R b, G b, B b) be respectively each pixel of regional a and regional b at the mean value of each component of RGB color space;
S212) receive operator's input prospect gauge point s and context marker point t;
S213) for each regional k of image, calculate its prospect mark weights foreW KsWith context marker weights backW Kt, wherein:
Figure FDA00003064261400021
ForeD KsMinimum quintuple space distance for regional k and prospect gauge point s;
Figure FDA00003064261400022
BackD KtMinimum quintuple space distance for regional k and context marker point t.
5. the depth map acquisition methods of binocular tri-dimensional frequency sequence according to claim 2 is characterized in that, described step S400) may further comprise the steps:
S401) read current pixel point;
S402) judge whether current pixel point belongs to the prospect part, if not, then return step S401) carry out the processing of next pixel, otherwise carry out next step;
S403) choose matrix centered by current pixel point as the depth map correcting window, according to the pre-treatment step result, the pixel that does not belong to the same area with current pixel point in the correcting window is defined as the inactive pixels point, and the rest of pixels point is defined as effective pixel points;
S404) depth value of renewal current pixel point is the mean value of the depth value of all effective pixel points in the depth map correcting window;
S405) check whether all pixels are proofreaied and correct and finish, if not, then enter next pixel and handle; If then finish to proofread and correct.
6. the depth map of a binocular tri-dimensional frequency sequence obtains system, it is characterized in that, comprising:
Pretreatment module (201) comprises cluster module and region growing module, and described cluster module is used for reading in two width of cloth images of same time point, and wherein first width of cloth image is carried out clustering processing; The mode that described region growing module be used for to adopt region growing is converted into the zone of connection with each cluster, records average five dimension coordinates and the interregional neighbor information in each zone;
Foreground extracting module (202): bag weights computing module and image are cut apart module, described weights computing module is used for receiving the prospect that is used for mark first width of cloth image of operator's input and the handmarking of background, calculates and represents the mark weights that zone connective between the zone connects correlation between weights and each zone and handmarking; Described image is cut apart module and is used for described zone is connected weights and mark weights as input, calls the GrpahCut algorithm first width of cloth image is carried out cutting apart of prospect part and background parts;
Degree of depth acquisition module (203): be used for second width of cloth image with described two width of cloth images as the reference image, segmentation result according to the foreground extraction step, calculate the disparity map of prospect part based on local auto-adaptive weights Stereo Matching Algorithm, based on the disparity map of Rank conversion Stereo Matching Algorithm calculating background parts, then the disparity map of prospect part and the disparity map of background parts are converted to depth map.
7. the depth map of binocular tri-dimensional frequency sequence according to claim 6 obtains system, it is characterized in that, also comprises:
Post-processing module (204): be used for the pixel to the prospect part, adopting following steps to carry out depth value proofreaies and correct: choose the depth map correcting window, and according to the result of pre-treatment step, upgrade the depth value of pixel to be corrected with the mean depth value that all and pixel to be corrected in the correcting window is belonged to the pixel of the same area.
8. the depth map according to claim 6 or 7 described binocular tri-dimensional frequency sequences obtains system, it is characterized in that, described cluster module comprises:
Denoising module: be used for pending image is carried out denoising;
K-means module: be used for to adopt the K-means clustering algorithm, according to the similarity degree of quintuple space coordinate described first width of cloth image is carried out cluster, pixel is attributed to its quintuple space apart from the classification under the cluster centre of minimum.
9. the depth map according to claim 6 or 7 described binocular tri-dimensional frequency sequences obtains system, it is characterized in that, described weights computing module comprises:
The regional weights computing module that connects: be used for the interregional neighbor information according to the pretreatment module acquisition, for non-conterminous zone, establishing its zone connection weights is 0, and for any adjacent areas a and regional b, its zone connects weights
Figure FDA00003064261400031
Wherein, D ab = ( R a - R b ) 2 + ( G a - G b ) 2 ( B a - B b ) 2 , (R a, G a, B a) and (R b, G b, B b) be respectively each pixel of regional a and regional b at the mean value of each component of RGB color space;
The mark input module: be used for to receive operator's input prospect gauge point s and context marker point t;
Mark weights computing module: be used for each the regional k for image, calculate its prospect mark weights foreW KsWith context marker weights backW Kt, wherein:
ForeD KsMinimum quintuple space distance for regional k and prospect gauge point s;
Figure FDA00003064261400034
BackD KtMinimum quintuple space distance for regional k and context marker point t.
10. the depth map of binocular tri-dimensional frequency sequence according to claim 7 obtains system, it is characterized in that, described post-processing module comprises:
Read module: be used for reading current pixel point;
Judge module: be used for judging whether current pixel point belongs to the prospect part, if not then read module carries out the processing of next pixel, handle otherwise carry out next module;
Valid pixel is chosen module: be used for choosing matrix centered by current pixel point as the depth map correcting window, according to the pre-treatment step result, the pixel that does not belong to the same area with current pixel point in the correcting window is defined as the inactive pixels point, and the rest of pixels point is defined as effective pixel points;
The depth value update module: the depth value that is used for the renewal current pixel point is the mean value of the depth value of all effective pixel points of depth map correcting window;
Stop judge module: finish for checking whether all pixels are proofreaied and correct, if not, then enter next pixel and handle; If then finish to proofread and correct.
CN201310134715.1A 2013-04-17 2013-04-17 Method and system for acquiring depth map of binocular stereo video sequence Active CN103248906B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201310134715.1A CN103248906B (en) 2013-04-17 2013-04-17 Method and system for acquiring depth map of binocular stereo video sequence
HK13110791.5A HK1183577A1 (en) 2013-04-17 2013-09-20 An attainment method of the depth image of binocular stereo video sequences and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310134715.1A CN103248906B (en) 2013-04-17 2013-04-17 Method and system for acquiring depth map of binocular stereo video sequence

Publications (2)

Publication Number Publication Date
CN103248906A true CN103248906A (en) 2013-08-14
CN103248906B CN103248906B (en) 2015-02-18

Family

ID=48928094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310134715.1A Active CN103248906B (en) 2013-04-17 2013-04-17 Method and system for acquiring depth map of binocular stereo video sequence

Country Status (2)

Country Link
CN (1) CN103248906B (en)
HK (1) HK1183577A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473743A (en) * 2013-09-12 2013-12-25 清华大学深圳研究生院 Method for obtaining image depth information
CN103888749A (en) * 2014-04-03 2014-06-25 清华大学深圳研究生院 Method for converting double-view video into multi-view video
CN103996206A (en) * 2014-02-24 2014-08-20 航天恒星科技有限公司 GraphCut-based interactive target extraction method in complicated background remote-sensing image
CN104463183A (en) * 2013-09-13 2015-03-25 株式会社理光 Cluster center selecting method and system
CN104639933A (en) * 2015-01-07 2015-05-20 前海艾道隆科技(深圳)有限公司 Real-time acquisition method and real-time acquisition system for depth maps of three-dimensional views
CN105025193A (en) * 2014-04-29 2015-11-04 钰创科技股份有限公司 Portable stereo scanner and method for generating stereo scanning result of corresponding object
CN105282375A (en) * 2014-07-24 2016-01-27 钰创科技股份有限公司 Attached Stereo Scanning Module
CN106991370A (en) * 2017-02-28 2017-07-28 中科唯实科技(北京)有限公司 Pedestrian retrieval method based on color and depth
CN107481250A (en) * 2017-08-30 2017-12-15 吉林大学 A kind of image partition method and its evaluation method and image interfusion method
CN109215044A (en) * 2017-06-30 2019-01-15 京东方科技集团股份有限公司 Image processing method and system, storage medium and mobile system
CN110263825A (en) * 2019-05-30 2019-09-20 湖南大学 Data clustering method, device, computer equipment and storage medium
CN110335389A (en) * 2019-07-01 2019-10-15 上海商汤临港智能科技有限公司 Car door unlocking method and device, system, vehicle, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101877128A (en) * 2009-12-23 2010-11-03 中国科学院自动化研究所 Method for segmenting different objects in three-dimensional scene
CN102263979A (en) * 2011-08-05 2011-11-30 清华大学 Depth map generation method and device for plane video three-dimensional conversion
CN102622768A (en) * 2012-03-14 2012-08-01 清华大学 Depth-map gaining method of plane videos

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101877128A (en) * 2009-12-23 2010-11-03 中国科学院自动化研究所 Method for segmenting different objects in three-dimensional scene
CN102263979A (en) * 2011-08-05 2011-11-30 清华大学 Depth map generation method and device for plane video three-dimensional conversion
CN102622768A (en) * 2012-03-14 2012-08-01 清华大学 Depth-map gaining method of plane videos

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473743B (en) * 2013-09-12 2016-03-02 清华大学深圳研究生院 A kind of method obtaining image depth information
CN103473743A (en) * 2013-09-12 2013-12-25 清华大学深圳研究生院 Method for obtaining image depth information
CN104463183B (en) * 2013-09-13 2017-10-10 株式会社理光 Cluster centre choosing method and system
CN104463183A (en) * 2013-09-13 2015-03-25 株式会社理光 Cluster center selecting method and system
CN103996206B (en) * 2014-02-24 2017-01-11 航天恒星科技有限公司 GraphCut-based interactive target extraction method in complicated background remote-sensing image
CN103996206A (en) * 2014-02-24 2014-08-20 航天恒星科技有限公司 GraphCut-based interactive target extraction method in complicated background remote-sensing image
CN103888749B (en) * 2014-04-03 2016-07-27 清华大学深圳研究生院 A kind of method of the many visual frequencies of binocular video conversion
CN103888749A (en) * 2014-04-03 2014-06-25 清华大学深圳研究生院 Method for converting double-view video into multi-view video
CN105025193A (en) * 2014-04-29 2015-11-04 钰创科技股份有限公司 Portable stereo scanner and method for generating stereo scanning result of corresponding object
CN105025193B (en) * 2014-04-29 2020-02-07 钰立微电子股份有限公司 Portable stereo scanner and method for generating stereo scanning result of corresponding object
CN105282375A (en) * 2014-07-24 2016-01-27 钰创科技股份有限公司 Attached Stereo Scanning Module
CN105282375B (en) * 2014-07-24 2019-12-31 钰立微电子股份有限公司 Attached stereo scanning module
CN104639933A (en) * 2015-01-07 2015-05-20 前海艾道隆科技(深圳)有限公司 Real-time acquisition method and real-time acquisition system for depth maps of three-dimensional views
CN106991370A (en) * 2017-02-28 2017-07-28 中科唯实科技(北京)有限公司 Pedestrian retrieval method based on color and depth
CN106991370B (en) * 2017-02-28 2020-07-31 中科唯实科技(北京)有限公司 Pedestrian retrieval method based on color and depth
CN109215044A (en) * 2017-06-30 2019-01-15 京东方科技集团股份有限公司 Image processing method and system, storage medium and mobile system
CN109215044B (en) * 2017-06-30 2020-12-15 京东方科技集团股份有限公司 Image processing method and system, storage medium, and mobile system
CN107481250A (en) * 2017-08-30 2017-12-15 吉林大学 A kind of image partition method and its evaluation method and image interfusion method
CN110263825A (en) * 2019-05-30 2019-09-20 湖南大学 Data clustering method, device, computer equipment and storage medium
CN110263825B (en) * 2019-05-30 2022-05-10 湖南大学 Data clustering method and device, computer equipment and storage medium
CN110335389A (en) * 2019-07-01 2019-10-15 上海商汤临港智能科技有限公司 Car door unlocking method and device, system, vehicle, electronic equipment and storage medium

Also Published As

Publication number Publication date
HK1183577A1 (en) 2013-12-27
CN103248906B (en) 2015-02-18

Similar Documents

Publication Publication Date Title
CN103248906B (en) Method and system for acquiring depth map of binocular stereo video sequence
CN104574375B (en) Image significance detection method combining color and depth information
CN102098526A (en) Depth map calculating method and device
CN102930296B (en) A kind of image-recognizing method and device
CN101443817B (en) Method and device for determining correspondence, preferably for the three-dimensional reconstruction of a scene
CN102665086B (en) Method for obtaining parallax by using region-based local stereo matching
CN101610425B (en) Method for evaluating stereo image quality and device
CN104756491A (en) Depth map generation from a monoscopic image based on combined depth cues
CN110189294B (en) RGB-D image significance detection method based on depth reliability analysis
CN101651772A (en) Method for extracting video interested region based on visual attention
CN110827312B (en) Learning method based on cooperative visual attention neural network
CN104994375A (en) Three-dimensional image quality objective evaluation method based on three-dimensional visual saliency
CN101877143A (en) Three-dimensional scene reconstruction method of two-dimensional image group
CN103136748B (en) The objective evaluation method for quality of stereo images of a kind of feature based figure
CN104574404A (en) Three-dimensional image relocation method
CN109493373B (en) Stereo matching method based on binocular stereo vision
CN107689060A (en) Visual processing method, device and the equipment of view-based access control model processing of destination object
CN103260043A (en) Binocular stereo image matching method and system based on learning
CN108021857B (en) Building detection method based on unmanned aerial vehicle aerial image sequence depth recovery
CN102223545B (en) Rapid multi-view video color correction method
CN104144339B (en) A kind of matter based on Human Perception is fallen with reference to objective evaluation method for quality of stereo images
CN111641822A (en) Method for evaluating quality of repositioning stereo image
CN105898279B (en) A kind of objective evaluation method for quality of stereo images
CN107909611A (en) A kind of method using differential geometric theory extraction space curve curvature feature
CN105138979A (en) Method for detecting the head of moving human body based on stereo visual sense

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1183577

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1183577

Country of ref document: HK

EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20130814

Assignee: JIANGSU ORIGINAL FORCE COMPUTER ANIMATION PRODUCTION CO., LTD.

Assignor: Graduate School at Shenzhen, Tsinghua University

Contract record no.: 2016440020012

Denomination of invention: Method and system for acquiring depth map of binocular stereo video sequence

Granted publication date: 20150218

License type: Exclusive License

Record date: 20160308

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model