CN100592338C  Multivisual angle video image depth detecting method and depth estimating method  Google Patents
Multivisual angle video image depth detecting method and depth estimating method Download PDFInfo
 Publication number
 CN100592338C CN100592338C CN200810300330A CN200810300330A CN100592338C CN 100592338 C CN100592338 C CN 100592338C CN 200810300330 A CN200810300330 A CN 200810300330A CN 200810300330 A CN200810300330 A CN 200810300330A CN 100592338 C CN100592338 C CN 100592338C
 Authority
 CN
 China
 Prior art keywords
 search
 pixel
 depth
 depth value
 camera
 Prior art date
Links
 230000000875 corresponding Effects 0.000 claims description 48
 230000000007 visual effect Effects 0.000 claims description 35
 239000011159 matrix materials Substances 0.000 claims description 30
 239000000203 mixtures Substances 0.000 claims description 2
 230000003044 adaptive Effects 0.000 abstract description 7
 238000005516 engineering processes Methods 0.000 abstract description 5
 230000002194 synthesizing Effects 0.000 abstract description 5
 230000015572 biosynthetic process Effects 0.000 abstract description 2
 238000003786 synthesis reactions Methods 0.000 abstract description 2
 281000009961 Parallax, Inc. (company) companies 0.000 description 15
 230000001808 coupling Effects 0.000 description 12
 238000010168 coupling process Methods 0.000 description 12
 238000005859 coupling reactions Methods 0.000 description 12
 238000010586 diagrams Methods 0.000 description 10
 238000000034 methods Methods 0.000 description 10
 239000004744 fabrics Substances 0.000 description 6
 230000003287 optical Effects 0.000 description 5
 230000000694 effects Effects 0.000 description 4
 241000405563 Jasmine virus T Species 0.000 description 3
 238000004422 calculation algorithm Methods 0.000 description 3
 230000003292 diminished Effects 0.000 description 3
 238000004458 analytical methods Methods 0.000 description 2
 HUTDUHSNJYTCARUHFFFAOYSAN ancymidol Chemical compound data:image/svg+xml;base64,<?xml version='1.0' encoding='iso-8859-1'?>
<svg version='1.1' baseProfile='full'
              xmlns='http://www.w3.org/2000/svg'
                      xmlns:rdkit='http://www.rdkit.org/xml'
                      xmlns:xlink='http://www.w3.org/1999/xlink'
                  xml:space='preserve'
width='300px' height='300px' viewBox='0 0 300 300'>
<!-- END OF HEADER -->
<rect style='opacity:1.0;fill:#FFFFFF;stroke:none' width='300' height='300' x='0' y='0'> </rect>
<path class='bond-0' d='M 131.385,216.672 L 88.1464,209.763' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-0' d='M 126.281,206.988 L 96.014,202.151' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-18' d='M 131.385,216.672 L 158.987,182.681' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-1' d='M 88.1464,209.763 L 72.5105,168.863' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-2' d='M 72.5105,168.863 L 56.4887,166.303' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-2' d='M 56.4887,166.303 L 40.4669,163.743' style='fill:none;fill-rule:evenodd;stroke:#E84235;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-4' d='M 72.5105,168.863 L 100.113,134.872' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-4' d='M 83.4491,169.285 L 102.771,145.491' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-3' d='M 24.8084,150.277 L 19.2224,135.666' style='fill:none;fill-rule:evenodd;stroke:#E84235;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-3' d='M 19.2224,135.666 L 13.6364,121.054' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-5' d='M 100.113,134.872 L 143.351,141.781' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-6' d='M 143.351,141.781 L 158.987,182.681' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-6' d='M 137.517,151.043 L 148.462,179.673' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-7' d='M 158.987,182.681 L 202.225,189.589' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-8' d='M 202.225,189.589 L 199.704,205.37' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-8' d='M 199.704,205.37 L 197.182,221.151' style='fill:none;fill-rule:evenodd;stroke:#E84235;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-9' d='M 202.225,189.589 L 209.134,146.351' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-15' d='M 202.225,189.589 L 245.464,196.498' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-10' d='M 209.134,146.351 L 250.034,130.715' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-10' d='M 212.142,135.826 L 240.772,124.881' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-19' d='M 209.134,146.351 L 175.143,118.749' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-11' d='M 250.034,130.715 L 252.556,114.934' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-11' d='M 252.556,114.934 L 255.077,99.1534' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 246.239,78.7847 L 234.596,69.3296' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 234.596,69.3296 L 222.952,59.8745' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 237.226,82.7464 L 229.075,76.1278' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 229.075,76.1278 L 220.925,69.5092' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-13' d='M 222.952,59.8745 L 207.854,65.6464' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-13' d='M 207.854,65.6464 L 192.756,71.4183' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 180.186,87.1869 L 177.665,102.968' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 177.665,102.968 L 175.143,118.749' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 188.078,93.3029 L 186.313,104.35' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 186.313,104.35 L 184.547,115.396' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-16' d='M 245.464,196.498 L 286.364,180.862' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-20' d='M 245.464,196.498 L 279.455,224.101' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-17' d='M 286.364,180.862 L 279.455,224.101' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<text dominant-baseline="central" text-anchor="end" x='34.6286' y='164.143' style='font-size:14px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#E84235' ><tspan>O</tspan></text>
<text dominant-baseline="central" text-anchor="start" x='189.96' y='235.017' style='font-size:14px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#E84235' ><tspan>OH</tspan></text>
<text dominant-baseline="central" text-anchor="start" x='252.077' y='89.6663' style='font-size:14px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#4284F4' ><tspan>N</tspan></text>
<text dominant-baseline="central" text-anchor="end" x='186.918' y='77.6997' style='font-size:14px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#4284F4' ><tspan>N</tspan></text>
</svg>
 data:image/svg+xml;base64,<?xml version='1.0' encoding='iso-8859-1'?>
<svg version='1.1' baseProfile='full'
              xmlns='http://www.w3.org/2000/svg'
                      xmlns:rdkit='http://www.rdkit.org/xml'
                      xmlns:xlink='http://www.w3.org/1999/xlink'
                  xml:space='preserve'
width='85px' height='85px' viewBox='0 0 85 85'>
<!-- END OF HEADER -->
<rect style='opacity:1.0;fill:#FFFFFF;stroke:none' width='85' height='85' x='0' y='0'> </rect>
<path class='bond-0' d='M 36.7257,60.8903 L 24.4748,58.9328' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-0' d='M 35.2796,58.1465 L 26.704,56.7762' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-18' d='M 36.7257,60.8903 L 44.5464,51.2595' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-1' d='M 24.4748,58.9328 L 20.0447,47.3445' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-2' d='M 20.0447,47.3445 L 14.8848,46.52' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-2' d='M 14.8848,46.52 L 9.72499,45.6955' style='fill:none;fill-rule:evenodd;stroke:#E84235;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-4' d='M 20.0447,47.3445 L 27.8653,37.7137' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-4' d='M 23.1439,47.464 L 28.6184,40.7224' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-3' d='M 7.00333,43.3192 L 5.18348,38.5589' style='fill:none;fill-rule:evenodd;stroke:#E84235;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-3' d='M 5.18348,38.5589 L 3.36364,33.7986' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-5' d='M 27.8653,37.7137 L 40.1162,39.6712' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-6' d='M 40.1162,39.6712 L 44.5464,51.2595' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-6' d='M 38.4631,42.2955 L 41.5642,50.4073' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-7' d='M 44.5464,51.2595 L 56.7972,53.217' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-8' d='M 56.7972,53.217 L 55.9836,58.3086' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-8' d='M 55.9836,58.3086 L 55.1701,63.4001' style='fill:none;fill-rule:evenodd;stroke:#E84235;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-9' d='M 56.7972,53.217 L 58.7547,40.9661' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-15' d='M 56.7972,53.217 L 69.0481,55.1745' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-10' d='M 58.7547,40.9661 L 70.343,36.536' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-10' d='M 59.6069,37.984 L 67.7187,34.8828' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-19' d='M 58.7547,40.9661 L 49.1239,33.1455' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-11' d='M 70.343,36.536 L 71.1566,31.4444' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-11' d='M 71.1566,31.4444 L 71.9701,26.3528' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 70.5084,22.8298 L 66.5891,19.6471' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 66.5891,19.6471 L 62.6697,16.4644' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 67.7684,23.8011 L 65.0249,21.5733' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-12' d='M 65.0249,21.5733 L 62.2814,19.3454' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-13' d='M 62.6697,16.4644 L 57.7717,18.337' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-13' d='M 57.7717,18.337 L 52.8736,20.2095' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 50.7511,22.9623 L 49.9375,28.0539' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 49.9375,28.0539 L 49.1239,33.1455' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 52.9572,24.8813 L 52.3877,28.4454' style='fill:none;fill-rule:evenodd;stroke:#4284F4;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-14' d='M 52.3877,28.4454 L 51.8182,32.0095' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-16' d='M 69.0481,55.1745 L 80.6364,50.7443' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-20' d='M 69.0481,55.1745 L 78.6788,62.9952' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<path class='bond-17' d='M 80.6364,50.7443 L 78.6788,62.9952' style='fill:none;fill-rule:evenodd;stroke:#3B4143;stroke-width:2px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1' />
<text dominant-baseline="central" text-anchor="end" x='9.31145' y='46.0073' style='font-size:4px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#E84235' ><tspan>O</tspan></text>
<text dominant-baseline="central" text-anchor="start" x='53.3221' y='66.0882' style='font-size:4px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#E84235' ><tspan>OH</tspan></text>
<text dominant-baseline="central" text-anchor="start" x='70.9219' y='24.9054' style='font-size:4px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#4284F4' ><tspan>N</tspan></text>
<text dominant-baseline="central" text-anchor="end" x='52.4601' y='21.5149' style='font-size:4px;font-style:normal;font-weight:normal;fill-opacity:1;stroke:none;font-family:sans-serif;fill:#4284F4' ><tspan>N</tspan></text>
</svg>
 C1=CC(OC)=CC=C1C(O)(C=1C=NC=NC=1)C1CC1 HUTDUHSNJYTCARUHFFFAOYSAN 0.000 description 2
 238000004364 calculation methods Methods 0.000 description 2
 238000003702 image correction Methods 0.000 description 2
 239000000463 materials Substances 0.000 description 2
 238000009877 rendering Methods 0.000 description 2
 281000079425 Computervision companies 0.000 description 1
 101710054282 SC29 Proteins 0.000 description 1
 230000005540 biological transmission Effects 0.000 description 1
 238000007906 compression Methods 0.000 description 1
 230000001276 controlling effects Effects 0.000 description 1
 230000002950 deficient Effects 0.000 description 1
 239000000686 essences Substances 0.000 description 1
 238000000605 extraction Methods 0.000 description 1
 239000011521 glasses Substances 0.000 description 1
 238000003909 pattern recognition Methods 0.000 description 1
 239000007787 solids Substances 0.000 description 1
 238000006467 substitution reactions Methods 0.000 description 1
 238000005211 surface analysis Methods 0.000 description 1
Classifications

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
 G06T7/00—Image analysis
 G06T7/50—Depth or shape recovery
 G06T7/55—Depth or shape recovery from multiple images
 G06T7/593—Depth or shape recovery from multiple images from stereo images

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
 G06T2207/00—Indexing scheme for image analysis or image enhancement
 G06T2207/10—Image acquisition modality
 G06T2207/10004—Still image; Photographic image
 G06T2207/10012—Stereo images
Abstract
Description
Technical field
The present invention relates to the multivision angle video image treatment technology.
Background technology
In recent years, researchers recognize gradually, following advanced threedimensional television and any (FVV of visual angle Video Applications system, Free Viewpoint Video System) should utilize computer vision, Video processing in and based on technology such as the scene of depth image are synthetic, obtaining and showing to be provided with and separate video, promptly watch the visual angle unrestricted mutually, thereby dirigibility, interactivity and the operability of height are provided with the camera orientation that obtains video.The stereotelevision project in Europe has adopted data layout (" based on the threedimensional television new method of synthesizing, compressing and transmitting of depth image ", stereo display and virtual reality system SPIE meeting, 2004. of video plus depth; C.Fehn, " Depthimagebased rendering (DIBR); compression and transmission for a new approach on 3DTV; " in Proc.SPIE Conf.Stereoscopic Displays and Virtual Reality Systems XI, vol.5291, CA, U.S.A., Jan.2004, pp.93104.), i.e. corresponding depth value of each pixel of image; Utilization is based on the view synthesizing method (DIBR:Depth Image Based Rendering) of depth image: the receiving end demoder is provided with and to watch the visual angle to generate stereopicture right according to showing, watches the visual angle unrestricted mutually with the camera orientation that obtains video thereby make.In April, 2007 JVT meeting motion (" data layout of the multiangle video plus depth of advanced 3 D video system "; A.Smolic and K.Mueller, et al., " MultiView Video plus Depth (MVD) Format for Advanced 3DVideo Systems ", ISO/IEC JTC1/SC29/WG11, Doc.JVTW100, San Jose, USA, April2007.) the video plus depth is generalized to multiangle video, has proposed the multiview coded data layout MVD (Multiview video plus depth) of the video man degree of depth.Because MVD can satisfy an essential demand of advanced 3 D video or any visual angle Video Applications, the view at continuous any visual angle that promptly can be in decoding end generates certain limit, rather than a limited number of discrete views, so the MVD scheme of video plus depth is adopted by JVT, is confirmed as developing direction from now on.
So how the depth information that obtains scene from two width of cloth or several views at different visual angles becomes one of multiangle video major issue handling.
Present deep search mode is: adopt fixing search steplength (uniform depthgrid) to carry out deep search in the fixing search scope.When using the fixing search steplength, as if the sideplay amount of the stepsize in search given at less depth value place corresponding to 1 pixel, then at big depth value place, the pixelshift amount of this stepsize in search correspondence will be less than 1 pixel.When supposing to project to noninteger pixel under given depth value, the pixel of getting arest neighbors is as subpoint, then will search same pixel at a plurality of different depth value places during deep search, repeat search promptly occurred.Conversely, if given stepsize in search is the sideplay amount corresponding to 1 pixel at big depth value place, then will be greater than 1 pixel in the pixelshift amount of this stepsize in search correspondence of less depth value place, be that adjacent two depth values will search two nonadjacent pixels, thereby make some pixel omission, the generation search is incomplete.So, be desirably in hunting zone [z originally _{Min}, z _{Max}] in N pixel of search, but owing to produced the pixel repeat search or leak to have searched for, actual search to the efficient search point that has to be less than N.In order to guarantee that the hunting zone comprises the possible value of institute of scene real depth value, usually establish the hunting zone enough greatly, and in order to guarantee certain search precision, establish stepsize in search lessly, this has increased searching times and corresponding calculated amount greatly, and owing to leak the existence of search and repeat search, the search effect is also bad.
So far, existing a lot of research and the algorithm for estimating relevant with estimation of Depth, but most of by to correction, parallel stereopicture is to carrying out disparity estimation earlier, concerns compute depth information according to the parallax and the degree of depth again.For example, only have horizontal parallax between two width of cloth images in the parallel camera system, utilize the method elder generation estimating disparity based on feature or piece coupling, the relation that is inversely proportional to according to the degree of depth and parallax calculates depth information then; And, then to just can obtain the depth map of original view correspondence to a series of processing such as correction, parallax coupling, depth calculation and anticorrections through image for nonparallel camera system.Be exactly to carry out disparity estimation in such estimation of Depth question essence, its performance is mainly determined by the disparity estimation algorithm.As everyone knows, disparity estimation or threedimensional coupling are the classical problems in the computer vision, though existing so far number of research projects and achievement, texture information lack or block caused coupling ambiguity or uncertainly make that the parallax matching problem still is research focus and the difficult point in the computer vision.
2006, JVT meeting motion (" multiview video coding core experiment 3 reports "; S.Yea, J.Oh, S.Ince, E.Martinian and A.Vetro, " Report on Core Experiment CE3 of MultiviewCoding ", ISO/IEC JTC1/SC29/WG11, Doc.JVTT123, Klagenfurt, Austria, July2006.) proposed to utilize camera internal and external parameter and synthetic based on the view of the degree of depth, with given stepsize in search, search makes the degree of depth of synthesizing the error minimum between view and the actual view as estimated value in the depth range search of a certain appointment.People such as M.Okutom have proposed the solid matching method (A multiplebaseline stereo) of many baselines stero, this method is utilized the inverse relation of the degree of depth and parallax, disparity estimation is converted into the degree of depth finds the solution problem, and a uncertain difficult problem (" many baselines stero ", patternrecognition and machine intelligence IEEE journal in the parallax coupling have been eliminated; M.Okutomi and K.Kanade, " A multiplebaseline stereo ", IEEE Trans.on Pattern Analysis andMachine Intelligence 15 (4): 353363,1993.).People such as N.Kim have proposed directly to carry out deep search, coupling and view synthetic operation (" general many baselines stero and the direct view that utilizes the deep space search, mates and synthesize are synthetic ", the international periodical of computer vision at distance/deep space; N.Kim, M.Trivedi and H.Ishiguro, " Generalized multiple baseline stereo and direct view synthesis usingrangespace search; match; and render ", International Journal of ComputerVision 47 (1/2/3): 131148,2002.): directly carry out deep search at deep space, do not need the parallax coupling, image correction process is directly finished in the deep search process, and depth value is a successive value, and its precision is subjected to the restriction of image pixel resolution unlike disparity vector.But in actual the finding the solution, need designated depth hunting zone and stepsize in search, ask optimum solution according to a certain cost function, and whether the value of hunting zone and steplength is suitable most important to estimated performance.
In the parallax coupling, the parallax hunting zone is intuitively definite according to image property usually, and in the deep search, particularly in nonparallel camera system, because the relation of change in depth and image pixel skew is not apparent, so its hunting zone is difficult to rationally determine.
So, how given various visual angles view is determined the key that suitable deep search interval and steplength become effective estimating depth information.
JVTWO59 (" the synthetic prediction of view core experiment 6 reports "; S.Yea and A.Vetro, " Report ofCE6 on View Synthesis Prediction ", ISO/IEC JTC1/SC29/WG11, Doc.JVTWO59, SanJose, USA, April 2007.) propose to utilize the matching characteristic point of two width of cloth views right, minimum value from some groups of alternative deep search, choose in maximal value and the stepsize in search make the matching characteristic point between a group of error minimum as depth range search and steplength, this method need be used KLT (KanadeLucasTomasi) algorithm (" detection and tracking of unique point ", Carnegie Mellon University's technical report; C.Tomasi, and T.Kanade, " Detection and tracking ofpoint features ", Technical Report CMUCS91132, Carnegie Mellon University, 1991.) carry out the feature extraction coupling, performance depends on the correctness of characteristic matching.
People such as M.Okutom and N.Kim mentions the pairing change in depth value of 1 pixelshift amount with the referenceview of long baseline as stepsize in search, thereby the pixelshift amount in every other referenceview of guaranteeing is less than 1 pixel.
Abovementioned two kinds of methods all are to use fixing stepsize in search, do not adjust steplength adaptively according to the variation of picture material or scene.
Summary of the invention
Technical matters to be solved by this invention is that the selfadaptation that proposes a kind of stepsize in search is determined method, can avoid the pixel repeat search or leak search.In addition, the invention allows for a kind of depth estimation method based on the adaptable search steplength.
The present invention solves the problems of the technologies described above the technical scheme that is adopted to be, multivision angle video image deep search method is characterized in that the stepsize in search in each step is dynamically adjusted according to current depth value in depth range search, current depth value is more little, and the stepsize in search of employing is more little; Current depth value is big more, and the stepsize in search of employing is big more, makes the stepsize in search in each step corresponding to identical pixel search precision; The length of pixelshift vector each time during described pixel search precision equals to search for; Described pixel search precision can for minute pixel precision as 1/2nd pixels, 1/4th pixels, or whole pixel precision, as a pixel, two pixels; Described stepsize in search equals in the search the pairing change in depth value of pixelshift vector each time;
According to the relation of change in depth value and pixelshift vector, described depth range search and stepsize in search determined to be converted into determining of pixel hunting zone and pixel search precision;
The stepsize in search of target view is according to computer vision perspective projection principle with based on the view composition principle of the degree of depth, determine by the camera internal and external parameter of the pixelshift vector sum view correspondence in current depth value, the referenceview in the target view, in the target view stepsize in search in each step in referenceview corresponding to the pixelshift vector of equal length.Described target view is meant the current image that needs estimating depth, and described referenceview is meant other images in the multiangle video system.Referenceview can be selected in the deep search process or be specified by the user automatically;
Stepsize in search is obtained by following formula:
Wherein: P is a pixel for the treatment of estimation of Depth in the target view, and z is the current depth value of pixel P, and Δ z is that the change in depth value of pixel P is a stepsize in search, Δ P _{r}Be the change in depth value Δ z of pixel P in the target view corresponding pixelshift vector in referenceview r, ‖ Δ P _{r}‖ ^{2}=Δ P _{r} ^{T}Δ P _{r}; With the matrix that is 3 * 3, be trivector; Wherein, R is the threedimensional rotation matrix of the camera coordinates system at target visual angle with respect to world coordinate system; T is the translation vector of the camera coordinates system at target visual angle with respect to world coordinate system; A is the camera inner parameter matrix at target visual angle; R _{r}Be the threedimensional rotation matrix of the camera coordinates of reference viewing angle system with respect to world coordinate system; t _{r}Be the translation vector of the camera coordinates of reference viewing angle system with respect to world coordinate system; A _{r}Camera inner parameter matrix for reference viewing angle; b _{3}And c _{3}It is respectively matrix B _{r}And C _{r}The third line vector.For parallel camera system, square being directly proportional of described change in depth value and current depth value.
Pixelshift vector in the described referenceview satisfies the polar curve equation of constraint of target visual angle and reference viewing angle:
The depth estimation method of multivision angle video image, utilize synthetic based on the view of the degree of depth and the deep search of joining based on piece in, the depth range search of target view and stepsize in search are by the pixel hunting zone and the decision of pixel search precision of referenceview; In depth range search, the stepsize in search in each step is dynamically adjusted according to current depth value, and current depth value is more little, and the stepsize in search of employing is more little; Current depth value is big more, and the stepsize in search of employing is big more, makes the stepsize in search in each step corresponding to identical pixel search precision; Described view based on the degree of depth is synthetic, be meant the pixel and the depth value of given target view, camera internal and external parameter according to target visual angle and reference viewing angle, this pixel back projection is arrived the threedimensional scenic spatial point, again this spatial point is projected to again the method for the plane of delineation of reference viewing angle, obtain the synthetic view of target view in this reference viewing angle; Described based on
The view of the degree of depth synthesizes and is specially based on the deep search that piece is joined, and utilize current depth value to carry out view and synthesize, and the error between the block of pixels of the block of pixels of the synthetic view of calculating and referenceview; Adopting the depth value of least error correspondence is the estimation of Depth value of target view;
Described deep search steplength determined by the camera internal and external parameter of the pixelshift vector sum view correspondence in current depth value, the referenceview in the target view, in the target view stepsize in search in each step in referenceview corresponding to the pixelshift vector of equal length;
The depth estimation method of multivision angle video image specifically may further comprise the steps:
Deep search initial value z in the step 1 estimating target view _{k}=0;
Step 2 is determined deep search corresponding to pixel hunting zone in the referenceview and pixel search precision, obtains pixelshift vector Δ P in the referenceview according to the pixel search precision _{r}
Step 3 is according to current depth value z _{k}With pixelshift vector Δ P _{r}, obtaining corresponding change in depth value Δ z, described change in depth value Δ z is next step stepsize in search;
Step 4 is utilized current depth value z _{k}It is synthetic to carry out view, and the error e between the block of pixels of the block of pixels of the synthetic view of calculating and referenceview _{k}
Step 4 is upgraded current depth value z _{k}=z _{k}+ Δ z; K=k+1;
Step 5 judges whether to surpass given pixel hunting zone, enters step 6 in this way, as not, enters step 3;
Step 6 is with error e _{k}(k=0 ..., N1, N is the search total step number) in the depth value of least error correspondence be estimated value.
Described stepsize in search is obtained by following formula:
Wherein: P is a pixel for the treatment of estimation of Depth in the target view, and z is the current depth value of pixel P, and Δ z is that the change in depth value of pixel P is a stepsize in search, Δ P _{r}Be the change in depth value Δ z of pixel P in the target view corresponding pixelshift vector in referenceview r, ‖ Δ P _{r}‖ ^{2}=Δ P _{r} ^{T}Δ P _{r}; B _{r}=A _{r}R _{r} ^{1}RA ^{1}And C _{r}=A _{r}R _{r} ^{1}Be 3 * 3 matrix, Δ t _{r}=tt _{r}It is trivector; Wherein, R is the threedimensional rotation matrix of the camera coordinates system at target visual angle with respect to world coordinate system; T is the translation vector of the camera coordinates system at target visual angle with respect to world coordinate system; A is the camera inner parameter matrix at target visual angle; R _{r}Be the threedimensional rotation matrix of the camera coordinates of reference viewing angle system with respect to world coordinate system; t _{r}For reference viewing angle
Camera coordinates system is with respect to the translation vector of world coordinate system; A _{r}Camera inner parameter matrix for reference viewing angle; b _{3}And c _{3}It is respectively matrix B _{r}And C _{r}The third line vector.For parallel camera system, square being directly proportional of described current depth value with the change in depth value.Pixelshift vector in the described referenceview satisfies the polar curve equation of constraint of target visual angle and reference viewing angle:
The invention has the beneficial effects as follows that the deep search of adaptable search steplength pixel can not occur and leak search and repeat search, image block synthetic in the estimation of Depth is little with the absolute difference of reference image block, and mistake is estimated to lack, and calculated amount or deep search number of times are few.
Description of drawings
Fig. 1 is provided with synoptic diagram for the coordinate system in the multiangle video system;
Fig. 2 is based on the synthetic synoptic diagram of the view of the degree of depth;
The view of the initial time of the video sequence of the 7th camera in Fig. 3 (a) Uli cycle tests;
The view of the initial time of the video sequence of the 7th camera in Fig. 3 (b) Uli cycle tests;
Fig. 3 (c) is the partial schematic diagram of Fig. 2 (a), and 16 signal zones that show are the imageregion of pixel [527,430] to [590,493];
Fig. 4 is the synoptic diagram that concerns of change in depth value and depth value square;
Fig. 5 is the synoptic diagram of change in depth value of the present invention and pixelshift vector;
Fig. 6 is that depth value pixel hour is leaked the synoptic diagram of search;
Fig. 7 is the synoptic diagram of the pixel repeat search of depth value when big;
Fig. 8 adjusts the synoptic diagram of deep search steplength for selfadaptation of the present invention;
Fig. 9 searches the distribution schematic diagram of pixel for adopting the adaptive elongated stepsize in search of the present invention;
Figure 10 adopts the deep search performance synoptic diagram of fixing search steplength and adaptive step of the present invention.
Embodiment
The selfadaptation that the present invention proposes a kind of deep search steplength is determined method, utilize camera internal and external parameter and perspective projection relation, at first derive the relation between the pixelshift amount of subpoint in synthetic view that pixel depth value, change in depth value and change in depth cause, according to the relational expression between change in depth value of deriving and the respective pixel sideplay amount, depth range search determined to be converted into determining of pixel hunting zone, the pixelshift amount has meaning directly perceived in image, rationally determine easily; And relation according to pixelshift amount and depth value, be that depth value is big more, the pixelshift amount that identical change in depth value causes is just more little, dynamically adjust stepsize in search, make each stepsize in search correspondence identical pixel search precision, avoid the pixel repeat search or leaked search, thereby improved search efficiency and performance.In addition, the invention allows for a kind of simple and effective initial depth method of estimation, this method is by finding the solution the convergent point that converges camera optical axis in the camera system, and this point is regarded as scene epitome point, thereby obtains that of scene depth is general to be estimated.
Usually need the coordinate system of three types to describe scene and picture position information thereof in multiangle video, they are respectively world coordinate system, camera coordinates system and the pixel coordinate system at scene place, as shown in Figure 1.Wherein, camera coordinates system is that initial point, optical axis are the z axle with the camera center, and the xy plane is parallel with the plane of delineation; Pixel coordinate system is a true origin with the image upper left corner then, and level and vertical coordinate are u, v.
If camera c _{i}(i=1 ..., camera coordinates m) is o _{i}x _{i}y _{i}z _{i}With respect to the position of world coordinate system oxyz threedimensional rotation matrix R _{i}With translation vector t _{i}Expression, wherein m is the camera number.Any vectorial p=[x of coordinate under world coordinate system in the scene, y, z] expression is o in camera coordinates _{i}x _{i}y _{i}z _{i}In the vectorial p of coordinate _{i}=[x _{i}, y _{i}, z _{i}] expression, then according to space geometry and coordinate transform, following relation is arranged:
p＝R _{i}P _{i}+t _{i}????(1)
According to computer vision perspective projection principle, the coordinate p under the camera coordinates system _{i}With its homogeneous pixel coordinate P at the plane of delineation _{i}=[u _{i}, v _{i}, 1] satisfy following the relation:
z _{i}p _{i}＝A _{i}p _{i}????(2)
Wherein, A _{i}Camera c for reference viewing angle _{i}The inner parameter matrix mainly comprises parameters such as camera focus, center and deformation coefficient.
The present invention carries out deep search based on piece coupling at deep space, promptly utilize camera internal and external parameter and synthetic based on the view of the degree of depth, in depth range search, make the depth value of the error minimum between the block of pixels of block of pixels and corresponding actual referenceview of synthetic view with stepsize in search search, and the estimation of Depth value of this depth value as the pixel of target view.Target view and target visual angle are meant current image and the corresponding visual angle that needs estimating depth, and referenceview and reference viewing angle are meant other images and the visual angle in the multiangle video system.Referenceview and reference viewing angle can be selected in the deep search process or be specified by the user automatically.
The depth value of pixel is given regularly in view, can in the scene space, obtain a spatial point according to the camera internal and external parameter to this pixel back projection (backproject), again the plane of delineation of this spatial point projection again (re project) to required view directions, obtain the synthetic view at this visual angle, Here it is based on the view synthetic technology of the degree of depth, as shown in Figure 2.
Consider the situation of two views, establish view 1 and be target view that view 2 is a referenceview.Pixel P in the view 1 _{1}At its camera c _{1}Depth value under the coordinate system is z _{1}, this corresponding pixel points in view 2 is P _{2}, at its camera c _{2}Depth value under the coordinate system is z _{2}, can derive according to formula (1) (2) obtains
Obtain by formula (3):
Note is described for convenient:
Then (4) formula becomes:
z _{1}BP _{1}+Ct＝z _{2}P _{2}????????????????(6)
B wherein, C is a threedimensional matrice, t is two translation vectors between the camera.Because P _{1}, P _{2}Be homogeneous coordinates, but the z in the cancellation (6) _{2}, obtain pixel P _{1}Pixel homogeneous coordinates in view 2 are:
B wherein _{3}And c _{3}It is respectively the third line vector of matrix B and C;
Can draw by formula (9): at camera c _{1}With c _{2}Under the known situation of internal and external parameter, the pixel point value of view 2 is about the pixel point value in the view 1 and the function of depth value thereof.Utilizing formula (7) to carry out the view of view 1 in reference viewing angle 2 synthesizes.
Pixel P in the view 1
_{1}, under given depth z, obtain it at camera c by back projection and reprojection
_{2}The visual angle in the pixel P of synthetic view 2
_{2},
Synthesized_I _{2}(P _{2})＝Synthesized_I _{2}(f _{2}(z，P _{1}))＝I _{1}(P _{1})???????(8)
I _{1}Be view 1, I _{2}Be view 2, Synthesized_I _{2}Be the synthetic view 2 of view 1 in reference camera visual angle 2.Abovementioned explanation is that the camera system of forming with two cameras is an example, can further draw the camera system of being made up of m camera equally and go for abovementioned principle.
Suppose the pixel P in the local window W that with pixel P is the center _{j}Have identical scene depth value, then be at the synthetic view 2 of window W internal view 1 and the absolute difference of the camera referenceview that 2 actual photographed obtain at the visual angle 2:
Because synthetic view 2 is to utilize the camera parameter of referenceview 2 correspondences to calculate, so the synthetic view 2 under the real scene depth value has identical YC value with referenceview 2 in theory.Therefore, view 1 is found the solution at the depth value of pixel P and can be converted into following problem:
Promptly in given depth hunting zone (depth range), making the depth z of absolute difference minimum of synthetic view and referenceview as final estimation of Depth value.
Thisly directly carry out the method for deep search at deep space, do not need the parallax coupling, image correction process is directly finished in the deep search process, and depth value is successive value, and its precision is subjected to the restriction of image pixel resolution unlike disparity vector.
Know that from formula (7) under the known situation of camera internal and external parameter, the pixel of synthetic view 2 is the functions about pixel in the view 1 and depth value thereof.If the depth value changes delta z of the pixel P1 correspondence in the view 1, then its pixel coordinate in synthetic view 2 is:
So, the pixel P in the view 1 _{1}Depth value changes delta z cause that its pixelshift vector in synthetic view 2 is:
The pass that can be derived the depth value changes delta z of the pixel in the view 1 and the respective pixel point offset vector Δ P in the synthetic view 2 by formula (12) is:
Use Δ P ^{T}The both sides of premultiplication (13) obtain:
Wherein, ‖ Δ P ‖ ^{2}=Δ P ^{T}Δ P be pixelshift vector Δ P mould square.So, when camera parameter is known, can try to achieve at the pairing change in depth value of the depth z 1 pixelshift vector Δ P of place Δ z by (14).
In addition, by formula (6) can obtain two width of cloth views corresponding pixel points this satisfy following polar curve equation of constraint:
P _{2} ^{T}(Ct×B)P _{1}＝0?????????????????(15)
P _{2}′ ^{T}(Ct×B)P _{1}＝0???????????????(16)
Wherein * be the multiplication cross of vector.So formula (15) deducts formula (16) and obtains pixelshift vector Δ P and also should satisfy the polar curve equation of constraint:
ΔP ^{T}(Ct×B)P _{1}＝0????????????????(17)
Given camera parameter and pixel P _{1}Situation under, formula (17) is about two the component Δ u of pixelshift vector Δ P and the homogeneous linear equations of Δ v.
For parallel camera system, parallax d and its degree of depth of Same Scene o'clock in two width of cloth views is inversely proportional to, promptly
Wherein d and z are respectively the parallax and the degree of depth, and f and B are respectively the focal length and the base length of camera.The pixel P in the view 1 then _{1}Depth value by z _{1}Change to z _{2}The time, the pixelshift amount of its corresponding subpoint in synthetic view 2 is
Know that according to (19) the change in depth value is directly proportional with the pixelshift amount, with square being inversely proportional to of depth value.For identical pixelshift amount, when residing depth value is big more, corresponding change in depth value is just big more, and when residing depth value is more little, corresponding change in depth value is just more little.For converging camera system,, from formula (12) also as can be seen, between change in depth value, pixelshift amount and the depth value approximate relation is arranged when the angle of two cameras when not being very big.
In order to verify this conclusion, we with as shown in Figure 3 Uli cycle tests (these these multiangle video data are provided by the HeinrichHertzInstitut (HHI) of Germany, can be from https: //www.3dtvresearch.org/3dav_CfP_FhG_HHI/https: //www.3dtvresearch.org/3dav_CfP_FhG_HHI/ downloads and obtains.This video sequence adopts by 8 video cameras shootings of arranging with the ethod of remittance and obtains, video format is 1024x768,25fps) this paper adopts the view of initial time of the video sequence of the 7th and the 8th camera) the parameter of the 7th and the 8th camera, the pixel P=[526 in view 7 (as Fig. 3 (a)) according to formula (14) and (17) calculating, 429] locate (corresponding to the clasp on shirt collar the right), the relation between change in depth value, the depth value quadratic sum pixelshift amount.The given unit picture element offset vector that satisfies polar curve constraint (17), be ‖ Δ P ‖=1, calculate the change in depth value of different depth value correspondences according to (14), the relation between them as shown in Figure 4, wherein horizontal ordinate be depth value square, ordinate is the change in depth value.Fig. 4 shows, pixelshift amount in synthetic view is given regularly, change in depth value and depth value square be approximated to linear relationship, this means that at different depth value place the change in depth value of the same amount of pixel causes different pixelshift amounts in the view 1 in synthetic view.
It should be noted that because (17) are the homogeneous linear equations about pixelshift vector Δ P, so there are rightabout each other two kinds of situation Δ P in Δ P _{+}With Δ P _{}, can try to achieve one positive one two negative change in depth value Δ z to their substitutions (14) _{+}With Δ z _{}, i.e. Δ P _{+}With Δ P _{}Correspond respectively to depth value increase and depth value and reduce caused pixelshift vector.Know that by preceding surface analysis the pixelshift amount is given regularly, change in depth value and depth value square be approximated to proportional relation, so the change in depth value of two pixelshifts vector Δ P correspondences of the identical direction that varies in size is also inequality, i.e. depth value decrease  Δ z _{} less than the depth value increase  Δ z _{+}, as shown in Figure 5.For example, get the pixel P=[526 in the Uli view 7 (shown in Fig. 3 (a)), 429], depth value 3172mm, the pixelshift amount is 64 pixels, and promptly ‖ Δ P ‖=64 are tried to achieve two corresponding change in depth values of rightabout pixelshift vector according to (14) (17) and are respectively Δ z _{+}=930 and Δ z _{}=593.
Know according to above analysis, under the situation of same pixel sideplay amount, change in depth value and depth value square be approximated to proportional relation.So when using the fixing search steplength, as if the sideplay amount of the stepsize in search given at less depth value place corresponding to 1 pixel, then at big depth value place, the pixelshift amount of this stepsize in search correspondence will be less than 1 pixel.When supposing to project to noninteger pixel under given depth value, the pixel of getting arest neighbors is as subpoint, then will search same pixel at a plurality of different depth value places during deep search, repeat search promptly occurred.Conversely, if given stepsize in search is the sideplay amount corresponding to 1 pixel at big depth value place, then will be greater than 1 pixel in the pixelshift amount of this stepsize in search correspondence of less depth value place, be that adjacent two depth values will search two nonadjacent pixels, thereby make some pixel omission, the generation search is incomplete.So, be desirably in hunting zone [z originally _{Min}, z _{Max}] in N pixel of search, but owing to produced the pixel repeat search or leak to have searched for, actual search to the efficient search point that has to be less than N.
For example, we are to the pixel P=[526 in the Uli view 7,429], in the scope of [2000,4500], carry out deep search with the fixed step size of 10mm.As shown in Figure 6, when depth value hour, the pixel u coordinate that for example searches at 2090 places is 661, and the u coordinate of the pixel that searches at depth value 2080 places is 663, the centre has pixel to skip and does not have searched arriving; And as shown in Figure 7, when depth value was big, for example to have searched the u coordinate be 437 same pixel two different depth values 4450 and 4460 places, and promptly pixel has carried out repeat search.Since the stepsize in search of 10mm in real depth is worth 3170 subranges corresponding to the search precision of 1 pixel, so we were desirably in [2000 originally, 4500] search 250 different pixels in the scope, but leak search and repeat search because pixel has taken place, actual computation finds only to have searched for 200 pixels
In order to make in the deep search process, stepsize in search is corresponding to pixel search precision identical in the referenceview, be that stepsize in search is all the time corresponding to the sideplay amount of fixing a pixel in the referenceview, must dynamically adjust stepsize in search according to the relation between change in depth value and the depth value, and determine corresponding hunting zone.Suppose the pixel P in the view 1 _{1}The initial search depth value be z _{0}, then can try to achieve in depth z easily according to formula (14) _{0}Down, the change in depth value Δ z in the view 1 of the pixelshift amount Δ P correspondence in the referenceview 2.As initial depth value z _{0}Differ under the situation that is not very big pixel P with the real depth value _{1}True corresponding pixel points and depth z in referenceview 2 _{0}Pixelshift amount between the pixel of trying to achieve is confined in a certain scope usually down.Be given in below in the N of pixel hunting zone, how determine stepsize in search, make the sideplay amount of the corresponding fixing all the time pixel of stepsize in search according to the depth value selfadaptation.
Given pixel P _{1}And camera parameter, according to the polar curve equation of constraint (16) of pixelshift vector, be easy to find the solution two rightabout each other offset vector Δ P that obtain pixelshift amount ‖ Δ P ‖ correspondence _{+}With Δ P _{}, calculate two corresponding change in depth value Δ z according to (14) then _{+}With Δ z _{}, they are diminished as next step depth value and become the stepsize in search of general orientation, as shown in Figure 8,
z _{1}＝z _{0}+Δz _{1}???????????(20)
Then, at depth value Δ z _{1}With offset vector Δ P _{}Utilize (14) to calculate corresponding change in depth value Δ z down, _{2}, at depth value z _{1}With offset vector Δ P _{+}Utilize (14) to calculate corresponding change in depth value Δ z down, _{+ 2}, and them respectively as next step stepsize in search, promptly
z _{2}＝z _{1}+Δz _{2}
z _{2}＝z _{1}+Δz _{2}??????????????(21)
By that analogy, can obtain the search depth and the steplength in n step is:
z _{n}＝z _{(n1)}+Δz _{n}
z _{n}＝z _{(n1)}+Δz _{n}??????????(22)
Wherein, search step number n determines that according to hunting zone N and search precision promptly n satisfies n Δ≤N.
So determined hunting zone and initial depth value z _{0}After, utilize the elongated stepsize in search that above method just can selfadaptation be adjusted in the hope of changing along with depth value, make to keep identical pixel search precision in the deep search process, overcome pixel repeat search in the fixing search steplength or leaked the defective of search.Because depth range search obtains by adding up of stepsize in search, thereby also adjusts adaptively along with the variation of depth value, when depth value became big, the depth range search of identical pixelshift amount ‖ Δ P ‖ correspondence correspondingly became big; When depth value diminished, the depth range search of identical pixelshift amount ‖ Δ P ‖ correspondence also correspondingly diminished.In addition, we can also pass through pixel precision Δ controlling depth search precision easily, and as the search precision of Δ=1 corresponding to unit picture element, and Δ=1/2 is corresponding to the search precision of halfpix.
So, relation between depth value variation and the pixelshift vector has been arranged, suc as formula (14), just can determine corresponding deep search steplength, definite determining of corresponding pixelshift amount that also be converted into of depth range search by the method for determining the pixel search precision.Pixelshift amount and search precision determine to be similar to determining of hunting zone and precision in the disparity estimation, have meaning directly perceived, determine easily, and can be according to picture material or application demand, by adjusting pixelshift amount and search precision, dynamically determine corresponding depth range search and steplength.
In the estimation of Depth process, need a given degree of depth initial value z _{0}, the value quality of this initial value affects deep search performance and effect.Work as z _{0}With the deviation of real depth value hour, can use less pixelshift amount is that the hunting zone can be smaller, thereby reduces the too high search speed of volumes of searches; Work as z _{0}When big, then will use relatively large pixelshift amount with the deviation of real depth value, guaranteeing the searching real depth value, thereby calculated amount is bigger.Though the degree of depth initial value of difference can be by setting largescale hunting zone and highprecision stepsize in search improves search performance, but good degree of depth initial value can determine among a small circle the hunting zone and suitable steplength, thereby improve the efficient and the performance of deep search.So, the estimation of degree of depth initial value and definite also extremely important in the estimation of Depth process.
The definite of the initial depth value of video sequence image can be divided into two kinds of situations, the image of initial time and successive image.The definite of the degree of depth initial value of initial time image is divided into two kinds again, i.e. first pixel and other pixel.For first pixel,,, need to consider how from information such as characteristics of image and camera parameter, to obtain the big probable value of scene depth this moment as initial value therefore without any known scene depth information owing to also any pixel was not carried out deep search; For followup other pixels, then can determine its initial depth according to the estimation of Depth value of neighborhood pixels point in the image.For followup image, because the depth value of the video sequence image at same visual angle has very strong correlativity, the degree of depth of actionless background area remains unchanged, and have only the degree of depth of the moving region of minority to change, so can be the depth value of the same pixel position of previous moment image as initial value.So in the determining of initial depth value, key is to obtain the scene depth information of initial time image, for first pixel provides degree of depth initial value preferably.
In the multiangle video, the difference between the image of different views or the positional information of camera are comprising the information of relevant scene depth usually.At converging two kinds of situations of camera system and parallel camera system, be given under the situation without any known depth information below, carry out the initial estimation of scene depth according to camera parameter or image information.
The main target of multiangle video is the information in a plurality of angle shot Same Scene, place so camera is circular arc usually, and the camera light axle converges at a bit, i.e. collecting system.In the practical application, though camera may not strictly converge at a bit,
But always can find a point nearest with each camera optical axis distance, this point is considered to convergent point.Convergent point all is the position at scene place usually, can think an epitome point of scene, so the position by trying to achieve convergent point just can be in the hope of a big probable value of scene depth, and this value as the initial value in the deep search.
If the coordinate of convergent point in world coordinate system is Mc=[x _{c}, y _{c}, z _{c}], this point is positioned on the optical axis of each camera, so this point can be expressed as in the camera coordinates system that with the optical axis is the z axle:
M _{1}＝[0，0，z _{r1}]
M _{2}＝[Q，0，z _{r2}]
……????????????????????????(23)
M _{m}＝[0，0，z _{rm}]
Z wherein _{Ri}Be that convergent point is at camera c _{i}The degree of depth in the coordinate system.According to the relation of world coordinates and camera coordinates, following formula is arranged:
M _{c}＝R _{1}M _{1}+t _{1}
M _{c}＝R _{2}M _{2}+t _{2}
……???????????????????????(24)
M _{c}＝R _{m}M _{m}+t _{m}
Cancellation Mc obtains:
R _{1}[0，0，z _{r1}]+t _{1}＝R _{2}[0，0，z _{r2}]+t _{2}
R _{1}[0，0，z _{r1}]+t _{1}＝R _{3}[0，0，z _{r3}]+t _{3}
……??????????????????????(25)
R _{1}[0，0，z _{r1}]+t _{1}＝R _{m}[0，0，z _{rm}]+t _{m}
Formula (25) is about depth z _{R1}, z _{R2}..., z _{Rm}The individual linear equation of 3 (m1).T wherein _{1}T _{m}Being respectively camera coordinates is C _{1}C _{m}With respect to world coordinate system M _{c}The translation vector of position, R _{1}R _{m}Being respectively camera coordinates is C _{1}C _{m}With respect to world coordinate system M _{c}The threedimensional rotation matrix of position, m are the camera number.With linear least square solving equation group (25), can obtain the depth value z of convergent point in each camera coordinates system _{R1}, z _{R2}..., z _{Rm}, they are big probable values of scene depth, can be used as the degree of depth initial value in the deep search.
Do not have convergent point in the parallel camera system, can not ask depth information, but parallax and the degree of depth there were simple inverse relation (18) this moment, so can obtain depth information by the method for calculating the global disparity between two width of cloth views in order to last method.
Global disparity may be defined as the pixelshift amount of the absolute difference minimum that makes two views, promptly tries to achieve by the following method:
Wherein, R is the number of pixels of looking the overlapping region of Fig. 1 and 2.Since less demanding to the estimated accuracy of global disparity, so the search unit of pixelshift amount x can establish more greatly in the formula (26), such as 8 pixels or 16 pixels, thereby can significantly reduce calculated amount.After trying to achieve global disparity,, can try to achieve degree of depth initial value according to the relational expression (18) that the degree of depth and parallax are inversely proportional to.
A scene point utilizing Uli video sequence parameter document to provide: the real world coordinates [35.07 of the high brightness point on the glasses left side, 433.93,1189.78] (mm of unit), and can obtain coordinate and the real depth information of this scene point under camera coordinates system according to the relational expression (1) of world coordinates and camera coordinates; Utilize the abovementioned method of asking two camera convergent points again, promptly obtain depth value under the coordinate system of convergent point in camera 7 and camera 8 by finding the solution system of linear equations (26), result of calculation is as shown in table 1.Judge that according to human eye observation the depth of field of Uli scene changes little, and the real depth information of degree of depth initial estimate and scene point is more or less the same in the table 1, has illustrated that degree of depth initial estimate is comparatively effective and reasonable, for estimation of Depth provides good initial value.
Table 1
Uli view shown in Fig. 3 (c) is from the 64x64 imageregion of pixel [527,430] to [590,493].To in this zone every the pixel of 15 pixels, totally 16 pixels carry out deep search respectively with fixed step size and adaptive step.Carry out adopting for three times the search of fixing search steplength in fixing search scope [2000,5000], steplength is respectively 20,35, and 50.In the determining of adaptable search steplength, initial depth is 2817, and the pixelshift amount is made as 32 pixels, and search precision is 1 pixel, and the degree of depth initial value of later pixel point is made as the estimation of Depth value of neighborhood pixels point.Definite method of adaptable search steplength according to the present invention, can obtain pixel [527,430] in the hunting zone of departing from 32 pixels of initial ranging pixel, the pairing stepsize in search of the search precision of per unit pixel, as shown in table 2, search pixel as shown in Figure 9 by these stepsize in searchs.Table 2 explanation, the steplength that reduces direction along depth value is a negative value, and along with the reducing of the increase of pixelshift amount, depth value, the absolute value of steplength reduces gradually; And along the steplength of depth value augment direction be on the occasion of, and along with the increase of pixelshift amount, the increase of depth value, the absolute value of steplength increases gradually.Fig. 9 shows, when carrying out deep search with the elongated stepsize in search of table 2, corresponding pixel search precision is guaranteed to hold constant, is 1 pixel all the time.
Table 2
Utilization is carried out estimation of Depth based on the method for piece coupling, and deep search result is shown in Figure 10, synthetic piece under the depth value that each some expression search obtains among the figure and the absolute difference between the actual block, and the more little common representative estimation of Depth value of this value is accurate more.When adopting the fixing search steplength, because the more little expression search accuracy of steplength is high more, so the effect of estimation of Depth is good more, the absolute difference that obtains under the depth value that search obtains as steplength 20mm is less than the absolute difference of steplength 35mm, and the absolute difference of 35mm is less than 50mm.But it is best that adaptive stepsize in search is searched for the depth value result who obtains down, the absolute difference minimum that it is corresponding.
Fig. 3 (c) is 16 pixels that pixel [527,430] carries out estimation of Depth to the imageregion of [590,493] in the view 2 (a), adopts the adaptable search steplength respectively, fixed step size 20,35, and 50 search for.Table 3 shows, when adopting the adaptable search steplength, 16 pixels have all searched correct depth value, and has wrong estimation of Depth when adopting fixed step size.This is because these several pixels are in the zone that texture lacks, and in largescale fixing search scope, the absolute difference smallest point that search obtains does not also correspond to correct pixel.And when adopting adaptive stepsize in search, because initial value determines that according to neighbor information the pixelshift amount can be established lessly, promptly searches in less relatively subrange, reduce the probability that searches erroneous pixel point, and guaranteed certain degree of depth slickness.Table 3 has been listed depth estimation result, deep search number of times and the wrong estimation number when using adaptable search steplength and fixing search steplength, and the data that frame is arranged in the table are misdata.Searching times is few and do not have wrong to estimate during table 3 presentation of results adaptable search step length searching, and searching times is many and also exist and wrongly estimate during the fixing search step length searching.For example need only 64 depth values of search in the selfadaptation deep search of 32 pixelshift amounts, and the fixing search steplength of 20mm needs to search for 150 depth values in the hunting zone of [2000,5000].
Table 3
Result by table 3 and Figure 10, reach a conclusion: be higher than the fixing search steplength on the deep search performance of adaptable search steplength, the synthetic image block of the depth value of promptly utilize estimating is little with the absolute difference of reference image block, and mistake is estimated to lack, and calculated amount or deep search number of times are few.
Claims (17)
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CN200810300330A CN100592338C (en)  20080203  20080203  Multivisual angle video image depth detecting method and depth estimating method 
Applications Claiming Priority (2)
Application Number  Priority Date  Filing Date  Title 

CN200810300330A CN100592338C (en)  20080203  20080203  Multivisual angle video image depth detecting method and depth estimating method 
PCT/CN2008/072141 WO2009097714A1 (en)  20080203  20080826  Depth searching method and depth estimating method for multiviewing angle video image 
Publications (2)
Publication Number  Publication Date 

CN101231754A CN101231754A (en)  20080730 
CN100592338C true CN100592338C (en)  20100224 
Family
ID=39898199
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CN200810300330A CN100592338C (en)  20080203  20080203  Multivisual angle video image depth detecting method and depth estimating method 
Country Status (2)
Country  Link 

CN (1)  CN100592338C (en) 
WO (1)  WO2009097714A1 (en) 
Cited By (1)
Publication number  Priority date  Publication date  Assignee  Title 

US9524556B2 (en)  20140520  20161220  Nokia Technologies Oy  Method, apparatus and computer program product for depth estimation 
Families Citing this family (6)
Publication number  Priority date  Publication date  Assignee  Title 

CN100592338C (en) *  20080203  20100224  四川虹微技术有限公司  Multivisual angle video image depth detecting method and depth estimating method 
JP5713624B2 (en) *  20091112  20150507  キヤノン株式会社  3D measurement method 
CN101710423B (en) *  20091207  20120104  青岛海信网络科技股份有限公司  Matching search method for stereo image 
KR101640404B1 (en) *  20100920  20160718  엘지전자 주식회사  Mobile terminal and operation control method thereof 
JP6565188B2 (en)  20140228  20190828  株式会社リコー  Parallax value deriving apparatus, device control system, moving body, robot, parallax value deriving method, and program 
TWI528783B (en) *  20140721  20160401  由田新技股份有限公司  Methods and systems for generating depth images and related computer products 
Citations (3)
Publication number  Priority date  Publication date  Assignee  Title 

EP0871144A2 (en) *  19970411  19981014  Nec Corporation  Maximum flow method for stereo correspondence 
CN1522542A (en) *  20010706  20040818  皇家菲利浦电子有限公司  Methods of and units for motion or depth estimation and image processing apparatus provided with such motion estimation unit 
CN1851752A (en) *  20060330  20061025  东南大学  Dual video camera calibrating method for threedimensional reconfiguration system 
Family Cites Families (3)
Publication number  Priority date  Publication date  Assignee  Title 

US6384859B1 (en) *  19950329  20020507  Sanyo Electric Co., Ltd.  Methods for creating an image for a threedimensional display, for calculating depth information and for image processing using the depth information 
US6606406B1 (en) *  20000504  20030812  Microsoft Corporation  System and method for progressive stereo matching of digital images 
CN100592338C (en) *  20080203  20100224  四川虹微技术有限公司  Multivisual angle video image depth detecting method and depth estimating method 

2008
 20080203 CN CN200810300330A patent/CN100592338C/en not_active IP Right Cessation
 20080826 WO PCT/CN2008/072141 patent/WO2009097714A1/en active Application Filing
Patent Citations (3)
Publication number  Priority date  Publication date  Assignee  Title 

EP0871144A2 (en) *  19970411  19981014  Nec Corporation  Maximum flow method for stereo correspondence 
CN1522542A (en) *  20010706  20040818  皇家菲利浦电子有限公司  Methods of and units for motion or depth estimation and image processing apparatus provided with such motion estimation unit 
CN1851752A (en) *  20060330  20061025  东南大学  Dual video camera calibrating method for threedimensional reconfiguration system 
Cited By (1)
Publication number  Priority date  Publication date  Assignee  Title 

US9524556B2 (en)  20140520  20161220  Nokia Technologies Oy  Method, apparatus and computer program product for depth estimation 
Also Published As
Publication number  Publication date 

CN101231754A (en)  20080730 
WO2009097714A1 (en)  20090813 
Similar Documents
Publication  Publication Date  Title 

US9509980B2 (en)  Realtime capturing and generating viewpoint images and videos with a monoscopic low power mobile device  
US10080012B2 (en)  Methods, systems, and computerreadable storage media for generating threedimensional (3D) images of a scene  
Perazzi et al.  Panoramic video from unstructured camera arrays  
US9600889B2 (en)  Method and apparatus for performing depth estimation  
US9300946B2 (en)  System and method for generating a depth map and fusing images from a camera array  
Kauff et al.  Depth map creation and imagebased rendering for advanced 3DTV services providing interoperability and scalability  
Hornacek et al.  SphereFlow: 6 DoF scene flow from RGBD pairs  
US9635348B2 (en)  Methods, systems, and computerreadable storage media for selecting image capture positions to generate threedimensional images  
CN102027752B (en)  For measuring the system and method for the potential eye fatigue of stereoscopic motion picture  
RU2407220C2 (en)  Method of coding and method of decoding of images, devices for them, program for them and information medium for storage of programs  
JP5887267B2 (en)  3D image interpolation apparatus, 3D imaging apparatus, and 3D image interpolation method  
Roy et al.  A maximumflow formulation of the ncamera stereo correspondence problem  
US9609307B1 (en)  Method of converting 2D video to 3D video using machine learning  
US8810635B2 (en)  Methods, systems, and computerreadable storage media for selecting image capture positions to generate threedimensional images  
US8045793B2 (en)  Stereo matching system and stereo matching method using the same  
Padua et al.  Linear sequencetosequence alignment  
JP4938861B2 (en)  Complex adaptive 2Dto3D video sequence conversion  
US7321374B2 (en)  Method and device for the generation of 3D images  
US8385628B2 (en)  Image encoding and decoding method, apparatuses therefor, programs therefor, and storage media for storing the programs  
US9344701B2 (en)  Methods, systems, and computerreadable storage media for identifying a rough depth map in a scene and for determining a stereobase distance for threedimensional (3D) content creation  
CN101400001B (en)  Generation method and system for video frame depth chart  
US9241147B2 (en)  External depth map transformation method for conversion of twodimensional images to stereoscopic images  
US8629901B2 (en)  System and method of revising depth of a 3D image pair  
Roy  Stereo without epipolar lines: A maximumflow formulation  
CN102223556B (en)  Multiview stereoscopic image parallax free correction method 
Legal Events
Date  Code  Title  Description 

C06  Publication  
PB01  Publication  
C10  Entry into substantive examination  
SE01  Entry into force of request for substantive examination  
C14  Grant of patent or utility model  
GR01  Patent grant  
CF01  Termination of patent right due to nonpayment of annual fee  
CF01  Termination of patent right due to nonpayment of annual fee 
Granted publication date: 20100224 Termination date: 20160203 