WO2005015497A1

WO2005015497A1 - Determining depth information

Info

Publication number: WO2005015497A1
Application number: PCT/IB2004/051359
Authority: WO
Inventors: Christiaan Varekamp; Fabian E. Ernst
Original assignee: Koninklijke Philips Electronics N.V.
Priority date: 2003-08-07
Filing date: 2004-08-02
Publication date: 2005-02-17

Abstract

The invention relates to a method and apparatus of determining depth information of an image by use of T -junctions. An apparatus comprises a segmentation processor (201) which receives an image and segments it into a plurality of image segments having similar characteristics. The segments are fed to a junction extraction processor (203) which determines T -junction candidates. The image segments and T -junction candidates are fed to a depth information processor (205) which determines depth information associated with objects of the image in response to the T-junction candidates. Specifically, predetermined junction models are fitted to the T junction candidates and the model having the closest fit is selected. Depth information for the T -junction is then determined from known depth information of the junction model. In addition certainty measures are determined for the T-junction candidates and the associated depth information.

Description

Determining depth information

FIELD OF THE INVENTION The invention relates to a method and apparatus for determining depth information and in particular for determining depth information for at least one image.

BACKGROUND OF THE INVENTION Conventional video and TV systems distribute video signals which inherently are two dimensional (2D) in nature. However, it would in many applications be desirable to further provide three dimensional (3D) information. For example, 3D information may be used for enhancing object grasping and video compression for video signals. In particular three dimensional video or television (3DTV) is promising as a means for enhancing the user experience of the presentation of visual content and 3DTV could potentially be as significant as the introduction of colour TV. The most commercially interesting 3DTV systems are based on re-use of existing 2D video infrastructure thereby allowing for a minimal cost and compatibility problems associated with a gradual roll out. For these systems conventional 2D video is distributed and is converted to 3D video at the location of the consumer. The 2D-to-3D conversion process adds (depth) structure to 2D video and may also be used for video compression. However, the conversion of 2D video into video comprising 3D information is a major image processing challenge. Consequently, significant research has been undertaken in this area and a number of algorithms and approaches have been suggested for extracting 3D information from 2D images. Known methods for deriving depth or occlusion relations from monoscopic video comprise the structure from motion approach and the dynamic occlusion approach. In the structure from motion approach, points of an object are tracked as the object moves and are used to derive a 3D model of the object. The 3D model is determined as that which would most closely result in the observed movement of the tracked points. The dynamic occlusion approach utilises the fact that as different objects move within the picture, the occlusion (i.e. the overlap of one object over another in a 2D picture) provides information indicative of the relative depth of the objects. However, structure from motion requires the presence of camera motion and cannot deal with independently moving objects (non-static scene). Furthermore, both approaches rely on the existence of moving objects and fail in situations where there is very little or no apparent motion in the video sequence. A depth cue which may provide static information is the T-junction which may correspond to an intersection between objects. However, although the possibility of using T-junctions as a depth cue for vision has been known for a long time, computational methods for detecting T-junctions in video and use of T-junctions for automatic depth extraction have had very limited success so far. Previous research into the use of T-junctions has mainly focussed on the T- junction detection task and example of schemes for detecting T-junctions are given in "Filtering, Segmentation and Depth" by M. Nitzberg, D. Mumford and T.Shiota, 1991. Lecture Notes in Computer Science 662. Springer- Verlag, Berlin; "Steerable-scalable kernels for edge detection and junction analysis" by P. Perona, 1992. 2nd European Conference of Computer Vision pages 3-18 Image and Vision Computing, vol. 10, pag. 663-672 and

"Junctions: Detection, Classification, and Reconstruction", L. Parida, D. Geiger, R. Hummel, 1998 IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.20, no.7, pp.687- 698. However, most of the methods disclosed in these documents cannot be used directly to derive depth order or occlusion relations between adjacent image regions. The only document disclosing a system for determining depth information based on the detected T-junctions is given in "Filtering, Segmentation and Depth" by M. Nitzberg, D. Mumford and T.Shiota. This approach is based on determination of contours after non-linear filtering, curve smoothing, corner and junction detection and curve continuation. Hence, the described method is rather complex and requires significant computational resources. Furthermore, since the disclosed method is contour based, it is not easily integrated with other region based depth cues such as depth from motion and depth from dynamic occlusion. Hence, an improved system for determining depth information would be advantageous and in particular a system allowing for reduced complexity, reduced computational burden and/or improved performance would be advantageous. SUMMARY OF THE INVENTION Accordingly, the Invention preferably seeks to mitigate, alleviate or eliminate one or more of the above mentioned disadvantages singly or in any combination. According to a first aspect of the invention, there is provided a method of determining depth information for at least one image comprising the steps of: segmenting the at least one image into a plurality of image segments; determining at least one junction associated with overlapping objects of the at least one image in response to the plurality of image segments; and determining depth information associated with objects of the at least one image in response to the at least one junction. The invention allows for a very efficient method of determining depth information. Improved performance is achieved and the quality of the derived depth information may be used. Specifically, the invention may allow for determination of depth information using low complexity image processing. Furthermore, the invention allows for the possibility of re-use of segmentation provided for other region based depth cues. Thereby the most complex and resource demanding step may simply be implemented by re-using data already generated and will accordingly not add significantly to the computational load. Furthermore, an approach based on segmentation into image segments provides for a more robust and less noise sensitive system than a pixel based method. According to a feature of the invention, the step of segmenting comprises grouping picture elements having similar brightness levels in the same segment. A brightness characteristic may provide a particular suitable parameter for segmentation resulting in image segments suited for deriving junctions and depth information. According to another feature of the invention, the step of segmenting comprises grouping picture elements having similar colours in the same segment. A colour characteristic may provide a particular suitable parameter for segmentation resulting in image segments suited for deriving junctions and depth information. According to another feature of the invention, the step of segmenting comprises grouping picture elements in response to motion characteristics of objects of the at least one image. A motion characteristic may provide a particularly suitable parameter for segmentation resulting in image segments suited for deriving junctions and depth information. The motion characteristic may specifically be a characteristic determined as part of a motion estimation process. According to another feature of the invention, the step of determining the at least one junction comprises dividing at least a part of the at least one image into a plurality of 2 by 2 picture element matrices and comparing each matrix to a predetermined junction detection criterion. This provides for a low complexity method of determining a junction. The computational requirement may be kept low. Furthermore, accurate junction detection may be achieved. The criterion may specifically be that a 2 by 2 element matrix comprises a junction if the matrix comprises picture elements of three different segments arranged in accordance with one of a plurality of predetermined configurations. According to another feature of the invention, the step of determining depth information comprises fitting a junction model to the at least one junction and determining the depth information in response to characteristics of the fitted junction model. This provides for a low complexity method of determining depth information. Specifically, the model may be a predetermined model having a predetermined association with depth information. Thus depth information may be determined directly by fitting a junction model without requiring specific depth analysis processing steps. According to another feature of the invention, the fitting of the junction model comprises selecting the junction model from a plurality of predefined junction models. A number of junction models may be predefined and have associated depth information and specifically depth order information. For example, depth information may directly be determined in response to which junction model out of the predefined junction models provides the best fit. The junction models may comprise a number of variable parameters that may be adjusted to provide a close fit. The approach allows for a very simple method of relatively accurately determining depth information. According to another feature of the invention, the step of determining depth information comprises determining a plurality of edge points associated with edges between segments forming the at least one junction. Edge points provide an excellent parameter for determining boundaries between segments forming junctions. Specifically, edge points provide suitable points for determining depth order information based on curve and/or junction model fitting. According to another feature of the invention, the plurality of edge points comprises only edge points within a given radius of the at least one junction. Only considering local edge points allows for reduced complexity and thus a reduced computational load and/or faster execution times. Specifically, a radius of between 3 and 20 picture elements provide for a suitable accuracy yet results in a low processing burden. According to another feature of the invention, the step of determining depth information comprises fitting a first and second curve to the plurality of edge points. The first and second curves may specifically be a first and second straight line. Furthermore, the fitting of a first and second curve may correspond to or be comprised in a fitting of a junction model. The curve fitting allows for depth information to be derived e.g. by comparing the fitted curves to known relationships between curves and depth information. According to another feature of the invention, the at least one junction is a T- junction and the first curve corresponds to a stem of the T-junction and the second curve corresponds to a top of the T-junction and the step of determining depth information comprises determining that two segments separated by the first curve are behind the segment separated from the two segments by the second curve. The T-junction may be formed by an intersection of two curves corresponding to boundaries of objects. Specifically, a T-junction may be a point in the image plane where object edges form a "T" structure with one edge terminating on a second edge. Thus the invention may allow for relative depth information between two objects to be determined simply by detecting a T-junction and easily derivable parameters of this T-junction. Thus a very efficient process is achieved. According to another feature of the invention, the method of determining depth information further comprises the step of determining a certainty measure of the depth information in response to an accuracy of a fit of the first and second curve. The certainty measure may be determined in response to a closeness of fit of one of the first and second curves but is preferably determined in response to a closeness of fit of both curves. Specifically, the fit of the curves may be (part of) fitting a junction model and the certainty measure may be determined in response to how closely the model fits the detected junction. The geometric fit of curves to a detected junction provides a suitable measure of the reliability of the detected junction being a suitable junction for providing depth information. For example, it may provide an excellent measure of the probability of the detected junction being a junction formed by edges of different objects. The certainty measure furthermore provides valuable information that can be used to improve depth information. For example, if a plurality of junctions are detected and used to derive combined depth information, the weighting of each junction may be in response to the depth information. According to another feature of the invention, the method of determining depth information further comprises determining a certainty measure of the depth information in response to a colour variation between at least a first and second image segment associated with the at least one junction. The colour variation between image segments of a detected junction provides a suitable measure of the reliability of the detected junction being a suitable junction for providing depth information. For example, it may provide an excellent measure of the probability of the detected junction being a junction formed by edges of different objects. The certainty measure furthermore provides valuable information that can be used to improve depth information. For example, if a plurality of junctions are detected and used to derive combined depth information, the weighting of each junction may be in response to the depth information. A certainty measure may for example be based on a geometric fit of curves or colour variation of associated segments but is preferably based on both in order to increase the accuracy of the certainty measure. Preferably the plurality of image segments are disjoint regions of the at least one image and the at least one junction is a T-junction. According to another feature of the invention, the step of determining at least one junction determines a plurality of junctions and the step of determining depth information comprises determining a depth map of at least part of the at least one image in response to the plurality of points. Thus the invention allows for a low complexity method of determining a depth map of preferably the whole image. This depth information may for example be used in generating pseudo 3D images from 2D images. According to a second aspect of the invention, there is provided an apparatus for determining depth information for at least one image comprising: means for segmenting the at least one image into a plurality of image segments; means for determining at least one junction associated with overlapping objects of the at least one image in response to the plurality of image segments; and means for determining depth information associated with objects of the at least one image in response to the at least one junction. These and other aspects, features and advantages of the invention will be apparent from and elucidated with reference to the embodiment(s) described hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS An embodiment of the invention will be described, by way of example only, with reference to the drawings, in which FIG. 1 illustrates an example of a T-junction in an image; FIG. 2 illustrates an apparatus for determining depth information for at least one image in accordance with a preferred embodiment of the invention; FIG. 3 illustrates a method of determining depth information of at least one image in accordance with an embodiment of the invention; and FIG. 4 illustrates an example of a relation between the placement of edge points and the segmentation matrix in accordance with an embodiment of the invention.

DESCRIPTION OF PREFERRED EMBODIMENTS FIG. 1 illustrates an example of a T-junction in an image. In the illustrated example, the image comprises a first rectangle 101 and a second rectangle 103. The first rectangle 101 overlaps the second rectangle 103 and accordingly edges of the objects form an intersection known as a T-junction 105. Specifically, a first edge 107 of the second rectangle 103 is cut short by a second edge 109 of the first rectangle. Accordingly, the first edge 107 forms a stem 111 of the T-junction 105 and the second edge 109 forms a top 113 of the T- junction. Thus, in the example the T-junction 105 is the point in the image plane where the object edges 107, 109 form a "T" by one edge 107 terminating on a second edge 109. Humans are capable of identifying that some objects are nearer than others just by the presence of T-junctions. In the example of FIG. 1, it is clear that the first rectangle 101 occludes the second rectangle 103 and thus that the object corresponding to the first rectangle 101 is in front of the object corresponding to the second rectangle 103. FIG. 2 illustrates an apparatus 200 for determining depth information for at least one image in accordance with a preferred embodiment of the invention. The apparatus 200 comprises a segmentation processor 201 which receives one or more images for which depth information is to be provided. The segmentation processor 201 is operable to segment the at least one image into a plurality of image segments. The segmentation processor 201 is coupled to a junction extraction processor 203 and receives the plurality of image segments. The junction extraction processor 203 is operable to determine at least one junction associated with overlapping objects of the at least one image in response to the plurality of image segments. In the preferred embodiment, the junction extraction processor 203 determines a plurality of T-junctions corresponding to intersections of edges between the segments. The junction extraction processor 203 is coupled to a depth information processor 205 which receives characteristics of the detected junctions from the junction extraction processor 203 and image segmentation information from the segmentation processor 201. The depth information processor 205 is operable to determine depth information associated with objects of the at least one image in response to the at least one junction. In the preferred embodiment, the depth information processor 205 generates a depth map for the image in response to a plurality of T-junctions determined by the junction extraction processor 203. FIG. 3 illustrates a method of determining depth information of at least one image in accordance with an embodiment of the invention. The method is applicable to the apparatus of FIG. 2 and will be described with reference to this. In step 301 the segmentation processor 201 receives an image from a suitable source. Step 301 is followed by step 303 wherein the image is segmented into a plurality of image segments. The aim of image segmentation is to group pixels together into image segments which are unlikely to contain depth discontinuities. A basic assumption is that a depth discontinuity causes a sharp change of brightness or colour in the image. Pixels with similar brightness and/or colour are therefore grouped together resulting in brightness/colour edges between regions. In the preferred embodiment, picture segmentation thus comprises the process of a spatial grouping of pixels based on a common property. There exist several approaches to picture- and video segmentation, and the effectiveness of each will generally depend on the application. It will be appreciated that any known method or algorithm for segmentation of a picture may be used without detracting from the invention. In the preferred embodiment, the segmentation includes detecting disjoint regions of the image in response to a common characteristic and subsequently tracking this object from one picture to the next. In one embodiment, the segmentation comprises grouping picture elements having similar brightness levels in the same image segment. Contiguous groups of picture elements having similar brightness levels tend to belong to the same underlying object. Similarly, contiguous groups of picture elements having similar colour levels also tend to belong to the same underlying object and the segmentation may alternatively or additionally comprise grouping picture elements having similar colours in the same segment. In some embodiments, the segmentation comprises grouping picture elements in response to motion characteristics of objects of the at least one image. Conventional motion estimation compression techniques, such as MPEG 2 based video compression, utilise motion estimation to identify and track moving objects between images thereby allowing for regions associated with the moving objects to be coded simply by a motion vector and differential information. Thus, typical video compression techniques comprise processing of image segments based on motion estimation characteristics. In the preferred embodiment, such segmentation is re-used by the segmentation processor 201 thus allowing for the segmentation process to re-use existing segmentations. As segmentation typically is a very resource demanding process, this allows for an efficient method and minimisation of resource usage. Step 303 thus results in the segmentation processor 201 generating a plurality of image segments. The segmentation allows for a high probability that boundaries between objects in the image will correspond to boundaries between image segments. Thus edges of objects in the image may be analysed by investigating the edges of the determined image segments. Step 303 is followed by step 305 wherein the junction extraction processor 203 determines at least one junction associated with overlapping objects of the at least one image in response to the plurality of image segments. In the preferred embodiment, the image is divided into 2 by 2 picture element matrices and each matrix is compared to a predetermined junction detection criterion. If the matrix meets the junction detection criterion it is assumed that the matrix comprises a T- junction, and if the junction detection criterion is not met it is assumed that the matrix does not comprise a T-junction. In some embodiments only part of the picture is divided into 2 by 2 pixel matrices and specifically a lower computational resource use may be achieved by only considering 2 by 2 groups of pixels along the boundaries between the image segments. An approach for determining T-junctions in accordance with the preferred embodiment will be described in more detail in the following. The description will use a notation wherein both the image and segmentation are represented by matrices of size NxM . It is important to define how these matrices relate to the Cartesian coordinate system that is needed to define the location of sample points, junction points and edge points in the underlying image. The origin of the Cartesian coordinate system of the underlying image is centred on element (i, j) = (Nl) of the matrix representing the image. For a given row / and column j of the image matrix, the sample location (i.e. the pixel position in the underlying image) then follows from: x = jX, y= N-i.

In the preferred embodiment, the T-junctions are identified by analysing all 2x2 sub-matrices of the N M segmentation matrix S . Since the T-junctions are to be detected, the analysis focuses on 3-junctions which are junctions at which exactly three different image segments meet. It should be noted that a 3 -junction is not necessarily a T-junction, but may also indicate a fork or an arrow shape (which may for example occur in the image of a cube). A further geometric analysis is therefore needed to determine whether a detected 3-junction may be considered a T-junction. We now introduce the so called segmentation matrix S . An element S of this matrix contains the segment number at pixel location (/, j) . The segment number itself is arbitrary and in the following we only use the property that the segment number changes at edges and the property that the segmentation is four-connected. In order to extract 3-junctions from the segmentation matrix S , the structure of all possible 2x2 sub -matrices is examined. A sub-matrix contains a 3-junction if exactly one of the four differences

is equal to zero. This is for example the case for the following sub-matrices:

but not for example for the following sub-matrix. 1 2 3 1

This sub-matrix is not considered to be a 3-junction because region number 1, which occurs twice, is not 4 -connected. This violates the basic assumption that regions in the segmentation must be 4-connected on a square sampling grid. In other words, a 2 by 2 sub-matrix is considered a 3-junction if the four elements correspond to exactly three image segments and the two samples from the same image segment are next to each other either vertically or horizontally (but not diagonally). Finally, the actual junction point in the Cartesian coordinates of the image is placed in the centre of the 2x2 sub-matrix:

^■^jun J r. ' y jun ^■■ N-i—

where and j are the first row and first column of the2x2 sub -matrix. Step 305 is followed by step 307 wherein the depth information processor 205 determines depth information of one or more objects in the image. The depth information is determined in response to the detected 3-junctions and in the preferred embodiment a large number of T-junctions are used to develop a depth map for a plurality of objects in the image. As illustrated in FIG. 1, a T-junction has a characteristic geometry where one edge known as the top 111 ends abruptly in the middle of a second edge known as the stem 113. Identification of the top and stem is used in deriving a possible depth order. To identify the top and the stem, it is in the preferred embodiment assumed that both are straight lines which pass through the junction point (x_jun , y.-_m ), but with an arbitrary orientation angle.

Accordingly, the junction is fitted to first and second curves which in the preferred embodiment are straight lines. In other embodiments more complex curves may be used. Initially, a plurality of edge points associated with edges between the image segments forming the at least one junction is determined. Thus, edge points are extracted from the segmentation matrix. Figure 4 illustrates an example of a relation between the placement of edge points and the segmentation matrix 400. In FIG. 4, the three image segments surrounding a junction point are identified by the set {l,2,3}. Edge points are now placed between rows and columns in the segmentation matrix, only if the values of the segmentation matrix changes from one row to the next or from one column to the next. As illustrated in FIG. 4 this results in three edge points 401 between image segment 1 and image segment 2, three edge points 403 between image segment 2 and image segment 3 and four edge points 405 between image segment 1 and image segment 3. Preferably, only edge points within a given radius of the junction are determined. This allows for a reduced complexify and computational requirement yet results in good performance. Specifically, only the subset of edge points that fall inside a circle with radius R are used in the calculations. It has been found that using a radius R of 3-20 pixels provide desirable performance yet achieves low complexify and computational burden. Edge points between rows are calculated from matrix indices (i,j) and

(i + l,j) as „ . 1 x = j , y = N- ι + - with the condition that L_tJ ≠ L_M , L, . e {l,2,3}, Z_I+, e {1,2,3} and

{^χ -^χ _sJ + (y -y_ιm)² ≤ R². Edge points between columns are calculated from matrix indices (i, j) and

(/, / + l) as x = j+- y = N- i

with the condition that L, . ≠ Z, ,₊₁ , L, . e {1,2,3}, L,_J+1 e {1,2,3} and

The edge points are then split into three subsets: (xfKyP), - k - N^¹' (edge points that lie on the edge between image segment 1 and 2) l ≤k ≤ N (edge points that lie on the edge between image segment 1 and

l ≤k≤ JV⁽³' (edge points that lie on the edge between image segment 2 and 3) where the superscript indicates the edge number. Given the three sets of edge points, there are three possibilities for assigning the T-junction top and stem: Model 1: edges 1 and 2 form the top and edge 3 forms the stem Model 2: edges 1 and 3 form the top and edge 2 forms the stem Model 3: edges 2 and 3 form the top and edge 1 forms the stem Each model has two parameters, the line orientation angles φ_top and _8tem which can vary between 0 and π. These parameters are in the preferred embodiment determined by minimizing the sum of squared perpendicular distances between edge points and the line. The sum of squared distances may be determined from the separate contributions from edge 1, 2 and 3. For instance for edge 1, this sum as a function of orientation angle is given by d^il)( )= ∑|- W -*ju„ )sin + tø> - y- Jcosrf A=l Similar expressions result for edges 2 and 3. For T-junction model 1, the total squared distance is the sum of the contribution from the top and the contribution from the stem. These contributions are given by W^(I,fø ⁽²⁾W

where the superscript now denotes the model. For T-junction model 1, the solution angles for T-junction top and stem are given by

0,_op = argmin(-i_t2 (ø)) Φ 0-_tem = ar minfcl( ))

Similar expressions result for T-junction models 2 and 3. Let m e {1,2,3} denote the model number. The best model is selected as the one that minimizes the sum of all perpendicular distances between edge points and the T-junction:

_Wbest = ar min( "' +^W ). m Thus in the preferred embodiment a plurality of junction models have been predefined or predetermined. Specifically, the junction models comprising straight lines and corresponding to the possible alignments of these lines and the stem and top of a T-junction have been determined. The step of determining depth information then comprises fitting each junction model to the data and selecting the model with the best fit. Once the T-junction top and stem are known, depth information corresponding to the relative depth order of the adjacent image segments is readily available. As is clear from FIG. 1, the image segment which partly forms the top but not the stem is inherently in front of the image segments forming the stem. Depth information between the two image segments forming the stem cannot directly be derived from the T-junction. However, in the preferred embodiment, many T-junctions are determined and specifically a given object may have many corresponding T-junctions. Therefore, relative depth information may be determined by considering the relative depth information of all objects. In the preferred embodiment, a certainty or reliability measure is further determined for each T-junction. The certainty measure is indicative of the certainty or reliability of the generated depth information. The certainty measure may thus increase the accuracy of processes considering a plurality of T-junctions and specifically may increase the accuracy of a depth map of the picture. For example, each T-junction may be weighted in accordance with the certainty measure thereby providing for conflicts between different depth estimates to be resolved and/or taken into account. In the preferred embodiment the certainty measure is determined by combining a geometric certainty measure and a photometric certainty measure. The geometric certainty measure is determined in response to an accuracy of the fit of the first and second straight lines and the photometric certainty measure is determined in response to a colour variation between at least two image segments of the junction. In some embodiments, only one of the geometric or photometric uncertainty measures may be used. As an example of the geometric certainty measure, the error of the best model may be compared to the error of the worst model. A suitable measure is minfcW + JW ) Pr. = l- maxrø + » Y

For a perfect T-shape,

. For a Y-shape min^W + rf-W ) = maxfrfW + d ) ≠ 0 that has 120 degree angles between the edges, ^{m m} and therefore R = 0. The division by the error of the worst model may be excluded in some embodiments but provides the advantage of normalising P_G between 0 and 1. A preferred photometric certainty measure determines the colour variations around the T-junction point. If the colour contrast is high, it is more likely that there is indeed a depth step compared to when the colour contrast is low as similar colours typically are indicative of the same object and therefore not a depth step. For example, if the image comprises a blue cube, the three sides adjacent a corner may have different shades of blue due to light reflections. The corner may be detected as a 3 -Junction but it will result in a very low photometric certainty measure as the colour contrast between segments will be low. In contrast, if the blue cube occludes a yellow object any T-junctions between the two objects will include a colour contrast between the blue and yellow thus resulting in a high photometric certainty measure. More specifically, a suitable photometric certainty measure p may be proportional to the minimum colour difference vector of the three possible image segments of the junction:

where ⁼ ^ >£>^b) with ^r'&^{> 6} denoting the red, green and blue colour channel and ' ' denoting the magnitude of vector I . In order to scale Pp between 0 and 1, the right hand side may be normalised by the magnitude of the largest possible difference vector. Normalisation will not be required for applications where only the relative strength in an image is used. I l l The mean colour vectors •' ²' ³ are calculated using pixel locations that lie within a given distance (e.g. measured in pixels) from the bifurcation point i.e. from the junction point. As mentioned previously, the preferred embodiment comprises determining a combined certainty measure from the geometric and photometric certainty measures. Specifically, the individual normalised measures may be multiplied:

P = P_G ' P? - In some embodiments, the certainty measures may be used to reject a candidate T-junction. For example, only 3-junctions having a certainty measure above a given threshold may be included in determining depth information used in a depth map. The invention can be implemented in any suitable form including hardware, software, firmware or any combination of these. However, preferably, the invention is implemented as computer software running on one or more data processors and/or digital signal processors. The elements and components of an embodiment of the invention may be physically, functionally and logically implemented in any suitable way. Indeed the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the invention may be implemented in a single unit or may be physically and functionally distributed between different units and processors. Although the present invention has been described in connection with the preferred embodiment, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present invention is limited only by the accompanying claims. In the claims, the term comprising does not exclude the presence of other elements or steps. Furthermore, although individually listed, a plurality of means, elements or method steps may be implemented by e.g. a single unit or processor. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is no feasible and/or advantageous. In addition, singular references do not exclude a plurality. Thus references to "a", "an", "first", "second" etc do not preclude a plurality.

Claims

CLAIMS:

1. A method of determining depth information for at least one image comprising the steps of: segmenting (303) the at least one image into a plurality of image segments; determining (305) at least one junction associated with overlapping objects of the at least one image in response to the plurality of image segments; and determining (307) depth information associated with objects of the at least one image in response to the at least one junction.

2. A method of determining depth information as claimed in claim 1 wherein the step of segmenting (303) comprises grouping picture elements having similar brightness levels in the same image segment.

3. A method of determining depth information as claimed in claim 1 wherein the step of segmenting (303) comprises grouping picture elements having similar colours in the same image segment.

4. A method of determining depth information as claimed in claim 1 wherein the step of segmenting (303) comprises grouping picture elements in response to motion characteristics of objects of the at least one image.

5. A method of determining depth information as claimed in claim 1 wherein the step (305) of determining the at least one junction comprises dividing at least apart of the at least one image into a plurality of 2 by 2 picture element matrices and comparing each matrix to a predetermined junction detection criterion.

6. A method of determining depth information as claimed in claim 1 wherein the step of determining depth information (307) comprises fitting a junction model to the at least one junction and determining the depth information in response to characteristics of the fitted junction model.

7. A method of determining depth information as claimed in claim 6 wherein the fitting of the junction model comprises selecting the junction model from a plurality of predefined junction models.

8. A method of determimng depth information as claimed in claim 1 wherein the step of determining depth information (307) comprises determining a plurality of edge points (401, 403, 405) associated with edges between segments forming the at least one junction.

9. A method of determining depth information as claimed in claim 8 wherein the plurality of edge points (401, 403, 405) comprises only edge points within a given radius of the at least one junction.

10. A method of determining depth information as claimed in claim 8 wherein the step of determining depth information (307) comprises fitting a first and second curve to the plurality of edge points (401, 403, 405).

11. A method of determining depth information as claimed in claim 10 wherein the at least one junction is a T-junction and the first curve corresponds to a stem of the T- junction and the second curve corresponds to a top of the T-junction, and the step of determining depth information (307) comprises determining that two segments separated by the first curve are behind the segment separated from the two segments by the second curve.

12. A method of determining depth information as claimed in claim 10 further comprising the step of determining a certainty measure of the depth information in response to an accuracy of a fit of the first and second curve.

13. A method of determining depth information as claimed in claim 1 further comprising the step of determining a certainty measure of the depth information in response to a colour variation between at least a first and second segment associated with the at least one junction.

14. A method of determining depth information as claimed in 1 wherein the plurality of image segments are disjoint regions of the at least one image.

15. A method of determining depth information as claimed in claim 1 wherein the at least one junction is a T-junction.

16. A computer program enabling the carrying out of a method according to claim

1.

17. A record carrier comprising a computer program as claimed in claim 16.

18. An apparatus for determining depth information for at least one image comprising: means (201) for segmenting the at least one image into a plurality of image segments; means (203) for determining at least one junction associated with overlapping objects of the at least one image in response to the plurality of image segments; and means (205) for determining depth information associated with objects of the at least one image in response to the at least one junction.