CROSSREFERENCE TO RELATED APPLICATIONS

[0001]
The present application claims the benefit of and priority from the U.S. Provisional Patent Application No. 61/647,794 filed on May 16, 2012 and titled “ThreeDimensional Imaging from Visible Light Endoscopic Images.” The disclosure of this provisional patent application is incorporated herein by reference in its entirety.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

[0002]
This invention was made with government support under grant R21 IR21EB00849001A1 awarded by the National Institute of Health. The government has certain rights in the invention.
TECHNICAL FIELD

[0003]
The present invention relates to imaging system and method and, in particular, to an endoscopic imaging system enabled to form an image of the closed bodily cavity on a true metric scale determined from a series of images of the cavity and data received from an auxiliary device and describing the repositioning of the endoscopic imaging system.
BACKGROUND

[0004]
Medical treatment of damaged tissue (especially that of ear, nose, throat) often requires physiological measurement. Upper airway conditions such as subglottic stenosis and sleep apnea restricts breathing and may require surgical repair to widen the airway. Accordingly, the measurement of the diameter of an airway facilitates the diagnosis of the severity of the illness and the determination of the required treatment.

[0005]
There are two common options for measuring the size of the pediatric airway, each less then perfect. Computed Tomography (CT) scanning can be used to provide accurate measurements. While this method provides substantially accurate results, the radiation exposure of the bodily tissue has significant associated risks, especially in pediatric patients. As an alternative, the anatomy of the airway can be measured by fitting a series of graduated endotracheal tubes into the airway during an endoscopic examination. This method is not accurate and may cause further damage to the airway.

[0006]
Surgical robotics (particularly when used for minimally invasice and natural orifice translumenal endoscopic surgery) has driven the need for threedimensional (3D) modeling of visible light images obtained from endoscopes and laparoscopes. Stereo laparoscopes exist and are used clinically. No widely used commercial stereo endoscope currently exists, in part due to the smaller diameter (4.5 mm as opposed to 5 to 10 mm for laparoscopes).

[0007]
State of the art reconstruction techniques such as, for example, those discussed by Agarwal et al. (Reconstructing rome, in IEEE Computer, pp. 4047, 2010), Goesele et al. (Scene reconstruction from community photo collections, in Computer, vol. 43, pp, 4853, 2010), and Frahm et al. (Building rome on a cloudless day, in ECCV. Berlin, Heidelberg, pp. 368381; SpringerVerlag, 2010) rely heavily on the ability to extract and match salient image features. While such approach may work reasonably well with images obtained under the conditions of consistent lighting of the imaged scene and the scene that has welldefined edges and corners, the biological tissue is smooth and specularly reflective (especially when wet). Therefore, features such as SIFT and RIFT become unreliable.

[0008]
Similarly, while methods such as SCRAMSAC (see, for example, Sattler et al., Scramsac: Improving ransac's efficiency with a spatial consistency filter, in ICCV, pp. 20902097, 2009) and ASKC (see Wang et al., A generalized kernel consensusbased robust estimator, in IEEE TPAMI, vol. 32, pp.178184, January 2010) can be used to compute camera motion with much more than 50 percent outlier data, the small amount of remaining inlier data is not sufficient for estimating the measured surface.

[0009]
Another approach (devised to determine the shape of the imaged surface with the use of fast shape from the shading method based on the estimates of local surface changes, see Wang et al., in Measurement Science and Technology, vol. 20, no. 12, p. 125801, 2009) relies on a measurement of relative surface deformation and, as a result, does not contain metric descriptors and characteristics of the imaged surface, thereby limiting the utility of the application of the approach to the recovery of the airway diameter. Moreover, this approach assumes a constant albedo. Under such an assumption, specularities and blood vessels may cause imaging artifacts in the resulting reconstruction of the surface.

[0010]
There remains a need in a system and method that enable a reconstruction of an imaged surface in association with true, metric scale and facilitate absolute measurement of the geometry of the imaged specularly reflective and smooth anatomical regions.
SUMMARY

[0011]
Embodiments of the invention provide a method for determining geometrical characteristics of a closed cavity associated with a tissue with the use of an imaging system. The method includes receiving, at a dataprocessing unit of the imaging system, first data acquired by an optical camera of the imaging system (the camera being repositionable within the closed cavity), where such first data represents the closed cavity on a processdetermined scale associated with the closed cavity. The method further includes (i) tracking repositioning of the camera in threedimensional (3D) space with the use of a tracking element of the imaging system to generate second data representing position and orientation of the camera in 3D (the tracking element being in operable communication with the dataprocessing unit; and (ii) with the dataprocessing unit, recovering a metric scale of the closed cavity based on the first and second data, wherein the dataprocessing unit is programmed to derive position and orientation of the camera with respect to a camera's reference frame. The method may additionally include determining structure of the closed cavity on a true geometrical scale and with respect to the derived position and orientation of the camera. Alternatively or in addition, the method may include producing a visuallyperceivable representation of a structure of the closed cavity in reference to the recovered metric scale. In one implementation, the method is devoid of using xray radiation and/or sizing the closed cavity with the use of a mechanical means.

[0012]
The method optionally includes processing the first data to recover features that are capable of being matched to detect the repositioning of the camera in reference to a view of a surface of the closed cavity. The processing of the first data may includes determining of the surface of the closed cavity based at least in part on intensity of light characterizing the surface in reflectance, wherein the determining comprises adjusting of surface properties that have been computed with the use of said features to define a recovered surface and computation of geometry of local patches of the surface. Furthermore, the method may include combining the recovered surface and the computed geometry of local patches to form the surface of the closed cavity. The camera employed in the method may contain a camera associated with an endoscope unit that has been positioned in a tubular bodily cavity, and the geometrical characteristics include a diameter of the tubular bodily cavity graded on the recovered true geometrical scale.

[0013]
Embodiments of the invention additionally provide an imaging system enabled to generate an image representing a closed bodily cavity. The imaging system includes an optical camera configured to acquire first data from inside the closed cavity, where the acquired first data represents a threedimensional (3D) structure of said closed cavity on a local topographical scale and the optical camera is enabled to move with respect to the closed cavity. The imaging system further includes a tracking device in operable communication with the optical camera and enabling an acquisition of second data that represents repositioning of the optical camera with respect to the closed cavity; and a computer processor in operable cooperation with the optical camera and the tracking element. The computer processor is programmed, with program code stored on a tangible, nontransitory computerreadable medium that is in operable communication with the processor, to recover a metric scale associated with geometry of the closed cavity based on the first and second data; and to determine geometrical characteristics of the closed cavity based on the 3D structure of the closed cavity and with respect to the recovered metric scale.

[0014]
The imaging system optionally further contains a display device, and the computer processor is further programmed to cause a formation, on the display device, of a visuallyperceivable rendering of the closed cavity, said rendering being formed in correspondence with the determined geometrical characteristics. Alternatively or in addition, the computer processor may be programmed to enable identification of scaleinvariant features of the closed cavity based on stereotriangulation of the first data and determination of a threedimensional (3D) surface of a portion of the closed cavity based on image intensity data from the first data and the identified scaleinvariant features. Furthermore, in one embodiment the computer processor may be programmed to determine a metric scale of motion of the optical camera within the closed cavity based on the second data and to determine a geometrical measure associated with the closed cavity based on the recovered metric scale and the determined 3D surface. In a related embodiment, an article of manufacture is provided that comprises an elongated body and a portion of the imaging system, wherein the optical system is housed inside the elongated body, and wherein a portion of the tracking device includes an electrical coil juxtaposed with the elongated body.

[0015]
Embodiments of the invention additionally provide a method for forming a threedimensional (3D) rendering a closed cavity of a patient's anatomy, which method includes the steps of (i) acquiring a series of images of the closed cavity with an imaging system repositionably located therein (different images from the series corresponding to different positions of the imaging system); (ii) determining a motion of the imaging system based on the images from the series with a dataprocessing system that is operably cooperated with the imaging system; (iv) computing a graded dimension of the closed cavity based on a metric scale of the motion of the imaging system that has been determined based at least in part on data representing the repositioning of the imaging system and acquired with the use of a tracking device located externally with respect to the closed cavity; and (v) forming the 3D image of at least a portion of the closed cavity in proportion to the computed graded dimension. The method may additionally include determining the metric scale of the motion of the imaging system based on the data received from the tracking device that is enabled to track at least one of orientation and position of the imaging system in relation to the closed cavity. Such determination of the metric scale, in one implementation, includes determining the metric scale based on the data received from at least one of an electromagnetic tracker, an accelerometer, a gyroscope, and an optical tracker. Alternatively or in addition, the method comprises determining a structure associated with the closed cavity based on the determined motion of the imaging system and the data representing the repositioning of the imaging system.
BRIEF DESCRIPTION OF THE DRAWINGS

[0016]
The invention will be more fully understood by referring to the following Detailed Description in conjunction with the Drawings, of which:

[0017]
FIG. 1A is a diagram of an embodiment of an imaging system of the invention.

[0018]
FIG. 1B is a diagram of an alternative embodiment of the imaging system of the invention.

[0019]
FIG. 2A is a flowchart illustrating an embodiment of the method of the invention;

[0020]
FIG. 2B is a flowchart providing additional illustration of an embodiment of the invention.

[0021]
FIG. 3A shows plots illustrating levelset area measurements of the initial, corrected, and groundtruth surfaces of a closed cavity of choice.

[0022]
FIG. 3B shows plots illustrating the expected and actual brightness differential of image pixels (under initial and final surface estimates).

[0023]
FIG. 4 is an endoscopic image of a larynx cavity.

[0024]
FIGS. 5A and 5B each show a set of diagrams I, II, and III. The diagrams I are images of larynxes recorded with a system of the invention that utilized an endoscope. The diagrams II show the QE surfaces of the corresponding larynxes with the reprojected image. The diagrams III are shaded surfaces of the larynxes rendered with the use of Phong reflectance model according to an embodiment of the invention.

[0025]
FIGS. 6A, 6B provide examples of registration between the surface of a closed cavity obtained with an embodiment of the invention and that obtained with a different, reference imaging modality. 6A: trachea; 6B: tracheal bifurcation.

[0026]
FIG. 7 is a diagram illustrating an example of a levelset curve used to compute the diameter of an airway cavity according to an embodiment of the invention. The principal axis of the volume of the cavity is orthogonal to the plane that is shown to intersect the cavity.

[0027]
FIGS. 8A through 8D illustrate the comparison between the diameters of the airway cavity measured based on the CT scan data and data obtained with an embodiment of the invention (QE measurement).
DETAILED DESCRIPTION

[0028]
A person of skill in the art will readily recognize that, as a corollary to the endoscopic imaging, the true scale of the motions of the imaging camera is not known due to the ambiguity of scale of the imaged scene that is caused by the relative motion of the imaging system and the scene.

[0029]
The present invention is directed to a system and method devised to enable accurate reconstruction of human anatomy with the use of images acquired with a single moving endoscope. The method, referred to herein as quantitative endoscopy (or QE), effectuates the estimate of the geometry of an imaged surface (and, in particular, of an imaged bodily cavity such as a pediatric airway) on a true metric scale based on relative motion between an imaging camera and the tissue, sparse bundle adjustment, and estimation of shape of the surface. To independently verify the results of such reconstruction, the devised geometry is compared with that obtained with the use of a computed topography (CT) approach. While the details pertaining to the invention are discussed below mostly in reference to the regionofinterest (ROI) that includes noncollapse tubular cavities such as subglottic and paraglottic regions, it is appreciated that, generally, the reconstruction of anatomy corresponding to any closed bodily cavity is within the scope of the invention.

[0030]
In particular, images acquired during the endoscopic exploration of a subglottic airway were used to generate a 3D mesh representation of the inside (the hollow) of the airway. Embodiments of the invention, which are specifically structured to characterize featurepoor and highlyvariant to illumination changes surfaces, provide a practical alternative to the mechanical measurement of a dimension of the bodily cavity and do not require such mechanical measurement to produce the metricallyaccurate dimension(s) of the bodily cavity.

[0031]
Embodiments of the invention employ a realistic photometric model for surface reconstruction and refinement algorithm. Specifically, according to an embodiment, a sparse set of surface features is extracted from the smooth and specular images acquired with an imaging system of the invention to generate a lowfrequency thinplate spline representation. The representation of the surface is then refined based on the surface's reflective properties. The resulting rendering of the surface is sufficiently accurate for clinical use, specifically in the planning of reconstructive surgery of the subglottic and paraglottic airway.

[0032]
As discussed below, volumetric reconstructions of anatomic regions based on visible light images acquired during exploration of the region with monocular endoscopic system. Using visible light images provides a cheap alternative to Xray based imaging and prevents the need for radiation exposure or potentially harmful mechanical sizing of the imaged anatomical region. The results of metricallyaccurate volume reconstruction is further compared with measurements obtained from the CT scans.

[0033]
References throughout this specification to “one embodiment,” “an embodiment,” “a related embodiment,” or similar language mean that a particular feature, structure, or characteristic described in connection with the referred to “embodiment” is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment. It is to be understood that no portion of disclosure, taken on its own and in possible connection with a figure, is intended to provide a complete description of all features of the invention.

[0034]
In the drawings like numbers are used to represent the same or similar elements wherever possible. The depicted structural elements are generally not to scale, and certain components are enlarged relative to the other components for purposes of emphasis and understanding. It is to be understood that no single drawing is intended to support a complete description of all features of the invention. In other words, a given drawing is generally descriptive of only some, and generally not all, features of the invention. A given drawing and an associated portion of the disclosure containing a description referencing such drawing do not, generally, contain all elements of a particular view or all features that can be presented is this view, for purposes of simplifying the given drawing and discussion, and to direct the discussion to particular elements that are featured in this drawing. A skilled artisan will recognize that the invention may possibly be practiced without one or more of the specific features, elements, components, structures, details, or characteristics, or with the use of other methods, components, materials, and so forth. Therefore, although a particular detail of an embodiment of the invention may not be necessarily shown in each and every drawing describing such embodiment, the presence of this detail in the drawing may be implied unless the context of the description requires otherwise. In other instances, well known structures, details, materials, or operations may be not shown in a given drawing or described in detail to avoid obscuring aspects of an embodiment of the invention that are being discussed. Furthermore, the described single features, structures, or characteristics of the invention may be combined in any suitable manner in one or more further embodiments.

[0035]
Moreover, if the schematic flow chart diagram is included, it is generally set forth as a logical flowchart diagram. As such, the depicted order and labeled steps of the logical flow are indicative of one embodiment of the presented method. Other steps and methods may be conceived that are equivalent in function, logic, or effect to one or more steps, or portions thereof, of the illustrated method. Additionally, the format and symbols employed are provided to explain the logical steps of the method and are understood not to limit the scope of the method. Although various arrow types and line types may be employed in the flowchart diagrams, they are understood not to limit the scope of the corresponding method. Indeed, some arrows or other connectors may be used to indicate only the logical flow of the method. For instance, an arrow may indicate a waiting or monitoring period of unspecified duration between enumerated steps of the depicted method. Without loss of generality, the order in which processing steps or particular methods occur may or may not strictly adhere to the order of the corresponding steps shown.

[0036]
The invention as recited in claims appended to this disclosure is intended to be assessed in light of the disclosure as a whole, including features disclosed in prior art to which reference is made.

[0037]
FIGS. 1A, 1B show schematically an embodiment 100 of the endoscopic system of the invention, that included a unit 110 to which a tube 114 (containing an imaging system 116 therein) was operably connected. The imaging system 116 was structured to define a camera with a lens having a clear outwardlydirected fieldofview (FOV) at the distal end 118. In operation, the system 100 (that has been inserted into a bodily cavity such as an airway, for example) was moved to sequentially position the distal end 118 at a plurality of points within the cavity to take an image of the cavity from each of the plurality of points. Juxtaposed with the unit 110 there was an electromagnetic (EM) tracker coil 120. Operational aggregately with the tracker coil 120, an external electromagnetic field generator 124 was used to augment the system 100. The combination of the coil 120 and the generator 124 facilitated the recordation of the motion of the endoscope, in operation, and synchronization of such motion with the video capture. The data cable 122 connected the system 100 with an external dataprocessing system (not shown). FIG. 1B illustrates the coordination of the frames of reference used during the image data acquisition. Here, the reference frame B corresponds to the EM tracker coil 120 and the reference frame F corresponds to the imaging system 116.

[0038]
In practice, prior to image acquisition with the endoscope 100 the distal end 118 of which has been inserted into a hollow of the bodily tissue, a short calibration procedure is performed on the endoscope and the tracker 120, 124. In one implementation, the tracker included an Aurora electromagnetic measurement system. In the course of videoassisted endoscopy, video/set of image frames from the endoscope 100 is recorded using an output from the video tower. The recorded image/video data are decomposed into timestamped images. The timestamp and the recorded EM data from the tracker are used to determine the relative position of the endoscope at the moment when a given image has been acquired.

[0039]
The electromagnetic measurement system and video equipment (not shown) were connected to an isolating power supply to protect equipment from power surges ad suppress electrical noise. The sensor coil 120 was additionally sterilized before the use of the endoscope.

[0040]
To achieve the abovementioned reconstruction of the cavity surface on a true metric scale, an embodiment of the method of the invention incorporates steps for recovering the motion of the camera and the “uptoscale” 3D positions of salient features of the imaged scene.

[0041]
In reference to FIG. 2A, to effectuate the process of the invention, the following steps are taken:

[0042]
Based on the sequence of acquired images of the anatomic ROI, salient features are identified and the motion of the endoscopic camera with respect to the ROI is estimated in reference to salient regions of the imaged anatomy, at steps 210 and 220;

[0043]
A sparse geometry camera bundle is generated, which consists of the estimated camera positions at a common motion scale, a sparse set of estimated 3D points and the views in which they are visible, at step 230;

[0044]
An external tracking device (such as, for example, an accelerometer, a gyroscope, an optical tracker, or an electromagnetic tracker) is used to determine the correct (metric) scale of the endoscope motion: data from the external tracking advice are used to apply a true metric scale to the estimated geometry using the image and tracker timestamps, at steps 240 and 250;

[0045]
The measured motion of the endoscope is used to determine the structure of the biological tissue with respect to the position of the endoscope, at step 260;

[0046]
The sparse geometry is used to estimate the interior surface of the ROI. The commutation of a mesh is effectuated with the use of the reprojection error of the feature points as well as an (optional) establishing photoconsistency between the 3D surface and its image projections. Refinement of the mesh is performed by solving a constrained optimization problem with GaussNewton iterations, at step 270; and

[0047]
Graded diameter of the ROI is further computed, at step 280.

[0048]
The block scheme of FIG. 2B provides additional illustration to the process flow of the embodiment of the invention. The calibration data comes from preprocessing steps using standard mathematical methods and software. The result is a set of intrinsic camera properties (several—for example, 4 to 6distortion coefficients and 5 values of the camera matrix). The bidirectional arrows in FIG. 2B describe the intermediate results that are used in the reconstruction process. The embodiment of the method of the invention and results of empirical implementation of the embodiment are discussed below.

[0049]
Notation. In the following, a 3D point Pi having coordinates {X_{i}, Y_{i}, Z_{i}}^{T }that has been imaged is denoted by its projected pixel coordinates, in the image j, as {circumflex over (p)}_{i,j}={u_{i,j}, v_{i,j}}^{T }and in normalized homogeneous image coordinates as p_{i,j}={x_{i,j},y_{i,j}}^{T}. The scaleinvariant feature transform (SIFT) features are extracted from the endoscopic images and the fast approximate nearest neighbors method is used to compute matches across image pairs. In one embodiment, the SIFT implementation of Vedaldi (see “Vlfeat—an open and portable library of computer vision algorithms”, in Proc. ACM Int. Conf on Multimedia, 2010) was used. This feature detection step provides a set of matched SIFT features in pixel coordinates {{circumflex over (p)}_{i,1},{circumflex over (p)}_{i,2}i ∈ 1 . . . n}. In normalized homogeneous coordinates p_{i,1}={x_{i,1},y_{i,1}}^{T},p_{i,2}={x_{i,2},y_{i,2}}^{T}, a fivepoint motion estimation algorithm and a random sample consensus algorithm (see Fischler et al., “Random sample consensus: a paradigm for model fitting with aliation to image analysis and automated cartography”, in Commun. ACM, vol. 24, pp. 381395, June 1981; and Nister, “An efficient solution to the fivepoint relative pose problem”, in CVPR, 2003) was used to arrive at

[0000]
argmin_{R,T}Σ_{i=0} ^{n} p _{i,1} ^{T}(R·sk(T))p _{i,2 } (1)

[0050]
The result is decomposed into an orthogonal rotations matrix R ∈ S03 and a translation vector T ∈
^{3}, which describe the repositioning of the imaging camera. Accordingly, the generalized motion associated with the camera between the acquisition of a first and a following second image is expressed as

[0000]
$\begin{array}{cc}F=\left(\begin{array}{cc}R& T\\ 0& 1\end{array}\right)& \left(2\right)\end{array}$

[0051]
According to a method of the invention, the motion of the camera among different locations is estimated between the sequentially acquired images in each pair of the image frames. The problem of calibration (or resectioning) of a camera includes the determination of a position and aiming (i.e., translation and rotation) of the camera when a particular image in the series of images is taken. One of the positions of the camera can be used as a reference, with zero translation and rotation. When the scale of translation of the camera between different viewpoints is not known (even if the direction of the translation and the angle or the corresponding rotation are known), such scale can be introduced into the calculation as a constant unknown multiplier λ. The common scale is computed using the estimated motion between a pair of frames as a reference scale. In particular, an arbitrary scale can be chosen for the translation of the camera from the first viewpoint to the second viewpoint and the location of the point of the imaged scene is reconstructed based on triangulation. When the same operation is carried out for each of the points of the imaged scene, the estimated scene will be principally correct but off by the scaling constant λ. This is often referred to as a reconstruction that is only accurate “up to scale” and results in “bundle” output data representing the camera motion and sparse geometry estimates. The images and sparse geometry form the input to the surface refinement algorithm.

[0052]
Scale Recovery. The first position of the camera is selected as a reference frame C0, and the previously computed normalized scale for the translation motion of the camera is used. In order to determine the true metric scale of the reconstruction, the following handeye calibration problem is solved:

[0000]
AX=XB (3)

[0000]
θ_{A} t _{x} +t _{A}=θ_{x} t _{B} +t _{x } (4)

[0000]
θ_{A} t _{x} +γt _{F}=θ_{x} t _{B} +t _{x } (5),

[0053]
where

[0000]
$B=\left(\begin{array}{cc}{\theta}_{B}& {t}_{B}\\ 0& 1\end{array}\right)$

[0000]
is the homogeneous transformation operator describing the position of the tracker coil 120 with respect to the field generator 124;

[0000]
$X=\left(\begin{array}{cc}{\theta}_{X}& {t}_{X}\\ 0& 1\end{array}\right)$

[0000]
is the homogeneous transformation from the tracker coil 120 to the camera 116;

[0000]
$A=\left(\begin{array}{cc}{\theta}_{A}& {t}_{A}\\ 0& 1\end{array}\right)$

[0000]
describe a correctly scaled position of the camera 116 with respect to the reference frame A_{0 }and

[0000]
$F=\left(\begin{array}{cc}{\theta}_{A}& \gamma \ue89e\phantom{\rule{0.3em}{0.3ex}}\ue89e{t}_{F}\\ 0& 1\end{array}\right)$

[0000]
is the uptoscale position of the camera 116 with respect to the reference frame A_{0}. In addition to the function t_{F}, describing the translational motion of the imaging camera, the operator F includes the additional term γ (which is unknown due to “uptoscale” motion estimation described below). A general solution to AX=XB is provided by Park et al. (in IEEE Trans. Robotics and Automation, v. 10, no. 5, pp. 717721, October 1994) for the case where B and A are known exactly. X is computed as a preprocessing step along with the intrinsic camera and distortion parameters, using a calibration pattern with known size. For each relative camera motion estimate, F_{ij }the corresponding tracker positions F_{i }and F_{j }are computed by selecting the two closest time stamps and using spherical quaternion interpolation to estimate the positions B_{i}, B_{j }of the EM tracker.

[0054]
The relative positions of the camera centers are computed using the equation

[0000]
A=XBX ^{−1 } (6)

[0055]
and the relative translation scale is computed by solving

[0000]
γtF_{ij}=tA_{ij } (7),

[0056]
where Aij is the homogeneous transform from A_{i }to A_{j }. This approach facilitates the consistent scaling of all camera motion estimates. The estimated relative translation scale γ is expected to be more or less noisy depending on the precision of the tracking system, but the maximum likelihood estimate of geometry produced by the sparse bundle adjustment is expected to be preserved. By applying the estimated scale to the reconstructed points, the reconstructed surface becomes accurately scaled to the units supplied by the tracking system.

[0057]
Spline Surface Representation. Points of the imaged scene viewed from the different positions by the camera are now mapped onto a 3D surface representing the imaged scene. Given a set of 3D points (C_{1 }. . . C_{k}) in a reference frame F_{0 }of the camera occupying a predetermined position, the surface viewed from F_{0 }can be represented using a thin plate spline function using projected homogeneous coordinates (c_{1 }. . . c_{k}) as control points. For this purpose, the thinplate spline parameterization was adopted (as described, for example, by Richta et al., “Threedimensional motion tracking for beating heart surgery using thinplate spline deformable model”, in Int. J. Rob. Res., v. 29, pp. 218230, February 2010; or by Bookstein, “Principal warps: Thinplate splines and the decomposition of deformation”, in IEEE TPAMI., v. 11, pp. 567585, June 1989). This parameterization approach maps points p=(x, y, 1)^{T }in homogeneous image coordinates to distance in the Z direction along their associated ray, T P S(p; W)=Z. Given control points (c_{1 }. . . c_{k}), the mapping function T P S(·; W) is defined by a weighing vector W, where

[0000]
W=(w_{1} , . . . w _{n} ,r _{1} ,r _{2} ,r _{3}) (8).

[0058]
The mapping function TPS is used to estimate on which contour the mapped points pi lie. The mappedontothe surface value of a given point p is given by

[0000]
TPS(p,W)=r _{1} +r _{2} x+r _{3} y+Σ _{i=0} ^{k} w _{i} U(∥c _{i} −p∥) (9),

[0059]
Here, the U(x)=x^{2 }log(x) is the radial basis function. Accordingly, a point P on a 3D surface representing the surface of the imaged scene and associated with the point p is characterized as

[0000]
P=TPS(p;W)·p=Z·(x,y,1)^{T}=(X,Y,Z)^{T } (10)

[0060]
The Optimization of the Determined Surface may be required to incrementally correct both the camera position estimates {F_{0 }. . . F_{n}} and the mapping from the image plane to R^{3 }(the mapping defined as a weighted sum of the control point positions in 3D). Alternatively or in addition, the optimization may be required to correct for photometric errors caused by different photometric properties of different portions of the imaged surface.

[0061]
To effectuate such correction, the following definitions are used:

[0062]
The equation

[0000]
$\begin{array}{cc}{K}^{*}\ue8a0\left(P\right)={K\ue8a0\left(\frac{X}{Z},\frac{Y}{Z},1\right)}^{T}=\hat{p}& \left(11\right)\end{array}$

[0063]
is defined to transform a 3D point, P=(X,Y,Z)^{T }to its projected pixel location in the reference frame; F_{n }is defined as the homogeneous transformation matrix (which describes the n^{th }position of the camera); I_{0 }denote the image at the reference frame, while I_{n }is used to denote the intensity image of camera at position n; for a 3D point P, the brightness of its projection in the nth image is given by I_{n}(K*(F_{n}P)). Assuming, in addition, that the brightness distribution of the imaged surface of the ROI is constant (invariable with respect to the position across the surface) the image of an arbitrary point P_{i }should have the same intensity when projected into any image. By selecting a uniform set of mesh surface points P_{i}, a photometric error term D_{i }is defined as a modification of the Helmholtz stereopsis equation (discussed, for example, by Zickler et al., in Int'l J. of Computer Vision, pp. 869884, 2002) as

[0000]
D _{i} =I _{0}(K*(P _{i}))−I _{n}(K*(F _{n} P _{i})) (12)

[0000]
D _{i} =I _{0}(K*(p _{i}))−I _{n}(K*(F _{n} TPS(p _{i} ;W)·p _{i}) (13)

[0064]
As the parameter vector W that defines TPS is derived from (c_{1 }. . . c_{k}) (see above), the computational error associated with imaging of the surface of the ROI is expressed with the following error Jacobian (computed through the Taylor series expansion of the error terms D_{i }about the current estimate of (c_{1 }. . . c_{k}) and R, T):

[0000]
$\begin{array}{cc}{J}_{D}=\left(\begin{array}{cccc}\frac{\partial {D}_{1}}{\partial {c}_{1}}& \dots & \frac{\partial {D}_{1}}{\partial {c}_{k}}& \frac{\partial {D}_{1}}{\partial R,t}\\ \vdots & \dots & \vdots & \vdots \\ \frac{\partial {D}_{n}}{\partial {c}_{1}}& \dots & \frac{\partial {D}_{n}}{\partial {c}_{k}}& \frac{\partial {D}_{n}}{\partial R,t}\end{array}\right)& \left(14\right)\end{array}$

[0065]
Under the assumption that that the detected image feature matches c_{j,0 }and c_{j,n}, having come from fiducial markers (such as, for example, anatomical landmarks), or robust matches, are accurate, the mapping function TPS and homogeneous transform F_{n }should map c_{j,0 }to the detected position of c_{j,n}. One can, therefore, define the additional error terms as

[0000]
E _{i} =∥K*(F _{n}(TPS(c _{j,0} ;W)·c _{j,0}))−c _{j,n}∥=0 (15)

[0066]
These error terms can be written as equality constraints in the minimization of J_{D}. The gradient of these error terms is computed with respect to the translation and rotation parameters T={X,Y,Z} and ω={θ_{x}, θ_{y}, θ_{z}}. The resulting constraint Jacobian J_{E }and the error Jacobian J_{D }can be combined into a KarushKuhnTucker matrix (see Kuhn, Nonlinear programming: a historical view, in SIGMAP Bull., pp. 618, June 1982) yielding the following system of equations

[0000]
$\begin{array}{cc}\left(\begin{array}{cc}{J}_{D}^{T}\ue89e{J}_{D}& {J}_{E}^{T}\\ {J}_{E}& 0\end{array}\right)\ue89e\left(\begin{array}{c}\Delta \ue89e\phantom{\rule{0.3em}{0.3ex}}\ue89ec\\ \Delta \ue8a0\left(w,T\right)\\ \lambda \end{array}\right)=\left(\begin{array}{c}{J}_{D}^{T}\ue89eD\\ E\end{array}\right)& \left(16\right)\end{array}$

[0067]
In the above equation, D represents the vector of error terms and λ are the Lagrange multipliers associated with the constraint error, E. This formulation combines the photometric error terms D_{i }with the traditional bundle adjustment terms defined by E_{j }to compute the optimal correction terms Δc and Δ(w,T).

[0068]
Optionally, additional corrective measures can be employed to correct the impractical assumption (made above at the stage of optimizing the determined surface) that the brightness of the imaged surface is invariant with respect to a position across the surface (which in the context of endoscopy and other invivo procedures is substantially incorrect). While various measures (such as, for example, normalized cross correlation, mutual information and sum of conditional variance are matching metrics) exist to robustly deal with this problem, these methods are not applicable to the case at hand because they do not directly model or inform the three dimensional geometry of the scene imaged with an endoscope.

[0069]
According to an embodiment of the invention, therefore, an optional step of Modeling the Reflectance and Brightness Attenuation may be taken as part of dataprocessing algorithm. The attenuation of intensity {I_{0}(K*(P_{i})) . . . I_{n}(K*(F_{n}P_{i}))} at a given point P_{i }of the surface with respect to the angle and distance between the camera and the point to P_{i }is carried out under the Lambertian assumption of reflectance (stating, in short, that the color intensity emitted by an imaged point P_{i }is proportional to the dotproduct of the unit surface normal vector and light source direction at P_{i}:

[0000]
$\begin{array}{cc}{I}_{n}\ue8a0\left({K}^{*}\ue8a0\left({F}_{n}\ue89e{P}_{i}\right)\right)=\alpha \ue89e\frac{\mathrm{cos}\ue8a0\left({\theta}_{i.n}\right)}{{r}_{i,n}^{2}}& \left(17\right)\end{array}$

[0070]
Here, the term r is simply the Zcoordinate of P_{i}, which represents the distance from the point P_{i }to the camera 116 of he endoscope system of the invention. It should be noted that under the assumption of Lambertian reflectance, the intensity is substantially independent of the direction of the viewer and the constant α, which represents the unknown albedo of the surface. In the case of an endoscope, the only source of light is at the optical center of the camera, and moves along with it. In other words, the light source and viewer are colocated. Using the estimated camera motions and surface representation determined with the method of the invention, one can further estimate the camera positions {F0 . . . Fn} and the surface normal at any point P_{i}. As a result, the terms θ_{i }and Z_{i }can be easily determined.

[0071]
The resulting augmentation of the D_{i }terms to reflect the correct attenuation is represented by the following corrected (geometric brightness) error equation:

[0000]
D _{i,Lambert} =I _{0}(K*(F _{n} P _{i}))−I _{n}(K*(F _{n} P _{i})) (18),

[0072]
which should replace the Eq. (13) to account for the change in color intensity of the imaged surface as a function of lighting direction.

[0073]
Empirical Results. A method of the invention has been applied to both synthesized and actual experiment.

[0074]
In the synthetic experiment, the method was used to render images of a sphere from several viewpoints under the assumption of Lambertian distribution of reflectance of the sphere. According to the algorithm of FIG. 2A, the initial surface was computed with noisefree motions of the camera and sparse geometry; then, the surfacecorrection (step 270) was applied to the data with noise added to the sparse geometry. FIGS. 3A, 3B show the results of this simulated experiment. In FIG. 3B, the xaxis represent the difference between the value corresponding to a surface point and its projected pixel value, while the yaxis represents the expected difference under the lighting law and the estimated surface geometry. If the estimate of the geometry of the imaged surface, obtained according to the method of the invention, is correct, these values match and all points lie on the line x=y. Points 310 show the values at initialization (start of iteration procedure), while points 320 correspond to the values at the termination of the surface optimization. It can be seen that the predicted brightness differential corresponding to the refined geometry better matches the actual difference in pixel value between the two images than the unrefined geometry. FIG. 3A illustrates the measured area of the scene along the Zaxis of the reference camera. Line 330 shows the values of the noisefree surface, line 340 shows the area measurements of the surface with noise, and line 350 represents the results of surface optimization. It can be seen that the refinement carried out with the algorithm of the invention caused the area of the surface to deform to be more similar to the ground truth value.

[0075]
In a real experiment, images were taken with the system 100 of FIG. 1A from inside a larynx. A typical image is shown in FIG. 4, with areas 410 indicating the presence of visuallyperceivable specular reflections. The acquired image data were used according to the dataprocessing method of the invention to reconstruct the image scene. The results of surface reconstruction for two independently measured larynxes, which vary greatly in shape, texture of the surface, and color, is summarized in FIGS. 5A and 5B.

[0076]
It is worth noting that Lambertian reflectance is the most basic reflectance model and does not model the specular reflectance that is present in the endoscopic images. In computer graphics, the relationships among a surface, view and light are used to define geometry shades for a surface or material type. In general, any such model of surface reflectance can be inserted into Eq. (18) by simply deriving a change in the brightness function in response to the normal vector camera position, and light position distance to point, in order to model the surface properties of the larynx. The Phong reflection model used in the real experiment is approximated as follows:

[0000]
α(cos θ_{i} +e ^{ρ}−1) (19)

[0077]
where ρ=(cos θ_{i})^{β} and β accounts for the falloff of the specular reflectance value as cos θ_{i }decreases. To define D_{i,Phong}, the Lambertian lighting model of Eqs. (7, (18) was replaced with this geometric brightness model.

[0078]
Each of FIGS. 5A, 5B presents a set of three images (I, II, and III). Images I illustrate the images recorded with the endoscope 100 of FIG. 1A. Images II present the corresponding QE surfaces with the reprojected image. Images III show the shaded surfaces that are rendered with the Phong reflectance model.

[0079]
Comparison With Reference Data and Sensitivity Analysis. In cases where additional data representing the closed cavity of interest is available that have been independently acquired with a different imaging modality (for example, computed tomography, CT), the resulting QE surfaces can be registered to the surface determined based on the additional data using an initial manual alignment (for example, by using the tracheal bifurcation as a landmark). The registration is then refined iteratively with, for example, the closest point algorithm. FIG. 6A illustrates the generated QE surface 610 corresponding to the trachea that is registered with the surface 620 determined using the CT scan. FIG. 6B provides additional illustration of the registration of the area of the QE surface 630 near the tracheal bifurcation and that obtained with the CT scan, 640. In order to compare the QE and CT surfaces, the corresponding volumes of the trachea were sliced into polygons along the principal axes of the volumes and computed the areas of each of the polygons.

[0080]
During the comparison of the imaging data obtained with the system and method of the invention with the reference data (such as the CT scan data referred to as the “ground truth” data), the criteria that are clinically important are the accuracy of the measured surface area and diameter of the spline volume. Here, the isosurface of the airway from the voicebox to the carina was extracted with the algorithm of the invention, the comparison between the CT extracted volume and the QE computed volume of the airway an initial alignment by unrotating and centering the QE volume was performed. To unrotate, the eigenvectors of the volume's point covariance matrix were computed and the volume was rotated so that the longest eigenvector was aligned with the principal zaxis. Lastly, the mean was subtracted from the point volume. The resulting volume was centered at the origin, with the principal zaxis extending through the trachea. This step is performed on both the CT extracted volume and the QE computed volume, and provides and initial alignment.

[0081]
Using the axes alignment as a starting point, we use the iterative closest point algorithm to refine the registration. Both axis alignment and ICP are rigid body transforms, so the scale of the QE computed volume does not change.

[0082]
The comparison between the QE and CT volumes is performed as follows. We compute a set of level curves for each volume, at intervals along the principal axis (as shown in FIG. 7, where the plane 710, that is transverse to the airway volume 720, defines the principal axis of the volume as a line that is substantially perpendicular to the plane 710). For each of these curves, we compute the area of the contour using Green's theorem. We then compute the diameter of each contour and compare difference between the two volumes. The results are shown in FIGS. 8A through 8D for three sets of level curves, where the solid lines represent the CT scan and the dashed lines show the QE measurements.

[0083]
Determination of the quality of the measurement of an area of a closed cavity with an embodiment of the invention is carried out by propagating the covariance of the measurement results. Given that, according to Eq. (9),

[0000]
TPS(p,W,R,T)=r _{1} +r _{2} x+r _{3} y+Σ _{i=0} ^{k} w _{i} U(∥c _{i} −p∥) (20)

[0084]
using the Eq. (10), and denoting Σz as the measurement covariance of the output values Z, of TPS(·), the measurement covariance matrix is determined empirically using the reprojection error of the sparse geometry bundle (based, for example, on Triggs et al., “Bundle adjustment a modern synthesis”, in Vision Algorithms: Theory and Practice, LNCS; Springer Verlag, 2000, pp. 298375). Let J_{TPS }define the Jacobian of the mapped values with respect to the thin plate spline parameters:

[0000]
$\begin{array}{cc}{J}_{\mathrm{TPS}}=\left(\begin{array}{cccc}\frac{\partial {Z}_{1}}{\partial {c}_{1}}& \cdots & \frac{\partial {Z}_{1}}{\partial {c}_{k}}& \frac{\partial {Z}_{1}}{\partial R,t}\\ \vdots & \cdots & \vdots & \vdots \\ \frac{\partial {Z}_{n}}{\partial {c}_{1}}& \cdots & \frac{\partial {Z}_{n}}{\partial {c}_{k}}& \frac{\partial {Z}_{n}}{\partial R,t}\end{array}\right)& \left(21\right)\end{array}$

[0085]
The parameter covariance Σ_{TPS }is determined using back propagation of the measurement covariance Σz, as

[0000]
Σ_{TPS}=(J _{TPS}*Σ_{Z} ^{−1}*(J _{TPS})^{T})^{−1 } (22)

[0086]
If Ji is chosen to denote the Jacobian of the area measurement A_{i }at distance Z_{i }

[0000]
$\begin{array}{cc}{J}_{i}=(\frac{\partial {A}_{i}}{\partial {X}_{i}}\ue89e\frac{\partial {A}_{i}}{\partial {Y}_{i}}\ue89e\frac{\partial {A}_{i}}{\partial {Z}_{i}}\ue89e\phantom{\rule{0.8em}{0.8ex}}\ue89e\dots \ue89e\phantom{\rule{0.8em}{0.8ex}}\ue89e\frac{\partial {A}_{i}}{\partial {X}_{n}}\ue89e\frac{\partial {A}_{i}}{\partial {Y}_{n}}\ue89e\frac{\partial {A}_{i}}{\partial {Z}_{n}},& \left(23\right)\end{array}$

[0087]
where [(X_{j}, Y_{j}, Z_{j}) are the points in the level set Z_{i}, then the covariance matrix of Σi can be approximated as

[0000]
Σ_{i}=(J _{i} ^{T}*Σ_{TPS} *J _{i}) (24)

[0088]
The total variance of the measurement is the trace of Σi.

[0089]
FIGS. 8A through 8D show the variation of a diameter of the CT volume measured along the principal axis of the airway cavity, as well as the aligned QE volumes. The value of the diameter was computed without knowledge of the CT surface, and would serve to provide the clinician an estimate of the measurement quality. Note that this value provides a conservative estimate of the deviation between the QE volume diameter and the CT volume diameter. These data sets each represent a different traversal of the airway from 2 different patients (sets 1 and 2 are from the first patient, sets 3 and 4 are from the second patient). The results illustrated in Table 2 show that the mean deviation is close to zero compared to the diameter. These data sets are based on the alignment of the tracheal bifurcation, which provides an unambiguous registration landmark.

[0090]
As discussed above, the proposed method of measurement of a true, metric scale based geometric parameter of a closed cavity—for example, a diameter of the corresponding hollow—is an alternative for measurement such parameter with the mechanical means. The proposed experimental methodology for a cavity surface recovery and refinement causes both the surface backprojection and constraint errors to converge and produces a better approximation of the true surface. The advantage of the thinplate spine function, chosen as a base model for 3D surface fitting, is that the generated surface directly coincides with the sparse geometry (namely, 3D points reconstructed from feature matches), for which the subpixel accuracy is achieved at the bundleadjustment step of the embodiment of the invention. This produces better results than, for example, a NURBS surface that provides an approximation without necessarily touching the control points.

[0091]
While the preferred embodiments of the present invention have been illustrated in detail, it should be apparent that modifications and adaptations to those embodiments may occur to one skilled in the art without departing from the scope of the present invention.