2D/3D IMAGE REGISTRATION IN IMAGE-GUIDED RADIOSURGERY
BACKGROUND
[0001] Radiosurgery can be used to treat tumors and other lesions by delivering a prescribed dose of high-energy radiation to a target area while minimizing radiation exposure to the surrounding tissue. In radiosurgery, precisely focused beams of radiation (e.g. very intense x-ray beams) are delivered to a target region, in order to destroy tumors or to perform other types of treatment. One goal is to apply a lethal amount of radiation to one or more tumors, without damaging the surrounding healthy tissue. Radiosurgery therefore calls for an ability to accurately target a tumor, so as to deliver high doses of radiation in such a way as to cause only the tumor to receive the desired dose, while avoiding critical structures such as the spinal cord.
[0002] Conventional radiosurgery may use a rigid and invasive stereotactic frame to immobilize the patient prior to diagnostic CT or MRI scanning. Treatment planning may then be conducted from the diagnostic images. The treatment planning software determines the number, intensity, and direction of the radiosurgical beams that should be cross-fired at the target, in order to ensure that a sufficient dose is administered throughout the tumor so as to destroy it, while minimizing damage to adjacent healthy tissue. Radiation treatment may typically be accomplished on the same day treatment planning takes place. Immobilization of the patient may be necessary in order to maintain the spatial relationship between the target and the radiation source to ensure accurate dose delivery. The frame may be fixed on the patient during the whole treatment process.
[0003] Image-guided radiosurgery can eliminate the use of invasive frame fixation during treatment, by frequently and quasi-continuously correcting patient position or aligning radiation beam with the patient target. To correct patient position or align radiation beam, the change in target position at the time of treatment, as compared to the position at the time of the diagnostic treatment planning, is detected. This can be accomplished by registering the x-ray image
acquired at the treatment time with the diagnostic 3D scan data obtained pre- operatively at the time of treatment planning. Medical image registration can be useful in many areas of medicine, including but not limited to radiosurgery and radiotherapy. The positions of the target can be defined by physicians at the time of treatment planning, using the diagnostic 3D scans.
[0004] Typically, 3D imaging modalities, such as computed tomography (CT), magnetic resonance imaging (MRI), or positron emission therapy (PET) can be used to generate diagnostic 3D images of the anatomical region containing the targeted area, for treatment planning purposes. These tools enable practitioners to identify the anatomical organs of a patient, and to precisely locate any abnormalities such as tumors. For example, CT scans allow an image of the internal structure of a target object to be generated, one cross-sectional slice at a time. The 3D scan data (e.g., CT, MRI, or PET scan data) may be used as a reference, in order to determine the patient position change during treatment. Typically, synthesized 2D images such as digitally reconstructed radiographs (DRRs) may be generated from the 3D scan data, and may be used as 2D reference images. In the field of medical image registration, this problem is categorized as a 2D/3D registration. In the 2D/3D registration process, similarity measures can be useful for comparing the image intensities in the x-ray images and the DRR images, so that the change in patient position (and thus in target region position) that has occurred between the diagnostic scanning and the taking of real-time images can be accurately detected.
[0005] Image-guided radiosurgery typically requires precise and fast positioning of the target at the treatment time. In practice, it is desirable that the accuracy be below 1 mm, and the computation time be on the order of a few seconds. Unfortunately, it can be difficult to meet both requirements simultaneously.
[0006] It is desirable that a method and system be provided in image-guided radiosurgery for tracking the position of the treatment target throughout treatment, that allow for a fast computation time , and at the same time maintains sufficient accuracy and stability. In particular, there is a need for a method and
system for performing 2D/3D medical image registration using as little computing time as possible, while at the same time meeting the requisite accuracy for radiosurgical applications. In order to optimize the 2D/3D registration process in image-guided radiosurgery, it is necessary to use an accurate, robust, and efficient similarity measure method and system.
SUMMARY
[0007] An accurate and rapid method and system are presented for tracking target position in image guided radiosurgery. A hierarchical and iterative framework is used to register 2D x-ray images with images that have been reconstructed from 3D scan data. The hierarchical and iterative 2D/3D registration algorithm allows for an accurate and rapid correction of target position, and an accurate and rapid alignment of radiosurgical beams, throughout the treatment procedure. High accuracy can be achieved in both the translational and rotational adjustments of the target position. The total computation time is about an order of magnitude faster than other techniques known in the art. An improved method and system is used to compare the measure of similarity of two digital images. The similarity measure is based on pattern intensity, and provides a robust, accurate, and efficient solution to the 2D/3D medical image registration problem in image guided radiosurgery.
[0008] A method in image guided surgery for aligning the position of a treatment target relative to a radiosurgical beam generator during treatment includes performing 2D/3D image registration between one or more near real-time 2D x- ray images of a treatment target, and one or more 2D reconstructed images of the target based on pre-treatment 3D scan data . A pre-treatment 3D scan of the target is performed, treating the target as a rigid body, and describing its position with six degrees of freedom. The 3D scan (for example a CT scan, an MRI scan, or a PET scan) shows the position of the target at treatment planning time. One or more 2D x-ray images of the target are generated during treatment, in near real time. The x-ray images show the position of the target at a current time during treatment. Preferably, two orthogonal x-ray projections images are generated, using imaging beams having a known position, angle, and intensity.
[0009] A set of 2D reconstructed images, preferably DRRs (digitally reconstructed radiographs), are generated offline, based on the 3D scan data. Preferably, the 2D reconstructed images are DRRs that are generated using the same positions and angles of the imaging beams that are used for the x-ray images, i.e. using the known positions, angles, and intensities of the imaging
beams that are used to generate the near real-time x-ray images.
[0010] The DRRs are registered with the x-ray images, to generate the 3D rigid body transformation parameters that represent the change in position of the target between the 3D scan and the x-ray images. The registration is performed for each orthogonal projection individually, and the results are subsequently combined.
[0011] During the 2D/3D registration, in-plane rotations of the DRRs are performed within the image plane of the x-ray images, thereby generating reference DRRs. The x-ray images are processed so that the orientation, image size, and bit depth of the x-ray images match the orientation, image size, and bit depth of the reference DRRs.
[0012] In-plane and out-of-plane transformation parameters are estimated using different search methods, including 3-D multi-level matching and 1-D searching, then iteratively refined until a desired accuracy is reached. The relative position of the radiosurgical beams and the target are, continuously adjusted in near real time, throughout the treatment, in accordance with the 3D transformation parameters obtained via the 2D/3D registration process.
[0013] The 2D/3D registration process involves determining the value of the parameters (x, y, Θ) and (r, ) that are required in order to register the x-ray image of the target with the reference DRRs of the target, (x, y, θ) represent the in-plane translational and rotational parameters within the image plane of the x- ray images, (x, y) indicating the requisite amount of translation within the image plane in the directions of the x- and y- axes, respectively, and θ indicating the requisite amount of rotation within the image plane, (r, φ) represent the out-of- plane rotational parameters, and indicate the requisite amount of out-of-plane rotations about mutually orthogonal axes that are defined in a 3D coordinate system, and that are orthogonal to the image plane.
[0014] In order to determine these parameters, a 3D multi-level matching is first performed, in order to determine an initial estimate for the in-plane
transformation parameters (x, y, θ). Based on these parameters (x, y, θ) obtained by 3D multi-level matching, an initial 1-D search is performed for each of the pair of out-of-plane rotation parameters (r, φ). The in-plane translation parameters (x, y) are then refined, using 2D sub-pixel matching, to increase the accuracy of these parameters.
[0015] The in-plane rotation parameter (θ) is then refined, based on the out-of- plane rotation parameters (r, φ) obtained from the initial 1 D search, and on the updated in-plane transformation parameters (x,y), in order to increase the accuracy of the in-plane rotation parameter φ. 1D interpolation is used in this step. Next, each of the out-of-plane rotation parameters (r, φ) are refined separately, based on the refined in-plane translation and rotation parameters. The refining steps are iteratively repeated, until a predetermined accuracy is reached. Finally, the out-of-plane rotation parameters (r, φ) are refined, using 1D interpolation, in order to achieve the desired resolution.
[0016] The 2D/3D image registration algorithm also features a method for determining a measure of similarity between a first image and a second image of an object. In an exemplary embodiment, the first image is a real-time 2D x-ray image of the object, and the second image is an artificially synthesized DRR, constructed from pre-treatment 3D scan data of the object. The similarity measure method includes forming a difference image by subtracting the corresponding pixel values of the second (DRR) image from each pixel value of the first image, i.e. the "live" or near real-time x-ray image. The method further includes applying upon each pixel of the difference image a pattern intensity function, where the pattern intensity function is an asymptotic function of the gradients of the difference image.
[0017] An image-guided radiosurgical system includes means for generating pre- treatment 3D scan data of the target, for example a CT scanner or an MRI scanner. The system includes a radiosurgical beam generator for generating at least one radiosurgical beam for treating the target. Imaging means are provided for generating 2D x-ray images of the target in near real time. The
imaging means include one or more (preferably two) imaging x-ray beam sources for generating at least one imaging beam having a known intensity, position and angle. The imaging means direct the imaging beams toward and through the target, so that the beams can be detected by corresponding image receivers (e.g. cameras) after the beams have traversed the target. The detection signals are processed by an image processor, which generates the x- ray images. Preferably, a pair of x-ray sources and a corresponding pair of x-ray cameras are provided, so that two orthogonal x-ray projection images are generated.
[0018] Means are provided for generating a set of 2D DRRs for each x-ray projection image. The DRRs are generated using the known intensity, location, and angle of the imaging beams. Image registration means are provided for registering the DRRs with the x-ray images. The image registration means include a processor for computing a set of 3D transformation parameters that represent the change in position of the target between the 3D scan and the near real time x-ray images. The processor contains software for estimating in-plane and out-of-plane transformation parameters for each projection, using a number of search methods including 3D multi-level matching, 2D sub-pixel matching, and 1D searching, and using two different similarity methods (sum-of-square differences and pattern intensity) at different phases of the registration process.
[0019] The image registration means includes another processor for determining the measure of similarity of a 2D x-ray image of an object and a 2D DRR of the object generated from previously obtained 3D scan data. The processor for determining the measure of similarity between the 2D x-ray image and the 2D DRR contains software for subtracting each pixel value of the second image from a corresponding pixel value of the first image to form a difference image, and then applying a gradient operator upon each pixel of the difference image to form a pattern intensity function. The pattern intensity function is an asymptotic function of the gradients of the difference image, and permits the pixel values within a neighborhood R defined about each pixel in the difference image to be considered. The gradients are defined over at least four directions.
[0020] The image-guided radiosurgical system also includes positioning means, responsive to a controller, for adjusting in near real time the relative position of the radiosurgical beams and the target, in accordance with the 3D transformation parameters obtained by the 2D/3D registration process.
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] FIG. 1 illustrates the geometric relations between a 3D treatment target and two orthogonal x-ray projections, including the in-plane translational and rotational parameters (x,y,0), and the out-of-plane rotational parameters (r, φ), for registering a 2D radiographic image with previously generated 3D scan data.
[0022] FIG. 2 is a schematic diagram of a methodology for tracking a treatment target during image-guided radiosurgery, in accordance with one embodiment.
[0023] FIG. 3 illustrates a flowchart of a registration algorithm used in a 2D/3D registration method, in accordance with one embodiment.
[0024] FIG. 4 illustrates the generation of 2D DRRs from 3D CT scan data of a treatment target within an anatomical region of a patient.
[0025] FIG. 5 illustrates a multi-resolution image representation for a multi-level matching process used to estimate the in-plane transformation parameters.
[0026] FIG. 6 schematically illustrates a neighborhood for calculating pattern intensity, in one embodiment.
[0027] FIG. 7 schematically illustrates an image-guided radiosurgery system, constructed in accordance with one embodiment.
DETAILED DESCRIPTION
[0028] An accurate and rapid method and system in image-guided radiosurgery are presented for tracking the position of a treatment target in six degrees of freedom. The tracking method and system allow for patient position correction and radiation beam alignment during radiosurgery/radiotherapy of a treatment target, for example a tumor within the brain or skull. A fully automatic tracking process is made possible, with no need for user interaction. The tracking method and system includes an improved 2D/3D image registration algorithm. The 2D/3D registration algorithm can also be used in applications other than radiosurgery and radiotherapy, i.e. in any application in which there a need to track a rigid object by registering 2D radiographic images onto 3D scan data. A similarity measure, based on pattern intensity, is used during the 2D/3D medical image registration. The similarity measure allows for selected phases of the 2D/3D registration process in image-guided radiosurgery to be carried out in a robust, efficient, and powerful manner.
[0029] In one embodiment, the radiosurgery target is treated as a rigid body. As well known, a rigid body is defined as an object whose internal geometric relationships remain static or unchanged over time. Because no external forces are imposed on a radiosurgical target during radiation treatment, it is reasonable to treat the target as a rigid body, and to use a 3D rigid transformation for the registration process. The 3D rigid transformation is described using six degrees of freedom: three translations along three mutually orthogonal axes in a 3D scan coordinate system (conventionally labeled using the x-, y-, and z- axes), and three rotations (roll, pitch, yaw) about these three axes. The six degrees of freedom are thus represented by six 3D transformation parameters: (x, y, z, r, p, w), where r represents the rotation about the x-axis, p represents the rotation about the y-axis, and w represents the rotation about the z-axis.
[0030] In one embodiment, two orthogonal x-ray projections are used to solve for these six parameters. FIG. 1 illustrates the geometric relations between a three- dimensional treatment target and two orthogonal 2D x-ray projections (labeled A and B in FIG. 1 ), in an image-guided radiosurgery method and system in
accordance with one embodiment. Cameras (or image receivers) A and B receive their x-ray projections from respective x-ray sources (not shown). The 2D x-ray projection images of the target are formed by transmitting imaging beams (having a known intensity, and having known positions and angles with respect to the target), generated from a respective pair of x-ray sources, through the target and onto cameras A and B.
[0031] As illustrated in FIG. 1 , a 3D CT coordinate system, i.e. a coordinate system for the target as viewed in the frame of the CT scan study (taken at the time of treatment planning), can be defined. During treatment, the patient assumes a position within the real-time camera coordinate frames (defined by the two x-ray cameras A and B, respectively), that does not necessarily match the position of the patient as seen within the 3D CT coordinate system. The difference in the position and orientation of the target within the respective radiographs (real-time x-ray, versus DRR) correspond to the difference in the three-dimensional position and orientation of the target between the camera- and the CT coordinate frames, and are found by solving for the parameters (x, y, z, r, p, w).
[0032] In the embodiment illustrated in FIG. 1 , the x-axis in the 3D CT coordinate system is directed inward into the paper, and is not referenced. In FIG. 1 , the orthogonal 2D projections A and B are viewed from the directions oAsA and OBSB, respectively. For each of the projections A and B, FIG. 1 illustrates respective 2D planar coordinate systems that are fixed with respect to the image plane that characterizes each projection. The image planes A and B for the projections A and B are thus defined by mutually orthogonal axes within the respective coordinate systems. These axes are shown in FIG. 1 as (x^, yA) for projection A, and (XB, yβ) for projection B. In other words, each x-ray image for each projection is characterized by a respective image plane, defined by mutually orthogonal x- and y- axes in a coordinate frame defined by the two x-ray cameras A and B: xA and yA for projection A, and s and ys for projection B. The direction of the axis xA in the 2D coordinate system for projection A, and the direction of the x-axis in the 3D scan coordinate system, are opposite with
respect to each other. The direction of axis xB in the coordinate system for projection B, and the direction of the axis x in the 3D scan coordinate system, are the same.
[0033] As shown in FIG. 1 , each projection is characterized by a respective set of transformation parameters, namely (χ
Λ,y
A,θ
Λ,r
Λ,φ
A) for projection A, and (χ
B,y
B,θ
B,r
B,φ
B) for projection B. The two out-of-plane rotations (with respect to the image plane) in projections A and B are denoted by (r
A,φ
A) and (r
B,Φβ) respectively, where r denotes the amount of rotation about the x-axis (in the 3D scan coordinate system), and ø denotes the amount of rotation about the o
As
A axis (for projection B) or the 0βS
β axis (for projection A). The in-plane translations and rotation in projections A and B are denoted (x y
A θ ) and (xeye Θ
B), respectively. As easily seen,
(x
B yβ) denote the amount of translations within the image planes for each projection (A and B) in the directions of the x- and y- axes that define each image plane (x
A- and y
A - for projection A, and xe- and ye for projection B), while Θ
A and Θ
B denote the amount of rotation within each image plane about an axis (not shown) that is perpendicular to both the x
A- (or x
s-) and y
A- (or y
s-) axes.
[0034] As can be seen from FIG. 1 , the out-of-plane rotation φ
A in projection A is the same as the in-plane rotation ΘB in projection B, and the out-of-plane rotation ΦB in projection B is the same as the in-plane rotation Θ
A in projection A. The use of the two projections A and B thus over-constrains the problem of solving for the six degrees of freedom. As seen from FIG. 1 , x
A = x
B, r = r
B, Θ
A = φ
B
[0035] For projection A, given a set of reference DRR images which correspond to different combinations of the two out-of-plane rotations (rA, φA), the 2D in- plane transformation (xA, yA, φA) can be estimated by the 2D image comparison. Determining the two out-of-plane rotations (r, ΘB) relies on which reference DRR is used for best similarity match. Similarly, the 2D in-plane transformation (xB, y&, ΘB) and the out-of-plane rotations (re, ΦB) can be estimated for projection B.
[0036] FIG. 2 is a schematic diagram of a methodology for tracking a treatment target during image-guided radiosurgery, in accordance with one embodiment. In overview, two sets of DRRs (or other 2D reconstructed images, for example) are generated as a first step, one set for each of the projections A and B. The process of generating DRRs is carried out after the radiation treatment planning is completed, but before treatment delivery. Before patient treatment, DRR initialization is performed on the initial DRRs, to create a set of in-plane rotated reference DRR images. In the course of radiosurgical treatment, the real time x- ray projection images are acquired and pre-processed. The processed x-ray images for each projection are registered with the corresponding set of reference DRR images. The results of the registration, (xA, yA, ΘA, rA, φA) and (XB, ye, ΘB, rB, ΦB) for projections A and B, are combined to produce the final six rigid transformation parameters (x, y, z, r, p, w).
[0037] The step of generating DRRs is performed offline, and involves specifying a set of rotation angles for each of the out-of-plane rotations r and φ, for each projection A and B. Each set of DRRs for each projection includes DRRs that correspond to different combinations of these out-of-plane rotation angles. Therefore, the total number of DRR images is therefore N * Nφ, where Nr and Nφ respectively denote the number of rotation angles for the two out-of-plane rotations r and φ. Because the out-of-plane rotations are expected to approach zero after patient alignment, the angles are more densely sampled in the range close to zero, and more sparsely sampled in the range of larger rotation angles.
[0038] Once the DRRs for out-of-plane rotations are generated, DRR initialization is performed, by computing a set of in-plane rotated reference DRRs images (in 0 degree). The most intensive part of computation in registration is the in-plane rotation computation. To achieve a fast computation, therefore, it is desirable to compute as many as possible in-plane rotations for the reference DRRs, before the registration process. All the reference DRR images are stored in a memory unit in the radiosurgical system, and are used for registration in each x-ray image acquisition, during target alignment and treatment.
[0039] The 2D x-ray images of the target that represent the two orthogonal projections A and B onto the respective image planes (shown in FIG.1 ) are acquired in near real time, by transmitting respective imaging beams through the target and onto the cameras A and B. The imaging beams for each projection (attenuated by virtue of having passed through the target) are detected by the respective cameras, after passing through the target. The detection signals are sent to an image sensor, so that the 2D x-ray projection images are generated.
[0040] For projection A, given a set of reference DRR images which correspond to different combinations of the two out-of-plane rotations (rA, φA), the 2D in- plane transformation (xA, yA, φA) can be estimated by the 2D image comparison. Determining the two out-of-plane rotations (r, ΘB) relies on which reference DRR is used for an optimal similarity match. Similarly, the 2D in-plane transformation (XB, ye, ΘB) and the outτof-plane rotations (rB, ΦB) can be estimated for projection B. Given the results (xA, yA, ΘA, rA, φA) for projection A and (xB, ye, ΘB, rB, ΦB) projection B, the 3-D transformation can be obtained by the following expressions :
x = (xA +.xB)/2,y = (yA -yB)/ /2,z = (yA +yB)/ [2, r = (rA +rB)l2,p = (θB -θA)l 2,w = (θB +θA)! 2 ,
[0041] In one embodiment, 2D/3D registration is performed using an algorithm that is designed in a unique hierarchical and iterative framework. FIG. 3 illustrates a flowchart of an algorithm used in a 2D/3D registration method, in accordance with one embodiment. The change in the position of the target (or other rigid object) in the radiographic image, as compared to the position of the target in the 3D scan data (as indicated in the reconstructed 2D image) is described using 3D rigid body transformations. Registration is performed by determining the value of the 3D rigid body transformation parameters that represent the difference in the position of the target as shown in the x-ray images, as compared to the position of the target as shown by the 2D images reconstructed from pre-treatment 3D scan data. By using different search methods for different transformation parameters, and by optimizing similarity
measures for different phases of the registration procedure, an increased accuracy is achieved in the target tracking process, with a significantly reduced computing time.
[0042] In a preliminary step (step 110 in FIG. 3) the raw x-ray images are pre- processed, before beginning the 2D/3D registration process. Pre-processing the raw x-ray images is necessary, in order to make the x-ray and DRR images have the same orientation, same image size, and same bit depth.
[0043] It has been observed that the out-of-plane rotations can be detected with a good accuracy only after the in-plane parameters have already been well estimated. It has also been found that the out-of-plane rotations are able to safely converge to the correct values when starting from the nominal position of the DRRs. Accordingly, a separate computation is carried out for the out-of- plane versus in-plane transformation parameters, during the registration process: the two out-of-plane rotations (r, φ) are estimated from the exact reference DRR images, while the in-plane transformation parameters (x, y, θ) are computed directly from the 2D images. The in-plane parameters (x, y, θ) are first computed using the nominal reference DRRs. An initial estimate for the out-of-plane rotations (r,φ) is then carried out, based on the previously obtained values of the in-plane transformation parameters x, y, θ). In order to maximize efficiency and accuracy, different search strategies are used for estimating the out-of-plane transformations and the in-plane transformations, respectively. Also, multiple similarity measure criteria are used that have been optimized at the different phases during the registration.
[0044] In the embodiment illustrated in FIG. 3, the registration process is described in terms of six distinct phases (illustrated in FIG. 3 as steps 120, 130, 140, 150, 160, and 170). In phase 1 (step 120 in FIG. 3), the in-plane transformation parameters (x, y, θ) are initially estimated using a set of in-plane rotated DRR images, which are generated offline from the nominal reference DRR (in 0 degree). The most intensive computation in the registration process is the computation of the in-plane rotation. To achieve a rapid computation, it is
desirable to compute as many in-plane rotations as possible for the reference DRRs, before starting the registration process. The process of generating in- plane rotated DRRs is thus carried out offline, after the reference DRRs for out- of-plane rotations are generated. All the reference DRR images are stored in memory, and used for registering each real-time x-ray image that is acquired during patient alignment and treatment.
[0045] In step 120, the three parameters are rapidly searched using a 3D multilevel matching method (described in connection with FIG. 4 below). A sum of absolute differences method ("SAD") is used as the similarity measure. In this step, there is no floating computation. The pixel accuracy for the translations (x,y) and half-degree accuracy for the in-plane rotation (θ) are achieved.
[0046] In the next step, i.e. step 130 (phase 2 of the registration process), the two out-of-plane rotations (r,φ) are separately searched in one dimension, based on the values of the in-plane parameters (x, y, θ), determined in previous step 120. A more complicated similarity measure based on pattern intensity, described in later paragraphs below, is used to detect the reference DRR image that corresponds to a combination of two out-of-plane rotations (r,φ). The search space for the possible rotation angles is the full search range of out-of-plane rotation angles. For an initial estimate, the full search range is sampled at every one-degree interval. In step 140 (phase 3), the in-plane translation parameters (x, y) are refined using 2D sub-pixel matching. 2D sub-pixel matching is a full range search method. Based on the updated transformation parameters (x, y, θ, r, φ) obtained from the previous step in the registration, a set of DRR images (3 x 3 or 5 x 5) is generated by translating the unknown reference DRR, one sub- pixel at a time. The in-plane translations (x, y) in sub-pixel accuracy are refined by finding the best match between the x-ray image and the DRR images.
[0047] In step 150 (phase 4), the in-plane rotation parameter θ is refined using 1D interpolation, based on the updated values for the in-plane translation parameters (x, y) from step 140, and the updated values of the out-of-plane rotation parameters (r, φ) from step 130. In step 160 (phase 5), the out-of-plane
rotations are separately refined to a better accuracy using 1 D search, based on the updated values for the in-plane transformation parameters (x, y, θ), from steps 140 and 150. In steps 140, 150, and 160 (phases 3, 4, and 5), a similarity measure method based on pattern intensity, described in more detail in later paragraphs, is used to ensure higher accuracy.
[0048] Steps 140, 150, and 160 are iteratively repeated until, a sufficient accuracy is obtained. Once the desired accuracy is reached, the final out-of- plane rotations are 1D interpolated, in the final step 170 (6th and last phase) of the registration process.
[0049] FIG.4 illustrates the generation of a 2D DRR from 3D CT scan data of a treatment target within an anatomical region of a patient. In FIG. 4, the volumetric 3D CT image of the target is referred to with the aid of reference numeral 260. The DRRs 265A and 265B, shown in FIG. 4, are artificial, synthesized 2D images that represent the radiographic image of the target that would be obtained, if imaging beams were used having the same intensity, position and angle as the beams used to generate the real time x-ray projection images, and if the target were positioned in accordance with the 3D CT scan data. In other words, the DRRs are calculated from prior 3D CT data, in an exact emulation of the real-time camera perspectives. The reference numerals 250A and 250B illustrate the hypothetical positions and angles from which the imaging beams would be directed through a target positioned in accordance with the CT volumetric image 260 of the target.
[0050] Typically, DRRs are generated by casting hypothetical beams or rays through the CT volumetric image of the target. Each ray goes through a number of voxels of the 3D CT image 260. By integrating the CT numbers for these voxels along each ray, and projecting onto an imaging plane (shown as 270A and 270B, respectively, in FIG. 4, the resultant image would emulate the radiograph that would be obtained by passing rays from hypothetical camera locations and angles (shown schematically as 250A and 250B, respectively) through a target positioned in accordance with the volumetric 3D image 260.
Ray tracing algorithms, known in the art, are generally used to generate the DRRs.
[0051] FIG. 5 illustrates a multi-resolution image representation for the multi-level matching process, used in the first phase (step 120 in FIG. 3) to initially estimate the in-plane transformation parameters. The full-size image is at the bottom (Level 1). The upper images (Level 2, Level 3 and Level 4) have lower spatial resolution. The lower resolution images are obtained by lower pass filtering, and by sub-sampling of the full-size images.
[0052] As a fast search method, multi-level matching is used for an initial estimate the in-plane transformation parameters. The basic idea of multi-level matching is to match the images at each level successively, starting with the lowest image resolution level (Level 4). The results at the lower resolution level serve to provide rough estimates for the in-plane transformation parameters (x, y, θ). The output at a lower level is then passed to the subsequent level characterized by a higher resolution. The parameters (x, y, θ) are refined, using the higher resolution images. In the final results obtained through multi-level matching, the accuracy of the translations depends on the spatial resolution of the image having the highest resolution (Level 1). The accuracy of the rotations depends on the sampling intervals of the in-plane rotations, during the DRR initialization process described above.
[0053] There may be some risks inherent in multi-level matching. The estimates at lower levels may fall within local minima, and far away from global minima. In this case, further matching at subsequent levels (at higher resolutions) may not converge to the global minima. To overcome this risk, multiple candidates of estimates are used. Many candidates for an optimal matching at a lower level are passed on to the higher resolution level. The higher the number of candidates used, the more reliable are the estimates. The best candidates are ranked by the SAD values.
[0054] In FIG. 5, denoting the full image size in Level 1 by Wx H, the image
W H W H Λ W H . . . _ . , . . l λ sizes are — x — , — x — and — — in Level 2, Level 3 and Level 4, respectively. 2 2 4 4 8 8 • H J
For translations, the search range in the lowest resolution level is the full search range that is calculated from the difference between the DRR and x-ray image W H sizes. Because of the smallest image size — x — at the lowest level, the full 8 8 range search can be completed in a very short time. The same small search range is (-2, +2) pixels for the remaining resolution levels. Because of the small search range, the search can be completed quickly, even at large image sizes.
For the rotations, the search range in the lowest resolution level is a full search range, at a denser sampling rate. In the higher resolution levels, partial search ranges are used, at a less dense sampling rate.
[0055] Applications such as image-guided radiosurgery require that the comparison between the DRRs (that contain the 3D CT scan information) and the real-time x-ray images, and consequent adjustment of the position of the x- ray source, be made very rapidly and accurately. In practice, the accuracy should be below 1 mm, and the computation time should be on the order of a few seconds. Unfortunately, it is difficult to meet both requirements simultaneously, because of several reasons. First, the two different modality images, i.e. CT scan images and x-ray images, have different spatial resolution and image quality. Generally, x-ray image resolution and quality are superior to the resolution and quality of DRR images, which are only synthesized images. Typically, some structures in the DRR may appear more blurred (especially normal to the CT slice plane), compared to the x-ray image. Ideally, an optimal similarity measure for a 2D/3D registration process should allow for an accurate registration to be achieved, despite such differences.
[0056] Second, DRR generation relies on a proper attenuation model. Because attenuation is proportional to the mass intensity of the target volume through which the beam passes, the exact relationship between the traversed mass intensity and the CT image intensity needs to be known, in order to obtain an accurate modeling. Establishing this relationship is difficult, however, so the
linear attenuation model is often used. As is known, the linear attenuation coefficient of a material is dependent on x-ray energy. CT machines and x-ray machines work at different effective energies, however. As a result, the attenuation coefficients measured by a CT scanner are different from the attenuation of a beam of x-rays passing through the target. The skeletal structures in DRR images cannot be reconstructed very well using the linear model, the DRRs being only synthetic x-ray projection images. At CT energies, the ratio of bone-to-soft-tissue attenuation is much lower than at x-ray radiographic energies. Thus, in a DRR produced from a 3D CT volume, the image contrast from soft tissue will be comparable with the image contrast from bone, reducing the clarity of bone details, for example.
[0057] Finally, x-ray images usually have a large image size (512 x 512). For better registration accuracy, it is desirable to use the full resolution image. Full resolution images are rarely used, in practice, however, because the resulting increase in computation time is excessive, and is incompatible with the requirements of image-guided radiosurgery.
[0058] Generally, similarity measure methods used in 2D/3D registration can be divided into two categories. The first method is based on image features. The image features could be anatomical edges or segmented objects. The registration accuracy depends on the accuracy of edge detection or object segmentation. The main advantage of this method is its fast computation. Feature-based similarity methods register on salient features that have been segmented from each image. They use a reduced amount of data, which makes the algorithms fast, once the segmentation has been undertaken. Because the full information content of the image is not used, however, the accuracy is sacrificed. Errors in the segmentation stage can lead to an error in the final registration.
[0059] The second method is based on image intensity content. Intensity-based methods compare the voxel and pixel values directly, using measures based on image statistics. The original images are used for registration. Usually, a good
accuracy can be achieved. Although these methods require little or no segmentation, intensity-based methods are typically much slower. Because a long time computation is required, it is hard to apply intensity-based similarity measures to clinical practice.
[0060] In one embodiment, a similarity measure method is used that is designed to optimize the 2D/3D image registration procedure described above. This similarity measure method is based on pattern intensity, and provides a powerful and efficient way to solve the 2D/3D image registration procedure, as described above. In particular, the pattern intensity based method and system described below is designed for the 1 D search phase (for the out-of-plane parameters), and the iterative refining phases, of the 2D/3D image registration procedure described above.
[0061] For the 3D multi-level search phase, the "sum of absolute differences" (SAD) measure is used, which is a known, simple similarity measure. The SAD measure is widely used in medical image processing and video processing, in cases where the two images to be matched have high image quality. The main advantage of using SAD is its fast computation and its easy optimization in parallel computation. Its main disadvantage is that the solution is sensitive to image noise, artifacts and intensity difference between the live and DRR images. As a result, SAD is only used in the first search phase to get approximate results. SAD can be expressed as
where lnve .j) represents the intensity of the "live" real-time x-ray image, and I
DRR ) represents the intensity of the reconstructed DRR image.
[0062] The pattern intensity similarity measure is more accurate, and less sensitive to image noise, artifacts, and to the intensity difference between the images being compared, compared to other similarity measures known in the art. In the exemplary embodiment described in the following paragraphs, two images are compared, the first image being a 2D x-ray image of a radiosurgical
treatment target, and the second image being a 2D DRR that is reconstructed from 3D CT scan data generated at the time of treatment planning. In one embodiment, the two images are discretized, digital images, characterized by first and second 2D arrays of pixel values. The pixel arrays are equi-dimensional, i.e. the number of rows and columns of the first array is equal to the number of rows and columns of the second array. As well known, each pixel value of an image is a number representative of the intensity of the image at a unique corresponding 2D area element forming the image.
[0063] A difference image is formed from the real-time x-ray image and the DRR image, by subtracting the corresponding pixel values of the second image (the DRR image) from each pixel value of the first image (the real-time):
Idiff ( j) = I Live & J) ~ IDRR ( j) . where lcm(ij) represents the intensity or pixel value of the ij-th pixel of the difference image, ive(U) represents the intensity or pixel value of the ij-th pixel of the live x-ray image; and
IDRROJ) represents the intensity or pixel value of the ij-th pixel of the artificial DRR image.
[0064] A pattern intensity function is defined, which operates on the difference image. The pattern intensity function is expressed as an asymptotic function of the gradients of the difference image: 2 ϊ ιU σ1 + (!* ( J) ~ Im (* + k,j + 1))2 ' (1 ) where σ is a weighting constant and R is a neighborhood that is defined using the pixel (i, J) as the center point. The mathematical formulation in equation (1) results in the similarity measure tending to a maximum value, as the number of structures tend to zero, and the similarity measure asymptotically tending to zero, as the number of structures increase. Because of the asymptotic nature of the pattern intensity measure, large differences in intensity have the same effect on the measure, regardless of their magnitude. This makes the measure robust to
large differences in pixel intensity.
[0065] The function is weighted by the weighting constant σ. The constant σ is used to weight the function, so that small deviations in intensity (caused by noise, by way of example) results in the measure remaining proximate to its maximum value. The sensitivity of the solution to the variation of X-ray image can be minimized by careful selection of this constant. The larger the weighting constant, the more stable the results become. However, the choice of the weighting constant is a tradeoff between stability and accuracy. If the value of the weighting constant is too large, the smaller details in the images cannot be reflected in the similarity measure. Based on experimentation, the empirical value of σ is determined to be in the range from about 4 to about 16, although other values of σ are also within the scope of the method and system described above.
[0066] The pattern intensity function considers a selected neighborhood for each pixel. Fig. 6 schematically illustrates a neighborhood for calculating pattern intensity, in one embodiment. In the embodiment illustrated in FIGl 6, the neighborhood R is defined such that the gradients in four directions are considered: horizontal, vertical, 45° diagonal and -45° diagonal. As shown in FIG. 6, in the horizontal direction, the (/, y"-7)-th pixel is considered. In the vertical direction, the (/-7 )-th pixel is considered. In the 45° diagonal direction, the (i-1, j+1)-th pixel is considered. In the -45° direction, the (i-1,j-1)-t pixel is considered.
[0067] Based on the definition of the neighborhood R as shown in FIG. 6, the pattern intensity expression is given as the sum below:
ffcf
+(I
dif(ij)-I
diJ(i -l)f ffcf +(I
dlf(i,j)-I
dif(i- j)f
fj +(idif(u)-ιdif(i-υ- > if<? Hidif(u)-ιdif(i-υ+v?
[0068] The formulation of the pattern intensity function, given in equation (2) above, provides a number of advantages over other known similarity measures.
First, the difference image filters out the low frequency part that is basically the soft tissues and keeps the high frequency part that is mostly the skeletal structures. This feature makes the algorithm robust to some brightness intensity difference between live and DRR images. Second, because of the asymptotic nature of the pattern intensity function, the similarity measure is less affected by pixels whose intensity values deviate only slightly from its neighboring pixels. These kinds of pixels are thought to contain random noise, hence undesirable. Third, because the asymptotic function quickly approaches to zero when the variable increases, large intensity differences such as image artifacts have the same effects on the similarity measure, regardless of their magnitude. Accordingly, the pattern intensity is less sensitive to image artifacts.
[0069] FIG. 7 schematically illustrates an image-guided radiosurgery system, constructed in accordance with one embodiment. In overview, the image guided radiosurgery system 300 includes a means 301 for generating pre-treatment 3D scan data of the target; radiosurgical beam generator 302; a positioning system 304; imaging means 306; and a controller 308. The system 300 may also include an operator control console and display 340. The means 301 may be a CT scanner, for example, or an MRI system or a PET system.
[0070] The radiosurgical beam generator 302 generates, when activated, a plurality of collimated radiosurgical beam (e.g. x-ray beam). The cumulative effect of the radiosurgical beams, when properly directed to and focused onto the target, is to necrotize or perform other treatments in a target within the patient's anatomy. The positioning system 304 may be an industrial robot, by way of example, and the beam generator 302 may be a small x-ray linac mounted to an arm of the industrial robot 304.
[0071] The imaging means 306 is preferably an x-ray imaging system for generating a pair of orthogonal x-ray projection images of the target. The imaging means 306 preferably has a pair of x-ray sources for generating diagnostic imaging beams (having known positions, angles, and intensities), and a corresponding pair of x-ray image detectors which detect the beams after the
beams have passed through the target.
[0072] The controller 308 includes software for generating a set of reconstructed 2D images (preferably DRRs) of the target, based on the 3D scan data from the 3D scanner 301, and the known intensity, location, and angle of the imaging beams. The controller 308 includes software for registering the DRRs with the real time x-ray images. The registration software is able to compute a set of 3D transformation parameters that represent the change in position of the target between the 3D scan and the near real-time x-ray images.
[0073] The positioning system 304 is responsive to commands from the controller 308, to continuously adjust, in near real time, the relative position of the radiosurgical beam generator and the target by the amount prescribed by the 3D transformation parameters obtained through the registration process.
[0074] In one embodiment, the controller 308 includes a processor 408 for performing 2D/3D registration. The 2D/3D registration processor 408 includes software for determining a set of in-plane transformation parameters (x, y, θ) and out-of-plane rotational parameters (r, φ), the parameters representing the difference in the position of the target as shown in the x-ray image, as compared to the position of the target as shown by the 2D reconstructed images.
[0075] The process 408 further includes 1 ) software for performing a 3D multilevel matching to determine an estimate for the in-plane transformation parameters (x, y, θ); 2) software for performing a 1 -D search for each of the pair of out-of-plane rotation parameters (r, φ), based on the estimated in-plane parameters (x, y, 0); and 3) software for iteratively refining the in-plane parameters (x, y, θ) and the out-of-plane parameters (r, φ), until a desired accuracy is reached.
[0076] In practice, a high accuracy is obtained for both translations and rotations after just a few iterations, using the method and system described above. For translations, an accuracy of 0.5mm or better is reached, and for rotations, an accuracy of 0.5 degrees or better is reached. The total computing time is a few
seconds, which is an order of magnitude faster than other methods in the prior art.
[0077] In one embodiment, the controller 308 also includes a processor 508 for determining a measure of similarity between two images. The similarity measure processor 508 is equipped with software for determining the measure of similarity between a first image and a second image, by subtracting each pixel value of the second image from a corresponding pixel value of the first image to form a difference image, and then applying a gradient operator upon each pixel of the difference image to form a pattern intensity function. The pattern intensity function is an asymptotic function of the gradients of the difference image, and permits the pixel values within a neighborhood R defined about each pixel in the difference image to be considered. The gradients are defined over at least four directions.
[0078] The method and system described above provide numerous advantages over the prior art. For example, a fully automatic tracking process is achieved, and no user intervention is necessary. Also, the registration process is optimized to allow the use of images having a full resolution. In this way, the full informational content of each image can be utilized, without having to leave out any image features. These results are achieved while reducing the total computation time for the tracking process to a few seconds, which is an order of magnitude faster than the existing prior art methods. At the same time, a high accuracy is achieved for both the translation parameters (below about 0.5mm) and the rotation parameters (below about 0.5 degrees).
[0079] While the invention has been particularly shown and described with reference to specific preferred embodiments, it should be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.