US20210110581A1 - Tomographic reconstruction system - Google Patents
Tomographic reconstruction system Download PDFInfo
- Publication number
- US20210110581A1 US20210110581A1 US17/033,065 US202017033065A US2021110581A1 US 20210110581 A1 US20210110581 A1 US 20210110581A1 US 202017033065 A US202017033065 A US 202017033065A US 2021110581 A1 US2021110581 A1 US 2021110581A1
- Authority
- US
- United States
- Prior art keywords
- voxel
- data
- vem
- voxels
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/003—Reconstruction from projections, e.g. tomography
- G06T11/006—Inverse problem, transformation from projection-space into object-space, e.g. transform methods, back-projection, algebraic methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2211/00—Image generation
- G06T2211/40—Computed tomography
- G06T2211/424—Iterative
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2211/00—Image generation
- G06T2211/40—Computed tomography
- G06T2211/432—Truncation
Definitions
- the present application relates to tomography imaging devices, e.g., computed tomography devices, and to tomography control systems.
- Tomographic reconstruction is an important inverse problem in a wide range of imaging systems, including medical scanners, explosive detection systems and electron and X-ray microscopy for scientific and materials imaging.
- the objective of tomographic reconstruction is to compute a three-dimensional volume (a physical object or a scene) from two-dimensional observations that are acquired using an imaging system.
- An example of tomographic reconstruction is found in computed tomography (CT) scans, in which X-ray radiation is passed from several angles to record 2D radiographic images of specific parts of the scanned patient. These radiographic images are then processed using a reconstruction algorithm to form a 3D volumetric view of the scanned region, which is subsequently used for medical diagnosis.
- CT computed tomography
- Model Based Iterative Reconstruction is a promising approach to realize tomographic reconstruction.
- the MBIR framework formulates the problem of reconstruction as minimization of a high-dimensional cost function, in which each voxel in the 3D volume is a variable.
- An iterative algorithm is employed to optimize the cost function such that a pre-specified error threshold is met.
- MBIR has demonstrated state-of-the art reconstruction quality on various applications and has been utilized commercially in GE's healthcare systems. In addition to improved image quality, MBIR has enabled significant reduction in X-ray dosage in the context of lung cancer screening ( ⁇ 80% reduction) and pediatric imaging (30-50% reduction). In other application domains, MBIR offers additional advantages such as improved output resolution, precise definition with reduced impact of undesired artifacts in images, and the ability to reconstruct even with sparse view angles. These capabilities are extremely critical in applications such as explosive detection systems (e.g. baggage and cargo scanners), where there is a need to reduce cost due to false alarm rates, operate under non-ideal view angles, and extend deployed systems to cover new threat scenarios.
- explosive detection systems e.g. baggage and cargo scanners
- a tomography system comprising a central processing unit, a system memory communicatively connected to the central processing unit and a hardware acceleration unit communicatively connected to the central processing unit and the system memory, the hardware accelerator configured to perform at least a portion of an MBIR process on computer tomography data.
- the system may comprise one or more voxel evaluation modules that evaluates an updated value of a voxel given a voxel location in a reconstructed volume.
- the operations may further include determining a reconstructed image for the selected voxel using the updated value of the voxel.
- the system may also include a computer tomography scanner, wherein the computer tomography scanner is configured to irradiate a test object, measure resulting radiation, and provide measured data corresponding to the resulting radiation.
- FIG. 1 is a diagram showing the components of an example tomography system according to one embodiment.
- FIG. 2 is a diagram showing an access pattern for voxel update in an MBIR algorithm useful with various aspects.
- FIG. 3 is a block diagram of an example implementation of hardware specialized to execute the MBIR algorithm useful with various aspects.
- FIG. 4 is a block diagram of a computation engine/Voxel Evaluation Module of FIG. 3 used for voxel update.
- FIG. 5 shows a scheme where constraining voxels on a x-z line enables sharing of A-Matrix Column.
- FIG. 6 shows a scheme where a neighborhood is reused if each computation engine is designed to update a volume around the selected voxel.
- CT X-ray computed tomography
- positron-emission tomography positron-emission tomography
- other tomography imaging systems are referred to herein generically as “CT” systems.
- FIG. 1 shows a tomography system 100 according to one embodiment.
- the system 100 includes a CT scanner 102 , a central processing unit 104 , a system memory 106 , a tomography hardware accelerator unit 302 , and a user interface 108 .
- the CT scanner 102 may include a rotating gantry having an x-ray radiation source and sensors which direct radiation to an object being scanned at various angles to record 2D radiographic images.
- the various components shown in FIG. 1 may be communicatively connected by an electronic network.
- Central processing unit 104 can each include one or more microprocessors, microcontrollers, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), programmable logic devices (PLDs), programmable logic arrays (PLAs), programmable array logic devices (PALs), or digital signal processors (DSPs).
- FPGAs field-programmable gate arrays
- ASICs application-specific integrated circuits
- PLDs programmable logic devices
- PLAs programmable logic arrays
- PALs programmable array logic devices
- DSPs digital signal processors
- System memory 106 can be a tangible non-transitory computer-readable storage medium, i.e., a non-transitory device or article of manufacture that participates in storing instructions that can be provided to central processing unit 104 or hardware accelerator unit 302 for execution.
- the system memory 106 comprises random access memory (RAM).
- the system memory 106 may comprise a hard disk drive.
- the phrase “communicatively connected” includes any type of connection, wired or wireless, for communicating data between devices or processors. These devices or processors can be located in physical proximity or not.
- N represents the set of all pairs of neighboring voxels in 3D (using say a 26 point neighborhood system)
- ⁇ (.) is a potential function that incorporates a model for the underlying image
- A is a diagonal matrix whose entries weight each term by a factor inversely proportional to the noise in the measurement
- ⁇ rs is a set of normalized weights depending on the physical distance between neighboring voxels.
- the first term in equation (1) has the interpretation of enforcing consistency of the desired reconstruction with the measurements while the second term enforces certain desirable characteristics in the reconstruction (sharp edges, low-noise etc.).
- the term y-Ax which represents the difference between the original 2D measurements and the 2D projections obtained from the 3D volume, is called the error sinogram (e).
- ICD-MBIR Iterative Coordinate Descent MBIR
- Steps of an MBIR process according to one embodiment are summarized in Table 1 below.
- the process Given a set of 2D measurements (y) as inputs, the process produces the reconstructed 3D volume (x) at the output.
- the voxels in x are initialized at random (line 1).
- the error sinogram (e) is computed as the difference between the 2D measurements and the 2D views obtained by projecting the current 3D volume.
- the process iteratively updates the voxels (lines 4-13) until the convergence criteria is met. In each iteration, every voxel in the volume is updated once in random order.
- Lines 5-11 of the process in Table 1 describe the steps involved in updating a voxel.
- the parameters ⁇ 1 and ⁇ 2 are computed using the A matrix and the error sinogram e (line 5).
- the quadratic surrogate function is evaluated for each of the voxel neighbors (lines 6-8). These are utilized to compute the new value of the voxel z* (line 9).
- the error sinogram (e) and the 3D volume (x) are then updated the new voxel value (lines 10-11).
- ICD-MBIR offers several advantages over other MBIR variants: (i) It takes lower number of iterations to converge, thereby enabling faster runtimes, and (ii) It is general and can be easily adopted for a variety of applications with different geometries, noise statistics and image models, without the need for custom algorithmic tuning for each application.
- each voxel update to create a 3D image 208 involves accessing 3 key data-structures as illustrated in FIG.
- a column 212 of the A matrix 210 wherein the column 212 is indexed by the x and z co-ordinates of a voxel 214 , (ii) voxel neighborhood 216 , which refers to the voxels adjacent to the current voxel 214 along all directions, and (iii) portions of the error sinogram, which is determined by slice ID or y co-ordinate of the voxel 214 .
- a matrix column 212 is typically sparse (sparsity ratio of 1000:1), the per-voxel update computations are relatively small (Time/Voxel-update: ⁇ 26 ⁇ s), and the overheads of parallelization such as task startup time, synchronization between threads, and off-chip memory bandwidth significantly limit performance. In summary, parallelizing computations within each voxel update yields very little performance improvement.
- the present disclosure provides a specialized hardware architecture and associated control system to simultaneously improve both the runtime and energy consumption of the MBIR algorithm by exploiting its computational characteristics.
- FIG. 3 shows a block diagram of the tomography hardware accelerator unit 302 according to one embodiment.
- the tomography hardware accelerator unit 302 receives as input: 2D measurement data 304 , reconstructed 3D volume data 306 , A matrix data 308 , and error sinogram data 310 .
- the 2D measurement data 304 , reconstructed 3D volume data 306 , A matrix data 308 , and error sinogram 310 may be stored in memory blocks external to and operatively connected to the tomography hardware accelerator unit 302 .
- the tomography hardware accelerator unit 302 may also include a global control unit 312 containing appropriate control registers and logic to initialize the location of the external memory blocks and generate interface signals 314 for sending/receiving inputs/outputs to and from the tomography hardware accelerator unit 302 .
- a global control unit 312 containing appropriate control registers and logic to initialize the location of the external memory blocks and generate interface signals 314 for sending/receiving inputs/outputs to and from the tomography hardware accelerator unit 302 .
- the global control unit 312 generates a random voxel ID (x,y,z co-ordinates). Based on the co-ordinates, the tomography hardware accelerator unit 302 retrieves the following data from the system memory 106 which is required to update the voxel 214 : a column 212 of the A matrix 210 , a portion of the error sinogram 310 , and the voxel neighbor data 216 (which is a portion of the volume data 306 ). The tomography hardware accelerator unit 302 may include internal memory blocks to store these data structures. The updated value of the voxel 214 is then computed by the tomography hardware accelerator unit 302 and stored back to the external system memory 106 . This process is repeated until the convergence criterion is met.
- the tomography hardware accelerator unit 302 may also comprise one or more voxel evaluation modules 316 .
- Each voxel evaluation module 316 may comprise a theta evaluation module 318 , a neighborhood processing element 320 , and a voxel update element 322 .
- Each of the theta evaluation module 318 , neighborhood processing element 320 , and voxel update element 322 may comprise one or more computer processors and associated memory for performing calculations on the received data.
- the TEM 318 evaluates the variables ⁇ 1 and ⁇ 2 of the MBIR algorithm.
- the NPE 320 applies a complex one-to-one function on each of the neighbor voxels.
- the VUE 322 uses the outputs of TEM and NPE to compute the updated value of the voxel 214 and the error sinogram 310 .
- the processing elements 318 , 320 and 322 may comprise hardware functional blocks such as adders, multipliers, registers etc. that are interconnected to achieve the desired functionality, in some cases, over multiple cycles of operation.
- the VEM 316 may also contain memory blocks that store the column 212 of A matrix 210 , portions of the error sinogram 310 and the voxel neighbor data, all of which may be accessed by the TEM 318 , NPE 320 and VUE 322 . Since the A matrix is sparse, it may be stored as an adjacency list using First-In-First-Out (FIFO) buffers in certain embodiments.
- FIFO First-In-First-Out
- the VEM 316 operates as follows. First, the elements of the A matrix column 212 are transferred into the TEM 318 .
- the TEM 318 utilizes the index of the A matrix elements to address the error sinogram 310 memory to obtain the corresponding error sinogram 310 value.
- the TEM 318 performs a vector reduction operation on the A matrix 212 and error sinogram 310 values to obtain ⁇ 1 and ⁇ 2 .
- the NPE 320 operates on each of the voxel's neighbors data and stores the processed neighbor values in a FIFO memory located in the VEM 316 . Since the TEM 318 and NPE 216 operate in parallel, the performance of the VEM 316 is maximized when their latencies are equal.
- the output of the TEM 318 and NPE 320 is directed to the VUE 322 , which computes the updated value of the voxel 214 .
- the entries in the error sinogram 310 memory are also updated based on the updated value of the voxel 214 .
- the voxel 214 data is written back to the system memory 106 .
- the VEM 316 efficiently computes the updated value of a voxel.
- the performance of the computation engine can be further improved by operating it as a two-level nested pipeline.
- the first-level pipeline is within the TEM 318 .
- the VEM 316 leverages the pipeline parallelism across the different elements of the vector reduction.
- the TEM 318 computes on a given A matrix element
- the error sinogram value for the successive element is fetched from the error sinogram memory in a pipelined manner.
- the second level pipelining exploits the parallelism across successive voxels.
- the VEM 316 concurrently transfers data required by the subsequent voxel, even as the previous voxel is being processed by the VEM 316 .
- both pipeline levels improve performance by overlapping data communication with computation.
- Each execution of the VEM 316 requires the A matrix column 212 , the error sinogram 310 and the voxel neighbor data 216 to be transferred from the system memory 106 to the VEM 316 .
- the VEM 316 reuses the data stored in the internal memory blocks of the VEM 316 across multiple voxels. Since voxels in a slice 218 share the same portion of the error sinogram 310 ( FIG. 2 ), the VEM 316 constrains the sequence in which the voxels 214 are updated in the VEM as follows: First, a slice 218 is selected from the volume 208 at random. Then, all voxels 214 in the slice 218 are updated in a random sequence before the next slice is chosen.
- the error sinogram 310 needs to be fetched only once per slice, and all voxels within the slice can re-use the data.
- the data transfer cost for the error sinogram 310 is amortized across all voxels 214 in the slice 218 .
- This optimization can be simply realized by modifying the global control unit 312 that generates the voxel ID.
- the global control unit 312 may be programmed to select voxels for updating in an order that processes a majority of voxels from the same slice together.
- the VEMs 316 are arranged as an array of L lanes, each containing a dedicated TEM 318 , NPE 318 , and VUE 322 .
- the voxels that are to be updated in parallel are chosen to be located far apart in the 3D volume.
- the voxels may be selected from different slices 218 that are equally and entirely spread out across they dimension ( FIG. 2 ) of the 3D volume 208 .
- the tomography hardware accelerator unit 302 restricts concurrently updated voxels to lie on a straight line in the volume.
- the line 502 is parallel to the y axis of the volume 208 (i.e. they have the same x,z coordinates), although any straight line in the volume may be used. Since A matrix columns are indexed only using the x and z co-ordinates ( FIG. 2 ), concurrently updated voxels 214 share the same A matrix column 212 , thereby linearly reducing the A matrix data transfer time.
- an A matrix memory 324 may be placed outside the VEMs 316 and shared by all VEMs 316 during operation. Before each execution of the VEM array, the neighborhoods for all voxels 214 and one A matrix column 212 are transferred from the system memory 106 to the memory 324 . This results in a net reduction of L ⁇ 1 A matrix column transfers per execution, where L is the number of VEMs 316 in the tomography hardware accelerator unit 302 .
- the neighborhood voxel data transfer time may be reduced by concurrently updating the neighborhood volume 216 of size a ⁇ b ⁇ c (x,y,z directions) around the voxel 214 .
- Each VEM 316 is then used to update one of the voxels 214 in the volume 216 in a parallel fashion.
- the capacity of the neighborhood memory is sized to hold all of the data for voxels in the neighborhood volume 216 . Also, once the updated value of a voxel is computed, it needs to be written to the neighborhood memory 216 within the VEM 316 (in addition to the system memory 106 ), as subsequent voxels in the volume use the updated value.
- the error sinogram memory in the VEM 316 may be replicated to hold data corresponding to each slice 218 in the volume 216 .
- voxels within the slice 218 use different A matrix columns, and correspondingly the A matrix memory is also replicated.
- the TEM 318 , NPE 320 and VUE 322 are not replicated in the VEM 316 in such embodiments, as voxels in the volume 216 are evaluated sequentially.
- the global control unit 312 and VEMs 316 are modified to appropriately index these memories and evaluate all voxels within the neighborhood volume 216 .
- Steps of various methods described herein can be performed in any order except when otherwise specified, or when data from an earlier step is used in a later step.
- Exemplary method(s) described herein are not limited to being carried out by components particularly identified in discussions of those methods.
- a technical effect is to improve the functioning of a CT scanner by substantially reducing the time required to process CT data, e.g., by performing MBIR or other processes to determine voxel values.
- a further technical effect is to transform measured data from a CT scanner into voxel data corresponding to the scanned object.
- aspects described herein may be embodied as systems or methods. Accordingly, various aspects herein may take the form of an entirely hardware aspect, an entirely software aspect (including firmware, resident software, micro-code, etc.), or an aspect combining software and hardware aspects These aspects can all generally be referred to herein as a “service,” “circuit,” “circuitry,” “module,” or “system.”
- various aspects herein may be embodied as computer program products including computer readable program code (“program code”) stored on a computer readable medium, e.g., a tangible non-transitory computer storage medium or a communication medium.
- a computer storage medium can include tangible storage units such as volatile memory, nonvolatile memory, or other persistent or auxiliary computer storage media, removable and non-removable computer storage media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data.
- a computer storage medium can be manufactured as is conventional for such articles, e.g., by pressing a CD-ROM or electronically writing data into a Flash memory.
- communication media may embody computer-readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transmission mechanism.
- a modulated data signal such as a carrier wave or other transmission mechanism.
- “computer storage media” do not include communication media. That is, computer storage media do not include communications media consisting solely of a modulated data signal, a carrier wave, or a propagated signal, per se.
- the program code can include computer program instructions that can be loaded into processor 186 (and possibly also other processors), and that, when loaded into processor 486 , cause functions, acts, or operational steps of various aspects herein to be performed by processor 186 (or other processor).
- the program code for carrying out operations for various aspects described herein may be written in any combination of one or more programming language(s), and can be loaded from disk 143 into code memory 141 for execution.
- the program code may execute, e.g., entirely on processor 186 , partly on processor 186 and partly on a remote computer connected to network 150 , or entirely on the remote computer.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Algebra (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Apparatus For Radiation Diagnosis (AREA)
Abstract
A tomography system having a central processing unit, a system memory communicatively connected to the central processing unit, and a hardware acceleration unit communicatively connected to the central processing unit and the system memory, the hardware accelerator configured to perform at least a portion of an MBIR process on computer tomography data. The hardware accelerator unit may include one or more voxel evaluation modules which evaluate an updated value of a voxel given a voxel location in a reconstructed volume. By processing voxel data for voxels in a voxel neighborhood, processing time is reduces.
Description
- The present patent application is a continuation of U.S. patent application Ser. No. 16/233,066, filed Dec. 26, 2018, which is a continuation of U.S. patent application Ser. No. 15/063,054, filed Mar. 7, 2016, which claims the priority benefit of U.S. Provisional Patent Application Ser. No. 62/129,018, filed Mar. 5, 2015. The contents of all of the these applications is hereby incorporated by reference in its entirety into the present disclosure.
- This invention was made with government support under Grant Number CNS-1018621 awarded by National Science Foundation. The government has certain rights in the invention.
- The present application relates to tomography imaging devices, e.g., computed tomography devices, and to tomography control systems.
- Tomographic reconstruction is an important inverse problem in a wide range of imaging systems, including medical scanners, explosive detection systems and electron and X-ray microscopy for scientific and materials imaging. The objective of tomographic reconstruction is to compute a three-dimensional volume (a physical object or a scene) from two-dimensional observations that are acquired using an imaging system. An example of tomographic reconstruction is found in computed tomography (CT) scans, in which X-ray radiation is passed from several angles to record 2D radiographic images of specific parts of the scanned patient. These radiographic images are then processed using a reconstruction algorithm to form a 3D volumetric view of the scanned region, which is subsequently used for medical diagnosis.
- Model Based Iterative Reconstruction (MBIR) is a promising approach to realize tomographic reconstruction. The MBIR framework formulates the problem of reconstruction as minimization of a high-dimensional cost function, in which each voxel in the 3D volume is a variable. An iterative algorithm is employed to optimize the cost function such that a pre-specified error threshold is met.
- MBIR has demonstrated state-of-the art reconstruction quality on various applications and has been utilized commercially in GE's healthcare systems. In addition to improved image quality, MBIR has enabled significant reduction in X-ray dosage in the context of lung cancer screening (˜80% reduction) and pediatric imaging (30-50% reduction). In other application domains, MBIR offers additional advantages such as improved output resolution, precise definition with reduced impact of undesired artifacts in images, and the ability to reconstruct even with sparse view angles. These capabilities are extremely critical in applications such as explosive detection systems (e.g. baggage and cargo scanners), where there is a need to reduce cost due to false alarm rates, operate under non-ideal view angles, and extend deployed systems to cover new threat scenarios.
- While MBIR shows great potential, its high compute and data requirements are key bottlenecks to its widespread commercial adoption. For instance, reconstructing a 512×512×256 volume of nanoparticles viewed from different angles through an electron microscope requires 50.33 GOPS (Giga operations) and 15G memory accesses per iteration of MBIR. Further, the algorithm may take 10s of iterations to converge depending on the threshold. Clearly, this places significant compute demand. One tested software implementation required ˜1700 seconds per iteration on a 2.3 GHz AMD Opteron server with 196 GB memory, which is unacceptable for many practical applications. Thus, technologies that enable orders of magnitude improvement in MBIR's implementation efficiency are needed.
- According to various aspects, a tomography system is provided, comprising a central processing unit, a system memory communicatively connected to the central processing unit and a hardware acceleration unit communicatively connected to the central processing unit and the system memory, the hardware accelerator configured to perform at least a portion of an MBIR process on computer tomography data. The system may comprise one or more voxel evaluation modules that evaluates an updated value of a voxel given a voxel location in a reconstructed volume. The operations may further include determining a reconstructed image for the selected voxel using the updated value of the voxel. The system may also include a computer tomography scanner, wherein the computer tomography scanner is configured to irradiate a test object, measure resulting radiation, and provide measured data corresponding to the resulting radiation.
- In the following description and drawings, identical reference numerals have been used, where possible, to designate identical features that are common to the drawings.
-
FIG. 1 is a diagram showing the components of an example tomography system according to one embodiment. -
FIG. 2 is a diagram showing an access pattern for voxel update in an MBIR algorithm useful with various aspects. -
FIG. 3 is a block diagram of an example implementation of hardware specialized to execute the MBIR algorithm useful with various aspects. -
FIG. 4 is a block diagram of a computation engine/Voxel Evaluation Module ofFIG. 3 used for voxel update. -
FIG. 5 shows a scheme where constraining voxels on a x-z line enables sharing of A-Matrix Column. -
FIG. 6 shows a scheme where a neighborhood is reused if each computation engine is designed to update a volume around the selected voxel. - The attached drawings are for purposes of illustration and are not necessarily to scale.
- X-ray computed tomography, positron-emission tomography, and other tomography imaging systems are referred to herein generically as “CT” systems.
- Throughout this description, some aspects are described in terms that would ordinarily be implemented as software programs. Those skilled in the art will readily recognize that the equivalent of such software can also be constructed in hardware, firmware, or micro-code. Because data-manipulation algorithms and systems are well known, the present description is directed in particular to algorithms and systems forming part of, or cooperating more directly with, systems and methods described herein. Other aspects of such algorithms and systems, and hardware or software for producing and otherwise processing signals or data involved therewith, not specifically shown or described herein, are selected from such systems, algorithms, components, and elements known in the art. Given the systems and methods as described herein, software not specifically shown, suggested, or described herein that is useful for implementation of any aspect is conventional and within the ordinary skill in such arts.
-
FIG. 1 shows atomography system 100 according to one embodiment. As shown, thesystem 100 includes aCT scanner 102, acentral processing unit 104, asystem memory 106, a tomographyhardware accelerator unit 302, and auser interface 108. TheCT scanner 102 may include a rotating gantry having an x-ray radiation source and sensors which direct radiation to an object being scanned at various angles to record 2D radiographic images. The various components shown inFIG. 1 may be communicatively connected by an electronic network. -
Central processing unit 104,hardware accelerator unit 302, and other processors described herein, can each include one or more microprocessors, microcontrollers, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), programmable logic devices (PLDs), programmable logic arrays (PLAs), programmable array logic devices (PALs), or digital signal processors (DSPs). -
System memory 106 can be a tangible non-transitory computer-readable storage medium, i.e., a non-transitory device or article of manufacture that participates in storing instructions that can be provided tocentral processing unit 104 orhardware accelerator unit 302 for execution. In one example, thesystem memory 106 comprises random access memory (RAM). In other examples, thesystem memory 106 may comprise a hard disk drive. - The phrase “communicatively connected” includes any type of connection, wired or wireless, for communicating data between devices or processors. These devices or processors can be located in physical proximity or not.
- To better illustrate the technical challenges involved in the implementation of MBIR, an explanation of the mathematical concepts behind the algorithm will be provided. In order to describe the MBIR approach, it is useful to think of all the 2D images (measurements) as well as the unknown 3D volume of voxels as one-dimensional vectors. If y is a M×1 vector containing all the measurements, x is a N×1 vector containing all the voxels in the 3D volume, and A is a sparse M×N matrix implementing the line integral through the 3D volume, then the MBIR reconstruction is obtained by minimizing the following function,
-
- where N represents the set of all pairs of neighboring voxels in 3D (using say a 26 point neighborhood system), ρ(.) is a potential function that incorporates a model for the underlying image, A is a diagonal matrix whose entries weight each term by a factor inversely proportional to the noise in the measurement, and ωrs is a set of normalized weights depending on the physical distance between neighboring voxels. The first term in equation (1) has the interpretation of enforcing consistency of the desired reconstruction with the measurements while the second term enforces certain desirable characteristics in the reconstruction (sharp edges, low-noise etc.). The term y-Ax, which represents the difference between the original 2D measurements and the 2D projections obtained from the 3D volume, is called the error sinogram (e).
- While several variants of the MBIR algorithm exist based on how the cost function is minimized, a popular variant called the Iterative Coordinate Descent MBIR (ICD-MBIR) is considered. The basic idea in ICD is to update the voxels one at a time so as to monotonically decrease the value of the original function (equation (1)) with each update. Since the cost function is convex and is bounded from below, this method converges to the global minimum.
- The cost function in equation (1) with respect to a single voxel (ignoring constants) indexed by s is given by
-
- where A*,s is the sth column of A, e=y˜Ax and xs is the current value of the voxel s.
- Due to the complicated nature of the function ρ( ), it is typically not possible to find a simple closed form expression for the minimum of (2). Hence, ( ) often replaced by a quadratic surrogate function which makes (2) simpler to minimize. In particular if
-
- then an overall surrogate function to (2) is given by
-
- Taking the derivative of this surrogate function and setting it to zero, it can be verified that the minimum of the function is
-
- Note that minimizing (5) ensures a decrease of (2) and hence that of the original function (1). The algorithm can be efficiently implemented by keeping track of the error sinogram e along with each update.
- Steps of an MBIR process according to one embodiment are summarized in Table 1 below.
-
TABLE 1 Input: 2D Measurements: y Output: Reconstructed 3D volume: x 1: Initialize x at random 2: Error Sinogram: e = y − Ax 3: while Convergence criteria not met do 4: for each voxel v in random order do 5: θ1 and θ2 = f (e, A*,v ) 6: for voxels u ∈ neighborhood Nv of v do 7: Compute surrogate fn. auv for u (Eqn. 5) 8: end for 9: Compute z* = g(θ1, θ2, xv, a*v) (Eqn. 7) 10: Update Error Sinogram: e ← e − (z* − xv )A*,v 11: Update voxel: xv ← z* 12: end for 13: end while - Given a set of 2D measurements (y) as inputs, the process produces the reconstructed 3D volume (x) at the output. First, the voxels in x are initialized at random (line 1). Next the error sinogram (e) is computed as the difference between the 2D measurements and the 2D views obtained by projecting the current 3D volume. The process iteratively updates the voxels (lines 4-13) until the convergence criteria is met. In each iteration, every voxel in the volume is updated once in random order.
- Lines 5-11 of the process in Table 1 describe the steps involved in updating a voxel. First, the parameters θ1 and θ2 are computed using the A matrix and the error sinogram e (line 5). Next, the quadratic surrogate function is evaluated for each of the voxel neighbors (lines 6-8). These are utilized to compute the new value of the voxel z* (line 9). The error sinogram (e) and the 3D volume (x) are then updated the new voxel value (lines 10-11).
- ICD-MBIR offers several advantages over other MBIR variants: (i) It takes lower number of iterations to converge, thereby enabling faster runtimes, and (ii) It is general and can be easily adopted for a variety of applications with different geometries, noise statistics and image models, without the need for custom algorithmic tuning for each application.
- However, a key challenge with ICD-MBIR is that it is not easily amendable to efficient parallel execution on modern multi-cores and many-core accelerators such as general purpose graphical processing units (GPGPUs) for the following reasons. First, there is limited data parallelism within the core computations that evaluate the updated value of the voxel. From a computational standpoint, each voxel update to create a
3D image 208 involves accessing 3 key data-structures as illustrated inFIG. 2 : (i) acolumn 212 of theA matrix 210, wherein thecolumn 212 is indexed by the x and z co-ordinates of avoxel 214, (ii)voxel neighborhood 216, which refers to the voxels adjacent to thecurrent voxel 214 along all directions, and (iii) portions of the error sinogram, which is determined by slice ID or y co-ordinate of thevoxel 214. Since theA matrix column 212 is typically sparse (sparsity ratio of 1000:1), the per-voxel update computations are relatively small (Time/Voxel-update: ˜26 μs), and the overheads of parallelization such as task startup time, synchronization between threads, and off-chip memory bandwidth significantly limit performance. In summary, parallelizing computations within each voxel update yields very little performance improvement. - The present disclosure provides a specialized hardware architecture and associated control system to simultaneously improve both the runtime and energy consumption of the MBIR algorithm by exploiting its computational characteristics.
-
FIG. 3 shows a block diagram of the tomographyhardware accelerator unit 302 according to one embodiment. The tomographyhardware accelerator unit 302 receives as input: 2D measurement data 304, reconstructed 3D volume data 306, A matrix data 308, anderror sinogram data 310. In certain embodiments, the 2D measurement data 304, reconstructed 3D volume data 306, A matrix data 308, anderror sinogram 310 may be stored in memory blocks external to and operatively connected to the tomographyhardware accelerator unit 302. The tomographyhardware accelerator unit 302 may also include aglobal control unit 312 containing appropriate control registers and logic to initialize the location of the external memory blocks and generate interface signals 314 for sending/receiving inputs/outputs to and from the tomographyhardware accelerator unit 302. - At a high level, operation of the tomography
hardware accelerator unit 302 can be summarized as follows. First, theglobal control unit 312 generates a random voxel ID (x,y,z co-ordinates). Based on the co-ordinates, the tomographyhardware accelerator unit 302 retrieves the following data from thesystem memory 106 which is required to update the voxel 214: acolumn 212 of theA matrix 210, a portion of theerror sinogram 310, and the voxel neighbor data 216 (which is a portion of the volume data 306). The tomographyhardware accelerator unit 302 may include internal memory blocks to store these data structures. The updated value of thevoxel 214 is then computed by the tomographyhardware accelerator unit 302 and stored back to theexternal system memory 106. This process is repeated until the convergence criterion is met. - The tomography
hardware accelerator unit 302 may also comprise one or morevoxel evaluation modules 316. Eachvoxel evaluation module 316 may comprise atheta evaluation module 318, aneighborhood processing element 320, and avoxel update element 322. Each of thetheta evaluation module 318,neighborhood processing element 320, andvoxel update element 322 may comprise one or more computer processors and associated memory for performing calculations on the received data. - The
TEM 318 evaluates the variables θ1 and θ2 of the MBIR algorithm. TheNPE 320 applies a complex one-to-one function on each of the neighbor voxels. TheVUE 322 uses the outputs of TEM and NPE to compute the updated value of thevoxel 214 and theerror sinogram 310. Theprocessing elements VEM 316 may also contain memory blocks that store thecolumn 212 of Amatrix 210, portions of theerror sinogram 310 and the voxel neighbor data, all of which may be accessed by theTEM 318,NPE 320 andVUE 322. Since the A matrix is sparse, it may be stored as an adjacency list using First-In-First-Out (FIFO) buffers in certain embodiments. A controller present within the engine is designed to fetch the necessary data if it is not already available in the internal memory blocks of theVEM 316. - The
VEM 316 operates as follows. First, the elements of theA matrix column 212 are transferred into theTEM 318. TheTEM 318 utilizes the index of the A matrix elements to address theerror sinogram 310 memory to obtain thecorresponding error sinogram 310 value. TheTEM 318 performs a vector reduction operation on theA matrix 212 anderror sinogram 310 values to obtain θ1 and θ2. In parallel to theTEM 318, theNPE 320 operates on each of the voxel's neighbors data and stores the processed neighbor values in a FIFO memory located in theVEM 316. Since theTEM 318 andNPE 216 operate in parallel, the performance of theVEM 316 is maximized when their latencies are equal. This is achieved by proportionately allocating hardware resources in their implementation. The output of theTEM 318 andNPE 320 is directed to theVUE 322, which computes the updated value of thevoxel 214. This involves performing a vector reduction operation on thevoxel neighborhood 216, followed by multiple scalar operations. The entries in theerror sinogram 310 memory are also updated based on the updated value of thevoxel 214. Finally, thevoxel 214 data is written back to thesystem memory 106. Thus, theVEM 316 efficiently computes the updated value of a voxel. - In certain embodiments, the performance of the computation engine can be further improved by operating it as a two-level nested pipeline. The first-level pipeline is within the
TEM 318. In this case, theVEM 316 leverages the pipeline parallelism across the different elements of the vector reduction. When theTEM 318 computes on a given A matrix element, the error sinogram value for the successive element is fetched from the error sinogram memory in a pipelined manner. The second level pipelining exploits the parallelism across successive voxels. In this case, theVEM 316 concurrently transfers data required by the subsequent voxel, even as the previous voxel is being processed by theVEM 316. Thus, both pipeline levels improve performance by overlapping data communication with computation. - Each execution of the
VEM 316 requires theA matrix column 212, theerror sinogram 310 and thevoxel neighbor data 216 to be transferred from thesystem memory 106 to theVEM 316. To minimize data transfer overhead, in certain embodiments, theVEM 316 reuses the data stored in the internal memory blocks of theVEM 316 across multiple voxels. Since voxels in aslice 218 share the same portion of the error sinogram 310 (FIG. 2 ), theVEM 316 constrains the sequence in which thevoxels 214 are updated in the VEM as follows: First, aslice 218 is selected from thevolume 208 at random. Then, allvoxels 214 in theslice 218 are updated in a random sequence before the next slice is chosen. In this case, theerror sinogram 310 needs to be fetched only once per slice, and all voxels within the slice can re-use the data. Thus the data transfer cost for theerror sinogram 310 is amortized across allvoxels 214 in theslice 218. This optimization can be simply realized by modifying theglobal control unit 312 that generates the voxel ID. In other words, theglobal control unit 312 may be programmed to select voxels for updating in an order that processes a majority of voxels from the same slice together. - In certain embodiments, the
VEMs 316 are arranged as an array of L lanes, each containing adedicated TEM 318,NPE 318, andVUE 322. To ensure convergence of the MBIR algorithm, the voxels that are to be updated in parallel are chosen to be located far apart in the 3D volume. To maximize the distance of separation, the voxels may be selected fromdifferent slices 218 that are equally and entirely spread out across they dimension (FIG. 2 ) of the3D volume 208. - In certain embodiments, as shown in
FIG. 5 , the tomographyhardware accelerator unit 302 restricts concurrently updated voxels to lie on a straight line in the volume. In the embodiment illustrated inFIG. 5 , theline 502 is parallel to the y axis of the volume 208 (i.e. they have the same x,z coordinates), although any straight line in the volume may be used. Since A matrix columns are indexed only using the x and z co-ordinates (FIG. 2 ), concurrently updatedvoxels 214 share the sameA matrix column 212, thereby linearly reducing the A matrix data transfer time. This optimization does not impact convergence, as the slices from which thevoxels 214 are picked lie sufficiently far apart (the y dimension of the volume is much larger than the number of parallel voxel updates). As shown inFIG. 3 , anA matrix memory 324 may be placed outside theVEMs 316 and shared by allVEMs 316 during operation. Before each execution of the VEM array, the neighborhoods for allvoxels 214 and one Amatrix column 212 are transferred from thesystem memory 106 to thememory 324. This results in a net reduction of L˜1 A matrix column transfers per execution, where L is the number ofVEMs 316 in the tomographyhardware accelerator unit 302. - In certain embodiments, the neighborhood voxel data transfer time may be reduced by concurrently updating the
neighborhood volume 216 of size a×b×c (x,y,z directions) around thevoxel 214. EachVEM 316 is then used to update one of thevoxels 214 in thevolume 216 in a parallel fashion. To facilitate the neighbor voxel reuse, in certain embodiments, the capacity of the neighborhood memory is sized to hold all of the data for voxels in theneighborhood volume 216. Also, once the updated value of a voxel is computed, it needs to be written to theneighborhood memory 216 within the VEM 316 (in addition to the system memory 106), as subsequent voxels in the volume use the updated value. Also, the since the adjacent voxels along the y direction belong to different slices, the error sinogram memory in theVEM 316 may be replicated to hold data corresponding to eachslice 218 in thevolume 216. Along similar lines, voxels within theslice 218 use different A matrix columns, and correspondingly the A matrix memory is also replicated. TheTEM 318,NPE 320 andVUE 322 are not replicated in theVEM 316 in such embodiments, as voxels in thevolume 216 are evaluated sequentially. Finally, theglobal control unit 312 andVEMs 316 are modified to appropriately index these memories and evaluate all voxels within theneighborhood volume 216. - Steps of various methods described herein can be performed in any order except when otherwise specified, or when data from an earlier step is used in a later step. Exemplary method(s) described herein are not limited to being carried out by components particularly identified in discussions of those methods.
- Various aspects provide more effective processing of CT data. A technical effect is to improve the functioning of a CT scanner by substantially reducing the time required to process CT data, e.g., by performing MBIR or other processes to determine voxel values. A further technical effect is to transform measured data from a CT scanner into voxel data corresponding to the scanned object.
- Various aspects described herein may be embodied as systems or methods. Accordingly, various aspects herein may take the form of an entirely hardware aspect, an entirely software aspect (including firmware, resident software, micro-code, etc.), or an aspect combining software and hardware aspects These aspects can all generally be referred to herein as a “service,” “circuit,” “circuitry,” “module,” or “system.”
- Furthermore, various aspects herein may be embodied as computer program products including computer readable program code (“program code”) stored on a computer readable medium, e.g., a tangible non-transitory computer storage medium or a communication medium. A computer storage medium can include tangible storage units such as volatile memory, nonvolatile memory, or other persistent or auxiliary computer storage media, removable and non-removable computer storage media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. A computer storage medium can be manufactured as is conventional for such articles, e.g., by pressing a CD-ROM or electronically writing data into a Flash memory. In contrast to computer storage media, communication media may embody computer-readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transmission mechanism. As defined herein, “computer storage media” do not include communication media. That is, computer storage media do not include communications media consisting solely of a modulated data signal, a carrier wave, or a propagated signal, per se.
- The program code can include computer program instructions that can be loaded into processor 186 (and possibly also other processors), and that, when loaded into processor 486, cause functions, acts, or operational steps of various aspects herein to be performed by processor 186 (or other processor). The program code for carrying out operations for various aspects described herein may be written in any combination of one or more programming language(s), and can be loaded from disk 143 into code memory 141 for execution. The program code may execute, e.g., entirely on processor 186, partly on processor 186 and partly on a remote computer connected to network 150, or entirely on the remote computer.
- The invention is inclusive of combinations of the aspects described herein. References to “a particular aspect” (or “embodiment” or “version”) and the like refer to features that are present in at least one aspect of the invention. Separate references to “an aspect” (or “embodiment”) or “particular aspects” or the like do not necessarily refer to the same aspect or aspects; however, such aspects are not mutually exclusive, unless otherwise explicitly noted. The use of singular or plural in referring to “method” or “methods” and the like is not limiting. The word “or” is used in this disclosure in a non-exclusive sense, unless otherwise explicitly noted.
- The invention has been described in detail with particular reference to certain preferred aspects thereof, but it will be understood that variations, combinations, and modifications can be effected within the spirit and scope of the invention.
Claims (18)
1. A tomography system, comprising:
a central processing unit;
a system memory communicatively connected to the central processing unit; and
a hardware acceleration unit communicatively connected to the central processing unit and the system memory, the hardware accelerator configured to perform at least a portion of an MBIR process on computer tomography data.
2. The system according to claim 1 , further comprising one or more voxel evaluation modules that evaluates an updated value of a voxel given a voxel location in a reconstructed volume.
3. The system according to claim 1 , further comprising an electronic display, the display operatively connected to the central processing unit, the electronic display configured to display a reconstructed image based on the computer tomography data.
4. The system according to claim 1 , further comprising a computer tomography scanner, wherein the computer tomography scanner is configured to irradiate a test object, measure resulting radiation, and provide measured data corresponding to the resulting radiation.
5. The system according to claim 1 , wherein hardware acceleration unit comprises a control unit which generates a pseudo-random sequence of voxel locations.
6. The system of claim 1 , wherein the hardware acceleration unit is configured to identify voxel data required to update a voxel and fetch the voxel data from the system memory.
7. The system of claim 2 , wherein the VEM contains VEM memory blocks internal to the VEM, which stores the data needed to compute the updated voxel value.
8. The system of claim 2 , wherein the VEM is configured to assess the data stored in the VEM memory blocks, re-use said data stored in the VEM memory blocks across multiple voxel evaluations, and partially fetch unavailable data from the system memory.
9. The system of claim 2 , wherein the VEM is configured to perform data transfer operations and data processing operations in parallel.
10. The system of claim 2 wherein the VEM is configured to perform data transfer operations and data processing operations in a pipelined manner.
11. The system of claim 2 , wherein the hardware accelerator unit fetches data required for a voxel from the system memory, while computations corresponding to a different voxel are in progress.
12. The system of claim 2 , wherein the hardware accelerator unit comprises a plurality of VEMs, the VEMs configured to update multiple voxels in parallel.
13. The system of claim 12 , wherein the sequence of voxels updated on a VEM is constrained to enhance data reuse within the accelerator.
14. The system of claim 12 in which at least one next voxel processed on a given VEM is constrained to lie within a common slice as the previous voxel processed on the VEM, thereby enabling the error sinogram memory to be reused.
15. The system of claim 12 in which voxels updated concurrently on multiple VEMs are constrained such that they share at least one entry of an A matrix of the tomography data.
16. The system of claim 15 where the said shared entry of the A matrix is fetched only once from the system memory and used by multiple VEMs.
17. The system of claim 12 in which adjacent voxels are updated on the same VEM, enabling neighborhood voxel data to be shared between the voxels.
18. The system of claim 12 where each VEM is configured to update a voxel neighborhood around a given voxel, the neighborhood comprising voxels adjacent to the given voxel.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/033,065 US20210110581A1 (en) | 2015-03-05 | 2020-09-25 | Tomographic reconstruction system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562129018P | 2015-03-05 | 2015-03-05 | |
US15/063,054 US10163232B2 (en) | 2015-03-05 | 2016-03-07 | Tomographic reconstruction system |
US16/233,066 US20190251712A1 (en) | 2015-03-05 | 2018-12-26 | Tomographic reconstruction system |
US17/033,065 US20210110581A1 (en) | 2015-03-05 | 2020-09-25 | Tomographic reconstruction system |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/233,066 Continuation US20190251712A1 (en) | 2015-03-05 | 2018-12-26 | Tomographic reconstruction system |
Publications (1)
Publication Number | Publication Date |
---|---|
US20210110581A1 true US20210110581A1 (en) | 2021-04-15 |
Family
ID=56850666
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/063,054 Expired - Fee Related US10163232B2 (en) | 2015-03-05 | 2016-03-07 | Tomographic reconstruction system |
US16/233,066 Abandoned US20190251712A1 (en) | 2015-03-05 | 2018-12-26 | Tomographic reconstruction system |
US17/033,065 Abandoned US20210110581A1 (en) | 2015-03-05 | 2020-09-25 | Tomographic reconstruction system |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/063,054 Expired - Fee Related US10163232B2 (en) | 2015-03-05 | 2016-03-07 | Tomographic reconstruction system |
US16/233,066 Abandoned US20190251712A1 (en) | 2015-03-05 | 2018-12-26 | Tomographic reconstruction system |
Country Status (1)
Country | Link |
---|---|
US (3) | US10163232B2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2024054952A (en) * | 2022-10-06 | 2024-04-18 | 浜松ホトニクス株式会社 | Image processing device and image processing method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080095300A1 (en) * | 2006-10-05 | 2008-04-24 | General Electric Company | System and method for iterative reconstruction using parallel processing |
US8503754B2 (en) * | 2010-06-07 | 2013-08-06 | Calgary Scientific Inc. | Parallel process for level set segmentation of volume data |
US8379948B2 (en) * | 2010-12-21 | 2013-02-19 | General Electric Company | Methods and systems for fast iterative reconstruction using separable system models |
-
2016
- 2016-03-07 US US15/063,054 patent/US10163232B2/en not_active Expired - Fee Related
-
2018
- 2018-12-26 US US16/233,066 patent/US20190251712A1/en not_active Abandoned
-
2020
- 2020-09-25 US US17/033,065 patent/US20210110581A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
US10163232B2 (en) | 2018-12-25 |
US20190251712A1 (en) | 2019-08-15 |
US20160260230A1 (en) | 2016-09-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9047674B2 (en) | Structured grids and graph traversal for image processing | |
US9892527B2 (en) | Development of iterative reconstruction framework using analytic principle for low dose X-ray CT | |
US9189870B2 (en) | Method, computer readable medium and system for tomographic reconstruction | |
CN110651302B (en) | Method and apparatus for image reconstruction | |
Fessler et al. | Axial block coordinate descent (ABCD) algorithm for X-ray CT image reconstruction | |
Jin et al. | A model-based 3D multi-slice helical CT reconstruction algorithm for transportation security application | |
US20210110581A1 (en) | Tomographic reconstruction system | |
US10867415B1 (en) | Device and method for constructing and displaying high quality images from imaging data by transforming a data structure utilizing machine learning techniques | |
US11308662B2 (en) | System and method for image reconstruction | |
Chouzenoux et al. | A local MM subspace method for solving constrained variational problems in image recovery | |
Yan et al. | EM+ TV based reconstruction for cone-beam CT with reduced radiation | |
US10096148B1 (en) | Portable x-ray computed tomography | |
JP6548441B2 (en) | IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND PROGRAM | |
Arcadu et al. | A forward regridding method with minimal oversampling for accurate and efficient iterative tomographic algorithms | |
CN112052885A (en) | Image processing method, device and equipment and PET-CT system | |
Klyuzhin et al. | PET image reconstruction and deformable motion correction using unorganized point clouds | |
Biguri | Iterative reconstruction and motion compensation in computed tomography on GPUs | |
Sridhar et al. | Distributed framework for fast iterative CT reconstruction from view-subsets | |
US10347014B2 (en) | System and method for image reconstruction | |
Gordon | Parallel ART for image reconstruction in CT using processor arrays | |
Jeffrey et al. | Fast sampling from Wiener posteriors for image data with dataflow engines | |
Xu et al. | Mapping iterative medical imaging algorithm on cell accelerator | |
Zhang et al. | FPGA acceleration by asynchronous parallelization for simultaneous image reconstruction and segmentation based on the Mumford-Shah regularization | |
Ha et al. | A GPU-accelerated multivoxel update scheme for iterative coordinate descent (ICD) optimization in statistical iterative CT reconstruction (SIR) | |
Sportelli et al. | Massively parallelizable list‐mode reconstruction using a Monte Carlo‐based elliptical Gaussian model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
AS | Assignment |
Owner name: NATIONAL SCIENCE FOUNDATION, VIRGINIA Free format text: CONFIRMATORY LICENSE;ASSIGNOR:PURDUE UNIVERSITY;REEL/FRAME:056269/0119 Effective date: 20210201 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |