CN101596113B - Computer tomography (CT) parallel reconstructing system and imaging method thereof - Google Patents

Computer tomography (CT) parallel reconstructing system and imaging method thereof Download PDF

Info

Publication number
CN101596113B
CN101596113B CN2008101144781A CN200810114478A CN101596113B CN 101596113 B CN101596113 B CN 101596113B CN 2008101144781 A CN2008101144781 A CN 2008101144781A CN 200810114478 A CN200810114478 A CN 200810114478A CN 101596113 B CN101596113 B CN 101596113B
Authority
CN
China
Prior art keywords
reconstructed image
angle
centroid
child node
reconstruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2008101144781A
Other languages
Chinese (zh)
Other versions
CN101596113A (en
Inventor
孟凡勇
陈飞国
王维
李静海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Process Engineering of CAS
Original Assignee
Institute of Process Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Process Engineering of CAS filed Critical Institute of Process Engineering of CAS
Priority to CN2008101144781A priority Critical patent/CN101596113B/en
Publication of CN101596113A publication Critical patent/CN101596113A/en
Application granted granted Critical
Publication of CN101596113B publication Critical patent/CN101596113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a computer tomography (CT) parallel reconstructing system and an imaging method thereof. The CT parallel reconstructing system comprises a front end sampler, a central node and a plurality of sub-nodes connected with the central node, wherein each sub-node is provided with an image processor. The imaging method of the CT parallel reconstructing system comprises the following steps: (1) a reconstructed image is divided into a plurality of block regions; (2) original projection data of each angle, which is sampled by the front end sampler, is received by the central node; (3) calculation tasks are allotted to the sub-nodes by the central node, and the reconstruction values of one or a plurality of block regions are calculated by the sub-nodes to which the reconstruction calculation tasks are allotted; (4) the reconstruction calculation tasks are completed by the sub-nodes; and (5) a complete reconstruction image is combined by the central node. By adopting the method of dividing the reconstructed image into subregions, the invention fully excavates the parallel characteristics of CT scan and reconstruction, and performs parallel reconstruction of GPU while data collection, and the reconstruction time is lowered from the minute grade of the prior art to millisecond grade, thereby accurately displaying the reconstruction of the sectional image of a detected object in real time.

Description

A kind of CT parallel reconstructing system and formation method
Technical field
The invention belongs to technical field of nondestructive testing, specifically, the present invention relates to a kind of CT concurrent reconstruction imaging system and concurrent reconstruction formation method.
Background technology
Computerized tomography (CT) is widely used in medical image and industrial nondestructive inspection field, its ultimate principle is by the projection dampening information of the measured object under the collection different angles to ray, by corresponding filter back-projection algorithm (FBP), reconstruct the two dimension or the three-dimension disclocation information of measured object.In image reconstruction process, the FBP algorithm needs earlier initial data to be carried out filtering, calculates the detector pixel unit of corresponding reconstructed coordinate points correspondence then, gives corresponding coordinate points with the detector data assignment under the corresponding angle, is back projection.Back projection's process relates to a large amount of triangles and address searching computing, expends the plenty of time.As for rebuilding the faultage image of 1024 * 1024 pixels, need surpass 10 with 1800 projection angle information 10Inferior triangulo operation directly carries out image reconstruction to it and calculates, and consuming timely surpasses 1 minute.CT is applied to the measurement in process engineering field, and detected object is generally the object of motion, has only the quasi real time scanner uni of Millisecond to rebuild the real information that measurand just can be provided.A method quickening image reconstruction is, the address computation that will be referred to triangulo operation in advance saves as well when address file supplies to rebuild inquires about, this method can effectively shorten the reconstruction required time, its cost is to have increased memory cost greatly, as rebuilding for above-mentioned 1024 * 1024 * 1800, if the address is float type (4 byte), its whole addresses committed memory is about 7.5G.Another shortcoming of address lookup table method is that when the CT geometric parameter changed, the address need all recalculate Projection.
On the other hand, be in recent years a research focus based on the high-performance numerical operation of graphic process unit (GPU).CPU usually use space more than half (as
Figure S2008101144781D00011
Figure S2008101144781D00012
The chip space of processor 60%) be used for buffer memory and management, GPU then is used for numerical computations with the overwhelming majority's transistor.In addition, the transistor sum of GPU is also much larger than CPU, as
Figure S2008101144781D00013
Athlon 64FX double-core CPU processor have 2.27 hundred million transistors, four cores
Figure S2008101144781D00014
DPCPU processor Harpertown (45nm, code name Penryn) has 8.2 hundred million transistors,
Figure S2008101144781D00015
The number of transistors of GeForce 9800GX2 (G92 core) GPU graphic process unit is 15.08 hundred million.Just because of GPU is absorbed in numerical operation, so its computing capability is much larger than CPU.At present, the GPU that the peak value computing capability reaches 500GFlops appears on the market, and is about 16 processors 100 times of C90 supercomputer peak performance.(K.Mueller, F.Xu, and N.Neophytou Why do GPUs work so well for acceleration of CT? SPIEElectronic Imaging ' 07San Jose, January 2007) therefore, GPU is specially adapted to the big field of the high numerical operation amount of degree of parallelism.
Summary of the invention
The objective of the invention is deficiency at existing C T image re-construction system, a kind of multi-process multithreading CT image parallel reconstructing system based on graphic process unit (GPU) cluster is provided, and provide multi-process multi-threaded parallel method for reconstructing at GPU cluster characteristics designs, make the image reconstruction time shorten greatly, can reach Millisecond and quasi real time show reconstructed image.
For achieving the above object, CT parallel reconstructing system provided by the invention comprises the front-end sampling device that is connected with the CT detector array, the Centroid that is connected with described front-end sampling device, and a plurality of child nodes that are connected with described Centroid, each described child node all is equipped with graphic process unit, described Centroid is used for the CT reconstructed image is divided into a plurality of block zone (being the piece zone), and distribute according to the calculation task of the block zone of being divided to each child node, described child node is used for calculating the reconstructed value of assigned each pixel of block zone.
In the technique scheme, described Centroid also is equipped with graphic process unit.
For achieving the above object, the formation method based on described CT parallel reconstructing system provided by the invention comprises the steps:
1) reconstructed image is divided into a plurality of block zone;
2) the original projection data under Centroid receiving front-end sampler each angle of being gathered;
3) Centroid is each node Distribution Calculation task, and gives each child node of having distributed calculation task with the original projection transfer of data; Each has distributed the child node of calculation task to calculate the reconstructed value in one or more block zone;
4) each child node is finished the reconstruction calculating of each pixel in the block zone of being distributed, and result of calculation is returned to Centroid;
5) Centroid is combined into complete reconstructed image according to the result of calculation of each node.
In the technique scheme, in the described step 1), the number of the pixel that each described block zone is contained is 64 multiple.
In the technique scheme, in the described step 1), the pixel in each described block zone is formed square formation.
In the technique scheme, in the described step 1), the pixel number in each described block zone is 256, and described block zone is the 16*16 square formation.
In the technique scheme, described step 2) in, described front-end sampling device is gathered the original projection data under each angle in the CT detection process successively, and described Centroid receives the original projection data under each angle in batches successively, and the original projection data under angle are one batch; Described Centroid is given the less node of current calculated load according to the calculated load situation of current each node with the calculation task dynamic assignment after receiving the original projection data of each detector cells under the angle.
In the technique scheme, in the described step 3), when the center node receives original projection data under the angle, calculate to same node the distribution of computation tasks in all block zones of the reconstructed image under this angle; In the described step 4), each node returns to Centroid with the reconstructed image under each angle; In the described step 5), Centroid is combined into final reconstructed image with the reconstructed image under each angle.
In the technique scheme, in the described step 3), when Centroid whenever receives original projection data under the angle, calculate to different nodes the distribution of computation tasks in each block zone of the reconstructed image under this angle; In the described step 4), the reconstructed image in the block zone that each node will be responsible for separately returns to Centroid, and in the described step 5), at first the reconstructed image in each block zone of returning according to each node is combined into the reconstructed image under each angle; And then draw final reconstructed image.
In the technique scheme, in the described step 3), each child node is responsible for the reconstruction in this block zone and is calculated, and a thread of the graphic process unit of each child node is responsible for the reconstructed value of a pixel and is calculated.
Compared with prior art, the invention provides a kind of multi-process multithreading CT image parallel reconstructing system based on graphic process unit (GPU) cluster, and provide CT image method for parallel reconstruction at the multi-process multithreading of GPU cluster ardware feature exploitation, take to divide the method for reconstructed image subregion and thread block, fully excavate the parallel characteristics of CT scan and reconstruction, in data acquisition, carry out the concurrent reconstruction of GPU, the present invention can improve reconstruction speed above 1000 times, reconstruction time is reduced to Millisecond by minute level of prior art, and the tomographic image reconstructing that reaches measurand quasi real time shows.
Description of drawings
Below, describe embodiments of the invention in conjunction with the accompanying drawings in detail, wherein:
Fig. 1 is an image re-construction system structural representation of the present invention, and wherein 1 is radiographic source, and 2 is measured object, and 3 is turntable, and 4 is detector.
Fig. 2 is the GPU operation principle sketch map among the present invention.
Fig. 3 is provided by the invention based on the FBP reconstruction figure of GPU parallel reconstructing system to three generations's CT scan data, and measuring object is lucite tube and wedge shape model.
Fig. 4 is the reconstruction figure of single CPU standard FBP to three generations's CT scan data, and measuring object is lucite tube and wedge shape model.
Fig. 5 be provided by the invention based on the GPU parallel reconstructing system to five generation the CT scan data reconstruction figure, measuring object is the lucite disk model.
Fig. 6 be single CPU standard FBP to five generation the CT scan data reconstruction figure, measuring object is the lucite disk model.
The specific embodiment
Embodiment 1
Present embodiment provides a kind of parallel reconstructing system based on the GPU cluster, comprises a front end data acquisition computer, is connected with the detector of CT system; N platform Back end data process computer (N does not do particular determination) is formed LAN with (SuSE) Linux OS, is furnished with on every computer
Figure S2008101144781D00041
The Tesla that company produces TMGraphic process unit; This system also comprises the essential hardware of CT scan institute such as radiographic source, detector array, turntable.Centroid connects memory device and image output device.
In the present embodiment, each GPU comprises L multiprocessor unit (multiprocessor), each multiprocessor unit can be initiated M thread input simultaneously and be calculated (the M quantity difference of the GPU of different model), therefore, in theory, design the CT image reconstruction computational methods at the GPU characteristics, N GPU can improve computational speed L * N * M doubly.
Present embodiment also provides a kind of method for parallel reconstruction based on the GPU cluster, may further comprise the steps:
1) determine the reconstructed image size, and carry out corresponding Memory Allocation on each node, the memory size of distribution is the reconstruction size *Data type is of a size of the 1024*1024 pixel as reconstructed image, and data type is float type (4 byte), and then the memory headroom that need divide of each node is a 1024*1024*4 byte, is used for the image of each node rebuild of buffer memory.
2) carry out predistribution to rebuilding calculation task, each node distributes the reconstruction of projection angle data to calculate, and the reconstructed image of each node further is divided into m*m number of sub images zone, and each sub-image area is made up of n*n pixel; GPU on each node is divided into m*m thread block (block), and each sub-image area correspondence a GPU thread block, and each thread block is divided into n*n thread, pixel of each thread computes.
3) turntable is placed initial angle, turntable has two kinds of mode of operations, and a kind of is that radiographic source-detector is motionless, turntable drives the measured object rotation, another kind is that measured object is motionless, and turntable drives radiographic source-detector and rotates around measured object, and the two does not produce materially affect to image reconstruction;
4) record original projection data under the current angle, original projection data is sent to Centroid; Centroid is detected the calculated load situation of each child node and the calculation task of each child node is distributed; And the original projection data under the current angle is sent to the x work song node that is assigned to calculation task; The calculation task of the reconstructed image under the current angle can be distributed to a child node, also can distribute to a plurality of child nodes and finish jointly, and when the center node also had GPU, Centroid itself also can be assigned with calculation task; In order to improve computational efficiency, the GPU of a child node is responsible for calculating a block zone in this step, the reconstructed value of a pixel in this block zone of each thread computes of GPU.But it should be noted that the above task method of salary distribution is not unique, also can calculate two even a plurality of pixel such as each thread, this is that those skilled in the art understand easily.
5) original projection data under the x work song node current angle that will receive is uploaded to the GPU buffer memory of this node, carries out Fourier Tranform to the original projection data filtering; According to the thread block zone of dividing in the step 2, in the GPU buffer memory, divide corresponding subimage memory space then, call the GPU that writes, utilize the data reconstruction value in the specified block of the graphic process unit parallel computation zone that is connected with this child node; Each processor unit in the described graphic process unit is responsible for a thread zone in the specified block zone;
6) call the kernel function that drives GPU work, according to ready-portioned thread block in the step 2, the parallel reconstruction computational threads of initiating, the corresponding pixel that calculates reconstructed image of each thread;
7) detect all thread computes and finish, sub-image data is synthetic, be back to Centroid, for synthesizing of final reconstructed image;
8) judge whether to travel through all angles, if the judgment is No, then turntable is placed next angle, return step 4) by a fixed step size; If the judgment is Yes, the reconstructed image after each node processing that Centroid is received synthesizes, and exports it to display device with pictorial form and stores on the specified memory devices such as hard disk, CD, flash disk.
It should be noted that, each node is finished calculation task required time difference, therefore in step 4, in order further to improve computational efficiency, Centroid can be according to the load condition Real-time and Dynamic Distribution Calculation task of each child node, select the x node of calculated load minimum under the current time to calculate, thereby guarantee that this parallel reconstructing system reaches the optimization of computational speed.
Among the present invention, it is improper that the interior Thread Count of each block distributes, and can cause the conflict of GPU memory address.When adopting the Tesla C870 GPU of Nvidia company, when the Thread Count in each block of depositor is 64 multiple, can farthest avoid the register memory conflict.Thread Count in the block is very few, thread is read and write and during simultaneously operating, understand some depositor in the processor and be in idle condition, calculated performance is not high, but increase along with the Thread Count in the block, the obtainable depositor number of each thread reduces, its performance also can descend, therefore, the Thread Count in each block need have individual optimum, for the reconstructing system of present embodiment, this optimum is 256, combining image is rebuild practical situation, 256 threads is compiled be 16*16 square formation, the part of corresponding reconstructed image respectively.The Block number is very few, can cause part multiprocessor (processor unit) to be in idle condition, general block number is at least the twice of multiprocessor number, make each block read and write and during simultaneously operating, multiprocessor can be switched to another block and calculate.(reconstructed image is generally square to the characteristics of rebuilding according to CT, its length of side B is generally 64,128,256,512,1024,2048 pixels), because general volume of the thread in the block is the 16*16 matrix, it also is the m*m matrix that corresponding block divides, m=B/16, for the reconstructed image of 1024*1024 pixel, its block number is 64*64.
Division to the reconstruction image-region among the present invention is the important step that improves computational efficiency.The Thread Count of each block is 64 multiple, is preferred valueses with 256 and 512.Simultaneously, carry out target for convenience and rebuild (the English targeted reconstruction of being, be meant at a certain specific part of reconstructed image and carry out, can often use this point in actual applications), therefore each block zone of reconstructed image all is made as square formation, and the block with 256 threads of 16*16 is the best.Certainly, block also can be made as 1*256, or the arrangement of other non-square matrix, is not easy to carry out the target reconstruction but be provided with like this.
The present invention can realize having given full play to the advantage of parallel computation SIMD (single-instruction multiple-data) framework with pixel of a thread computes by suitable division, has improved the concurrency of rebuilding computing.
Embodiment 2
This embodiment is with the equidistant fan beam CT of three generations Dan Yuan lucite tube and wedge shape model to be measured, detector has 1536 pixels, the wide 0.4mm of pixel, the step motor drive turntable carries out the full week scanning of 360 degree around measured object, gathers 3600 angle data for projection altogether.Therefore initial data is the 1536*3600 two-dimensional matrix, and with the access of int type, each raw data file is about 22M.Reconstructed image image is of a size of 1024*1024.Image parallel reconstructing system of the present invention has 30 nodes, and each node is equipped with a GPU graphic process unit.
As shown in Figure 1, the rotation of front-end sampling machine control turntable, and the working asynchronously of radiographic source and detector, after the ray that radiographic source 1 is launched passes measured object 2, collect ray signal after the decay by detector 3, be passed to the front-end sampling machine, the front-end sampling machine is through mould/number conversion, after converting the aanalogvoltage wire size to digital signal, data are passed to No. 0 node more in real time, No. 0 node (being Centroid) is responsible for initial data to the distribution of other 29 nodes and the collecting work of the data after the reconstruction, and the synthetic of final reconstructed image element also finished by No. zero node, and final view data is stored with binary coding.The characteristics of hardware system and reconstructed image size according to the present invention are divided into 64*64 blocks with reconstructed image, and each block is made up of 16*16 thread, as shown in Figure 2.
Method for parallel reconstruction based on GPU is as follows:
1) initialisation image image, the image after the initialization are the null value matrix, and front-end sampling is machine-readable, and (the capable vector of 1536*1 is with P after getting the dampening information of first angle ray to measured object 1Expression), with P 1Handle No. 0 node of a group of planes by reach Back end data with gigabit Ethernet;
2) No. 0 node detects after transfer of data finishes, and detecting is each node calculated load situation at that time, with P 1Be sent to the minimum node x of load.Method for detecting is: before Centroid sends new calculation task, send instruction earlier, add up the pixel count that each node is finished calculating, new task is dealt into finishes the maximum node of pixel count, the node of promptly current calculated load minimum.
3) node x calls GPU use FFT to P 1Carry out corresponding filtering operation, the ramp wave filter of the optional standard of wave filter also can select wave filter such as Ram-Lak, cosine, Hamming herein;
4) GPU on this node of node x initialization is with P 1Be uploaded to the internal memory of GPU, call date processing nuclear code, the reconstruction regions of its branch linchpin of the calculating of each GPU thread parallel (is the blocks The corresponding area that reconstructed image is divided, the calculating of the corresponding pixel of each thread), its computing formula is as follows, d is reconstructed image mid point (x, y) Dui Ying projection detector pixel point, x, y are the coordinate of image, D is the distance that detector is arrived in the source, in this example 1000mm, a is that pixel is wide, is 0.4mm in this example, and β is a projection angle, in this step first angle, β=0;
d = D × ( x × cos β + y × sin β ) ( D + x × sin β - y × cos β ) × a
5) with the value P of the detector pixel point that calculates in the step 4 1, dTax to point on the reconstructed image (x, y);
6) this GPU is gone up all threads and carry out simultaneously operating, guarantee the calculating of all thread completing steps 4 and the assignment of step 5, get 1024*1024 matrix ima{1};
7) node x reaches node No. 0 with the reconstruction matrix ima{1} of first angle, and No. 0 node detects data and receives, and ima{1} and image are carried out corresponding adding and computing;
8) drive stepping motor rotation 0.1 degree according to step 1-step 7, is gathered the projection attenuation data of next angle, and data for projection under the current angle is rebuild; When the data acquisition of finishing all 3600 angles and reconstruction, No. 0 node is also finished image = Σ n = 1 3600 ima { n } Computing obtains the image matrix of reconstructed image;
9) at last reconstructed image is carried out post-processing operation such as smothing filtering, removal negative value, the image matrix of reconstructed image is sent to the pictorial display machine, show final reconstructed image.
The two dimensional image that Fig. 3 is to use the present invention to rebuild, Fig. 4 is for using the reconstructed results of single CPU+ standard FBP algorithm.Because the two theoretical basis is in full accord, so image quality is identical.
Use reconstructing system provided by the invention and algorithm, the reconstruction time of each angle is about 0.5ms, finishes whole angles reconstruction required times and is about 40ms.Adopt the CPU unit to rebuild approximately and need 160 seconds, adopt the CPU cluster reconstruction of 30 nodes to need 8 seconds approximately, the present invention can improve about 4000 times (single relatively CPU) and 200 times (CPU clusters of relative 30 nodes) respectively with rebuilding efficient.
Embodiment 3
This embodiment be with five generations equidistant fan beam CT the lucite disk model is measured, therefore simultaneously the CT system has 18 groups of source-detectors, can gather the initial data of 18 projection angles.Each detector has 640 pixels, the wide 0.4mm of pixel, source and detector mounted in pairs are on turntable, and data reach the front-end sampling machine by slip ring, drive system drive turntable carries out the full week scanning of 360 degree with the rotating speed of 667rpm around measured object, gathers 900 angle data for projection altogether.Therefore initial data is the 640*900 two-dimensional matrix.Reconstructed image image is of a size of 1024*1024.Image parallel reconstructing system of the present invention has 30 nodes, and each node is equipped with a GPU graphic process unit, and reconstructed image is divided into 64*64 blocks, and each block is made up of 16*16 thread (the corresponding pixel of each thread), as shown in Figure 2.
Method for parallel reconstruction based on GPU is as follows:
10) initialisation image image matrix is the null value matrix, the front-end sampling machine reads the 1st, 51,101 simultaneously ..., 851,900 angles totally 18 groups of data for projection, (the capable vector of 640*1 is with P bB=1,51,101 ..., 851,900} represents), with P bHandle No. 0 node of a group of planes by reach Back end data with gigabit Ethernet;
11) No. 0 node detects after transfer of data finishes, and No. 0 current each node calculated load situation of node detecting is with P bBe sent to 18 child node G that load is minimum n
12) node G nCalling GPU uses FFT to P bCarry out corresponding filtering operation, the ramp wave filter of the optional standard of wave filter also can select wave filter such as Ram-Lak, cosine, Hamming herein;
13) initialization node G nOn GPU, with P 1Be uploaded to the internal memory of GPU, call date processing nuclear code, it divides the zoning (being the thread The corresponding area among the blocks that divides of reconstructed image) of linchpin the calculating of each thread parallel, and its computing formula is as follows, and d is reconstructed image mid point (x, y) Dui Ying projection detector pixel point, D is the distance that detector is arrived in the source, is 1200mm in this example, and β is a projection angle, a is that pixel is wide, is 0.4mm in this example;
d = D × ( x × cos β + y × sin β ) ( D + x × sin β - y × cos β ) × a
14) value of the detector pixel point that calculates in the step 13 is composed (x y), is P to point B, d
15) this GPU is gone up all threads and carry out simultaneously operating, guarantee the calculating of all thread completing steps 14 and the assignment of step 15, get 18 groups of 1024*1024 matrix I bB=1,51,101 ..., 851,900};
16) finish the child node of calculating with reconstruction matrix I bReach node No. 0, No. 0 node detects data and receives, with I bCarry out corresponding adding and computing with image;
17) the turntable drive system drives source-detector rotation 0.4 degree, according to step 11-step 17, gather 18 groups of projection attenuation datas under the next angle and these data are rebuild, after finishing 50 angles totally 900 groups of data for projection are gathered and rebuild, No. 0 node is also finished image = Σ n = 1 900 I n Computing obtains reconstructed image image matrix;
18) reconstructed image is carried out post-processing operation such as smothing filtering, removal negative value, image image is sent to the pictorial display machine, show final reconstructed image.
The two dimensional image that Fig. 5 is to use the present invention to rebuild, Fig. 6 is for using the reconstructed results of single CPU+ standard FBP algorithm.
Because this CT system is furnished with 18 groups of source-detectors, therefore, turntable only need rotate can finish data acquisition in 1/18 week, and the Projection Sampling time of each angle is about 0.1ms, finishes whole samplings and needs 5ms approximately.Use this image parallel reconstructing system based on the GPU cluster, intact reconstruction at an angle needs 0.2ms approximately, and the image reconstruction of all angles needs 10ms approximately fully.Because data acquisition is finished with the non-serial of reconstruction, but after intact at an angle 18 groups of projection acquisition, when carrying out the collection of the angle-data, carry out image reconstruction, therefore gather and rebuild parallel carrying out, finish the scanner uni of a section and rebuild, altogether 12ms consuming time.
At last; above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above is specific embodiments of the invention only, is not limited to the present invention, and is within the spirit and principles in the present invention all; any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (7)

1. the formation method of a CT parallel reconstructing system, described CT parallel reconstructing system comprises the front-end sampling device that is connected with the CT detector array, the Centroid that is connected with described front-end sampling device, and a plurality of child nodes that are connected with described Centroid, each described child node all is equipped with graphic process unit, described Centroid is used for the CT reconstructed image is divided into a plurality of zones, and distribute according to the calculation task of the piece zone of being divided to each child node, described child node is used for calculating the reconstructed value of assigned each pixel of piece zone, and the formation method of described CT parallel reconstructing system comprises the steps:
1) reconstructed image is divided into a plurality of zones;
2) the original projection data under Centroid receiving front-end sampler each angle of being gathered;
3) Centroid is each child node Distribution Calculation task, and gives each child node of having distributed calculation task with the original projection transfer of data; Each has distributed the child node of calculation task to calculate the reconstructed value in one or more zones;
4) each child node is finished the reconstruction calculating of each pixel in the piece zone of being distributed, and result of calculation is returned to Centroid;
5) Centroid is combined into complete reconstructed image according to the result of calculation of each child node;
Described step 2) in, described front-end sampling device is gathered the original projection data under each angle in the CT detection process successively, and described Centroid receives the original projection data under each angle in batches successively, and the original projection data under angle are one batch;
In the described step 3), described Centroid is given the less child node of current calculated load according to the calculated load situation of current each child node with distribution of computation tasks after receiving the original projection data of each detector cells under the angle.
2. formation method according to claim 1 is characterized in that, in the described step 1), the number of the pixel that each described zone is contained is 64 multiple.
3. formation method according to claim 1 is characterized in that, in the described step 1), the pixel in each described zone is formed square formation.
4. formation method according to claim 2 is characterized in that, in the described step 1), the pixel number in each described zone is 256, and described zone is the 16*16 square formation.
5. formation method according to claim 4, it is characterized in that, in the described step 3), when the center node receives original projection data under the angle, calculate to same child node the distribution of computation tasks in all piece zones of the reconstructed image under this angle; In the described step 4), each child node returns to Centroid with the reconstructed image under each angle; In the described step 5), Centroid is combined into final reconstructed image with the reconstructed image under each angle.
6. formation method according to claim 4, it is characterized in that, in the described step 3), when Centroid whenever receives original projection data under the angle, calculate to different child nodes the distribution of computation tasks in each piece zone of the reconstructed image under this angle; In the described step 4), the reconstructed image in the piece zone that each child node will be responsible for separately returns to Centroid, and in the described step 5), at first the reconstructed image in each piece zone of returning according to each child node is combined into the reconstructed image under each angle; And then draw final reconstructed image.
7. formation method according to claim 6 is characterized in that, in the described step 3), each child node is responsible for the reconstruction in this piece zone and is calculated, and a thread of the graphic process unit of each child node is responsible for the reconstructed value of a pixel and is calculated.
CN2008101144781A 2008-06-06 2008-06-06 Computer tomography (CT) parallel reconstructing system and imaging method thereof Active CN101596113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101144781A CN101596113B (en) 2008-06-06 2008-06-06 Computer tomography (CT) parallel reconstructing system and imaging method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101144781A CN101596113B (en) 2008-06-06 2008-06-06 Computer tomography (CT) parallel reconstructing system and imaging method thereof

Publications (2)

Publication Number Publication Date
CN101596113A CN101596113A (en) 2009-12-09
CN101596113B true CN101596113B (en) 2011-09-07

Family

ID=41417841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101144781A Active CN101596113B (en) 2008-06-06 2008-06-06 Computer tomography (CT) parallel reconstructing system and imaging method thereof

Country Status (1)

Country Link
CN (1) CN101596113B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102279970B (en) * 2010-06-13 2013-02-27 清华大学 Method for reestablishing helical cone-beam CT (Computer Tomography) based on GPU (Graphic Processor Unit)
CN102163319B (en) * 2011-03-02 2012-10-31 浪潮(北京)电子信息产业有限公司 Method and system for realization of iterative reconstructed image
CN102630316B (en) * 2011-12-22 2015-05-06 华为技术有限公司 Processing method and apparatus of concurrent tasks
CN102783967B (en) * 2012-08-23 2014-06-04 汕头市超声仪器研究所有限公司 Breast CT (Computed Tomography) apparatus
CN103784158B (en) * 2012-10-29 2016-08-03 株式会社日立制作所 CT device and CT image generating method
CN103033783B (en) * 2012-12-10 2015-08-26 深圳先进技术研究院 A kind of nuclear magnetic resonance fast reconstruction system and method
CN104038543B (en) * 2013-05-27 2018-02-27 沈阳东软医疗系统有限公司 Method, cloud platform and the system that a kind of medical imaging equipment cloud is rebuild
CN104202368B (en) * 2013-05-27 2018-02-27 沈阳东软医疗系统有限公司 Method, cloud platform and the system of medical imaging data sharing based on cloud platform
CN104952043B (en) 2014-03-27 2017-10-24 株式会社日立制作所 Image filtering method and CT systems
CN104484232B (en) * 2014-08-11 2017-12-29 沈阳东软医疗系统有限公司 A kind of method and device for improving image reconstruction speed
CN106137235A (en) * 2016-07-26 2016-11-23 中国科学院深圳先进技术研究院 C-arm X-ray machine, control system and medical image system
CN106778024B (en) * 2017-01-04 2020-02-14 东软医疗系统股份有限公司 Image display method and device
WO2018133003A1 (en) * 2017-01-19 2018-07-26 深圳先进技术研究院 Ct three-dimensional reconstruction method and system
CN108021436A (en) * 2017-12-28 2018-05-11 辽宁科技大学 A kind of process scheduling method
CN109448078B (en) * 2018-10-19 2022-11-04 珠海金山数字网络科技有限公司 Image editing system, method and equipment
CN113313823B (en) * 2021-07-28 2021-11-12 康达洲际医疗器械有限公司 Collaborative imaging method and system for imaging system group

Also Published As

Publication number Publication date
CN101596113A (en) 2009-12-09

Similar Documents

Publication Publication Date Title
CN101596113B (en) Computer tomography (CT) parallel reconstructing system and imaging method thereof
CN102609978B (en) Method for accelerating cone-beam CT (computerized tomography) image reconstruction by using GPU (graphics processing unit) based on CUDA (compute unified device architecture) architecture
US10699447B2 (en) Multi-level image reconstruction using one or more neural networks
CN1284122C (en) System and method for fast parallel cone-beam reconstruction using one or more microprocessor
US8620054B2 (en) Image reconstruction based on accelerated method using polar symmetries
US5901196A (en) Reduction of hitlist size in spiral cone beam CT by use of local radon origins
US9111381B2 (en) Shift-varying line projection using graphics hardware
AU2002251922A1 (en) System and method for fast parallel cone-beam reconstruction using one or more microprocessors
Gac et al. High speed 3D tomography on CPU, GPU, and FPGA
Liu et al. GPU-based branchless distance-driven projection and backprojection
Kim et al. Forward-projection architecture for fast iterative image reconstruction in X-ray CT
Keck et al. GPU-accelerated SART reconstruction using the CUDA programming environment
CN101520899B (en) Method for parallel reconstruction of cone beam CT three-dimension images
Kachelrieß et al. Hyperfast perspective cone--beam backprojection
Scherl et al. Implementation of the FDK algorithm for cone-beam CT on the cell broadband engine architecture
CN111640054A (en) Three-dimensional reconstruction method based on GPU acceleration
Chen et al. A comparison of 3D cone-beam Computed Tomography (CT) image reconstruction performance on homogeneous multi-core processor and on other processors
CN101268950A (en) Accurate reestablishment system of helical CT based on CELL wide band engine
CN101887591B (en) Cone beam CT (computed tomography) fast reconstruction method based on rectangular bounding box
Preto et al. Object identification in binary tomographic images using gpgpus
Käseberg et al. OpenCL accelerated multi-GPU cone-beam reconstruction
Egger et al. Fast volume reconstruction in positron emission tomography: Implementation of four algorithms on a high-performance scalable parallel platform
Cui et al. Fully 3-D list-mode positron emission tomography image reconstruction on GPU using CUDA
Chen et al. Graphic Processing Unit (GPU) Based Hardware Acceleration for High Speed 3D Cone-Beam Computed Tomography (CT) Reconstruction
Wan et al. High-performance blob-based iterative reconstruction of electron tomography on multi-GPUs

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant