CN102831629A - Graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method - Google Patents

Graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method Download PDF

Info

Publication number
CN102831629A
CN102831629A CN2012103037600A CN201210303760A CN102831629A CN 102831629 A CN102831629 A CN 102831629A CN 2012103037600 A CN2012103037600 A CN 2012103037600A CN 201210303760 A CN201210303760 A CN 201210303760A CN 102831629 A CN102831629 A CN 102831629A
Authority
CN
China
Prior art keywords
process unit
video memory
graphic process
carry out
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012103037600A
Other languages
Chinese (zh)
Other versions
CN102831629B (en
Inventor
郭境峰
王海潮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shantou Ultrasonic Testing Technology Co., Ltd.
Original Assignee
SHANTOU DONGFANG ULTRASONIC TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANTOU DONGFANG ULTRASONIC TECHNOLOGY CO LTD filed Critical SHANTOU DONGFANG ULTRASONIC TECHNOLOGY CO LTD
Priority to CN201210303760.0A priority Critical patent/CN102831629B/en
Publication of CN102831629A publication Critical patent/CN102831629A/en
Application granted granted Critical
Publication of CN102831629B publication Critical patent/CN102831629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Processing (AREA)

Abstract

The invention relates to a graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method. graphic processor based mammary gland CT image reconstruction method sequentially comprises the following steps of: (1) receiving and transmitting data: transmitting a digital signal converted through an analog-to-digital converter from an analog signal detected by a scanning detection device into a system memory, and transmitting the digital signal inside the system memory into a video memory of the graphic processor; (2) reconstructing an imaging algorithm according to parallel limited-angles, and parallelly carrying out data operation in the graphic processor, wherein the data operation parallelly carried out in the graphic processor is iterative computation; (3) judging whether a computation result of the step (2) achieves an expected goal or not, if so, executing a step (4), and if not, returning to the step (2) again; and (4) transmitting the result of the iterative computation into the system memory for image postprocessing by the graphic processor. In the invention, reconstruction operation is carried out on the graphic processor, and therefore, the speed of the iterative computation can be greatly increased and the limited-angle mammary gland CT image reconstruction can be fast and accurately realized.

Description

Mammary gland CT image rebuilding method based on graphic process unit
Technical field
The present invention relates to image processing method, specifically, relate to a kind of mammary gland CT image rebuilding method based on graphic process unit.
Background technology
Mammary gland disease is that women's common disease, the particularly breast cancer incidence of disease in women's malignant tumour holds pride of place, and women's health is had great threat.
Traditional CT Medical Devices are when carrying out breast examination; Mostly adopting lets the patient lie at scanning bed mode of carrying out whole body or toposcopy; Scanning angle is often greater than 180 degree even reach 360 degree, and imaging mode is the complete data reestablishment imaging, and this mode not only need spend the more time and carry out image reconstruction; And make the patient receive more radiation dose, be unfavorable for that patient's body is healthy.
When scanning angle was spent less than 120, imaging mode was the limited angle reestablishment imaging.The special factor of considering radiation dose and picture contrast; Breast imaging generally adopts the limited angle reestablishment imaging; To reduce the radiation dose that the patient receives; But limited angle reestablishment imaging algorithm (like famous GP (Gerchberg-Papoulis) algorithm) iteration between Fourier space and image space needs more iterations just can obtain convergence preferably, also spends the more time simultaneously to carry out iterative computation; Have that speed of convergence is not good enough, image reconstruction speed waits problem slowly, reduced checking efficiency.
Summary of the invention
Technical matters to be solved by this invention provides a kind of mammary gland CT image rebuilding method based on graphic process unit; This mammary gland CT image rebuilding method can utilize graphic process unit easily and effectively, is implemented in the mammary gland CT image reconstruction under the limited angle rapidly and accurately.The technical scheme that adopts is following:
A kind of mammary gland CT image rebuilding method based on graphic process unit is characterized in that comprising the steps: successively
(1) Data Receiving and transmission: will be transferred to Installed System Memory through the converted digital signal of A/D converter (A/D) through system bus by the detected simulating signal of scanning detection apparatus of mammary gland CT equipment; Then according to the data volume of digital signal size application video memory, and the digital data transmission in the Installed System Memory in the video memory of graphic process unit (abbreviation GPU);
Above-mentioned digital signal is the signal that under limited angle reestablishment imaging mode, obtains.
Usually; After digital signal in the Installed System Memory being carried out successively processing such as data prediction (promptly utilizing level to dope the data for projection information of its adjacent part than the method that generates), FIR LPF; Apply for video memory according to the data volume size of digital signal again, and be transferred to data in the video memory of graphic process unit through the PCIEx16 interface.
CUDA (Compute Unified Device Architecture) the framework coding that the present invention preferentially adopts NVidia company to release; At first graphic process unit is carried out initialization, detect current graphic process unit and graphic process unit and drive the demand that whether meets the CUDA operation.And, before carrying out data transmission between Installed System Memory and the video memory, set up the CUDA environment: (a) download and install CUDA TooKit and CUDA SDK through following step; (b) in new project, comprise necessary CUDA header file, library file and chained library; (c) the nvcc compiler of loading CUDA, the nvcc compiler can convert the part of the graphic process unit program of CUDA to PTX code, becomes the program that can carry out in graphic process unit at last; (d) generate the file that suffix is called .cu, like this, in compiling, will compile the nvcc compiler that the file of suffix .cu by name is lost to CUDA, other file is then still compiled by the VC compiler.
(2) by parallel limited angle reestablishment imaging algorithm, the parallel data operation that carries out in graphic process unit;
The basic ideas of limited angle reestablishment imaging algorithm are iteration, and the data of limited angle imaging are limited frequency band in the Fourier space, the data that therefore can recover to lack with the GP algorithm.
Definition operator B and C, the process of GP iteration is in the limited angle reestablishment imaging algorithm:
B =?T F ,?C =?FT I?F ?1 ?①
0?= k ?
i+1?=?C? k?+?(I-CB) i ?③
Wherein the given data in Fourier space is defined as k, total data is expressed as, and F representes Fourier conversion, F 1Expression Fourier inverse transformation, T IAnd T FBe the two-valued function matrix of image space and frequency space, I is a unit matrix.
The GP iteration is finally with (1-λ i) nSpeed convergence arrive, wherein, { λ iBe the eigenwert of CB, and 0<λ i<1.
The parallel data operation that carries out is an iterative computation in graphic process unit, and iterative computation comprises the steps: (2-1) Fourier conversion and inverse transformation; (2-2) ask the eigenwert of spatial domain and frequency domain; (2-3) ask the eigenwert of operator B, C.Wherein:
Above-mentioned steps (2-1) comprises the steps: that specifically each stream handle of (2-1-1) graphic process unit receives data, that is to say that the digital signal data that the video memory of graphic process unit is received is assigned in each stream handle of graphic process unit; (2-1-2) one dimension Fourier conversion; (2-1-3) two-dimensional fourier transform; (2-1-4) TWO-DIMENSIONAL FOURIER inverse transformation; (2-1-5) result of calculation is write shared video memory.
In the step (2-1): before carrying out step (2-1-2) one dimension Fourier conversion; Through kernel design (i.e. nuclear design); Meet the warp launching condition when making GPU carry out one dimension Fourier transformation calculations; The cross-thread that assurance is subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Before carrying out step (2-1-3) two-dimensional fourier transform; Pass through atomic operation; Guarantee when making a plurality of threads visit the same address of overall video memory or shared video memory simultaneously that but each thread can realize sharing the mutually exclusive operation of write data; Before thread complete operation, other any thread all can't be visited this address therein, thereby the speed of visit thread-data is improved; Before carrying out the inverse transformation of step (2-1-4) TWO-DIMENSIONAL FOURIER; Design through kernel; Make GPU carry out meeting when the TWO-DIMENSIONAL FOURIER inverse transformation is calculated the warp launching condition; The cross-thread that assurance is subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Carry out step (2-1-5) result of calculation is write share video memory before; Send synchronic command; Guarantee that all threads in the same thread block all implement same position; Meeting operation suspension after wherein any thread runs to the synchronic command mark, threads all in whole thread block all run to same position, and whole thread block just can continue to carry out following statement.
Above-mentioned steps (2-2) specifically comprises the steps: (2-2-1) initializer B, C, comprises the video memory application of operator matrix and composes initial value, avoids null pointer; (2-2-2) use the cublas built-in function; (2-2-3) ask the spatial feature value; (2-2-4) ask the frequency domain character value; (2-2-5) result of calculation is write shared video memory.
In the step (2-2): before carrying out step (2-2-2) use cublas built-in function; Design through kernel; Make GPU carry out meeting when the spatial feature value is calculated the warp launching condition; The cross-thread that assurance is subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Carrying out before step (2-2-3) asks the spatial feature value; Pass through atomic operation; When making a plurality of threads visit the same address of overall video memory or shared video memory simultaneously; Guarantee that but each thread can realize sharing the mutually exclusive operation of write data, other any thread all can't be visited this address before thread complete operation therein, thereby the speed of visit thread-data is improved; Through the asynchronous flow operation, make when GPU calculates that the host CPU thread needn't wait for and can carry out other calculating carrying out before step (2-2-4) asks the frequency domain character value, thereby make CPU and GPU carry out work simultaneously, the raising resource utilization; Carry out step (2-2-5) result of calculation is write share video memory before; Send synchronic command; Guarantee that all threads in the same thread block all implement same position; Meeting operation suspension after wherein any thread runs to the synchronic command mark, threads all in whole thread block all run to same position, and whole thread block just can continue to carry out following statement.
Above-mentioned steps (2-3) comprises the steps: that specifically (2-3-1) reads shared video memory variable, and promptly read step (2-2) writes the variable of sharing video memory after calculating and accomplishing; (2-3-2) finding the inverse matrix; (2-3-3) ask conjugate matrices; (2-3-4) obtain the eigenwert of operator B, C; (2-3-5) result of calculation is write shared video memory.
In the step (2-3): before carrying out step (2-3-2) finding the inverse matrix; Design through kernel; Make GPU carry out meeting the warp launching condition when inverse matrix is calculated, the cross-thread that guarantees to be subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Carrying out before step (2-3-3) asks conjugate matrices; The privilege of access mark is made in the instruction of reading inverse matrix result of calculation; Make this instruction (promptly reading the instruction of inverse matrix result of calculation) share the limit priority of video memory visit, guarantee the fastest acquisition desired data and need not wait for privilege of access mark; Carrying out through asynchronous execution command, the calculating in the stream can being carried out simultaneously with the data transmission of another stream before step (2-3-4) obtains the eigenwert of operator B, C, improve resource utilization; Carry out step (2-3-5) result of calculation is write share video memory before, through offset alignment design, 4 byte-aligned or 8 byte-aligned that the alignment of data mode is calculated for meeting most GPU.
The calculation mechanism of graphic process unit is a concurrent operation mechanism; The suitable data operation that has identical calculations in a large number; That is to say that can imagine the CPU that becomes to have a plurality of (can reach tens to hundreds of) stream handle to graphic process unit, they can carry out computing simultaneously.The target of designs C UDA algorithm is to deliver to different stream handles to the data with identical calculations respectively to carry out computing, to practice thrift operation time.
(3) utilize the result of calculation of predetermined condition determination step (2) whether to reach re-set target, as reach then execution in step (4) of re-set target, carry out iterative computation otherwise come back to step (2).
(4) graphic process unit is sent to Installed System Memory with the result of iterative computation, carries out post processing of image.
Post processing of image can comprise log-compressed, window etc., output and showing after post processing of image.
The present invention utilizes the characteristics of the able to programme and high performance parallel computing of graphic process unit (GPU), on graphic process unit, rebuilds computing, can very large lifting iterative computation speed; Be implemented in the mammary gland CT image reconstruction under the limited angle rapidly and accurately; It is quicker that image is shown, the time of practicing thrift out is simultaneously carried out the more images aftertreatment, improves image display effect; It is more clear that image is shown, is more conducive to the detection of mammary gland disease.
Description of drawings
Fig. 1 is the overview flow chart of the preferred embodiment of the present invention;
Fig. 2 is the process flow diagram of step (1) Data Receiving and transmission;
Fig. 3 is the synoptic diagram of graphic process unit concurrent operation mechanism;
Fig. 4 is the process flow diagram of step (2-1) Fourier conversion and inverse transformation;
Fig. 5 is the process flow diagram that step (2-2) is asked the eigenwert of spatial domain and frequency domain;
Fig. 6 is the process flow diagram that step (2-3) is asked the eigenwert of operator B, C.
Embodiment
With reference to figure 1, this mammary gland CT image rebuilding method based on graphic process unit comprises the steps: successively
(1) Data Receiving and transmission: will be by the detected simulating signal of scanning detection apparatus of mammary gland CT equipment through the converted digital signal of A/D converter (A/D) (this digital signal is the signal that under limited angle reestablishment imaging mode, obtains); Be transferred to Installed System Memory through system bus; Then according to the data volume of digital signal size application video memory, and the digital data transmission in the Installed System Memory in the video memory of graphic process unit.With reference to figure 2; In the present embodiment; Digital data transmission is behind Installed System Memory; After digital signal in the Installed System Memory being carried out successively processing such as data prediction (promptly utilizing grade method to dope the data for projection information of its adjacent part), FIR LPF, apply for video memory according to the data volume size of digital signal again, and be transferred to data in the video memory of graphic process unit through the PCIEx16 interface than generation.
CUDA (Compute Unified Device Architecture) the framework coding that present embodiment adopts NVidia company to release; At first graphic process unit is carried out initialization, detect current graphic process unit and graphic process unit and drive the demand that whether meets the CUDA operation.And, before carrying out data transmission between Installed System Memory and the video memory, set up the CUDA environment: (a) download and install CUDA TooKit and CUDA SDK through following step; (b) in new project, comprise necessary CUDA header file, library file and chained library; (c) the nvcc compiler of loading CUDA, the nvcc compiler can convert the part of the graphic process unit program of CUDA to PTX code, becomes the program that can carry out in graphic process unit at last; (d) generate the file that suffix is called .cu, like this, in compiling, will compile the nvcc compiler that the file of suffix .cu by name is lost to CUDA, other file is then still compiled by the VC compiler.
The CUDA code that below between internal memory and video memory, exchanges for data:
Figure BDA0000204966631
(2) by parallel limited angle reestablishment imaging algorithm, the parallel data operation that carries out in graphic process unit;
Definition operator B and C, the process of GP iteration is in the limited angle reestablishment imaging algorithm:
B =?T F ,?C =?FT I?F ?1 ?①
0?= k ?
i+1?=?C? k?+?(I-CB) i ?③
Wherein the given data in Fourier space is defined as k, total data is expressed as, and F representes Fourier conversion, F 1Expression Fourier inverse transformation, T IAnd T FBe the two-valued function matrix of image space and frequency space, I is a unit matrix.
The GP iteration is finally with (1-λ i) nSpeed convergence arrive, wherein, { λ iBe the eigenwert of CB, and 0<λ i<1.
With reference to figure 1, the parallel data operation that carries out is an iterative computation in graphic process unit, and iterative computation comprises the steps: (2-1) Fourier conversion and inverse transformation; (2-2) ask the eigenwert of spatial domain and frequency domain; (2-3) ask the eigenwert of operator B, C.Wherein:
With reference to figure 4; Step (2-1) (Fourier conversion and inverse transformation) comprises the steps: that specifically (2-1-1) receives data by the stream handle of graphic process unit; That is to say that the digital signal data that the video memory of graphic process unit is received is assigned in each stream handle of graphic process unit; (2-1-2) one dimension Fourier conversion; (2-1-3) two-dimensional fourier transform; (2-1-4) TWO-DIMENSIONAL FOURIER inverse transformation; (2-1-5) result of calculation is write shared video memory.
In the step (2-1): before carrying out step (2-1-2) one dimension Fourier conversion; Through kernel design (i.e. nuclear design); Meet the warp launching condition when making GPU carry out one dimension Fourier transformation calculations; The cross-thread that assurance is subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Before carrying out step (2-1-3) two-dimensional fourier transform; Pass through atomic operation; Guarantee when making a plurality of threads visit the same address of overall video memory or shared video memory simultaneously that but each thread can realize sharing the mutually exclusive operation of write data; Before thread complete operation, other any thread all can't be visited this address therein, thereby the speed of visit thread-data is improved; Before carrying out the inverse transformation of step (2-1-4) TWO-DIMENSIONAL FOURIER; Design through kernel; Make GPU carry out meeting when the TWO-DIMENSIONAL FOURIER inverse transformation is calculated the warp launching condition; The cross-thread that assurance is subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Carry out step (2-1-5) result of calculation is write share video memory before; Send synchronic command; Guarantee that all threads in the same thread block all implement same position; Meeting operation suspension after wherein any thread runs to the synchronic command mark, threads all in whole thread block all run to same position, and whole thread block just can continue to carry out following statement.
Below be the CUDA code of step (2-1) Fourier conversion and inverse transformation:
Figure BDA0000204966632
With reference to figure 5, step (2-2) (asking the eigenwert of spatial domain and frequency domain) specifically comprises the steps: (2-2-1) initializer B, C, comprises the video memory application of operator matrix and composes initial value; (2-2-2) use the cublas built-in function; (2-2-3) ask the spatial feature value; (2-2-4) ask the frequency domain character value; (2-2-5) result of calculation is write shared video memory.
In the step (2-2): before carrying out step (2-2-2) use cublas built-in function; Design through kernel; Make GPU carry out meeting when the spatial feature value is calculated the warp launching condition; The cross-thread that assurance is subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Carrying out before step (2-2-3) asks the spatial feature value; Pass through atomic operation; When making a plurality of threads visit the same address of overall video memory or shared video memory simultaneously; Guarantee that but each thread can realize sharing the mutually exclusive operation of write data, other any thread all can't be visited this address before thread complete operation therein, thereby the speed of visit thread-data is improved; Through the asynchronous flow operation, make when GPU calculates that the host CPU thread needn't wait for and can carry out other calculating carrying out before step (2-2-4) asks the frequency domain character value, thereby make CPU and GPU carry out work simultaneously, the raising resource utilization; Carry out step (2-2-5) result of calculation is write share video memory before; Send synchronic command; Guarantee that all threads in the same thread block all implement same position; Meeting operation suspension after wherein any thread runs to the synchronic command mark, threads all in whole thread block all run to same position, and whole thread block just can continue to carry out following statement.
Below ask the CUDA code of the eigenwert of spatial domain and frequency domain for step (2-2):
Figure BDA0000204966633
With reference to figure 6, step (2-3) (asking the eigenwert of operator B, C) comprises the steps: that specifically (2-3-1) reads shared video memory variable, and promptly read step (2-2) writes the variable of sharing video memory after calculating and accomplishing; (2-3-2) finding the inverse matrix; (2-3-3) ask conjugate matrices; (2-3-4) obtain the eigenwert of operator B, C; (2-3-5) result of calculation is write shared video memory.
In the step (2-3): before carrying out step (2-3-2) finding the inverse matrix; Design through kernel; Make GPU carry out meeting the warp launching condition when inverse matrix is calculated, the cross-thread that guarantees to be subordinated to same warp need not carry out fence when communicating synchronous, thereby improve travelling speed; Carrying out before step (2-3-3) asks conjugate matrices; The privilege of access mark is made in the instruction of reading inverse matrix result of calculation; Make this instruction (promptly reading the instruction of inverse matrix result of calculation) share the limit priority of video memory visit, guarantee the fastest acquisition desired data and need not wait for privilege of access mark; Carrying out through asynchronous execution command, the calculating in the stream can being carried out simultaneously with the data transmission of another stream before step (2-3-4) obtains the eigenwert of operator B, C, improve resource utilization; Carry out step (2-3-5) result of calculation is write share video memory before, through offset alignment design, 4 byte-aligned or 8 byte-aligned that the alignment of data mode is calculated for meeting most GPU.
Below ask the CUDA code of the eigenwert of operator B, C for step (2-3):
Figure BDA0000204966634
With reference to figure 3; The calculation mechanism of graphic process unit is a concurrent operation mechanism, and the suitable data operation that has identical calculations in a large number that is to say; Can imagine the CPU that becomes to have a plurality of (can reach tens to hundreds of) stream handle to graphic process unit, they can carry out computing simultaneously.
(3) utilize the result of calculation of predetermined condition determination step (2) whether to reach re-set target, as reach then execution in step (4) of re-set target, carry out iterative computation otherwise come back to step (2).
(4) graphic process unit is sent to Installed System Memory with the result of iterative computation, carries out post processing of image.
Post processing of image can comprise log-compressed, window etc., output and showing after post processing of image.

Claims (9)

1. the mammary gland CT image rebuilding method based on graphic process unit is characterized in that comprising the steps: successively
(1) Data Receiving and transmission: will be transferred to Installed System Memory through the converted digital signal of A/D converter through system bus by the detected simulating signal of scanning detection apparatus of mammary gland CT equipment; Then according to the data volume of digital signal size application video memory, and the digital data transmission in the Installed System Memory in the video memory of graphic process unit;
Said digital signal is the signal that under limited angle reestablishment imaging mode, obtains;
(2) by parallel limited angle reestablishment imaging algorithm, the parallel data operation that carries out in graphic process unit;
Definition operator B and C, the process of GP iteration is in the limited angle reestablishment imaging algorithm:
B =?T F ,?C =?FT I?F ?1 ?①
0?= k
i+1?=?C? k?+?(I-CB) i ?③
Wherein the given data in Fourier space is defined as k, total data is expressed as, and F representes Fourier conversion, F 1Expression Fourier inverse transformation, T IAnd T FBe the two-valued function matrix of image space and frequency space, I is a unit matrix; The GP iteration is finally with (1-λ i) nSpeed convergence arrive, wherein, { λ iBe the eigenwert of CB, and 0<λ i<1;
The parallel data operation that carries out is an iterative computation in graphic process unit, and iterative computation comprises the steps: (2-1) Fourier conversion and inverse transformation; (2-2) ask the eigenwert of spatial domain and frequency domain; (2-3) ask the eigenwert of operator B, C;
(3) utilize the result of calculation of predetermined condition determination step (2) whether to reach re-set target, as reach then execution in step (4) of re-set target, carry out iterative computation otherwise come back to step (2);
(4) graphic process unit is sent to Installed System Memory with the result of iterative computation, carries out post processing of image.
2. the mammary gland CT image rebuilding method based on graphic process unit according to claim 1; It is characterized in that: in the step (1); Digital data transmission is behind Installed System Memory; After digital signal in the Installed System Memory carried out data prediction, FIR LPF successively, be transferred in the video memory of graphic process unit through the PCIEx16 interface again according to the data volume size application video memory of digital signal, and data.
3. the mammary gland CT image rebuilding method based on graphic process unit according to claim 1; It is characterized in that: in the step (1); Before carrying out data transmission between Installed System Memory and the video memory, set up the CUDA environment: (a) download and install CUDA TooKit and CUDA SDK through following step; (b) in new project, comprise necessary CUDA header file, library file and chained library; (c) the nvcc compiler of loading CUDA, the nvcc compiler converts the part of the graphic process unit program of CUDA to PTX code, becomes the program that can carry out in graphic process unit at last; (d) generate the file that suffix is called .cu.
4. the mammary gland CT image rebuilding method based on graphic process unit according to claim 1; It is characterized in that: step (2-1) comprises the steps: that specifically (2-1-1) receives data by the stream handle of graphic process unit; That is to say that the digital signal data that the video memory of graphic process unit is received is assigned in each stream handle of graphic process unit; (2-1-2) one dimension Fourier conversion; (2-1-3) two-dimensional fourier transform; (2-1-4) TWO-DIMENSIONAL FOURIER inverse transformation; (2-1-5) result of calculation is write shared video memory.
5. the mammary gland CT image rebuilding method based on graphic process unit according to claim 4 is characterized in that:
In the step (2-1): before carrying out step (2-1-2) one dimension Fourier conversion; Design through kernel; Meet the warp launching condition when making GPU carry out one dimension Fourier transformation calculations, the cross-thread that guarantees to be subordinated to same warp need not carry out fence when communicating synchronous; Before carrying out step (2-1-3) two-dimensional fourier transform; Pass through atomic operation; Guarantee when making a plurality of threads visit the same address of overall video memory or shared video memory simultaneously that but each thread can realize sharing the mutually exclusive operation of write data; Before thread complete operation, other any thread all can't be visited this address therein; Before carrying out the inverse transformation of step (2-1-4) TWO-DIMENSIONAL FOURIER; Design through kernel; Make GPU carry out meeting the warp launching condition when TWO-DIMENSIONAL FOURIER inverse transformation is calculated, the cross-thread that guarantees to be subordinated to same warp need not carry out fence when communicating synchronous; Carry out step (2-1-5) result of calculation is write share video memory before; Send synchronic command; Guarantee that all threads in the same thread block all implement same position; Meeting operation suspension after wherein any thread runs to the synchronic command mark, threads all in whole thread block all run to same position, and whole thread block just can continue to carry out following statement.
6. the mammary gland CT image rebuilding method based on graphic process unit according to claim 1 is characterized in that: step (2-2) specifically comprises the steps: (2-2-1) initializer B, C, comprises the video memory application of operator matrix and composes initial value; (2-2-2) use the cublas built-in function; (2-2-3) ask the spatial feature value; (2-2-4) ask the frequency domain character value; (2-2-5) result of calculation is write shared video memory.
7. the mammary gland CT image rebuilding method based on graphic process unit according to claim 6 is characterized in that:
In the step (2-2): before carrying out step (2-2-2) use cublas built-in function; Design through kernel; Make GPU carry out meeting the warp launching condition when spatial feature value is calculated, the cross-thread that guarantees to be subordinated to same warp need not carry out fence when communicating synchronous; Carrying out before step (2-2-3) asks the spatial feature value; Pass through atomic operation; When making a plurality of threads visit the same address of overall video memory or shared video memory simultaneously; Guarantee that but each thread can realize sharing the mutually exclusive operation of write data, other any thread all can't be visited this address before thread complete operation therein; Carrying out through the asynchronous flow operation, making that the host CPU thread needn't be waited for and can carry out other calculating when GPU calculates before step (2-2-4) asks the frequency domain character value; Carry out step (2-2-5) result of calculation is write share video memory before; Send synchronic command; Guarantee that all threads in the same thread block all implement same position; Meeting operation suspension after wherein any thread runs to the synchronic command mark, threads all in whole thread block all run to same position, and whole thread block just can continue to carry out following statement.
8. the mammary gland CT image rebuilding method based on graphic process unit according to claim 1 is characterized in that: step (2-3) comprises the steps: that specifically (2-3-1) reads shared video memory variable; (2-3-2) finding the inverse matrix; (2-3-3) ask conjugate matrices; (2-3-4) obtain the eigenwert of operator B, C; (2-3-5) result of calculation is write shared video memory.
9. the mammary gland CT image rebuilding method based on graphic process unit according to claim 8 is characterized in that:
In the step (2-3): before carrying out step (2-3-2) finding the inverse matrix; Design through kernel; Make GPU carry out meeting the warp launching condition when inverse matrix is calculated, the cross-thread that guarantees to be subordinated to same warp need not carry out fence when communicating synchronous; Carrying out the privilege of access mark being made in the instruction of reading inverse matrix result of calculation before step (2-3-3) asks conjugate matrices, make this instruction obtain to share the limit priority of video memory visit with privilege of access mark; Carrying out through asynchronous execution command, the calculating in the stream can being carried out simultaneously with the data transmission of another stream before step (2-3-4) obtains the eigenwert of operator B, C; Carry out step (2-3-5) result of calculation is write share video memory before, through offset alignment design, 4 byte-aligned or 8 byte-aligned that the alignment of data mode is calculated for meeting most GPU.
CN201210303760.0A 2012-08-23 2012-08-23 Graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method Active CN102831629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210303760.0A CN102831629B (en) 2012-08-23 2012-08-23 Graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210303760.0A CN102831629B (en) 2012-08-23 2012-08-23 Graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method

Publications (2)

Publication Number Publication Date
CN102831629A true CN102831629A (en) 2012-12-19
CN102831629B CN102831629B (en) 2014-12-17

Family

ID=47334743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210303760.0A Active CN102831629B (en) 2012-08-23 2012-08-23 Graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method

Country Status (1)

Country Link
CN (1) CN102831629B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102783967A (en) * 2012-08-23 2012-11-21 汕头市超声仪器研究所有限公司 Breast CT (Computed Tomography) apparatus
CN105684043A (en) * 2013-06-06 2016-06-15 唯盼健康科技有限公司 A method of reconstruction of an object from projection views
CN110012252A (en) * 2019-04-09 2019-07-12 北京奥特贝睿科技有限公司 A kind of rapid image storage method and system suitable for autonomous driving emulation platform
CN110537209A (en) * 2017-04-17 2019-12-03 皇家飞利浦有限公司 Preview is imaged in real-time diagnosis
CN110766704A (en) * 2019-10-22 2020-02-07 深圳瀚维智能医疗科技有限公司 Breast point cloud segmentation method, device, storage medium and computer equipment
CN112051285A (en) * 2020-08-18 2020-12-08 大连理工大学 Intelligent nondestructive detection system integrating X-ray real-time imaging and CT (computed tomography) tomography
CN112258378A (en) * 2020-10-15 2021-01-22 武汉易维晟医疗科技有限公司 Real-time three-dimensional measurement system and method based on GPU acceleration

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008017076A2 (en) * 2006-08-03 2008-02-07 The Regents Of The University Of California Iterative methods for dose reduction and image enhancement in tomography
CN101336831A (en) * 2008-08-13 2009-01-07 汕头超声仪器研究所 Rebuilding method of real-time three-dimensional medical ultrasonic image
US20110267054A1 (en) * 2010-04-30 2011-11-03 Qiang He Magnetic resonance imaging water-fat separation method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008017076A2 (en) * 2006-08-03 2008-02-07 The Regents Of The University Of California Iterative methods for dose reduction and image enhancement in tomography
CN101336831A (en) * 2008-08-13 2009-01-07 汕头超声仪器研究所 Rebuilding method of real-time three-dimensional medical ultrasonic image
US20110267054A1 (en) * 2010-04-30 2011-11-03 Qiang He Magnetic resonance imaging water-fat separation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GANG-RONG QU, ET AL.: "An Iterative Algorithm for Angle-limited Three-dimensional Image Reconstruction", 《ACTA MATHEMATICAE APPLICATAE SINICA, ENGLISH SERIES》, vol. 24, no. 1, 31 December 2008 (2008-12-31), pages 157 - 166, XP019567397 *
TAKESHI FUJITA, ET AL.: "Phase recovery for electron holography using Gerchberg–Papoulis iterative algorithm", 《ULTRAMICROSCOPY》, vol. 102, 31 December 2005 (2005-12-31), pages 279 - 286 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102783967A (en) * 2012-08-23 2012-11-21 汕头市超声仪器研究所有限公司 Breast CT (Computed Tomography) apparatus
CN102783967B (en) * 2012-08-23 2014-06-04 汕头市超声仪器研究所有限公司 Breast CT (Computed Tomography) apparatus
CN105684043A (en) * 2013-06-06 2016-06-15 唯盼健康科技有限公司 A method of reconstruction of an object from projection views
CN110537209A (en) * 2017-04-17 2019-12-03 皇家飞利浦有限公司 Preview is imaged in real-time diagnosis
CN110012252A (en) * 2019-04-09 2019-07-12 北京奥特贝睿科技有限公司 A kind of rapid image storage method and system suitable for autonomous driving emulation platform
CN110766704A (en) * 2019-10-22 2020-02-07 深圳瀚维智能医疗科技有限公司 Breast point cloud segmentation method, device, storage medium and computer equipment
CN110766704B (en) * 2019-10-22 2022-03-08 深圳瀚维智能医疗科技有限公司 Breast point cloud segmentation method, device, storage medium and computer equipment
CN112051285A (en) * 2020-08-18 2020-12-08 大连理工大学 Intelligent nondestructive detection system integrating X-ray real-time imaging and CT (computed tomography) tomography
CN112051285B (en) * 2020-08-18 2021-08-20 大连理工大学 Intelligent nondestructive detection system integrating X-ray real-time imaging and CT (computed tomography) tomography
CN112258378A (en) * 2020-10-15 2021-01-22 武汉易维晟医疗科技有限公司 Real-time three-dimensional measurement system and method based on GPU acceleration

Also Published As

Publication number Publication date
CN102831629B (en) 2014-12-17

Similar Documents

Publication Publication Date Title
CN102831629B (en) Graphic processor based mammary gland CT (Computerized Tomography) image reconstruction method
US10699447B2 (en) Multi-level image reconstruction using one or more neural networks
Shackleford et al. On developing B-spline registration algorithms for multi-core processors
Shamonin et al. Fast parallel image registration on CPU and GPU for diagnostic classification of Alzheimer's disease
Xu et al. Impulse C vs. VHDL for accelerating tomographic reconstruction
JP2021521993A (en) Image enhancement using a hostile generation network
Blas et al. Surfing the optimization space of a multiple-GPU parallel implementation of a X-ray tomography reconstruction algorithm
CN108122263B (en) Image reconstruction system and method
CN103310484B (en) Computed tomography (CT) image rebuilding accelerating method based on compute unified device architecture (CUDA)
CN107223267B (en) Image reconstruction method and system
Choi et al. Acceleration of EM-based 3D CT reconstruction using FPGA
Flores et al. Parallel CT image reconstruction based on GPUs
Li et al. Parallel iterative cone beam CT image reconstruction on a PC cluster
Li et al. cuMBIR: An efficient framework for low-dose X-ray CT image reconstruction on GPUs
Goel et al. Computecovid19+: accelerating Covid-19 diagnosis and monitoring via high-performance deep learning on CT images
Jimenez et al. An irregular approach to large-scale computed tomography on multiple graphics processors improves voxel processing throughput
KR20120127214A (en) Method and apparatus for monte-carlo simulation gamma-ray scattering estimation in positron emission tomography using graphics processing unit
CN109410187B (en) Systems, methods, and media for detecting cancer metastasis in a full image
Chilingaryan et al. Reviewing GPU architectures to build efficient back projection for parallel geometries
MA et al. Cuda parallel implementation of image reconstruction algorithm for positron emission tomography
Yoshida et al. Scalable, high-performance 3D imaging software platform: system architecture and application to virtual colonoscopy
KR20140130786A (en) Super-resolution Apparatus and Method using LOR reconstruction based cone-beam in PET image
Zhang et al. High performance parallel backprojection on multi-GPU
Ha et al. A GPU-accelerated multivoxel update scheme for iterative coordinate descent (ICD) optimization in statistical iterative CT reconstruction (SIR)
Isosalo et al. Local edge computing for radiological image reconstruction and computer-assisted detection: A feasibility study

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 515041 Guangdong City, Longhu Province Wan Industrial Zone, Longjiang Road, Mount Everest road and the junction of the northwest side of the area, building 2, on the east side of the East

Patentee after: Shantou Ultrasonic Testing Technology Co., Ltd.

Address before: 1 floor, No. 77 Jinsha Road, Jinping District, Guangdong, Shantou, China

Patentee before: Shantou Dongfang Ultrasonic Technology Co.,Ltd.