CN103810670A - DVH (dose volume histogram) parallel statistical method based on CUDA (compute unified device architecture) stream and shared memory - Google Patents

DVH (dose volume histogram) parallel statistical method based on CUDA (compute unified device architecture) stream and shared memory Download PDF

Info

Publication number
CN103810670A
CN103810670A CN201410033988.1A CN201410033988A CN103810670A CN 103810670 A CN103810670 A CN 103810670A CN 201410033988 A CN201410033988 A CN 201410033988A CN 103810670 A CN103810670 A CN 103810670A
Authority
CN
China
Prior art keywords
dvh
dose
stream
cuda
shared drive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410033988.1A
Other languages
Chinese (zh)
Other versions
CN103810670B (en
Inventor
王阳萍
党建武
蒋偑钊
杜晓刚
王松
杨景玉
陈永
郭治成
邓冲
赵庶旭
闵永智
张鑫
罗维薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou Jiaotong University
Original Assignee
Lanzhou Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou Jiaotong University filed Critical Lanzhou Jiaotong University
Priority to CN201410033988.1A priority Critical patent/CN103810670B/en
Publication of CN103810670A publication Critical patent/CN103810670A/en
Application granted granted Critical
Publication of CN103810670B publication Critical patent/CN103810670B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Processing (AREA)
  • Image Generation (AREA)

Abstract

The invention discloses a DVH (dose volume histogram) parallel statistical method based on a CUDA (compute unified device architecture) stream and a shared memory. The method includes the steps: firstly, sampling organs on a host, transmitting the position of a sampling point into a device and processing dose statistics of each organ by one stream; secondly, loading a dose matrix by the aid of a texture memory; thirdly, fetching a texture according to a position point allocated for each thread, setting a filter mode of the texture into linear interpolation, namely linearly interpolating eight picture elements of the three-dimensional texture according to distances and returning values obtained by linear interpolation; fourthly, storing statistical results by the aid of the shared memory. By developing N sub dose boxes on the shared memory, the problem of bank conflict of the shared memory is solved, and statistical speed is increased.

Description

The parallel statistical method of DVH figure based on CUDA stream and shared drive
Technical field
The present invention relates to medical data image processing field, particularly, relate to a kind of parallel statistical method of DVH figure based on CUDA stream and shared drive.
Background technology
Dose-volume histogram (Dose Volume Histogram, DVH) is the powerful of evaluating radiotherapy treatment planning scheme quality.In the reverse treatment planning systems such as intensity modulated radiation therapy, the speed of DVH figure statistics is proposed to higher requirement.
DVH figure adds up by two-dimensional diagram the method that 3-dimensional dose distributes in radiotherapy planning, represent certain area-of-interest is as having how many volumes to be subject to the irradiation of many high doses in the volume of tumour target area or assessment critical organ (organ at risk, OAR).Because its dosage that intuitively, has effectively reacted treatment plan distributes and the quality of planning, become the Main Basis of assessment radiotherapy treatment planning quality.In the statistics of DVH figure, can, according to the position relationship of the outline line of target area or critical organ and 3-dimensional dose data, the dose value that organ spatial point is corresponding be added up.Conventionally can within the scope of organ, sample, carry out the dose value of Tri linear interpolation in the hope of sampled point by dose data, and be recorded in discontinuous dose cassette (bin).Along with the increase of patient CT section in therapeutic process and the needs of intensity modulated radiation therapy, in statistic processes, need to carry out a large amount of Tri linear interpolation computings, the computing velocity of CPU can not requirement of real time, and the present invention considers to utilize parallel processing to realize the express statistic of DVH figure.
In parallel processing field, mainly contain multinuclear CPU (central processing unit) (Multicore Central Processing Unit, Multicore CPU) parallel processing and image processor units (Graphics Processing Unit, GPU) parallel processing.What be responsible for due to CPU is the issued transaction that logicality is stronger, is not good to do highly dense intensity and calculate, so select to aim at computation-intensive, highly-parallelization application, the high-performance calculation platform NVIDIA GPU of design is as the platform of parallel processing DVH figure.On the arithmetic capability of NVIDIA GPU and bandwidth of memory, there is obvious advantage with respect to CPU.By calculating unified equipment frame (Compute Unified Device Architecture, CUDA), GPU can bring into play its powerful computing power under single instruction multiple data (Single Instruction Multiple Threads, SIMT) programming model.
During based on GPU Parallel Implementation DVH figure statistics, there are two difficult points.First, for outline line corresponding to each organ, judge when whether sampling point position is within the scope of it and can use a large amount of if-else statements, this is very simple task in CPU, in GPU, use and judge that statement causes the serial of parallel thread to be carried out possibly, greatly reduces the execution performance of stream multiprocessor inside.The second, when statistics, sampled point result can be saved in respectively in 100 bin, especially distribute for the dosage that is similar to heavy ion radiotherapy etc. and has particular law, can cause a large amount of write conflicts, cause Calculation bottleneck.
Summary of the invention
The object of the invention is to, for the problems referred to above, propose a kind of parallel statistical method of DVH figure based on CUDA stream and shared drive, to improve computing velocity, and there is the internal memory of avoiding write conflict.
For achieving the above object, the technical solution used in the present invention is:
The parallel statistical method of DVH figure based on CUDA stream and shared drive, comprises the following steps:
Step 1: in host side, organ is sampled, and import sampling point position into equipment end,
In DVH when statistics figure, need to judge that sampled point is whether in the outline line of organ, and CPU judgement obtains all sampling point positions of each organ and stores in array, and sampling point position represents to be pos=(x, y, z) by a tri-vector;
Utilize the stream mechanism of CUDA that the position array obtaining is imported in GPU and calculated, homogeneous turbulence is not separate in the time carrying out, and each stream handle is processed the position array of each organ;
Step 2: use texture storage device to be written into dose matrix:
To have dose matrix data and copy equipment end to from host side, and dose matrix data will be stored in texture storage device;
Step 3: the location point being assigned to according to each thread, use texture to pick up, the filter patterns of texture is set to linear interpolation, eight of three-D grain pixels is carried out to linear interpolation according to distance, and returns to the value that linear interpolation obtains;
Step 4: use shared drive storage statistics, by open up N sub-dose cassette on shared drive, solve the bank collision problem that shared drive there will be, and accelerated Statistical Speed.
According to a preferred embodiment of the invention, in above-mentioned steps 4, use shared drive storage statistics to be mainly: on shared drive, to open up N sub-dose cassette, make bin corresponding between sub-dose cassette not at same bank, several the adjacent threads in same warp upgrade respectively corresponding sub-dose cassette.
According to a preferred embodiment of the invention, the span of above-mentioned N is .
According to a preferred embodiment of the invention, described N=8.
Technical scheme of the present invention has following beneficial effect:
Technical scheme of the present invention, adopts isomery (heterogeneous) programming model and the CUDA stream mechanism of CUDA respectively the sampled point of each organ to be imported in GPU and calculated.Realize the parallelization of two levels, i.e. parallel processing between parallel and interpolation arithmetic between organ.Thereby raising computing velocity.Adopt shared drive to store data, avoided the write conflict between internal memory.
Below by drawings and Examples, technical scheme of the present invention is described in further detail.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the parallel statistical method of DVH figure based on CUDA stream and shared drive described in the embodiment of the present invention;
Fig. 2 a and Fig. 2 b are the graph of a relation of organ and dose matrix;
Fig. 3 is single-threaded DVH figure statistical flowsheet figure;
Fig. 4 is CUDA framework operation schematic diagram;
When Fig. 5 is DVH figure statistics, schematic diagram is upgraded in multithreading ballot;
Fig. 6 is the bank conflict schematic diagram of shared storage;
Fig. 7 a is 1% to 100% dosage distribution schematic diagram;
Fig. 7 b is 90% to 100% dosage distribution schematic diagram;
Fig. 8 a is the DVH schematic diagram that CPU generates;
Fig. 8 b is the DVH schematic diagram that GPU generates.
Embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein, only for description and interpretation the present invention, is not intended to limit the present invention.
As shown in Figure 1, a kind of parallel statistical method of DVH figure based on CUDA stream and shared drive, comprises the following steps:
Step 1: in host side, organ is sampled, and import sampling point position into equipment end,
In DVH when statistics figure, need to judge that sampled point is whether in the outline line of organ, and CPU judgement obtains all sampling point positions of each organ and stores in array, and sampling point position represents to be pos=(x, y, z) by a tri-vector;
Utilize the stream mechanism of CUDA that the position array obtaining is imported in GPU and calculated, homogeneous turbulence is not separate in the time carrying out, and each stream handle is processed the position array of each organ;
Step 2: use texture storage device to be written into dose matrix:
To have dose matrix data and copy equipment end to from host side, and dose matrix data will be stored in texture storage device;
Step 3: the location point being assigned to according to each thread, use texture to pick up, the filter patterns of texture is set to linear interpolation, eight of three-D grain pixels is carried out to linear interpolation according to distance, and returns to the value that linear interpolation obtains;
Step 4: use shared drive storage statistics, by open up N sub-dose cassette on shared drive, solve the bank collision problem that shared drive there will be, and accelerated Statistical Speed
Wherein, in step 4, use shared drive storage statistics to be mainly: on shared drive, to open up N sub-dose cassette, make bin corresponding between sub-dose cassette not at same bank, several the adjacent threads in same warp upgrade respectively corresponding sub-dose cassette.The span of N is
Figure 41893DEST_PATH_IMAGE001
.Because also need afterwards sub-dose cassette to be added, and shared drive is limited, so be not that N is the bigger the better, effect is better when N=8 by experiment, and when N>8, the polling hours are elongated on the contrary.
Concrete treatment step is as follows:
1.1 for isomery programming model
CUDA is not simple GPU language, and it has coordinated the concurrent of CPU and two kinds of processing units of GPU, CUDA framework towards be a Heterogeneous Computing network being formed by CPU and GPU.CPU has the complex logic that branch prediction, program stack and loop optimization etc. are taked for control, and the fairly simple structure of GPU makes it be applicable to the statement of order, single, few circulation, few redirect.CPU end moves the program of serial and controls the startup of GPU (new kepler GK110 framework support DYNAMIC PARALLELISM mechanism, can not return to the data of host CPU and dynamic creation new thread by application, be that any kernel can start another kernel), GPU end moves task that can be parallel, in general, GPU needs control and the coordination of CPU, so GPU is commonly called equipment (device), CPU is called main frame (host).
1.2 thread
In CUDA, NVIDIA organizes thread with a kind of programming model of layering.User creates as required the thread of some and determines the mapping relations of thread and data.Be divided into several thread block (block) calculating thread, the hardware corresponding with thread block is embodied as a stream multiprocessor (streaming multiprocessor, SM), claims again multiprocessor.A SM comprises several scalar processors (scalar processor), or claims CUDA core (CUDA core).The corresponding concrete thread of carrying out of scalar processor.All grid of block composition, i.e. a corresponding grid of video card.
1.3 storer
CUDA storer is organized as a kind of hierarchical structure, and being respectively can be by the global storage of all thread accesses in program, constant storer and texture storage device; Can be by the shared storage of all thread accesses in block; Can be by the register of single thread accesses.
(1) register: for storing automatic variable, do not use the variable of qualifier explanation in overall situation function and equipment function.Register is distributed on each scalar processor independently, for each thread provides privately owned storage space.Capacity is less, and access speed is fast.
(2) shared storage: be positioned at each multiprocessor, belong to on-chip memory (on-chip memory), access speed and register are similar, the data of read-write 4B approximately need two clock period (clock cycle).Thread on same multiprocessor can be accessed a slice shared storage.
(3) global storage: belong to chip external memory (off-chip memory), once directly access needs 400 to 800 clock period, on the multiprocessor of computing power 3.x, there is buffer memory on sheet (on-chip cache), a lot of soon compared with computing power 1.x speed.
(4) texture storage device: belong to chip external memory.Texture storage device is a concept from graphics, on graph image video card before, carry out image demonstration as the storer of core exactly, therefore it has higher hardware supported, there is buffer memory on sheet, the calculating that shows as addressing is realized by specific hardware cell, does not need to expend the computing time of kernel; Access needn't be followed the principle of order; Contiguous tables of data is revealed to larger data bandwidth; Provide hardware-accelerated interpolation function, but precision only have 9.
2 DVH figure statistics
2.1 existing DVH figure statistic algorithms
The outline data that is input as dose data and organ when statistics DVH figure, their relation as shown in Figure 2 a and 2 b.
In Fig. 2 a, rectangle represents 3-dimensional dose matrix, and entity part represents organ scope intuitively; In Fig. 2 b, curve has represented the outline line of organ, and setting every outline line has certain thickness, and when statistics, each sampled point has represented certain volume.When making DVH when statistics figure of single-threaded, each organ is added up according to order serial, and general flow process is: as shown in Figure 3.When single-threaded statistics DVH figure, obtaining after dose data and outline data, need to sample to all relevant organs and tumour target area.According to the clooating sequence statistics of organ, in each organ, sampled and interpolation arithmetic by the relation of its outline line and dose data, finally each organ statistics is output as to the statistics array that a size is 100.
3. the DVH figure parallel computation based on CUDA,
When DVH figure Parallel Implementation, first need the sample point coordinate of all volumes in organ to import in GPU global storage, deposit dose matrix in texture memory, carry out interpolation arithmetic according to sample point coordinate afterwards, statistics is deposited in shared storage, finally statistics is added up and passes main frame back, be specially:
Step 1: in host side, organ is sampled, and import sampling point position into equipment end.
In DVH when statistics figure, need to judge that sampled point is whether in the outline line of organ, and this task need to be carried out a large amount of judgement statements, is adapted at CPU end and processes.CPU judgement obtains all sampling point positions of each organ and stores in array, and position represents pos=(x, y, z) by a tri-vector.
When cudaMemcpy () function direct copying enters equipment end for array by the position obtaining, compare poor efficiency, because need all data all copy after equipment end, kernel just can be processed these data.The stream mechanism of CUDA can play a significant role at this moment, GPU can be calculated with the transmission of CPU/GPU internal memory and mutually cover.
As shown in Figure 4, in general CUDA program can be divided into three phases, be respectively: main frame → device data transmission (h part), kernel is carried out (i part) and equipment → host data transmission (j part).Kernel execution time and the data transmission elapsed time of in figure, supposing each stream are deciles.The different part of homogeneous turbulence may be not overlapped on time shaft, as the overlapping region of the overlapping region of yl moiety in figure and green portion and green portion and blue portion, all represented the parallel section between various flows.
Because homogeneous turbulence is not separate in the time carrying out, so the present invention is in DVH statistics, the statistics of each organ to be processed with a stream respectively, the core code on CUDA is won as follows:
// set up stream according to organ and target area quantity
cudaStream_t stream[ORGAN_COUNT];
for(int i=0; i<ORGAN_COUNT; i++)
cudaStreamCreate(&stream[i]);
// use cudaMemcpyAsync () function to realize asynchronous data transfer
for(int i=0; i<ORGAN_COUNT; i++)
cudaMemcpyAsync();
Step 2: use texture storage device to be written into dose matrix.
Need to have dose matrix data and copy equipment end to from host side, now have two kinds of selections: can be deposited in global storage or texture storage device.Because after can carry out Tri linear interpolation to 3-dimensional dose matrix continually according to sampling point position, and be all eight points that are close in dose matrix conventionally when access.Known by introducing of above-mentioned storer, texture storage device has great advantage compared with global storage, so select the storer of texture storage device as dose matrix data.
Step 3: realize kernel
In kernel, the location point being assigned to according to each thread, uses texture to pick up function and picks up.Because the filter patterns of texture has been set to linear interpolation, eight of three-D grain pixels are carried out to linear interpolation according to distance, return to the value that linear interpolation obtains.
Step 4: use shared drive storage statistics.
The ballot that uses global storage to store each thread is upgraded, and only need on global storage, open up the array that a size is 100, and corresponding array element is directly upgraded in the ballot of each thread.If while having multiple threads a bin to be voted, just there will be write conflict simultaneously.If use a parallel N thread to add up an organ, afterwards result is write in 100 bin.If N is 7 o'clock, as shown in Figure 5.
This simultaneously to an address the caused write conflict that conducts interviews, CUDA can not guarantee its correctness, the atomic operation that can only rely on CUDA to provide is accessed reliably, and the principle of atomic operation is that device bus is locked, and will inevitably have a strong impact on the speed of statistics.
The feature that the technical program distributes in conjunction with dosage in radiotherapy treatment planning has proposed a kind of method, can greatly reduce the bank delay bringing that conflicts, concrete grammar, for open up N sub-dose cassette on shared drive, makes bin corresponding between sub-dose cassette not at same bank.Several adjacent threads in same warp upgrade respectively corresponding sub-dose cassette.In the time of N=4, in shared drive, there are 4 sub-dose cassette, if the polled data of adjacent four threads of certain warp is respectively 98,97,98,97 o'clock, this four number will be put into respectively to four different bank according to this algorithm, avoid access conflict, as shown in Figure 6.
The present invention take statistics heavy particle radiotherapy dose data as example explanation.
Fig. 7 a and Fig. 7 b are heavy ion layering exposure dose distribution schematic diagram
Prague peak value having during due to heavy ions, can make high dose area focus on cancer target area, region as shown in Figure 7a.When radiotherapy, be to adopt layering illuminating method, obtain the b as Fig. 7 after high dose area is visual, visual data distributes and is corrugated, distributes close to degenerating, and can cause serious bank conflict.
Experiment, take C Plus Plus as basis, compiles under VS2010 environment.The CPU that experiment porch uses is Intel Celeron E3300@2.5GHz, and GPU is tall and handsome GeForce GTX 650 Ti that reach, and computer system is Microsoft Windows XP Professional Service Pack 3, internal memory 2G.
This experiment Main Analysis uses respectively traditional acceleration effect based on CPU and while the present invention is based on GPUCUDA stream statistics DVH figure.Experimental data is 4 organs, skin, left lung, right lung and tumour target area, totally 262 outline lines.Dose data is the each interval 0.4cm of x, y, z axle, the three-dimensional matrice of 116*60*84.First add up 10 times with CPU, obtaining the average operation result time is 436.56ms, as table one.After utilizing GPU, texture storage device and shared storage to do parallelization during based on CUDA stream statistics, add up, and use and avoid bank collision method to improve on this basis, add up respectively 10 times, the speed-up ratio of contrast tradition based on CPU method is as table two.As shown in Fig. 8 a and Fig. 8 b, the DVH figure going out with two kinds of method statistics is respectively identical.
Table one, CPU statistics DVH figure timetable:
Figure 2014100339881100002DEST_PATH_IMAGE003
Table two, the DVH figure flowing based on CUDA accelerate contrast table:
Figure 2014100339881100002DEST_PATH_IMAGE005
Can find out by table two, stream mechanism, texture memory and the shared drive in use CUDA obtained good speed-up ratio after adding up.What propose with the present invention avoids bank collision algorithm, also has small size acceleration to promote on original basis.
Finally it should be noted that: the foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, although the present invention is had been described in detail with reference to previous embodiment, for a person skilled in the art, its technical scheme that still can record aforementioned each embodiment is modified, or part technical characterictic is wherein equal to replacement.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (4)

1. the parallel statistical method of DVH figure based on CUDA stream and shared drive, is characterized in that, comprises the following steps:
Step 1: in host side, organ is sampled, and import sampling point position into equipment end;
In DVH when statistics figure, need to judge that sampled point is whether in the outline line of organ, and CPU judgement obtains all sampling point positions of each organ and stores in array, and sampling point position represents to be pos=(x, y, z) by a tri-vector;
Utilize the stream mechanism of CUDA that the position array obtaining is imported in GPU and calculated, homogeneous turbulence is not separate in the time carrying out, and each stream handle is processed the position array of each organ;
Step 2: use texture storage device to be written into dose matrix:
To have dose matrix data and copy equipment end to from host side, and dose matrix data will be stored in texture storage device;
Step 3: the location point being assigned to according to each thread, use texture to pick up, the filter patterns of texture is set to linear interpolation, eight of three-D grain pixels is carried out to linear interpolation according to distance, and returns to the value that linear interpolation obtains;
Step 4: use shared drive storage statistics, by open up N sub-dose cassette on shared drive, solve the bank collision problem that shared drive there will be, and accelerated Statistical Speed.
2. the parallel statistical method of DVH figure based on CUDA stream and shared drive according to claim 1, it is characterized in that, in above-mentioned steps 4, use shared drive storage statistics to be mainly: on shared drive, to open up N sub-dose cassette, make bin corresponding between sub-dose cassette not at same bank, several the adjacent threads in same warp upgrade respectively corresponding sub-dose cassette.
3. the parallel statistical method of DVH figure based on CUDA stream and shared drive according to claim 2, is characterized in that, the span of above-mentioned N is
Figure 2014100339881100001DEST_PATH_IMAGE002
.
4. the parallel statistical method of DVH figure based on CUDA stream and shared drive according to claim 3, is characterized in that described N=8.
CN201410033988.1A 2014-01-24 2014-01-24 DVH (dose volume histogram) parallel statistical method based on CUDA (compute unified device architecture) stream and shared memory Expired - Fee Related CN103810670B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410033988.1A CN103810670B (en) 2014-01-24 2014-01-24 DVH (dose volume histogram) parallel statistical method based on CUDA (compute unified device architecture) stream and shared memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410033988.1A CN103810670B (en) 2014-01-24 2014-01-24 DVH (dose volume histogram) parallel statistical method based on CUDA (compute unified device architecture) stream and shared memory

Publications (2)

Publication Number Publication Date
CN103810670A true CN103810670A (en) 2014-05-21
CN103810670B CN103810670B (en) 2017-01-18

Family

ID=50707392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410033988.1A Expired - Fee Related CN103810670B (en) 2014-01-24 2014-01-24 DVH (dose volume histogram) parallel statistical method based on CUDA (compute unified device architecture) stream and shared memory

Country Status (1)

Country Link
CN (1) CN103810670B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105302580A (en) * 2015-11-28 2016-02-03 武汉斗鱼网络科技有限公司 Method and system for rapidly acquiring game graphics through GPU (Graphics Processing Unit) texture sharing
CN107278303A (en) * 2014-12-04 2017-10-20 皇家飞利浦有限公司 Initialization and the gradual QA planned automatically based on shape

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882311A (en) * 2010-06-08 2010-11-10 中国科学院自动化研究所 Background modeling acceleration method based on CUDA (Compute Unified Device Architecture) technology
US20130204067A1 (en) * 2012-02-07 2013-08-08 Varian Medical Systems International Ag Method and Apparatus Pertaining to the Optimization of Radiation-Treatment Plans Using Automatic Changes to Treatment Objectives

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101882311A (en) * 2010-06-08 2010-11-10 中国科学院自动化研究所 Background modeling acceleration method based on CUDA (Compute Unified Device Architecture) technology
US20130204067A1 (en) * 2012-02-07 2013-08-08 Varian Medical Systems International Ag Method and Apparatus Pertaining to the Optimization of Radiation-Treatment Plans Using Automatic Changes to Treatment Objectives

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JOSIEN P. W. PLUIM等: "Mutual information based registration of medical", 《IEEE TRANSACTIONS ON MEDICAL IMAGING》 *
党建武 等: "基于GPU的2D-3D医学图像配准", 《计算机科学》 *
翟凤文 等: "基于互信息和类电磁机制优化算法的医学图像配准研究", 《兰州交通大学学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107278303A (en) * 2014-12-04 2017-10-20 皇家飞利浦有限公司 Initialization and the gradual QA planned automatically based on shape
CN105302580A (en) * 2015-11-28 2016-02-03 武汉斗鱼网络科技有限公司 Method and system for rapidly acquiring game graphics through GPU (Graphics Processing Unit) texture sharing

Also Published As

Publication number Publication date
CN103810670B (en) 2017-01-18

Similar Documents

Publication Publication Date Title
Pratx et al. GPU computing in medical physics: A review
Jia et al. GPU-based high-performance computing for radiation therapy
Zhang et al. Fast tridiagonal solvers on the GPU
CN102375800B (en) For the multiprocessor systems on chips of machine vision algorithm
Chou et al. A fast forward projection using multithreads for multirays on GPUs in medical image reconstruction
Chen et al. Ultrafast convolution/superposition using tabulated and exponential kernels on GPU
Hissoiny et al. A convolution‐superposition dose calculation engine for GPUs
US20210368656A1 (en) Intelligent control and distribution of a liquid in a data center
Xu et al. ARCHER, a new Monte Carlo software tool for emerging heterogeneous computing environments
CN110504016B (en) Monte Carlo grid parallel dose calculation method, equipment and storage medium
Liu et al. GPU-based branchless distance-driven projection and backprojection
WO2007106815A2 (en) Methods and apparatus for hardware based radiation dose calculation
Tickner Monte Carlo simulation of X-ray and gamma-ray photon transport on a graphics-processing unit
US20210267095A1 (en) Intelligent and integrated liquid-cooled rack for datacenters
WO2022010745A1 (en) Intelligent multiple mode cooling unit for datacenter racks
Neylon et al. A nonvoxel‐based dose convolution/superposition algorithm optimized for scalable GPU architectures
CN103810670A (en) DVH (dose volume histogram) parallel statistical method based on CUDA (compute unified device architecture) stream and shared memory
Mensmann et al. An advanced volume raycasting technique using GPU stream processing
Hansen et al. Synthetic aperture beamformation using the GPU
US11681341B2 (en) Intelligent repurposable cooling systems for mobile datacenter
Bruckner Efficient volume visualization of large medical datasets
Gu et al. Accurate and efficient GPU ray‐casting algorithm for volume rendering of unstructured grid data
Zhang et al. High performance parallel backprojection on multi-GPU
CN102122323A (en) Method for quickly realizing Gamma analysis method based on GPU (graphic processing unit)
Wang et al. Compute Unified Device Architecture-Based Parallel Dose-Volume Histogram Computation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170118

Termination date: 20180124