Background
With the development of medical imaging technology and the innovation of computer equipment, the resolution and imaging precision of medical images are continuously improved, so that the medical images are widely applied to clinical and medical research. The image registration technology is an indispensable important technology in a plurality of medical image applications, and various types of information are matched in the same space, so that disease analysis and diagnosis are facilitated.
The three-dimensional medical image registration process mainly comprises five parts of image preprocessing, image transformation, gray interpolation, target function optimization and similarity measure calculation. The ultimate goal of image registration is to maximize the similarity between the two images to be registered, which is usually determined quantitatively using a similarity measure. Common similarity measurement methods include sum of squares of errors, mean square error, root mean square error, cross-correlation coefficient, mutual information amount, and the like. The mutual information concept is derived from information theory and is used for expressing the amount of information contained in one system and the information contained in another system. In image registration, mutual information is often used to express the degree of correlation between one image and another image, and when two images are geometrically aligned, the statistical correlation (i.e., mutual information) of the gray values of the pixels corresponding to the images is maximized. Mutual information is widely applied to image registration technology because of the advantages of no need of preprocessing images, high registration accuracy, good robustness and the like.
Mutual information is a common similarity measurement method at present, and when the joint entropy is calculated by the traditional mutual information, only the probability that the same gray level appears in the same coordinate of a single pixel point in two images is considered, but the relative relation between the pixel and other pixels in the neighborhood space of the pixel is ignored, and the image space information is not grasped. Under the condition that the image has noise or is incomplete and the resolution ratio is low, the calculation of mutual information is easy to cause the image to fall into a local extremum. Therefore, in order to better solve the problem of similarity measurement, a method for registering images by using a regional mutual information measurement concept is provided, and the gray information of the image field space is also used as an important measurement standard to introduce mutual information calculation, so that the registration accuracy and robustness are improved. Compared with the traditional method for directly reading the gray level occurrence times of a single point by using mutual information, the method for calculating and counting the gray level image and the probability density has the advantages that the calculation process of the regional mutual information is more complicated, a large number of high-dimensional matrix calculations exist, the time consumption is far larger than that of the traditional mutual information, the cost of the paid time is very large, and how to calculate the high-dimensional matrix more quickly is the key for solving the problem.
Disclosure of Invention
In order to solve the problem that the time consumption for calculating the high-dimensional matrix in the regional mutual information measurement is long, the invention provides a three-dimensional medical image registration similarity calculation method based on a GPU, which can greatly shorten the running time, thereby improving the registration accuracy and robustness and enabling the regional mutual information measurement method to be better applied to the registration technology.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a three-dimensional medical image registration similarity calculation method based on a GPU comprises the following steps:
reading data, opening a memory space in a CPU memory, reading two three-dimensional medical image data with the same resolution, marking the two images as a reference image R and a floating image F respectively, and marking the three-dimensional medical image, wherein the total pixel number N of the three-dimensional image is (x-2R) (y-2R) (z-2R), wherein x, y and z are the resolution of the three-dimensional image, and R is the radius of an adjacent point; creating column vector arrays based on neighborhood space information in a CPU memory, wherein each column vector array comprises d-2 (2r +1)3The voxels of the reference image R and the floating image F are R, respectively, 54 elementsi,j,kAnd Fi,j,kConnecting each corresponding voxel point of the two images with the information of the neighborhood voxel points to obtain a combined voxel Vi,j,kRepresents a d-dimensional space point PiAll of PiA joint distribution, i.e. a d × N sized matrix P, denoted P ═ P1,…,PN];
Opening a space in the video memory, and transmitting data to the video memory;
opening up a video memory space for data to be operated and variables needing to store results in a video memory, reading image data into a system memory by a CPU (central processing unit), obtaining the size of an image, opening up a memory space with the same size on a GPU (graphics processing unit) by using a cudaMalloc function, initializing the memory space, and opening up two d × N-sized video memory spaces for storing a combined gray-scale histogram P and an average value
And a space of d × d size for storing a covariance matrix C, using a cudaMemcpy function to convert image information data to be operated in a CPU memoryCopying to a GPU video memory;
step three, calculating the average value of the joint voxels
Using atomic operations in GPU, statistics
To obtain
Step four, distributing threads, executing kernel function and calculating the average value of the joint voxels
Difference P from joint distribution P
0;
Each thread block is allocated with nthreamedPerBlock 16 × 16 256 threads, and each thread grid is allocated with
The thread index in the thread block is threadedIndex, x + threadedX, y + blockDim, the thread index in the thread grid is blockIndex, x + blockIdx, y grid Dim, the thread index is i threadedIndex, blockDim, x blockDim, and each thread i is responsible for calculating:
after all threads finish computing, K
iForm d × N matrix P
0;
Step five, calculating the covariance of each point in the neighborhood:
decomposing a large matrix into a plurality of sub-matrixes by introducing a shared memory and a matrix blocking concept, wherein each thread block only reads the sub-matrixes in the shared memory in the block for calculation, and then the matrixes in the shared memory are combined to obtain a final result, so as to obtain a covariance C, wherein C is a d x d matrix;
step six, transmitting the calculation result to a CPU memory by using a cudaMemcpy function, and calculating the joint entropy based on the CPU;
Rdthe information entropy of the covariance matrix for a set of positive-distribution points in space is calculated as follows:
substituting the covariance C obtained in the fifth step into the formula to obtain a joint information entropy H (R, F);
seventhly, solving the information entropies H (R), H (F) of the reference image and the floating image
Get C
RAs a covariance matrix of the reference image, C
FAnd (4) substituting the covariance matrix of the floating image into a formula to obtain the information entropy H (R) and H (F) of the two images respectively. Wherein C is
RIs the upper left part of the covariance matrix C
Size submatrix, C
FIs the lower right part of the covariance matrix C
A sub-matrix of size;
step eight, calculating the region mutual information RMI
And the final region mutual information is RMI (R, F) ═ H (R) + H (F) -H (R, F), and the similarity degree of the two three-dimensional medical images, namely the similarity degree of the reference image and the floating image, can be quantitatively analyzed and judged according to the RMI value of the region mutual information.
It can be seen from the above steps that the region mutual information needs to be calculated by using d × N54 × (x-2) (y-2) (z-2) data to perform joint information entropy, a large amount of matrix calculation needs to be performed, the complexity is high, and due to independence of a data processing process, GPU multithreading parallel processing can be used, and the registration efficiency, accuracy and robustness can be greatly improved by the GPU-based three-dimensional medical image registration similarity calculation method.
Further, the third stepIn the method, the kernel function thread index of the atomic operation is (threadidx.x + threadidx.y × blockdim.x) + (blockidx.x + blockidx.y × gridddim.x) × blockdim.x × blockdim.y, wherein threadidx.x and threadidx.y are thread indexes in the thread blocks in the x and y dimensions, respectively, blockidx.x and blockidx.y are thread block indexes in the x and y dimensions, respectively, blockdim.x and blockidx.y are thread block numbers in the x and y dimensions, respectively, and griddim.x and griddim.y are thread block numbers in the y dimension, respectively
Still further, in the fourth step, dim3 type thread blocks are set, each thread block has a dimension of 16 × 16, that is, 256 threads in total, and in order to ensure that the total prime number and the number of allocated threads are in an integer multiple relationship, and the number of bus thread blocks required by the allocated threads is solved by an upward rounding strategy to be equal to or greater than the total prime number
And ensuring that each voxel point has a unique corresponding thread and kernel function parameter and calculation, each thread performs independent and same operation, and acquiring data and processing the data according to the thread index i to complete calculation.
Due to the adoption of the technical scheme, the invention has the following advantages: the registration accuracy and robustness of the three-dimensional medical image can be effectively improved; the registration efficiency can be improved through GPU parallel acceleration; the method can provide a stable, efficient and repeatable analysis method for the research of the three-dimensional medical image.
Detailed Description
The invention will be further explained with reference to the drawings.
Referring to fig. 1 to 3, a method for calculating the registration similarity of three-dimensional medical images based on a GPU includes the following steps:
reading data, opening a memory space in a CPU memory, reading two three-dimensional medical image data with the same resolution, marking the two images as a reference image R and a floating image F respectively, marking the three-dimensional medical image, neglecting the influence of an image edge voxel point on final information entropy as the main information of the image in the medical image is positioned in the center of the image, and obtaining the total pixel number N of the three-dimensional image which is (x-2R) (y-2R) (z-2R), wherein x, y and z are the resolution of the three-dimensional image and R is the radius of an adjacent point;
creating column vector arrays based on neighborhood space information in a CPU memory, wherein each column vector array comprises d-2 (2r +1)3The voxels of the reference image R and the floating image F are R, respectively, 54 elementsi,j,kAnd Fi,j,kConnecting each corresponding voxel point of the two images with the information of the neighborhood voxel points to obtain a combined voxel Vi,j,kRepresents a d-dimensional space point PiAll of PiA joint distribution, i.e. a d × N sized matrix P, denoted P ═ P1,…,PN];
Opening a space in the video memory, and transmitting data to the video memory;
opening up the video memory space for the data to be operated and the variable of the result to be stored in the video memory, and reading the image data into the system memory by the CPU to obtain the image size, so that the image size is ensuredOpening up memory spaces with the same size on a GPU by using a cudaMalloc function, initializing the memory spaces, and opening up two d × N-sized memory spaces for storing a combined gray-scale histogram matrix P and an average value
A space with the size of d × d is used for storing a covariance matrix C, and image information data to be operated in a CPU memory is copied to a GPU video memory by using a cudaMemcpy function;
step three, calculating the average value of the joint voxels
Using atomic operations in GPU, statistics
To obtain
Step four, distributing threads, executing kernel function and calculating the average value of the joint voxels
Difference P from joint distribution P
0;
Each thread block is allocated with nthreamedPerBlock 16 × 16 256 threads, and each thread grid is allocated with
The thread index in the thread block is threadedIndex, x + threadedX, y + blockDim, the thread index in the thread grid is blockIndex, x + blockIdx, y grid Dim, the thread index is i threadedIndex, blockDim, x blockDim, and each thread i is responsible for calculating:
after all threads finish computing, K
iForm d × N matrix P
0;
Step five, calculating the covariance of each point in the neighborhood:
The covariance calculation is matrix multiplication, a large matrix is decomposed into a plurality of sub-matrixes by introducing a shared memory and a matrix blocking concept, each thread block only reads the sub-matrixes in the shared memory in the block for calculation, and then the matrixes in the shared memory are combined to obtain a final result, so that the covariance C is obtained, wherein C is a d x d matrix;
step six, transmitting the calculation result to a CPU memory by using a cudaMemcpy function, and calculating the joint entropy based on the CPU;
Rdthe information entropy of the covariance matrix for a set of positive-distribution points in space is calculated as follows:
substituting the covariance C obtained in the fifth step into the formula to obtain a joint information entropy H (R, F);
step seven, solving the information entropies H (R), H (F) of the reference image and the floating image;
get C
RAs a covariance matrix of the reference image, C
FAnd (4) substituting the covariance matrix of the floating image into a formula to obtain the information entropy H (R) and H (F) of the two images respectively. Wherein C is
RIs the upper left part of the covariance matrix C
Size submatrix, C
FIs the lower right part of the covariance matrix C
A sub-matrix of size;
step eight, calculating the region mutual information RMI;
and the final region mutual information is RMI (R, F) ═ H (R) + H (F) -H (R, F), and the similarity degree of the two three-dimensional medical images, namely the similarity degree of the reference image and the floating image, can be quantitatively analyzed and judged according to the RMI value of the region mutual information.
It can be seen from the above steps that the region mutual information needs to be calculated by using d × N54 × (x-2) (y-2) (z-2) data to perform joint information entropy, a large amount of matrix calculation needs to be performed, the complexity is high, and due to independence of a data processing process, GPU multithreading parallel processing can be used, and the registration efficiency, accuracy and robustness can be greatly improved by the GPU-based three-dimensional medical image registration similarity calculation method.
Further, in the third step, the atomic operation has a kernel function thread index of (threadidx.x + threadidx.y × blockdim.x) + (blockidx.x + blockidx.y × griddim.x) × blockdim.x × blockdim.y, where threadidx.x and threadidx.y are thread indices in the x and y dimensions of the thread block, blockidx.x and blockidx.y are thread indices in the x and y dimensions of the thread block, respectively, blockdix.x and blockdiy.y are thread numbers in the x and y dimensions, respectively, and griddim.x and griddim.y are thread numbers in the x and y dimensions, respectively
In the fourth step, dim3 type thread blocks are set, each thread block selects the dimension of 16 × 16, namely 256 threads in total, in order to ensure that the total prime number is in integral multiple relation with the number of the allocated threads, and the allocated threads are greater than or equal to the total prime number, the number of the bus thread blocks required by the allocated threads is solved by a strategy of rounding up
And ensuring that each voxel point has a unique corresponding thread and kernel function parameter and calculation, each thread performs independent and same operation, and acquiring data and processing the data according to the thread index i to complete calculation.