WO2024082441A1 - Deep learning-based multi-modal image registration method and system, and medium - Google Patents

Deep learning-based multi-modal image registration method and system, and medium Download PDF

Info

Publication number
WO2024082441A1
WO2024082441A1 PCT/CN2022/142807 CN2022142807W WO2024082441A1 WO 2024082441 A1 WO2024082441 A1 WO 2024082441A1 CN 2022142807 W CN2022142807 W CN 2022142807W WO 2024082441 A1 WO2024082441 A1 WO 2024082441A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature points
image feature
points
similarity
Prior art date
Application number
PCT/CN2022/142807
Other languages
French (fr)
Chinese (zh)
Inventor
刘洁
王涛
顾力栩
Original Assignee
上海精劢医疗科技有限公司
精劢医疗科技南通有限公司
上海偌劢机器人科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海精劢医疗科技有限公司, 精劢医疗科技南通有限公司, 上海偌劢机器人科技有限公司 filed Critical 上海精劢医疗科技有限公司
Publication of WO2024082441A1 publication Critical patent/WO2024082441A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present application relates to the field of image processing technology, and in particular, to a multimodal image registration method, system and medium based on deep learning.
  • CT computed tomography
  • MRI magnetic resonance imaging
  • US ultrasound
  • CT has obvious imaging advantages for high-density tissues in the human body, such as bones;
  • MRI has better resolution for soft tissues, etc.
  • the fusion of multiple modal images can provide complementary information to better achieve the purpose of diagnosis, evaluation or intervention.
  • the fusion of multimodal images can fully combine the tissue characteristics reflected by different modal images to give a more accurate judgment on whether there is a lesion, the nature of the lesion, and the range.
  • the fusion of preoperative images and intraoperative images can achieve the superposition of preoperative planning and intraoperative images, which can provide doctors with richer and more intuitive information, improve the quality of image guidance during intervention, thereby improving the quality of surgery and clinical outcomes.
  • images of different modalities are usually acquired at different time points using different scanning instruments. This process is accompanied by changes in the patient's posture and internal anatomical structure. Therefore, the prerequisite for achieving multimodal image fusion is to perform multimodal medical image registration, and the accuracy of the registration directly determines the effect of the fusion.
  • Multimodal medical image registration is a challenging problem.
  • the relationship between the grayscale distribution of medical images of different modalities is often complex and uncertain.
  • structures and features that exist in one modality may be missing in another modality.
  • Traditional multimodal registration methods can be roughly divided into grayscale-based registration methods and anatomical feature-based registration methods.
  • Grayscale-based registration methods mainly use multimodal similarity measures, such as mutual information and cross-correlation; anatomical feature-based registration methods mainly rely on landmarks identified in images of different modalities.
  • deep learning technology has developed rapidly, and has also been increasingly studied and applied in the field of image registration, which is expected to solve the problems of slow registration speed and insufficient registration accuracy in traditional registration.
  • a multimodal image registration method, system and medium based on deep learning are provided.
  • Multimodal image registration methods based on deep learning include:
  • the three-dimensional images include at least one reference image and at least one floating image; acquire a region to be registered of the three-dimensional image, detect image feature points in the region to be registered of the reference image, wherein the image feature points are points that can be distinguished from image features of other points in a neighborhood; obtain image blocks of a preset size with each of the image feature points as the center; input the image blocks into a similarity network to obtain a similarity graph within a corresponding range of the floating image; input the coordinate information of the image feature points, the image blocks of the reference image and the corresponding similarity graph into a displacement network to obtain a displacement vector; interpolate the region without image feature points based on the displacement vector to obtain a displacement vector field; and perform spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
  • the region to be registered is determined through manual interaction, or is determined based on a grayscale threshold of the image, or is determined by automatically detecting and segmenting a specific structure in the image.
  • the method of acquiring the image feature points includes:
  • a specific structure in the area to be registered of the reference image is segmented, a feature score is obtained based on the positional relationship between each boundary point of the specific structure and the boundary points around it, and the boundary points with feature scores greater than a second preset value are used as the image feature points.
  • the feature score S(p) of the voxel point at the coordinate p in the image I is determined according to the Foerstner operator, and its expression is:
  • K ⁇ represents the Gaussian kernel function with variance ⁇
  • Tr( ⁇ ) represents the trace of the matrix.
  • the registration method further comprises:
  • the number and distribution of the image feature points are adjusted, and the adjustment includes any of the following:
  • Adjust the distribution of the image feature points use a sampling window of a set size to scan the reference image, and when two or more image feature points appear in the sampling window, only retain the image feature point with the largest feature score;
  • adjusting the number and distribution of the image feature points when the number of the image feature points is greater than a third preset value, randomly selecting a point from the detected image feature points as an adjustment point set, and selecting a point farthest from the adjustment point set from the remaining image feature points each time to add to the adjustment point set, until the number of the image feature points in the adjustment point set reaches a fourth preset value, and the distance of the image feature point from the adjustment point set is the minimum value of the Euclidean distance from the point to all the image feature points in the adjustment point set;
  • adjust the number and distribution of the image feature points when the number of the image feature points is greater than a third preset value, use all the image feature points to construct an octree, traverse the octree according to the breadth-first principle, and if the point with the largest feature score in the current subtree is not in the adjustment point set, then add the point with the largest feature score in the current subtree to the adjustment point set until the number of the image feature points in the adjustment point set reaches a fourth preset value.
  • the input of the similarity network is the image blocks corresponding to the reference image and the floating image
  • the image block of the reference image has a size of W 1 ⁇ H 1 ⁇ D 1
  • the image block of the floating image includes a specified detection range, has a size of W 2 ⁇ H 2 ⁇ D 2 , and satisfies W 1 ⁇ W 2 ,H 1 ⁇ H 2 ,D 1 ⁇ D 2 ;
  • the output of the similarity network is a similarity map corresponding to the image feature points, and the size of the similarity map is [(W 2 -W 1 )/q+1] ⁇ [(H 2 -H 1 )/q+1] ⁇ [(D 2 -D 1 )/q+1], where q is a downsampling coefficient.
  • the displacement network includes an encoding part, an interaction part and a decoding part;
  • the input of the encoding part includes a similarity map of each image feature point, an image block of the corresponding reference image and coordinate information of the corresponding image feature point,
  • the interaction part receives the encoding results of all image feature points, and encodes the interaction information between different image feature points
  • the decoding part receives the output of the encoding part, the interaction part and some intermediate states, and outputs a displacement vector corresponding to each image feature point;
  • the displacement vector is obtained by first obtaining a displacement probability map, and then taking the pixel value in the displacement probability map as the weight of the pixel corresponding coordinate for weighted averaging;
  • the encoding part and the decoding part are connected by a jump connection.
  • the registration method further comprises:
  • the similarity of the local structure between the reference image and the floating image is obtained by specifying features; an objective function is constructed according to the similarity and a smooth constraint, and the displacement vector field is locally adjusted by minimizing the objective function.
  • the present application also provides a multimodal image registration system based on deep learning, comprising:
  • a module for acquiring a region to be registered acquiring three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image, and acquiring a region to be registered of the three-dimensional images;
  • Image feature point detection module detects image feature points in the to-be-registered area of the reference image, wherein the image feature points are points that can be distinguished from the image features of other points in the neighborhood;
  • Similarity graph acquisition module taking each of the image feature points as the center, obtaining an image block of a preset size; inputting the image block into a similarity network to obtain a similarity graph within the corresponding range of the floating image;
  • a displacement vector field acquisition module inputs the coordinate information of the image feature point, the image block of the reference image and the corresponding similarity map into the displacement network to obtain a displacement vector; interpolates the area without image feature points based on the displacement vector to obtain a displacement vector field;
  • Registration module performs spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
  • the present application also provides a computer-readable storage medium, on which a deep learning-based multimodal image registration program is stored.
  • a deep learning-based multimodal image registration program is executed by a processor, the above-mentioned deep learning-based multimodal image registration method is implemented.
  • FIG1 is a flow chart of a modality image registration method based on deep learning according to an embodiment of the present application
  • FIG2 is a schematic diagram of a similarity network structure according to an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a displacement network structure according to an embodiment of the present application.
  • the present application discloses a multimodal image registration method based on deep learning, as shown in FIG1 , comprising the following steps:
  • Step S1 Acquire three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image; and acquire areas to be registered of the two images.
  • modality 1 is designated as the reference image
  • modality 2 is designated as the floating image and interpolated to set the same spatial resolution as modality 1.
  • the three-dimensional image can be CT, MRI, ultrasound (three-dimensional ultrasound or three-dimensional ultrasound image reconstructed from a series of two-dimensional ultrasound images), etc.
  • CT three-dimensional ultrasound or three-dimensional ultrasound image reconstructed from a series of two-dimensional ultrasound images
  • ultrasound three-dimensional ultrasound or three-dimensional ultrasound image reconstructed from a series of two-dimensional ultrasound images
  • the area to be registered can be determined by manual interaction, or by the grayscale threshold of the image, or by automatic detection and segmentation of specific structures in the image.
  • a special case is that the area to be registered is the entire image.
  • Step S2 Detect image feature points in the area to be registered of the reference image. Sample points in the area to be registered of the reference image, obtain feature scores based on the neighborhood information of the sampled points, and use points with feature scores greater than a set threshold as image feature points.
  • the image feature points are obtained in the following way:
  • Grid sampling or random sampling is performed from the area to be registered in the reference image, and a three-dimensional operator constructed based on the grayscale variance, gradient value, etc. in the neighborhood of the sampling point is used to determine the feature score, and the point with a feature score higher than the first preset value is used as the image feature point.
  • the Foerstner operator is a commonly used three-dimensional feature point detection operator, which can be used to obtain the feature score S(p) of the pixel point at the coordinate p in the image I, and its expression is as follows:
  • K ⁇ represents the Gaussian kernel function with variance ⁇
  • Tr( ⁇ ) represents the trace of the matrix.
  • Another implementation method for obtaining image feature points is to segment the specific structure in the to-be-registered region of the reference image, obtain a feature score based on the positional relationship between each boundary point of the specific structure and its surrounding boundary points, and use the boundary points with a feature score greater than a second preset value as image feature points. For example, a curvature value is obtained through the positional relationship between each boundary point and its surrounding boundary points, and a point with a curvature value greater than a set threshold is used as an image feature point.
  • Step 3 Adjust the data and distribution of image feature points. In order to prevent image feature points from being concentrated in the same area or being too unevenly distributed in the area to be registered, adjust the number and distribution of image feature points to prevent a large number of image feature points from being stored in the subsequent image blocks.
  • the adjustment includes any of the following:
  • Adjust the distribution of image feature points Use a sampling window of a set size to scan the reference image. When two or more image feature points appear in the sampling window, only the image feature point with the largest feature score is retained.
  • adjust the number and distribution of image feature points when the number of image feature points is greater than a third preset value, randomly select a point from the detected image feature points as the adjustment point set, and each time select the point farthest from the adjustment point set from the remaining image feature points to add to the adjustment point set until the number of image feature points in the adjustment point set reaches a fourth preset value, and the distance of the image feature point from the adjustment point set is the minimum value of the Euclidean distance from the point to all image feature points in the adjustment point set.
  • adjust the number and distribution of image feature points when the number of image feature points is greater than the third preset value, use all the image feature points to construct an octree, and traverse the octree according to the breadth-first principle. If the point with the largest feature score in the current subtree is not in the adjustment point set, then add the point with the largest feature score in the current subtree to the adjustment point set until the number of image feature points in the adjustment point set reaches a fourth preset value.
  • Step S4 Taking each image feature point as the center, taking an image block that contains a specified range of neighborhood with the feature point as the center, inputting the image block into a similarity network, and obtaining a similarity graph within the corresponding range of the floating image.
  • the input of the similarity network is the image blocks corresponding to the reference image and the floating image
  • the image block size of the reference image is W 1 ⁇ H 1 ⁇ D 1
  • the image block of the floating image includes the specified detection range, the size of which is W 2 ⁇ H 2 ⁇ D 2 , and satisfies W 1 ⁇ W 2 , H 1 ⁇ H 2 , D 1 ⁇ D 2 .
  • the output of the similarity network is the similarity graph of the corresponding image feature points
  • the size of the similarity graph is [(W 2 -W 1 )/q+1] ⁇ [(H 2 -H 1 )/q+1] ⁇ [(D 2 -D 1 )/q+1], where q is the downsampling coefficient.
  • the size of the value of any point in the similarity graph indicates: the possibility that the corresponding position of the value predicted based on the local image feature in the floating image corresponds to the same anatomical point as the image feature point in the reference image.
  • the similarity network is a convolutional neural network based on self-supervised training.
  • the peak value of the similarity graph is used to construct a contrast loss function to determine whether the floating image block contains the anatomical structure corresponding to the feature point of the reference image.
  • the image blocks of the reference image and the image blocks of the floating image can be feature encoded respectively through two convolutional neural network branches to obtain a first feature map after the image blocks of the reference image are encoded and a second feature map after the image blocks of the floating image are encoded.
  • a sliding window convolution operation is performed on the first feature map on the second feature map, and the similarity map is obtained after normalization.
  • Step S5 input the coordinate information of the image feature points, the image block of the reference image and the corresponding similarity graph into the displacement network to obtain a displacement vector.
  • the displacement network includes an encoding part, an interaction part and a decoding part.
  • the input of the encoding part includes a similarity map of each image feature point, an image block of the corresponding reference image and the coordinate information of the corresponding image feature point.
  • the above three input items can be encoded separately and output to the interaction part, or all or two of the above three input items can be integrated (such as splicing, addition, etc.), and then jointly encoded and output to the interaction part.
  • the interaction part receives the encoding results of all image feature points and encodes the interaction information between different image feature points.
  • the decoding part receives the output of the encoding part, the interaction part and some intermediate states, and outputs the displacement vector corresponding to each image feature point.
  • the encoding part encodes the similarity graph of the image feature points and the image block of the corresponding reference image respectively through two convolutional neural network branches, encodes the position of the image feature points in a fixed manner, and adds it to the image block code.
  • the interaction part can be constructed by a self-attention mechanism: the relationship between different image feature points is obtained and encoded through the self-attention layer, and the feature transformation is further performed through the feedforward layer.
  • the structure composed of the above self-attention layer and the feedforward layer can be cascaded in multiple levels.
  • the decoding part is constructed based on the convolutional neural network, and a normalized displacement probability map can be obtained.
  • the size of the value of any point in the displacement probability map indicates: the possibility that the corresponding position of the value in the floating image and the feature point in the reference image correspond to the same anatomical point based on the position distribution of all image feature points, the image block of the reference image and the corresponding similarity map prediction.
  • the pixel value in the displacement probability map is used as the weight of the corresponding coordinate of the pixel for weighted averaging to obtain a displacement vector; or, the coordinate corresponding to the maximum value in the displacement probability map is taken as the displacement vector.
  • the encoding part and the decoding part are connected by a jump connection. In practical applications, multiple displacement network structures can also be cascaded.
  • the interaction part can also be constructed based on graph neural network.
  • Step S6 interpolating the region without image feature points based on the displacement vector to obtain a displacement vector field.
  • One way to store the displacement vector field is as a 6-dimensional matrix, where the first three dimensions are the same size as the modality 1 image, and the last three dimensions represent the displacement vectors that map the corresponding pixel points to the modality 2 image.
  • cubic linear interpolation can be used.
  • Step S7 locally adjust the displacement vector field to obtain the final displacement vector field.
  • a specific adjustment method is to obtain the similarity of the local structure between the reference image and the floating image by specifying features; construct an objective function based on the similarity and smoothness constraints, and locally adjust the displacement vector field by minimizing the objective function.
  • the modality independent neighborhood descriptor is a common multimodal image feature, which can be used as a specified feature and extracted from two modal images respectively, and the square difference of the modality independent neighborhood descriptors of the two modal images is used to measure the similarity of the local structure. It is also possible to use a similarity network, and use its output to replace the above-mentioned specified feature similarity to construct an objective function to perform local adjustments to the displacement vector field.
  • Step S8 Perform spatial transformation on the floating image according to the optimized displacement vector field to obtain a registration result.
  • the present application also discloses a multimodal image registration system based on deep learning, comprising:
  • a module for acquiring a region to be registered acquiring three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image, and acquiring a region to be registered of the three-dimensional images;
  • Image feature point detection module detects image feature points in the to-be-registered area of the reference image, wherein the image feature points are points that can be distinguished from the image features of other points in the neighborhood;
  • Similarity graph acquisition module taking each of the image feature points as the center, obtaining an image block of a preset size; inputting the image block into a similarity network to obtain a similarity graph within the corresponding range of the floating image;
  • a displacement vector field acquisition module inputs the coordinate information of the image feature point, the image block of the reference image and the corresponding similarity map into the displacement network to obtain a displacement vector; interpolates the area without image feature points based on the displacement vector to obtain a displacement vector field;
  • Registration module performs spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
  • the present application also discloses a computer-readable storage medium, such as a computer hard disk, etc., on which a multimodal image registration program based on deep learning is stored.
  • a multimodal image registration program based on deep learning is executed by a processor, the above-mentioned multimodal image registration method based on deep learning is implemented.
  • This application reduces the interference of low-information points by extracting image feature points, and improves the efficiency of registration, especially when the image size is large.
  • Using the similarity graph of all feature points for global optimization removes the prerequisite of accurately detecting corresponding points in two modalities.
  • the spatial distribution information of all feature points is considered to improve the robustness of the algorithm.
  • the introduction of deep learning is used to fully extract the corresponding information between different modalities of the same anatomical structure, and on the other hand, it can avoid the large time overhead caused by iterative solutions in traditional methods.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Algebra (AREA)
  • Pure & Applied Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)

Abstract

A deep learning-based multi-modal image registration method and system, and a medium. The method comprises: acquiring three-dimensional images of different modalities, wherein the three-dimensional images comprise a reference image and a floating image; acquiring areas to be registered of the three-dimensional images, and detecting image feature points in an area to be registered of the reference image; obtaining image blocks according to the image feature points and inputting the image blocks into a similarity network to obtain a similarity graph in a corresponding range of the floating image; inputting the coordinates of the image feature points, the image blocks and the similarity graph into a displacement network to obtain a displacement vector, and performing interpolation on an area having no image feature point to obtain a displacement vector field; and performing spatial transformation on the floating image according to the displacement vector field.

Description

基于深度学习的多模态影像配准方法、系统及介质Multimodal image registration method, system and medium based on deep learning
相关申请Related Applications
本申请要求2022年10月21日申请的,申请号为202211296302.9,发明名称为“基于深度学习的跨模块非刚体配准方法、系统及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed on October 21, 2022, with application number 202211296302.9, and invention name “Cross-module non-rigid registration method, system and medium based on deep learning”, the entire contents of which are incorporated by reference in this application.
技术领域Technical Field
本申请涉及图像处理技术领域,具体地,涉及一种基于深度学习的多模态影像配准方法、系统及介质。The present application relates to the field of image processing technology, and in particular, to a multimodal image registration method, system and medium based on deep learning.
背景技术Background technique
现代医疗诊断需要各种医学影像的支持,常见的医学影像模态包含计算机断层扫描(Computed Tomography,CT)、磁共振成像(Magnetic Resonance Imaging,MRI)以及超声成像(Ultrasound,US)等,其成像各有特点。CT对于人体密度高的组织,如骨骼等,成像优势明显;MRI对软组织分辨能力更好等。多种模态影像的融合可以提供互补信息,更好地达到诊断、评估或介入的目的。例如,在计算机辅助诊断中,多模态影像的融合可以充分结合不同模态影像反映的组织特征,对是否存在病灶、病灶的性质以及范围给出更加准确的判断。而在微创手术导航中,术前影像和术中影像的融合可以实现术前规划与术中影像的叠加,可以为医生提供更加丰富直观的信息,提高介入过程中图像引导的质量,从而提高手术质量,改善临床结果。然而,不同模态的影像通常是使用不同的扫描仪器在不同时间点获得的,这一过程中伴有患者姿态和内部解剖结构的变化,因此,实现多模态影像融合的前提是进行多模态医学影像的配准,配准的精度直接决定着融合的效果。Modern medical diagnosis requires the support of various medical images. Common medical imaging modalities include computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound (US), each with its own imaging characteristics. CT has obvious imaging advantages for high-density tissues in the human body, such as bones; MRI has better resolution for soft tissues, etc. The fusion of multiple modal images can provide complementary information to better achieve the purpose of diagnosis, evaluation or intervention. For example, in computer-aided diagnosis, the fusion of multimodal images can fully combine the tissue characteristics reflected by different modal images to give a more accurate judgment on whether there is a lesion, the nature of the lesion, and the range. In minimally invasive surgical navigation, the fusion of preoperative images and intraoperative images can achieve the superposition of preoperative planning and intraoperative images, which can provide doctors with richer and more intuitive information, improve the quality of image guidance during intervention, thereby improving the quality of surgery and clinical outcomes. However, images of different modalities are usually acquired at different time points using different scanning instruments. This process is accompanied by changes in the patient's posture and internal anatomical structure. Therefore, the prerequisite for achieving multimodal image fusion is to perform multimodal medical image registration, and the accuracy of the registration directly determines the effect of the fusion.
多模态医学影像配准是一个具有挑战性的问题,不同模态医学影像灰度分布之间的关系往往是复杂而不确定的,此外,在一种模态中存在结构和特征,可能在另一种模态中缺失。传统的多模态配准方法可以大致分为基于灰度值的配准方法和基于解剖特征的配准方法。基于灰度值的配准方法主要使用多模态相似性测度,例如互信息、互相关等;基于解剖特征的配准方法主要依赖于在不同模态影像中识别的标志点。近年来,深度学习技术发展迅速,在图像配准领域也得到越来越多的研究和应用,有望解决传统配准中 配准速度慢、配准精度不足等问题。Multimodal medical image registration is a challenging problem. The relationship between the grayscale distribution of medical images of different modalities is often complex and uncertain. In addition, structures and features that exist in one modality may be missing in another modality. Traditional multimodal registration methods can be roughly divided into grayscale-based registration methods and anatomical feature-based registration methods. Grayscale-based registration methods mainly use multimodal similarity measures, such as mutual information and cross-correlation; anatomical feature-based registration methods mainly rely on landmarks identified in images of different modalities. In recent years, deep learning technology has developed rapidly, and has also been increasingly studied and applied in the field of image registration, which is expected to solve the problems of slow registration speed and insufficient registration accuracy in traditional registration.
发明内容Summary of the invention
根据本申请的各种实施例,提供一种基于深度学习的多模态影像配准方法、系统及介质。According to various embodiments of the present application, a multimodal image registration method, system and medium based on deep learning are provided.
基于深度学习的多模态影像配准方法,包括:Multimodal image registration methods based on deep learning include:
获取不同模态的三维影像,所述三维影像包括至少一幅参考影像和至少一幅浮动影像;获取三维影像的待配准区域,在所述参考影像的待配准区域内检测影像特征点,所述影像特征点为在邻域内能够与其他点的影像特征进行区分的点;以每个所述影像特征点为中心,得到预设大小的图像块;将所述图像块输入到相似性网络,得到所述浮动影像的对应范围内的相似性图;将所述影像特征点的坐标信息、所述参考图像的图像块和对应的所述相似性图输入位移网络,得到位移向量;基于所述位移向量对无影像特征点区域进行插值,得到位移向量场;根据所述位移向量场对所述浮动影像进行空间变换,得到配准结果。Acquire three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image; acquire a region to be registered of the three-dimensional image, detect image feature points in the region to be registered of the reference image, wherein the image feature points are points that can be distinguished from image features of other points in a neighborhood; obtain image blocks of a preset size with each of the image feature points as the center; input the image blocks into a similarity network to obtain a similarity graph within a corresponding range of the floating image; input the coordinate information of the image feature points, the image blocks of the reference image and the corresponding similarity graph into a displacement network to obtain a displacement vector; interpolate the region without image feature points based on the displacement vector to obtain a displacement vector field; and perform spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
在一些实施例中,所述待配准区域通过人工交互确定,或根据图像的灰度阈值确定,或对图像中的特定结构进行自动检测分割确定。In some embodiments, the region to be registered is determined through manual interaction, or is determined based on a grayscale threshold of the image, or is determined by automatically detecting and segmenting a specific structure in the image.
在一些实施例中,所述影像特征点的获取方式包括:In some embodiments, the method of acquiring the image feature points includes:
从所述参考影像的待配准区域进行体素点采样,根据采样点邻域内的灰度方差和梯度值获取特征评分,将所述特征评分高于第一预设值的点作为所述影像特征点;Sampling voxel points from the to-be-registered region of the reference image, obtaining feature scores according to the grayscale variance and gradient value in the neighborhood of the sampling points, and taking points with feature scores higher than a first preset value as the image feature points;
或,对参考影像的待配准区域中的特定结构进行分割,根据所述特定结构每个边界点与其周围的边界点的位置关系获取特征评分,将所述特征评分大于第二预设值的边界点作为所述影像特征点。Alternatively, a specific structure in the area to be registered of the reference image is segmented, a feature score is obtained based on the positional relationship between each boundary point of the specific structure and the boundary points around it, and the boundary points with feature scores greater than a second preset value are used as the image feature points.
在一些实施例中,根据Foerstner算子确定图像I中位于坐标p处的体素点的特征评分S(p),其表达式为:In some embodiments, the feature score S(p) of the voxel point at the coordinate p in the image I is determined according to the Foerstner operator, and its expression is:
Figure PCTCN2022142807-appb-000001
Figure PCTCN2022142807-appb-000001
其中,K σ表示方差为σ的高斯核函数,
Figure PCTCN2022142807-appb-000002
为图像I的空间梯度在坐标p处的值,Tr(·)表示求矩阵的迹。
Among them, represents the Gaussian kernel function with variance σ,
Figure PCTCN2022142807-appb-000002
is the value of the spatial gradient of image I at coordinate p, and Tr(·) represents the trace of the matrix.
在一些实施例中,所述配准方法还包括:In some embodiments, the registration method further comprises:
对所述影像特征点的数目和分布进行调整,调整包括以下任意一种:The number and distribution of the image feature points are adjusted, and the adjustment includes any of the following:
调整所述影像特征点的分布:使用设定大小的采样窗口对所述参考影像扫描,当所 述采样窗口中出现两个及以上影像特征点时,只保留特征评分最大的影像特征点;Adjust the distribution of the image feature points: use a sampling window of a set size to scan the reference image, and when two or more image feature points appear in the sampling window, only retain the image feature point with the largest feature score;
或,调整所述影像特征点的数目和分布:当所述影像特征点的数目大于第三预设值时,从检测到的所述影像特征点中随机选取一个点作为调整点集,每次从剩余的影像特征点中选择距离调整点集最远的点加入所述调整点集中,直到所述调整点集中所述影像特征点的数目达到第四预设值,所述影像特征点距离所述调整点集的距离为该点到所述调整点集中所有影像特征点的欧式距离的最小值;Or, adjusting the number and distribution of the image feature points: when the number of the image feature points is greater than a third preset value, randomly selecting a point from the detected image feature points as an adjustment point set, and selecting a point farthest from the adjustment point set from the remaining image feature points each time to add to the adjustment point set, until the number of the image feature points in the adjustment point set reaches a fourth preset value, and the distance of the image feature point from the adjustment point set is the minimum value of the Euclidean distance from the point to all the image feature points in the adjustment point set;
或,调整所述影像特征点的数目和分布:当所述影像特征点的数目大于第三预设值时,利用所有的影像特征点构建八叉树,根据宽度优先原则遍历所述八叉树,若当前子树中特征评分最大的点不在调整点集中,则将当前子树中特征评分最大的点加入所述调整点集,直到所述调整点集中所述影像特征点的数目达到第四预设值。Or, adjust the number and distribution of the image feature points: when the number of the image feature points is greater than a third preset value, use all the image feature points to construct an octree, traverse the octree according to the breadth-first principle, and if the point with the largest feature score in the current subtree is not in the adjustment point set, then add the point with the largest feature score in the current subtree to the adjustment point set until the number of the image feature points in the adjustment point set reaches a fourth preset value.
在一些实施例中,所述相似性网络的输入为所述参考影像和所述浮动影像对应的图像块,所述参考影像的图像块尺寸为W 1×H 1×D 1,所述浮动影像的图像块包含指定检测范围,其尺寸为W 2×H 2×D 2,并且满足W 1≤W 2,H 1≤H 2,D 1≤D 2In some embodiments, the input of the similarity network is the image blocks corresponding to the reference image and the floating image, the image block of the reference image has a size of W 1 ×H 1 ×D 1 , the image block of the floating image includes a specified detection range, has a size of W 2 ×H 2 ×D 2 , and satisfies W 1 ≤W 2 ,H 1 ≤H 2 ,D 1 ≤D 2 ;
所述相似性网络的输出为对应影像特征点的相似性图,所述相似性图的尺寸为[(W 2-W 1)/q+1]×[(H 2-H 1)/q+1]×[(D 2-D 1)/q+1],其中q为降采样系数。 The output of the similarity network is a similarity map corresponding to the image feature points, and the size of the similarity map is [(W 2 -W 1 )/q+1]×[(H 2 -H 1 )/q+1]×[(D 2 -D 1 )/q+1], where q is a downsampling coefficient.
在一些实施例中,所述位移网络包括编码部分、相互作用部分和解码部分;所述编码部分的输入包括每个影像特征点的相似性图、对应的参考影像的图像块和对应影像特征点的坐标信息,所述相互作用部分接收所有影像特征点的编码结果,并对不同影像特征点之间的相互作用信息进行编码,所述解码部分接收所述编码部分、所述相互作用部分的输出以及部分中间状态,输出每个影像特征点对应的位移向量;所述位移向量通过先获得位移概率图,再将所述位移概率图中的像素值作为像素对应坐标的权重进行加权平均得到;所述编码部分和所述解码部分之间有跳接相连。In some embodiments, the displacement network includes an encoding part, an interaction part and a decoding part; the input of the encoding part includes a similarity map of each image feature point, an image block of the corresponding reference image and coordinate information of the corresponding image feature point, the interaction part receives the encoding results of all image feature points, and encodes the interaction information between different image feature points, the decoding part receives the output of the encoding part, the interaction part and some intermediate states, and outputs a displacement vector corresponding to each image feature point; the displacement vector is obtained by first obtaining a displacement probability map, and then taking the pixel value in the displacement probability map as the weight of the pixel corresponding coordinate for weighted averaging; the encoding part and the decoding part are connected by a jump connection.
在一些实施例中,所述配准方法还包括:In some embodiments, the registration method further comprises:
通过指定特征获取所述参考影像和所述浮动影像之间局部结构的相似性;根据所述相似性和平滑约束构建目标函数,通过最小化所述目标函数对所述位移向量场进行局部调整。The similarity of the local structure between the reference image and the floating image is obtained by specifying features; an objective function is constructed according to the similarity and a smooth constraint, and the displacement vector field is locally adjusted by minimizing the objective function.
本申请还提供一种基于深度学习的多模态影像配准系统,包括:The present application also provides a multimodal image registration system based on deep learning, comprising:
待配准区域获取模块:获取不同模态的三维影像,所述三维影像包括至少一幅参考影像和至少一幅浮动影像,获取三维影像的待配准区域;A module for acquiring a region to be registered: acquiring three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image, and acquiring a region to be registered of the three-dimensional images;
影像特征点检测模块:在所述参考影像的待配准区域内检测影像特征点,所述影像特征点为在邻域内能够与其他点的影像特征进行区分的点;Image feature point detection module: detects image feature points in the to-be-registered area of the reference image, wherein the image feature points are points that can be distinguished from the image features of other points in the neighborhood;
相似性图获取模块:以每个所述影像特征点为中心,得到预设大小的图像块;将所述图像块输入到相似性网络,得到所述浮动影像的对应范围内的相似性图;Similarity graph acquisition module: taking each of the image feature points as the center, obtaining an image block of a preset size; inputting the image block into a similarity network to obtain a similarity graph within the corresponding range of the floating image;
位移向量场获取模块:将所述影像特征点的坐标信息、所述参考图像的图像块和对应的所述相似性图输入位移网络,得到位移向量;基于所述位移向量对无影像特征点区域进行插值,得到位移向量场;A displacement vector field acquisition module: inputs the coordinate information of the image feature point, the image block of the reference image and the corresponding similarity map into the displacement network to obtain a displacement vector; interpolates the area without image feature points based on the displacement vector to obtain a displacement vector field;
配准模块:根据所述位移向量场对所述浮动影像进行空间变换,得到配准结果。Registration module: performs spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有基于深度学习的多模态影像配准程序,所述基于深度学习的多模态影像配准程序被处理器执行时实现上述的基于深度学习的多模态影像配准方法。The present application also provides a computer-readable storage medium, on which a deep learning-based multimodal image registration program is stored. When the deep learning-based multimodal image registration program is executed by a processor, the above-mentioned deep learning-based multimodal image registration method is implemented.
本申请的一个或多个实施例的细节在以下附图和描述中提出,以使本申请的其他特征、目的和优点更加简明易懂。Details of one or more embodiments of the present application are set forth in the following drawings and description to make other features, objects, and advantages of the present application more readily apparent.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更好地描述和说明这里公开的本申请的实施例和/或示例,可以参考一幅或多幅附图。用于描述附图的附加细节或示例不应当被认为是对所公开的申请、目前描述的实施例和/或示例以及目前理解的这些申请的最佳模式中的任何一者的范围的限制。In order to better describe and illustrate the embodiments and/or examples of the present application disclosed herein, reference may be made to one or more drawings. The additional details or examples used to describe the drawings should not be considered as limiting the scope of any of the disclosed applications, the embodiments and/or examples currently described, and the best modes of these applications currently understood.
图1为本申请实施例的基于深度学习的模态影像配准方法的流程图;FIG1 is a flow chart of a modality image registration method based on deep learning according to an embodiment of the present application;
图2为本申请实施例的相似性网络结构示意图;FIG2 is a schematic diagram of a similarity network structure according to an embodiment of the present application;
图3为本申请实施例的位移网络结构示意图。FIG. 3 is a schematic diagram of a displacement network structure according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行描述和说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。基于本申请提供的实施例,本领域普通技术人员在没有作出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。此外,还可以理解的是,虽然这种开发过程中所作出的努力可能是复杂并且冗长的,然而对于与本申请公开的内容相关的本领域的普通技术人员而言,在本申请揭露的技术内容的基础上进行的一些设计,制造或者生产等变更只是常规的技术手段,不应当理解为本申请公开的内容不充分。In order to make the purpose, technical solutions and advantages of the present application clearer, the present application is described and illustrated below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not intended to limit the present application. Based on the embodiments provided in the present application, all other embodiments obtained by ordinary technicians in this field without making creative work are within the scope of protection of the present application. In addition, it can also be understood that although the efforts made in this development process may be complex and lengthy, for ordinary technicians in the field related to the contents disclosed in the present application, some changes such as design, manufacturing or production based on the technical contents disclosed in the present application are only conventional technical means, and should not be understood as insufficient contents disclosed in the present application.
在本申请中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指 相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域普通技术人员显式地和隐式地理解的是,本申请所描述的实施例在不冲突的情况下,可以与其它实施例相结合。Reference to "embodiments" in this application means that a particular feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment that is mutually exclusive with other embodiments. It is explicitly and implicitly understood by those of ordinary skill in the art that the embodiments described in this application may be combined with other embodiments without conflict.
除非另作定义,本申请所涉及的技术术语或者科学术语应当为本申请所属技术领域内具有一般技能的人士所理解的通常意义。本申请所涉及的“一”、“一个”、“一种”、“该”等类似词语并不表示数量限制,可表示单数或复数。本申请所涉及的“多个”是指大于或者等于两个。本申请所涉及的术语“包括”、“包含”、“具有”以及它们任何变形,意图在于覆盖不排他的包含。本申请所涉及的“第一”、“第二”等仅为了区分对象,并非对对象的限定。Unless otherwise defined, the technical terms or scientific terms involved in this application should be the usual meanings understood by people with ordinary skills in the technical field to which this application belongs. The words "one", "a", "a", "the" and the like involved in this application do not indicate a quantitative limitation and may represent the singular or plural. The "multiple" involved in this application means greater than or equal to two. The terms "including", "comprising", "having" and any variations thereof involved in this application are intended to cover non-exclusive inclusions. The "first", "second", etc. involved in this application are only for distinguishing objects and are not limitations on the objects.
本申请公开了一种基于深度学习的多模态影像配准方法,参照图1所示,包括以下步骤:The present application discloses a multimodal image registration method based on deep learning, as shown in FIG1 , comprising the following steps:
步骤S1:获取不同模态的三维影像,所述三维影像包括至少一幅参考影像和至少一幅浮动影像;获取两种影像的待配准区域。Step S1: Acquire three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image; and acquire areas to be registered of the two images.
通过读取包含同一病人同一区域的不同模态的三维影像,指定模态1为参考影像,模态2为浮动影像并插值设置与模态1相同的空间分辨率。By reading three-dimensional images of different modalities covering the same area of the same patient, modality 1 is designated as the reference image, modality 2 is designated as the floating image and interpolated to set the same spatial resolution as modality 1.
三维影像可以是CT、MRI、超声(三维超声或者由一系列二维超声影像重建的三维超声影像)等。配准过程中会寻求对浮动影像的最优空间变换,将其映射到参考影像的坐标系中,使两种模态影像中对应的人体解剖点达到空间上的一致。The three-dimensional image can be CT, MRI, ultrasound (three-dimensional ultrasound or three-dimensional ultrasound image reconstructed from a series of two-dimensional ultrasound images), etc. During the registration process, the optimal spatial transformation of the floating image is sought and mapped to the coordinate system of the reference image so that the corresponding human anatomical points in the two modal images are spatially consistent.
待配准区域可以通过人工交互确定,也可以根据图像的灰度阈值确定,还可以对图像中的特定结构进行自动检测分割确定。一个特例是待配准区域为整幅影像。The area to be registered can be determined by manual interaction, or by the grayscale threshold of the image, or by automatic detection and segmentation of specific structures in the image. A special case is that the area to be registered is the entire image.
步骤S2:在参考影像的待配准区域内检测影像特征点。从参考影像的待配准区域进行点采样,根据采样点的邻域信息得到特征评分,将特征评分大于设定阈值的点作为影像特征点。影像特征点的获取方式为:Step S2: Detect image feature points in the area to be registered of the reference image. Sample points in the area to be registered of the reference image, obtain feature scores based on the neighborhood information of the sampled points, and use points with feature scores greater than a set threshold as image feature points. The image feature points are obtained in the following way:
从参考影像的待配准区域进行网格采样或随机采样,使用基于采样点邻域内的灰度方差、梯度值等构建的三维算子确定特征评分,将特征评分高于第一预设值的点作为影像特征点。例如Foerstner算子是一种常用的三维特征点检测算子,可以用于获取图像I中位于坐标p处的像素点的特征评分S(p),其表达式如下:Grid sampling or random sampling is performed from the area to be registered in the reference image, and a three-dimensional operator constructed based on the grayscale variance, gradient value, etc. in the neighborhood of the sampling point is used to determine the feature score, and the point with a feature score higher than the first preset value is used as the image feature point. For example, the Foerstner operator is a commonly used three-dimensional feature point detection operator, which can be used to obtain the feature score S(p) of the pixel point at the coordinate p in the image I, and its expression is as follows:
Figure PCTCN2022142807-appb-000003
Figure PCTCN2022142807-appb-000003
其中,K σ表示方差为σ的高斯核函数,
Figure PCTCN2022142807-appb-000004
为图像I的空间梯度在坐标p处的值,Tr(·)表示求矩阵的迹。
Among them, represents the Gaussian kernel function with variance σ,
Figure PCTCN2022142807-appb-000004
is the value of the spatial gradient of image I at coordinate p, and Tr(·) represents the trace of the matrix.
获取影像特征点的另一种实现方式是对参考影像的待配准区域中的特定结构进行分割,根据特定结构每个边界点与其周围的边界点的位置关系获取特征评分,将特征评分大于第二预设值的边界点作为影像特征点。例如,通过每个边界点与其周围边界点的位置关系获取曲率值,将曲率值大于设定阈值的点作为影像特征点。Another implementation method for obtaining image feature points is to segment the specific structure in the to-be-registered region of the reference image, obtain a feature score based on the positional relationship between each boundary point of the specific structure and its surrounding boundary points, and use the boundary points with a feature score greater than a second preset value as image feature points. For example, a curvature value is obtained through the positional relationship between each boundary point and its surrounding boundary points, and a point with a curvature value greater than a set threshold is used as an image feature point.
步骤3:调整影像特征点的数据及分布。为了防止影像特征点集中在同一个区域或在待配准区域中分布过于不均衡,对影像特征点数目及分布进行调整,防止后续所得到的图像块中存有大量的影像特征点。调整包括以下任意一种:Step 3: Adjust the data and distribution of image feature points. In order to prevent image feature points from being concentrated in the same area or being too unevenly distributed in the area to be registered, adjust the number and distribution of image feature points to prevent a large number of image feature points from being stored in the subsequent image blocks. The adjustment includes any of the following:
调整影像特征点的分布:使用设定大小的采样窗口对参考影像扫描,当采样窗口中出现两个及以上影像特征点时,只保留特征评分最大的影像特征点。Adjust the distribution of image feature points: Use a sampling window of a set size to scan the reference image. When two or more image feature points appear in the sampling window, only the image feature point with the largest feature score is retained.
或,调整影像特征点的数目和分布:当影像特征点的数目大于第三预设值时,从检测到的影像特征点中随机选取一个点作为调整点集,每次从剩余的影像特征点中选择距离调整点集最远的点加入调整点集中,直到调整点集中影像特征点的数目达到第四预设值,所述影像特征点距离调整点集的距离为该点到调整点集中所有影像特征点的欧式距离的最小值。Or, adjust the number and distribution of image feature points: when the number of image feature points is greater than a third preset value, randomly select a point from the detected image feature points as the adjustment point set, and each time select the point farthest from the adjustment point set from the remaining image feature points to add to the adjustment point set until the number of image feature points in the adjustment point set reaches a fourth preset value, and the distance of the image feature point from the adjustment point set is the minimum value of the Euclidean distance from the point to all image feature points in the adjustment point set.
或,调整影像特征点的数目和分布:当影像特征点的数目大于第三预设值时,利用所有的影像特征点构建八叉树,根据宽度优先原则遍历八叉树,若当前子树中特征评分最大的点不在调整点集中,则将当前子树中特征评分最大的点加入调整点集,直到调整点集中影像特征点的数目达到第四预设值。Or, adjust the number and distribution of image feature points: when the number of image feature points is greater than the third preset value, use all the image feature points to construct an octree, and traverse the octree according to the breadth-first principle. If the point with the largest feature score in the current subtree is not in the adjustment point set, then add the point with the largest feature score in the current subtree to the adjustment point set until the number of image feature points in the adjustment point set reaches a fourth preset value.
步骤S4:以每个影像特征点为中心,取以其为中心包含指定范围邻域的图像块,将图像块输入到相似性网络,得到浮动影像的对应范围内的相似性图。Step S4: Taking each image feature point as the center, taking an image block that contains a specified range of neighborhood with the feature point as the center, inputting the image block into a similarity network, and obtaining a similarity graph within the corresponding range of the floating image.
参照图2所示,相似性网络的输入为参考影像和浮动影像对应的图像块,所述参考影像的图像块尺寸为W 1×H 1×D 1,所述浮动影像的图像块包含指定检测范围,其尺寸为W 2×H 2×D 2,并且满足W 1≤W 2,H 1≤H 2,D 1≤D 2。所述相似性网络的输出为对应影像特征点的相似性图,所述相似性图的尺寸为[(W 2-W 1)/q+1]×[(H 2-H 1)/q+1]×[(D 2-D 1)/q+1],其中q为降采样系数。相似性图中任一点值的大小表示:基于局部影像特征预测的该值在浮动影像中所对应位置与参考影像中影像特征点对应于同一解剖点的可能性大小。 As shown in FIG2 , the input of the similarity network is the image blocks corresponding to the reference image and the floating image, the image block size of the reference image is W 1 ×H 1 ×D 1 , the image block of the floating image includes the specified detection range, the size of which is W 2 ×H 2 ×D 2 , and satisfies W 1 ≤W 2 , H 1 ≤H 2 , D 1 ≤D 2 . The output of the similarity network is the similarity graph of the corresponding image feature points, the size of the similarity graph is [(W 2 -W 1 )/q+1]×[(H 2 -H 1 )/q+1]×[(D 2 -D 1 )/q+1], where q is the downsampling coefficient. The size of the value of any point in the similarity graph indicates: the possibility that the corresponding position of the value predicted based on the local image feature in the floating image corresponds to the same anatomical point as the image feature point in the reference image.
上述相似性网络为基于自监督训练的卷积神经网络,使用相似性图峰值构建对比损失函数,判断浮动影像图像块中是否包含与参考影像特征点所对应的解剖结构。The similarity network is a convolutional neural network based on self-supervised training. The peak value of the similarity graph is used to construct a contrast loss function to determine whether the floating image block contains the anatomical structure corresponding to the feature point of the reference image.
具体的,可通过两个卷积神经网络分支分别对参考影像的图像块和浮动影像的图像块进行特征编码,得到参考影像的图像块编码后的第一特征图和浮动影像的图像块编码后的第二特征图,将第一特征图在第二特征图上进行滑动窗口卷积运算,并进行归一化 后得到相似性图。Specifically, the image blocks of the reference image and the image blocks of the floating image can be feature encoded respectively through two convolutional neural network branches to obtain a first feature map after the image blocks of the reference image are encoded and a second feature map after the image blocks of the floating image are encoded. A sliding window convolution operation is performed on the first feature map on the second feature map, and the similarity map is obtained after normalization.
步骤S5:将影像特征点的坐标信息、参考影像的图像块和对应的所述相似性图输入位移网络,得到位移向量。Step S5: input the coordinate information of the image feature points, the image block of the reference image and the corresponding similarity graph into the displacement network to obtain a displacement vector.
参照图3所示,所述位移网络包括编码部分、相互作用部分和解码部分。其中,编码部分的输入包括每个影像特征点的相似性图、对应的参考影像的图像块和对应影像特征点的坐标信息。可对上述三种输入项分别进行编码并输出至相互作用部分,也可将上述三种输入项的全部或其中两种整合(如拼接、相加等),再进行联合编码并输出至相互作用部分。相互作用部分接收所有影像特征点的编码结果,并对不同影像特征点之间的相互作用信息进行编码。解码部分接收编码部分、相互作用部分的输出以及部分中间状态,输出每个影像特征点对应的位移向量。As shown in Figure 3, the displacement network includes an encoding part, an interaction part and a decoding part. Among them, the input of the encoding part includes a similarity map of each image feature point, an image block of the corresponding reference image and the coordinate information of the corresponding image feature point. The above three input items can be encoded separately and output to the interaction part, or all or two of the above three input items can be integrated (such as splicing, addition, etc.), and then jointly encoded and output to the interaction part. The interaction part receives the encoding results of all image feature points and encodes the interaction information between different image feature points. The decoding part receives the output of the encoding part, the interaction part and some intermediate states, and outputs the displacement vector corresponding to each image feature point.
具体的,编码部分通过两个卷积神经网络分支对影像特征点的相似性图以及对应的参考影像的图像块分别进行编码,采用固定方式对影像特征点的位置编码,并与图像块编码相加。在相互作用部分可以通过自注意力机制构建:通过自注意力层获取不同影像特征点之间的关系并进行编码,并通过前馈层进一步进行特征变换。上述自注意力层与前馈层组成的结构可进行多级级联。解码部分基于卷积神经网络构建,可以获得归一化后的位移概率图。位移概率图中任一点值的大小表示:基于所有影像特征点的位置分布、参考图像的图像块及对应相似性图预测的该值在浮动影像中所对应位置与参考影像中特征点对应于同一解剖点的可能性大小。将所述位移概率图中的像素值作为像素对应坐标的权重进行加权平均,得到位移向量;或者,取位移概率图中的最大值对应的坐标作为位移向量。编码部分和解码部分之间有跳接相连,实际应用中,还可以对多个位移网络结构进行级联。Specifically, the encoding part encodes the similarity graph of the image feature points and the image block of the corresponding reference image respectively through two convolutional neural network branches, encodes the position of the image feature points in a fixed manner, and adds it to the image block code. The interaction part can be constructed by a self-attention mechanism: the relationship between different image feature points is obtained and encoded through the self-attention layer, and the feature transformation is further performed through the feedforward layer. The structure composed of the above self-attention layer and the feedforward layer can be cascaded in multiple levels. The decoding part is constructed based on the convolutional neural network, and a normalized displacement probability map can be obtained. The size of the value of any point in the displacement probability map indicates: the possibility that the corresponding position of the value in the floating image and the feature point in the reference image correspond to the same anatomical point based on the position distribution of all image feature points, the image block of the reference image and the corresponding similarity map prediction. The pixel value in the displacement probability map is used as the weight of the corresponding coordinate of the pixel for weighted averaging to obtain a displacement vector; or, the coordinate corresponding to the maximum value in the displacement probability map is taken as the displacement vector. The encoding part and the decoding part are connected by a jump connection. In practical applications, multiple displacement network structures can also be cascaded.
进一步的,相互作用部分还可基于图神经网络构建。Furthermore, the interaction part can also be constructed based on graph neural network.
步骤S6:基于所述位移向量对无影像特征点区域进行插值,得到位移向量场。Step S6: interpolating the region without image feature points based on the displacement vector to obtain a displacement vector field.
位移向量场的一种存储方式为6维矩阵,其中前3个维度与模态1影像尺寸相同,后三个维度表示将对应像素点映射到模态2影像的位移向量。One way to store the displacement vector field is as a 6-dimensional matrix, where the first three dimensions are the same size as the modality 1 image, and the last three dimensions represent the displacement vectors that map the corresponding pixel points to the modality 2 image.
为保证位移向量场的平滑,可以使用三次线性插值。To ensure the smoothness of the displacement vector field, cubic linear interpolation can be used.
步骤S7:对位移向量场进行局部调整,得到最终的位移向量场。Step S7: locally adjust the displacement vector field to obtain the final displacement vector field.
一种具体的调整方式是通过指定特征获取所述参考影像和所述浮动影像之间局部结构的相似性;根据所述相似性和平滑约束构建目标函数,通过最小化所述目标函数对位移向量场进行局部调整。例如,模态无关邻域描述符(modality independent neighborhood descriptor,MIND)为一种常见的多模态图像特征,可以将其作为指定特征,分别从两个 模态的图像中进行提取,并用两个模态影像的模态无关邻域描述符的平方差衡量局部结构的相似性。也可以使用相似性网络,用其输出代替上述指定特征相似性构建目标函数,进行位移向量场局部调整。A specific adjustment method is to obtain the similarity of the local structure between the reference image and the floating image by specifying features; construct an objective function based on the similarity and smoothness constraints, and locally adjust the displacement vector field by minimizing the objective function. For example, the modality independent neighborhood descriptor (MIND) is a common multimodal image feature, which can be used as a specified feature and extracted from two modal images respectively, and the square difference of the modality independent neighborhood descriptors of the two modal images is used to measure the similarity of the local structure. It is also possible to use a similarity network, and use its output to replace the above-mentioned specified feature similarity to construct an objective function to perform local adjustments to the displacement vector field.
步骤S8:根据优化后的位移向量场对浮动影像进行空间变换,得到配准结果。Step S8: Perform spatial transformation on the floating image according to the optimized displacement vector field to obtain a registration result.
本申请还公开了一种基于深度学习的多模态影像配准系统,包括:The present application also discloses a multimodal image registration system based on deep learning, comprising:
待配准区域获取模块:获取不同模态的三维影像,所述三维影像包括至少一幅参考影像和至少一幅浮动影像,获取三维影像的待配准区域;A module for acquiring a region to be registered: acquiring three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image, and acquiring a region to be registered of the three-dimensional images;
影像特征点检测模块:在所述参考影像的待配准区域内检测影像特征点,所述影像特征点为在邻域内能够与其他点的影像特征进行区分的点;Image feature point detection module: detects image feature points in the to-be-registered area of the reference image, wherein the image feature points are points that can be distinguished from the image features of other points in the neighborhood;
相似性图获取模块:以每个所述影像特征点为中心,得到预设大小的图像块;将所述图像块输入到相似性网络,得到所述浮动影像的对应范围内的相似性图;Similarity graph acquisition module: taking each of the image feature points as the center, obtaining an image block of a preset size; inputting the image block into a similarity network to obtain a similarity graph within the corresponding range of the floating image;
位移向量场获取模块:将所述影像特征点的坐标信息、所述参考图像的图像块和对应的所述相似性图输入位移网络,得到位移向量;基于所述位移向量对无影像特征点区域进行插值,得到位移向量场;A displacement vector field acquisition module: inputs the coordinate information of the image feature point, the image block of the reference image and the corresponding similarity map into the displacement network to obtain a displacement vector; interpolates the area without image feature points based on the displacement vector to obtain a displacement vector field;
配准模块:根据所述位移向量场对所述浮动影像进行空间变换,得到配准结果。Registration module: performs spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
本申请还公开了一种计算机可读存储介质,例如计算机硬盘等,所述计算机可读存储介质上存储有基于深度学习的多模态影像配准程序,所述基于深度学习的多模态影像配准程序被处理器执行时实现上述的基于深度学习的多模态影像配准方法。The present application also discloses a computer-readable storage medium, such as a computer hard disk, etc., on which a multimodal image registration program based on deep learning is stored. When the multimodal image registration program based on deep learning is executed by a processor, the above-mentioned multimodal image registration method based on deep learning is implemented.
本邻域技术人员知道,除了以纯计算机可读程序代码方式实现本申请提供的系统及其各个装置、模块、单元以外,完全可以通过将方法步骤进行逻辑编程来使得本申请提供的系统及其各个装置、模块、单元以逻辑门、开关、专用集成电路、可编程逻辑控制器以及嵌入式微控制器等的形式来实现相同功能。所以,本申请提供的系统及其各项装置、模块、单元可以被认为是一种硬件部件,也可以是实现方法的软件模块。Those skilled in the art know that, in addition to implementing the system and its various devices, modules, and units provided by the present application in a purely computer-readable program code, it is entirely possible to implement the same functions of the system and its various devices, modules, and units provided by the present application in the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers by logically programming the method steps. Therefore, the system and its various devices, modules, and units provided by the present application can be considered as a hardware component or a software module for implementing the method.
本申请通过提取影像特征点降低了低信息量点的干扰,提高了配准效率,尤其是当影像尺寸较大时效果更佳显著。使用所有特征点的相似性图进行全局优化,去除了准确检测两个模态中对应点的前置要求,同时,考虑了所有特征点的空间分布信息,提高了算法的稳健性。引入深度学习一方面用于充分提取相同解剖结构在不同模态之间的对应信息,另一方面可以避免传统方法中迭代求解造成的较大的时间开销。This application reduces the interference of low-information points by extracting image feature points, and improves the efficiency of registration, especially when the image size is large. Using the similarity graph of all feature points for global optimization removes the prerequisite of accurately detecting corresponding points in two modalities. At the same time, the spatial distribution information of all feature points is considered to improve the robustness of the algorithm. The introduction of deep learning is used to fully extract the corresponding information between different modalities of the same anatomical structure, and on the other hand, it can avoid the large time overhead caused by iterative solutions in traditional methods.
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-described embodiments may be arbitrarily combined. To make the description concise, not all possible combinations of the technical features in the above-described embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-described embodiments only express several implementation methods of the present application, and the descriptions thereof are relatively specific and detailed, but they cannot be understood as limiting the scope of the patent application. It should be pointed out that, for a person of ordinary skill in the art, several variations and improvements can be made without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the patent application shall be subject to the attached claims.

Claims (10)

  1. 一种基于深度学习的多模态影像配准方法,其特征在于,包括:A multimodal image registration method based on deep learning, characterized by comprising:
    获取不同模态的三维影像,所述三维影像包括至少一幅参考影像和至少一幅浮动影像;获取三维影像的待配准区域,在所述参考影像的待配准区域内检测影像特征点,所述影像特征点为在邻域内能够与其他点的影像特征进行区分的点;以每个所述影像特征点为中心,得到预设大小的图像块;将所述图像块输入到相似性网络,得到所述浮动影像的对应范围内的相似性图;将所述影像特征点的坐标信息、所述参考图像的图像块和对应的所述相似性图输入位移网络,得到位移向量;基于所述位移向量对无影像特征点区域进行插值,得到位移向量场;根据所述位移向量场对所述浮动影像进行空间变换,得到配准结果。Acquire three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image; acquire a region to be registered of the three-dimensional image, detect image feature points in the region to be registered of the reference image, wherein the image feature points are points that can be distinguished from image features of other points in a neighborhood; obtain image blocks of a preset size with each of the image feature points as the center; input the image blocks into a similarity network to obtain a similarity graph within a corresponding range of the floating image; input the coordinate information of the image feature points, the image blocks of the reference image and the corresponding similarity graph into a displacement network to obtain a displacement vector; interpolate the region without image feature points based on the displacement vector to obtain a displacement vector field; and perform spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
  2. 根据权利要求1所述的基于深度学习的多模态影像配准方法,其中,所述待配准区域通过人工交互确定,或根据图像的灰度阈值确定,或对图像中的特定结构进行自动检测分割确定。According to the multimodal image registration method based on deep learning according to claim 1, wherein the area to be registered is determined by manual interaction, or is determined according to the grayscale threshold of the image, or is determined by automatic detection and segmentation of specific structures in the image.
  3. 根据权利要求1所述的基于深度学习的多模态影像配准方法,其中,所述影像特征点的获取方式包括:According to the multimodal image registration method based on deep learning according to claim 1, wherein the method of acquiring the image feature points includes:
    从所述参考影像的待配准区域进行体素点采样,根据采样点邻域内的灰度方差和梯度值获取特征评分,将所述特征评分高于第一预设值的点作为所述影像特征点;Sampling voxel points from the to-be-registered region of the reference image, obtaining feature scores according to the grayscale variance and gradient value in the neighborhood of the sampling points, and taking points with feature scores higher than a first preset value as the image feature points;
    或,对所述参考影像的待配准区域中的特定结构进行分割,根据所述特定结构每个边界点与其周围的边界点的位置关系获取特征评分,将所述特征评分大于第二预设值的边界点作为所述影像特征点。Alternatively, the specific structure in the area to be registered of the reference image is segmented, a feature score is obtained according to the positional relationship between each boundary point of the specific structure and the boundary points around it, and the boundary points with feature scores greater than a second preset value are used as the image feature points.
  4. 根据权利要求3所述的基于深度学习的多模态影像配准方法,其中,根据Foerstner算子确定图像I中位于坐标p处的体素点的特征评分S(p),其表达式为:According to the multimodal image registration method based on deep learning in claim 3, the feature score S(p) of the voxel point at the coordinate p in the image I is determined according to the Foerstner operator, and its expression is:
    Figure PCTCN2022142807-appb-100001
    Figure PCTCN2022142807-appb-100001
    其中,K σ表示方差为σ的高斯核函数,
    Figure PCTCN2022142807-appb-100002
    为图像I的空间梯度在坐标p处的值,Tr(·)表示求矩阵的迹。
    Among them, represents the Gaussian kernel function with variance σ,
    Figure PCTCN2022142807-appb-100002
    is the value of the spatial gradient of image I at coordinate p, and Tr(·) represents the trace of the matrix.
  5. 根据权利要求3所述的基于深度学习的多模态影像配准方法,其中,还包括:The multimodal image registration method based on deep learning according to claim 3, further comprising:
    对所述影像特征点的数目和分布进行调整,所述调整包括以下任意一种:The number and distribution of the image feature points are adjusted, and the adjustment includes any of the following:
    调整所述影像特征点的分布:使用设定大小的采样窗口对所述参考影像扫描,当所述采样窗口中出现两个及以上影像特征点时,只保留特征评分最大的影像特征点;Adjust the distribution of the image feature points: use a sampling window of a set size to scan the reference image, and when two or more image feature points appear in the sampling window, only retain the image feature point with the largest feature score;
    或,调整所述影像特征点的数目和分布:当所述影像特征点的数目大于第三预设值 时,从检测到的所述影像特征点中随机选取一个点作为调整点集,每次从剩余的影像特征点中选择距离调整点集最远的点加入所述调整点集中,直到所述调整点集中所述影像特征点的数目达到第四预设值,所述影像特征点距离所述调整点集的距离为该点到所述调整点集中所有影像特征点的欧式距离的最小值;Or, adjusting the number and distribution of the image feature points: when the number of the image feature points is greater than a third preset value, randomly selecting a point from the detected image feature points as an adjustment point set, and selecting a point farthest from the adjustment point set from the remaining image feature points each time to add to the adjustment point set, until the number of the image feature points in the adjustment point set reaches a fourth preset value, and the distance of the image feature point from the adjustment point set is the minimum value of the Euclidean distance of the point to all the image feature points in the adjustment point set;
    或,调整所述影像特征点的数目和分布:当所述影像特征点的数目大于第三预设值时,利用所有的影像特征点构建八叉树,根据宽度优先原则遍历所述八叉树,若当前子树中特征评分最大的点不在调整点集中,则将当前子树中特征评分最大的点加入所述调整点集,直到所述调整点集中所述影像特征点的数目达到第四预设值。Or, adjust the number and distribution of the image feature points: when the number of the image feature points is greater than a third preset value, use all the image feature points to construct an octree, traverse the octree according to the breadth-first principle, and if the point with the largest feature score in the current subtree is not in the adjustment point set, then add the point with the largest feature score in the current subtree to the adjustment point set until the number of the image feature points in the adjustment point set reaches a fourth preset value.
  6. 根据权利要求1所述的基于深度学习的多模态影像配准方法,其中,所述相似性网络的输入为所述参考影像和所述浮动影像对应的图像块,所述参考影像的图像块尺寸为W 1×H 1×D 1,所述浮动影像的图像块包含指定检测范围,其尺寸为W 2×H 2×D 2,并且满足W 1≤W 2,H 1≤H 2,D 1≤D 2The multimodal image registration method based on deep learning according to claim 1, wherein the input of the similarity network is the image blocks corresponding to the reference image and the floating image, the image block size of the reference image is W 1 ×H 1 ×D 1 , the image block of the floating image includes a specified detection range, the size of which is W 2 ×H 2 ×D 2 , and satisfies W 1 ≤W 2 ,H 1 ≤H 2 ,D 1 ≤D 2 ;
    所述相似性网络的输出为对应影像特征点的相似性图,所述相似性图的尺寸为[(W 2-W 1)/q+1]×[(H 2-H 1)/q+1]×[(D 2-D 1)/q+1],其中q为降采样系数。 The output of the similarity network is a similarity map corresponding to the image feature points, and the size of the similarity map is [(W 2 -W 1 )/q+1]×[(H 2 -H 1 )/q+1]×[(D 2 -D 1 )/q+1], where q is a downsampling coefficient.
  7. 根据权利要求1所述的基于深度学习的多模态影像配准方法,其中,所述位移网络包括编码部分、相互作用部分和解码部分;所述编码部分的输入包括每个影像特征点的相似性图、对应的参考影像的图像块和对应影像特征点的坐标信息,所述相互作用部分接收所有影像特征点的编码结果,并对不同影像特征点之间的相互作用信息进行编码,所述解码部分接收所述编码部分、所述相互作用部分的输出以及部分中间状态,输出每个影像特征点对应的位移向量;所述位移向量通过先获得位移概率图,再将所述位移概率图中的像素值作为像素对应坐标的权重进行加权平均得到;所述编码部分和所述解码部分之间有跳接相连。According to the multimodal image registration method based on deep learning in claim 1, the displacement network includes an encoding part, an interaction part and a decoding part; the input of the encoding part includes a similarity map of each image feature point, an image block of the corresponding reference image and coordinate information of the corresponding image feature point, the interaction part receives the encoding results of all image feature points, and encodes the interaction information between different image feature points, the decoding part receives the output of the encoding part, the interaction part and some intermediate states, and outputs a displacement vector corresponding to each image feature point; the displacement vector is obtained by first obtaining a displacement probability map, and then taking the pixel value in the displacement probability map as the weight of the pixel corresponding coordinate for weighted averaging; the encoding part and the decoding part are connected by a jump connection.
  8. 根据权利要求1所述的基于深度学习的多模态影像配准方法,其中,还包括:The multimodal image registration method based on deep learning according to claim 1, further comprising:
    通过指定特征获取所述参考影像和所述浮动影像之间局部结构的相似性;根据所述相似性和平滑约束构建目标函数,通过最小化所述目标函数对所述位移向量场进行局部调整。The similarity of the local structure between the reference image and the floating image is obtained by specifying features; an objective function is constructed according to the similarity and a smooth constraint, and the displacement vector field is locally adjusted by minimizing the objective function.
  9. 一种基于深度学习的多模态影像配准系统,其特征在于,包括:A multimodal image registration system based on deep learning, characterized by comprising:
    待配准区域获取模块:获取不同模态的三维影像,所述三维影像包括至少一幅参考影像和至少一幅浮动影像,获取三维影像的待配准区域;A module for acquiring a region to be registered: acquiring three-dimensional images of different modalities, wherein the three-dimensional images include at least one reference image and at least one floating image, and acquiring a region to be registered of the three-dimensional images;
    影像特征点检测模块:在所述参考影像的待配准区域内检测影像特征点,所述影像特征点为在邻域内能够与其他点的影像特征进行区分的点;Image feature point detection module: detects image feature points in the to-be-registered area of the reference image, wherein the image feature points are points that can be distinguished from the image features of other points in the neighborhood;
    相似性图获取模块:以每个所述影像特征点为中心,得到预设大小的图像块;将所述图像块输入到相似性网络,得到所述浮动影像的对应范围内的相似性图;Similarity graph acquisition module: taking each of the image feature points as the center, obtaining an image block of a preset size; inputting the image block into a similarity network to obtain a similarity graph within the corresponding range of the floating image;
    位移向量场获取模块:将所述影像特征点的坐标信息、所述参考图像的图像块和对应的所述相似性图输入位移网络,得到位移向量;基于所述位移向量对无影像特征点区域进行插值,得到位移向量场;A displacement vector field acquisition module: inputs the coordinate information of the image feature point, the image block of the reference image and the corresponding similarity map into the displacement network to obtain a displacement vector; interpolates the area without image feature points based on the displacement vector to obtain a displacement vector field;
    配准模块:根据所述位移向量场对所述浮动影像进行空间变换,得到配准结果。Registration module: performs spatial transformation on the floating image according to the displacement vector field to obtain a registration result.
  10. 一种计算机可读存储介质,其特征在于:所述计算机可读存储介质上存储有基于深度学习的多模态影像配准程序,所述基于深度学习的多模态影像配准程序被处理器执行时实现如权利要求1-8中任一项所述的基于深度学习的多模态影像配准方法。A computer-readable storage medium, characterized in that: a multimodal image registration program based on deep learning is stored on the computer-readable storage medium, and when the multimodal image registration program based on deep learning is executed by a processor, the multimodal image registration method based on deep learning as described in any one of claims 1-8 is implemented.
PCT/CN2022/142807 2022-10-21 2022-12-28 Deep learning-based multi-modal image registration method and system, and medium WO2024082441A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211296302.9 2022-10-21
CN202211296302.9A CN115690178A (en) 2022-10-21 2022-10-21 Cross-module non-rigid registration method, system and medium based on deep learning

Publications (1)

Publication Number Publication Date
WO2024082441A1 true WO2024082441A1 (en) 2024-04-25

Family

ID=85066044

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/142807 WO2024082441A1 (en) 2022-10-21 2022-12-28 Deep learning-based multi-modal image registration method and system, and medium

Country Status (2)

Country Link
CN (1) CN115690178A (en)
WO (1) WO2024082441A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104867126A (en) * 2014-02-25 2015-08-26 西安电子科技大学 Method for registering synthetic aperture radar image with change area based on point pair constraint and Delaunay
CN109035315A (en) * 2018-08-28 2018-12-18 武汉大学 Merge the remote sensing image registration method and system of SIFT feature and CNN feature
CN109064502A (en) * 2018-07-11 2018-12-21 西北工业大学 The multi-source image method for registering combined based on deep learning and artificial design features
US20210192758A1 (en) * 2018-12-27 2021-06-24 Shanghai Sensetime Intelligent Technology Co., Ltd. Image processing method and apparatus, electronic device, and computer readable storage medium
CN113763441A (en) * 2021-08-25 2021-12-07 中国科学院苏州生物医学工程技术研究所 Medical image registration method and system for unsupervised learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104867126A (en) * 2014-02-25 2015-08-26 西安电子科技大学 Method for registering synthetic aperture radar image with change area based on point pair constraint and Delaunay
CN109064502A (en) * 2018-07-11 2018-12-21 西北工业大学 The multi-source image method for registering combined based on deep learning and artificial design features
CN109035315A (en) * 2018-08-28 2018-12-18 武汉大学 Merge the remote sensing image registration method and system of SIFT feature and CNN feature
US20210192758A1 (en) * 2018-12-27 2021-06-24 Shanghai Sensetime Intelligent Technology Co., Ltd. Image processing method and apparatus, electronic device, and computer readable storage medium
CN113763441A (en) * 2021-08-25 2021-12-07 中国科学院苏州生物医学工程技术研究所 Medical image registration method and system for unsupervised learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LI HAO: "A Harris Corner Matching Optimization Algorithm Combing Adaptive Threshold and Forstner", TELECOMMUNICATION ENGINEERING, DIANXUN JISHU ZAZHISHE, CN, vol. 58, no. 9, 1 September 2018 (2018-09-01), CN , pages 1079 - 1085, XP093162876, ISSN: 1001-893X, DOI: 10.3969/j.issn.1001-893x.2018.09.015 *

Also Published As

Publication number Publication date
CN115690178A (en) 2023-02-03

Similar Documents

Publication Publication Date Title
Tobon-Gomez et al. Benchmark for algorithms segmenting the left atrium from 3D CT and MRI datasets
JP6059261B2 (en) Intelligent landmark selection to improve registration accuracy in multimodal image integration
US9275432B2 (en) Method of, and apparatus for, registration of medical images
US9155470B2 (en) Method and system for model based fusion on pre-operative computed tomography and intra-operative fluoroscopy using transesophageal echocardiography
US8160316B2 (en) Medical image-processing apparatus and a method for processing medical images
CN111260786A (en) Intelligent ultrasonic multi-mode navigation system and method
Zheng et al. Multi-part modeling and segmentation of left atrium in C-arm CT for image-guided ablation of atrial fibrillation
US20220207742A1 (en) Image segmentation method, device, equipment and storage medium
JP2015047506A (en) Method and apparatus for registering medical images
CN111311655B (en) Multi-mode image registration method, device, electronic equipment and storage medium
WO2023186133A1 (en) System and method for puncture path planning
US20040136584A1 (en) Method for matching and registering medical image data
KR102537214B1 (en) Method and apparatus for determining mid-sagittal plane in magnetic resonance images
US11633235B2 (en) Hybrid hardware and computer vision-based tracking system and method
US20220301224A1 (en) Systems and methods for image segmentation
CN115511997A (en) Angiography image processing method and system
Hao et al. Magnetic resonance image segmentation based on multi-scale convolutional neural network
WO2024082441A1 (en) Deep learning-based multi-modal image registration method and system, and medium
Mitra et al. A thin-plate spline based multimodal prostate registration with optimal correspondences
CN114757894A (en) Bone tumor focus analysis system
KR20150026354A (en) Method and Appartus for registering medical images
JP5403431B2 (en) Tomographic image processing method and apparatus
Jucevicius et al. Automated 2D Segmentation of Prostate in T2-weighted MRI Scans
Peng et al. 3D Segment and Pickup Framework for Pancreas Segmentation
CN117408908B (en) Preoperative and intraoperative CT image automatic fusion method based on deep neural network