CN116740215A

CN116740215A - Four-dimensional CT image reconstruction method, device, medical equipment and storage medium

Info

Publication number: CN116740215A
Application number: CN202310841681.3A
Authority: CN
Inventors: 邢宇翔; 杜牧歌; 张丽; 高河伟; 王振天
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2023-07-10
Filing date: 2023-07-10
Publication date: 2023-09-12

Abstract

The application relates to the technical field of imaging, in particular to a four-dimensional CT image reconstruction method, a device, medical equipment and a storage medium, wherein the method comprises the following steps: obtaining projection data of a four-dimensional CT to be reconstructed; extracting a time coordinate and a space coordinate of a target in projection data of the four-dimensional CT to be reconstructed; the method comprises the steps of inputting a time coordinate and a space coordinate into a pre-constructed reconstruction model, and outputting a reconstructed image of the four-dimensional CT to be reconstructed, wherein the reconstruction model comprises a motion field INR and a static template image INR, the motion field INR is composed of an implicit nerve representation INR, the motion field INR generates deformation motion offset and space-time coupling characteristics according to the time coordinate and the space coordinate, and the static template image INR performs image reconstruction according to the space coordinate and/or the space-time coupling characteristics after deformation motion offset to obtain the reconstructed image of the four-dimensional CT to be reconstructed. Therefore, the problems that the learning ability of the reconstruction model on high-resolution data is poor, the pathological state is reconstructed at a limited angle, the application scene of the reconstruction model is narrow and the like are solved.

Description

Four-dimensional CT image reconstruction method, device, medical equipment and storage medium

Technical Field

The present application relates to the field of imaging technologies, and in particular, to a four-dimensional CT (Computed Tomography) image reconstruction method, apparatus, medical device, and storage medium.

Background

The 4DCT (4-DimensionComputed Tomography, four-dimensional electronic computed tomography) technology can image moving objects, and is used for imaging areas such as lungs, hearts and the like in motion in the medical field; INR (implicit neural representation ) is a data representation method based on neural network in the field of deep learning, and is suitable for representing multidimensional data such as images.

In the related art, an INR-based 4DCT image representation method may be used. In the first method, the static image is represented by INR and the four-dimensional motion field is represented by a conventional polynomial with discrete matrix parameters; the feature coding module in the INR is a fixed frequency coding function which cannot be learned, converts four-dimensional space-time coordinates of a complex image into a feature vector, and inputs a learnable MLP (multi-layer persistence) to obtain an image value. In the second method, for four-dimensional motion fields, a previous high-quality 4DCT reconstructed image of a scanned object is required to be firstly obtained as a priori dynamic image, motion fields among phases in the priori dynamic image are firstly extracted, motion principal components are extracted from the motion fields through PCA (Principal Component Analysis) technology, time coordinates are converted into weight vectors of the principal components through INR which cannot be learned by a feature coding module, linear weighting is carried out on the principal components, thus the motion fields from a static image to each phase in the priori image are obtained, and a 4D (4-Dimension) reconstructed image is obtained through deformation motion transformation; in the training process, on one hand, the 4D reconstructed image is re-projected according to the scanning relation to be compared with the projection obtained in the actual scanning to calculate the loss function, and on the other hand, the static image INR is compared with the data of a special time phase in the projection obtained in the actual scanning to perform image domain and projection domain to calculate the loss function.

However, the method in the related art has limited representation capability, it is difficult to represent a dynamic image with a complicated structure, high resolution, and it is difficult to fit a motion field with a complicated motion, high spatial resolution, and temporal resolution; the calculation efficiency is low, high-resolution data are difficult to process, the application scene is limited, and the actual use requirement cannot be met.

Disclosure of Invention

The application provides a four-dimensional CT image reconstruction method, a device, medical equipment and a storage medium, which are used for solving the problems that in the related technology, the four-dimensional CT image reconstruction has poor learning ability on high-resolution data, limited angle reconstruction has limitation, and a reconstruction model has larger operand and lower operation efficiency, so that the application range of the model in an actual scene is narrower, and the actual use requirement cannot be met.

An embodiment of a first aspect of the present application provides a four-dimensional CT image reconstruction method, including the steps of: obtaining projection data of a four-dimensional CT to be reconstructed; extracting a time coordinate and a space coordinate of a target in projection data of the four-dimensional CT to be reconstructed; inputting the time coordinates and the space coordinates into a pre-constructed reconstruction model, and outputting a reconstructed image of the four-dimensional CT to be reconstructed, wherein the reconstruction model comprises a motion field INR and a static template image INR, the motion field INR is composed of an implicit neural representation INR, the motion field INR generates deformation motion offset and space-time coupling characteristics according to the time coordinates and the space coordinates, and the static template image INR performs image reconstruction according to the space coordinates and/or the space-time coupling characteristics after deformation motion offset to obtain the reconstructed image of the four-dimensional CT to be reconstructed.

Optionally, the time coordinates adopt projection data acquisition time points in physical significance or motion phase signals recorded by a motion monitoring sensor; the motion field INR comprises a first feature coding module, a second feature coding module, a coupling module and a first neural network module, wherein the first feature coding module is used for coding the time coordinates to obtain time features; the second feature encoding module is used for encoding the space coordinates to obtain space features; the coupling module is used for coupling the time characteristic and the space characteristic to obtain a space-time coupling characteristic; the first neural network module is used for converting the space-time coupling characteristic into deformation motion offset.

Optionally, the calculation formula of the space-time coupling feature is:

wherein M is the motion field INR, e _M A space-time coupling characteristic coding module;is a space vector, t is time, < >>S (t) is a temporal feature vector; χ is the coupling module.

Optionally, the calculation formula of the deformation motion offset is:

wherein ,φ_M As a first module of the neural network,is the deformation motion offset.

Optionally, the static template image INR includes a third feature encoding module, a second neural network module and a non-deformation time-varying module, where the third feature encoding module is configured to encode the spatial coordinates after the deformation motion is deviated to obtain a template spatial feature; the non-deformation time-varying module is used for combining the space-time coupling characteristic obtained by the motion field INR with the template space characteristic of deformation motion only when the target to be reconstructed has time-varying effect caused by factors other than additional deformation motion, so as to obtain the template space characteristic with additional non-deformation time-varying; the second neural network module is used for reconstructing and obtaining the reconstructed image of the four-dimensional CT to be reconstructed according to the template space features which are only in deformation motion or have additional non-deformation changes along with time.

Optionally, for the object V with deformation motion in the four-dimensional CT image to be reconstructed ₁ The image reconstruction expression form is as follows:

wherein h is a third feature encoding module,for objects V having deformation movement ₁ Image reconstruction expression of four-dimensional CT, < ->Encoding a module for a second neural network;

for the target V with additional non-deformation time change in the four-dimensional CT image to be reconstructed ₂ The image reconstruction expression form is as follows:

wherein ,for a target V with additional non-deformation over time ₂ Is an image reconstruction expression of four-dimensional CT of (1), h is a third feature encoding module,/-a third feature encoding module>Encoding a module for a second neural network; />The vector is +.>And->Is a merged expression of the splice.

An embodiment of the second aspect of the present application provides a four-dimensional CT image reconstruction apparatus, including: the acquisition module is used for acquiring projection data of the four-dimensional electronic computed tomography CT to be reconstructed; the extraction module is used for extracting the time coordinates and the space coordinates of the target in the projection data of the four-dimensional CT to be reconstructed; the reconstruction module is used for inputting the time coordinates and the space coordinates into a pre-constructed reconstruction model and outputting a reconstructed image of the four-dimensional CT to be reconstructed, wherein the reconstruction model comprises a motion field INR and a static template image INR, the motion field INR is composed of an implicit nerve representation INR, the motion field INR generates deformation motion offset and space-time coupling characteristics according to the time coordinates and the space coordinates, and the static template image INR performs image reconstruction according to the space coordinates and/or the space-time coupling characteristics after the deformation motion offset to obtain the reconstructed image of the four-dimensional CT to be reconstructed.

Optionally, the calculation formula of the space-time coupling feature is:

wherein M is the motion field INR, e _M A space-time coupling characteristic coding module;is a space vector, t is time, < >>S (t) is a temporal feature vector; />Is a coupling module.

Optionally, the calculation formula of the deformation motion offset is:

Optionally, the static template image INR includes a third feature encoding module, a second neural network module and a non-deformation time-varying module, where the third feature encoding module is configured to encode the spatial coordinates after the deformation motion is deviated to obtain a template spatial feature; the non-deformation time-varying module is used for combining the space-time coupling characteristic obtained by the motion field INR with the template space characteristic of the deformation-only motion to obtain the template space characteristic with additional non-deformation time variation when the target to be reconstructed has the time-varying effect caused by factors other than the additional deformation motion; the second neural network module is used for reconstructing and obtaining the reconstructed image of the four-dimensional CT to be reconstructed according to the template space features which are only in deformation motion or have additional non-deformation changes along with time.

Optionally, for the object V with deformation motion in the four-dimensional CT image to be reconstructed ₁ Image ofThe reconstructed expression form is:

wherein h is a third feature encoding module,for objects V having deformation movement ₁ Image reconstruction expression of four-dimensional CT, < ->Is a second neural network module.

An embodiment of a third aspect of the present application provides a medical device comprising: the computer program comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the program to realize the four-dimensional CT image reconstruction method as described in the embodiment.

An embodiment of a fourth aspect of the present application provides a computer-readable storage medium having stored thereon a computer program for execution by a processor for implementing the four-dimensional CT image reconstruction method as described in the above embodiment.

Therefore, the application has at least the following beneficial effects:

the model can be directly calculated at any single position by constructing and applying the reconstruction model of the feature coding module containing the learnable features, the complete static image and the four-dimensional motion field do not need to be calculated first and then the deformation operation is carried out, the operation consumption is reduced, the solving difficulty is reduced, the limitation of limited angle reconstruction is overcome, the accuracy of high-time resolution reconstruction is further improved, meanwhile, the applicability of the model to different application scenes is improved, the use experience is improved, and the actual use needs are met.

Therefore, the problems that in the related technology, the four-dimensional CT image reconstruction has poor learning ability on high-resolution data, the limited angle reconstruction has limitation, and the reconstruction model has larger operand and lower operation efficiency, so that the application range of the model in an actual scene is narrower, the actual use requirement cannot be met, and the like are solved.

Additional aspects and advantages of the application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the application.

Drawings

The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flow chart of a four-dimensional CT image reconstruction method according to an embodiment of the present application;

FIG. 2 is a flow chart of a four-dimensional CT image reconstruction method according to one embodiment of the present application;

FIG. 3 is a schematic diagram of a cone beam CT system according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a learner based feature encoding implementation according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a four-dimensional CT image reconstruction apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural view of a medical device according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present application and should not be construed as limiting the application.

The X-ray CT imaging technique is widely used in the fields of medical treatment, security inspection, industrial nondestructive inspection, etc., and the 3DCT (3-Dimension Computed Tomography, three-dimensional electronic computed tomography) technique can only image a stationary object, and the 4DCT technique developed in recent years can image a moving object, and is used in the medical field for imaging areas such as lungs, hearts, etc. in motion. The existing 4DCT reconstruction method generally divides a complete motion period into a plurality of phases, and selects the phase with smaller motion amplitude for imaging. For the phase of interest, to ensure imaging quality, reconstruction algorithms generally require acquisition angles to be distributed over a complete 0 to 180 + fan angle, avoiding the occurrence of limited angle artifacts. This is typically achieved by providing a sufficiently long scan time to cover a plurality of motion periods. However, long scans can result in higher doses and motion mismatch can cause additional motion artifacts when the projections acquired for the same phase contain multiple cycles of motion.

For 4DCT, the scanning time is reduced to cover only a single motion period of a moving object to be reconstructed, so that the dosage can be greatly reduced, the scanning efficiency is improved, and meanwhile, the motion artifact caused by multi-period motion mismatch is reduced. However, reducing the scanning time can lead to a reduced angular range of the scanned for each phase, so that the scanning time for the limit condition only covers the single motion cycle condition, and the reconstruction belongs to serious limited angle pathological problems; in addition, when the same motion cycle is divided into phases, the shorter the phase width is, the lighter the motion artifact contained in the phase image is, and thus the higher the achievable time resolution is.

However, at the same CT apparatus rotational speed, the shorter the width of the phase, the smaller the angular range over which the image is scanned under the time window. The rotation speed of the CT equipment rack is limited, so that when the reconstruction with higher time resolution is pursued, each time phase of the three-dimensional scanned body can only correspond to a small angle range, and the reconstruction problem belongs to a very serious limited angle pathological reconstruction problem. Therefore, the limited angle reconstruction problem in 4DCT reconstruction is a key bottleneck problem of shortening scan time and improving time resolution of reconstructed images.

To solve the problem of limited angle reconstruction in 4DCT reconstruction, it is necessary to use the correlation between the images to be reconstructed in each phase. In the existing 4DCT reconstruction algorithm, one class of methods digs the time and space relativity between different phase images and between dynamic images and prior images, and designs prior regular terms for the iterative reconstruction algorithm. Another class of methods considers that all phase images can be obtained from one still image by deformation motion transformation of the motion field, thereby designing an algorithm that jointly reconstructs the still image and the 4D motion field. However, although the above methods are different from each other in terms of algorithm, in terms of data representation, the still image is represented in a discrete 3D matrix form, and the motion field is represented as a discrete 4D matrix or a parameterized model with low-dimensional discrete matrix parameters. In the discrete matrix representation of static images and motion fields, the correlation between different positions is weak, the time-space correlation cannot be fully modeled, and the performance of the method is difficult to promote by an algorithm only. The problem of limited angle of 4DCT is solved, the dimension of the problem to be solved can be reduced from 4D to lower dimension, and the scale of the solution space is fully limited; therefore, a dynamic image model can be built in the related art, and a 4DCT reconstruction algorithm is designed in a targeted manner, so that the problem of limited angle reconstruction in 4DCT reconstruction is solved more effectively.

A novel data representation method based on a neural network, called implicit neural representation, is introduced in the field of deep learning, and is suitable for representing multidimensional data such as images. In INR, a multidimensional data is defined as a continuous field, and for any coordinate in the data, it is first converted into a feature vector by a feature coding module, and then the feature vector is converted into a value of the position by a neural network. The INR not only can model discrete images into smoother continuous fields, but also can obtain image values of all positions through a feature conversion module and a neural network, and each position has higher relevance than that of a discrete matrix form. Therefore, dynamic objects in the 4DCT can be modeled by referring to the INR mode, and stronger time-space correlation and more sufficient and effective dimension reduction are introduced in the data representation level. In addition, the model built by the new method also allows more flexible and deeper interaction between the data representation and the reconstruction algorithm to promote the effect of the algorithm. In addition, common deep learning methods often require training on several data sets and then applying to new test data, with generalization problems of training and test mismatch. Most INR methods do not need to train on any data set, the immediate data obtained in application can be directly trained on the data, and the using mode is similar to the traditional iterative reconstruction algorithm, so that the generalization problem does not exist, and the acquisition cost of the early training data is also avoided.

In the related art, two INR-based 4DCT image representation methods may be used. In the first method, the static image is represented by an INR and the four-dimensional motion field is represented by a conventional polynomial with discrete matrix parameters. The feature coding module in INR is a fixed frequency coding function which can not be learned, four-dimensional space-time coordinates of complex images are converted into a feature vector, and a learnable MLP is input to obtain image values. This approach suffers from several drawbacks: the feature coding module in the static image INR can not learn, so that the representation capability of the INR is limited, and the dynamic image with complex structure and high resolution is difficult to represent; four-dimensional motion fields are still represented by conventional polynomials with discrete matrix parameters, which have limited representation capabilities and are difficult to fit with motion fields with complex motion, high spatial resolution and temporal resolution; the interaction between the static image INR and the motion field is completed through the traditional deformation transformation, the 4D single-point pixel value is determined by the static image value and the motion field value of all points in a larger neighborhood after deformation transformation, so that the processing of each ray in each training step involves calculating the discrete form of the complete static image and the four-dimensional motion field from a model, the characteristics of the INR are not fully utilized in the mode, the calculation cannot be limited in the range of the direct associated pixels, the calculation efficiency is low, and high-resolution data are difficult to process; the method can only learn dynamic objects generated by deformation motions, and cannot handle a wider variety of motions.

The second method is similar to the first method in the way of representing the still image, and is also represented by INR that the feature encoding module cannot learn. For four-dimensional motion fields, the former high-quality 4DCT reconstructed image of the scanned object is required to be firstly obtained as a priori dynamic image, the motion fields among phases in the priori dynamic image are firstly extracted, the motion main components are extracted through the PCA technology, then the time coordinates are converted into weight vectors of the main components through INR which cannot be learned by a feature coding module, and the main components are subjected to linear weighting, so that the motion fields from the static image to each phase in the priori image are obtained, and the 4D reconstructed image is obtained through deformation motion transformation. In the training process, on one hand, the 4D reconstructed image is re-projected according to the scanning relation to be compared with the projection obtained in the actual scanning to calculate the loss function, and on the other hand, the static image INR is compared with the data of a special time phase in the projection obtained in the actual scanning to perform image domain and projection domain to calculate the loss function. This approach suffers from several drawbacks: firstly, the static image INR has the same defects as the first method that the representation capability is limited, and the structure is complex and the high-resolution dynamic image is difficult to represent; secondly, the motion field needs to acquire the previous 4DCT data of the scanned object to extract the main component of the motion field, so that the application scene is limited, and when the motion components in the two scans have larger phase difference, the method has lower precision; furthermore, only INR is applied to represent the linear weighted weight of the principal component in the motion field, on one hand, the principal component can not be learned, and on the other hand, the feature coding module in the INR of the weight can not be learned, so that the representation capability of the motion field INR is limited, and complex motion fields are difficult to represent; in addition, the calculation process of the model for the pixel value of the single dynamic image is similar to that of the first method, in order to calculate the pixel value of the single point, the static image values and the motion field values of all points in a larger adjacent area need to be calculated from the model, and then deformation transformation is carried out, so that the training process is similar to that of the first method, and the calculation of the global static image and the motion field and the calculation of projection and reconstruction of the global image are involved, so that the calculation amount is large, the efficiency is low, and the training process is difficult to be used for high-resolution data; finally, the method still assumes that the motion only has deformation motion, and limits the application scene.

Aiming at the problems that in the related technology mentioned in the background technology, the four-dimensional CT image reconstruction has poor learning ability on high-resolution data, limited angle reconstruction has limitation, and the reconstruction model has larger operand and lower operation efficiency, so that the application range of the model in an actual scene is narrower and the actual use requirement cannot be met, the application provides a four-dimensional CT image reconstruction method. The four-dimensional CT image reconstruction method, apparatus, medical device and storage medium according to the embodiments of the present application are described below with reference to the accompanying drawings.

Specifically, fig. 1 is a schematic flow chart of a four-dimensional CT image reconstruction method according to an embodiment of the present application.

As shown in fig. 1, the four-dimensional CT image reconstruction method includes the steps of:

in step S101, projection data of a four-dimensional electronic computed tomography CT to be reconstructed is acquired.

It can be appreciated that the embodiment of the present application may first obtain projection data instead of reconstructing four-dimensional ct to facilitate the use of the projection data in subsequent embodiments; the obtained projection data may include time coordinates, space coordinates, motion field INR formed by implicit neural representation INR, static template image INR, etc., which will be described in detail in the following embodiments and will not be repeated here.

In step S102, the time coordinates and the space coordinates of the target in the projection data of the four-dimensional CT to be reconstructed are extracted.

It can be understood that, as shown in fig. 2, the embodiment of the present application may extract the time coordinates and the space coordinates from the projection data obtained in step S101, so that the subsequent embodiment uses the data, for example, performs coupling processing on the space features and the time features as space-time coupling features, which are not described herein.

In step S103, the time coordinates and the space coordinates are input into a pre-constructed reconstruction model, and a reconstructed image of the four-dimensional CT to be reconstructed is output, wherein the reconstruction model includes a motion field INR and a static template image INR, the motion field INR is composed of an implicit neural representation INR, the motion field INR generates deformation motion offset and space-time coupling characteristics according to the time coordinates and the space coordinates, and the static template image INR performs image reconstruction according to the space coordinates and/or the space-time coupling characteristics after deformation motion offset, so as to obtain the reconstructed image of the four-dimensional CT to be reconstructed.

It can be appreciated that, as shown in fig. 2, the embodiment of the present application may construct a four-dimensional CT reconstruction model in advance, where the model includes two main modules: a motion field INR as shown in fig. 2 (1), and a static template image INR as shown in fig. 2 (2). In the design of the motion field INR, the space characteristic and the time characteristic are subjected to coupling treatment to be used as space-time coupling characteristics; in the design of the static template image INR, a deformation motion mechanism and optionally a non-deformation time-varying mechanism are modeled. In the model of the embodiment of the application, a plurality of INRs are used as basic units, and each INR is provided with a learnable feature coding module which respectively codes space features, time features and template space features.

In the embodiment of the application, the time coordinates adopt projection data acquisition time points with physical significance, or motion phase signals recorded by a motion monitoring sensor; the motion field INR comprises a first feature coding module, a second feature coding module, a coupling module and a first neural network module, wherein the first feature coding module is used for coding the time coordinates to obtain time features; the second feature coding module is used for coding the space coordinates to obtain space features; the coupling module is used for coupling the time characteristics and the space characteristics to obtain space-time coupling characteristics; the first neural network module is used for converting the space-time coupling characteristic into deformation motion offset.

The embodiment of the application can select at least one neural network module to realize the conversion of the time-space coupling characteristic, for example, an MLP module can be selected as the neural network module and the like; in the following embodiments, an MLP module will be specifically described as an example of a neural network module.

It can be appreciated that the embodiment of the present application may use a module with a learnable feature encoding module, where the feature encoding module converts input coordinates into feature vectors, and the neural network module includes several layers of neural networks (the embodiment of the present application may be implemented using an MLP module as the neural network module), and receives the vectors output by the feature encoding module as input, and outputs a new vector or value. Wherein the INR used in the present application has a learnable feature encoding module comprising parameters that can be adjusted during training, thus enhancing the representation of the INR. As shown in fig. 3, the learnable feature encoding module may be implemented in a variety of forms commonly used in the art, such as a fixed frequency code+learnable MLP, a coordinate value queried in a learnable feature matrix, a coordinate value queried in a set of learnable feature matrices of different resolutions, and so on.

In the embodiment of the application, the calculation formula of the space-time coupling characteristic is as follows:

In the embodiment of the application, the calculation formula of the deformation motion offset is as follows:

It will be appreciated that the motion field INR comprises a learnable space-time coupled feature encoding module e _M And a neural network module phi _M Converting the time coordinates and the space coordinates into deformation motion offset; space-time coupled feature coding module e _M Modeling as a coupling of spatial features and temporal features is to vector spatial featuresAnd a time feature vector s (t) via some coupling module +.>Converting into a new feature vector; coupling module->May be some kind of operation that is fixed, such as a simple concatenation operation, or an element-wise multiplication; or some function with a learnable parameter, for example, by concatenating two vectors and inputting them into a neural network. Wherein the spatial feature vector->And the temporal feature vector s (t) are both designed by adopting a leachable feature code, and can be in the same or different forms, for example, the spatial feature vector +. >In the form of a query in a set of multi-resolution feature matrices, while the temporal feature vector s (t) takes the form of a fixed frequency code + a learnable MLP.

In the embodiment of the application, the static template image INR comprises a third feature coding module, a second neural network module and a non-deformation time-varying module, wherein the third feature coding module is used for coding the space coordinates after deformation movement deviation to obtain template space features;

the non-deformation time-varying module is used for combining the space-time coupling characteristics obtained by the motion field INR with the template space characteristics of the deformation-only motion when the target to be reconstructed has the time-varying effect caused by factors other than the additional deformation motion, so as to obtain the template space characteristics with the additional non-deformation time-varying effect; the second neural network module is used for reconstructing and obtaining a reconstructed image of the four-dimensional CT to be reconstructed according to the template space features which are only in deformation motion or have additional non-deformation changes along with time.

It will be appreciated that embodiments of the present application can decouple 4D motion images into a static template image INR (denoted V) and a motion field INR (denoted M), the relationship between which includes both the necessary deformation motion effects and the optional non-deformation time-varying effects; as shown in fig. 2, the attenuation coefficient for a 4D object to be reconstructed wherein />Is the spatial coordinate of a point on the object, and t is the time coordinate.

In the connection of the deformation motion effects, for an object μ, the model assumes μ at time t and positionUpper->The value of (2) may correspond to the value in the static template image INRSpatial position->The value of>And playground INR at time t and position +.>The value of>Modeling from->Point to->Is used for the deformation motion offset of the device. Due to the fact that at time t and position->The above object values are only related to a certain post-movement position, this relation modeling the deformation movement.

The embodiment of the application can also establish a second model in the additionally added connection of the non-deformation effect along with the time change: assume thatThe value of the above is not only dependent on the position +.>Also depend on->Itself. Therefore, the static template image INR receives more than +.>As input, also receive the representation->Feature vectors of (a)As an additional input.

In the embodiment of the application, the target V with deformation motion in the four-dimensional CT image to be reconstructed ₁ The image reconstruction expression form is as follows:

wherein h is a third feature encoding module,for objects V having deformation movement ₁ Image reconstruction expression of four-dimensional CT, < - >Is a second neural network module.

For object V with additional non-deformation time-varying changes in four-dimensional CT image to be reconstructed ₂ The image reconstruction expression form is as follows:

wherein ,for a target V with additional non-deformation over time ₂ Is an image reconstruction expression of four-dimensional CT, h is a third feature encoding module, phi _V2 Encoding a module for a second neural network; />The vector is +.>And->Is a merged expression of the splice.

It can be appreciated that the feature encoding module in the static template image INR is a parameter-learnable design; will V ₁ Is developed into a template third feature coding module h and a neural network module phi with parameter learning only by the deformation motion INR form _V 。

Therefore, after the loss function is trained, the embodiment of the application can substitute any group of space and time coordinates to be queried in any group of four-dimensional CT into model reasoning to obtain a reconstruction value of the position; the loss function defining method in the training process of the embodiment of the application can be specifically as follows:

the loss function includes both a data fidelity term and an optional multi-level regularization term, wherein obtaining the data fidelity term to compare the model predictions to the actual projection data may be accomplished in at least one manner, such as distance functions including, but not limited to, l-norms Weighted least squares method, conditional probability, etc. under CT projection noise modeling to facilitate denoising, where +.>And p is a real label and is a model predictive value.

The multi-level regularization term is implemented by simultaneously constraining the properties of intermediate variables of the model, including the output of the motion field INR and the output of the template object INR, to be substituted into a smoothing characterization function and an edge extraction function, and the sparsity, smoothness and other properties of the output are improved through optimization of these feature metrics, where the functions include, but are not limited to, the aforementioned l-norm, and Total-Variation (Total-Variation) functions commonly used in the field, and the like.

The parameter training process of the method of the embodiment of the application can be specifically as follows:

(1) Acquiring 4DCT projections of a 4D object to be reconstructed, wherein each time point corresponds to a projection angle;

(2) In each iteration step in training, a batch of projection data point subsets (comprising different time stamps and coordinates on a two-dimensional projection graph, can follow different rules, such as complete randomness or the same time stamp in each batch of data points, etc.) are randomly selected from four-dimensional CT projections according to a certain rule;

(3) Calculating paths of X rays corresponding to the projection data points; selecting a plurality of inquiry space coordinates in the superposition area of the X-ray path and the reconstruction area, and inquiring the predicted value of the proposed model under the space coordinates and the time stamp;

(4) According to a line integral rendering equation of X-ray projection, calculating a projection value of a proposed model prediction result on each projection data point position, namely a projection value of model prediction;

(5) Substituting the projection value predicted by the model, the real projection label and the intermediate variable predicted by the model into a loss function, and training the model in the method through error back transmission;

(6) After training, substituting any group of space and time coordinates to be queried in the four-dimensional CT into a model for reasoning, and obtaining a reconstruction value of the position.

The four-dimensional CT image reconstruction method according to the embodiment of the present application will be described in the following with a specific embodiment, specifically as follows:

(1) The embodiment can be applied to 4DCT of any scanning mode or system architecture without losing generality; a cone beam circular orbit CT imaging system will be exemplified in the following procedure. In this system, as shown in fig. 3, the flat panel detector and the light source are moved around the scanned object at a certain rotational speed, the rotational center is set as the origin of coordinates, the flat panel detector is always perpendicular to the XY plane, and the Z coordinate of the light source is 0.

In this embodiment, the scanned object is the patient's lung region and the single breath cycle is 5 seconds(s). The rotation speed of the frame is set to be 2.5s per circumference, so that in the continuous scanning of 5s, the frame rotates for 2 weeks and scans 720 degrees in total; 360 evenly distributed visual angles are acquired every week, 720 visual angles are acquired in total, and each visual angle corresponds to a time window with the length of 5/720 s.

Assume that cone beam projection data obtained by the detector is wherein ,/>Representing real space, C and D represent the number of columns and rows of single projection respectively, and V represents the total view number; the time-dependent attenuation coefficient of the object to be reconstructed at a certain point can be determined by means of a field +.>Representation, wherein->Is the spatial coordinate in the object coordinate system, assuming for convenience that x, y, z are normalized to 0,1]The method comprises the steps of carrying out a first treatment on the surface of the t is the time coordinate (timestamp) during the scanned period, assuming for convenience that t is normalized to [0,1]。

In medical diagnostic applications, etc., to facilitate visualization,often stored and displayed in its discrete form, can be represented by the following formula:

assuming that the physical volume of the object to be reconstructed is S _x ×S _y ×S _z Scanned time is T _scan The ideal spatial resolution that the discrete form can represent isIdeal temporal resolution is +.>Whereas for a continuous form->There is formally no limitation on temporal resolution and spatial resolution. In this embodiment, a continuous form of the attenuation coefficient can be used +.>After the model is trained, the discrete form line attenuation coefficient diagram of the object to be reconstructed can be obtained by reasoning the model predicted value of each discrete grid point position in actual use.

(2) The learnable features have various forms, and in this embodiment, a "fixed frequency code+learnable MLP" form 1 and a "a set of learnable feature matrices with different resolutions" form 3 can be used as shown in fig. 4; wherein fig. 4 is an example of a learner-based feature encoding implementation in the base unit "INR with learner-based features". The "set of different resolution learnable feature matrices" form 3 is composed of a series of "learnable feature matrix" forms 2.

In form 1, for example, for each dimension in the input coordinates (y), a fixed frequency encoding function is first converted into a frequency domain vector, e.g., x is converted into

γ(x)＝[sin(2 ⁰ x),sin(2 ¹ x),…,sin(2 ^K-1 x),cos(2 ⁰ x),cos(2 ¹ x),…,cos(2 ^K- ¹ x)]，

Where the integer K is the order of the frequency domain vector, which may be set to 16 in this embodiment. Next, the frequency domain vectors of each dimension are spliced into a large vector, such as [ γ (x), γ (y) ], and input into a parameter-learnable MLP, which is converted into a new vector F (x, y) of length F, representing the feature vector. In this embodiment, the parameter-learnable MLP is a neural network with two fully-connected layers, a first network followed by a ReLU activation function, and a second network followed by no activation function.

In form 2, for example, a matrix with a parameter that is learnable is includedWherein the first two dimensions correspond to the dimensions of the input coordinates, and the size F of the last dimension is the number of channels of the feature matrix. Assume that the input coordinates (, y) have been normalized to [0,1]Scaling each dimension to [0, N-1 ]]Obtain->Where N is the dimension of the parameter learning matrix. Next, adjacent integer lattice points are found, and for the two-dimensional coordinates of this embodiment, there are four adjacent integer lattice points. Next, the values of the four adjacent integer lattice points are looked up in a parameter-learnable matrix, finally, according to d-dimensional coordinates +.>D-linear interpolation (bilinear interpolation in the example) is carried out according to the position relation of four adjacent integer lattice points, and the value of (y) in the matrix is obtained>Is a vector of length F, representing the feature vector.

In form 3, for example, for L combinations of form 2 with different feature matrix resolutions, where L represents the number of steps of resolution, this embodiment may be set to 16, the first d dimensions of the matrix are different in size, for example, inIn this embodiment, the first 2 dimensions of the first matrix are 16 and the first 2 dimensions of the last matrix are 512. The input coordinates (y) are respectively input into the L feature codes of the form 2 to obtain L feature vectors with the length of F And finally, splicing the multi-resolution feature vectors to obtain the multi-resolution feature vector with the length of L multiplied by F.

In particular, for larger size feature matrices present in form 2 and form 3, the storage space may be compressed by some means. For example, to compress the first d dimensions of a larger d+1-dimensional matrix into a single dimension of dimension T, thereby forming a smaller matrixStoring, mapping the input d-dimensional coordinates to a [0, T-1 ] by using the following Hash coordinate mapping function]Integer index between:

wherein ,represents bitwise exclusive OR operation, pi ₁ ,…, _d Are sufficiently large prime numbers that are different from each other. The value of the input coordinate in H is found with a new integer index. Although different d-dimensional coordinates may be mapped to the same integer index, i.e., there is a "hash collision," employing a hash coordinate mapping function can effectively reduce the storage size of the matrix with only a small loss of performance. For example, in the present embodiment, t=2 may be set ¹⁹ For the feature matrix with the total size of the front d dimension larger than T, the compression is performed in the mode.

(3) Defining a motion field INR of the model:

an embodiment of the application may define the motion field INR in the model as shown in the diagram (1) in fig. 2, using the following embodiment.

In this embodiment, the coupling module χ in the motion field INR is set to be an element-by-element multiplication of the temporal feature vector and the spatial feature vector, and to perform this operation, the temporal feature vector and the spatial feature vector are set to be the same length in the example. The time coordinate t is normalized to [0,1 ]]Thereafter, a learnable feature encoding module is input, wherein the learnable feature encoding takes the form 1 described in the foregoing eight-2, wherein the parameter k=16 of the frequency encoding part, the mlp comprises 2 fully connected layers, the first layer size is 32×64, the second layer size is 64×32, thereby encoding the time coordinates into a feature vector s (t) of length 32. Spatial coordinatesNormalized to [0,1 ]]Then, a learner-based feature code module is input, wherein the learner-based feature code adopts the form 3 described in the foregoing eight-2, and comprises 16 feature matrices with different resolutions, and the size of the i-th matrix is N _i ×N _i ×N _i xF, where { N _i } _i＝1,…,16 F=2 for an equal-ratio array from 32 to 1024. Furthermore, hash coding described in eight-2 is used to compress the parameters of the matrix, parameter t=2 ¹⁹ ，π ₁ ＝1，π ₂ ＝2654435761，π ₃ = 805459861. Thereby encoding the spatial coordinates as a feature vector +.32 in length>s (t) and->By means of a coupling module χ (in this embodiment element-wise multiplication) a spatio-temporal coupling eigenvector is obtained >In an MLP defined by 3 convolution layers, the convolution layer sizes are 32×64, 64×64, 64×3, respectively, with a ReLU activation function following the first two layers. Finally, a vector with a length of 3 is obtained>Is the deformation motion offset.

(4) Defining a static template image INR of the model:

an embodiment of the present application can define a static template image INR as shown in the diagram (1) in fig. 2 by using the following embodiment.

For the connection of deformation movement, the result of the previous step is firstly usedTo calculate the coordinates after deformation movementThen inputted into a module for learning feature coding, wherein the learning feature coding adopts the form 3 described in the previous eight-2, the specific definition mode is the same as the learning feature coding of space coordinates in the motion field INR, the output of the module is a vector with length of 32, and the vector is template space feature->To add an optional non-deforming time-dependent relation, the intermediate variable +.>Short-circuiting to the aforementioned template spatial features->Then, the new vectors with the length of 64 are spliced to be used as the input of the next MLP; otherwise, consider only template spatial features +.>As input to the next MLP. Finally, the resulting vector of length α is input to an MLP defined by 3 convolutional layers of size α×64, 64×64, 64×1, respectively, with a ReLU activation function following the first two layers. Finally, a floating point number is obtained, and an image is reconstructed +. >

(5) The method comprises the following training processes:

first, obtaining 4DCT projection obtained by scanning an example cone beam CT system, wherein the size of the projection is as followsWhere the number of views v=720, i.e. the total number of time stamps.

In a second step, training was set to 500epochs, each epoch containing 720 iterations. For each iteration, randomly selecting one frame from 720 frames of projections, and randomly selecting 1024 projection data points on the two-dimensional projection data of C multiplied by D; and calculating the coincidence line segment of the X-ray path (connecting line of the detector and the light source) of each projection data point and the reconstruction area, and selecting a plurality of space coordinate points to be queried on the line segment according to a certain mode. For example, if the absolute value of the X coordinate of the light source is larger than the absolute value of the Y coordinate, the points to be queried are selected on all the coincident line segments according to the same X coordinate interval, and the interval is set as the X-direction pixel size of the region to be rebuiltOtherwise, selecting the points to be queried on all the coincident line segments according to the same Y coordinate interval, wherein the interval is set as the Y-direction pixel size of the area to be reconstructed>And inputting the space coordinates and the time stamps of all the query points into the definition model to obtain a series of line attenuation coefficient predicted values.

And thirdly, calculating the projection value of the proposed model prediction result on each projection data point position, namely the projection value of the model prediction according to a line integral rendering equation of CT projection. For example, for a line segment on the X-ray path corresponding to any projection data point, the predicted value of the line attenuation coefficient of each query point is multiplied by the distance between adjacent query points, and then summed to obtain the predicted projection value.

Fourth, predicting the modelIntermediate variables of the projection values, the actual projection labels and the model predictions are substituted into the loss function. For example, the loss function includes a data fidelity term, which is defined as the 1-norm of the difference between the model predicted projection value and the true projection signature, and a regularization term, which is defined as the final output of the motion field INR in eight-3, i.e., the deformation motion offset2-norms of (2). Taking the data fidelity term weight as 1 and the regular term weight as 0.1 as an example, substituting the loss function result into a deep learning optimizer, and reversely transmitting and updating model parameters; the present embodiment may employ Adam optimizer and the learning rate may be set to 3e-3.

(6) 4DCT reasoning after training:

after training, the embodiment can be in a 4D image discrete form according to the visual requirementCalculating 4D coordinates of each pixel, and for a certain dimension, if the dimension is N, N coordinates of the dimension are +.>And inputting the 4D coordinates of each pixel into the trained model to obtain the predicted value of the line attenuation coefficient of the pixel. Finally, all values are arranged in an indexed manner into a 4D image discrete form.

In summary, according to the four-dimensional CT image reconstruction method provided by the embodiment of the application, the reconstruction model of the feature coding module containing the learnable features can be constructed and applied, so that the model can be directly calculated at any single position without calculating a complete static image and a four-dimensional motion field first and then performing deformation operation, the operation consumption is reduced, the solving difficulty is reduced, the limitation of limited angle reconstruction is overcome, the accuracy of high-time resolution reconstruction is further improved, and meanwhile, the applicability of the model to different application scenes is improved, thereby improving the use experience and meeting the actual use needs.

Next, a four-dimensional CT image reconstruction apparatus according to an embodiment of the present application will be described with reference to the accompanying drawings.

Fig. 5 is a block diagram of a four-dimensional CT image reconstruction apparatus according to an embodiment of the present application.

As shown in fig. 5, the four-dimensional CT image reconstruction apparatus 10 includes: the device comprises an acquisition module 100, an extraction module 200 and a reconstruction module 300.

The acquisition module 100 is configured to acquire projection data of a four-dimensional computed tomography CT to be reconstructed; the extraction module 200 is used for extracting the time coordinates and the space coordinates of the target in the projection data of the four-dimensional CT to be reconstructed; the reconstruction module 300 is configured to input the time coordinates and the space coordinates into a pre-constructed reconstruction model, and output a reconstructed image of the four-dimensional CT to be reconstructed, where the reconstruction model includes a motion field INR and a static template image INR, the motion field INR is configured by an implicit neural representation INR, the motion field INR generates a deformation motion offset and a space-time coupling feature according to the time coordinates and the space coordinates, and the static template image INR performs image reconstruction according to the space coordinates and/or the space-time coupling feature after the deformation motion offset, so as to obtain the reconstructed image of the four-dimensional CT to be reconstructed.

wherein M is the motion field INR, e _M Encoding for space-time coupling featuresA code module;is a space vector, t is time, < >>S (t) is a temporal feature vector; />Is a coupling module.

wherein ,φ_M In the form of a neural network module,is the deformation motion offset.

In the embodiment of the application, the static template image INR comprises a third feature coding module, a second neural network module and a non-deformation time-varying module, wherein the third feature coding module is used for coding the space coordinates after deformation movement deviation to obtain template space features; the non-deformation time-varying module is used for combining the space-time coupling characteristics obtained by the motion field INR with the template space characteristics of the deformation-only motion when the target to be reconstructed has the time-varying effect caused by factors other than the additional deformation motion, so as to obtain the template space characteristics with the additional non-deformation time-varying effect; the second neural network module is used for reconstructing and obtaining a reconstructed image of the four-dimensional CT to be reconstructed according to the template space features which are only in deformation motion or have additional non-deformation changes along with time.

wherein h is a third feature encoding module,for objects V having deformation movement ₁ Image reconstruction expression of four-dimensional CT, < ->Is a second neural network module;

It should be noted that the foregoing explanation of the four-dimensional CT image reconstruction method embodiment is also applicable to the four-dimensional CT image reconstruction apparatus of this embodiment, and will not be repeated here.

According to the four-dimensional CT image reconstruction device provided by the embodiment of the application, the reconstruction model of the feature coding module containing the learnable features can be constructed and applied, so that the model can be directly calculated at any single position without calculating a complete static image and a four-dimensional motion field first and then performing deformation operation, the operation consumption is reduced, the solving difficulty is reduced, the limitation of limited angle reconstruction is overcome, the accuracy of high-time resolution reconstruction is further improved, and meanwhile, the applicability of the model to different application scenes is improved, so that the use experience is improved, and the actual use requirement is met.

Fig. 6 is a schematic structural diagram of a medical device according to an embodiment of the present application. The medical device may include:

a memory 601, a processor 602, and a computer program stored on the memory 601 and executable on the processor 602.

The processor 602 implements the four-dimensional CT image reconstruction method provided in the above embodiment when executing a program.

Further, the medical device further includes:

a communication interface 603 for communication between the memory 601 and the processor 602.

A memory 601 for storing a computer program executable on the processor 602.

The memory 601 may include a Random Access Memory (RAM) memory, and may further include a nonvolatile memory, such as at least one magnetic disk memory.

A memory 601 for storing a computer program executable on the processor 602.

The memory 601 may include a high-speed RAM (Random Access Memory ) memory, and may also include a nonvolatile memory, such as at least one disk memory.

If the memory 601, the processor 602, and the communication interface 603 are implemented independently, the communication interface 603, the memory 601, and the processor 602 may be connected to each other through a bus and perform communication with each other. The bus may be an ISA (Industry Standard Architecture ) bus, a PCI (Peripheral Component, external device interconnect) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 6, but not only one bus or one type of bus.

Alternatively, in a specific implementation, if the memory 601, the processor 602, and the communication interface 603 are integrated on a chip, the memory 601, the processor 602, and the communication interface 603 may perform communication with each other through internal interfaces.

The processor 602 may be a CPU (Central Processing Unit ) or ASIC (Application Specific Integrated Circuit, application specific integrated circuit) or one or more integrated circuits configured to implement embodiments of the present application.

The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the four-dimensional CT image reconstruction method as above.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, "N" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable gate arrays, field programmable gate arrays, and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims

1. The four-dimensional CT image reconstruction method is characterized by comprising the following steps of:

obtaining projection data of a four-dimensional CT to be reconstructed;

extracting a time coordinate and a space coordinate of a target in projection data of the four-dimensional CT to be reconstructed;

inputting the time coordinates and the space coordinates into a pre-constructed reconstruction model, and outputting a reconstructed image of the four-dimensional CT to be reconstructed, wherein the reconstruction model comprises a motion field INR and a static template image INR, the motion field INR is composed of an implicit neural representation INR, the motion field INR generates deformation motion offset and space-time coupling characteristics according to the time coordinates and the space coordinates, and the static template image INR performs image reconstruction according to the space coordinates and/or the space-time coupling characteristics after deformation motion offset to obtain the reconstructed image of the four-dimensional CT to be reconstructed.

2. The four-dimensional CT image reconstruction method according to claim 1, wherein the time coordinates employ projection data acquisition time points of physical significance or motion phase signals recorded by a motion monitoring sensor; the playground INR includes a first feature encoding module, a second feature encoding module, a coupling module, and a first neural network module, wherein,

the first feature encoding module is used for encoding the time coordinates to obtain time features;

the second feature encoding module is used for encoding the space coordinates to obtain space features;

the coupling module is used for coupling the time characteristic and the space characteristic to obtain a space-time coupling characteristic;

the first neural network module is used for converting the space-time coupling characteristic into deformation motion offset.

3. The four-dimensional CT image reconstruction method according to claim 2, wherein the space-time coupling feature has a calculation formula:

4. The four-dimensional CT image reconstruction method as set forth in claim 3, wherein the deformation motion offset is calculated by the formula:

5. The four-dimensional CT image reconstruction method according to claim 1, wherein the static template image INR comprises a third feature encoding module, a second neural network module, and a non-deformation time-varying module, wherein,

the third feature encoding module is used for encoding the space coordinates after the deformation movement is deviated to obtain the template space features of the deformation movement only;

the non-deformation time-varying module is used for combining the space-time coupling characteristic obtained by the motion field INR with the template space characteristic of the deformation-only motion to obtain the template space characteristic with additional non-deformation time variation when the target to be reconstructed has the time-varying effect caused by factors other than the additional deformation motion;

the second neural network module is used for reconstructing and obtaining the reconstructed image of the four-dimensional CT to be reconstructed according to the template space features which are only in deformation motion or have additional non-deformation changes along with time.

6. The four-dimensional CT image reconstruction method as recited in claim 5, wherein for the object V having deformation motion in the four-dimensional CT image to be reconstructed ₁ The image reconstruction expression form is as follows:

wherein ,for a target V with additional non-deformation over time ₂ Image reconstruction for four-dimensional CTExpression, h is the third feature encoding module,>encoding a module for a second neural network; />The vector is +.>And->Is a merged expression of the splice.

7. A four-dimensional CT image reconstruction apparatus, comprising:

the acquisition module is used for acquiring projection data of the four-dimensional electronic computed tomography CT to be reconstructed;

the extraction module is used for extracting the time coordinates and the space coordinates of the target in the projection data of the four-dimensional CT to be reconstructed;

the reconstruction module is used for inputting the time coordinates and the space coordinates into a pre-constructed reconstruction model and outputting a reconstructed image of the four-dimensional CT to be reconstructed, wherein the reconstruction model comprises a motion field INR and a static template image INR, the motion field INR is composed of an implicit nerve representation INR, the motion field INR generates deformation motion offset and space-time coupling characteristics according to the time coordinates and the space coordinates, and the static template image INR performs image reconstruction according to the space coordinates and/or the space-time coupling characteristics after the deformation motion offset to obtain the reconstructed image of the four-dimensional CT to be reconstructed.

8. The four-dimensional CT image reconstruction device according to claim 7, wherein the time coordinates employ projection data acquisition time points of physical significance or motion phase signals recorded by a motion monitoring sensor; the playground INR includes a first feature encoding module, a second feature encoding module, a coupling module, and a first neural network module, wherein,

9. The four-dimensional CT image reconstruction apparatus according to claim 8, wherein the space-time coupling characteristic is calculated by the formula:

10. The four-dimensional CT image reconstruction apparatus according to claim 9, wherein the deformation motion offset is calculated by the formula:

11. The four-dimensional CT image reconstruction apparatus according to claim 7, wherein the static template image INR comprises a third feature encoding module, a second neural network module, and a non-deformation time-varying module, wherein,

the third feature encoding module is used for encoding the space coordinates after the deformation motion is deviated to obtain template space features;

the non-deformation time-varying module is used for combining the space-time coupling characteristic obtained by the motion field INR with the template space characteristic of deformation motion only when the target to be reconstructed has time-varying effect caused by factors other than additional deformation motion, so as to obtain the template space characteristic with additional non-deformation time-varying;

12. The four-dimensional CT image reconstruction apparatus according to claim 11, wherein for the object V having a deformation motion in the four-dimensional CT image to be reconstructed ₁ The image reconstruction expression form is as follows:

13. A medical device, comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the four-dimensional CT image reconstruction method as claimed in any one of claims 1-7.

14. A computer readable storage medium having stored thereon a computer program, wherein the program is executed by a processor for implementing the four-dimensional CT image reconstruction method as claimed in any one of claims 1 to 7.