CN114708353B - Image reconstruction method and device, electronic equipment and storage medium - Google Patents

Image reconstruction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114708353B
CN114708353B CN202210628118.3A CN202210628118A CN114708353B CN 114708353 B CN114708353 B CN 114708353B CN 202210628118 A CN202210628118 A CN 202210628118A CN 114708353 B CN114708353 B CN 114708353B
Authority
CN
China
Prior art keywords
image
model
training
sample
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210628118.3A
Other languages
Chinese (zh)
Other versions
CN114708353A (en
Inventor
崔玥
李超
余山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202210628118.3A priority Critical patent/CN114708353B/en
Publication of CN114708353A publication Critical patent/CN114708353A/en
Application granted granted Critical
Publication of CN114708353B publication Critical patent/CN114708353B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

The invention relates to the technical field of artificial intelligence, and provides an image reconstruction method, an image reconstruction device, electronic equipment and a storage medium. The adopted image reconstruction model is obtained by training a target pre-training model through a first sample neural image and a corresponding sample reconstruction result thereof, the target pre-training model is obtained by training three pre-training modes, namely comparison learning unsupervised pre-training, cross-mode image conversion supervised pre-training and image reconstruction unsupervised pre-training, the problem of overfitting of the model is avoided, the performance and the generalization of the model on an image reconstruction task are greatly improved, on the basis, the image reconstruction model is used for reconstructing the input neural image, and the accuracy of the reconstruction result can be greatly improved.

Description

Image reconstruction method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to an image reconstruction method and apparatus, an electronic device, and a storage medium.
Background
The image reconstruction task refers to repairing images with defects such as partial deletion, artifacts, low resolution and the like, and is generally realized through a neural network model.
At present, most image reconstruction models based on transfer learning are obtained by training in a conventional training mode combining a single pre-training task and fine tuning. However, for the brain neuroimaging reconstruction task, the data set size is usually small, and only tens or hundreds of data volumes make the conventional training method prone to the problem of overfitting, thereby resulting in poor performance of the image reconstruction model.
Disclosure of Invention
The invention provides an image reconstruction method, an image reconstruction device, electronic equipment and a storage medium, which are used for solving the defect that the training mode of an image reconstruction model of a neuroimage in the prior art is easy to generate overfitting.
The invention provides an image reconstruction method, which comprises the following steps:
determining a neural image to be reconstructed;
inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model;
the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model:
performing comparison learning unsupervised pre-training based on the second sample neural image;
cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image;
and (4) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage.
According to the image reconstruction method provided by the invention, the comparison learning unsupervised pre-training based on the second sample neural image comprises the following steps:
constructing a positive sample pair and a negative sample pair based on the second sample neuroimage;
inputting each image in the positive sample pair to a first model to be trained to obtain each feature vector corresponding to the positive sample pair output by the first model to be trained; the first model to be trained is determined based on the initial model;
inputting each image in the negative sample pair to the first model to be trained to obtain each feature vector corresponding to the negative sample pair output by the first model to be trained;
and training the first model to be trained by taking the consistency of the feature vectors corresponding to the positive sample pairs and the difference of the feature vectors corresponding to the negative sample pairs as targets.
According to the image reconstruction method provided by the invention, the cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image comprises the following steps:
inputting the third sample neural image into a second model to be trained to obtain a conversion image output by the second model to be trained; the second model to be trained is determined based on the initial model;
and calculating the voxel-wise mean square error of the converted image and the target modal image, and training the second model to be trained by taking the voxel-wise mean square error as a target.
According to the image reconstruction method provided by the invention, the image reconstruction unsupervised pre-training based on the fourth sample neuroimage comprises the following steps:
preprocessing the fourth sample neural image based on image reconstruction task information to obtain a preprocessing result;
inputting the preprocessing result into a third model to be trained to obtain a prediction image output by the third model to be trained; the third model to be trained is determined based on the initial model;
and calculating the voxel-by-voxel mean square error of the prediction image and the fourth sample neural image, and training the third model to be trained by taking the voxel-by-voxel mean square error as a target.
According to the image reconstruction method provided by the invention, the initial model is constructed based on the first initial network and/or the second initial network.
According to the image reconstruction method provided by the invention, the image reconstruction model comprises a first encoder, a linear flattening layer, a second encoder and a multilayer perceptron;
the first encoder corresponds to a first initial encoder in the first initial network;
the linear flattening layer, the second encoder, and the multilayer perceptron correspond to an initial linear flattening layer, a second initial encoder, and an initial multilayer perceptron, respectively, in the second initial network;
the first encoder, the linear flattening layer, the second encoder and the multilayer sensor are connected in sequence.
According to the image reconstruction method provided by the invention, the neural image is a multi-modal neural image.
The present invention also provides an image reconstruction apparatus comprising:
the determining unit is used for determining a neural image to be reconstructed;
the reconstruction unit is used for inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model;
the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by training in the following three ways on the basis of an initial model:
performing comparison learning unsupervised pre-training based on the second sample neural image;
cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image;
and (4) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage.
The present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the image reconstruction method as described in any of the above when executing the program.
The invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements an image reconstruction method as described in any of the above.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements an image reconstruction method as described in any one of the above.
According to the image reconstruction method, the image reconstruction device, the electronic equipment and the storage medium, the neural image to be reconstructed is determined, and the neural image is input into the image reconstruction model, so that the reconstruction result of the neural image is obtained. The adopted image reconstruction model is obtained by training a target pre-training model through a first sample neural image and a corresponding sample reconstruction result thereof, the target pre-training model is obtained by training three pre-training modes, namely comparison learning unsupervised pre-training, cross-mode image conversion supervised pre-training and image reconstruction unsupervised pre-training, the problem of overfitting of the model is avoided, the performance and the generalization of the model on an image reconstruction task are greatly improved, on the basis, the image reconstruction model is used for reconstructing the input neural image, and the accuracy of the reconstruction result can be greatly improved.
Drawings
In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a schematic flow chart of an image reconstruction method provided by the present invention;
FIG. 2 is a schematic diagram of a training process of an image reconstruction model provided by the present invention;
FIG. 3 is a schematic diagram of a network structure of an image reconstruction model provided by the present invention;
FIG. 4 is a schematic structural diagram of an image reconstruction apparatus provided by the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, a neural image reconstruction technology based on deep learning is based on a neural network model, where the model input is an image to be reconstructed which has partial deletion, artifact, low resolution and other phenomena, and the model output is a reconstructed neural image, for example: missing part of the complement, removing artifacts and high-resolution neural images. Data adopted during model training is generally obtained by processing modes such as artificial masking (namely synthesizing partial missing images), artifact adding (namely synthesizing artifact images), downsampling (namely synthesizing low-resolution images) and the like of neural images with better quality, and a prediction target during model training is an unprocessed defect-free original image.
Most of the existing pre-training methods for neural network models of neuroimages applied to image reconstruction tasks are based on contrast learning or image reconstruction methods, belong to unsupervised methods, and are limited to single pre-training tasks. This will result in poor image reconstruction performance of the final trimmed neural network model. Moreover, the scale of a pre-training data set used in the pre-training of the existing neural network model is small, so that the performance improvement of the pre-training method on the neural network model is limited.
Based on this, the embodiment of the invention provides an image reconstruction method.
Fig. 1 is a schematic flow chart of an image reconstruction method provided by the present invention, as shown in fig. 1, the method includes:
s11, determining a neural image to be reconstructed;
s12, inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model;
the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model:
performing comparison learning unsupervised pre-training based on the second sample neural image;
cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image;
and (4) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage.
Specifically, an execution subject of the image reconstruction method provided in the embodiment of the present invention is an image reconstruction device, the device may be configured in a server, the server may be a local server or a cloud server, the local server may be a computer, and the like, which is not specifically limited in the embodiment of the present invention.
First, step S11 is executed to determine a neural image to be reconstructed. The neural image to be reconstructed is a three-dimensional neural image that needs to be reconstructed, and may be a partial missing image, an artifact image, a low-resolution image, or the like, which is not specifically limited herein.
Then, step S12 is executed to input the neural image into the image reconstruction model, and analyze the neural image through the image reconstruction model to obtain and output a reconstruction result of the neural image. The reconstruction result may be a complete image corresponding to a partially missing image, a de-artifact image corresponding to an artifact image, a high-resolution image corresponding to a low-resolution image, or the like.
In view of the fact that the unsupervised pre-training method can fully utilize a large amount of label-free (label) neuroimage data, better representation capability is extracted from the neuroimage data, and performance of a downstream task is improved, the unsupervised pre-training process is introduced in the embodiment of the invention and can comprise comparison learning unsupervised pre-training and image reconstruction unsupervised pre-training, so that a label-free large data set can be effectively utilized, representation learning capability of the neuroimage is improved, performance and generalization of a model on a reconstruction task can be improved, meanwhile, dependency on the label-containing data set can be reduced, and data labeling cost is saved.
In addition, considering that the improvement effect of a pre-training mode based on a single pre-training task on the reconstruction performance of the model is limited, the embodiment of the invention adopts an unsupervised and supervised integrated multi-stage pre-training strategy, introduces cross-mode image conversion supervised pre-training on the basis of contrast learning unsupervised pre-training and image reconstruction unsupervised pre-training, and can further improve the performance and the generalization of the model on the reconstruction task.
Based on this, the target pre-training model in the embodiment of the present invention is obtained by performing combined training in the following three pre-training modes on the basis of the initial model: one) unsupervised pre-training of contrast learning based on the second sample neuroimage; secondly), performing supervised pre-training on the cross-modal image conversion based on the third sample neural image and the corresponding target modal image; and thirdly) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage. The initial model may use a single neural network or a combination of multiple neural networks, which is not specifically limited in this embodiment of the present invention. The Neural Network may include a Convolutional Neural Network (CNN) such as ResNet and inclusion, a Transformer, and the like, and the initial model may be a single CNN or Transformer, or a model combining CNN and Transformer, and is not particularly limited herein.
The three pre-training modes pre-train the initial model in a serial manner, and the execution order of the three pre-training modes may be set as required, for example, the pre-training mode one), the pre-training mode two), and the pre-training mode three) may be sequentially executed in the order, or the pre-training mode two), the pre-training mode one), and the pre-training mode three) may be sequentially executed in the order, and the execution order of the three pre-training modes is not specifically limited.
The following description will only take the first), second) and third) pre-training modes executed in sequence as an example, and other execution sequences will not be described again. At this time, the initial model may be subjected to comparison learning unsupervised pre-training through the second sample neural image to obtain a first pre-training model, then a layer specific to the cross-modal image conversion task is added on the basis of the first pre-training model, and the third sample neural image and the corresponding target modal image are applied to perform training, so as to obtain a second pre-training model. And then, adding a layer specific to an image reconstruction task on the basis of the second pre-training model, and training by applying a fourth sample neural image to obtain a third pre-training model. And finally, fine-tuning the third pre-training model through the first sample neural image and the corresponding sample reconstruction result thereof to obtain a final image reconstruction model.
It is understood that the third pre-training model is a target pre-training model. The first sample neural image, the second sample neural image, the third sample neural image and the fourth sample neural image adopted in the embodiment of the present invention may be taken from the same sample neural image set, and are not particularly limited herein.
It should be noted that the image reconstruction model is obtained by adopting a multi-stage pre-training strategy with unsupervised and supervised fusion, and cross-modal image conversion and supervised pre-training are introduced in the supervised pre-training stage, so that the model learning is greatly promoted to the feature representation with stronger generalization, the multi-stage pre-training strategy can relieve the over-fitting problem of the model on a single pre-training task, and further the performance and the generalization of the model on the image reconstruction task are greatly improved.
The image reconstruction method provided by the embodiment of the invention comprises the steps of firstly determining a neural image to be reconstructed; and then inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image. The adopted image reconstruction model is obtained by training a target pre-training model through a first sample neural image and a corresponding sample reconstruction result thereof, the target pre-training model is obtained by training three pre-training modes, namely comparison learning unsupervised pre-training, cross-mode image conversion supervised pre-training and image reconstruction unsupervised pre-training, so that the data labeling cost is saved, the problem of overfitting of the model is avoided, the performance and the generalization of the model on an image reconstruction task are greatly improved, on the basis, the image reconstruction model is used for reconstructing the input neural image, and the accuracy of the reconstruction result can be greatly improved.
On the basis of the foregoing embodiment, the image reconstruction method provided in the embodiment of the present invention, in which the comparison learning unsupervised pre-training based on the second sample neuroimage, includes the following steps:
constructing a positive sample pair and a negative sample pair based on the second sample neuroimage;
inputting each image in the positive sample pair to a first model to be trained to obtain each feature vector corresponding to the positive sample pair output by the first model to be trained; the first model to be trained is determined based on the initial model;
inputting each image in the negative sample pair to the first model to be trained to obtain each feature vector corresponding to the negative sample pair output by the first model to be trained;
and training the first model to be trained by taking the consistency of the feature vectors corresponding to the positive sample pairs and the difference of the feature vectors corresponding to the negative sample pairs as targets.
Specifically, in order to learn the characteristics of a potential space from a large label-free data set, ensure that the characteristics of the same-class data are as similar as possible and the characteristics of the heterogeneous data are as different as possible, and improve the performance of the model on the downstream task by utilizing the learned characteristic capacity, the embodiment of the invention adopts a comparison learning algorithm to perform comparison learning unsupervised pre-training on a first model to be trained, so as to obtain a first pre-training model. Here, the first model to be trained refers to a training object for comparison learning unsupervised pre-training, and may be determined based on the initial model, specifically, related to the execution sequence of the three pre-training modes. When the pre-training modes one), two) and three) are sequentially executed in sequence, the first model to be trained is an initial model.
The specific process of the comparative learning unsupervised pre-training can be as follows:
firstly, carrying out certain data enhancement operations such as rotation, turnover, color conversion, blurring and the like on an original second sample neural image, and constructing a positive sample pair and a negative sample pair according to an enhanced image, wherein the positive sample pair is two images from the same sample neural image, and the negative sample pair is two images from different sample neural images; inputting each image in the positive sample pair into a first model to be trained to obtain a feature vector of each image output by the first model to be trained, thereby obtaining two feature vectors corresponding to the positive sample pair; inputting each image in the negative sample pair into the first model to be trained to obtain a feature vector of each image output by the first model to be trained, thereby obtaining two feature vectors corresponding to the negative sample pair;
on this basis, the consistency between the two feature vectors corresponding to the positive sample pair and the difference between the two feature vectors corresponding to the negative sample pair can be used as targets to train the first model to be trained, that is, the parameters of the first model to be trained are updated in the training process by combining the loss of consistency between the two feature vectors corresponding to the positive sample pair and the loss of difference between the two feature vectors corresponding to the negative sample pair, so as to finally obtain the first pre-training model.
In the embodiment of the invention, in the comparison learning unsupervised pre-training process, the model can learn the difference of bottom-layer semantic information among neural images of different samples, and the extracted features have stronger universality in various downstream tasks. Meanwhile, compared with supervised learning pre-training which needs to use label information, the cost of applying contrast learning unsupervised pre-training on the second sample neural image is lower, additional labels or other modal information is not needed, and the method can be carried out only by the sample neural image.
On the basis of the foregoing embodiment, the image reconstruction method provided in the embodiment of the present invention, in which the cross-modal image conversion supervised pre-training based on the third sample neuroimage and the corresponding target modal image, includes the following steps:
inputting the third sample neural image into a second model to be trained to obtain a conversion image output by the second model to be trained; the second model to be trained is determined based on the initial model;
and calculating the voxel-by-voxel mean square error of the converted image and the target modal image, and training the second model to be trained by taking the voxel-by-voxel mean square error as a target.
Specifically, in the embodiment of the present invention, when the cross-mode image conversion has the supervised pre-training, the third sample neural image is first input to the second model to be trained, and the second model to be trained performs the mode conversion on the third sample neural image, so as to obtain and output the converted image corresponding to the third sample neural image. The second model to be trained is a training object with supervised pre-training for cross-modal image conversion, and can be determined based on the initial model, and is specifically related to the execution sequence of the three pre-training modes. When the pre-training modes I), II) and III) are sequentially executed in sequence, the second model to be trained is a first pre-training model obtained through comparison learning unsupervised pre-training.
Then, the voxel-by-voxel mean square error of the target modal image corresponding to the converted image and the third sample neural image can be calculated, and the second model to be trained can be trained by taking the voxel-by-voxel mean square error as a target. The voxel-wise mean square error can be used as a loss function to optimize the structure of the second model to be trained, and parameters of the second model to be trained are updated, so that the second model to be trained is trained, and the second pre-training model is obtained.
In the embodiment of the invention, the second model to be trained can learn the relation between different modes in the process of cross-mode conversion, and meanwhile, the cross-mode conversion task can be regarded as providing information of other modes to the model in a label form to a certain extent.
In addition, in the process of cross-mode image conversion and supervised pre-training, if the obtained second pre-training model has a good image mode conversion effect, the output of the second pre-training model can be used for expanding the first sample neural image so as to achieve the purpose of expanding the mode type of the first sample neural image and provide more mode information for the final image reconstruction model to reconstruct the image.
On the basis of the foregoing embodiment, the image reconstruction method provided in the embodiment of the present invention, in which the image reconstruction unsupervised pre-training based on the fourth sample neuroimage, includes the following steps:
preprocessing the fourth sample neural image based on image reconstruction task information to obtain a preprocessing result;
inputting the preprocessing result into a third model to be trained to obtain a prediction image output by the third model to be trained; the third model to be trained is determined based on the initial model;
and calculating the voxel-by-voxel mean square error of the prediction image and the fourth sample neural image, and training a third model to be trained by taking the voxel-by-voxel mean square error as a target.
Specifically, in the embodiment of the present invention, when performing the unsupervised pre-training of image reconstruction, the third model to be trained may be trained through the fourth sample neuroimage. The third model to be trained may be a training object of unsupervised pre-training of image reconstruction, and may be determined based on the initial model, specifically relating to the execution sequence of the three pre-training modes. When the pre-training modes I), II) and III) are sequentially executed, the third model to be trained is a second pre-training model obtained through cross-mode image conversion and supervised pre-training.
Because the fourth sample neuroimage is usually used as a reconstruction target, the fourth sample neuroimage can be preprocessed before training based on the image reconstruction task information to obtain a preprocessing result. The image reconstruction task information refers to a specific type of an image reconstruction task, and may be, for example, a super-resolution image reconstruction task, a partial missing image reconstruction task, an artifact repair task, or the like. The preprocessing mode can be selected according to the image reconstruction task information, for example, if the image reconstruction task information is a super-resolution image reconstruction task, the preprocessing mode can be to perform down-sampling on the fourth sample neuroimage; if the image reconstruction task information is a partial missing image reconstruction task, the preprocessing mode can be that a plurality of points are randomly selected from the fourth sample neural image as central points, and the central points are used as centers to respectively mask cubes with random sizes; if the image reconstruction task information is artifact repair, the preprocessing method may be to blur the fourth sample neuroimage to simulate the artifact.
Therefore, the purpose of the preprocessing operation is to provide a training sample for the third model to be trained, and the preprocessing result is the sample neuroimage to be reconstructed. Furthermore, after the preprocessing operation, the preprocessing result may be input to the third model to be trained, and the prediction image corresponding to the preprocessing result is obtained and output by the third model to be trained.
Thereafter, the voxel-wise mean square error of the prediction image and the fourth sample neural image can be calculated, and the third model to be trained is trained with the voxel-wise mean square error as a target. The voxel-wise mean square error can be used as a loss function to optimize the structure of the third model to be trained, and parameters of the third model to be trained are updated, so that the third model to be trained is trained, and the third pre-training model is obtained.
On the basis of the above embodiment, in the image reconstruction method provided in the embodiment of the present invention, the first sample neuroimage is obtained by preprocessing a sample reconstruction result based on the image reconstruction task information, where the sample reconstruction result may be a reconstruction target manually selected from a target data set.
On the basis of the above embodiments, in the image reconstruction method provided in the embodiments of the present invention, the initial model is constructed based on the first initial network and/or the second initial network.
Specifically, in the embodiment of the present invention, the initial model may be constructed based on a single first initial network or a single second initial network, the first initial network may be constructed by using a mainstream CNN network architecture such as ResNet and inclusion, and the second initial network may be constructed by using a Transformer.
Considering that a single neural network has defects and limitations as an initial model, for example, CNN has a strong ability to extract local position information, but cannot easily model long-distance information. Based on this, the initial model may also be constructed in combination with the first initial network and the second initial network, that is, the initial model may be a model combining CNN and Transformer.
According to the method provided by the embodiment of the invention, the model combining the CNN and the Transformer is used, the characteristics that the CNN is good at extracting local information and the Transformer is good at extracting global information are considered, and the performance of the method is superior to that of a model architecture only using the CNN or only using the Transformer.
On the basis of the above embodiment, in the image reconstruction method provided in the embodiment of the present invention, the image reconstruction model includes a first encoder, a linear flattening layer, a second encoder, and a multi-layer perceptron;
the first encoder corresponds to a first initial encoder in the first initial network;
the linear flattening layer, the second encoder, and the multilayer perceptron correspond to an initial linear flattening layer, a second initial encoder, and an initial multilayer perceptron, respectively, in the second initial network;
the first encoder, the linear flattening layer, the second encoder and the multilayer sensor are connected in sequence.
Specifically, in the embodiment of the present invention, the initial model may be constructed based on a first initial network and a second initial network, the first initial network may include a first initial encoder, and the second initial network may include an initial linear flattening layer, a second initial encoder, and an initial multi-layer perceptron. The initial model can be obtained by sequentially splicing the first initial encoder, the initial linear flattening layer, the second initial encoder and the initial multi-layer sensor.
And then, in the whole process of pre-training the initial model to obtain a target pre-training model and training the target pre-training model to obtain an image reconstruction model, performing combined training on the first initial network and the second initial network to respectively obtain a first network and a second network, and combining a first encoder in the first network with a linear flattening layer, a second encoder and a multilayer sensor in the second network to form the image reconstruction model.
On the basis of the above embodiment, in the image reconstruction method provided in the embodiment of the present invention, in consideration of the fact that the neural images of different modalities contain different information, the neural image to be reconstructed may be a multi-modality neural image, so that more data modalities may be introduced by using a multi-modality fusion strategy, and further, the image reconstruction model may utilize complementary information provided by different modalities to further improve the reconstruction performance of the image reconstruction model. Correspondingly, the various types of sample neuroimages used for training the model may also be multi-modal sample neuroimages.
Here, the multi-modal neuroimaging may be, for example, a brain neuroimaging obtained by CT (Computed Tomography), MRI (Magnetic Resonance Imaging), and the like, which is not particularly limited in the embodiment of the present invention.
Fig. 2 is a schematic diagram of a complete training process of the image reconstruction apparatus provided in the embodiment of the present invention, as shown in fig. 2, the process includes:
s21, obtaining a first pre-training model by adopting a training mode of comparison learning unsupervised pre-training:
performing comparison learning unsupervised pre-training on an initial model on a large-scale neuroimage data set, performing certain data enhancement operations such as rotation, turnover, color transformation, blurring and the like on an original second sample neuroimage, constructing a positive sample pair and a negative sample pair according to an image after enhancement processing, wherein the positive sample pair is a sample pair from the same second sample neuroimage, the negative sample pair is a sample pair from different second sample neuroimages, in the training process, two images in each sampling sample pair are respectively input into the initial model, the initial model outputs a characteristic vector with the same length for each input image, and the cosine distance of the two characteristic vectors is calculated
Figure 482859DEST_PATH_IMAGE001
In contrast, for a positive sample pair, it will
Figure 381544DEST_PATH_IMAGE002
As its loss function value (loss), for negative sample pairs, will
Figure 92012DEST_PATH_IMAGE003
As its loss.
S22, obtaining a second pre-training model by adopting a cross-modal image conversion and supervised pre-training mode:
and performing cross-modal image conversion on the first pre-training model on the large-scale neuroimage dataset with supervised pre-training, namely taking a neuroimage in the same mode as the neuroimage of the third sample on the large-scale neuroimage dataset as the input of the first pre-training model, wherein the output of the first pre-training model is other mode images corresponding to the input image, and the first pre-training model is expected to convert the neuroimage of one or more modes into the neuroimage of a target mode. In the training process, the voxel-wise mean square error of the converted image obtained by the first pre-training model and the neural image in the target mode can be used as loss for optimizing the network structure of the first pre-training model, and then the second pre-training model is obtained.
S23, obtaining a third pre-training model by adopting an image reconstruction unsupervised pre-training mode:
and performing image reconstruction unsupervised pre-training on the second pre-training model on the large-scale neuroimage data set, and performing different pre-processing on the fourth neuroimage according to the image reconstruction task information. In the process, the voxel-by-voxel mean square error of the prediction image and the fourth sample neural image can still be used as a loss function value, and the second pre-training model is trained to obtain a third pre-training model;
s24, fine-tuning a third pre-training model, namely a target pre-training model, on a downstream task to obtain an image reconstruction model:
the third pre-training model may be applied to a downstream task for fine-tuning training, for example, may be applied to an image reconstruction task, a layer specific to the image classification task is added on the basis of the third pre-training model, and the first sample neuroimage and a sample reconstruction result corresponding thereto are applied for training, so that a fine-tuned image reconstruction model may be finally obtained. Since the third pre-training model has converged on the original data, a smaller learning rate (e.g.. ltoreq.0.0001) should be set at this time for training on the first sample neural image.
FIG. 3 is a schematic diagram of a network structure of an image reconstruction model provided by the present invention, and as shown in FIG. 3, during the application process of the image reconstruction model, a multi-modal neuroimaging to be reconstructed can be performed
Figure 835977DEST_PATH_IMAGE004
(the channel, height, width and depth are C, H, W and D respectively) are input into an image reconstruction model, and the first encoder of a first network (3D CNN) performs feature extraction on the neural image to obtain a feature map
Figure 745027DEST_PATH_IMAGE005
(channel, height, width and depth are respectively
Figure 47570DEST_PATH_IMAGE006
Figure 979754DEST_PATH_IMAGE007
Figure 527410DEST_PATH_IMAGE008
And
Figure 494229DEST_PATH_IMAGE009
) Linear flattening layer (Linear Flat) pair profiles for the second network (Transformer)
Figure 734717DEST_PATH_IMAGE005
Performing a Flatten operation on each divided Patch, mapping to obtain each Patch vector (Patch Embedding) and corresponding Position Encoding (Position Encoding), and inputting each Patch vector and the corresponding Position Encoding into a second encoder of a second network in a sequence form to obtain a second EncodingThe coded vectors output by the encoder are respectively aligned to the characteristic diagram by a header (MLP Head) of a shared multi-layer Perceptron
Figure 950935DEST_PATH_IMAGE005
Mapping each corresponding patch vector, and finally mapping the feature map
Figure 302282DEST_PATH_IMAGE005
And converting into a reconstruction result. Patch Embedding is indicated by small circles without reference numbers and Position Encoding is indicated by small circles with 1-N reference numbers in FIG. 3.
In summary, the image reconstruction method provided in the embodiment of the present invention uses an unsupervised and supervised fusion multi-stage pre-training strategy, and introduces a plurality of pre-training tasks such as comparison learning unsupervised pre-training, cross-mode image conversion supervised pre-training, and image reconstruction unsupervised pre-training, so that a large unlabeled data set can be effectively utilized, the performance and generalization of an image reconstruction model on downstream tasks are improved, and the characterization learning capability of a neuroimage is improved.
As shown in fig. 4, on the basis of the above embodiment, an embodiment of the present invention provides an image reconstruction apparatus, including:
a determination unit 41 for determining a neural image to be reconstructed;
the reconstruction unit 42 is configured to input the neuroimage into an image reconstruction model, and obtain a reconstruction result of the neuroimage output by the image reconstruction model;
the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model:
performing comparison learning unsupervised pre-training based on the second sample neural image;
cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image;
and (4) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage.
On the basis of the foregoing embodiment, an embodiment of the present invention provides an image reconstruction apparatus, further including a first pre-training module, configured to:
constructing a positive sample pair and a negative sample pair based on the second sample neural image;
inputting each image in the positive sample pair to a first model to be trained to obtain each feature vector corresponding to the positive sample pair output by the first model to be trained; the first model to be trained is determined based on the initial model;
inputting each image in the negative sample pair to the first model to be trained to obtain each feature vector corresponding to the negative sample pair output by the first model to be trained;
and training the first model to be trained by taking the consistency of the feature vectors corresponding to the positive sample pairs and the difference of the feature vectors corresponding to the negative sample pairs as targets.
On the basis of the foregoing embodiment, an embodiment of the present invention provides an image reconstruction apparatus, further including a second pre-training module, configured to:
inputting the third sample neural image into a second model to be trained to obtain a conversion image output by the second model to be trained; the second model to be trained is determined based on the initial model;
and calculating the voxel-by-voxel mean square error of the converted image and the target modal image, and training the second model to be trained by taking the voxel-by-voxel mean square error as a target.
On the basis of the foregoing embodiment, an embodiment of the present invention provides an image reconstruction apparatus, further including a third pre-training module, configured to:
preprocessing the fourth sample neural image based on image reconstruction task information to obtain a preprocessing result;
inputting the preprocessing result into a third model to be trained to obtain a prediction image output by the third model to be trained; the third model to be trained is determined based on the initial model;
calculating the voxel-wise mean square error of the predicted image and the fourth sample neural image, and training the third model to be trained by taking the voxel-wise mean square error as a target.
On the basis of the above embodiments, in an embodiment of the present invention, an image reconstruction apparatus is provided, where the initial model is constructed based on the first initial network and/or the second initial network.
On the basis of the above embodiments, an embodiment of the present invention provides an image reconstruction apparatus, where the image reconstruction model includes a first encoder, a linear flattening layer, a second encoder, and a multi-layer sensor;
the first encoder corresponds to a first initial encoder in the first initial network;
the linear flattening layer, the second encoder, and the multilayer perceptron correspond to an initial linear flattening layer, a second initial encoder, and an initial multilayer perceptron, respectively, in the second initial network;
the first encoder, the linear flattening layer, the second encoder and the multilayer sensor are connected in sequence.
On the basis of the above embodiments, an embodiment of the present invention provides an image reconstruction apparatus, where the neuroimaging is a multi-modal neuroimaging.
Specifically, the functions of the modules in the image reconstruction apparatus provided in the embodiment of the present invention correspond to the operation flows of the steps in the above method embodiments one to one, and the achieved effects are also consistent.
Fig. 5 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 5: a Processor (Processor) 510, a communication Interface (Communications Interface) 520, a Memory (Memory) 530 and a communication bus 540, wherein the Processor 510, the communication Interface 520 and the Memory 530 communicate with each other via the communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform the image reconstruction method provided in the various embodiments described above, the method comprising: determining a neural image to be reconstructed; inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model; the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model: performing comparison learning unsupervised pre-training based on the second sample neural image; cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image; and (4) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage.
Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the image reconstruction method provided in the above embodiments, the method comprising: determining a neural image to be reconstructed; inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model; the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model: performing comparison learning unsupervised pre-training based on the second sample neural image; cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image; and (4) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the image reconstruction method provided in the above embodiments, the method comprising: determining a neural image to be reconstructed; inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model; the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model: performing comparison learning unsupervised pre-training based on the second sample neural image; cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image; and (4) carrying out unsupervised pre-training on image reconstruction based on the fourth sample neuroimage.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. An image reconstruction method, comprising:
determining a neural image to be reconstructed;
inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model;
the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model:
performing comparison learning unsupervised pre-training based on the second sample neural image;
cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image;
image reconstruction unsupervised pre-training based on the fourth sample neuroimage;
the cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image comprises the following steps:
inputting the third sample neural image into a second model to be trained to obtain a conversion image output by the second model to be trained; the second model to be trained is determined based on the initial model;
and calculating the voxel-by-voxel mean square error of the converted image and the target modal image, and training the second model to be trained by taking the voxel-by-voxel mean square error as a target.
2. The image reconstruction method according to claim 1, wherein the comparison learning unsupervised pre-training based on the second sample neuroimage comprises the following steps:
constructing a positive sample pair and a negative sample pair based on the second sample neuroimage;
inputting each image in the positive sample pair to a first model to be trained to obtain each feature vector corresponding to the positive sample pair output by the first model to be trained; the first model to be trained is determined based on the initial model;
inputting each image in the negative sample pair to the first model to be trained to obtain each feature vector corresponding to the negative sample pair output by the first model to be trained;
and training the first model to be trained by taking the consistency of the feature vectors corresponding to the positive sample pairs and the difference of the feature vectors corresponding to the negative sample pairs as targets.
3. The image reconstruction method according to claim 1, wherein the unsupervised pre-training of the image reconstruction based on the fourth sample neuroimage comprises the steps of:
preprocessing the fourth sample neural image based on image reconstruction task information to obtain a preprocessing result;
inputting the preprocessing result into a third model to be trained to obtain a prediction image output by the third model to be trained; the third model to be trained is determined based on the initial model;
and calculating the voxel-by-voxel mean square error of the prediction image and the fourth sample neural image, and training the third model to be trained by taking the voxel-by-voxel mean square error as a target.
4. The image reconstruction method according to claim 1, characterized in that the initial model is constructed based on a first initial network and/or a second initial network.
5. The image reconstruction method according to claim 4, wherein the image reconstruction model comprises a first encoder, a linear flattening layer, a second encoder, and a multi-layer perceptron;
the first encoder corresponds to a first initial encoder in the first initial network;
the linear flattening layer, the second encoder, and the multilayer perceptron correspond to an initial linear flattening layer, a second initial encoder, and an initial multilayer perceptron, respectively, in the second initial network;
the first encoder, the linear flattening layer, the second encoder and the multilayer sensor are connected in sequence.
6. The image reconstruction method according to any one of claims 1 to 5, wherein the neuroimage is a multi-modal neuroimage.
7. An image reconstruction apparatus, comprising:
the determining unit is used for determining a neural image to be reconstructed;
the reconstruction unit is used for inputting the neural image into an image reconstruction model to obtain a reconstruction result of the neural image output by the image reconstruction model;
the image reconstruction model is obtained by training based on a first sample neural image and a corresponding sample reconstruction result on the basis of a target pre-training model, and the target pre-training model is obtained by joint training in the following three ways on the basis of an initial model:
performing comparison learning unsupervised pre-training based on the second sample neural image;
cross-modal image conversion supervised pre-training based on the third sample neural image and the corresponding target modal image;
image reconstruction unsupervised pre-training based on the fourth sample neuroimage;
further comprising a second pre-training module for:
inputting the third sample neural image into a second model to be trained to obtain a conversion image output by the second model to be trained; the second model to be trained is determined based on the initial model;
and calculating the voxel-by-voxel mean square error of the converted image and the target modal image, and training the second model to be trained by taking the voxel-by-voxel mean square error as a target.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the image reconstruction method according to any one of claims 1 to 6 when executing the program.
9. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the image reconstruction method according to any one of claims 1 to 6.
CN202210628118.3A 2022-06-06 2022-06-06 Image reconstruction method and device, electronic equipment and storage medium Active CN114708353B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210628118.3A CN114708353B (en) 2022-06-06 2022-06-06 Image reconstruction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210628118.3A CN114708353B (en) 2022-06-06 2022-06-06 Image reconstruction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114708353A CN114708353A (en) 2022-07-05
CN114708353B true CN114708353B (en) 2022-09-06

Family

ID=82177729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210628118.3A Active CN114708353B (en) 2022-06-06 2022-06-06 Image reconstruction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114708353B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115619647B (en) * 2022-12-20 2023-05-09 北京航空航天大学 Cross-modal super-resolution reconstruction method based on variation inference

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523584A (en) * 2018-10-26 2019-03-26 上海联影医疗科技有限公司 Image processing method, device, multi-mode imaging system, storage medium and equipment
CN113314205A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Efficient medical image labeling and learning system
CN113868459A (en) * 2021-06-25 2021-12-31 之江实验室 Model training method, cross-modal characterization method, unsupervised image text matching method and unsupervised image text matching device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523584A (en) * 2018-10-26 2019-03-26 上海联影医疗科技有限公司 Image processing method, device, multi-mode imaging system, storage medium and equipment
CN113314205A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Efficient medical image labeling and learning system
CN113868459A (en) * 2021-06-25 2021-12-31 之江实验室 Model training method, cross-modal characterization method, unsupervised image text matching method and unsupervised image text matching device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Measuring the Effects of Data Parallelism on Neural Network Training;Christopher J.Shallue等;《Jorunal of Machine Learning Reasearch》;20190719;第1-30页 *
深度信念网络研究进展;周涛;《计算机工程与应用》;20200324;第24-30页 *

Also Published As

Publication number Publication date
CN114708353A (en) 2022-07-05

Similar Documents

Publication Publication Date Title
EP3298576B1 (en) Training a neural network
EP3916635B1 (en) Defect detection method and apparatus
CN109919838B (en) Ultrasonic image super-resolution reconstruction method for improving outline definition based on attention mechanism
WO2021048607A1 (en) Motion deblurring using neural network architectures
CN113658051A (en) Image defogging method and system based on cyclic generation countermeasure network
Zhang et al. Gated fusion network for degraded image super resolution
CN112862830B (en) Multi-mode image segmentation method, system, terminal and readable storage medium
KR20200084434A (en) Machine Learning Method for Restoring Super-Resolution Image
CN111861884B (en) Satellite cloud image super-resolution reconstruction method based on deep learning
CN111080591A (en) Medical image segmentation method based on combination of coding and decoding structure and residual error module
Vu et al. Perception-enhanced image super-resolution via relativistic generative adversarial networks
CN114708465B (en) Image classification method and device, electronic equipment and storage medium
CN113362310A (en) Medical image liver segmentation method based on unsupervised learning
Khan et al. An encoder–decoder deep learning framework for building footprints extraction from aerial imagery
CN114708353B (en) Image reconstruction method and device, electronic equipment and storage medium
Yang et al. A survey of super-resolution based on deep learning
CN113724136A (en) Video restoration method, device and medium
Bao et al. SCTANet: a spatial attention-guided CNN-transformer aggregation network for deep face image super-resolution
CN112686830B (en) Super-resolution method of single depth map based on image decomposition
Chen et al. Towards real-world blind face restoration with generative diffusion prior
Lee et al. Wide receptive field and channel attention network for jpeg compressed image deblurring
Graham et al. Unsupervised 3d out-of-distribution detection with latent diffusion models
CN116563100A (en) Blind super-resolution reconstruction method based on kernel guided network
CN116016953A (en) Dynamic point cloud attribute compression method based on depth entropy coding
CN115761358A (en) Method for classifying myocardial fibrosis based on residual capsule network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant