CN110211079B

CN110211079B - Medical image fusion method and device

Info

Publication number: CN110211079B
Application number: CN201910430661.0A
Authority: CN
Inventors: 黄运有; 张知非; 范帆达; 叶海男
Original assignee: Beijing Jianfeng Xinrui Information Technology Research Institute Co ltd; Capital Medical University
Current assignee: Beijing Jianfeng Xinrui Information Technology Research Institute Co ltd; Capital Medical University
Priority date: 2019-05-22
Filing date: 2019-05-22
Publication date: 2021-07-13
Anticipated expiration: 2039-05-22
Also published as: CN110211079A

Abstract

The invention discloses a medical image fusion method and a medical image fusion device, wherein the method comprises the following steps: acquiring pixel information of a first modal image and a second modal image to be fused, wherein the first modal image and the second modal image are images of different modalities; inputting pixel information of a first modality image and a second modality image into an image fusion network model obtained through pre-training, and outputting a fusion image of the first modality image and the second modality image, wherein the image fusion network model is a model for fusing different modality images based on semantic information, and the semantic information is used for representing meanings of pixel values in the different modality images. The invention can fuse images of different modes based on semantic information, so that the fused multi-mode images are easy to read, and the technical effect of providing medical image information of different modes for clinicians to assist in treatment is realized.

Description

Medical image fusion method and device

Technical Field

The invention relates to the field of image processing, in particular to a medical image fusion method and device.

Background

This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.

A medical image is an image of a human organ or a diseased tissue acquired by various medical imaging apparatuses for medical or medical research. Since each medical imaging device can only acquire a single-modality medical image, the single-modality medical image is often difficult to provide sufficient information. For example, an MR image acquired by a magnetic resonance imaging apparatus can clearly reflect structures such as soft tissue, but is not sensitive to calcifications; the CT image obtained by the electronic computer tomography equipment has stronger spatial resolution and geometric characteristics, can clearly reflect structures such as bones, and the like, but has lower contrast ratio to soft tissues. Therefore, how to fuse the single-mode medical images obtained by different medical imaging devices to obtain multi-mode medical images to obtain more comprehensive information is of great significance to medical research and medical diagnosis.

At present, in the mainstream method for acquiring a multi-modal medical image in the prior art, two single-modal medical images are directly fused into one image by adopting an image fusion technology, so that boundary information and structural information in the fused image are clearer. Because semantic information of the image is not considered, the fused image is difficult to understand.

For example, fig. 1a and 1b show a CT image and an MR image of a skull of a patient, the images shown in fig. 1a and 1b are taken as two original images to be fused, fig. 1c shows an image obtained by fusing the CT image shown in fig. 1a and the MR image shown in fig. 1b based on a CNN-LP network structure, fig. 1d shows an image obtained by fusing the CT image shown in fig. 1a and the MR image shown in fig. 1b based on an NSCT-PCDC network structure, and fig. 1e shows an image obtained by fusing the CT image shown in fig. 1a and the MR image shown in fig. 1b based on an NSST-papnn network structure.

Analysis shows that the existing image fusion method puts the colors in the original image into the fused image, which causes two problems: (ii) local regions (e.g., regions indicated by rectangular boxes in FIGS. 1a, 1b, 1c, 1d, and 1 e) are blurred; ② some parts (such as cerebrospinal fluid and skull) in the fused image are difficult to distinguish due to the same color in the original image.

Disclosure of Invention

The embodiment of the invention provides a medical image fusion method, which is used for solving the technical problem that semantic deletion or semantic conflict exists in the medical image fusion process in the prior art and comprises the following steps: acquiring pixel information of a first modal image and a second modal image to be fused, wherein the first modal image and the second modal image are images of different modalities; inputting pixel information of a first modal image and a second modal image into an image fusion network model obtained by pre-training, and outputting a fusion image of the first modal image and the second modal image, wherein the image fusion network model is a model for fusing different modal images based on semantic information, and the semantic information is used for representing the meaning of pixel values in the different modal images;

inputting the pixel information of the first modality image and the second modality image into an image fusion network model obtained by pre-training, and outputting a fusion image of the first modality image and the second modality image, wherein the method comprises the following steps:

extracting first semantic information of the first modality image according to pixel information of the first modality image, and extracting second semantic information of the second modality image according to pixel information of the second modality image;

mapping the first semantic information and the second semantic information to a target image space;

and fusing the first semantic information and the second semantic information based on the target image space to obtain a fused image of the first modal image and the second modal image.

The embodiment of the invention also provides a medical image fusion device, which is used for solving the technical problem that the semantic meaning of the medical image is lost or conflicted in the fusion process in the prior art, and comprises the following components: the image information acquiring unit is used for acquiring pixel information of a first modal image and a second modal image to be fused, wherein the first modal image and the second modal image are images of different modalities; the image information processing unit is used for inputting the pixel information of the first modal image and the second modal image into an image fusion network model obtained by pre-training and outputting the fusion image of the first modal image and the second modal image, wherein the image fusion network model is a model for fusing different modal images based on semantic information, and the semantic information is used for representing the meaning of pixel values in the different modal images;

wherein the image information processing unit includes:

the semantic extraction module is used for extracting first semantic information of the first modality image according to the pixel information of the first modality image and extracting second semantic information of the second modality image according to the pixel information of the second modality image;

an image space mapping module for mapping the first semantic information and the second semantic information to a target image space;

and the image fusion module is used for fusing the first semantic information and the second semantic information based on the target image space to obtain a fused image of the first modal image and the second modal image.

The embodiment of the invention also provides computer equipment for solving the technical problem that the medical images have semantic missing or semantic conflict in the fusion process in the prior art, the computer equipment comprises a memory, a processor and a computer program which is stored in the memory and can be operated on the processor, and the fusion method of the medical images is realized when the processor executes the computer program.

An embodiment of the present invention further provides a computer-readable storage medium, which is used for solving the technical problem of semantic missing or semantic conflict in the fusion process of medical images in the prior art, and the computer-readable storage medium stores a computer program for executing the medical image fusion method.

In the embodiment of the invention, an image fusion network model for fusing images in different modes based on semantic information is obtained in advance through machine learning training, and when the images in different modes are fused, only the pixel information of the images in different modes is required to be input into the image fusion network model, so that the fused images of the images in different modes can be obtained.

According to the embodiment of the invention, the images in different modes can be fused based on the semantic information, so that the fused multi-mode images are easy to read, and the technical effect of providing medical image information in different modes for clinicians to assist in treatment is realized.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:

FIG. 1a is a schematic diagram of a skull CT image provided in an embodiment of the present invention;

FIG. 1b is a schematic diagram of a head MR image provided in an embodiment of the present invention;

fig. 1c is a schematic view of a multi-modal image obtained by fusing fig. 1a and fig. 1b based on a CNN-LP network structure provided in an embodiment of the present invention;

fig. 1d is a schematic view of a multi-modal image obtained by fusing the images of fig. 1a and 1b based on an NSCT-PCDC network structure according to an embodiment of the present invention;

fig. 1e is a schematic diagram of a multi-modal image obtained by fusing the images in fig. 1a and 1b based on an NSST-PAPCNN network structure according to an embodiment of the present invention;

FIG. 2 is a flow chart of a medical image fusion method provided in an embodiment of the present invention;

FIG. 3 is a schematic diagram of an image fusion process based on semantic information according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a multi-modal image fusion model based on a W-Net network structure according to an embodiment of the present invention;

FIG. 5a is a schematic diagram of a coding network model using a U-Net network structure according to an embodiment of the present invention;

fig. 5b is a schematic diagram of a decoding network model adopting a U-Net network structure according to an embodiment of the present invention;

FIG. 6a is a schematic view of a CT image provided in an embodiment of the present invention;

FIG. 6b is a schematic diagram of an MR image provided in an embodiment of the present invention;

fig. 6c is a schematic view of a multi-modal image obtained by fusing fig. 6a and fig. 6b based on a CNN-LP network structure according to an embodiment of the present invention;

fig. 6d is a schematic view of a multi-modal image obtained by fusing the images of fig. 6a and 6b based on the NSCT-PCDC network structure according to an embodiment of the present invention;

fig. 6e is a schematic view of a multi-modal image obtained by fusing the images of fig. 6a and 6b based on the NSST-PAPCNN network structure according to an embodiment of the present invention;

fig. 6f is a schematic diagram of a multi-modal image obtained by fusing fig. 6a and fig. 6b based on a GF network structure according to an embodiment of the present invention;

fig. 6g is a schematic view of a multi-modal image obtained by fusing the images of fig. 6a and 6b based on the NSCT-RPCNN network structure provided in the embodiment of the present invention;

fig. 6h is a schematic diagram of a multi-modal image obtained by fusing the images in fig. 6a and 6b based on a W-Net network structure according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a medical image fusion apparatus provided in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.

In the description of the present specification, the terms "comprising," "including," "having," "containing," and the like are used in an open-ended fashion, i.e., to mean including, but not limited to. Reference to the description of the terms "one embodiment," "a particular embodiment," "some embodiments," "for example," etc., means that a particular feature, structure, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. The sequence of steps involved in the embodiments is for illustrative purposes to illustrate the implementation of the present application, and the sequence of steps is not limited and can be adjusted as needed.

An embodiment of the present invention provides a method for fusing medical images, and fig. 2 is a flowchart of the method for fusing medical images provided in the embodiment of the present invention, as shown in fig. 2, the method includes the following steps:

s201, acquiring pixel information of a first modality image and a second modality image to be fused, wherein the first modality image and the second modality image are images of different modalities;

s202, inputting the pixel information of the first mode image and the second mode image into an image fusion network model obtained by pre-training, and outputting a fusion image of the first mode image and the second mode image, wherein the image fusion network model is a model for fusing different mode images based on semantic information, and the semantic information is used for representing the meaning of pixel values in the different mode images.

It should be noted that, in the embodiment of the present invention, the first modality image and the second modality image are intended to represent images of two different modalities, and may be, but not limited to, single modality images obtained by any two or more kinds of medical imaging devices as follows: magnetic resonance equipment, CT equipment, ultrasonic equipment, various X-ray machines, various infrared instruments, a microscope and the like.

As a preferred embodiment, the embodiment of the present invention is described in detail by taking an example of fusion of an MR image (an image obtained by a magnetic resonance apparatus) and a CT image (an image obtained by a CT apparatus). The MR image and the CT image look very similar (contrast of grey values from black to white) but the meaning of the pixel values in the images is completely different due to the completely different imaging principles. If the MR image and the CT image are directly fused, semantic missing or semantic conflict problems can be caused. According to the medical image fusion method provided by the embodiment of the invention, the MR image and the CT image are fused based on the semantic information, so that the fused image can clearly reflect structural information such as bones and the like, and can also reflect structural information such as soft tissues and the like.

As can be seen from the above, the medical image fusion method provided in the embodiment of the present invention obtains an image fusion network model fusing images of different modalities based on semantic information in advance through machine learning training, and when fusing images of different modalities, only needs to input pixel information of the images of different modalities into the image fusion network model, so as to obtain a fused image of the images of different modalities.

It should be noted that the image fusion network model may be a network model obtained by machine learning and pre-training, and thus, as an optional implementation, the medical image fusion method provided in the embodiment of the present invention may further obtain the image fusion network model by training through the following steps: training an image fusion network model by using training data; in the training process, adjusting parameters of the image fusion network model until a loss function of the image fusion network model meets a preset convergence condition; wherein, the image fusion network model comprises: an encoding network model and a decoding network model; the input data of the coding network model is a first modal image and a second modal image to be fused, and the output data of the coding network model is a fused image of the first modal image and the second modal image; decoding input data of the network model into a fused image of a first modal image and a second modal image, and decoding output data of the network model into the first modal image and the second modal image which are reconstructed according to the fused image; the loss function includes at least reconstruction errors of the first modality image and the second modality image.

Optionally, the loss function may further include: sparse penalty term and L2 regularization term, as an alternative embodiment, the expression of the loss function may be:

wherein the content of the first and second substances,

wherein x is_ctRepresenting a first modality image to be fused; x is the number of_mrRepresenting a second modality image to be fused;

representing a reconstructed first modality image;

representing a reconstructed second modality image;

representing a Euclidean distance between the reconstructed first modality image and the first modality image to be fused;

representing Euclidean distance between the reconstructed second-modality image and the second-modality image to be fused, and used for representing the fused image z_ijKL divergence from constant picture;

expressing a sparse penalty term for representing KL divergence of the fused image and the constant image; z is a radical of_ijA pixel value representing a coordinate (i, j) in the fused image; ρ represents a constant;

represents the L2 regularization term; a represents the weight of the sparse penalty term; β represents the weight of the L2 regularization term.

Since the first modality image and the second modality image are images of different modalities and the meanings represented by image pixel values are different, in order to achieve the purpose of fusing images of different modalities based on semantic information, the fusion method of medical images provided by the embodiment of the present invention can specifically achieve the fusion of the first modality image and the second modality image based on the image fusion network model by the following steps: extracting first semantic information of the first modality image according to pixel information of the first modality image, and extracting second semantic information of the second modality image according to pixel information of the second modality image; mapping the first semantic information and the second semantic information to a target image space; and fusing the first semantic information and the second semantic information based on the target image space to obtain a fused image of the first modal image and the second modal image.

Taking a first modality image as a CT image and a second modality image as an MR image as an example, fig. 3 is a schematic diagram of an image fusion process based on semantic information according to an embodiment of the present invention, as shown in fig. 3, extracting first semantic information of the CT image in modality 1 and second semantic information of the MR image in modality 2, constructing an image space of modality 3, and fusing the first semantic information and the second semantic information in modality 3 to obtain a fused image, where each pixel value in modality 3 represents a pixel value meaning of modality 1 and a pixel value meaning of modality 2.

As a preferred implementation manner, the embodiment of the present invention utilizes a U-Net network structure to construct the multimodal image fusion model based on semantic information in the embodiment of the present invention, so as to obtain the multimodal image fusion model based on the W-Net network structure shown in fig. 4, where x ═ x shown in fig. 4_ct,x_mr]Representing two images to be fused of an input, where x_ctRepresenting a first modality image (e.g., a CT image), x_mrRepresenting a second modality image (e.g., an MR image); e_θRepresenting a coding network model; d_φRepresenting a decoding network model; z represents a fused image;

representation of a decoding network model D_φTwo generated from fused image Z reconstructionAn image of a document, wherein,

representing a reconstructed first modality image (e.g., a CT image),

a second modality image (e.g., an MR image) representative of the reconstruction;

and representing the loss function of the whole W-Net network structural model.

In the multi-modal image fusion model provided by the embodiment of the invention, a coding network model E_θA multiple input channel, single output channel U-Net network (as shown in FIG. 5a, FIG. 5a only shows two input channels) is employed to implement the first modality image x_ctAnd a second modality image x_mrObtaining a fusion image Z of the two; decoding network model D_φSingle input channel, multiple output channel U-Net networks (as shown in FIG. 5b, with FIG. 5b showing only two output channels) are employed to enable reconstruction of first modality images from fused images

And a second modality image

Assuming that the images to be fused are the CT image shown in fig. 6a and the MR image shown in fig. 6b, fig. 6c is a schematic view of a multi-modal image obtained by fusing fig. 6a and 6b based on the CNN-LP network structure; FIG. 6d is a schematic diagram of a multi-modal image obtained by fusing the images of FIGS. 6a and 6b based on the NSCT-PCDC network structure; FIG. 6e is a schematic diagram of a multi-modal image obtained by fusing the images of FIGS. 6a and 6b based on the NSST-PAPNN network structure; fig. 6f is a schematic view of a multi-modal image obtained by fusing fig. 6a and 6b based on a GF network structure; FIG. 6g is a schematic diagram of a multi-modal image obtained by fusing the images of FIGS. 6a and 6b based on the NSCT-RPCNN network structure; fig. 6h is a schematic view of a multi-modal image obtained by fusing fig. 6a and fig. 6b based on a W-Net network structure according to an embodiment of the present invention. As can be seen from comparing fig. 6c, 6d, 6e, 6f, 6g, and 6h, the multi-modal fusion image obtained by the image fusion method based on the W-Net network structure according to the embodiment of the present invention can clearly reflect information of each single-modal image.

Table 1 shows the comparison of the experimental results of the evaluation indexes obtained by image fusion using image fusion methods of different network structures, wherein Q_MIThe mutual information evaluation index is a measurement based on entropy and used for measuring the information quantity of the fusion image retention source image; q^AB/FAn evaluation index representing edge information, which is a gradient-based metric that measures information retention of the edge of the fused image; SSIM represents a structural similarity evaluation index, which is an evaluation based on structural similarity, and measures structural similarity between a fused image and an original image; q^DIs a Human Visual System (HVS) based evaluation index using a Daly filter for measuring visual differences between the fused image and the source image; the Semantic Loss index is expressed by the Semantic Loss, and is used for measuring the reconstruction error of the fused image to express the Semantic conflict in the fused image; wherein Q is_MI、Q^AB/FThe index values of CT, MR-T2 are all as high as possible, and Q^DAnd the lower the index value of Semantic Loss, the better.

TABLE 1 evaluation index of different image fusion methods

As can be seen from table 1, the medical image fusion method provided by the embodiment of the present invention takes semantic loss into consideration, so that the medical image fusion method provided by the embodiment of the present invention can achieve an optimal fusion effect for a semantic loss index, and is similar to the existing image fusion method in a visual evaluation index.

The embodiment of the invention also provides a medical image fusion device, which is described in the following embodiment. Because the principle of solving the problems of the embodiment of the device is similar to the fusion method of the medical images, the implementation of the embodiment of the device can be referred to the implementation of the method, and repeated details are not repeated.

Fig. 7 is a schematic diagram of a medical image fusion apparatus provided in an embodiment of the present invention, as shown in fig. 7, the apparatus includes:

an image information obtaining unit 71, configured to obtain pixel information of a first modality image and a second modality image to be fused, where the first modality image and the second modality image are images of different modalities;

the image information processing unit 72 is configured to input pixel information of the first modality image and the second modality image into an image fusion network model obtained through pre-training, and output a fusion image of the first modality image and the second modality image, where the image fusion network model is a model for fusing different modality images based on semantic information, and the semantic information is used to represent meanings of pixel values in the different modality images.

As can be seen from the above, the medical image fusion device according to the embodiment of the present invention obtains an image fusion network model fusing images in different modalities based on semantic information in advance through machine learning training, obtains pixel information of images in different modalities to be fused through the image information obtaining unit 71 when images in different modalities are fused, and inputs the pixel information of images in different modalities to be fused into the image fusion network model obtained through pre-training through the image information processing unit 72, so as to obtain a fusion image of images in different modalities.

As an optional implementation manner, in the medical image fusion apparatus provided in the embodiment of the present invention, the image information processing unit 72 may specifically include: a semantic extracting module 721, configured to extract first semantic information of the first modality image according to the pixel information of the first modality image, and extract second semantic information of the second modality image according to the pixel information of the second modality image; an image space mapping module 722 for mapping the first semantic information and the second semantic information to a target image space; the image fusion module 723 is configured to fuse the first semantic information and the second semantic information based on the target image space to obtain a fusion image of the first modality image and the second modality image.

In an optional embodiment, the medical image fusion apparatus provided in the embodiment of the present invention may further include: a model training unit 73 for training the image fusion network model using the training data; the model adjusting unit 74 is configured to adjust parameters of the image fusion network model until a loss function of the image fusion network model meets a preset convergence condition in the training process; wherein, the image fusion network model comprises: an encoding network model and a decoding network model; the input data of the coding network model is a first modal image and a second modal image to be fused, and the output data of the coding network model is a fused image of the first modal image and the second modal image; decoding input data of the network model into a fused image of a first modal image and a second modal image, and decoding output data of the network model into the first modal image and the second modal image which are reconstructed according to the fused image; wherein the loss function comprises at least a reconstruction error determined by an error between the reconstructed first modality image and the first modality image to be fused, and an error between the reconstructed second modality image and the second modality image to be fused.

wherein the content of the first and second substances,

representing a reconstructed first modality image;

representing a reconstructed second modality image;

representing the Euclidean distance between the reconstructed second modality image and the second modality image to be fused;

The embodiment of the present invention further provides a computer device, which is used to solve the technical problem of semantic missing or semantic conflict in the fusion process of medical images in the prior art, and the computer device includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program, the processor implements any one of the optional or preferred medical image fusion methods.

An embodiment of the present invention further provides a computer-readable storage medium, which is used to solve the technical problem of semantic missing or semantic conflict in the fusion process of medical images in the prior art, and the computer-readable storage medium stores a computer program for executing any one of the optional or preferred medical image fusion methods.

In summary, the embodiment of the present invention provides a medical image fusion scheme based on image semantics, which performs semantic extraction, semantic transformation, and semantic fusion on a medical image, and can obtain a fusion image with consistent semantics and easy reading, so that a medical image fusion technology can truly provide medical image information from different modalities in the fusion image for a clinician to assist diagnosis and treatment.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A method of fusing medical images, comprising:

acquiring pixel information of a first modal image and a second modal image to be fused, wherein the first modal image and the second modal image are images of different modalities;

inputting pixel information of the first modal image and the second modal image into an image fusion network model obtained by pre-training, and outputting a fusion image of the first modal image and the second modal image, wherein the image fusion network model is a model for fusing different modal images based on semantic information, and the semantic information is used for representing meanings of pixel values in the different modal images;

2. The method of claim 1, wherein the method further comprises:

training the image fusion network model by using training data;

in the training process, adjusting parameters of the image fusion network model until a loss function of the image fusion network model meets a preset convergence condition;

wherein the image fusion network model comprises: an encoding network model and a decoding network model; the input data of the coding network model is a first modal image and a second modal image to be fused, and the output data of the coding network model is a fused image of the first modal image and the second modal image; the input data of the decoding network model is a fused image of a first modal image and a second modal image, and the output data of the decoding network model is the first modal image and the second modal image which are reconstructed according to the fused image;

the loss function includes at least a reconstruction error of the first modality image and the second modality image.

3. The method of claim 2, wherein the loss function further comprises: a sparse penalty term and an L2 regularization term, wherein the loss function is expressed as:

wherein the content of the first and second substances,

representing a reconstructed first modality image;

representing a reconstructed second modality image;

4. A medical image fusion apparatus, comprising:

the image information acquiring unit is used for acquiring pixel information of a first modal image and a second modal image to be fused, wherein the first modal image and the second modal image are images of different modalities;

the image information processing unit is used for inputting the pixel information of the first modal image and the second modal image into an image fusion network model obtained by pre-training and outputting a fusion image of the first modal image and the second modal image, wherein the image fusion network model is a model for fusing different modal images based on semantic information, and the semantic information is used for representing the meaning of pixel values in the different modal images;

wherein the image information processing unit includes:

5. The apparatus of claim 4, wherein the apparatus further comprises:

the model training unit is used for training the image fusion network model by using training data;

the model adjusting unit is used for adjusting the parameters of the image fusion network model in the training process until the loss function of the image fusion network model meets a preset convergence condition;

the loss function includes at least a reconstruction error determined from an error between the reconstructed first-modality image and the first-modality image to be fused, and an error between the reconstructed second-modality image and the second-modality image to be fused.

6. The apparatus of claim 5, wherein the loss function further comprises: a sparse penalty term and an L2 regularization term, wherein the loss function is expressed as:

wherein the content of the first and second substances,

representing a reconstructed first modality image;

representing a reconstructed second modality image;

representing a reconstructed first modality image with a first modality to be fusedEuclidean distance of the image;

representing an L2 regularization term for characterizing the complexity of the model; a represents the weight of the sparse penalty term; β represents the weight of the L2 regularization term.

7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method for fusing medical images according to any one of claims 1 to 3 when executing the computer program.

8. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the medical image fusion method according to any one of claims 1 to 3.