CN113870259B

CN113870259B - Multi-modal medical data fusion assessment method, device, equipment and storage medium

Info

Publication number: CN113870259B
Application number: CN202111454543.7A
Authority: CN
Inventors: 王玉峰
Original assignee: Tianjin Yujin Artificial Intelligence Medical Technology Co ltd
Current assignee: Tianjin Yujin Artificial Intelligence Medical Technology Co ltd
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2022-04-01
Anticipated expiration: 2041-12-02
Also published as: CN113870259A; WO2023098524A1

Abstract

The application relates to the technical field of medical treatment, and discloses an assessment method, a device, equipment and a storage medium for multi-modal medical data fusion, wherein the method comprises the following steps: acquiring medical data to be evaluated of multiple modalities of a target object; respectively extracting the characteristics of the medical data to be evaluated of each mode to obtain a plurality of characteristic vectors, and fusing the characteristic vectors to obtain a fused characteristic vector; and inputting the fusion feature vector into the trained multi-mode fusion evaluation model to obtain an evaluation result output by the model. The system and the method have the advantages that the multi-mode medical data are subjected to feature extraction and feature fusion based on artificial intelligence, fusion feature vectors are obtained, the multi-mode fusion evaluation model is used for predicting and evaluating the disease remission degree of the target object based on the fusion feature vectors, the disease remission degree under the pathological level can be accurately evaluated in an auxiliary mode, the judgment accuracy is improved, and the medical risk is reduced. The application also discloses an assessment of multimodal medical data fusion.

Description

Multi-modal medical data fusion assessment method, device, equipment and storage medium

Technical Field

The present application relates to the field of medical technology, and for example, to an evaluation method, apparatus, device and storage medium for multi-modal medical data fusion.

Background

Rectal cancer is one of the main cancers threatening the life and health of residents in China, and causes serious social burden. The main treatment methods of the rectal cancer comprise comprehensive treatment means such as operation, radiotherapy, chemotherapy, targeted therapy and the like. Although the patient has a standard comprehensive treatment means, the damage of the patient with low rectal cancer caused by tumor or operation can cause the damage of anus function, anus loss and colostomy, and seriously affects the survival and treatment of the patient. Many patients with locally advanced rectal cancer cannot achieve the purpose of radical treatment due to a first-stage operation, and are not suitable for surgical treatment. At present, the standard treatment mode of local advanced rectal cancer (more than or equal to cT 3 or N +) is the comprehensive treatment of neoadjuvant chemoradiotherapy combined with total rectal membranectomy and adjuvant chemotherapy. The new adjuvant therapy can effectively realize the descending stage of the tumor, and improve the resection rate and the anus protection rate. Neoadjuvant therapy also provides a better option for preserving organ function in patients with low rectal cancer. The evaluation of the curative effect of the new adjuvant therapy of the rectal cancer, namely whether clinical remission is achieved after the therapy and how the probability of achieving pathological remission is, is a key link for clinical decision making and patient prognosis evaluation.

In the evaluation of the new auxiliary treatment effect of the rectal cancer at the present stage, most clinical guidelines and expert consensus suggest that whether a patient achieves clinical remission or close to clinical remission is comprehensively judged through multi-mode data such as endoscopy, digital rectal examination, rectal nuclear magnetism, serum tumor marker levels, chest and abdominal basin enhanced CT and the like. Evaluation of new adjuvant treatment of rectal cancer relies on the oncology multidisciplinary diagnostic and treatment team with highly experienced specialists in the departments of surgery, medicine, radiotherapy, imaging, digestive endoscopy, pathology, etc. The lack of specialists in certain professional directions has led many medical institutions to be unable to develop new adjuvant therapies for rectal cancer. The dependence on expert experience also leads to the possibility that the evaluation of the new adjuvant therapy curative effect on the colorectal cancer causes judgment errors and different decision criteria due to human factors. The clinical urgent need is a tool and a method for objectively and consistently evaluating the curative effect of the novel auxiliary treatment of the rectal cancer by integrating multi-modal medical data.

Disclosure of Invention

The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview nor is intended to identify key/critical elements or to delineate the scope of such embodiments but rather as a prelude to the more detailed description that is presented later.

The embodiment of the disclosure provides an assessment method, an assessment device, equipment and a storage medium for multi-modal medical data fusion, and aims to solve the technical problem that in the related art, a clinician is difficult to accurately assess the condition of patient illness alleviation in a manual mode, so that the medical risk of a patient is high.

In some embodiments, the disclosed embodiments provide a method for assessing multimodal medical data fusion, comprising:

acquiring medical data to be evaluated of multiple modalities of a target object;

respectively extracting the characteristics of the medical data to be evaluated of each mode to obtain a plurality of characteristic vectors;

fusing the plurality of feature vectors to obtain fused feature vectors;

and inputting the fusion feature vector into a pre-trained multi-mode fusion evaluation model to obtain the evaluation result of the multi-modal medical data to be evaluated, which is output by the pre-trained multi-mode fusion evaluation model.

In some embodiments, inputting the fusion feature vector to a pre-trained multi-modal fusion assessment model to obtain an assessment result of the multi-modal medical data to be assessed output by the pre-trained multi-modal fusion assessment model, includes:

horizontally splicing all the eigenvectors in the fused eigenvector to obtain an eigenvector first matrix W (in), and carrying out position coding on the eigenvector first matrix W (in) through a first function to obtain an eigenvector second matrix W (P), wherein the adopted formula is as follows:

wherein t represents a certain subvector in the first matrix of eigenvectors w (in); p (t) represents the coding result corresponding to the t value; pos denotes that the vector t belongs to the feature vector; i represents the sequence number bit of the vector t in the first matrix W (in) of the characteristic vector; d represents the number of matrix horizontal dimensions of the eigen vector first matrix w (in);

inputting the feature vector second matrix W (P) into a second function, and calculating to obtain a high-dimensional feature representation matrix W (M) on a subspace by adopting the following formula:

wherein, the CONCAT function represents a second function, F (1), F (2) … … F (i) represents that the formula F is calculated for the ith eigenvector in the second matrix W (P) of the eigenvectors;

a transpose representing a feature vector first matrix w (in);

x in F (i) represents the ith eigenvector in the input eigenvector second matrix W (P); q, K, V denotes the linear perception layer of the parameter n of the hidden layer of the multi-modal fusion assessment model; q (x) represents a linear regression of x;

encoding the feature vectors of the images by an encoder of a multi-mode fusion evaluation model, inputting the output W (out) of the encoder into a linear regression layer, converting the W (out) into a low-dimensional feature representation matrix by the linear regression layer, and finally outputting an evaluation result through the operation of a softmax function.

In some embodiments, acquiring medical data to be evaluated of a plurality of modalities of a target object includes at least three of the following ways:

acquiring a rectal cancer image data set of a target object as first modality data, wherein the rectal cancer image data set at least comprises a macroscopic view angle image, a near view angle image and a microscopic view angle image which are determined according to a tumor region or a regressive tumor region;

acquiring a magnetic resonance image data set of rectal cancer of the target subject as second modality data, wherein the magnetic resonance image data set of rectal cancer comprises initial magnetic resonance image data of rectal cancer and target magnetic resonance image data of rectal cancer; labeling the tumor area or the retracted tumor area in the initial rectal cancer magnetic resonance image data and the target rectal cancer magnetic resonance image data respectively to obtain a plurality of slice images containing the tumor area or the retracted tumor area;

acquiring an initial clinical data set and a target clinical data set of a target object as third modality data, wherein the initial clinical data set and the target clinical data set at least comprise personal information and case information of the target object;

and acquiring initial tumor marker information, target tumor marker information, initial blood information and target blood information of the target object as fourth modal data.

In some embodiments, the feature extraction is performed on the medical data to be evaluated of each modality, so as to obtain a plurality of feature vectors, including:

inputting the first mode data and the second mode data into a pre-trained neural network model respectively;

respectively carrying out matrix connection on medical images in the first modality data and the second modality data through a hard connecting line layer of the neural network model;

performing convolution calculation and maximum pooling operation on the medical images after matrix connection through alpha three-dimensional convolution modules of the neural network model to extract a high-dimensional feature map;

and converting the high-dimensional characteristic diagram extracted from the last three-dimensional convolution kernel into one-dimensional characteristic vectors through beta up-sampling modules and a full connection layer of the neural network model to respectively obtain a first characteristic vector and a second characteristic vector.

mapping the character description features in the third modal data and the fourth modal data into corresponding numerical features;

and mapping the numerical characteristic to a two-dimensional matrix to respectively obtain a third eigenvector and a fourth eigenvector.

In some embodiments, the training process of the neural network model comprises:

inputting the acquired preset medical data to be evaluated into a corresponding initial neural network model as a training sample so as to enable the initial neural network model to output a corresponding initial feature vector;

if the initial characteristic vector meets the preset requirement, the initial neural network model is successfully trained, and the pre-trained neural network model is obtained;

if the initial characteristic vector does not meet the preset requirement, continuing training the initial neural network model by adjusting the loss parameter in the initial neural network model until the loss parameter is fit and reaches a preset loss parameter threshold value, and obtaining the pre-trained neural network model.

In some embodiments, a cross entropy loss function is used for parameter back propagation and updating in the training process of the multi-modal fusion assessment model until the cross entropy loss function is fitted.

In some embodiments, the disclosed embodiments provide an evaluation apparatus for multi-modal medical data fusion, comprising:

a medical data acquisition module configured to acquire medical data to be evaluated of a plurality of modalities of a target object;

the characteristic vector extraction module is configured to respectively perform characteristic extraction on the medical data to be evaluated of each modality to obtain a plurality of characteristic vectors;

the feature vector fusion module is configured to fuse the plurality of feature vectors to obtain fused feature vectors;

and the multi-mode fusion evaluation module is configured to input the fusion feature vectors into a pre-trained multi-mode fusion evaluation model so as to obtain evaluation results of the multi-mode medical data to be evaluated, which are output by the pre-trained multi-mode fusion evaluation model.

In some embodiments, the disclosed embodiments provide an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing the steps of the method when executing the program stored in the memory.

In some embodiments, the disclosed embodiments provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, performs the above-described method steps.

The assessment method, the assessment device, the assessment equipment and the storage medium for multi-modal medical data fusion provided by the embodiment of the disclosure can realize the following technical effects:

the multi-modal medical data are subjected to feature extraction based on artificial intelligence to obtain a plurality of feature vectors, the obtained feature vectors are fused to obtain a fusion feature vector, and the multi-modal fusion assessment model trained based on the fusion feature vector is used for predicting and assessing the disease remission degree of the target object, so that the method and the device can assist in accurately assessing the disease remission degree of the target object at the pathological level after treatment, thereby improving the judgment accuracy and reducing the medical risk of the target object.

The foregoing general description and the following description are exemplary and explanatory only and are not restrictive of the application.

Drawings

At least one embodiment is illustrated by the accompanying drawings, which correspond to the accompanying drawings, and which do not form a limitation on the embodiment, wherein elements having the same reference numeral designations are shown as similar elements, and which are not to scale, and wherein:

FIG. 1 is a flow chart diagram of an evaluation method for multi-modal medical data fusion provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of feature extraction and data evaluation for multi-modal medical data provided by an embodiment of the present disclosure;

FIG. 3 is a schematic structural diagram of an evaluation apparatus for multi-modal medical data fusion provided in an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

So that the manner in which the features and elements of the disclosed embodiments can be understood in detail, a more particular description of the disclosed embodiments, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, at least one embodiment may be practiced without these specific details. In other instances, well-known structures and devices may be shown in simplified form in order to simplify the drawing.

The embodiment of the present disclosure provides an evaluation method for multi-modal medical data fusion, as shown in fig. 1, including the following steps:

s101, acquiring medical data to be evaluated of multiple modalities of a target object.

S102, feature extraction is respectively carried out on the medical data to be evaluated of each mode, and a plurality of feature vectors are obtained.

S103, fusing the plurality of feature vectors to obtain fused feature vectors.

And S104, inputting the fusion feature vector into the pre-trained multi-mode fusion evaluation model to obtain the evaluation result of the multi-mode medical data to be evaluated, which is output by the pre-trained multi-mode fusion evaluation model.

acquiring a rectal cancer image data set of a target object as first modality data through an endoscope, wherein the rectal cancer image data set at least comprises a macroscopic view angle image (1 piece is adopted generally), a myopic angle image (1 piece is adopted generally) and a microscopic view angle image (2 pieces is adopted generally) which are determined according to a tumor area or a regressive tumor area; the macroscopic view angle image is a panoramic image of an area which is within a first preset distance interval from a tumor area or a retracted tumor area and is opposite to the center of the intestinal cavity, for example, a panoramic image which is 0.8-20 mm away from the tumor area or the retracted tumor area and is opposite to the center of the intestinal cavity is taken as a macroscopic view angle image; the near view image is an image of a tumor region or a retracted tumor region with the longest boundary smaller than the visual field boundary at a preset scale, for example, an image taken under the condition that the longest boundary of the "tumor region" or the retracted tumor region "is smaller than 10% of the visual field boundary is taken as a near view image; the microscopic view image refers to a local image within a preset threshold range (e.g., within 0.8 mm) from the tumor region or the receded region and directly facing the tumor surface.

Acquiring a magnetic resonance image data set of the rectal cancer of the target object as second modality data, wherein the magnetic resonance image data set of the rectal cancer comprises initial magnetic resonance image data of the rectal cancer and target magnetic resonance image data of the rectal cancer; the method can be used for labeling the tumor area or the retracted tumor area in the initial rectal cancer magnetic resonance image data and the target rectal cancer magnetic resonance image data respectively in an automatic labeling or manual labeling mode to obtain a plurality of slice images containing the tumor area or the retracted tumor area. Wherein, the initial magnetic resonance image data of the rectum cancer can be the data of the target object before receiving the treatment, and the target magnetic resonance image data of the rectum cancer can be the data of the target object after receiving the treatment.

Acquiring an initial clinical data set and a target clinical data set of the target subject as third modality data, wherein the initial clinical data set and the target clinical data set comprise at least personal information and case information of the target subject. The initial clinical data set may be data of the target subject before treatment and the target clinical data set may be data of the target subject after treatment. The personal information of the target object can include, but is not limited to, age, height, weight, and the like, and the case information of the target object can include, but is not limited to, family history of malignancy, personal history of tumor, treatment plan, tumor location, degree of tumor differentiation, T stage before treatment, N stage before treatment, tumor infiltration depth, tumor distance from anal margin, and the like.

And acquiring initial tumor marker information, target tumor marker information, initial blood information and target blood information of the target object as fourth modal data. The initial tumor marker information and the initial blood information may be data of the target subject before treatment, and the target tumor marker information and the target blood information may be data of the target subject after treatment. Alternatively, the initial tumor marker information and the target tumor marker information may include, but are not limited to, data for carbohydrate antigen 125 (CA 125), carbohydrate antigen 153 (CA 153), carbohydrate antigen 199 (CA 199), carcinoembryonic antigen (CEA), and alpha-fetoprotein (AFP); the initial blood information and the target blood information may include, but are not limited to, blood routine data such as red blood cells, hemoglobin, platelets, platelet volume, white blood cells, neutrophils, lymphocytes, monocytes, C-reactive protein, hypersensitivity C-reactive protein, total protein, albumin, and prealbumin.

inputting first modal data into a first neural network model trained in advance;

performing matrix connection on the macroscopic view angle image, the near-sighted angle image and the microscopic view angle image through a hard connecting layer of the first neural network model;

performing convolution calculation and maximum pooling operation on the macro visual angle image, the near visual angle image and the micro visual angle image after matrix connection through alpha three-dimensional convolution modules of the first neural network model to extract a high-dimensional characteristic diagram;

and converting the high-dimensional feature map extracted from the last three-dimensional convolution kernel into a one-dimensional feature vector through beta up-sampling modules and a full connection layer of the first neural network model to obtain a first feature vector, wherein the value of alpha can be 7, and the value of beta can be 5.

inputting second mode data into a pre-trained second neural network model;

performing matrix connection on a plurality of slice images which contain tumor regions or retracted tumor regions and are marked in the second modal data through a hard connecting layer of a second neural network model;

carrying out convolution calculation and maximum pooling operation on the slice images after the matrix connection through alpha three-dimensional convolution modules of a second neural network model, and extracting a high-dimensional feature map;

and converting the high-dimensional feature map extracted from the last three-dimensional convolution kernel into a one-dimensional feature vector through beta up-sampling modules and a full connection layer of the second neural network model to obtain a second feature vector, wherein alpha can be 5, and beta can be 3.

In some embodiments, the training process of the first and second neural network models comprises:

inputting the acquired preset medical data to be evaluated into a corresponding initial neural network model as a training sample so that the initial neural network model outputs a corresponding initial feature vector;

if the initial characteristic vector meets the preset requirement, the initial neural network model is successfully trained, and a pre-trained neural network model is obtained;

if the initial characteristic vector does not meet the preset requirement, training the initial neural network model continuously by adjusting the loss parameters in the initial neural network model until the loss parameters are fit and reach a preset loss parameter threshold value, and obtaining the pre-trained neural network model.

Optionally, the first neural network model and the second neural network model may adopt a three-dimensional convolutional network (3 DCNN), which is not limited by the embodiment of the present disclosure.

mapping the character description characteristics in the third modal data and the fourth modal data into corresponding numerical characteristics;

Optionally, in the feature extraction process of the third modality data, if the target object has no family history of malignant tumor, mapping to number 0; if the target object is a family history of malignant tumors, mapping to a number 1; similarly, mapping other textual description features to corresponding numerical features is as follows:

individual history of tumor (no 0, no 1), recurrent tumor (1, no 0), neoadjuvant chemotherapy (1, no 0), neoadjuvant radiotherapy (1, no 0), treatment protocol (single 1, double 2, triple 3), tumor location (suprarectal 1, mid rectal 2, sub-rectal 3), tumor differentiation (high degree of differentiation 1, mid degree of differentiation 2, low degree of differentiation 3), size (0 for intestinal circumference 1/3, 1 for intestinal circumference 2/3, and 2 for intestinal circumference 1).

As shown in fig. 2, in some embodiments, inputting the fused feature vector into the pre-trained multi-modal fusion assessment model to obtain the assessment result of the multi-modal medical data to be assessed output by the pre-trained multi-modal fusion assessment model, includes:

horizontally splicing all eigenvectors in the fused eigenvectors to obtain an eigenvector first matrix W (in), and carrying out position coding on the eigenvector first matrix W (in) through a first function to obtain an eigenvector second matrix W (P), wherein the adopted formula is as follows:

a transpose representing a feature vector first matrix w (in);

encoding each image feature vector through an encoder of a multi-mode fusion evaluation model, inputting output W (out) of the encoder into a linear regression layer, converting W (out) into a low-dimensional feature representation matrix through the linear regression layer, and finally outputting an evaluation result through operation of a softmax function. And inputting the first, second, third and fourth feature vectors into a pre-trained multi-modal fusion assessment model to complete decision making, and finally obtaining assessment results with the results of complete remission or incomplete remission of the disease condition of the target object and probabilities corresponding to the assessment results, such as complete remission probability and incomplete remission probability.

Optionally, in the training process of the multi-modal fusion assessment model, a cross entropy loss function is adopted to perform parameter back propagation and updating until the cross entropy loss function is fitted.

The embodiment of the present disclosure further provides an evaluation apparatus for multi-modal medical data fusion, as shown in fig. 3, including:

a medical data acquisition module 301 configured to acquire medical data to be evaluated of a plurality of modalities of a target object;

a feature vector extraction module 302 configured to perform feature extraction on the medical data to be evaluated of each modality respectively to obtain a plurality of feature vectors;

a feature vector fusion module 303 configured to fuse the plurality of feature vectors to obtain a fused feature vector;

and the multi-mode fusion evaluation module 304 is configured to input the fusion feature vectors into the pre-trained multi-mode fusion evaluation model to obtain evaluation results of the multi-modal medical data to be evaluated, which are output by the pre-trained multi-mode fusion evaluation model.

An embodiment of the present disclosure further provides an electronic device, a structure of which is shown in fig. 4, including:

a processor (processor) 400 and a memory (memory) 401, and may further include a Communication Interface 402 and a Communication bus 403. The processor 400, the communication interface 402 and the memory 401 may communicate with each other through a communication bus 403. Communication interface 402 may be used for information transfer. The processor 400 may invoke logic instructions in the memory 401 to perform the above-described embodiment of the method for assessment of multimodal medical data fusion.

In addition, the logic instructions in the memory 401 may be implemented in the form of software functional units and may be stored in a computer readable storage medium when the logic instructions are sold or used as independent products.

The memory 401 is a computer-readable storage medium and can be used for storing software programs, computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 400 executes the program instructions/modules stored in the memory 401 to execute the functional application and data processing, i.e., to implement the evaluation method of multimodal medical data fusion in the above-described method embodiments.

The memory 401 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 401 may include a high-speed random access memory, and may also include a nonvolatile memory.

The disclosed embodiment also provides a computer-readable storage medium storing computer-executable instructions configured to execute the above-mentioned multi-modal medical data fusion assessment method.

The disclosed embodiments provide a computer program product comprising a computer program stored on a computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-mentioned assessment method for multimodal medical data fusion

The computer-readable storage medium described above may be a transitory computer-readable storage medium or a non-transitory computer-readable storage medium.

The embodiment of the disclosure provides an assessment method, device, equipment and storage medium for multi-modal medical data fusion, which adopt a three-dimensional convolution network (3 DCNN) technology to fuse multi-view images and extract fusion characteristics of a macroscopic view image, a near view image and a microscopic view image of a rectal cancer under an endoscope. In view of the fact that the input requirements of the conventional machine learning prediction model have standardized data formats, and if the input requirements are not satisfied, the performance of the conventional machine learning prediction model is greatly affected, the multi-modal fusion assessment model based on artificial intelligence provided by the application has excellent performance, also has self-attention weight, can rely on the self-perception capability of the model, has relatively excellent performance under the condition that partial data is missing (at least three types of modal data should be input into the four types of modal data of the invention), can quickly and accurately output assessment results, and is closer to clinical use scenes. The method can assist in accurately evaluating the disease alleviation degree of the target object under the pathological level after treatment, thereby improving the judgment accuracy and reducing the medical risk of the target object.

The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product, where the computer software product is stored in a storage medium and includes at least one instruction to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium comprising: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes, and may also be a transient storage medium.

The above description and drawings sufficiently illustrate embodiments of the disclosure to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. The examples merely typify possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in or substituted for those of others. The scope of the disclosed embodiments includes the full ambit of the claims, as well as all available equivalents of the claims. As used in this application, although the terms "first," "second," etc. may be used in this application to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, unless the meaning of the description changes, so long as all occurrences of the "first element" are renamed consistently and all occurrences of the "second element" are renamed consistently. The first and second elements are both elements, but may not be the same element. Furthermore, the words used in the specification are words of description only and are not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, the terms "comprises" and/or "comprising," when used in this application, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other like elements in a process, method or apparatus that comprises the element. In this document, each embodiment may be described with emphasis on differences from other embodiments, and the same and similar parts between the respective embodiments may be referred to each other. For methods, products, etc. of the embodiment disclosures, reference may be made to the description of the method section for relevance if it corresponds to the method section of the embodiment disclosure.

Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software may depend upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments. It is clear to those skilled in the art that, for convenience and brevity of description, the working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments disclosed herein, the disclosed methods, products (including but not limited to devices, apparatuses, etc.) may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit may be merely a division of a logical function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to implement the present embodiment. In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises at least one executable instruction for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. In the description corresponding to the flowcharts and block diagrams in the figures, operations or steps corresponding to different blocks may also occur in different orders than disclosed in the description, and sometimes there is no specific order between the different operations or steps. For example, two sequential operations or steps may in fact be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. Each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims

1. A method for multi-modal medical data fusion assessment, comprising:

fusing the plurality of feature vectors to obtain fused feature vectors;

inputting the fusion feature vector into a pre-trained multi-mode fusion evaluation model to obtain an evaluation result of the multi-modal medical data to be evaluated, which is output by the pre-trained multi-mode fusion evaluation model;

the inputting the fusion feature vector into a pre-trained multi-modal fusion assessment model to obtain the assessment result of the multi-modal medical data to be assessed output by the pre-trained multi-modal fusion assessment model includes:

a transpose representing a feature vector first matrix w (in);

2. The method according to claim 1, wherein the acquiring of the multi-modal medical data to be evaluated of the target object comprises at least three of the following ways:

acquiring a rectal cancer image dataset of the target object as first modality data, wherein the rectal cancer image dataset at least comprises a macroscopic perspective image, a near perspective image and a microscopic perspective image which are determined according to a tumor region or a regressive tumor region;

acquiring an initial clinical data set and a target clinical data set of the target object as third modality data, wherein the initial clinical data set and the target clinical data set at least comprise personal information and case information of the target object;

3. The method according to claim 2, wherein the feature extraction is performed on the medical data to be evaluated of each modality respectively to obtain a plurality of feature vectors, and comprises:

respectively carrying out matrix connection on medical images in the first modal data and the second modal data through a hard connecting line layer of the neural network model;

4. The method according to claim 2, wherein the feature extraction is performed on the medical data to be evaluated of each modality respectively to obtain a plurality of feature vectors, and comprises:

5. The method of claim 3, wherein the training process of the neural network model comprises:

6. The method according to claim 1, wherein a cross entropy loss function is used for parameter back propagation and updating during the training process of the multi-modal fusion assessment model until the cross entropy loss function is fitted.

7. An apparatus for multi-modal medical data fusion assessment, comprising:

the multi-mode fusion evaluation module is configured to input the fusion feature vectors into a pre-trained multi-mode fusion evaluation model so as to obtain evaluation results of the multi-modal medical data to be evaluated, which are output by the pre-trained multi-mode fusion evaluation model; is specifically configured to:

a transpose representing a feature vector first matrix w (in);

8. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any one of claims 1 to 6 when executing a program stored on a memory.

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.