CN117370933A - Multi-mode unified feature extraction method, device, equipment and medium - Google Patents

Multi-mode unified feature extraction method, device, equipment and medium Download PDF

Info

Publication number
CN117370933A
CN117370933A CN202311434500.1A CN202311434500A CN117370933A CN 117370933 A CN117370933 A CN 117370933A CN 202311434500 A CN202311434500 A CN 202311434500A CN 117370933 A CN117370933 A CN 117370933A
Authority
CN
China
Prior art keywords
feature extraction
mode
feature
medical data
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311434500.1A
Other languages
Chinese (zh)
Other versions
CN117370933B (en
Inventor
何昆仑
赵亚威
柳青河
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese PLA General Hospital
Original Assignee
Chinese PLA General Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese PLA General Hospital filed Critical Chinese PLA General Hospital
Priority to CN202311434500.1A priority Critical patent/CN117370933B/en
Publication of CN117370933A publication Critical patent/CN117370933A/en
Application granted granted Critical
Publication of CN117370933B publication Critical patent/CN117370933B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention provides a multi-mode unified feature extraction method, a device, equipment and a medium, wherein the multi-mode unified feature extraction method comprises the following steps: acquiring multi-mode target medical data; according to the mode of the target medical data, extracting the target medical data by adopting a feature extraction model to obtain a first feature extraction result; supplementing the missing features through the first feature extraction result to obtain a second feature extraction result; fusing the second characteristic results to obtain fusion characteristics; and identifying the fusion characteristics to obtain a medical characteristic identification result of the target medical data. The beneficial effects of the invention are as follows: the multi-mode data is prompted to project through a large model of expert knowledge, the large model is induced to generate corresponding features, and cyclic aggregation is carried out, so that the data of different modes are fused into a unified representation, fusion and complementation of the medical multi-mode data are realized, and the stability and accuracy of feature extraction and identification of the medical data are improved.

Description

Multi-mode unified feature extraction method, device, equipment and medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a medium for extracting multi-mode unified features.
Background
Most multi-mode intelligent learning algorithms extract multi-mode data into a unified feature representation through a special neural network, and then execute downstream tasks through the unified feature representation. However, in medical data, which is typically present as multi-modal data, the absence of data modalities can affect the accuracy of medical data identification.
Disclosure of Invention
The embodiment of the invention mainly aims to provide a multi-mode unified feature extraction method, device, equipment and medium, which realize fusion and complementation among medical multi-mode data and improve the stability and accuracy of feature extraction and identification of the medical data.
One aspect of the present invention provides a method for extracting multi-modal unified features, including:
acquiring target medical data according to a medical feature extraction request, wherein the target medical data is multi-mode data, and the modes of the target medical data comprise multimedia data, text data and structured data;
according to the mode of the target medical data, extracting the target medical data by adopting a feature extraction model to obtain a first feature extraction result;
determining missing features in the target medical data according to the first feature extraction result, and complementing the missing features through the first feature extraction result to obtain a second feature extraction result;
fusing the second characteristic results to obtain fusion characteristics, wherein the fusion characteristics represent the unified characterization of the target medical data;
and identifying the fusion characteristics to obtain a medical characteristic identification result of the target medical data.
The method for extracting the multi-mode unified feature according to the embodiment, wherein extracting the target medical data by using a feature extraction model according to the mode of the target medical data to obtain a first feature extraction result comprises the following steps:
and identifying and classifying the modes of the target medical data, and selecting a corresponding type of feature extraction model to perform feature extraction based on each classification to obtain a first feature extraction result of each mode.
The multi-mode unified feature extraction method according to the present invention, wherein determining missing features in the target medical data according to the first feature extraction result, to obtain a second feature extraction result, includes:
inducing a first feature extraction result of a first modality and a first feature extraction result corresponding to the first modality except the first modality to obtain a feature extraction model induction feature, wherein the first modality is any one of multi-modality data;
if the first feature extraction result of the first modality cannot characterize the induction feature, complementing the first modality induction feature with the first feature extraction result except the first modality, and determining the second feature extraction result; and determining the second feature extraction result according to the first feature extraction result of the first mode, the first feature extraction result except for the first mode accident and the comparison result of the induced features.
The method for extracting the unified features of multiple modes according to the present invention, wherein determining the second feature extraction result according to the first feature extraction result of the first mode, the first feature extraction result except for the first mode accident, and the comparison result of the induced features, includes:
if the first feature extraction result of the first modality is inconsistent with at least one of the first feature extraction results except the first modality, and the first feature extraction result except the first modality is inconsistent, the first feature extraction result of the first modality is used as a second feature extraction result;
and if at least one of the first feature extraction results of the first modality and the first feature extraction results except the first modality are inconsistent, and the first feature extraction results except the first modality are consistent, taking the first feature extraction results except the first modality as the second feature extraction results.
The multi-mode unified feature extraction method according to the present invention, wherein the method further comprises:
determining a mode missing prompt library and a question-answer library according to at least one of a disease guide, a treatment principle and expert discussion, wherein the mode missing prompt library is used for detecting whether a mode is true, and the question-answer library is used for determining the mode of medical data;
the method comprises the steps of detecting medical data by adopting a question-answer library, determining a first mode, obtaining a feature vector of the first mode and a feature vector except the first mode, and complementing the feature vector with the mode missing first mode by the mode missing prompt library and the feature vector except the first mode.
According to the multi-mode unified feature extraction method, the second feature result is fused, and obtaining the fused feature comprises:
performing cyclic aggregation on the first feature extraction results of all the first modes to obtain second feature extraction results which are not changed any more, wherein the cyclic aggregation is used for the completion and association of the characterization features;
and fusing the second characteristic results which are not changed any more in all the first modes to obtain fusion characteristics.
Another aspect of the embodiments of the present invention provides a multi-mode unified feature extraction device, including:
the first module is used for acquiring target medical data according to the medical feature extraction request, wherein the target medical data is multi-mode data, and the modes of the target medical data comprise multimedia data, text data and structured data;
the second module is used for extracting the target medical data by adopting a feature extraction model according to the mode of the target medical data to obtain a first feature extraction result;
a third module, configured to determine a missing feature in the target medical data according to the first feature extraction result, and complement the missing feature through the first feature extraction result to obtain a second feature extraction result;
a fourth module, configured to fuse the second feature result to obtain a fusion feature, where the fusion feature represents a unified representation of the target medical data;
and a fifth module, configured to identify the fusion feature, and obtain a medical feature identification result of the target medical data.
Another aspect of an embodiment of the present invention provides an electronic device, including a processor and a memory;
the memory is used for storing programs;
the processor executes the program to implement the method as described above.
Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the method described previously.
The beneficial effects of the invention are as follows: clinical data including data of multiple modes such as images, texts, structuring and the like are prompted to project through a large model of expert knowledge, the large model is induced to generate corresponding features, and cyclic aggregation is carried out, so that the data of different modes are fused into unified representation, fusion and complementation among medical multi-mode data are realized, and stability and accuracy of feature extraction and recognition of medical data are improved.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic diagram of a multimodal unified feature extraction system in accordance with an embodiment of the invention.
FIG. 2 is a flow chart of a method for multimodal unified feature extraction in accordance with an embodiment of the invention.
FIG. 3 is a schematic diagram of a missing feature completion and association flow in an embodiment of the invention.
Fig. 4 is a schematic diagram of a multi-modal feature unified feature extraction flow based on a modal missing library and a question-answer library according to an embodiment of the present invention.
FIG. 5 is a schematic diagram of a multi-modal feature fusion process according to an embodiment of the invention.
FIG. 6 is a diagram illustrating multi-modal unified feature extraction in accordance with an embodiment of the invention.
FIG. 7 is a schematic diagram of multi-modal unified feature extraction in the absence of a modality according to an embodiment of the present invention.
FIG. 8 is a schematic diagram of an apparatus for multimodal unified feature extraction in accordance with an embodiment of the invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. In the following description, suffixes such as "module", "part" or "unit" for representing elements are used only for facilitating the description of the present invention, and have no particular meaning in themselves. Thus, "module," "component," or "unit" may be used in combination. "first", "second", etc. are used for the purpose of distinguishing between technical features only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated. In the following description, the continuous reference numerals of the method steps are used for facilitating examination and understanding, and the technical effects achieved by the technical scheme of the invention are not affected by adjusting the implementation sequence among the steps in combination with the overall technical scheme of the invention and the logic relations among the steps. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
Referring to fig. 1, fig. 1 is a schematic diagram of a multi-mode unified feature extraction system according to an embodiment of the present invention, which includes a terminal 100 and a server 200, wherein the terminal 100 includes a medical terminal, a medical device, and the like, and is configured to generate medical data of different modes, such as image data, text data, and structured data, wherein a server 200 user obtains multi-mode data and a feature extraction or identification request of the terminal 100, obtains target medical data according to the medical feature extraction request, and the target medical data is multi-mode data, and the modes of the target medical data include multimedia data, text data, and structured data; according to the mode of the target medical data, extracting the target medical data by adopting a feature extraction model to obtain a first feature extraction result; determining missing features in the target medical data according to the first feature extraction result, and complementing the missing features through the first feature extraction result to obtain a second feature extraction result; fusing the second characteristic results to obtain fusion characteristics, wherein the fusion characteristics represent unified characterization of the target medical data; and identifying the fusion characteristics to obtain a medical characteristic identification result of the target medical data.
Referring to fig. 2, fig. 2 is a flowchart of a method for multi-modal unified feature extraction according to an embodiment of the invention, which can refer to, but is not limited to, steps S100-S500:
s100, acquiring target medical data according to a medical feature extraction request, wherein the target medical data is multi-mode data, and the modes of the target medical data comprise multimedia data, text data and structured data.
And S200, extracting the target medical data by adopting a feature extraction model according to the mode of the target medical data to obtain a first feature extraction result.
In some embodiments, the modalities of the target medical data are identified and classified, and based on each classification, a feature extraction model of a corresponding type is selected for feature extraction, so as to obtain a first feature extraction result of each modality.
In some embodiments, the feature extraction model is a large model, where the large model is capable of processing complex and large amounts of data, completing a variety of complex tasks.
S300, determining missing features in the target medical data according to the first feature extraction result, and complementing the missing features through the first feature extraction result to obtain a second feature extraction result.
In some embodiments, reference is made to the missing feature completion and association flow diagram shown in fig. 3, which includes, but is not limited to, the steps of:
s310, inducing a first feature extraction result of a first mode and a first feature extraction result corresponding to the first mode, so as to obtain a feature extraction model induction feature, wherein the first mode is any one of multi-mode data.
S320, if the first feature extraction result of the first mode cannot characterize the induced feature, complementing the induced feature of the first mode with the first feature extraction result except the first mode, and determining a second feature extraction result; and determining a second feature extraction result according to the first feature extraction result of the first modality, the first feature extraction result except for the first modality accident, and the comparison result of the induced features.
In some embodiments, taking multi-modal data of lobar pneumonia as an example, text data text modal feature extraction in modal data induces a large model to output the following features by inputting case data and extracted CT image features and blood routine data features: "sex, age, symptoms, signs".
The following determination was made: if the case data is insufficient to embody the characteristics, the characteristics are supplemented completely through the extracted CT image characteristics and blood routine data characteristics.
In some embodiments, if the first feature extraction result of the first modality is inconsistent with at least one of the first feature extraction results other than the first modality, and the first feature extraction result other than the first modality is inconsistent, the first feature extraction result of the first modality is used as the second feature extraction result;
and if at least one of the first feature extraction results of the first modality and the first feature extraction results except the first modality are inconsistent, and the first feature extraction results except the first modality are consistent, taking the first feature extraction results except the first modality as the second feature extraction results.
Illustratively, taking multi-modal data of lobar pneumonia as an example:
if the characteristic result extracted by the case data is inconsistent with the characteristic result induced by the CT image characteristic and the blood routine data characteristic, the characteristic result extracted by the case data is the same as the characteristic result extracted by the case data.
If the characteristic results extracted from the case data are inconsistent with the CT image characteristics and the characteristic results induced by the blood routine data characteristics, the results between the CT image characteristics and the characteristics induced by the blood routine data characteristics are inconsistent, and the characteristic results extracted from the case data are the same.
If the characteristic results extracted from the case data are different from the CT image characteristics and the characteristic results induced by the blood routine data characteristics, the CT image characteristics are consistent with the blood routine data characteristics, and the CT image characteristics and the blood routine data characteristics are consistent with each other.
In some embodiments, referring to the multi-modal feature unified feature extraction flow diagram based on the modal missing library and question-answer library shown in fig. 4, it includes, but is not limited to, steps S330 to S340:
s330, determining a mode missing prompt library and a question-answer library according to at least one of disease guidelines, treatment principles and expert discussions, wherein the mode missing prompt library is used for detecting whether a mode is true, and the question-answer library is used for determining the mode of medical data;
s340, detecting the medical data by adopting a question-answer library, determining a first mode, acquiring a feature vector of the first mode and a feature vector except the first mode, and complementing the feature vector with the mode missing first mode by the mode missing prompt library and the feature vector except the first mode.
Illustratively, a question-answer library Q is constructed based on prior knowledge of disease guidelines, treatment guidelines, expert discussions, and the like i Modal miss hint library L i Randomly initializing feature vectors under different modesAt the time of the t-th cycle, according to Q i Belonging to the mode, selecting a corresponding large model N i Will prompt the engineering data Q i Feature vector +.>Extracted as feature vector +.>If modal miss occurs, the large model calls the miss hint library L i Other modality feature vector +.>Extracted as feature vector +.>And returning to the circulation step until the characteristics are converged. Fusing the converged features into a feature matrix W for downstream tasks, where Q i Represents a question-answer library constructed based on prior knowledge such as disease guidelines, treatment principles, expert discussions and the like, L i A modal missing prompt library is represented, i represents a certain modality; />Representing the characteristics extracted from the i-mode data after t rounds of aggregation; w represents all modality features->Feature matrix after connection.
S400, fusing the second characteristic results to obtain fusion characteristics, wherein the fusion characteristics represent unified characterization of the target medical data.
In some embodiments, referring to the feature fusion flow diagram shown in fig. 5, it includes, but is not limited to, steps S410-S420:
s410, carrying out cyclic aggregation on the first feature extraction results of all the first modes to obtain a second feature extraction result which is not changed any more, wherein the cyclic aggregation is used for the complementation and association of the characterization features;
and S420, fusing the second characteristic results which are not changed any more in all the first modes to obtain fusion characteristics.
In some embodiments, the missing feature completion and association flow diagram shown in fig. 3 is exemplified by multi-modal data of lobar pneumonia, and the processing flow includes: the image mode feature extraction, namely, the following features are output by inputting CT images, and the extracted case data and blood routine data features induction large model: "lobar status, mediastinal status, main bronchus status, skeletal status, neurological status", the association and complementation of lobar pneumonia features is performed; executing feature extraction of structural data, and outputting the following features by inputting blood routine data and the extracted case data and CT image features induction large model: "inflammatory status, coagulation status, immune status", performing association and complementation of lobar pneumonia features; circularly judging the data of the three modes until the output characteristics are not changed; and fusing the characteristics which are not changed any more to obtain unified characteristics for downstream tasks.
S500, identifying the fusion characteristics to obtain medical characteristic identification results of the target medical data.
In some embodiments, the fusion profile characterizes a lobar pneumonia fusion profile of a case, such as gender, age, symptoms, signs, lobar status, mediastinal status, main tracheal status, skeletal status, neurological status, inflammatory status, clotting status, immune status, etc.
In some embodiments, referring to a multi-modal unified feature extraction schematic diagram shown in fig. 6, clinical data in this embodiment includes data of multiple modalities such as images, texts, and structuring, and engineering is prompted by a large model of expert knowledge to induce the large model to generate corresponding features. And generating features associated with the previous mode direction under the mode by inputting the generated feature vector and data of another mode into a large model, and circulating until the features are converged.
In some embodiments, referring to a multi-mode unified feature extraction schematic diagram when a mode is missing as shown in fig. 7, the embodiment aims at the problem of encountering the mode missing, and the large model is induced to complement the mode features in the state through the association of other mode features.
FIG. 8 is a diagram of a multi-modal unified feature extraction analysis device in accordance with an embodiment of the invention. The apparatus includes a first module 110, a second module 820, a third module 830, a fourth module 840, and a fifth module 850.
The first module is used for acquiring target medical data according to the medical feature extraction request, wherein the target medical data is multi-mode data, and the modes of the target medical data comprise multimedia data, text data and structured data; the second module is used for extracting the target medical data by adopting a feature extraction model according to the mode of the target medical data to obtain a first feature extraction result; the third module is used for determining missing features in the target medical data according to the first feature extraction result, and complementing the missing features through the first feature extraction result to obtain a second feature extraction result; a fourth module, configured to fuse the second feature result to obtain a fused feature, where the fused feature represents a unified representation of the target medical data; and a fifth module for identifying the fusion characteristics to obtain the medical characteristic identification result of the target medical data.
The device according to the embodiment may implement any of the foregoing multi-mode unified feature extraction methods under the cooperation of the first module, the second module, the third module, the fourth module, and the fifth module in the device, that is, obtain, according to a medical feature extraction request, target medical data, where the target medical data is multi-mode data, and a mode of the target medical data includes multimedia data, text data, and structured data; according to the mode of the target medical data, extracting the target medical data by adopting a feature extraction model to obtain a first feature extraction result; determining missing features in the target medical data according to the first feature extraction result, and complementing the missing features through the first feature extraction result to obtain a second feature extraction result; fusing the second characteristic results to obtain fusion characteristics, wherein the fusion characteristics represent unified characterization of the target medical data; and identifying the fusion characteristics to obtain a medical characteristic identification result of the target medical data. The beneficial effects of the invention are as follows: clinical data including data of multiple modes such as images, texts, structuring and the like are prompted to project through a large model of expert knowledge, the large model is induced to generate corresponding features, and cyclic aggregation is carried out, so that the data of different modes are fused into unified representation, fusion and complementation among medical multi-mode data are realized, and stability and accuracy of feature extraction and recognition of medical data are improved.
The embodiment of the invention also provides electronic equipment, which comprises a processor and a memory;
the memory stores a program;
the processor executes the program to execute the multi-mode unified feature extraction method; the electronic device has the functionality of a software system that hosts and runs multi-modal unified feature extraction provided by embodiments of the invention, such as a personal computer, mini-computer, mainframe, workstation, network or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, etc.
Embodiments of the present invention also provide a computer-readable storage medium storing a program that is executed by a processor to implement a multi-modal unified feature extraction method as described above.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the multi-modal unified feature extraction analysis method described previously.
Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and these equivalent modifications or substitutions are included in the scope of the present invention as defined in the appended claims.

Claims (10)

1. A method for multi-modal unified feature extraction, comprising:
acquiring target medical data according to a medical feature extraction request, wherein the target medical data is multi-mode data, and the modes of the target medical data comprise multimedia data, text data and structured data;
according to the mode of the target medical data, extracting the target medical data by adopting a feature extraction model to obtain a first feature extraction result;
determining missing features in the target medical data according to the first feature extraction result, and complementing the missing features through the first feature extraction result to obtain a second feature extraction result;
fusing the second characteristic results to obtain fusion characteristics, wherein the fusion characteristics represent the unified characterization of the target medical data;
and identifying the fusion characteristics to obtain a medical characteristic identification result of the target medical data.
2. The method of claim 1, wherein the performing extraction on the target medical data using a feature extraction model according to a modality of the target medical data to obtain a first feature extraction result includes:
and identifying and classifying the modes of the target medical data, and selecting a corresponding type of feature extraction model to perform feature extraction based on each classification to obtain a first feature extraction result of each mode.
3. The method of claim 2, wherein determining missing features in the target medical data according to the first feature extraction result, to obtain a second feature extraction result, comprises:
inducing a first feature extraction result of a first modality and a first feature extraction result corresponding to the first modality except the first modality to obtain a feature extraction model induction feature, wherein the first modality is any one of multi-modality data;
if the first feature extraction result of the first modality cannot characterize the induction feature, complementing the first modality induction feature with the first feature extraction result except the first modality, and determining the second feature extraction result; and determining the second feature extraction result according to the first feature extraction result of the first mode, the first feature extraction result except for the first mode accident and the comparison result of the induced features.
4. The method of claim 3, wherein determining the second feature extraction result according to the first feature extraction result of the first modality, the first feature extraction result except for the first modality accident, and the comparison result of the induced features comprises:
if the first feature extraction result of the first modality is inconsistent with at least one of the first feature extraction results except the first modality, and the first feature extraction result except the first modality is inconsistent, the first feature extraction result of the first modality is used as a second feature extraction result;
and if at least one of the first feature extraction results of the first modality and the first feature extraction results except the first modality are inconsistent, and the first feature extraction results except the first modality are consistent, taking the first feature extraction results except the first modality as the second feature extraction results.
5. A multi-modal unified feature extraction method as claimed in claim 3 further comprising:
determining a mode missing prompt library and a question-answer library according to at least one of a disease guide, a treatment principle and expert discussion, wherein the mode missing prompt library is used for detecting whether a mode is true, and the question-answer library is used for determining the mode of medical data;
the method comprises the steps of detecting medical data by adopting a question-answer library, determining a first mode, obtaining a feature vector of the first mode and a feature vector except the first mode, and complementing the feature vector with the mode missing first mode by the mode missing prompt library and the feature vector except the first mode.
6. The method of claim 3, wherein fusing the second feature results to obtain a fused feature comprises:
performing cyclic aggregation on the first feature extraction results of all the first modes to obtain second feature extraction results which are not changed any more, wherein the cyclic aggregation is used for the completion and association of the characterization features;
and fusing the second characteristic results which are not changed any more in all the first modes to obtain fusion characteristics.
7. The method of claim 1, wherein the feature extraction model is a large model.
8. A multi-modal unified feature extraction apparatus, comprising:
the first module is used for acquiring target medical data according to the medical feature extraction request, wherein the target medical data is multi-mode data, and the modes of the target medical data comprise multimedia data, text data and structured data;
the second module is used for extracting the target medical data by adopting a feature extraction model according to the mode of the target medical data to obtain a first feature extraction result;
a third module, configured to determine a missing feature in the target medical data according to the first feature extraction result, and complement the missing feature through the first feature extraction result to obtain a second feature extraction result;
a fourth module, configured to fuse the second feature result to obtain a fusion feature, where the fusion feature represents a unified representation of the target medical data;
and a fifth module, configured to identify the fusion feature, and obtain a medical feature identification result of the target medical data.
9. An electronic device comprising a processor and a memory;
the memory is used for storing programs;
execution of the program by the processor implements the multimodal unified feature extraction method of any one of claims 1-7.
10. A computer-readable storage medium, wherein the storage medium stores a program that is executed by a processor to implement the multi-modal unified feature extraction method of any one of claims 1-7.
CN202311434500.1A 2023-10-31 2023-10-31 Multi-mode unified feature extraction method, device, equipment and medium Active CN117370933B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311434500.1A CN117370933B (en) 2023-10-31 2023-10-31 Multi-mode unified feature extraction method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311434500.1A CN117370933B (en) 2023-10-31 2023-10-31 Multi-mode unified feature extraction method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN117370933A true CN117370933A (en) 2024-01-09
CN117370933B CN117370933B (en) 2024-05-07

Family

ID=89402148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311434500.1A Active CN117370933B (en) 2023-10-31 2023-10-31 Multi-mode unified feature extraction method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN117370933B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819052A (en) * 2021-01-25 2021-05-18 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Multi-modal fine-grained mixing method, system, device and storage medium
CN113870259A (en) * 2021-12-02 2021-12-31 天津御锦人工智能医疗科技有限公司 Multi-modal medical data fusion assessment method, device, equipment and storage medium
CN114519898A (en) * 2020-11-02 2022-05-20 北京眼神智能科技有限公司 Biological characteristic multi-mode fusion recognition method and device, storage medium and equipment
CN114564593A (en) * 2022-02-21 2022-05-31 北京百度网讯科技有限公司 Completion method and device of multi-mode knowledge graph and electronic equipment
CN115545093A (en) * 2022-09-13 2022-12-30 珠海高凌信息科技股份有限公司 Multi-mode data fusion method, system and storage medium
CN115952466A (en) * 2022-06-28 2023-04-11 电子科技大学 Communication radiation source cross-mode identification method based on multi-mode information fusion
CN116344028A (en) * 2023-02-14 2023-06-27 北京深睿博联科技有限责任公司 Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data
US20230206121A1 (en) * 2020-06-23 2023-06-29 Huawei Cloud Computing Technologies Co., Ltd. Modal information completion method, apparatus, and device
CN116383766A (en) * 2023-04-12 2023-07-04 平安科技(深圳)有限公司 Auxiliary diagnosis method, device, equipment and storage medium based on multi-mode data
CN116487031A (en) * 2023-04-17 2023-07-25 莆田市数字集团有限公司 Multi-mode fusion type auxiliary diagnosis method and system for pneumonia
CN116628263A (en) * 2023-06-07 2023-08-22 平安科技(深圳)有限公司 Video retrieval method and device based on multiple modes, electronic equipment and storage medium
CN116758397A (en) * 2023-06-27 2023-09-15 华东师范大学 Single-mode induced multi-mode pre-training method and system based on deep learning
CN116861363A (en) * 2023-07-12 2023-10-10 中国电信股份有限公司技术创新中心 Multi-mode feature processing method and device, storage medium and electronic equipment

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230206121A1 (en) * 2020-06-23 2023-06-29 Huawei Cloud Computing Technologies Co., Ltd. Modal information completion method, apparatus, and device
CN114519898A (en) * 2020-11-02 2022-05-20 北京眼神智能科技有限公司 Biological characteristic multi-mode fusion recognition method and device, storage medium and equipment
CN112819052A (en) * 2021-01-25 2021-05-18 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Multi-modal fine-grained mixing method, system, device and storage medium
WO2023098524A1 (en) * 2021-12-02 2023-06-08 天津御锦人工智能医疗科技有限公司 Multi-modal medical data fusion evaluation method and apparatus, device, and storage medium
CN113870259A (en) * 2021-12-02 2021-12-31 天津御锦人工智能医疗科技有限公司 Multi-modal medical data fusion assessment method, device, equipment and storage medium
CN114564593A (en) * 2022-02-21 2022-05-31 北京百度网讯科技有限公司 Completion method and device of multi-mode knowledge graph and electronic equipment
CN115952466A (en) * 2022-06-28 2023-04-11 电子科技大学 Communication radiation source cross-mode identification method based on multi-mode information fusion
CN115545093A (en) * 2022-09-13 2022-12-30 珠海高凌信息科技股份有限公司 Multi-mode data fusion method, system and storage medium
CN116344028A (en) * 2023-02-14 2023-06-27 北京深睿博联科技有限责任公司 Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data
CN116383766A (en) * 2023-04-12 2023-07-04 平安科技(深圳)有限公司 Auxiliary diagnosis method, device, equipment and storage medium based on multi-mode data
CN116487031A (en) * 2023-04-17 2023-07-25 莆田市数字集团有限公司 Multi-mode fusion type auxiliary diagnosis method and system for pneumonia
CN116628263A (en) * 2023-06-07 2023-08-22 平安科技(深圳)有限公司 Video retrieval method and device based on multiple modes, electronic equipment and storage medium
CN116758397A (en) * 2023-06-27 2023-09-15 华东师范大学 Single-mode induced multi-mode pre-training method and system based on deep learning
CN116861363A (en) * 2023-07-12 2023-10-10 中国电信股份有限公司技术创新中心 Multi-mode feature processing method and device, storage medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐曼;沈江;余海燕;: "数据驱动的医疗与健康决策支持研究综述", 工业工程与管理, no. 01, 10 February 2017 (2017-02-10) *

Also Published As

Publication number Publication date
CN117370933B (en) 2024-05-07

Similar Documents

Publication Publication Date Title
Hu et al. SA-Net: A scale-attention network for medical image segmentation
KR102177568B1 (en) Method for semi supervised reinforcement learning using data with label and data without label together and apparatus using the same
CN107273883B (en) Decision tree model training method, and method and device for determining data attributes in OCR (optical character recognition) result
Wong et al. Smartannotator an interactive tool for annotating indoor rgbd images
Ahmad et al. SiNC: Saliency-injected neural codes for representation and efficient retrieval of medical radiographs
JP2009528117A (en) Identifying image feature sets for assessing image similarity
Tang et al. Research on medical image classification based on machine learning
CN112802013B (en) Brain disease detection method and device based on graph neural network and multi-task learning
EP4173000A2 (en) Machine learning based medical data checker
CN113469981A (en) Image processing method, device and storage medium
Ogiela et al. Natural user interfaces in medical image analysis
CN115410717A (en) Model training method, data retrieval method, image data retrieval method and device
CN114003758B (en) Training method and device of image retrieval model and retrieval method and device
Lin et al. Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation
CN117370933B (en) Multi-mode unified feature extraction method, device, equipment and medium
WO2021116011A1 (en) Medical image segmentation and atlas image selection
Dastider et al. Rescovnet: A deep learning-based architecture for covid-19 detection from chest ct scan images
Gu et al. The effect of pulmonary vessel suppression on computerized detection of nodules in chest CT scans
CN112241470A (en) Video classification method and system
Han et al. Multimodal 3D convolutional neural networks for classification of brain disease using structural MR and FDG-PET images
CN116958957A (en) Training method of multi-mode feature extraction network and three-dimensional feature representation method
Kalyani et al. Deep learning-based detection and classification of adenocarcinoma cell nuclei
CN113361584B (en) Model training method and device, and pulmonary arterial hypertension measurement method and device
Mursalin et al. EpNet: A deep neural network for ear detection in 3D point clouds
CN113903433A (en) Image processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant