CN116344028A - Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data - Google Patents

Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data Download PDF

Info

Publication number
CN116344028A
CN116344028A CN202310123255.6A CN202310123255A CN116344028A CN 116344028 A CN116344028 A CN 116344028A CN 202310123255 A CN202310123255 A CN 202310123255A CN 116344028 A CN116344028 A CN 116344028A
Authority
CN
China
Prior art keywords
data
image
preprocessing
feature extraction
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310123255.6A
Other languages
Chinese (zh)
Inventor
俞益洲
马杰超
张树
李一鸣
乔昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenrui Bolian Technology Co Ltd
Shenzhen Deepwise Bolian Technology Co Ltd
Original Assignee
Beijing Shenrui Bolian Technology Co Ltd
Shenzhen Deepwise Bolian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenrui Bolian Technology Co Ltd, Shenzhen Deepwise Bolian Technology Co Ltd filed Critical Beijing Shenrui Bolian Technology Co Ltd
Priority to CN202310123255.6A priority Critical patent/CN116344028A/en
Publication of CN116344028A publication Critical patent/CN116344028A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Epidemiology (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Pathology (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention provides a method and a device for automatically identifying lung diseases based on multi-mode heterogeneous data, wherein the method comprises the following steps: preprocessing unstructured text data by using a global unified character embedding feature; preprocessing the structured text data; performing feature extraction on medical image data by using an image feature extraction model of a structure to obtain image features, wherein the image feature extraction model of the structure uses a transducer structure as a trunk model; carrying out relation mapping expression among vocabularies; performing feature extraction and analysis of multiple dimensions, and performing feature fusion on multi-modal data, wherein the multi-modal data comprises text features and image features obtained by preprocessing unstructured text data and preprocessing structured text data; and classifying the fused features to obtain an output result.

Description

Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data
Technical Field
The invention relates to the field of computers, in particular to an automatic lung disease identification method and device based on multi-mode heterogeneous data.
Background
With the rapid development of medical informatization and the update iteration of medical equipment, a vast variety of medical data has been generated, which can be roughly divided into clinical text data and image data. The text data mainly comprises structural test data such as hemoglobin, urine convention, gene detection results and the like, and unstructured text data such as patient complaints, pathological texts and the like recorded by doctors; the image data includes image data such as ultrasonic image, CT image, X-ray, and nuclear magnetic resonance image, and signal data such as electrocardiogram and electroencephalogram. Currently, most of the applications of artificial intelligence in medicine are single-modality data to handle specific tasks, such as Computed Tomography (CT) and single-disease diagnosis of retinal images, which neglect a broader clinical context, which inevitably weakens the potential of artificial intelligence models. In contrast, multi-modal data from multiple sources is often processed for clinicians in diagnosing lung infections, performing prognostic evaluations, and determining treatment plans. Medical data of different modes provides diagnosis and treatment information of patients from different specific angles, and the accuracy of diagnosis and treatment is further improved by combining various medical information, so that the artificial intelligence is more close to clinical practice. However, in theory, the artificial intelligence model should also be able to use data resources that are generally available to all clinicians, even resources that are not available to most clinicians (e.g., most common clinicians often do not review thousands of multi-modal data from different regions, different hospitals, different departments), while data integration of different modalities often increases the robustness and accuracy of the diagnosis. However, the information among the data of different modes is complementary and redundant, so that the defect of the own mode is overcome by effectively utilizing the complementary information among the different modes, the influence of the redundant information among the modes is reduced, the mastering condition of the global state of a patient is improved, and the multi-mode lung identification method is a serious problem in the research of various common diseases of the lung.
Disclosure of Invention
The present invention aims to provide an automatic pulmonary disease identification method and device based on multimodal heterogeneous data, which overcomes or at least partially solves the above-mentioned problems.
In order to achieve the above purpose, the technical scheme of the invention is specifically realized as follows:
one aspect of the present invention provides an automatic pulmonary disease recognition method based on multimodal heterogeneous data, comprising: preprocessing unstructured text data by using a global unified character embedding feature; preprocessing the structured text data; performing feature extraction on medical image data by using an image feature extraction model of a structure to obtain image features, wherein the image feature extraction model of the structure uses a transducer structure as a trunk model; carrying out relation mapping expression among vocabularies; performing feature extraction and analysis of multiple dimensions, and performing feature fusion on multi-modal data, wherein the multi-modal data comprises text features and image features obtained by preprocessing unstructured text data and preprocessing structured text data; and classifying the fused features to obtain an output result.
Wherein preprocessing unstructured text data comprises: unstructured text data is converted using a rule-oriented structuring algorithm.
Wherein preprocessing the structured data comprises: judging whether the preset value is in a reasonable interval or not, and normalizing the data of different orders of magnitude.
The image feature extraction model of the structure comprises the following steps: and a plurality of multi-layer encoders, wherein the input of each encoder firstly flows into a Self-Attention layer, and the convolution kernel is 16 times 16.
The image feature extraction model of the structure comprises the following steps: a symptom-based abnormality detection model and a disease-based diagnostic model, wherein the symptom-based abnormality detection model is used to feature enhance the disease-based diagnostic model.
The feature extraction and analysis of multiple dimensions are performed, and the feature fusion of the multi-mode data comprises the following steps: feature fusion is performed by using a multi-modal attention fusion mechanism, which is expressed as:
Figure BDA0004080790870000021
wherein Y (i) represents the output of the relationship between a certain character and all other characters, x i And y j Two of the characters representing the vector in the fusion; i represents the index until the output of its response is calculated, j is the index enumerating all possible positions; θ (x) i ,y j ) Calculating a relationship between two different feature positions; g (x) j ) Calculating a feature at position j; finally, the final relationship result of 1/C (x) is processed through normalization.
In another aspect, the present invention provides an automatic lung disease recognition apparatus based on multimodal heterogeneous data, comprising: the data structuring module is used for preprocessing unstructured text data by using global unified character embedding characteristics; the data preprocessing module is used for preprocessing the structured text data; the convolutional neural network module is used for carrying out feature extraction on medical image data by using an image feature extraction model of the structure to obtain image features, wherein the image feature extraction model of the structure uses a transformer structure as a trunk model; the text embedding module is used for carrying out relation mapping expression among vocabularies; the feature fusion module is used for carrying out feature extraction and analysis of multiple dimensions, carrying out feature fusion on multi-mode data, wherein the multi-mode data comprises text features and image features obtained by preprocessing unstructured text data and preprocessing structured text data; and the classifier is used for classifying the fused features to obtain an output result.
The data structuring module preprocesses unstructured text data in the following mode: unstructured text data is converted using a rule-oriented structuring algorithm.
The data preprocessing module preprocesses the structured data in the following mode: judging whether the preset value is in a reasonable interval or not, and normalizing the data of different orders of magnitude.
The image feature extraction model of the structure comprises the following steps: and a plurality of multi-layer encoders, wherein the input of each encoder firstly flows into a Self-Attention layer, and the convolution kernel is 16 times 16.
The image feature extraction model of the structure comprises the following steps: a symptom-based abnormality detection model and a disease-based diagnostic model, wherein the symptom-based abnormality detection model is used to feature enhance the disease-based diagnostic model.
The feature fusion module performs feature extraction and analysis of multiple dimensions in the following manner, and performs feature fusion on the multi-mode data: feature fusion is performed by using a multi-modal attention fusion mechanism, which is expressed as:
Figure BDA0004080790870000031
wherein Y (i) represents the output of the relationship between a certain character and all other characters, x i And y h Two of the characters representing the vector in the fusion; i represents the index until the output of its response is calculated, j is the index enumerating all possible positions; θ (x) i ,y j ) Calculating a relationship between two different feature positions; g (x) j ) Calculating a feature at position j; finally, the final relationship result of 1/C (x) is processed through normalization.
Therefore, the method and the device for automatically identifying the lung diseases based on the multi-mode heterogeneous data provided by the invention can identify the common chest multiple diseases and the common and different symptom and fine classification diseases thereof by fusing text information such as patient complaints and clinical medical record laboratory examination and image information such as CT (computed tomography), and can observe the patient from a macroscopic angle aiming at the multi-mode heterogeneous model of the common chest diseases, thereby effectively overcoming the limitation of the prior art and assisting a clinician to make more accurate judgment.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for automatically identifying pulmonary diseases based on multimodal heterogeneous data according to an embodiment of the present invention;
FIG. 2 is a flowchart of a specific implementation of a method for automatically identifying pulmonary diseases based on multimodal heterogeneous data according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an automatic lung disease recognition device based on multi-mode heterogeneous data according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The invention relates to a multi-mode heterogeneous data-based automatic lung disease identification scheme, which specifically comprises chronic obstructive pulmonary disease, bronchiectasis, pneumothorax, lung malignant tumor, emphysema, pneumonia, pulmonary tuberculosis, lung tumor, pleural effusion, interstitial lung disease and the like; the invention mainly utilizes heterogeneous data of a plurality of modes of imaging data (CT, X-ray and the like) and clinical text data (medical records, epidemiology and statistics and laboratory blood biochemical examination) to carry out final chest multi-disease identification. On the basis, the invention is based on multi-mode data, and adopts a deep neural network to efficiently extract image texture characteristics and semantic sign information (bronchi obstruction, bronchiectasis, bronchially enlarged lymph node, bronchis stenosis, pneumothorax, mediastinal lymph node enlargement, atelectasis, pneumoconiosis, bulla, lung solid variate, lung patch shadow, lung streak shadow, emphysema, lung grinding glass density shadow, lung cavitation, lung cavity, lung grid shadow, lung honeycomb shadow, phylum lymph node enlargement, pleural effusion, pleural thickening, augmentation lymph node enlargement, calcification, tumor, nodule and the like) existing in the image to carry out characteristic improvement on image diagnosis; for clinical text data, inputs are general clinical characteristics of the patient (age, body temperature and maximum temperature at admission), complaints, and case diagnosis and treatment procedures, laboratory blood biochemical tests (albumin, serum Lactate Dehydrogenase (LDH), indirect bilirubin, thrombin time, activated Partial Thromboplastin Time (APTT), platelet count, C-reactive protein (CRP), white blood cells, lymphocyte count, neutrophil count, PCT, IL-6, etc.), patient epidemiological routine information, and the like. Aiming at the phenomena of homonymy and heterology, the pulmonary disease fine granularity accurate diagnosis and treatment model based on the multi-modal data is constructed by gathering the multi-modal data of the whole diagnosis and treatment process by means of a large data platform by means of a text analysis function and an image processing function of artificial intelligence. For example, for a large class of pulmonary infection diseases, specific examples include detection of common respiratory viruses (influenza virus, H1N1, H5N1, H7N9, respiratory syncytial virus, coronavirus), bacteria (acinetobacter baumannii, haemophilus influenzae, klebsiella pneumoniae), mycotic pneumonia, pulmonary tuberculosis, and the like at a fine particle size.
The following describes a method for automatically identifying pulmonary diseases based on multimodal heterogeneous data according to an embodiment of the present invention with reference to fig. 1 and fig. 2, where the method for automatically identifying pulmonary diseases based on multimodal heterogeneous data according to the embodiment of the present invention includes:
s1, preprocessing unstructured text data by using a global unified character embedding feature.
Specifically, the overall unified character embedding feature is used for realizing unified modeling on unstructured text data such as medical records, complaints and discharge diagnosis reports. For this type of non-strongly structured data, data quality is critical to the performance of the system.
As a first alternative implementation manner of the embodiment of the present invention, preprocessing unstructured text data includes: unstructured text data is converted using a rule-oriented structuring algorithm. Specifically, firstly, the input data is converted by using a regular guided structuring algorithm, so that the input data is as orderly as possible in format, and therefore, the invention establishes a structuring preprocessing method of the original clinical record.
S2, preprocessing the structured text data.
In particular, pretreatment of data is required for epidemiology as well as laboratory blood biochemical examinations of such already structured data.
As an alternative implementation of the embodiment of the present invention, preprocessing the structured data includes: judging whether the preset value is in a reasonable interval or not, and normalizing the data of different orders of magnitude. Specifically, for example, white blood cell count, platelet count, etc., it is judged whether or not it is in a reasonable section, and it is normalized for various data of different orders of magnitude.
And S3, performing feature extraction on the medical image data by using an image feature extraction model of the structure to obtain image features, wherein the image feature extraction model of the structure uses a transducer structure as a trunk model.
As an optional implementation manner of the embodiment of the present invention, the image feature extraction model of the structure includes: and a plurality of multi-layer encoders, wherein the input of each encoder firstly flows into a Self-Attention layer, and the convolution kernel is 16 times 16. Specifically, for medical imaging, the invention uses an image feature extraction model of the structure, and uses a transducer structure as a backbone model, wherein the model is composed of a plurality of multi-layer encoders, and the input of the encoders flows into the Self-Attention layer first. In the present invention, the image is first subjected to a 16 x 16 convolution kernel to generate a series of tokens that allow the encoder to encode a particular word. The resulting image features are then stored in an intelligent multimodal database for subsequent fusion with text features.
As an optional implementation manner of the embodiment of the present invention, the image feature extraction model of the structure includes: a symptom-based abnormality detection model and a disease-based diagnostic model, wherein the symptom-based abnormality detection model is used to feature enhance the disease-based diagnostic model. Specifically, the present invention designs a symptom-based abnormality detection model and a disease-based diagnostic model: the chest disease diagnosis thinking of clinical specialists is simulated, namely firstly, the abnormal region appearing in the image is judged, and the final disease diagnosis is carried out on the patient based on the abnormal priori knowledge appearing on the patient. In particular, for disease diagnosis of images, final model diagnosis is made on the premise of referring to characteristic results detected by abnormality, a diagnosis model is enhanced by using a network model based on multiple symptoms, and the diagnosis is finally made by identifying the symptoms of the images.
S4, carrying out relation mapping expression among vocabularies.
Specifically, in a conventional NLP, words can be treated as discrete symbols and then represented using one-hot vectors. Then, there is similarity of codes between different vocabularies, and a relation mapping between the context and the target word needs to be found. This complex context is then expressed by modeling the network. In the invention, chinese BERT is used for carrying out relation mapping expression among vocabularies, thereby obtaining text characteristics and image characteristics for fusion.
And S5, extracting and analyzing the characteristics of multiple dimensions, and carrying out characteristic fusion on the multi-mode data, wherein the multi-mode data comprises text characteristics and image characteristics obtained by preprocessing unstructured text data and preprocessing structured text data.
Specifically, feature extraction and analysis of multiple dimensions are performed in an intelligent multi-modal feature database, corresponding data matched with the same word case for inspection are found, and feature fusion is performed on multi-modal data. Firstly, carrying out one-dimensional convolution operation and Dropout operation on the image features and the text in sequence; the convolution layer is used for extracting main features; the Dropout layer is used for avoiding the network model from generating over fitting and simplifying the calculation complexity. And then, carrying out vector dimension splicing fusion on the obtained image features and the corresponding text features, and carrying out a feature layer fusion-based mode on the image features and the corresponding text features.
As an optional implementation manner of the embodiment of the present invention, performing feature extraction and analysis of multiple dimensions, and performing feature fusion on multi-modal data includes: feature fusion is performed by using a multi-modal attention fusion mechanism, which is expressed as:
Figure BDA0004080790870000061
wherein Y (i) represents the output of the relationship between a certain character and all other characters, x i And y j Two of the characters representing the vector in the fusion; u represents the index until the output of its response is calculated, j is the index enumerating all possible positions; θ (x) u ,y j ) Calculating a relationship between two different feature positions; g (x) j ) Calculating a feature at position j; finally, the final relationship result of 1/C (x) is processed through normalization.
In particular, the present invention utilizes multiple focused attention mechanisms to achieve relevance information extraction between images and text. The multimodal attention fusion mechanism can be expressed as:
Figure BDA0004080790870000062
wherein Y (i) represents the output of the relation between a certain token and all other tokens, x i And y j Two of the token represented in the fused vector; i represents the index until the output of its response is calculated, j is the index enumerating all possible positions; θ (x) i ,y j ) Calculating a relationship between two different feature positions; g (x) j ) Calculate position jFeatures at the location; finally, the final relationship result of 1/C (x) is processed through normalization.
Therefore, the characteristics in the image and the characteristics in the text data can be regarded as one node in the transformation former network, so that real-time and simultaneous learning of multiple modes can be directly carried out, and efficient learning expression of the heterogeneous clinical data can be obtained.
And S6, classifying the fused features to obtain an output result.
Specifically, a plurality of classification rules are used to classify a plurality of diseases, and a word classification error is used to back-propagate an optimized classifier. And taking the diagnosis result analyzed in the discharge report as a gold standard, and carrying out feature analysis diagnosis of common lung diseases by combining multiple dimensions.
Therefore, the invention provides an automatic detection and identification method and device for common pulmonary diseases aiming at heterogeneous data in clinic in hospitals, including multi-mode image data, electronic case clinical reports, laboratory blood examination, image report and other image text data, and is provided with a plurality of features of biochemical signs and image detection, and a plurality of data fusion strategies are provided based on the multi-mode data, so that the diagnosis capability of a model to a patient from a plurality of angles is improved, and a more comprehensive support is provided for a clinician in diagnosis.
For example, respiratory infections often have similar clinical symptoms, signs, laboratory tests, and imaging manifestations, typical clinical manifestations being: fever, cough, chest distress, breathlessness and dyspnea; throat congestion, pulmonary dryness or moist rales; the heart beat increases; blood oxygen saturation decrease, etc. Laboratory tests for the same pathogen also share certain similarities, such as viral infections: normal total number of blood normal WBCs, decreased total number of lymphocytes, increased C-reactive protein (CRP), normal Procalcitonin (PCT), increased Lactate Dehydrogenase (LDH), etc. Pulmonary imaging cues: interstitial or frosted glass-like changes in both lungs, etc. On the premise of researching the identification of common chest diseases, the identification of fine classification granularity is carried out by utilizing multi-mode data as much as possible, thereby providing further support for clinicians in diagnosis.
Therefore, the invention solves the automatic detection function of various common lung diseases, can analyze images under the guidance of doctors by combining medical records and images based on various clinical data from different departments, avoids excessive irrelevant areas of the image technology, and accurately solves the identification diagnosis problem. For image data, inputting the image data into a chest CT image or an X-ray image, and efficiently extracting image texture features and semantic sign information existing in the image by adopting a deep neural network; for clinical text data, input is general clinical characteristics (age, body temperature and highest temperature at admission) of a patient, complaints and case diagnosis and treatment processes, laboratory blood biochemical examination (albumin, serum Lactate Dehydrogenase (LDH), indirect bilirubin, thrombin time, activated Partial Thromboplastin Time (APTT), platelet count, C-reactive protein (CRP), white blood cells, lymphocyte count, neutrophil count, PCT, IL-6, etc.), patient epidemiological conventional information, etc., the text data is first structured, and then the text is subjected to feature analysis by using a natural language processing algorithm.
By the invention, more than 10 lung typical abnormal symptoms or diseases can be detected simultaneously. The detection system can assist doctors to obtain higher sensitivity in focus discovery, and can provide more comprehensive support for clinicians in diagnosis.
Fig. 3 is a schematic structural diagram of a device for automatically identifying pulmonary diseases based on multimodal heterogeneous data according to an embodiment of the present invention, where the device for automatically identifying pulmonary diseases based on multimodal heterogeneous data applies the method described above, and the structure of the device for automatically identifying pulmonary diseases based on multimodal heterogeneous data is simply described below, and other less things are referred to the description related to the method for automatically identifying pulmonary diseases based on multimodal heterogeneous data described above, and referring to fig. 3, the device for automatically identifying pulmonary diseases based on multimodal heterogeneous data according to the embodiment of the present invention includes:
the data structuring module is used for preprocessing unstructured text data by using global unified character embedding characteristics;
the data preprocessing module is used for preprocessing the structured text data;
the convolutional neural network module is used for carrying out feature extraction on medical image data by using an image feature extraction model of the structure to obtain image features, wherein the image feature extraction model of the structure uses a transformer structure as a trunk model;
the text embedding module is used for carrying out relation mapping expression among vocabularies;
the feature fusion module is used for carrying out feature extraction and analysis of multiple dimensions, carrying out feature fusion on multi-mode data, wherein the multi-mode data comprises text features and image features obtained by preprocessing unstructured text data and preprocessing structured text data;
and the classifier is used for classifying the fused features to obtain an output result.
As a first alternative implementation manner of the embodiment of the present invention, the data structuring module performs preprocessing on unstructured text data in the following manner: unstructured text data is converted using a rule-oriented structuring algorithm.
As a first alternative implementation manner of the embodiment of the present invention, the data preprocessing module preprocesses structured data in the following manner: judging whether the preset value is in a reasonable interval or not, and normalizing the data of different orders of magnitude.
As a first optional implementation manner of the embodiment of the present invention, the image feature extraction model of the structure includes: and a plurality of multi-layer encoders, wherein the input of each encoder firstly flows into a Self-Attention layer, and the convolution kernel is 16 times 16.
As a first optional implementation manner of the embodiment of the present invention, the image feature extraction model of the structure includes: a symptom-based abnormality detection model and a disease-based diagnostic model, wherein the symptom-based abnormality detection model is used to feature enhance the disease-based diagnostic model.
As a first optional implementation manner of the embodiment of the present invention, the feature fusion module performs feature extraction and analysis of multiple dimensions, and performs feature fusion on the multimodal data in the following manner: feature fusion is performed by using a multi-modal attention fusion mechanism, which is expressed as:
Figure BDA0004080790870000081
wherein Y (i) represents the output of the relationship between a certain character and all other characters, x i And y j Two of the characters representing the vector in the fusion; i represents the index until the output of its response is calculated, j is the index enumerating all possible positions; θ (x) i ,y j ) Calculating a relationship between two different feature positions; g (x) j ) Calculating a feature at position j; finally, the final relationship result of 1/C (x) is processed through normalization.
Therefore, the invention provides an automatic detection and identification method and device for common pulmonary diseases aiming at heterogeneous data in clinic in hospitals, including multi-mode image data, electronic case clinical reports, laboratory blood examination, image report and other image text data, and is provided with a plurality of features of biochemical signs and image detection, and a plurality of data fusion strategies are provided based on the multi-mode data, so that the diagnosis capability of a model to a patient from a plurality of angles is improved, and a more comprehensive support is provided for a clinician in diagnosis.
For example, respiratory infections often have similar clinical symptoms, signs, laboratory tests, and imaging manifestations, typical clinical manifestations being: fever, cough, chest distress, breathlessness and dyspnea; throat congestion, pulmonary dryness or moist rales; the heart beat increases; blood oxygen saturation decrease, etc. Laboratory tests for the same pathogen also share certain similarities, such as viral infections: normal total number of blood normal WBCs, decreased total number of lymphocytes, increased C-reactive protein (CRP), normal Procalcitonin (PCT), increased Lactate Dehydrogenase (LDH), etc. Pulmonary imaging cues: interstitial or frosted glass-like changes in both lungs, etc. On the premise of researching the identification of common chest diseases, the identification of fine classification granularity is carried out by utilizing multi-mode data as much as possible, thereby providing further support for clinicians in diagnosis.
Therefore, the invention solves the automatic detection function of various common lung diseases, can analyze images under the guidance of doctors by combining medical records and images based on various clinical data from different departments, avoids excessive irrelevant areas of the image technology, and accurately solves the identification diagnosis problem. For image data, inputting the image data into a chest CT image or an X-ray image, and efficiently extracting image texture features and semantic sign information existing in the image by adopting a deep neural network; for clinical text data, input is general clinical characteristics (age, body temperature and highest temperature at admission) of a patient, complaints and case diagnosis and treatment processes, laboratory blood biochemical examination (albumin, serum Lactate Dehydrogenase (LDH), indirect bilirubin, thrombin time, activated Partial Thromboplastin Time (APTT), platelet count, C-reactive protein (CRP), white blood cells, lymphocyte count, neutrophil count, PCT, IL-6, etc.), patient epidemiological conventional information, etc., the text data is first structured, and then the text is subjected to feature analysis by using a natural language processing algorithm.
By the invention, more than 10 lung typical abnormal symptoms or diseases can be detected simultaneously. The detection system can assist doctors to obtain higher sensitivity in focus discovery, and can provide more comprehensive support for clinicians in diagnosis.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (12)

1. An automatic lung disease identification method based on multi-modal heterogeneous data, which is characterized by comprising the following steps:
preprocessing unstructured text data by using a global unified character embedding feature;
preprocessing the structured text data;
performing feature extraction on medical image data by using an image feature extraction model of a structure to obtain image features, wherein the image feature extraction model of the structure uses a transformer structure as a trunk model;
carrying out relation mapping expression among vocabularies;
performing feature extraction and analysis of multiple dimensions, and performing feature fusion on multi-modal data, wherein the multi-modal data comprises text features and image features, wherein the text features are obtained by preprocessing unstructured text data and preprocessing structured text data;
and classifying the fused features to obtain an output result.
2. The method of claim 1, wherein the preprocessing unstructured text data comprises:
the unstructured text data is converted using a rule-oriented structuring algorithm.
3. The method of claim 1, wherein the preprocessing of structured data comprises:
judging whether the preset value is in a reasonable interval or not, and normalizing the data of different orders of magnitude.
4. The method of claim 1, wherein the image feature extraction model of the structure comprises: and a plurality of multi-layer encoders, wherein the input of each encoder firstly flows into a Self-Attention layer, and the convolution kernel is 16 times 16 convolution kernels.
5. The method of claim 1, wherein the image feature extraction model of the structure comprises:
a symptom-based anomaly detection model and a disease-based diagnostic model, wherein the symptom-based anomaly detection model is used to feature enhance the disease-based diagnostic model.
6. The method of claim 1, wherein performing feature extraction and analysis in multiple dimensions, feature fusion in the multimodal data comprises:
feature fusion is performed by using a multi-modal attention fusion mechanism, which is expressed as:
Figure FDA0004080790860000011
wherein Y (i) represents the output of the relationship between a certain character and all other characters, x i And y j Two of the characters representing the vector in the fusion; i represents the index until the output of its response is calculated, j is the index enumerating all possible positions; θ (x) i ,y j ) Calculating a relationship between two different feature positions; g (x) j ) Calculating a feature at position j; finally, the final relationship result of 1/C (x) is processed through normalization.
7. An automatic lung disease recognition device based on multi-modal heterogeneous data, comprising:
the data structuring module is used for preprocessing unstructured text data by using global unified character embedding characteristics;
the data preprocessing module is used for preprocessing the structured text data;
the convolutional neural network module is used for carrying out feature extraction on medical image data by using an image feature extraction model of a structure to obtain image features, wherein the image feature extraction model of the structure uses a transformer structure as a trunk model;
the text embedding module is used for carrying out relation mapping expression among vocabularies;
the feature fusion module is used for carrying out feature extraction and analysis of multiple dimensions and carrying out feature fusion on multi-mode data, wherein the multi-mode data comprises text features and image features, wherein the text features are obtained by preprocessing unstructured text data and preprocessing structured text data;
and the classifier is used for classifying the fused features to obtain an output result.
8. The apparatus of claim 7, wherein the data structuring module pre-processes unstructured text data by:
the unstructured text data is converted using a rule-oriented structuring algorithm.
9. The apparatus of claim 7, wherein the data preprocessing module preprocesses structured data by:
judging whether the preset value is in a reasonable interval or not, and normalizing the data of different orders of magnitude.
10. The apparatus of claim 7, wherein the image feature extraction model of the structure comprises: and a plurality of multi-layer encoders, wherein the input of each encoder firstly flows into a Self-Attention layer, and the convolution kernel is 16 times 16 convolution kernels.
11. The method of claim 7, wherein the image feature extraction model of the structure comprises:
a symptom-based anomaly detection model and a disease-based diagnostic model, wherein the symptom-based anomaly detection model is used to feature enhance the disease-based diagnostic model.
12. The apparatus of claim 7, wherein the feature fusion module performs feature extraction and analysis of multiple dimensions by performing feature fusion on the multimodal data by:
feature fusion is performed by using a multi-modal attention fusion mechanism, which is expressed as:
Figure FDA0004080790860000021
wherein Y (i) represents the output of the relationship between a certain character and all other characters, x i And y j Two of the characters representing the vector in the fusion; i represents the index until the output of its response is calculated, j is the index enumerating all possible positions; θ (x) i ,y j ) Calculating a relationship between two different feature positions; g (x) j ) Calculating a feature at position j; finally, the final relationship result of 1/C (x) is processed through normalization.
CN202310123255.6A 2023-02-14 2023-02-14 Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data Pending CN116344028A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310123255.6A CN116344028A (en) 2023-02-14 2023-02-14 Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310123255.6A CN116344028A (en) 2023-02-14 2023-02-14 Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data

Publications (1)

Publication Number Publication Date
CN116344028A true CN116344028A (en) 2023-06-27

Family

ID=86883025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310123255.6A Pending CN116344028A (en) 2023-02-14 2023-02-14 Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data

Country Status (1)

Country Link
CN (1) CN116344028A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370933A (en) * 2023-10-31 2024-01-09 中国人民解放军总医院 Multi-mode unified feature extraction method, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370933A (en) * 2023-10-31 2024-01-09 中国人民解放军总医院 Multi-mode unified feature extraction method, device, equipment and medium
CN117370933B (en) * 2023-10-31 2024-05-07 中国人民解放军总医院 Multi-mode unified feature extraction method, device, equipment and medium

Similar Documents

Publication Publication Date Title
Mamalakis et al. DenResCov-19: A deep transfer learning network for robust automatic classification of COVID-19, pneumonia, and tuberculosis from X-rays
Zhou et al. A rapid, accurate and machine-agnostic segmentation and quantification method for CT-based COVID-19 diagnosis
Pereira et al. COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios
CN110335665B (en) Image searching method and system applied to medical image auxiliary diagnosis analysis
WO2023078025A1 (en) Task decomposition strategy-based auxiliary differential diagnosis system for fever of unknown origin
Rostami et al. A novel explainable COVID-19 diagnosis method by integration of feature selection with random forest
CN113241135A (en) Disease risk prediction method and system based on multi-mode fusion
Nneji et al. Multi-channel based image processing scheme for pneumonia identification
WO2021209887A1 (en) Rapid, accurate and machine-agnostic segmentation and quantification method and device for coronavirus ct-based diagnosis
Kumar et al. Lungcov: A diagnostic framework using machine learning and Imaging Modality
Gürsoy et al. An overview of deep learning techniques for COVID-19 detection: methods, challenges, and future works
Agnihotri et al. Challenges, opportunities, and advances related to COVID-19 classification based on deep learning
Sharma et al. Deep learning models for tuberculosis detection and infected region visualization in chest X-ray images
Chien et al. Yolov8-am: Yolov8 with attention mechanisms for pediatric wrist fracture detection
CN116344028A (en) Method and device for automatically identifying lung diseases based on multi-mode heterogeneous data
Ghafoor COVID-19 pneumonia level detection using deep learning algorithm
Yahyaoui et al. Performance Comparison of Deep and Machine Learning Approaches Toward COVID-19 Detection
Zhang et al. Clinical applicable AI system based on deep learning algorithm for differentiation of pulmonary infectious disease
Ji et al. ResDSda_U-Net: A novel U-Net based residual network for segmentation of pulmonary nodules in lung CT images
Chen et al. SCKansformer: Fine-Grained Classification of Bone Marrow Cells via Kansformer Backbone and Hierarchical Attention Mechanisms
Chen et al. Automatically structuring on Chinese ultrasound report of cerebrovascular diseases via natural language processing
Mensah et al. Overview of CapsNet Performance evaluation methods for image classification using a dual input capsule network as a case study
CN114283140A (en) Lung X-Ray image classification method and system based on feature fusion and storage medium
Han et al. U-CCNet: Brain Tumor MRI Image Segmentation Model with Broader Global Context Semantic Information Abstraction
Karajah et al. Covid-19 Detection From Chest X-Rays Using Modified VGG 16 Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination