CN117636099A - Medical image and medical report pairing training model - Google Patents
Medical image and medical report pairing training model Download PDFInfo
- Publication number
- CN117636099A CN117636099A CN202410090308.3A CN202410090308A CN117636099A CN 117636099 A CN117636099 A CN 117636099A CN 202410090308 A CN202410090308 A CN 202410090308A CN 117636099 A CN117636099 A CN 117636099A
- Authority
- CN
- China
- Prior art keywords
- medical
- medical image
- image
- report
- training model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 39
- 230000006870 function Effects 0.000 claims abstract description 21
- 230000000694 effects Effects 0.000 claims abstract description 5
- 238000000034 method Methods 0.000 claims description 18
- 239000013598 vector Substances 0.000 claims description 18
- 238000013507 mapping Methods 0.000 claims description 6
- 239000007787 solid Substances 0.000 claims description 6
- 238000001514 detection method Methods 0.000 claims description 5
- 230000011218 segmentation Effects 0.000 claims description 5
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000000638 solvent extraction Methods 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 abstract description 4
- 238000003759 clinical diagnosis Methods 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 abstract description 3
- 238000012351 Integrated analysis Methods 0.000 abstract description 2
- 238000012512 characterization method Methods 0.000 abstract description 2
- 201000010099 disease Diseases 0.000 abstract description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 abstract description 2
- 238000012544 monitoring process Methods 0.000 abstract description 2
- 238000012545 processing Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000003902 lesion Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 210000003484 anatomy Anatomy 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a medical image and medical report pairing training model which adopts a set of registered medical images and medical reportsTraining is realized, and the training steps of the comprehensive training model are as follows: s1, image coding; s2, text encoding; s3, attention weighted image representation; s4, building a training model function, wherein the invention can be used for building a medical imageAnd automatically learning useful characteristic representation in medical report, wherein the model can capture complex relation between medical image and report data through joint learning, thereby improving the characterization capability and information extraction effect of the data, introducing modern deep learning technology into medical field, thereby realizing integrated analysis of multi-modal medical data and providing more accurate and comprehensive support for clinical diagnosis and disease monitoring.
Description
Technical Field
The invention relates to the technical field of medical information processing, in particular to a medical image and medical report pairing training model.
Background
Image and text data accumulation in the medical field has increased dramatically, and these data sources cover a wide variety of information from X-ray and MRI scans to clinical reports and medical records. These data are both rich in anatomical and pathological features and bear the clinical experience and professional judgment of the physician. However, the efficient use of such data, particularly in combining medical images with text data, remains one of the important challenges facing the medical field.
Interpretation of medical images is critical to accurately conduct clinical diagnosis. However, due to the variety and complexity of medical image data, accurate identification of lesions, localization of abnormalities, and analysis of anatomical structures requires a great deal of experience and expertise from a physician. In addition, medical texts contain a large amount of important information about patient condition, treatment regimen, and doctor diagnosis. However, there are challenges in correlating and combining these two data types to extract more comprehensive information.
In recent years, the application of deep learning techniques in the fields of medical images and texts is increasing. However, current deep learning methods focus mainly on single modality data analysis, ignoring the rich correlation between medical images and text. To fully exploit multimodal data, in particular medical image and text pairing data, we need an innovative approach to achieve joint analysis and comprehensive interpretation.
Disclosure of Invention
In order to solve the problems, the invention provides a medical image and medical report pairing training model, which is realized by the following technical scheme.
A medical image and medical report pairing training model employing a set of registered medical images and medical reportsTo achieve training, wherein K represents the number of paired medical images and medical reports;
/>wherein a is i And b i Representing a medical image and a medical report, respectively, W and H representing the width and length of the medical image, respectively, and C representing the number of color channels of the medical image source file;
wherein,,/>;
the training steps of the comprehensive training model are as follows:
s1, image coding, namely partitioning a medical image, and coding sub-regions to obtain feature vectors of the sub-regions;
s2, text coding, namely extracting entity information in the medical report to code so as to acquire embedded representation of the entity information;
s3, attention weighted image representation, namely weighting the subareas of the medical image according to the importance of each subarea in the medical image relative to each medical report to obtain the final representation of the medical image;
s4, building a training model function.
Preferably, in the step S1, a target detection segmentation model is used to identify a critical entity region and a weak semantic feature region in the medical image, and a res net-50 model is used as an encoder to encode the critical entity region and the weak semantic feature region respectively, so as to obtain f and fWherein f represents the eigenvector of the key solid region, < ->Feature vectors representing regions of weak semantic features,
wherein M represents the number of key entity areas on each medical image, and M is 5;
the global features extracted in the final adaptive average pooling layer by ResNet-50 model are denoted as f g 。
Preferably, in the step S2, entity information is extracted from the medical report by using the existing MetaMap model, and the extracted entity information is expressed asWherein->Representing->Extracted entity information i ∈>Subsequently the BioClinicalBERT model was used as encoder and denoted +.>Encoding the entity information to obtain the entity information and an embedded representation of the overall report, comprising:
mapping the representation into 128-dimensional feature vectors by projection mapping, i.e
。
Preferably, in said step S3, the final representation a of the medical image i The method comprises the following steps:
+/>λ1 and λ2 are hyper-parameters and λ1+λ2=1.
Preferably, the method comprises the steps of,
wherein the method comprises the steps ofWeighting the critical entity area on the ith medical image based on the attention of the medical report; />Is the effect of entity information on the ith medical report on the jth critical entity area on the ith medical image,/->Represents an attention weight; />Is the feature vector of the j-th key entity region on the medical image paired with the i-th medical report.
Preferably, the method comprises the steps of,
wherein,is a super parameter; />Corresponding to->Embedding representation of information of individual reporting entity with the +.sup.th in the ith medical image>Similarity between sub-regions;
wherein,representing the transpose of the vector.
Preferably, the training model is optimized using the following loss function:
wherein,is a weight function, +.>Z is equal to->Or->,/>Is a super parameter.
Preferably, in the step S4, each medical image and the corresponding medical report are used as positive sample pairs, each medical image and other medical reports are used as negative sample pairs, and the noisy contrast estimation loss function in contrast learning is finally obtained, which is a function of the training model:
wherein the method comprises the steps ofIs a hyper-parameter used to control weak semantic negative sample weights where neg represents the set of negative sample pairs formed by each image with other exam reports.
The invention can automatically learn useful characteristic expression from medical images and medical reports, can capture complex relations between the medical images and the report data through joint learning of the medical images and the report data, improves the characterization capability and the information extraction effect of the data, introduces a modern deep learning technology into the medical field, thereby realizing the integrated analysis of multi-mode medical data and providing more accurate and comprehensive support for clinical diagnosis and disease monitoring, and has the following beneficial effects:
extraction of weak semantic features: by extracting image regions with weak semantic information. The method is beneficial to better capturing local low-level features by the model and improving the performance of the model in processing scenes such as medical images.
Consider local features: the generated weak semantic negative sample contains local features and other texture features of the target, so that the local structure of the medical image can be more comprehensively described, and the model can more accurately understand the details of the image.
Enhancing semantic information: the medical report is embedded in combination with the medical image representation, the image representation is weighted with an attention mechanism, and important areas related to the text entities are further extracted. This helps the model to better capture semantic information in the medical image.
Enhanced image-report association: by computing the similarity between the medical image sub-regions and the medical report, an attention weighted image representation is generated. This helps to strengthen the association between the image and the report, improving the similarity calculation capability of the model between the medical report and the image.
Contrast learning framework: the model is trained by comparing the differences between positive and negative samples using a contrast learning framework. This allows the model to better distinguish between different samples in the feature space, improving the robustness and performance of the model.
Drawings
In order to more clearly illustrate the technical solutions of the present invention, the drawings that are needed in the description of the specific embodiments will be briefly described below, it being obvious that the drawings in the following description are only some examples of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1: the invention relates to a process for establishing a medical image and medical report pairing training model;
fig. 2: the medical image processing flow comprises a medical image processing flow;
fig. 3: the invention relates to a medical report processing flow.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in figures 1-3 of the drawings,
example 1
A medical image and medical report pairing training model employing a set of registered medical images and medical reportsTo achieve training, wherein K represents the number of paired medical images and medical reports;
/>wherein a is i And b i Representing a medical image and a medical report, respectively, W and H representing the width and length of the medical image, respectively, and C representing the number of color channels of the medical image source file;
wherein,,/>;
the training steps of the comprehensive training model are as follows:
s1, image coding, namely partitioning a medical image, and coding sub-regions to obtain feature vectors of the sub-regions;
s2, text coding, namely extracting entity information in the medical report to code so as to acquire embedded representation of the entity information;
s3, attention weighted image representation, namely weighting the subareas of the medical image according to the importance of each subarea in the medical image relative to each medical report to obtain the final representation of the medical image;
s4, building a training model function.
According to the invention, the medical report can be embedded into the medical image representation, the image representation is weighted, and the important area related to the entity information on the medical report is extracted, so that the semantic information in the medical image is better captured, and the sub-area on the medical image is weighted, thereby being beneficial to enhancing the association between the medical image and the medical report and improving the similarity computing capability of the model between the medical report and the medical image.
Example 2
In step S1, a target detection segmentation model is used for identifying a key entity region and a weak semantic feature region in a medical image, and a ResNet-50 model is used as an encoder for encoding the key entity region and the weak semantic feature region respectively to obtain f and fWherein f represents the eigenvector of the key solid region, < ->Feature vectors representing regions of weak semantic features,
wherein M represents the number of key entity areas on each medical image, and M is 5; the fixed value of M can be manually set according to task requirements, and is set to be 5 in the invention, namely five key entity areas are selected on each medical image.
The global features extracted in the final adaptive average pooling layer by ResNet-50 model are denoted as f g 。
In the application, the target detection segmentation model adopts the existing models of Faster R-CNN, YOLO or U-Net and the like, the models can identify key entity areas in medical images and help to locate interesting anatomical structures, lesion areas and the like, and then the ResNet-50 model is used as an encoder to encode the key entity areas to obtain feature vectors f of the key entity areas.
In the medical image field, the weak semantic feature region refers to regions which may not be main lesions or structures in the image, such as blood vessels, bones, organ boundaries, textures, color distribution of the image and the like, the weak semantic feature region is identified by utilizing a target detection segmentation model, and is encoded by a ResNet-50 model to obtain feature vectors of the weak semantic feature region。
Example 3
In step S2, entity information is extracted from the medical report using the existing MetaMap model, and the extracted entity information is expressed asWherein->Representing->Extracted entity information, iSubsequently the BioClinicalBERT model was used as encoder and denoted +.>Encoding the entity information to obtain the entity information and an embedded representation of the overall report, comprising:
mapping the representation into 128-dimensional feature vectors by projection mapping, i.e
。
This process helps better capture key semantic information in medical reports.
Further, in step S3, the final representation Ai of the medical image is:
+/>λ1 and λ2 are hyper-parameters and λ1+λ2=1.
Further, the method comprises the steps of,
wherein the method comprises the steps ofAttention-adding based medical reports for critical entity areas on the ith medical imageWeight; />Is the effect of entity information on the ith medical report on the jth critical entity area on the ith medical image,/->Represents an attention weight; />Is the feature vector of the j-th key entity region on the medical image paired with the i-th medical report.
Further, the method comprises the steps of,
wherein,is a super parameter; />Corresponding to->Embedding representation of information of individual reporting entity with the +.sup.th in the ith medical image>Similarity between sub-regions;
wherein,representing the transpose of the vector.
Entity information that is defined in conformity with the medical language system, i.e., keyword information in the medical report.
In the present embodiment, in the medical image, unlike the natural image, the region of interest is often indicated by subtle visual cues. Using global features alone may not adequately capture the features of these regions of interest, we have adopted a different approach, namely learning the attention mechanism, to weight key solid regions of different medical images according to their importance to given solid information.
To generate an attention weighted image representation based on entity information, we first calculate the similarity of all key entity regions and entity information, using the dot product similarity calculation formula:
。
for each medical report, we calculate an attention weighted image representation based on its similarity to all key solid regions in the paired medical image,
Attention weighting is the impact of a medical report on different critical entity areas,
a final representation of the final resulting image:
+/>。
example 4
The training model is optimized using the following loss function:
wherein,is a weight function by +.>To dynamically adjust the weights of positive and weak semantic negative samples, which weight function can be calculated based on the distance of the feature vector, +.>Z is equal toOr->,/>Is a super parameter.
The loss function here is to learn the feature representation of the medical image more optimally.
By passing throughThe method has the advantages that the weight can be dynamically adjusted according to the feature distance between samples, so that the model can learn weak semantic features more pertinently. This helps to increase the sensitivity of the model to weak semantic information, thereby better performing contrast learning and feature learning. By this loss function we can better utilize the back propagation algorithm to optimize the model parameters.
Example 5
In step S4, each medical image and the corresponding medical report are used as a positive sample pair, each medical image and other medical reports are used as a negative sample pair, and finally, the noisy contrast estimation loss function in contrast learning is obtained, which is a function of the training model:
wherein the method comprises the steps ofIs a hyper-parameter used to control weak semantic negative sample weights where neg represents the set of negative sample pairs formed by each image with other exam reports.
The feature function here is to optimize the pairing training of the medical image and the medical report to meet the task requirements of subsequent automated interpretation of the medical image.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.
Claims (8)
1. A medical image and medical report pairing training model, characterized in that the model employs a set of registered medical images and medical reportsTo achieve training, wherein K represents the number of paired medical images and medical reports;
wherein a is i And b i Representing a medical image and a medical report, respectively, W and H representing the width and length of the medical image, respectivelyC represents the number of color channels of the medical image source file;
wherein,,/>;
the training steps of the comprehensive training model are as follows:
s1, image coding, namely partitioning a medical image, and coding sub-regions to obtain feature vectors of the sub-regions;
s2, text coding, namely extracting entity information in the medical report to code so as to acquire embedded representation of the entity information;
s3, attention weighted image representation, namely weighting the subareas of the medical image according to the importance of each subarea in the medical image relative to each medical report to obtain the final representation of the medical image;
s4, building a training model function.
2. The medical image and medical report pairing training model according to claim 1, wherein in step S1, the target detection segmentation model is used to identify the critical entity region and the weak semantic feature region in the medical image, and the res net-50 model is used as the encoder to encode the critical entity region and the weak semantic feature region respectively, to obtain f and fWherein f represents the eigenvector of the key solid region, < ->Feature vectors representing weak semantic feature regions;
wherein M represents the number of key entity areas on each medical image, and M is 5;
the global features extracted in the final adaptive average pooling layer by ResNet-50 model are denoted as f g 。
3. The medical image and medical report pairing training model according to claim 2, wherein in step S2, entity information is extracted from the medical report using the existing MetaMap model, and the extracted entity information is represented asWherein->Representing->Extracted entity information i ∈>Subsequently the BioClinicalBERT model was used as encoder and denoted +.>Encoding the entity information to obtain the entity information and an embedded representation of the overall report, comprising:
mapping the representation into 128-dimensional feature vectors by projection mapping, i.e
。
4. A medical image and medical report pairing training model according to claim 3, characterized in that in step S3 the final representation a of the medical image i The method comprises the following steps:
+/>λ1 and λ2 are hyper-parameters and λ1+λ2=1.
5. A medical image and medical report pairing training model according to claim 4, wherein,
wherein the method comprises the steps ofWeighting the critical entity area on the ith medical image based on the attention of the medical report; />Is the effect of entity information on the ith medical report on the jth critical entity area on the ith medical image,/->Represents an attention weight; />Is the feature vector of the j-th key entity region on the medical image paired with the i-th medical report.
6. A medical image and medical report pairing training model according to claim 4, wherein,
wherein,is a super parameter; />Corresponding to->Embedding representation of information of individual reporting entity with the +.sup.th in the ith medical image>Similarity between sub-regions;
wherein,representing the transpose of the vector.
7. The medical image and medical report pairing training model of claim 6, wherein the training model is optimized using the following loss function:
’
wherein,is a weight function, +.>Z is equal to->Or->,/>Is a super parameter.
8. The medical image and medical report pairing training model according to claim 7, wherein in step S4, each medical image and corresponding medical report are used as positive sample pairs, each medical image and other medical report are used as negative sample pairs, and the noisy contrast estimation loss function in contrast learning is finally obtained, and the function is a function of the training model:
wherein the method comprises the steps ofIs a hyper-parameter used to control weak semantic negative sample weights where neg represents the set of negative sample pairs formed by each image with other exam reports.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410090308.3A CN117636099B (en) | 2024-01-23 | 2024-01-23 | Medical image and medical report pairing training model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410090308.3A CN117636099B (en) | 2024-01-23 | 2024-01-23 | Medical image and medical report pairing training model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117636099A true CN117636099A (en) | 2024-03-01 |
CN117636099B CN117636099B (en) | 2024-04-12 |
Family
ID=90021849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410090308.3A Active CN117636099B (en) | 2024-01-23 | 2024-01-23 | Medical image and medical report pairing training model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117636099B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200093455A1 (en) * | 2017-03-24 | 2020-03-26 | The United States of America, as represented by the Secretary, Department of Health and | Method and system of building hospital-scale chest x-ray database for entity extraction and weakly-supervised classification and localization of common thorax diseases |
CN112992308A (en) * | 2021-03-25 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Training method of medical image report generation model and image report generation method |
US20220122250A1 (en) * | 2020-10-19 | 2022-04-21 | Northwestern University | Brain feature prediction using geometric deep learning on graph representations of medical image data |
CN115861641A (en) * | 2022-10-31 | 2023-03-28 | 浙江工业大学 | Medical image report generation method based on fine-grained attention |
CN115910264A (en) * | 2022-11-10 | 2023-04-04 | 上海师范大学 | Medical image classification method, device and system based on CT and medical report |
WO2023204944A1 (en) * | 2022-04-19 | 2023-10-26 | Microsoft Technology Licensing, Llc | Training of text and image models |
CN117392473A (en) * | 2023-10-30 | 2024-01-12 | 齐鲁工业大学(山东省科学院) | Interpretable medical image classification system based on multi-modal prototype network |
CN117391092A (en) * | 2023-12-12 | 2024-01-12 | 中南大学 | Electronic medical record multi-mode medical semantic alignment method based on contrast learning |
-
2024
- 2024-01-23 CN CN202410090308.3A patent/CN117636099B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200093455A1 (en) * | 2017-03-24 | 2020-03-26 | The United States of America, as represented by the Secretary, Department of Health and | Method and system of building hospital-scale chest x-ray database for entity extraction and weakly-supervised classification and localization of common thorax diseases |
US20220122250A1 (en) * | 2020-10-19 | 2022-04-21 | Northwestern University | Brain feature prediction using geometric deep learning on graph representations of medical image data |
CN112992308A (en) * | 2021-03-25 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Training method of medical image report generation model and image report generation method |
WO2023204944A1 (en) * | 2022-04-19 | 2023-10-26 | Microsoft Technology Licensing, Llc | Training of text and image models |
CN115861641A (en) * | 2022-10-31 | 2023-03-28 | 浙江工业大学 | Medical image report generation method based on fine-grained attention |
CN115910264A (en) * | 2022-11-10 | 2023-04-04 | 上海师范大学 | Medical image classification method, device and system based on CT and medical report |
CN117392473A (en) * | 2023-10-30 | 2024-01-12 | 齐鲁工业大学(山东省科学院) | Interpretable medical image classification system based on multi-modal prototype network |
CN117391092A (en) * | 2023-12-12 | 2024-01-12 | 中南大学 | Electronic medical record multi-mode medical semantic alignment method based on contrast learning |
Non-Patent Citations (1)
Title |
---|
ZIFENG WANG 等: "MedCLIP: Contrastive Learning from Unpaired Medical Images and Text", ARXIV:2210.10163, 18 October 2022 (2022-10-18), pages 1 - 12 * |
Also Published As
Publication number | Publication date |
---|---|
CN117636099B (en) | 2024-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Qi et al. | Automated diagnosis of breast ultrasonography images using deep neural networks | |
Chang et al. | Thyroid segmentation and volume estimation in ultrasound images | |
WO2019062846A1 (en) | Medical image aided diagnosis method and system combining image recognition and report editing | |
Guo et al. | Multi-level semantic adaptation for few-shot segmentation on cardiac image sequences | |
Yu et al. | Early melanoma diagnosis with sequential dermoscopic images | |
WO2015106374A1 (en) | Multidimensional texture extraction method based on brain nuclear magnetic resonance images | |
CN114782307A (en) | Enhanced CT image colorectal cancer staging auxiliary diagnosis system based on deep learning | |
JP2021144675A (en) | Method and program | |
CN114266786A (en) | Gastric lesion segmentation method and system based on generation countermeasure network | |
CN113298830A (en) | Acute intracranial ICH region image segmentation method based on self-supervision | |
Liu et al. | Accurate and robust pulmonary nodule detection by 3D feature pyramid network with self-supervised feature learning | |
CN117036288A (en) | Tumor subtype diagnosis method for full-slice pathological image | |
CN117274147A (en) | Lung CT image segmentation method based on mixed Swin Transformer U-Net | |
CN117218127B (en) | Ultrasonic endoscope auxiliary monitoring system and method | |
Ruan et al. | An efficient tongue segmentation model based on u-net framework | |
Xu et al. | Application of artificial intelligence technology in medical imaging | |
Wu et al. | Human identification with dental panoramic images based on deep learning | |
CN117636099B (en) | Medical image and medical report pairing training model | |
Liu et al. | U2F-GAN: weakly supervised super-pixel segmentation in thyroid ultrasound images | |
CN115409812A (en) | CT image automatic classification method based on fusion time attention mechanism | |
CN115526898A (en) | Medical image segmentation method | |
KR102601970B1 (en) | Apparatus and method for detecting leison region and gland region in medical image | |
Diamantis et al. | This Intestine Does Not Exist: Multiscale Residual Variational Autoencoder for Realistic Wireless Capsule Endoscopy Image Generation | |
Pang et al. | Correlation matters: multi-scale fine-grained contextual information extraction for hepatic tumor segmentation | |
CN116092643A (en) | Interactive semi-automatic labeling method based on medical image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |